Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I can't say what's happening in GitHub Copilot, but it's not necessarily true that the only way to produce syntactically valid outputs is to take substrings of the source text. It is possible to learn something approximating a generative grammar.

Take a look at https://karpathy.github.io/2015/05/21/rnn-effectiveness/

At the same time, I would not be surprised if there are outputs that do correspond to the source training data.



Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: