Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Someone will still need to review the reams and reams of bullshit code generated by these Markov chains. It'll be like all your code gets written by first-term co-op students who have discovered alcohol for the first time and think it's pretty neat, but you're still responsible for the final output and don't get to do any of the fun and interesting stuff. Just constant, mind-numbing code reviews of something that might look OK on the surface but is really just blow-up punching clowns all the way down and already shipped by the marketing department because your backlog is massive and after all, you've been replaced by a random number generator fed through an incomplete Bayesian database of random bad code that costs so much less,


You described v1. Just wait for v2.

You can’t possibly think these things will remain clowns forever.

These things aren’t “Markov chains” - the architecture is significantly more scalable, which is exactly why this time is different.


A Markov Chain is a mathematical structure, not a machine learning architecture. If there are a finite number of states, and a function to determine the probability to transition to any given state from any other state, then it’s a Markov chain.

Transformers with finite block size have a finite number of states, so they are Markov chains.


GPT3 released four years ago, in terms of iterations beyond that, we are at ~v5, and progress has only been incremental relative to that milestone. The transformer models can only be scaled so far before not even VC money can sustain training. I believe we will get there eventually, but transformer based LLMs have been hitting a roof for a long time, and we need to think differently.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: