Is there really value being presented here? Is this codebase a stable enough base to continue developing this compiler or does it warrant a total rewrite? Honest question, it seems like the author mentioned it being at its limits. This mirrors my own experience with Opus in that it isn't that great at defining abstractions in one-shot at least. Maybe with enough loops it could converge but I haven't seen definite proof of that in current generation with these ambitious clickbaity projects.
This is an experiment to see the current limit of AI capabilities. The end result isn't useful, but the fact is established that in Feb 2026, you can spend $20k on AI to get a inefficient but working C complier.
Of course it's impressive. I am just pointing out that these experiments with the million line browser and now this c compiler seem to greatly extrapolate conclusions. The researchers claim they prove you can scale agents horizontally for econkmic benefit. But the products both of these built are of questionable technical quality and it isnt clear to me they are a stable enough foundation to build on top of. But everyone in the hype crowd just assumes this is true. At least this researcher has sort of promised to pursue this project whereas Wilson already pretty much gave up on his browser. I hadn't seen a commit in that repo for weeks. Given that, I am not going to immediately assume these agents truly achieved anything of economic value relative to what a smaller set of agents could have achieved.
FWIW, an inefficient but working product is pretty much the definition of a startup MVP. People are getting hung up on the fact that it doesn't beat gcc and clang, and generalizing to the idea that such a thing can't possibly be useful.
But clearly it can, and is. This builds and boots Linux. A putative MVP might launch someone's dreams. For $20k!
The reflexive ludditism is kinda scary actually. We're beyond the "will it work" phase and the disruption is happening in front of us. I was a luddite 10 months ago. I was wrong.
You are projecting and over-reacting. My response is measured against the insane hype this is getting beyond what was demonstrared. I never said ot wasn't impressive.
I'm not hung up on anything. Clearly the project isn't stable because it can't be modified without regression. It can be an MVP but if it needs someone to rewrite it or spend many man-months just to grok the code to add to it then its conceivable it isnt an economic win in the long run. Also, they haven't compared this to what a smaller set of agents could accomplish with the same task and thus I am still not fully sold on the economic viability of horizontally scaling agents at this time (well at least not on the task that was tested).
Then, as your parent comment asked, is there value in it? $20K, which is more than the yearly minimum wage in several countries in Europe, was spent recreating a worse version of something we already have, just to see if it was possible, using a system which increases inequality and makes climate change—which is causing people to die—worse.
If it generates a booting kernel and passes the test suite at 99% it's probably good enough to use, yeah.
The point isn't to replace GCC per se, it's to demonstrate that reasonably working software of equivalent complexity is within reach for $20k to solve whatever problem it is you do have.
> that reasonably working software of equivalent complexity is within reach for $20k to solve
But if this can't come close to replacing GCC and can't be modified without introducing bugs then it hasn't proven this yet. I learned some new hacks from the paper and that's great and all but from my experiencing of trying to harness even 4 claude sessions in parallel on a complex task it just goes off the rails in terms of coherence. I'll try the new techniques but my intuition is that its not really as good as you are selling it.
What does that mean, though? I mean, it's already meeting a very high quality bar by booting at all and passing those tests. No, it doesn't beat existing solutions on all the checkboxes, but that's not what the demo is about.
The point being demonstrated is that if you need a "custom compiler" or something similar for your own new, greenfield requirement, you can have it at pretty-clearly-near-shippable quality in two weeks for $20k.
And if people can't smell the disruption there, I don't know what to say.
Is it really shippable if it is strictly worse than the thing it copied. Do you know anyone who would use a vibe coded compiler that cant be modified without introducing regressions (as the researcher admitted)?
> if you spend months writing a tight spec, tests and have a better version of the compiler around to use when everything else fails.
Doesn't matter because your competitors will have beaten you to market. That's just a simple Darwinian point, no AI magic needed.
No one doubts that things will be different in the coming Claudepocalypse, and new ideas about quality and process will need to happen to manage it. But sticking our heads in the sand and pretending that our stone tools are still better is just a path to early retirement at this point.
I feel like maybe you spend too much time watching hypefluencers. AI tools are great but if they are already super intelligent why haven't you gotten a swarm of agents to build yourself a billion dollar SaaS?
It's hard to separate the bullshit from reality when the hype is just turned to the max everywhere you turn. It feels like I'm in some elaborate psy-op where my experiences with these tools are just an order of magnitude lower than the hype and I can't even express those thoughts without having "luddite" patch attached to me. And if you read between the lines of what Karpathy wrote in his famous "anxiety" post, it kind of echoes my point. Its "an alien technology and we can't yield it right" yada yada. Which is an odd way to say "sometimes this thing works magically but a lot of the time its total shit so you aren't as productive as you would like".
Copilot has access to the latest models like Opus 4.6 in agentic mode as well. It's got certain quirks and I prefer a TUI myself but it isn't radically different.
You submit a SQL query to periodically run, we run it and store the results. As we ingest more documents (dozens of sources are being ingested every day), we run it again. If there's different outputs, you get an email.
> If the private market doesn't want bonds, the central bank can purchase them. That's not inflationary.
Central bank buying bonds and increasing money supply absolutely is inflationary. That is precisely how FOMOs work, with the end goal being increasing or decreasing money supply depending on inflation and labour market. So if you already have stubborn inflation and you have a fiscal crisis then unmooring inflation expectations by lowering rates is exactly what you don't want to do (risk becoming a banana republic that inflates away it's debt). I don't think this will happen in the near future but it is absolutely a risk and you'd be foolish as a central banker not to consider it.
reply