Try using spec kit. Codex 5 high for planning; Claude code sonnet 4.5 for implementation; codex 5 high for checking the implementation; back to Claude code for addressing feedback from codex; ask Claude code to create a PR; read the PR description to ensure it tracks your expectations.
There’s more you’ll get a feel for when you do all that. But it’s a place to start.
I mean, if taxi companies could build their own Uber in house I’m sure they’d love to and at least take some customers from Uber itself.
A lot of startups are middlemen with snazzy UIs. Middlemen won’t be in as much use in a post AI world, same as devs won’t be as needed (devs are middlemen to working software) or artists (middlemen to art assets)
That's why you use Uber because the app has more depth and is more polished?
Most people use it for price, ability to get driver quickly, some for safety and many because of brand.
Having a functioning app with an easy interface helps onboard and funnel people but it's not a moat just an on ram like a phone number many taxis have.
These guys are pursuing what they believe to be the biggest prize ever in the history of capitalism. Given that, viewing their decisions as a cynic, by default, seems like a rational place to start.
I don’t understand the use of MCP described in the post.
Claude code can access pretty much all those third party services in the shell, using curl or gh and so on. And in at least one case using MCP can cause trouble: the linear MCP server truncates long issues, in my experience, whereas curling the API does not.
You're exactly right. To be honest, in pretty much every case I've seen, indicating usage of a read-only resource directly in the prompt always outperforms using the MCP for it. Should really only be using MCP if you need MCP-specific functionality imo (elicitation, sampling)
> Overall, LLMs aren’t yet at the point where they can replace all engineers. But I don’t doubt they will be soon enough.
All engineers? This doesn't match my hands-on experience at all.
If you give a chainsaw to everyone, it doesn't make everyone a lumberjack. And a chainsaw itself certainly isn't a lumberjack.
If you give Claude Code or the like to everyone, it's doesn't make everyone a highly skilled software engineer. And Claude code itself isn't a highly skilled software engineer.
I've come around to this view. When I first began using these things for building software (the moment ChatGPT dropped), I was immediately skeptical of the view that these things are merely glorified autocomplete. They felt so different than that. These computers would do what I _meant_, not what I _said_. That was a first and very unlike any autocomplete I'd ever seen.
Now, with experience using them to build software and feeling how they are improving, I believe they are nothing more or less than fantastically good auto complete. So good that it was previously unimaginable outside of science fiction.
Why autocomplete and not highly skilled software engineer? They have no taste. And, at best, they only pretend to know the big picture, sometimes.
They do generate lots of code, of course. And you need something / someone that you can trust to review all that code. And THAT thing needs to have good taste and to know the big picture, so that it can reliably make good judgement calls.
LLMs can't. So, using LLMs to write loads of code just shifts the bottleneck from writing code to reviewing code. Moreover, LLMs and their trajectory of improvements do not, to this experienced software engineer, feel like they are kind of solution and kind of improvements needed to get to an automated code review system so reliable that a company would bet its life on it.
Are we going to need fewer software engineers per line of code written? Yes. Are lines of code going to go way up? Yes. Is the need for human code review going to go way up? Yes, until something other than mere LLM improvements arrive.
Even if you assume 100% of code is written by LLMs, all engineers aren't going to be replaced by LLMs.
tbh, it could be a breakthrough in model design, an smart optimization (like the recent DeepConf paper), a more brute force (like the recent CodeMonkeys paper), or a completely new paradigm (at which point we won't even call it an LLM anymore). either way, I believe it's hard to claim this will never happen.
It's pretty easy to understand why it will never happen. AI isn't alive. It's more intellectually akin to a sword or a gun than it is to even the simplest living thing. Nobody has any intention of changing this because there's money to be had selling swords and guns and no money to be had in selling living entities that seek self-presentation and soul