I wonder at the end of this if it's the still worth the risk?
A lot of how I form my thoughts is driven by writing code, and seeing it on screen, running into its limitations.
Maybe it's the kind of work I'm doing, or maybe I just suck, but the code to me is a forcing mechanism into ironing out the details, and I don't get that when I'm writing a specification.
I second this. This* is the matter against which we form understanding. This here is the work at hand, our own notes, discussions we have with people, the silent walk where our brain kinda process errors and ideas .. it's always been like this since i was a kid, playing with construction toys. I never ever wanted somebody to play while I wait to evaluate if it fits my desires. Desires that often come from playing.
Outsourcing this to an LLM is similar to an airplane stall .. I just dip mentally. The stress goes away too, since I assume the LLM will get rid of the "problem" but I have no more incentives to think, create, solve anything.
Still blows my mind how different people approach some fields. I see people at work who are drooling about being able to have code made for them .. but I'm not in that group.
I'll push it back against this a little bit. I find any type of deliberative thinking to be a forcing function. I've recently been experimenting with writing very detailed specifications and prompts for an LLM to process. I find that as I go through the details, thoughts will occur to me. Things I hadn't thought about in the design will come to me. This is very much the same phenomenon when I was writing the code by hand. I don't think this is a binary either or. There are many ways to have a forcing function.
I think it's analogous to writing and refining an outline for a paper. If you keep going, you eventually end up at an outline where you can concatenate what are basically sentences together to form paragraphs. This is sort of where you are now, if you spec well you'll get decent results.
I agree, I felt this a bit. The LLM can be a modeling peer in a way. But the phase where it goes to validate / implement is also key to my brain. I need to feel the details.
My think/create/solve focus is on making my agentic coding environment produce high quality code with the least cost. Seems like a technical challenge worth playing with.
It probably helps that I have 40 years of experience with producing code the old ways, including using punch cards in middle school and learning basic on a computer with no persistent storage when I was ten.
I think I've done enough time in the trenches and deserve to play with coding agents without shame.
Alternatively, another second order effect is can't sip latte anymore because you're orchestrating 8 bots do the work and you're back to 80%-100% time saturation.
So far in my career I have always had more requests coming in than implementations going out. If I can go 3 or 10 times faster, than I will still have plenty of work. Especially for the slew of ideas that are never even considered to put towards a dev, because it's already considered to be too low value to have it even be considered to be build. Or the ideas that are so far fetched they were never considered feasible. I am not worried work will dry up.
What I believe is going to be interesting is what happens when non-engineers adopt building with agentic AI. Maybe 70 or 80% of their needs will be met without anyone else directly involved. My suspicion is that it will just create more work: making those generated apps work in a trustworthy manner, giving the agents more access to build context and make decisions, turning those one off generated apps into something maintainable, etc.
Exactly this. Even if right now you, bottom level wage earning grunt, get to lighten your workload for a fleeting second, sit back and enjoy the latte it's only but a fleeting second until the capital class tighten the screws.
Most people will get laid off and made redundant and those who remain are going to have to run faster than ever to produce wealth for the capital owners.
Yea, I don’t think that will be the case. Spreadsheets simplified the work of junior finance people who did all the work by hand before. But more people work in finance now than before.
Actually for me it was the opposite: before I wasn't able to play around and experiment in my free time that much, because I didn't have enough energy left to actualize the thoughts and ideas I have since I have a day job.
Now, since the bottleneck of moving the fingers to write code has gone down, I actually started to enjoy doing side projects. The mental stress from writing code has gone down drastically with Claude Code, and I feel the urge to create more nowadays!
you have a point.. i'm still confused about how this will affect jobs, markets
in a way a personal project is different from a job duty, here you're exploring, less if no deadline.. at work if I feel the llm is doing everything and I don't really master, i risk my job and my skills rot.
There were always many mediocre engineers around, some of them even with fancy titles like "Senior," "Principal", and CTO.
We have always survived it, so probably we can also survive mediocre coders not reading the code the LLM generates for them because they are unable to see the problems that they were never able to see in their handwritten code.
Honestly it’s not that hard. I already coded less and less as part of my job as I get more senior and just didn’t have time, but I was still easy to do code reviews and fix bugs, sit down and whip out a thousand lines in a power session. Once you learn it doesn’t take much practice to maintain it. A lot of traditional coding is very inefficient. With AI it’s like we’re moving from combustion cars to EVs, the energy efficiency is night and day, for doing the same thing.
That said, the next generation may struggle, but they’ll find their way.
It’s going to be extremely difficult if PR and code reviews do not prune unnecessary functions. From what I’m experiencing now, there’s a lot of additional code that gets generated.
In this case why can’t other agents just automate your job completely ? They are capable of that. What do you bring in the process of still doing manual organization ?
I still have to tell it what to do, and often how to do it. I manage its external memory and guidelines, and review implementation plans. I’m still heavily involved in software design and test coverage.
AI is not capable yet of automating my job completely – I anticipate this will happen within two years, maybe even this year (I’m an ML researcher).
No, I mean that my job in its current form – as an ML researcher with a phd and 15 years of experience - will be completely automated within two years.
Is the progress of LLMs moving up abstraction layers inevitable as they gather more data from each layer? First, we fed LLMs raw text and code and now they are gathering our interactions with the LLM regarding generated code. It seems like you could then use the interactions to make a LLM that is good at prompting and fixing another LLMs generated code. Then its on to the next abstraction layer.
What you described makes sense, and it's just one of the things to try. There are lots of other research directions: online learning, more efficient learning, better loss/reward functions, better world models from training on Youtube/VR simulations/robots acting in real world, better imitation learning, curriculum learning, etc. There will undoubtedly be architectural improvements, hardware improvements, longer context windows, insights from neuroscience, etc. There is still so much to research. And there are more AI researchers now than ever. Plus current AI models already make us (AI researchers) so much more productive. But even if absolutely no further progress is made in AI research, and foundational model development stops today, there's so much improvement to be made in the tooling around the models: agentic frameworks, external memory management, better online search, better user interactions, etc. The whole LLM field is barely 5 years old.
So your assumption is that it will ultimately be the users of software themselves who will throw some every day language at an AI and it will reliably generate something that meets those users' intuitive expectations?
Yes, it will be at least as reliable as an average software engineer at an average company (probably more reliable than that), or at least as reliable as a self-driving car where a user says get me to this address, and the car does it better (statistically) than an average human driver.
I think this could work for some tasks but not for others.
We didn't invent formal languages to give commands to computers. We invented them as a tool for thinking and communicating things that are hard to express in natural language.
I doubt that we will stop thinking and I doubt that it will ever be efficient to specify tasks purely in terms of natural language.
One of my first jobs as a software engineer was for a bank (~30 years ago). This bank manager wasn't a man of many words. He just handed us an Excel sheet as a specification for what he wanted us to implement.
My job right now is to translate natural English statements from my bosses/colleagues into natural English instructions for Claude. Yes, it takes skill and experience to do this effectively. But I don't see any reasons Gemini 4, Opus 5 or GPT-6 won't be able to do this just as well as I do.
I have enough savings for a few years, so I might just move to a lower COL area, and wait it out. Hopefully after the initial chaos period things will improve.
For someone at your position with your experience it’s quite depressing that your job is going to be automated. I feel quite anxious when I see young generations in my country that say themselves they are lazy about learning new things. The next generation will be useless to capitalist societies, in a sense that they won’t be able to bring value through administrative or white collar work. I hope some areas of the industry will move slowly toward AI
Everything you have said here is completely true, except for "not in that group": the cost-benefit analysis clearly favors letting these tools rip, even despite the drawbacks.
But it's also likely that these tools will produce mountains of unmaintainable code and people will get buried by the technical debt. It kind of strikes me as similar to the hubris of calling the Titanic "unsinkable." It's an untested claim with potentially disastrous consequences.
> But it's also likely that these tools will produce mountains of unmaintainable code and people will get buried by the technical debt.
It's not just likely, but it's guaranteed to happen if you're not keeping an eye on it. So much so, that it's really reinforced my existing prejudice towards typed and compiled languages to reduce some of the checking you need to do.
Using an agent with a dynamic language feels very YOLO to me. I guess you can somewhat compensate with reams of tests though. (which begs the question, is the dynamic language still saving you time?)
You can (and probably should) still do tests, but there's an entire class of errors you know can't happen, so you need far less tests, focusing only on business logic for the most part.
Static type checking is even faster than running the code. It doesn't catch everything, but if finding a type error in a fast test is good, then finding it before running any tests seems like it would be even better.
I can provide evidence for your claim. The technical debt can easily snowball if the review process is not stringent enough to keep out unnecessary functions.
Oh I'm well aware of this. I admitted defeat in a way.. I can't compete. I'm just at loss, and unless LLM stall and break for some reason (ai bubble, enshittification..) I don't see a future for me in "software" in a few years.
Somehow I appreciate this type of attitude more than the one which reflects total denial of the current trajectory. Fervent denial and AI trash-talking being maybe the single most dominant sentiment on HN over the last year, by all means interspersed with a fair amount of amazement at our new toys.
But it is sad if good programmers should loose sight of the opportunities the future will bring (future as in the next few decades). If anything, software expertise is likely to be one of the most sought-after skills - only a slightly different kind of skill than churning out LOCs on a keyboard faster than the next person: People who can harness the LLMs, design prompts at the right abstraction level, verify the code produced, understand when someone has injected malware, etc. These skills will be extremely valuable in the short to medium term AFAICS.
But ultimately we will obviously become obsolete if nothing (really) catastrophic happens, but when that happens then likely all human labor will be obsolete too, and society will need to be organized differently than exchanging labor for money for means of sustenance.
If the world comes to that it will be absolutely catastrophic, and it’s a failure of grappling with the implications that many of the executives of AI companies think you can paper over the social upheaval with some UBI. There will be no controlling what happens, and you don’t even need to believe in some malicious autonomous AI to see that.
I get crazy over the 'engineer are not paid to write loc', nobody is sad because they don't have to type anymore. My two issues are it levels the delivery game, for the average web app, anybody can now output something acceptable, and then it doesn't help me conceptualize solution better, so I revert to letting it produce stuff that is not maleable enough.
I wonder about who "anybody can now output something acceptable" will hit most - engineers or software entrepreneurs.
Any implementation moat around rapid prototyping, and any fundraising moat around hiring a team of 10 to knock out your first few versions, seems gone now. Trying to sell MVP-tier software is real hard when a bunch of your potential customers will just think "thanks for the idea, I'll just make my own."
The crunch for engineers, on the other hand, seems like that even if engineers are needed to "orchestrate the agents" and manage everything, there could be a feature-velocity barrier for the software that you can still sell (either internally or externally). Changing stuff more rapidly can quickly hit a point of limited ROI if users can't adjust, or are slowed by constant tooling/workflow churn. So at some point (for the first time in many engineers' career, probably) you'll probably see product say "ok even though we built everything we want to test, we can't roll it all out at once!". But maybe what is learned from starting to roll those things out will necessitate more changes continually that will need some level of staffing still. Or maybe cheaper code just means ever-more-specialized workflows instead of pushing users to one-size-fits-all tooling.
In both of those cases the biggest challenge seems to be "how do you keep it from toppling down over time" which has been the biggest unsolved problem in consumer software development for decades. There's a prominent crowd right now saying "the agents will just manage it by continuing to hack on everything new until all the old stuff is stable too" but I'm not sure that's entirely realistic. Maybe the valuable engineering skills will be putting in the right guardrails to make sure that behavioral verification of the code is a tractable problem. Or maybe the agents will do that too. But right now, like you say, I haven't found particularly good results in conceptualizing better solutions from the current tools.
> your potential customers will just think "thanks for the idea, I'll just make my own."
yeah, and i'm surprised nobody talks about this much. prompting is not that hard, and some non software people are smart enough to absorb the necessary details (especially since the llm can tutor them on the way) and then let the loop produce the MVP.
> Or maybe cheaper code just means ever-more-specialized workflows instead of pushing users to one-size-fits-all tooling.
The future is either a language model trained on AI code bloats and the ways to optimize the bloat away
OR,
something like Mercor, currently getting paid really well by Meta, OpenAI, Anthropic and Gemini to pay very smart humans really well to proof language model outputs.
i'm sorry if I pulled everybody down .. but it's been many months since gemini and claude became solid tools, and regularly i have this strong gut feeling. i tried reevaluating my perception of my work, goals, value .. but i keep going back to nope.
After a multi-decade career that spanned what is rapidly seeming like the golden age of software development, I have two emotions: first gratefulness; second a mixture of resignation, maudlin reflection, and bitterness that I am fighting hard to resist.
As someone who’s always wanted to “get home and code something on my own”, I do have a glimmer of hope that I wonder if others share. I’ve worked extensively with Claude and there’s no question I am now a high velocity “builder” and my broad experience has some value here. I am sad that I won’t be able to deeply look at all the code I am producing, but I am making sure the LLM and I structure things so that I could eventually dig in to modules if needed (unlikely to happen I suppose).
Anyway, my hope/question: if I embrace my new role as fast system builder and I am creative in producing systems that solve real problems “first”, is there a path to making that a career (I.e. 4 friends and I cranking out real production software that’s filling a real niche)? There must be some way for this to succeed —- I am not yet buying the “everything will be instantly copyable and so any solution is instantly commodity” argument. If that’s true, then there is no hope. I am still in shape, though, so going pro in pickleball is always an option, ha ha.
Unfortunately you aren't a high velocity builder. The velocity curve has now shifted and everyone having Claude blast out loc after loc is now a high velocity builder. And when everyone is a high velocity builder...nobody is.
Fair point, but my hope is that the creativity involved in deciding what to build, with the choice informed by engineering experience (the project/value will not be obvious to everyone) will allow differentiation.
"creativity involved in deciding what to build, with the choice informed by engineering experience (the project/value will not be obvious to everyone) will allow differentiation."
How? Anyone upon seeing your digital product can just prompt the same thing in no time. If you can prompt it, I can prompt it and so can a million other people.
Nobody whether an individual or business holds any uniqueness or advantage to themselves. All careers and skill sets are leveled and worthless. Implementation skills are worthless. Creativity is worthless.
Agree on data value, but as mentioned above I am not yet buying the “everything will be instantly copyable and so any solution is instantly commodity” argument … crud web-app sure, something with significant back-end complexity or a multi-service systems level solution, not so much. Perhaps optimistic, admittedly. Cheers.
I hear you. And maybe you're right. Maybe I'm deluding myself, but: when I look at my skilled colleagues who vibecode, I can't understand how this is sustainable. They're smart people, but they've clearly turned off. They can't answer non-trivial questions about the details of the stuff they (vibe-)delivered without asking the LLM that wrote it. Whoever uses the code downstream aren't gonna stand (or pay!) for this long-term! And the skills of the (vibe-)authors will rapidly disappear.
Maybe I'm just as naive as those who said that photographs lack the soul of paintings. But I'm not 100% convinced we're done for yet, if what you're actually selling is thinking, reasoning and understanding.
The difference with a purely still photograph is that code is a functional encoding of an intention. Code of an LLM could be perfect and still not encode the perfect intention of the product. I’ve seen that in many occasions.
Many people don’t understand what code really is about and think they have a printer toy now and we don’t have to use pencils.
That’s not at all the same thing.
Code is intention, logic, specific use case all at once. With a non deterministic system and vague prompting there will be misinterpreted intentions from LLM because the model makes decisions to move forward. The problem is the scale of it, we’re not talking about 1000 loc. In a month you can generate millions of loc, in a year hundreds of millions of loc.
Some will have to crash and burn their company before they realize that no human at all in the loop is a non sense. Let them touch fire and make up their mind I guess.
> Code is intention, logic, specific use case all at once. With a non deterministic system and vague prompting there will be misinterpreted intentions from LLM because the model makes decisions to move forward. The problem is the scale of it, we’re not talking about 1000 loc. In a month you can generate millions of loc, in a year hundreds of millions of loc.
People are also non deterministic. When I delegate work to team of five or six mid level developers or God forbid outsourced developers, I’m going to have to check and review their work too.
It’s been over a decade that my vision/responsibility could be carried out by just my own two hands and be done on time within 40 hours a week - until LLMs
People are indeed not deterministic. But they are accountable. In the legal sense, of course, but more importantly, in an interpersonal sense.
Perhaps outsourcing is a good analogy. But in that case I'd call it outsourcing without accountability. LLMs feel more like an infinite chain of outsourcing.
As a former tech lead and now staff consultant who leads cloud implementations + app dev, I am ultimately responsible for making sure that projects are done on time, on budget and meets requirements. My manager nor the customer would allow me to say it’s one of my team members fault that something wasn’t done correctly any more than I could say don’t blame me blame Codex.
I’ve said repeatedly over the past couple of days that if a web component was done by someone else, it might as well have been created by Claude, I haven’t done web development in a decade. If something isn’t right or I need modifications I’m going to either have to Slack the web developer or type a message to Claude.
Ofc people are non deterministic. But usually we expect machines to be. That’s why we trust them blindly and don’t check the calculations. We review people’s work all the time though.
Here people will stop review machine LLM code as it’s kind of a source of truth like in other areas. That’s my point, reviewing code takes time and even more time when no human wrote it. It’s a dangerous path to stop reviews because of trust in the machine now that the machine is just kind of like humans, non deterministic.
No one who has any knowledge or who has ever used an LLM expects determinism.
And there are no computer professionals who haven’t heard about hallucinations.
Reviewing whether the code meets requirements through manual and automated tests - and that’s all I cared about when I had a team of 8 under me - is the same regardless. I wasn’t checking whether John used a for loop or while loop in between my customer meetings and meetings with the CTO. I definitely wasn’t checking the SOQL (not a typo) of the Salesforce consultants we hired. I was testing inputs and outputs and UX.
Having a team of 8 people producing code is manageable. Having an AI with 8 agents that write code all day long is not the same volume it can generate more code in a day that one person can review in a week.
What you say is that, product teams will prompt what they want to a framework, the framework will take care of spec analysis, development, reviews, compliance with spec. Product teams with QA will make sure the delivery is functionally correct.
No humans need to make sure of anything code related.
What we don’t know yet is, does AI will still produce solid code trough the years because it’s all statistical analysis and with the volume of millions of loc, refactoring needed, data migrations etc what will happen ?
For context, I just started using coding agents - codex CLI and Claude code in October. Once I saw that you had to be billed by use, I’m not using my own money for it when it’s for a company.
Two things changed - Codex CLI now lets you use it with your $20 a month subscription and I have never run into quota issues with it and my employer signed up for the enterprise vs of Claude and we each have an $800 a month allowances
My argument though is “why should I care about the code?” for the most part. If I were outsourcing a project or delegating it to a team lead, I would be asking high level architectural, security and scalability questions.
AI generated the code, AI maintains the code. I am concerned about abstractions and architecture.
You shouldn’t have to maintain or refactor “millions of lines of code”, if your code is well modularized with clean interfaces, making a change for $x7 may mean making a change for $x1…$x6. But you still should be working locally in one module at the time. You should do the same for the benefit of coders. Heck my little 5 week project has three independently deployable repos in a root folder. My root Agents file just has a summary of how all three relate via a clean interface.
In the project I am working on now, besides “does it meet the requirements”, I care about security, scalability, concurrency, user experience for the end user, user experience for the operations folks when they need to make config changes, and user experience for any developers who have to make changes long after I’m off this project. I haven’t looked at a single line of code - besides the CloudFormation templates. But I can answer any architectural question about any of it. The architecture and abstractions were designed by me and dictated to the agents
On this particular project, on the coding level, there is absolutely nothing that application code like this can do that could be insecure except hypothetically embed AWS credentials into the code. But it can’t do that either since it doesn’t have access to it [1].
In this case security posture comes from the architecture - S3 block public access, well scoped IAM roles, not running “in a VPC”. Things I am checking in the infrastructure as code and I was very specific about.
The user experience has to come from design and checking manually.
I mentioned earlier that my first stab it scaled poorly. This was caused by my design and I suspected it would beforehand. But building the first version was so fast because of AI tools, I felt no pain in going with my more architecturally complicated plan B and throwing the first version away. I wouldn’t have known that by looking at the code. The code was fine it was the underlying AWS service. I could only know that by throwing 100K documents at it instead of 1000.
I designed a concurrent locking mechanism that had a subtle flaw. Throwing the code into ChatGPT into thinking mode, it immediately found it. I might have been better off just to tell the coding agents “design a locking mechanism for $x” instead of detailing it.
Even maintainability was helped because I knew I or anyone else who touched it was probably going to be using an LLM. From the get go I threw the initial contract, the discovery sessions transcripts, the design diagrams, the review of the design diagrams, my project plan and breakdown into ChatGPT and told it to render a detailed markdown file of everything - that was the beginning of my AGENTS.md file.
I asked both Codex and Claude to log everything I was doing and my decisions into separate markdown files.
Any new developer could come into my repo, fire up Claude and it wouldn’t just know what was coded, it would have full context of the project from the initial contract through to the delivery
[1] code running on AWS never explicitly has to worry about AWS credentials , the SDKs can find the information by themselves by using the credentials of the IAM role attached to the EC2 instance, Lambda, Docker container, etc.
Even locally you should be getting temporary credentials that are assigned to environment variables that the SDK retrieved automatically.
Okay - and the person ultimately leading the team is still responsibility for it whether you are delegating to more junior developers or AI. You’re still reviewing someone else’s code based on your specs
I have this nagging feeling I’m more and more skimming text, not just what the LLMs output, but all type of texts. I’m afraid people will get too lazy to read, when the LLM is almost always right. Maybe it’s a silly thought. I hope!
People will say "oh, it's the same as when the printing press came, people were afraid we'd get lazy from not copying text by hand", or any of a myriad of other innovations that made our lives easier. I think this time it's different though, because we're talking about offloading the very essence of humanity – thinking. Sure, getting too lazy to walk after cars became widespread was detrimental to our health, but if we get too lazy to think, what are we?
there are some youtube videos about the topic, be it pupil in high school addicted to llms, or adults losing skills, and not dev only, society is starting to see strange effects
I feel the same. And I expect even a lot of the early adopters and AI enthusiasts are going to find themselves as the short end of the stick sooner than later.
I've already seen this play out. The lazies in our floor were all crazy about AI because they could finally work few and finish their tasks. Until they realized that they were visibly replaceable now. The motto in team chats is "we'll lie about the productivity gains to management, just say 10% but with lots of caretaking" now
Imagine everyone who is in less technical or skilled domains.
I can't help but resist this line of thinking as a result. If the end is nigh for us, it's nigh for everyone else too. Imagine the droves of less technical workers in the workforce who will be unseated before software engineers. I don't think it is tenable for every worker in the first world to become replaced by a computer. If an attempt at this were to occur, those smart unemployed people would be a real pain in the ass for the oligarchs.
I think you have every right to doubt those telling us that they run 5 agents to generate a new SAAS-product while they are sipping latté in a bar. To work like that I believe you'll have to let go of really digging into the code, which in my experience is needed if want good quality.
Yet I think coding agents can be quite a useful help for some of the trivial, but time consuming chores.
For instance I find them quite good at writing tests. I still have to tweak the tests and make sure that they do as they say, but overall the process is faster IMO.
They are also quite good at brute-forcing some issue with a certain configuration in a dark corner of your android manifest. Just know that they WILL find a solution even if there is none, so keep them on a leash!
Today I used Claude for bringing a project I abandoned 5 years ago up to speed. It's still at work in progress, but the task seemed insurmountable (in my limited spare time) without AI, now it feels like I'm half-way there in 2-3 hours.
I think we really need to have a serious think of what is "good quality" in the age of coding agents. A lot of the effort we put into maintaining quality has to do with maintainability, readability etc. But is it relevant if the code isn't for humans? What is good for a human is not what is good for an AI necessarily (not to say there is no overlap). I think there are clearly measurable things we can agree still apply around bugs, security etc, but I think there are also going to be some things we need to just let go of.
The implications to your statement seems to me that is: "you'll never have to directly care about it yourself, so why do you care about it?". Unless you were talking about the codebase in a user-application relationship which in this case feel free to ignore the rest of my post.
I don't believe that the code will become an implementation detail, ever. When all you do is ship an MVP to demonstrate what you're building then no one cares, before or after LLM assistance. But any codebase that lives more than a year and serves real users while generating revenue deserves to have engineers who knows what's happening beyond authoring markdown instructions to multiple agents.
Your claim seems to push us towards a territory where externalizing out thought processes to a third party is the best possible outcome for all parties, because the models will only get better and stay just as affordable.
I will respond to that by pointing out that, models that will ultimately be flawless in code generation will worth a fortune in terms of adding value, and any corporation that will win the arms race will be actually killing themselves by not raising the cost of access to their services by a metric ton. This is because there will be few LLM providers that actually worth it by then, and because oligopoly is a thing.
So no. I don't expect that we'll ever reach a point where the average person will be "speaking forth" software the same way they post on Reddit, without paying cancer treatment levels of money.
But even if it's actually affordable... Why would I ever want to use your app instead of just asking an LLM to make me one from scratch? No one seems to think about that.
i've been building agent tooling for a while and this is the question i keep coming back to. the actual failure mode isn't messy code, agents produce reasonably clean, well-typed output these days. it's that the code confidently solves a different problem than what you intended. i've had an agent refactor an auth flow that passed every test but silently dropped a token refresh check because it "simplified" the logic. clean code, good types, tests green, security hole. so for me "quality" has shifted from cyclomatic complexity and readability scores to "does the output behaviour match the specification across edge cases, including the ones i didn't enumerate." that's fundamentally an evaluation problem, not a linting problem.
This is where I think its going, it feels that in the end we will end up with an "llm" language, one that is more suited to how an llm works and less human.
You can’t drop anything as long as a programmer is expected to edit the source code directly. Good luck investigating a bug when the code is unclear semantically, or updating a piece correctly when you’re not really sure it’s the only instance.
I think that's the question. Is a programmer expected to ever touch the source code? Or will AI -- and AI alone -- update the code that it generated?
Not entirely unlike other code generation mechanisms, such as tools for generating HTML based on a graphical design. A human could edit that, but it may not have been the intent. The intent was that, if you want a change, go back to the GUI editor and regenerate the HTML.
> Not entirely unlike other code generation mechanisms, such as tools for generating HTML based on a graphical design. A human could edit that, but it may not have been the intent. The intent was that, if you want a change, go back to the GUI editor and regenerate the HTML.
We largely moved back away from "work in a graphic tool then spit out HTML from it" because it wasn't robust for the level of change/iteration pace, this wasn't exactly my domain but IIRC there were especially a lot of problems around "small-looking changes are now surprisingly big changes in the generated output that have a large blast radius in terms of the other things (like interactivity) we've added in."
Any time you do a refactor that changes contract boundaries between functions/objects/models/whatever, and you have to update the tests to reflect this, you have a big risk of your new tests not covering exactly the same set of component interactions that your old tests did. LLM's don't change this. They can iterate until the tests are green, but certain changes will require changing the tests, and now "iterating until the tests are green" could be resolved by changing the tests in a way that subtly breaks surprising user-facing things.
The value of good design in software is having boundaries aligned with future desires (obviously this is never perfect foresight) to minimize that risk. And that's the scary thing to myself about not even reading the code.
So like we went from assembler to higher level programming languages, we will now move to specifications for LLMs? Interesting thought... Maybe, once the "compilers" get good enough, but for mission critical systems they are not nearly good enough yet.
Right. I work in aerospace software, and I do not know if this option would ever be on the table. It certainly isn't now.
So I think this question needs to be asked in the context of particular projects, not as an industry-wide yes or no answer. Does your particular project still need humans involved at the code level? Even just for review? If so, then you probably ought to retain human-oriented software design and coding techniques. If not, then, whatever. Doesn't matter. Aim for whatever efficiency metric you like.
It’s also pretty close to Steve Jobs initial vision of computing in the future (https://stevejobsarchive.com/stories/objects-of-our-life, 1983) but my point is that whatever it is we call AI now became reality so much faster than anyone really saw coming. Even if the pace slows down, and it didn’t yet, things are improving so massively all the time that the world can’t keep up changing to accommodate.
> I think you have every right to doubt those telling us that they run 5 agents to generate a new SAAS-product while they are sipping latté in a bar. To work like that I believe you'll have to let go of really digging into the code, which in my experience is needed if want good quality.
Also we live in a capitalist society. The boss will soon ask: "Why the fuck am I paying you to sip a latte in a bar? While am machine does your work? Use all your time to make money for me, or you're fired."
AI just means more output will be expected of you, and they'll keep pushing you to work as hard as you can.
> AI just means more output will be expected of you, and they'll keep pushing you to work as hard as you can.
That’s a bit too cynical for me. After all, yes, your boss is not paying you for sipping lattes, but for producing value for the company. If there is a tool that maximises your output, why wouldn’t he want you to use that to great efficiency?
Put differently, would a carpenter shop accept employees rejecting the power saw in favour of a hand saw to retain their artisanal capability?
> why wouldn’t he want you to use that to great efficiency
Because I deny that? It's not fun for me.
> would a carpenter shop accept employees rejecting the power saw in favour of a hand saw to retain their artisanal capability?
Why not? If that makes enough money to keep going.
You might argue that in theoretical ideal market companies who're not utilizing every possible trick to improve productivity (including AI) will lose competition, but let's be real, a lot of companies are horribly inefficient and that does not make them bankrupt. The world of producing software is complicated.
I know that I deliver. When I'm asked to write a code, I deliver it and I responsible for it. I enjoy the process and I can support this code. I can't deliver with AI. I don't know what it'll generate. I don't know how much time would it take to iterate to the result that I precisely want. So I can't longer be responsible for my own output. Or I'd spend more time baby-sitting AI than it would take me to write the code. That's my position. Maybe I'm wrong, they'll fire me and I'll retire, who knows. AI hype is real and my boss often copy&pasting ChatGPT asking me to argue with it. That's super stupid and irritating.
I started this career because I liked writing code. I no longer write a lot of code as a lead, but I use writing code to learn, to gain a deeper understanding of the problem domain etc. I'm not the type who wants to write specs for every method and service but rather explore and discover and draft and refactor by... well, coding. I'm amazed at creating and reading beautiful, stylish, working code that tells a story.
If that's taken away, I'm not sure how I could retain my interest in this profession. Maybe I'll need to find something else, but after almost a decade this will be a hard shift.
I totally emphasise as a fellow developer, but I doubt you realise what an incredibly privileged position it is to just refuse working if you don't have fun doing it. And it doesn't really make for a convincing argument to keep you employed either.
> Why not? If that makes enough money to keep going.
If all other competing carpenters use power tools, you're going to loose contracts. We've had a few incredibly easy decades as software developers where market pressure wasn't really a thing, but that is about to change when the cost of producing code drops considerably.
> You might argue that in theoretical ideal market companies who're not utilizing every possible trick to improve productivity (including AI) will lose competition […]
You're moving the goalposts here. We're not talking about wringing every last drop of efficiency out of employees. We're talking about businesses not tolerating paying for licenses for AI agents to enable developers to sip Lattés while their computer does their job. That's a fundamentally different proposition.
> That’s a bit too cynical for me. After all, yes, your boss is not paying you for sipping lattes, but for producing value for the company. If there is a tool that maximises your output, why wouldn’t he want you to use that to great efficiency?
Sitting in a cafe enjoying a latte is not "producing value for the company." If having "5 agents to generate a new SAAS-product" matches your non-AI capacity and gives you enough free time to relax in a cafe, he's going to want to you run 50 agents generating 5 new SAAS products, until you hit your capacity.
If he doesn't need 5 new SAAS products, just one, then he's going to fire you or other members of your team.
Think of it this way: you're a piece of equipment to your boss, and every moment he lets you sit idle (on the clock) is money lost. He wants to run that piece of equipment as hard as he can, to maximize his profit.
I still do this, but when I'm reviewing what's been written and / or testing what's been built.
How I see it is we've reverted back to a heavier spec type approach, however the turn around time is so fast with agents that it still can feel very iterative simply because the cost of bailing on an approach is so minimal. I treat the spec (and tests when applicable) as the real work now. I front load as much as I can into the spec, but I also iterate constantly. I often completely bail on a feature or the overall approach to a feature as I discover (with the agent) that I'm just not happy with the gotchas that come to light.
AI agents to me are a tool. An accelerator. I think there are people who've figured out a more vibey approach that works for them, but for now at least, my approach is to review and think about everything we're producing, which forms my thoughts as we go.
Historically software engineering has been seen as "assembly line" work by a lot of people (see all the efforts to outsource it through spec handoffs and waterfall through the years) but been implemented in practice as design-as-you-build (nobody anticipates all the questions or edge cases in advance, software specs are often an order of magnitude simpler than the actual number of branches in the code).
For mission-critical applications I wonder if making "writing the actual code" so much cheaper means that it would make more sense to do more formal design up front instead, when you no longer have a human directly in the loop during the writing of the code to think about those nasty pops-up-on-the-fly decisions.
> software specs are often an order of magnitude simpler than the actual number of branches in the code
Love this! Be it design specs or a mock from the designer. So many unaccounted for decisions. Good devs will solve many on their own, uplevel when needed, and provide options.
And absolutely it means more design up front. And without human in the direct loop, maybe people won’t skimp on this!
I also second this. I find that I write better by hand, although I work on niche applications it’s not really standard crud or react apps. I use LLMs in the same way i used to used stack overflow, if I go much farther to automate my work than that I spend more time on cleanup compared to if I just write code myself.
Sometimes the AI does weird stuff too. I wrote a texture projection for a nonstandard geometric primitive, the projection used some math that was valid only for local regions… long story. Claude kept on wanting to rewrite the function to what it thought was correct (it was not) even when I directed to non related tasks. Super annoying. I ended up wrapping the function in comments telling it to f#=% off before it would leave it alone.
> I use LLMs in the same way i used to used stack overflow, if I go much farther to automate my work than that I spend more time on cleanup compared to if I just write code myself.
yea, same here.
i've asked an ai to plan and setup some larger non straight forwards changes/features/refactorings but it usually devolves into burning tokens and me clicking the 'allow' button and re-clarifying over and over when it keeps trying to confirm the build works etc...
when i'm stuck though, or when im curious of some solution it usually opens the way to finish the work similar to stack overflow
Using AI or writing your own code isn't an xor thing. You can still write the code but have a coding assistant or something an alt/cmd-tab away. I enjoy writing code, it relaxes me so that's what I do but when I need to look something up or i'm not clear on the syntax for some particular operation instead of tabbing to a browser and google.com I tab to the agent and ask it to take a look. For me, this is especially helpful for CSS and UI because I really suck at and dislike that part of development.
I also use these things to just plan out an approach. You can use plan mode for yourself to get an idea of the steps required and then ask the agent to write it to a file. Pull up the file and then go do it yourself.
>but the code to me is a forcing mechanism into ironing out the details, and I don't get that when I'm writing a specification.
This is so on point. The spec as code people try again and again. But reality always punches holes in their spec.
A spec that wasn't exercised in code, is like a drawing of a car, no matter how detailed that drawing is, you can't drive it, and it hides 90% of the complexity.
To me the value of LLMs is not so much in the code they write. They're usually to verbose, start building weird things when you don't constantly micromanage them.
But you can ask very broad questions, iteratively refine the answer, critique what you don't like. They're good as a sounding board.
I love using LLMs as well as rubber ducks - what does this piece of code do? How would you do X with Y? etc.
The problem is that this spec-driven philosophy (or hype, or mirage...) would lead to code being entirely deprecated, at least according to its proponents. They say that using LLMs as advisors is already outdated, we should be doing fully agentic coding and just nudge the LLM etc. since we're losing out on 'productivity'.
>They say that using LLMs as advisors is already outdated, we should be doing fully agentic coding and just nudge the LLM etc. since we're losing out on 'productivity'.
As long as "they" are people that either profit from FOMO or bad developers that still don't produce better software than before, I'm ok ignoring the noise.
In 1987 when I first started coding, I would either write my first attempt in BASIC and see it was too slow and rewrite parts in assembly or I would know that I had to write what I wanted from the get go in assembly because the functionality wasn’t exposed at all in BASIC (using the second 64K of memory or using double hires graphics).
This past week, I spent a couple of days modifying a web solution written by someone else + converting it from a Terraform based deployment to CloudFormation using Codex - without looking at the code as someone who hasn’t done front in development in a decade - I verified the functionality.
More relevantly but related, I spent a couple of hours thinking through an architecture - cloud + an Amazon managed service + infrastructure as code + actual coding, diagramming it, labeling it , and thinking about the breakdown and phases to get it done. I put all of the requirements - that I would have done anyway - into a markdown file and told Claude and Codex to mark off items as I tested each item and summarize what it did.
Looking at the amount of work, between modifying the web front end and the new work, it would have taken two weeks with another developer helping me before AI based coding. It took me three or four days by myself.
The real kicker though is while it worked as expected for a couple of hundred documents, it fell completely to its knees when I threw 20x documents into the system. Before LLMs, this would have made me look completely incompetent telling the customer I now wasted two weeks worth of time and 2 other resources.
Now, I just went back to the literal drawing board, rearchitected it, did all of the things with code that the managed services abstracted away with a few tweaks, created a new mark down file and was done in a day. That rework would have taken me a week by itself. I knew the theory behind what the managed service was doing. But in practice I had never done it.
It’s been over a decade where I was responsable for a delivery that I could do by myself without delegating to other people or that was simple enough that I wouldn’t start with a design document for my own benefit. Now within the past year, I can take on larger projects by myself without the coordination/“mythical man Month” overhead.
I can also in a moment of exasperation say to Codex “what you did was an over complicated stupid mess, rethink your implementation from first principles” without getting reported to HR.
There is also a lot of nice to have gold plating that I will do now knowing that it will be a lot faster
I dunno. On the one hand, I keep hearing anecdata, including hackernews comments, friends, and coworkers, suggesting that AI-assisted coding is a literal game changer in terms of productivity, and if you call yourself a professional you'd better damn well lock the fuck in and learn the tools. At the extreme end this takes the form of, you're not a real engineer unless you use AI because real engineering is about using the optimal means to solve problems within time, scale, and budget constraints, and writing code by hand is now objectively suboptimal.
On the other hand, every time the matter is seriously empirically studied, it turns out that overall:
* productivity gains are very modest, if not negative
* there are considerable drawbacks, including most notably the brainrot effect
Furthermore, AI spend is NOT delivering the promised returns to the extent that we are now seeing reversals in the fortunes of AI stocks, up to and including freakin' NVIDIA, as customers cool on what's being offered.
So I'm supposed to be an empiricist about this, and yet I'm supposed to switch on the word of a "cool story bro" about how some guy built an app or added a feature the other day that he totally swears would have taken him weeks otherwise?
I'm like you. I use code as a part of my thought process for how to solve a problem. It's a notation for thought, much like mathematical or musical notation, not just an end product. "Programs must be written for people to read, and only incidentally for machines to execute." I've actually come to love documenting what I intend to do as I do it, esp. in the form of literate programming. It's like context engineering the intelligence I've got upstairs. Helps the old ADHD brain stay locked in on what needs to be done and why. Org-mode has been extremely helpful in general for collecting my scatterbrained thoughts. But when I want to experiment or prove out a new technique, I lean on working directly with code an awful lot.
That's because many developers are used to working like this.
With AI, the correct approach is to think more like a software architect.
Learning to plan things out in your head upfront without to figure things out while coding requires a mindset shift, but is important to work effectively with the new tools.
To some this comes naturally, for others it is very hard.
I think what GP is referring too are technical semantics and accidental complexity. You can’t plan for those.
The same kind of planning you’re describing can and do happen sans LLM, usually on the sofa, or in front of a whiteboard. Or by reading some research materials. No good programmer rushes to coding without a clear objective.
But the map is not the territory. A lot of questions surface during coding. LLMs will guess and the result may be correct according to the plan, but technically poor, unreliable, or downright insecure.
I dont think any complex plan should be planned in your head. But drawing diagrams, sketching components, listing pros and cons, 100%. Not jumping directly into coding might look more like jumping into spec writing a poc
Maintaining a 'mental RAM Cache' is a powerful tool to understanding the system as a whole on a deep and intuitive level, even if you can only 'render' sections at a time. The bigger it is the more you can keep track of to be able to foresee interactions between distant pieces.
It shouldn't be your only source of a plan as you'd likely wind up dropping something, but figuring out how to jiggle things around before getting it 'on paper' is something I've found helpful.
Following the RAM analogy, this sounds like saving files only in RAM, instead of creating the files in the file system, persisted on disk, and then caching it in RAM.
Personally, for me without writing or sketching I cannot think complex things: as in complex logic, constraints, etc.
I guess this is topic too abstract, so we can read into it different things.
> A lot of how I form my thoughts is driven by writing code, and seeing it on screen, running into its limitations.
If you need that, don't use AI for it. What is it that you don't enjoy coding or think it's tangential to your thinking process? Maybe while you focus on the code have an agent build a testing pipeline, or deal with other parts of the system that is not very ergonomic or need some cleanup.
this is the right answer, but many companies mandate to use ai (burn x tokens and y percent of code) now, so people are bound to use it where it might not fit
I was just thinking this the other day after I did a coding screen and didn't do well. I know the script for the interviewee is your not suppsed to write any code until you talk through the whole thing, but I think i woukd have done better if I could have just wrote a bunch of throw away code to iterate on.
> A lot of how I form my thoughts is driven by writing code, and seeing it on screen, running into its limitations.
Two principles I have held for many years which I believe are relevant both to your sentiment and this thread are reproduced below. Hopefully they help.
First:
When making software, remember that it is a snapshot of
your understanding of the problem. It states to all,
including your future-self, your approach, clarity, and
appropriateness of the solution for the problem at hand.
Choose your statements wisely.
And:
Code answers what it does, how it does it, when it is used,
and who uses it. What it cannot answer is why it exists.
Comments accomplish this. If a developer cannot be bothered
with answering why the code exists, why bother to work with
them?
To your first point - so are my many markdown files that I tell Codex/Claude to keep updated while I’m doing my work including telling them to keep them updated with why I told them to do certain things. They have detailed documentation of my initial design goals and decisions that I wrote myself.
Actually those same markdown files answer the second question.
> If a developer cannot be bothered with answering why the code exists, why bother to work with them?
Most people can't answer why they themselves exist, or justify why they are taking up resources rather than eating a bullet and relinquishing their body-matter.
According to the philosophy herein, they are therefore worthless and not worth interacting with, right?
I think of it differently: I’ve been coding so long that ironing out the details and working through the specification with AI comes extremely naturally. It’s like how I would talk to a colleague and iterate on their work.
However, the quality of the code produced by LLMs needs to be carefully managed to assure it’s of a high standard. That’s why I formalized a system of checks and balances for my genetic coding that contains architectural guidelines as well as language, specific taste advice.
I liken it to manual versus automated industrial production. I think manual coding will always have its place just like how there are even still people who craft things by manual labor, whether it’s woodworkers only using manual tools or blacksmiths who still manually stoke coke fires that produce very unique and custom products; vs the highly automated production lines we have that produce acceptable forms of something efficiently, and many of them so many people can have them.
Some people like to lay the brick, some people like to draw the blueprints. I don’t think there is anything wrong with not subscribing to this onslaught on AI tooling, doing the hard work is rewarding. Whether AI will become a standard in how code is written in the future is still to be determined and I think there is a real chance that is where it goes, it shouldn’t hinder your love for doing what you do.
100%. To me the real question is whether all the bother getting the agents to not waste time nets out to real gains, or perceived gains (while possibly even losing efficiency).
It's not at all clear to me which is true given the level of hype and antipathy out there. I'm just going to watch and wait, and experiment cautiously, till it's more clearcut.
Are there still people under the impression that the correct way to use Stack Overflow all these years was to copy & paste without analyzing what the code did and making it fit for purpose?
If I have to say, we're just waiting for the AI concern caucus to get tired of performing for each other and justifying each other's inaction in other facets of their lives.
> A lot of how I form my thoughts is driven by writing code, and seeing it on screen, running into its limitations.
I completely agree but my thought went to how we are supposed to estimate work just like that. Or worse, planning poker where I'm supposed to estimate work someone else does.
i go back and forth on this. when i'm working on something where the hard part is the actual algorithm, say custom scheduling logic or a non-trivial state machine, i need my hands in the code because the implementation is the thinking. but for anything where the complexity is in integration rather than logic, wiring up OAuth flows, writing CRUD endpoints, setting up CI pipelines, agents save me hours and the output is usually fine after one review pass. the "code as thought" argument is real but it applies to maybe 20% of what most of us ship day to day. the other 80% is plumbing where the bottleneck is knowing what to build, not how.
This is exactly the issue I’m facing especially when working with AI-generated codebases.
Coding is significantly faster but my understanding of the system takes a lot longer because I’m having to merge my mental model with what was produced.
I sometimes wonder if the economics of AI coding agents only work if you totally ignore all the positive externalities that come with writing code.
Is the entire AI bubble just the result of taking performance metrics like "lines of code written per day" to their logical extreme?
Software quality and productivity have always been notoriously difficult to measure. That problem never really got solved in a way that allowed non technical management to make really good decisions from the spreadsheet level of abstraction... but those are the same people driving adoption of all these AI tools.
Engineers sometimes do their jobs in spite of poor incentives, but we are eliminating that as an economic inefficiency.
A lot of how I form my thoughts is driven by writing code, and seeing it on screen, running into its limitations.
Maybe it's the kind of work I'm doing, or maybe I just suck, but the code to me is a forcing mechanism into ironing out the details, and I don't get that when I'm writing a specification.