More

33MHz-i486 · 2026-05-02T18:56:28 1777748188

I remember Travis Kalanik spouting the talking points about self-driving in 2017, that after Waymo, Tesla had the advantage because they had the best data, that they were going to crack self-driving soon. Then I remember Dara scuttling Uber’s entire self driving division in 2019.

Self-driving is possible but it requires a massive sustained investment in custom hardware on the car, in real and simulation testing, in painstaking software developlment covering tens of thousands of scenarios, realtime remote control failsafes, fleet management capabilities in every city. Waymo is the only company that comes close to the right approach. All these other Elons, GM, Uber CEOs are just jangling shiny objects in front of investors. A moonshot on the financial model for what are otherwise mature stagnant businesses.

33MHz-i486 · 2026-04-29T21:24:19 1777497859

I recall a lot of the original funding (Billions of dollars) was spent on 3rd party consultants to run various multi-year long review processes (environmental, legal compliance, eminent domain, community agreements etc.)

33MHz-i486 · 2026-04-28T20:22:03 1777407723

theyre getting high on their own supply, and instead really need to hire some senior engineers

33MHz-i486 · 2026-04-28T20:17:58 1777407478

at this level they just decide and spin up a swat team to execute it in a couple weeks without politicking. the bureaucratic ways, reviews are just for the low levels, to keep them busy with feature scraps while they mostly do operations

aab99 · 2026-04-28T23:42:52 1777419772

yup, I think there are few public articles on aws mantle so you can look it up, but internally this is pretty common knowledge. The entire inference engine of bedrock is built and maintained by a handful of ec2 engineers (all principals and above). Judging by the commit history of the project they are able to just build independent of any of the traditional bureaucracy.

33MHz-i486 · 2026-04-29T00:34:21 1777422861

the way in which Mantle was built is highlighted internally by PEs as some sort of triumph but really its a fairly tone deaf indicted of AWS’s engineering culture ... “To achieve a meaningful result in a reasonable amount of time we had to break nearly ever constraint that we force all other engineers to work under. good luck to you plebs of L6 and below”

darkwater · 2026-04-29T08:38:07 1777451887

I'm going to play devil's advocate here: _what if_ most of lower level engineers are actually not able to self-organize themselves like this dedicated, I bet hand-picked group of PE can? I'm pretty sure AWS ruthless culture would gladly use less middle-managers and be swifter in time to market if it were so easy, no? What works for a single, highly-focused project (or a handful of similar situations) doesn't scale when you have to take care of bazillion customers, do boring/smaller tasks and keep the machine ticking.

coredog64 · 2026-04-29T15:23:05 1777476185

Having seen how the sausage is made, Amazon's internal engineering culture is a textbook example of the principal/agent problem. You only get a raise from a promo, and you only get a promo from scope. It's way easier for management to demonstrate scope and so those folks build out empires of crap. In the best case they're just consuming oxygen, but I have definitely seen orgs created that were a negative influence on productivity.

darkwater · 2026-04-29T16:11:43 1777479103

Never worked for BigTech, so I'mm really not in my territory, but I wonder what is the purpose of all those interviews to get "the best of the best" and then let these things happen. What are the incentives for the management layer?

33MHz-i486 · 2026-04-29T21:14:37 1777497277

probably the main thing is just having low-level engineers competent enough to run operations on the dozens of poorly maintained 50k LoC codebases per team, hundreds of bespoke internal dependencies, processes, and custom tooling. build ecosystems of 5+ languages.

secondarily its an industrialization of software development. the hiring process is where they try define the labor as a replaceable component. grab the best cogs you can annually for the lowest price, run them in the machine for 2-4 years and swap most of them out before they get too expensive, or specialized or uppity.

jjmarr · 2026-04-29T12:34:44 1777466084

[deleted]

spelunker · 2026-04-29T02:03:20 1777428200

Check out this all of this stuff we can build with a room full of PEs and no rules!

avereveard · 2026-04-29T12:44:02 1777466642

For better and worse tho.

o10449366 · 2026-04-28T21:11:50 1777410710

Lol, spinning up swat teams because someone high up decides "drop everything this is my pet priority now" is politicking. It looks good for the leaders, meanwhile its the engineers pulling the all nighters and dealing with having to maintain systems that are operationally compromised from day 0 because there's no proper planning/scoping involved other than "Big Man says this needs to be done in 2 weeks"

tt24 · 2026-04-29T02:03:55 1777428235

Okay but I’ll be able to use OpenAI models on bedrock now

Why would I care if AWS asks their engineers to work a little harder on a project

phillipcarter · 2026-04-28T21:24:09 1777411449

Sure, but also...

...anyone with a brain at AWS knows that supporting OpenAI's latest models on Bedrock is simply good for AWS. That context is rather important!

ignoramous · 2026-04-29T02:51:16 1777431076

> engineers pulling the all nighters

There's always some carrot with the stick, even if an imaginary one!

33MHz-i486 · 2026-04-24T20:46:06 1777063566

I think the subtext of the last few weeks is the Anthropic was becoming severely capacity constrained (or approaching that). They seem to have had to sign two somewhat adverse contracts with Amazon and Google in short succession. suddenly model quality is back up again.

tiffanyh · 2026-04-24T21:15:19 1777065319

That’s what’s needed when you go from $9B in ARR … to $30B in ARR literally just one quarter later.

That kind of insane growth & demand is unprecedented at that scale.

https://www.anthropic.com/news/google-broadcom-partnership-c...

an0malous · 2026-04-24T21:52:24 1777067544

What is all this AI doing? People are spending 10’s to 100’s of billions and no service or technology seems better or cheaper. Everything is more expensive and worse.

barnabee · 2026-04-24T22:55:12 1777071312

Where I work:

- Development velocity is very noticeably much higher across the board. Quality is not obviously worse, but it's LLM assisted, not vibe coding (except for experiments and internal tools).

- Things that would have been tactically built with TypeScript are now Rust apps.

- Things that would have been small Python scripts are full web apps and dashboards.

- Vibe coding (with Claude Desktop, nobody is using Replit or any of the others) is the new Excel for non tech people.

- Every time someone has any idea it's accompanied by a multi page "Clauded" memo explaining why it's a great idea and what exactly should be done (about 20% of which is useful).

- 80% of what were web searches now go to Claude instead (for at least a significant minority of people, could easily be over 50%).

- Nobody talks about ChatGPT any more. It's Claude or (sometimes) Gemini.

- My main job isn't writing code but I try to keep Claude Code (both my personal and corpo accounts) and OpenCode (also almost always Claude, via Copilot) busy and churning away on something as close to 100% of the time as I can without getting in the way of my other priorities.

We (~20 people) are probably using 2 orders of magnitude more inference than we were at the start of the year and it's consolidated away from cursor, ChatGPT and Claude to just be almost all Claude (plus a little Gemini as that's part of our Google Whateverspace plan and some people like it, mostly for non-engineering tasks).

No idea if any of this will make things better, exactly, but I think we'd be at a severe competitive disadvantage if we dropped it all and went back how things were.

stasomatic · 2026-04-24T23:59:01 1777075141

I am hobbyist playing around. Recently dropped CC (which gave me a sense of awe 2 months ago), but they realized GPUs need CapEx and I want to screw around with pi.dev on a budget. Then on to GH Copilot but couldn't understand their cost structure, ran out of quota half month in, now on Codex. I don't really see any difference for little stuff. I also have Antigravity through a personal Gmail account with access to Opus et al and I don't understand if I am paying for it or not. They don't have my CC so that's a breather.

It's all romantic, but a bunch of devs are getting canned left and right, a slice of the population whose disposable income the economy depends on.

It's too late to be a contrarian pundit, but what's been done besides uncovering some 0-days? The correction will be brutal, worse than the Industrial Revolution. Just the recent news about Meta cuts, SalesForce, Snap, Block, the list is long.

Have you shipped anything commercially viable because of AI or are you/we just keeping up?

fc417fc802 · 2026-04-25T00:49:54 1777078194

> The correction will be brutal, worse than the Industrial Revolution.

Has it occurred to you that there might not be a correction, and that the outcome would still be brutal, at least on par with the industrial revolution.

SlinkyOnStairs · 2026-04-25T09:53:01 1777110781

It won't get that far.

It's physically impossible to build out the datacenters required for the "AI is actually good and we have mass layoffs" scenario. This Anthropic investment is spurred on because they've already hit a brick wall with capacity.

$40B goes a long way, but not for datacenters where nearly every single component and service is now backordered. Even if you could build the DC, the power connection won't be there.

The current oil crisis just makes all of that even worse.

fc417fc802 · 2026-04-25T19:44:31 1777146271

Doesn't that just draw out the AI revolution by a few years? I don't see why it would stop anything though.

Imagine a scenario where someone claimed that it was physically impossible to replace all the buggies with automobiles because everything was backordered and there were labor shortages. Surely the replacement still happens eventually though?

SlinkyOnStairs · 2026-04-25T23:33:56 1777160036

A drawn out long change simply doesn't have the major societal upset that imminent mass-unemployment has.

With how much scale AI datacenters want and how the Trump administration has made supply problems significantly worse, we'd be talking decades, plural.

fc417fc802 · 2026-04-26T02:15:03 1777169703

I don't think lowering the rate a bit is going to be sufficient to avoid major upsets. If (arbitrary example) every software developer were forced to switch jobs over a 10 year period that would still be an extremely disruptive sequence of events. And I don't think there's any scenario in which software developers are widely impacted but other industries somehow aren't.

Digitization was already fairly disruptive and that involved much smaller changes than what we appear to be facing while also taking place over something like 30 years or more.

fallat · 2026-04-25T11:47:37 1777117657

We pretty much already had the layoffs, at least that's my perception.

The next level of layoffs is probably still 25 years out.

SlinkyOnStairs · 2026-04-25T12:20:41 1777119641

There's layoffs, certainly.

But all the economic indicators suggest those are "bad economy" layoffs dressed up as "AI" layoffs to keep the shareholders happy.

bitmasher9 · 2026-04-25T14:32:04 1777127524

The real “AI layoffs” are all the people that are PIPed because their colleagues are better at leveraging AI.

Spivak · 2026-04-25T18:59:32 1777143572

We must have a very different view of the world because in my neck of the woods companies are desperate for senior talent. And it's become even harder to find seniors now that everyone has access to a machine that can create the appearance of experience.

embedding-shape · 2026-04-25T13:37:03 1777124223

> The next level of layoffs is probably still 25 years out.

Hasn't even been 25 years years since the previous layoffs before the current ones.

stasomatic · 2026-04-25T00:54:04 1777078444

Do you mean as in there will be no happy ending / reset and no another century of prosperity?

fc417fc802 · 2026-04-25T01:04:56 1777079096

I mean as in living through the industrial revolution would have been wild. So whether we have an AI revolution or an AI bubble it's bound to be a roller coaster.

And that's without accounting for the various wars (and resultant economic impacts) that are already in progress. A large part of what drove the meat grinder of WWI was (very approximately) the various actors repeatedly misjudging the overall situation and being overly enthusiastic to try out their shiny new weapons systems. If one or more superpowers decide to have a showdown the only thing that might minimize loss of life this time around is (ironically enough) the rise of autonomous weapons systems. Even in that case as we know from WWII the logical outcome is a decimated economy and manufacturing sector regardless of anything else that might happen.

Aeolun · 2026-04-25T06:34:42 1777098882

> minimize loss of life this time around is (ironically enough) the rise of autonomous weapons systems

I think that just means the relative civilian loss of life will increase once again.

fc417fc802 · 2026-04-25T07:27:31 1777102051

What strategic merit is there in targeting civilians or life critical infrastructure in a fully automated battlebot scenario? Perhaps it's naive but I would expect stockpiles, datacenters, and any key infrastructure on which the local semiconductor fabrication depends to be the primary targets.

kakacik · 2026-04-25T10:06:02 1777111562

Look au Ukraine for answers and how russians target almost purely civilian infrastructure and civilians in terror campaigns every single day and night, same as nazis did to Britain in WWII. With exactly same results but they just double down and send more drones next day.

russia is really and empire of the dumb and subjugated serfs at this point (again, history repeats), but they are far from only such place.

Dont expect more, most people are not that nice when SHTF.

bojan · 2026-04-25T11:05:06 1777115106

The current reality doesn't match your expectations. Russia is using automated warfare to strike what are primarily human life-critical targets.

tacet · 2026-04-26T06:15:53 1777184153

The aim of war is to make political change and gain control of opponent. it is much more valuable to capture datacenters, infra and semiconductor fabricaton than to destroy and rebuild it.

rhubarbtree · 2026-04-25T01:36:38 1777080998

Bubble or revolution - not a dichotomy.

Bubbles like the AI bubble are a game theoretic outcome of a revolution. Many players invest heavily to avoid losing, but as a whole the market over invests. This leads to a bubble.

chpatrick · 2026-04-25T01:56:55 1777082215

Imagine you're a typesetter and they just invented computerized printing.

Spivak · 2026-04-25T19:12:15 1777144335

These kinds of comments are so confusing to me, is the work you do day to day really so trivial that you can be wholesale replaced by an LLM?

chpatrick · 2026-04-25T22:35:43 1777156543

90% of the actual code writing? Yes. The actually valuable part is coming up with the ideas for what to do.

There isn't going to be a great reset where everyone goes back to coding by hand any more than we're going back to typesetting by hand.

fc417fc802 · 2026-04-25T19:50:44 1777146644

We're not talking about the LLMs of today but whatever shows up 2 years from now and then again 2 years after that. Don't look at the present state of things but instead project the trend line.

jameshart · 2026-04-25T03:32:01 1777087921

There has always been a gap between the experience of solo/small shop developers, vs. developers who work in teams in a large corporate environment. But thanks to open source, we have for the past twenty years at least mostly all been using the same tools.

But right now, the difference in developer experience between a dev on a team at a business which has corporate copilot or Claude licenses and bosses encouraging them to maximize token usage, vs a solo dev experimenting once every few months with a consumer grade chat model is vast.

eieie · 2026-04-25T03:49:09 1777088949

Let’s take an extreme example.

Meta seemingly has a constant stream of product managers. If llm’s really augment the productivity of engineers, why isn’t meta launching lots more stuff? I mean there’s no harm in at least launching one new thing.

What are all those people doing with the so called productivity enhancements?

What I’m calling into question is how much does generating more code matter if the bottle neck is creativity/imagination for projects?

The only thing I’ve seen is a really crummy meta AI thing implemented within WhatsApp.

bushbaba · 2026-04-25T05:53:05 1777096385

It’s allowed a sludge of internal tools to spin up, and more bloat. The ability to sand bag and over build these tools has gotten 2-10x worse.

Only solution I can think of is to drastically cut headcount so productivity is back to prior levels, and profitability is raised. Big Tech is mostly market constrained with not much room to grow beyond the market itself growing.

As for startups, seems like AI tools have drastically reduced their time to market and accelerated their growth curves.

eieie · 2026-04-26T01:24:55 1777166695

Im convinced the most scarce skill on the planet is the ability to a) envision something that needs to exist in the world b) explain how the thing creates value from a financial perspective.

Most people tend to think they know what they are talking about (e.g. surface level understanding of how to think economically) and end up making basket-case decisions - only realising it months later. By that point they will fail to admit defeat and keep going on.

"As for startups, seems like AI tools have drastically reduced their time to market and accelerated their growth curves."

You mean like openclaw? lol

aenis · 2026-04-25T18:27:56 1777141676

In a word, bottlenecks moved.

What I see in my backyard: coding now takes significantly less time, but its just coding. Before one gets to building there are squabbles between business and product people. Testing takes just as much as it used to. Since nice to haves are easy to add and product people begin to take it for granted, the product cycles don't get shorter.

Give it time. Right now its just coding, but procedural AI will come after product development, architecture, and then whatever is left of management.

eieie · 2026-04-26T01:22:48 1777166568

Absolute delusion.

The best people can not only envision products but also possess great judgement without needing data. For AI to even come close it would need an insane amount of data that is nuanced and subtle - by the the time the AI has obtained all the necessary data and made sense of it the human is long gone working on something else.

stasomatic · 2026-05-02T12:42:03 1777725723

But these people will age out and juniors do not get hired. “Good judgment comes from experience, and experience comes from bad judgment.” and all that.

Is LLM going to invent its own languages that no average programmer will understand? As in "I don't need your C++ human, I will rewrite your fart app in ClaudASM and you will like it". These are naive questions, but I can't visualize how all of this will unfold.

elliotec · 2026-04-25T05:14:41 1777094081

Forgive my ignorance, but what exactly is the vast difference? Who's doing more of what, or whatever you're implying? And how do you quantify this?

aenis · 2026-04-25T18:33:43 1777142023

The people who use AI to the maximum learn more.

A neutral hobbyist on a $20 budget will build something and immediately bump into quotas. Its not going to be an enjoyable experience.

A negatively predisposed pro who only dabbles in AI gets to the first disappointment, smiles, and thinks "yeah, about what i expected" and quits.

To learn those new tools one needs to not be stingy. Invest as much as needed into tokens, subscriptions, and maybe most importantly invest the time. Spend time building various things. Try out various models not just for coding, but as part of apps being built. For bonus points, meaningfully experiment with local models. I try to avoid discussions with sceptics who have not put at least a few months of effort into learning those tools. It's like discussing driving with my mother in law, who spent maybe 20hrs behind the wheel through her whole life (and is very, very opinionated!).

LtWorf · 2026-04-26T10:39:59 1777199999

And it's not a waste of money because?

aenis · 2026-04-26T17:25:22 1777224322

You'd have learned something new. Useful, not useful, thorough understanding of a new thing is rewarding.

Also, its not primarily about money - the real investment here is time.

LtWorf · 2026-04-26T18:03:29 1777226609

In my opinion it's a complete waste of time and money to learn something that is gated by a company that might disappear tomorrow.

It's akin to company courses to learn something that is specific to that company. Of course you do them on the job, there is no point in doing them if you don't work there.

Similarly what's the point of trying 300 different models if any job will decide for you which one they approve the use of, and you are liable to get fired and asked for damages if you let anything else access company intellectual property?

jameshart · 2026-04-25T06:33:33 1777098813

The difference is (if you'll forgive me recruiting a couple of straw men for the purpose of illustrating the spectrum we are talking about here):

Hobbyist solo dev, counting tokens, hitting quotas, trying things on little projects, giving up and not seeing what the fuss is about.

vs

Corporate developer, increasingly held accountable by their boss for hitting metrics for token usage; being handed every new model as soon as it comes out; working with the tools every day on code changes that impact other developers on other teams all of whom have access to those same tools.

elliotec · 2026-04-25T06:46:38 1777099598

Okay, so just to be clear you're not commenting on productivity? Or what does "changes that impact" mean?

I might be missing a lot of self-evident assumptions here but I feel like I'm still missing so much context and have no idea what this difference is actually describing.

jameshart · 2026-04-25T07:07:03 1777100823

If you have some objective measure of productivity in mind, feel free to share it, but no that's not what I'm commenting on.

I'm talking more about why threads like this seem to be full of people saying 'this has completely changed how corporate development works' and other people saying 'I tried it a few times and I don't get the hype'

bluGill · 2026-04-25T14:59:44 1777129184

Developers being let go is about the economy. Every time we see a slowdown people are let go and we always blame the fad but it's the economy not whatever.

kaiokendev · 2026-04-25T20:49:52 1777150192

> Every time someone has any idea it's accompanied by a multi page "Clauded" memo explaining why it's a great idea and what exactly should be done (about 20% of which is useful).

we're in the same boat, and currently trying to fix that 20% problem because it's the biggest hindrance to shipping things quickly

there is a ton of learned ceremony that we have to undue gracefully because it's extremely tempting to vibe code a problem spec as opposed to just... talking to users directly and understanding what the actual problem is

mullingitover · 2026-04-24T23:26:36 1777073196

> - Development velocity is very noticeably much higher across the board

It's an absolute tornado of PRs these days. Everyone making the most of these tools is effectively an engineering team lead.

MrDarcy · 2026-04-25T00:56:41 1777078601

The CTO/VP of engineering role down is now singularly focused on keeping agents fed with a backlog of Linear issues. This is the new normal.

aenis · 2026-04-25T18:37:48 1777142268

As a CTO I can say that this is not my experience.

My experience these days is fighting corporate bureaucracy and inertia to make sure we reap the benefits of faster coding. Feeding agents with work is not a problem. Building teams that use those tools effectively is the problem. (Say, shall we merge product and engineering teams? Do we start getting rid of people who refuse to use AI? What do we do with pentests? How do we strengthen the tools that do code analysis and weed out lazy devs who can now more easily pretend to be invested in their work? Stuff like this keeps me busy.

lobsterthief · 2026-04-26T13:53:47 1777211627

As a CTO this has been my experience as well. I would add in every non-technical C-suite member aiming to use AI as some magic lever to avoid prioritizing projects or engaging in real critical thinking. Too many people are offloading their cognitive decisionmaking to some magic box, thinking it has all the answers, because its output appears magical and complete.

After 25 years in programming I think I’ll finally start that farm ;)

mullingitover · 2026-04-26T00:43:36 1777164216

> Do we start getting rid of people who refuse to use AI?

I don't even think the bigger companies are going to waste time on figuring out how to retrain, they're just going to do industrial scale layoffs and then rebuild from the ground up with people who won't get past interviews without demonstrating hard skills in this area.

There is a shocking gap growing right now, it's a Wile E. Coyote not realizing he already walked off the cliff type of situation for a lot of people.

eieie · 2026-04-26T01:35:14 1777167314

Lol people like you operate in a local maximum.

Ultimately the shareholders want to see the money. They dont give a crap about what you think or what the poster above thinks - you're both accountable to the shareholders who do not employ you for fun. They employ you for the sole purpose of making them wealthier. All this incremental spend on tokens shows up in the financials positively or it doesn't.

mullingitover · 2026-04-26T21:04:16 1777237456

> Ultimately the shareholders want to see the money.

Seems like we're saying the same thing?

> All this incremental spend on tokens shows up in the financials positively or it doesn't.

Right, and we're talking about the staff failing to spend the incremental tokens at all, thus failing to discover whether or not they'll show up positively. I'm just saying, investors are probably going to decide to roll the dice on a complete staffing rebuild rather than try to wait for the existing corporate culture to adapt because they're going to get fomo. Arguably it's already happening.

am17an · 2026-04-25T03:36:45 1777088205

Sounds exhausting. Are your revenue numbers up?

camdenreslink · 2026-04-25T04:50:53 1777092653

I am also curious about the correlation between more PRs getting merged faster and actual business outcomes.

My impression has always been it's more important the build the correct thing (what the customer needs/wants) rather than more stuff faster.

TeMPOraL · 2026-04-25T06:39:22 1777099162

> My impression has always been it's more important the build the correct thing (what the customer needs/wants) rather than more stuff faster.

The process of learning what the customer needs/wants is a heavily iterative one, often involving throwing prototypes at them or betting at a solution, then course-correcting based on their reaction. Similarly, the process of building the correct thing is almost always an iterative approximation - correctness is something you discover and arrive at after research and prototypes and trying and getting it wrong.

All of that benefits from any of its steps being done faster - but it's up to the org/team whether they translate this speedup to quality or velocity. For example, if AI lets you knock out prototypes and hypothesis-testing scripts much faster, you can choose whether to finish earlier (and start work on next thing sooner), or do more thorough research, test more hypothesis, and finish as normally, but with better result.

(Well, at least theoretically. If you're under competitive pressure, the usual market dynamics will take the choice away, but that's another topic.)

dominotw · 2026-04-25T13:19:58 1777123198

no customers will accept "throwing prototypes at them". my time is not for QA-ing your product.

why do you think restaurants rarely change their menus.

joshuacc · 2026-04-25T19:04:59 1777143899

You have a specific idea of customer in mind. Likely different than the gp’s. Many types of customers are quite happy to have prototypes thrown at them. Sometimes it’s even contractually required in agency work.

eueew · 2026-04-25T16:50:11 1777135811

Is it just me or is this whole mania exposing those people who thought they were great ‘thinkers’? The takes I see are so utterly flawed it’s ironic - people refer to llm’s as hallucinating when the real halluncinations are from people cosplaying the role of management/investors when they have never done said role professionally in their life.

aenis · 2026-04-25T18:42:26 1777142546

For sure, but these days product management mistakes can be more easily rectified. Before that, if we invested 4 months in building something that did not land, we'd be quite reluctant to jettison this and start fresh. Egos, career considerations, sunk cost, etc. I think I will soon be able to say "not any more", since doing a U turn can be cheaper than pretending the bad choice is the best choice. "Oops, lets redo this" vs. 6 months of executive squabbling about whose fault it is that we wasted $3M in development costs on something that clearly does not perform.

Also, give it time. Real adoption in boring companies started Q1. Q2 is, I think, this settling in and people learning how to do their work and manage their responsibilities. Q3/Q4 will be the time when I expect to start seeing higher velocities across all IT-adjacent products I use.

barnabee · 2026-04-26T01:44:26 1777167866

Iteration speed is far more important than the volume of features delivered, though we are tackling aspirational features that would previously have been considered too complex or niche, too.

mewpmewp2 · 2026-04-25T08:45:42 1777106742

This with the ability to research, iterate on prototypes, in my opinion allows to determine the right thing quicker as well. Of course right now the value is largely intuition based, there may be some immediate revenue/profit, but revenue/profit will take time to follow, so in a way it is a speculative intuition based bet. Financial gains will take time to follow, so for a period of time it will be "trust me bro" for at least some cases, but I suppose future will show, since the intuition seems so strong about it. You can't have good data about an emerging tech like that.

camdenreslink · 2026-04-25T16:02:53 1777132973

I question the premise that faster iteration on prototypes leads to better revenue or business outcomes.

fragmede · 2026-04-25T19:56:07 1777146967

Isn't that trivially true? Scenario 1) Spend $10,000 to make one prototype. You get one shot, so you prepare and do as much pre-work as humanly possible, but because you only get one shot, you forgot the ask the question that in Hobbs sight was obvious. Scenario 2) prototypes cost $1,000 so you get multiple shots. So you don't do as much pre-work, throw a half dozen things at the wall. One of them sticks! It really resonates with customers. You iterate a few more times, and when it's finally on the market, you have a successful business.

The difference is all that pre-work. The problem with that is some things are only obvious after you've built one and it doesn't fit just right for some reason. That reason is impossibly harder to just reason about and figure out vs iterating where possible. For software things that's easier. For hardware, we have stories like the palm pilot engineer having a wooden block with them for a week before deciding on the form factor for it. Such pre-work is valuable, but if the cost of prototypes is way down, you can afford to iterate instead of trying to psychically predict everything up front. Of course that doesn't work for eg trips to the Moon, but most busineeses aren't doing that.

camdenreslink · 2026-04-26T01:52:18 1777168338

The problem is in validating the prototype. Whether the users are consumers or enterprises or internal stakeholders, they aren’t going to try 10 different prototypes. They will try one or two.

Most business software isn’t complicated to implement (i.e. it doesn’t require multiple prototypes to determine which technical approach is best). Usually for most apps you approximately know the technical implementation. What requires taste, experience, or whatever you want to call it, is the user experience and if your software actually solves a real problem. You can’t really just churn on prototypes to solve that. You will lose the patience of your user base.

barnabee · 2026-04-26T01:47:59 1777168079

Yes, it is trivially true much of the time.

Even so-called UX and product experts get stuff wrong all the time. Going from idea to prototype to feedback in hours or days rather than days or weeks feels like a superpower, at least in the very customer facing parts of what we do.

groundzeros2015 · 2026-04-25T19:08:47 1777144127

Did CAD make engineers better? certain products are only possible because of CAD but the pen and paper guys weren’t obviously less efficient, and I personally think they were very efficient.

When prototypes are harder to build you focus on answering the biggest questions. I feel like you spend more time iterating on details in CAD, even when the larger idea is invalid.

eueew · 2026-04-25T16:51:33 1777135893

Agreed it’s far more complex than that.

But people who have only wrote software their entire life wouldn’t know that would they?

It’s like the econ prof’s who theorise about the theory of the firm but have never done it themselves.

eieie · 2026-04-25T03:46:02 1777088762

Incremental cash flows is what we should be observing - have to net out the costs of llm associated with the activity.

Thats just one set of costs but a good starting point.

xnx · 2026-04-25T10:02:30 1777111350

Reducing costs is also a business benefit.

am17an · 2026-04-25T10:21:59 1777112519

The cost being reduced is the cost of your labour. Tokens are only getting more expensive.

xnx · 2026-04-26T15:14:47 1777216487

Did you mean less expensive?

https://pricepertoken.com/trends#frontier-pricing

am17an · 2026-04-29T00:34:05 1777422845

No I mean more expensive, i.e. you're consuming vastly more tokens.

barnabee · 2026-04-26T01:42:03 1777167723

It’s no more exhausting than the alternative. It feels good being able to build more and experiment more.

The biggest downside is the feeling that people sometimes turn their brain off and aren’t even doing basic checks on some of the slop their LLMs produce.

jeremyjh · 2026-04-24T23:52:09 1777074729

It sounds very similar to my shop. I have QA people and Product Managers using Claude to develop better integration and reporting tools in Python. Business users are vibe coding all kinds of tools shared as Claude Artifacts, the more ambitious ones are building single page app prototypes. We ported one prototype to Next.js and hosted on Vercel in a couple of days and then handed it back to them with a Devcontainer and Claude Code so they can iterate on it themselves; and we also developed all the security infrastructure, scaffolding, agent instructions & policy required to do this for low stakes apps in a responsible way.

It hardly seems worth it to try to iterate on design when they can just build a completely functional prototype themselves in a few hours. We're building APIs for internal users in preference to UIs, because they can build the UIs themselves and get exactly what they need for their specific use cases and then share it with whoever wants it.

We replaced an expensive, proprietary vendor product in a couple of weeks.

I have no delusions about the scale or complexity limits of these projects. They can help with large, complex systems but mostly at the margins: help with impact analysis, production support, test cases, code review. We generate a lot of code too but we're not vibe coding a new system of record and review standards have actually increased because refactoring is so much cheaper.

The fact is that ordinary businesses have a LOT of unmet demand for low stakes custom software. The ones that lean into this will not develop superpowers but I do think they will out-compete slow adopters and those companies will be forced to catch up in the next few years.

I develop presentations now by dumping a bunch of context in a folder with a template and telling Claude Cowork what I want (it does much better than web version because of its python and shell tools and it can iterate, render, review, repeat until its excellent). The copy is quite good, I rewrite less than a third of it and the style and graphics are so much better than I could do myself in many hours.

No one likes reading a bunch of vibe coded slop and cultural norms about this are still evolving; but on balance its worth it by far.

realusername · 2026-04-25T10:12:29 1777111949

Personally at my place, there hasn't been a noticable velocity change since the adoption of Claude Code. I'd say it's even slightly worse as now you have junior frontend engineers making nonsense PRs in the backend.

Mainn blockers are still product, legal, management ... which Claude code didn't help with.

davidcann · 2026-04-24T23:28:33 1777073313

Is your team measuring how much of your code is being written with claude and comparing amongst the team, like what works best in your codebase? How are you learning from each other?

I’m making a team version of my buildermark.dev open source project and trying to learn about how teams would like to use it.

barnabee · 2026-04-24T23:46:21 1777074381

Different teams are using it in very different ways so it can be tough to compare meaningfully.

Backends handling tens to hundreds of thousands of messages per second with extremely high correctness and resilience requirements are necessarily taking a different approach to less critical services that power various ancillary sites/pages or to front end web apps.

That said there's a lot of very open discussion around tooling, "skills", MCP, etc., harnesses, and approaches and plenty of sharing and cross-pollination of techniques.

It would be great to find ways to better quantify the actual value add from LLMs and from the various ways of using them, but our experience so far is that the landscape in terms of both model capability and tooling is shifting so fast that that's quite hard to do.

davidcann · 2026-04-25T00:03:52 1777075432

Thanks for the feedback. I agree that it’s changing very fast, which is why my thesis is that this tooling will be needed to help everyone on the team keep up.

croes · 2026-04-25T10:18:09 1777112289

Jevon‘s paradox comes into play.

https://en.wikipedia.org/wiki/Jevons_paradox

In the end only profit matters

ojr · 2026-04-25T02:22:10 1777083730

I am an early Gemini daily driver type engineer, feels like Node, Firefox, React and Tailwind all over again, Claude Sonnet is 10x more expensive, quick thought experiment do you think 10 Gemini prompts is needed to match the quality of one Claude Code prompt? The harness around Gemini is an issue but I built my own (in Rust)

jwpapi · 2026-04-25T00:48:33 1777078113

I think if you drop this all you will absolutely kill it.

komali2 · 2026-04-25T01:58:36 1777082316

I'm not sure. I have a buddy that's one of the better engineers I know personally, and he struggled to maintain an "AI Lent" for even a month. He found he just wasn't productive enough without it.

He did a writeup: https://buduroiu.com/blog/ai-lent-end/

svieira · 2026-04-25T11:21:15 1777116075

> I delivered more work that I was less confident about, making me more miserable in the process

Don't leave the kicker out of the story

komali2 · 2026-04-26T04:57:27 1777179447

The kicker is he wasn't able to compete without the agents.

IMO this is the natural end state of LLM fueled capitalism: products skating along the razor edge between "has value under capitalism" and "is a heap of garbage" until we suddenly realize there's nothing under our feet.

jwpapi · 2026-04-26T20:08:54 1777234134

His success measurement is PRs. The success measurement I mean is increased likelihood of company succeeding.

ttul · 2026-04-25T00:41:04 1777077664

This sounds like my office, but we're a bit more tilted toward Codex. I personally use Claude Cowork for drudge-admin work, GPT 5.5-Pro for several big research tasks daily, and the LLMs munge on each other's slop all day as I try my best to wrap my head around what has been produced and get it into our document repository -- all the while being conscious that the enormous volume of stuff I'm producing is a bit overwhelming for everyone.

We are definitely reaching the point where you need an LLM to deal with the onslaught of LLM-generated content, even if the humans are being judicious about editing everything. We're all just cranking on an inhumanly massive amount of output and it's frankly scary.

JambalayaJimbo · 2026-04-25T04:37:51 1777091871

Didn’t got 5.5 just come out lol. Am I just reading slop on this website?

ttul · 2026-04-26T01:25:47 1777166747

Yeah, it did just come out. I should have probably said "GPT-Pro" and left out the version :)

dominotw · 2026-04-25T13:19:06 1777123146

what have you guys built exactly?

barnabee · 2026-04-26T21:36:21 1777239381

Everything from complex backend logic and processing to new user facing festivals, ops and infra tools, analytical apps, etc.

A lot of engineers now describe the problem, discuss the outline of the solution with the LLM, and then get it to write and test most of it ahead of their review. They tell me it usually takes the same approach they would anyway and even when it doesn’t, it’s often faster to explain what’s wrong and give the LLM another try.

Very little (and even then, only simple internal tools) gets written without a human owning the code and reviewing it thoroughly, but even with that overhead the productivity boost is impressive.

pwinnski · 2026-04-25T13:53:34 1777125214

I kept asking this question last year, especially after that initial METR report showing people believed themselves to be faster when they were slower. Then I decided to dive in feet-first for a few weeks so that nobody could say I hadn't tried all I could.

At work, what I see happening is that tickets that would have lingered in a backlog "forever" are getting done. Ideas that would have come up in conversation but never been turned into scoped work is getting done, too. Some things are no faster at all, and some things are slower, mostly because the clankers can't be trusted and human understanding can't be sped up, or because input is needed from product team, etc. But the sorts of things that don't make it into release notes, and are never announced to customers, those are happening faster, and more of them are happening.

We review server logs, create tickets for every error message we see, and chase them down, either fixing the cause or mitigating and downgrading the error message, or however is appropriate to the issue. This was already a practice, but it used to feel like we were falling farther behind every week, as the backlog of such tickets grew longer. Most low-priority stuff, since obviously we prioritized errors based on user impact, but now remediation is so fast that we've eliminated almost the entire backlog. It's the sort of things that if we were a mobile app, would be described as "improvement and bug fixes" generically. It's a lot of quality-of-life issues for use as backend devs.

At home, I'm creating projects I don't intend for anyone outside my family to see. So far things I could theoretically have done myself, even related to things I've done myself before, but at a scale I wouldn't bother. Like a price-checker that tracks a watchlist of grocery items at nine local stores and notifies me in discord of sales on items and in categories I care about. It's a little agent posting to a discord channel that I can check before heading out for groceries.

Or several projects related to my hobbies, automating the parts I don't enjoy so much to give me more time for the parts I do. My collection of a half-dozen python scripts and three cron jobs related to those hobbies has grown to just over 20 such scripts and 14 cron jobs. Plus some that are used by an agent as part of a skill, although still scripts I can call manually, because I'll go back to cron jobs for everything if the price of tokens rises a bit more.

I was super-skeptical, and now I'm not. I think companies laying off employees are delusional or using LLMs as an excuse, but there is zero question in my mind that these things can be a huge boon to productivity for some categories of coding.

LittleBox · 2026-04-26T07:47:40 1777189660

I went through a similar journey. That and having read all the other experienced engineers’ anecdotes I think the current consensus is that it can boost productivity and does so for a lot of people but that vibe coding still remains unviable in a lot of situations.

renegade-otter · 2026-04-25T19:53:12 1777146792

Development velocity is faster, but the code quality hits take a while to manifest.

Some places are more diligent, but most are not. We HATE reading other people's code, and we only have so much focus capacity per day to review all the shit these clunkers spew out.

Over time, the errors induced by Looks Good To Me code reviews compound.

Jagerbizzle · 2026-04-24T21:56:17 1777067777

I'm burning an insane number of tokens 8-12 hours a day for the dramatic improvement of some internal tooling at a big tech company. Using it heavily for an unannounced future project as well.

I presume I'm not the only one.

msy · 2026-04-24T22:06:05 1777068365

We suddenly have a proliferation of new internal tools and resources, nearly all of which are barely functional and largely useless with no discernible impact on the overall business trajectory but sure do seem to help come promo time.

Barely an hour goes by without a new 4-page document about something that that everyone is apparently ment to read, digest and respond to, despite its 'author' having done none of those steps, it's starting to feel actively adversarial.

kranke155 · 2026-04-24T22:24:05 1777069445

Without good management AI is just a new way to make terrible work in unprecedented quantities.

With good management you will get great work faster.

The distinguishing feature between organisations competing in the AI era is process. AI can automate a lot of the work but the human side owns process. If it’s no good everything collapses. Functional companies become hyper functional while dysfunctional companies will collapse.

Bad ideas used to be warded off by workers who in some shape or form of malicious compliance just would slow down and redirect the work while advocating for better solutions.

That can’t happen as much anymore as your manager or CEO can vibe code stuff and throw it down the pipeline for the workers to fix.

If you have bad processes your company will die, or shrivel or stagnate at best. Companies with good process will beat you.

komali2 · 2026-04-25T02:01:55 1777082515

> but sure do seem to help come promo time.

I personally noticed this. The speed at which development was happening at one gig I had was impossible to keep up with without agentic development, and serious review wasn't really possibile because there wasn't really even time to learn the codebase. Had a huge stack of rules and MCPs to leverage that kinda kept things on the rails and apps were coming out but like, for why? It was like we were all just abandoning the idea of good code and caring about the user and just trying to close tickets and keep management/the client happy, I'm not sure if anyone anywhere on the line was measuring real world outcomes. Apparently the client was thrilled.

It felt like... You know that story where two economists pass each other fifty bucks back and forth and in doing so skyrocket the local GDP? Felt like that.

qingcharles · 2026-04-24T23:03:53 1777071833

My main use of vibecoding is creating dozens of internal tools that have sped up tasks, or made tasks possible that were previously not. These tools would have taken weeks of time to build manually and would have been hard to justify, rather than just struggling with manual processes every now and again. AI has been life-changing in creating these kinda janky tools with janky UI that do everything they're supposed to perfectly, but are ugly as hell.

Jach · 2026-04-25T00:06:05 1777075565

Are you able to describe any of those internal tools in more detail? How important are they on average? (For example, at a prior job I spent a bit of time creating a slackbot command "/wtf acronym" which would query our company's giant glossary of acronyms and return the definition. It wasn't very popular (read: not very useful/important) but it saved myself some time at least looking things up (saving more time than it took to create I'm sure). I'd expect modern LLMs to be able to recreate it within a few minutes as a one-shot task.)

shimman · 2026-04-25T00:40:58 1777077658

It's almost always a CRUD app or dashboard that no one uses while being extremely overkill for their use case.

edit: LOL called it, a bunch of useless garbage that no one really cares about but used to justify corporate jobs programs.

Daishiman · 2026-04-25T03:24:34 1777087474

If it's useless that's a you problem. I've been building CRUDs that would have taken me a month to get perfectly right in the span of 4-5 days which save an enormous number of human tech support hours.

shimman · 2026-04-25T03:30:37 1777087837

Sorry man but the software world is littered with CRUD apps, they are called CRUD apps for a reason. They're basically the mass assembled stamped L-bracket of the software world. CRUD apps have also had template generators for like 30 years now too.

Still useless in the sense that if you died tomorrow and your app was forgotten in a week the world will still carry on. As it should. Utterly useless in pushing humanity forward but completely competent at creating busy work that does not matter (much like 99% of CRUD apps and dashboards).

But sure yeah, the dashboard for your SMB is amazing.

Daishiman · 2026-04-25T04:53:42 1777092822

The software industry's value proposition for the vast majority of businesses running the world lies in CRUD apps that properly capture business requirements. That's infinitely more relevant in insurance, pharma, banking and logistics than any technological breakthrough of the past 25 years.

Your rant just shows you don't understand why people pay for software.

girvo · 2026-04-25T00:52:29 1777078349

Ah but it looks cool and I can put it on my stack ranking perf eval

scottyah · 2026-04-25T01:00:16 1777078816

I have one that serves a few functions- Tracks certificates and licenses (you can export certs in any of the majorly requested formats), a dashboard that tells you when licenses and certs are close to expiring, a user count, a notification system for alerts (otherwise it's a mostly buried Teams channel most people miss), a Downtime Tracker that doesn't require people to input easily calculatable fields, a way for teams to reset their service account password and manage permissions, as well as add, remove, switch which project is sponsoring which person, edit points of contact, verify project statuses, and a lot more. It even has some quick charts that pull from our Jira helpdesk queue- charts that people used to run once a week for a meeting are just live now in one place. It also has application statuses and links, and a lot more.

I'd been fighting to make this for two years and kept getting told no. I got claude to make a PoC in a day, then got management support to continue for a couple weeks. It's super beneficial, and targets so many of our pain points that really bog us down.

CoolThings · 2026-04-25T02:09:55 1777082995

>> a dashboard that tells you when licenses and certs are close to expiring

Or, Excel > Data > Sort > by the Date column. No dashboard needed, no app needed.

Jach · 2026-04-25T13:33:29 1777124009

A lot of businesses can get by just fine with making it one person's responsibility to maintain a spreadsheet for this. It can be fragile though as the company grows and/or the number of items increases, and you have to make sure it's all still centralized and teams aren't randomly purchasing licenses or subscriptions without telling anyone, it needs to be properly handed off if the person leaves/dies/takes a vacation, backed up if not using a cloud spreadsheet... I've probably seen at least a dozen startups come and go over the years purporting to solve this kind of problem, other businesses integrate it into an existing Salesforce/other deployment... it seems like a fine choice for an internal tool, so long as the tool is running on infrastructure that is no less stable than a spreadsheet on someone's machine.

In the startup world something like "every emailed spreadsheet is a business" used to be a motivating phrase, it must be more rough out there when LLMs can business-ify so many spreadsheet processes (whether it's necessary for the business yet or not). And of course with this sort of tool in particular, more eyes seeing "we're paying $x/mo for this service?" naturally leads to "can't we just use our $y/mo LLM to make our own version?". Not sure I'd want to be in small-time b2b right now.

jeremyjh · 2026-04-25T22:44:40 1777157080

Or better yet just remember it all, no spreadsheet needed either!

Daishiman · 2026-04-25T03:23:38 1777087418

Why are you ignoring the fact that grabbing data from heterogeneous sources, combining it and presenting it is generally never a trivial task? This is exactly what LLMs are good for.

camdenreslink · 2026-04-25T04:56:08 1777092968

If you are using an LLM to actually fetch that data, combine it, and present it to you in an ad hoc way (like you run the same prompt every month or something), I wouldn't trust that at all. It still hallucinates, invents things and takes short cuts too often.

If you are using an LLM to create an application to grab data from heterogeneous sources, combine it and present it, that is much better, but could also basically be the excel spreadsheet they are describing.

Daishiman · 2026-04-25T15:08:57 1777129737

Your knowledge of LLMs is outdated by at least a year. For the past three months at least my team has been one-shotting complex SQL queries that are as semantically correct as your ability to describe them.

And why do you diminish the skill of good data wrangling as if it weren’t the most valuable skill in the vast majority of computer programming jobs? Your cynicism doesn’t correspond with the current ground truth in LLM usage.

camdenreslink · 2026-04-25T16:02:04 1777132924

Well, that is still having the LLM write code which is more like my second scenario. I use SOTA LLMs for coding literally every day. I don't think my knowledge is "outdated by at least a year".

qingcharles · 2026-04-25T01:17:16 1777079836

The ones I can mention.. one that watches a specific web site until an offer that is listed expires and then clicks renew (happens about once a day, but there is no automated way in the system to do it and having the app do it saves it being unlisted for hours and saves someone logging in to do it). Several that download specific combinations of documents from several different portals, where the user would just suck it up previously and right-click on each one to save it (this has a bunch of heuristics because it really required a human before to determine which links to click and in what order, but Claude was able to determine a solid algo for it). Another one that opens PDFs and pulls the titles and dates from the first page of the documents, which again was just done manually before, but now sends the docs via Gemma4 free API on Google to extract the data (the docs are a mess of thousands of different layouts).

Denzel · 2026-04-25T03:01:03 1777086063

None of these projects sound like weeks worth of scope w/o AI.

Gigachad · 2026-04-25T01:41:49 1777081309

We had a coworker vibecode an internal tool, do a bunch of marketing to the company at how incredible it is. Then got hired somewhere else.

I just went and deleted it because it's completely broken at every edge case and half of the happy paths too.

hdndjsbbs · 2026-04-24T22:37:31 1777070251

My team has also adopted this - it's much easier to add another layer than to refine or simplify what exists. We have AI skills to help us debug microservices that call microservices that have circular dependencies.

This was possible before but someone would maybe notice the insane spaghetti. Now it's just "we'll fix it with another layer of noodles".

vineyardmike · 2026-04-25T01:46:49 1777081609

That's so interesting because where I work, the push was to "add one more API" to existing services, turning them into near monoliths for the sake of deployment and access. Still a mess of util and helper functions recursively calling each other, but at least it's one binary in one container.

mancerayder · 2026-04-25T00:05:37 1777075537

Unfortunately I saw this pre-AI with microservices, where while empowering developers with their beloved microservices, we create intense complexity and deployment headaches. AI will fix the slop with an obscuring layer of complexity on top.

layoric · 2026-04-25T00:32:50 1777077170

Are you concerned this will just lead to coupling everywhere like microservices tend to do?

hdndjsbbs · 2026-04-26T12:52:13 1777207933

Oh the "micro services" are all coupled. To test anything you have to deploy a constellation of interdependent services with redundant DBs, each generating new IDs for the same underlying resource.

cobolcomesback · 2026-04-24T23:57:08 1777075028

We’re seeing the exact same where I work. Our main Slack channels have become inundated with “new tool announcements!”, multiple per day, often solving duplicate problems or problems that don’t exist. We’ve had to stop using those channels for any real conversation because most people are muting them due to the slop noise.

And what’s worse is that when someone does build a decent tool, you can’t help but be skeptical because of all the absolute slop that has come out. And everyone thinks their slop doesn’t stink, so you can’t take them at their word when they say it doesn’t. Even in this thread, how are you to know who is talking about building something useful vs something they think is useful?

A lot of people that have always wanted to be developers but didn’t have the skills are now empowered to go and build… things. But AI hasn’t equipped them with the skill of understanding if it actually makes sense to build a thing, or how to maintain it, or how to evolve it, or how to integrate it with other tools. And then they get upset when you tell them their tool isn’t the best thing since sliced bread. It’s exhausting, and I think we’ve yet to see the true consequences of the slop firehose.

trhway · 2026-04-24T23:24:55 1777073095

>Barely an hour goes by without a new 4-page document about something that that everyone is apparently ment to read, digest and respond to, despite its 'author' having done none of those steps, it's starting to feel actively adversarial.

well, isn't that what AI can be used effectively for - to generate [auto]response to the AI generated content.

duskdozer · 2026-04-25T09:32:05 1777109525

What a delightful world we're building.

trhway · 2026-04-25T22:57:32 1777157852

ideally we'd delegate all mindless/routine stuff to AI, and we'd dedicate ourselves to the higher creative and scientific pursuits. Somehow though i think that ideal will come with some adjustments/distortions/bugs :)

Jagerbizzle · 2026-04-24T22:11:42 1777068702

I'm sorry to hear that you have people abusing their new superpowers.

I run a team and am spending my time/tokens on serious pain points.

casey2 · 2026-04-24T22:38:16 1777070296

Such as?

tonyarkles · 2026-04-25T01:00:15 1777078815

I'll throw this out as something where it has saved literally weeks of work: debugging pathological behaviour in third-party code. Prompt example: "Today, when I did U, V, and W. I ended up with X happening. I fixed it by doing Y. The second time I tried, Z happened instead (which was the expected behaviour). Can you work out a plausible explanation for why X happened the first time and why Y fixed it? Please keep track of the specific lines of code where the behaviour difference shows up."

This is in a real-time stateful system, not a system where I'd necessarily expect the exact same thing to happen every time. I just wanted to understand why it behaved differently because there wasn't any obvious reason, to me, why it would.

The explanation it came back with was pretty wild. It essentially boiled down to a module not being adequately initialized before it was used the first time and then it maintained its state from then on out. The narrative touched a lot of code, and the source references it provided did an excellent job of walking me through the narrative. I independently validated the explanation using some telemetry data that the LLM didn't have access to. It was correct. This would have taken me a very long time to work out by hand.

Edit: I have done this multiple times and have been blown away each time.

jeppebemad · 2026-04-25T06:18:59 1777097939

This seems to be a common denominator for what LLMs actually do well: Finding bugs and explaining code. Anything about producing code is still a success to be seen.

zahlman · 2026-04-25T11:56:02 1777118162

> Prompt example: "Today, when I did U, V, and W. I ended up with X happening. I fixed it by doing Y. The second time I tried, Z happened instead (which was the expected behaviour). Can you work out a plausible explanation for why X happened the first time and why Y fixed it? Please keep track of the specific lines of code where the behaviour difference shows up."

> The explanation it came back with was pretty wild. It essentially boiled down to a module not being adequately initialized before it was used the first time and then it maintained its state from then on out.

Even without knowing any of the variable values, that explanation doesn't sound wild at all to me. It sounds in fact entirely plausible, and very much like what I'd expect the right answer to sound like.

tonyarkles · 2026-04-25T17:28:00 1777138080

The wild part, for me at the time, was how many steps there were from cause and effect and how perfectly they'd been reasoned through. The first time I had that experience was my first real "this LLM stuff might have some legs". My second similar experience several days later was "hmmm that wasn't a fluke..."

I'm still at a stage where I'm not completely sure that I like the code that Codex or Claude wants to write. Sometimes it's good, sometimes it takes 5 or 6 iterations to get it somewhere I'm happy with. But wow, on the front end of the work, they are great design/review/iterate partners; sometimes I let the tools write the first draft and then I find the gaps, sometimes I write the first draft and let the tools find the gaps. Either way has worked really well for making solid debt-free progress.

Jagerbizzle · 2026-04-24T22:50:53 1777071053

I answered this in a different comment below, but a lot of the friction is around the amount of time it takes to test/review/submit etc, and a lot of this is centered around tooling that no one has had the time to improve, perf problems in clunky processes that have been around longer than anyone individual, and other things of this nature. Addressing these issues is now approachable and doable in one's "spare time".

casey2 · 2026-04-24T23:19:43 1777072783

The point of that friction is to keep the human in the loop wrt code quality, it's not meant to be meaningless busywork. It's difficult to believe that you sustain the benefit of those systems. Anthropic and Microsoft publicly failed to keep up code quality. They would probably be in a better spot currently if they used neither, no friction, no AI. But that friction exists for a reason and AI doesn't have the "context length" to benefit from it.

This the the difference between intentional and incidental friction, if your CI/CD pipeline is bad it should be improved not sidestepped. The first step in large projects is paving over the lower layer so that all that incidental friction, the kind AI can help with, is removed. If you are constantly going outside that paved area, sure AI will help, but not with the success of the project which is more contingent on the fact that you've failed to lay the groundwork correctly.

girvo · 2026-04-25T00:54:49 1777078489

For me/my team, I use it to fix DevProd pain points that I would otherwise never get the investment to go solve. Just removed Webpack for Rspack, for example. Could easily do it myself, which is why I can prompt it correctly and review the output properly, but I can let it run while I’m in meetings over more important product or architectural decisions

serf · 2026-04-24T23:02:32 1777071752

>Such as?

it's crazy that the experiences are still so wildly varying that we get people that use this strategy as a 'valid' gotcha.

AI works for the vast majority of nowhere-near-the-edge CS work -- you know, all the stuff the majority of people have to do every day.

I don't touch any kind of SQL manually anymore. I don't touch iptables or UFW. I don't touch polkit, dbus, or any other human-hostile IPC anymore. I don't write cron jobs, or system unit files. I query for documentation rather than slogging through a stupid web wiki or equivalent. a decent LLM model does it all with fairly easy 5-10 word prompts.

ever do real work with a mic and speech-to-text? It's 50x'd by LLM support. Gone are the days of saying "H T T P COLON FORWARD SLASH FORWARD SLASH W W W".

this isn't some untested frontier land anymore. People that embrace it find it really empowering except on the edges, and even those state-of-the-art edge people are using it to do the crap work.

This whole "Yeah, well let me see the proof!" ostrich-head-in-the-sand thing works about as long as it takes for everyone to make you eat their dust.

hattmall · 2026-04-25T04:20:38 1777090838

People ask for examples because they want to know what other people are doing. Everything you mention here is VERY reasonable. It's exactly the kind of stuff no one is going to be surprised that you are getting good results with the current AI. But none of that is particularly groundbreaking.

I'm not trying to marginalize your or anyone else's usage of AI. The reason people are saying "such as" is to gauge where the value lies. The US GDP is around 30T. Right now there's is something like ~12T reasonably involved in the current AI economy. That's massive company valuations, data center and infrastructure build out a lot of it is underpinning and heavily influencing traditional sectors of the economy that have a real risk of being going down the wrong path.

So the question isn't what can AI do, it can do a lot, even very cheap models can handle most of what you have listed. The real question is what can the cutting edge state of the art models do so much better that is productively value added to justify such a massive economic presence.

leptons · 2026-04-24T23:47:22 1777074442

That's all well and good, but what happens when the price to run these AIs goes up 10x or even 100x.

It's the same model as Uber, and I can't afford Uber most of the time anymore. It's become cost prohibitive just to take a short ride, but it used to cost like $7.

It's all fun and games until someone has to pay the bill, and these companies are losing many billions of dollars with no end in sight for the losses.

I doubt the tech and costs for the tech will improve fast enough to stop the flood of money going out, and I doubt people are going to want to pay what it really costs. That $200/month plan might not look so good when it's $2000/month, or more.

nvader · 2026-04-25T00:39:40 1777077580

Why not try it yourself? Inference providers like BaseTen and AWS Bedrock have perfectly capable open source models as well as some licensed closed source models they host.

You can use "API-style" pricing on these providers which is more transparent to costs. It's very likely to end up more than 200 a month, but the question is, are you going to see more than that in value?

For me, the answer is yes.

leptons · 2026-04-25T02:28:22 1777084102

What makes you think I haven't tried it myself?

The "costs" are subsidized, it's a loss-leader.

jeremyjh · 2026-04-25T22:50:05 1777157405

Bedrock and other third party open weight hosted model costs are not subsidized. What could possibly be the investment strategy for being one of twelve fly-by-night openrouter operators hosting the latest Qwen?

Jach · 2026-04-25T00:20:54 1777076454

It's an important concern for those footing the bill, but I expect companies really in the face of being impacted by it to be able to do a cost-benefit calculation and use a mix of models. For the sorts of things GP described (iptables whatever, recalling how to scan open ports on the network, the sorts of things you usually could answer for yourself with 10-600 seconds in a manpage / help text / google search / stack overflow thread), local/open-weight models are already good enough and fast enough on a lot of commodity hardware to suffice. Whereas now companies might say just offload such queries to the frontier $200/mo plan because why not, tokens are plentiful and it's already being paid for, if in the future it goes to $2000/mo with more limited tokens, you might save them for the actual important or latency-sensitive work and use lower-cost local models for simpler stuff. That lower-cost might involve a $2000 GPU to be really usable, but it pays for itself shortly by comparison. To use your Uber analogy, people might have used it to get to downtown and the airport, but now it's way more expensive, so they'll take a bus or walk or drive downtown instead -- but the airport trip, even though it's more expensive than it used to be, is still attractive in the face of competing alternatives like taxis/long term parking.

Planktonne · 2026-04-25T00:02:27 1777075347

None of that is concrete though; it's all alleged speed-ups with no discernable (though a lot of claimed) impact.

> This whole "Yeah, well let me see the proof!" ostrich-head-in-the-sand thing works about as long as it takes for everyone to make you eat their dust.

People will stop asking for the proof when the dust-eating commences.

nathancahill · 2026-04-24T22:44:35 1777070675

Creating stakeholder value

natpalmer1776 · 2026-04-24T22:59:09 1777071549

Promoting synergy

paradoxyl · 2026-04-25T00:28:19 1777076899

Creating productivity gain narrtives

scottyah · 2026-04-25T01:01:15 1777078875

Aligning stakeholders

DANmode · 2026-04-25T05:45:11 1777095911

Eating a bagel

er2d · 2026-04-24T22:08:01 1777068481

Im convinced none of these people have any training in corporate finance. For if they did they'd realise they were wasting money.

I guess you gotta look busy. But the stick will come when the shareholders look at the income statement and ask... So I see an increase in operating expenses. Let me go calculate the ROIC. Hm its lower, what to do? Oh I know, lets fire the people who caused this (it wont be the C-Suite or management who takes the fall) lmao.

dpark · 2026-04-24T22:42:33 1777070553

Do you really think companies have started spending millions on tokens and no one from finance has been involved?

You could argue that all the spending is wasted (doubtless some is), but insisting that the decision is being made in complete ignorance of financial concerns reeks of that “everyone’s dumb but me” energy.

hattmall · 2026-04-25T04:29:25 1777091365

There is a difference to just noticing and attributing it to and recognizing negative financial outcomes. Right now for most companies they are still adjusting to declining inflation. Their bottom lines are doing quite well because consumer price inflation is much stickier than supply inflation. We are coming off of one of the quickest and largest supply lead inflationary cycles. It may not be immediately apparent for many companies that new expenditures are a drag on profitability.

The real thing to look at is whether or not the future outlook for company AI spend is heading up or down?

wjeje · 2026-04-25T00:52:36 1777078356

What a finance team allocates on spend has nothing to do with what the tokens actually get used for.

Are they peeking over the shoulder of each team and individual? Of course not.

It can be the case that the spend is absolutely wasteful. Numbers don’t lie.

temp8830 · 2026-04-24T23:47:56 1777074476

> Do you really think companies have started spending millions on tokens and no one from finance has been involved?

Oh, they were involved all right. They ran their analyses and realized that the increase in Acme Corp's share price from becoming "AI-enabled" will pay for the tokens several times over. For today. They plan to be retired before tomorrow.

wjeje · 2026-04-25T00:49:46 1777078186

That magic trick only works for publicly traded stocks.

Most firms are not a google or a Microsoft - a firms cash balance can become a strategic weapon in the right environment. So wasting money is not a great idea. Lest we forget dividends.

Moreover if you have a budget set re. Spend on tokens - you have rationing. Therefore the firm should be trying to get the most out of token spend. If you are wasting tokens on stuff that doesn’t create a benefit financially for the firm then indeed it is not inline with proper corporate financial theory.

strange_quark · 2026-04-25T03:18:51 1777087131

No, it works for any VC-backed companies. Something like 60% of VC funding last year went to AI companies. VCs aren't going to give you a money unless you're building an agentic AI-native agent platform for agents.

wjeje · 2026-04-25T03:40:27 1777088427

No Employees of publicly traded firms benefit from short-term gains in the stock price, assuming the stock price jump holds throughout the period of grant/vesting.

People who work at VC-backed firms do not get to enjoy the same degree of liquidity, not even close. There can be some outliers but that is 0.1% of all.

Can't believe simple stuff like this has to be said.

strange_quark · 2026-04-25T03:48:57 1777088937

CFOs or VPs absolutely benefit by hyping their company up to private investors by allowing tokenmaxxing to go on unchecked. Tender offers, acquisitions, and aquihires all exist. Or just good old fashioned resume padding by saying you "enabled AI transformation" or whatever helps you land a big payday at some other company.

dpark · 2026-04-24T23:56:14 1777074974

Sounds like they did train in corporate finance.

wjeje · 2026-04-25T00:55:51 1777078551

Sounds like you haven’t had training in corporate finance.

casey2 · 2026-04-24T23:05:06 1777071906

More that there is a poor incentive structure. Just like how PE can make money by leveraged buyouts and running businesses into the ground. Many of the financial instruments that make both that and the current AI bubble possible were legal then made illegal within the lifetimes of the last 16 presidents.

Round-tripping used to be regulated. SPVs used to be regulated. If you need a loan you used to have to go to something called a bank, now it comes from ???? who knows drug cartels, child traffickers, blackstone, russians & chinese oligarchs. Even assuming it doesn't collapse tommorow why should they make double digit returns on AI datacenters built on the backs of Americans?

dpark · 2026-04-25T00:01:47 1777075307

My issue was not with criticism of the money being spent or how it’s being obtained. I was specifically commenting on this statement:

> “Im convinced none of these people have any training in corporate finance. For if they did they'd realise they were wasting money.”

This isn’t meaningful criticism. This is a vacuous “those guys are so dumb”.

fc417fc802 · 2026-04-25T00:50:54 1777078254

Sounds like a workplace wide DDoS.

_zoltan_ · 2026-04-25T07:30:55 1777102255

That's not on Claude, that's on the authors.

Claude is a tool. It can be abused, or used in a sloppy way. But it can also be used rigorously.

I've been beating my team to be more papercut-free in the tooling they develop and it's been rough mostly because of the velocity.

But overall it's a huge net positive.

jeremyjh · 2026-04-25T00:15:55 1777076155

I'm sorry to hear you have such poor leadership.

BloondAndDoom · 2026-04-24T22:01:16 1777068076

AI is truly perfect for internal tooling. Security is less or no concern, bugs are more acceptable, performance / scalability rarely a concern. Quickest way to get things done, and speed up production development, MVP development etc.

jdub · 2026-04-24T22:16:27 1777068987

> Security is less or no concern

[waits for chickens to come home to roost]

overfeed · 2026-04-24T22:49:03 1777070943

> [waits for chickens to come home to roost]

"We are writing down X billions over 4 years, and have cancel several ambitious programs related to our AI experiments. We were following standard practice in the industry, so [shareholders] can't blame us for these chickens coming to roost. If everyone is guilty, is anyone really guilty?"

connicpu · 2026-04-24T22:28:39 1777069719

Doesn't take long until someone has the bright idea to pipe customer tickets directly into the poorly written internal tool

TeMPOraL · 2026-04-25T10:10:29 1777111829

If security was the prime concern, there would be no chickens and no coop and no farm - people would still be living in caves. After all, outside is dangerous, and Grug Chief said, smart ass grugs with their smart ass ideas like fire or agriculture just invite complexity and create security vulnerabilities.

After all (Grug Chief reminds us), the only truly secure computing system is an inert rock.

jdub · 2026-04-26T05:55:25 1777182925

You're hearing "don't invent fire" but what's being said is "for fuck's sake, stop lighting fires in the cave".

2ndorderthought · 2026-04-24T23:02:31 1777071751

No problems at all except, unauthorized access to a model they were claiming was a weapon and couldn't be released to the public and having their cli code leaked in the last two weeks. Everything's just fine

LPisGood · 2026-04-24T22:37:55 1777070275

When attackers can move laterally through everything because every internal tool leaks credentials and data there will be issues.

therealdrag0 · 2026-04-25T02:48:36 1777085316

Internal tool Doesn’t have credential. Checkmate ;)

sumedh · 2026-04-24T22:51:56 1777071116

Anthropic seems to be doing fine :)

cobolcomesback · 2026-04-24T23:44:38 1777074278

This comment makes me want to scream.

lioeters · 2026-04-25T03:14:53 1777086893

This is what happens when entire industries go all in on "Move fast and break things." Imagine what they said about software applying to everything else in the world. That's what's coming.

> Security is less or no concern, bugs are more acceptable, performance / scalability rarely a concern. Quickest way to get things done

TeMPOraL · 2026-04-25T10:13:36 1777112016

> This is what happens when entire industries go all in on "Move fast and break things." Imagine what they said about software applying to everything else in the world. That's what's coming.

This is literally how rest of the world works already, and always had. We'd still be living in caves otherwise. Fortunately most people (at least outside software) seem to understand that security is a trade-off against usefulness, and not an end goal in itself.

_zoltan_ · 2026-04-25T07:32:15 1777102335

This is not going away.

Even right now the difference with working with 'AI native' developers or with regular developers is day and night.

I certainly wouldn't want a non-clause enabled developer on my team now.

svieira · 2026-04-25T11:31:48 1777116708

> I certainly wouldn't want a non-clause enabled developer on my team now.

You only want to work with people who are hip with the North Pole?

_zoltan_ · 2026-04-25T13:01:36 1777122096

Typo obviously :-)

amluto · 2026-04-24T23:27:59 1777073279

I am, oddly, able to get really quite a lot of mileage out of $20/mo of OpenAI plan, and I have never encountered a usage limit. I have gotten warnings that I was close a couple times.

I wonder what I’m doing differently.

I did spend quite a bit of time, mostly manually, improving development processes such that the agent could effectively check its work. This made a difference between the agent mostly not working and mostly working. Maybe if I had instead spent gobs of money it would have worked output tooling improvements?

komali2 · 2026-04-25T02:10:53 1777083053

I wonder if you're like me? I tried out the MCPs and sub agents and rules and bells and whistles and always just came back to a plain Codex / Claude Code / Cursor Agent terminal window, where I say what I want, @ a few files, let it rip, check the diff, ask for some adjustments, then commit and start the process over after clearing context.

Haven't found a process that beats this yet and I burn very few tokens this way.

devmor · 2026-04-25T03:57:02 1777089422

I don’t really write code with it at all, and that’s why I burn so many tokens.

I like writing code, I’m good at writing code. What I hate doing is dredging through logs, filtering out test scenarios and putting together disparate information from knowledge silos - so I have the AI doing that. It’s my research assistant.

Effectively I’m using it like an automated search engine that indexes anything I want and refines the results by using the statistical near neighbors of how other people explained their searches.

se4u · 2026-04-24T22:02:04 1777068124

I'd be interested to learn what kind of internal tooling are you improving ?

Jagerbizzle · 2026-04-24T22:13:34 1777068814

We've had a lot of complaints about our review processes, time to submit, etc, and a lot of that boils down to tools no one has time to improve.

It's now trivial to fix these problems while still doing our day jobs -- shipping a product.

TimTheTinker · 2026-04-24T22:09:36 1777068576

Personally, a static analysis PR check to catch some types of preventable runtime production errors in application code