> *My experience is opposite to yours.* But that is exactly the problem, no? It ...

vidarh · 2025-10-14T12:47:25 1760446045

My point is that these are not minor successes, and not occasional. Not every attempt is equally successful, but a significant majority of my attempts are. Otherwise I wouldn't be letting it run for longer and longer without intervention.

For me this isn't one project where I've "twiddled around with the file and added random sentences". It's an increasingly systematic approach to giving it an approach to making changes, giving it regression tests, and making it make small, testable changes.

I do that because I can predict with a high rate of success that it will achieve progress for me at this point.

There are failures, but they are few, and they're usually fixed simply by starting it over again from after the last succesful change when it takes too long without passing more tests. Occasionally it requires me to turn off --dangerously-skip-permissions and guide it through a tricky part. But that is getting rarer and rarer.

No, I haven't formally documented it, so it's reasonable to be skeptical (I have however started packaging up the hooks and agents and instructions that consistently work for me on multiple projects. For now, just for a specific client, but I might do a writeup of it at some point) but at the same time, it's equally warranted to wonder whether the vast difference in reported results is down to what you suggest, or down to something you're doing differently with respect to how you're using these tools.

baq · 2025-10-14T11:27:30 1760441250

replace 'AI|LLM' with 'new hire' in your post for a funny outcome.

svieira · 2025-10-14T12:36:30 1760445390

Replace 'new hire' with 'AI|LLM' in the updated post for a very sad outcome.

marcosdumay · 2025-10-14T15:47:29 1760456849

New hires perform consistently. Even if you can't predict beforehand how well they'll work, after a short observation time you can predict very well how they will continue to work.

hitarpetar · 2025-10-14T11:51:25 1760442685

this is the first time I've ever seen this joke, well done!

kordlessagain · 2025-10-14T16:57:14 1760461034

You are using the wrong tools if you are getting crappy results. It’s like editing a photo with notepad, it’s possible but likely to fail.