It's the subtle errors that are really difficult to navigate. I got burned for about 40 hours on a conditional being backward in the middle of an otherwise flawless method.
The apparent speed up is mostly a deception. It definitely helps with rough outlines and approaches. But, the faster you go, the less you will notice the fine details, and the more assumptions you will accumulate before realizing the fundamental error.
I'd rather find out I was wrong within the same day. I'd probably have written some unit tests and played around with that function a lot more if I had handcrafted it.
When I am able ask a very simple question of an LLM which then prevents me having to context-switch to answer the same simple question myself; this is a big time saver for me but hard-to-quantify.
Anything that reduces my cognitive load when the pressure is on is a blessing on some level.
Cognitive load is something people always leave out. I can fuckin code drunk with these things. Or just increase stamina to push farther than I would writing every single line.
You could make the same argument for any non-AI driven productivity tool/technique. If we can't trust the user to determine what is and is not time-saving then time-saving isn't a useful thing to discuss outside of an academic setting.
My issue with most AI discussions is they seem to completely change the dimensions we use to evaluate basic things. I believe if we replaced "AI" with "new useful tool" then people would be much more eager to adopt it.
What clicked for me is when I started treating it more like a tool and less like some sort of nebulous pandora's box.
Now to me it's no different than auto completing code, fuzzy finding files, regular expressions, garbage collection, unit testing, UI frameworks, design patterns, etc. It's just a tool. It has weaknesses and it has strengths. Use it for the strengths and account for the weaknesses.
Like any tool it can be destructive in the hands of an inexperienced person or a person who's asking it to do too much. But in the hands of someone who knows what they're doing and knows what they want out of it - it's so freakin' awesome.
Sorry for the digression. All that to say that if someone believes it's a productivity boost for them then I don't think they're being misled.
Except actual studies objectively show efficiency gains, more with junior devs, which make sense. So no, it's not a "deception" but it is often overstated in popular media.
And anecdotes are useless. If you want to show me improved studies justifying your claim great, but no I don't value random anecdotes. There are countless conflicting anecdotes (including my own).
I find the opposite, the more senior the more value they offer as you know how to ask the right questions, how to vary the questions and try different tact’s and also observe errors or mistakes
That’s the thing, isn’t it? The craft of programming in the small is one of being intimate with the details, thinking things through conscientiously. LLMs don’t do that.
I find that it depends very heavily on what you're up to. When I ask it to write nix code it'll just flat out forget how the syntax works half way though. But if I want it to troubleshoot an emacs config or wield matplotlib it's downright wizardly, often including the kind of thing that does indicate an intimacy with the details. I get distracted because I'm then asking it:
> I un-did your change which made no sense to me and now everything is broken, why is what you did necessary?
I think we just have to ask ourselves what we want it to be good at, and then be diligent about generating decades worth of high quality training material in that domain. At some point, it'll start getting the details right.
What languages/toolkits are you working with that are less than 10 years old?
Anyhow, it seems to me like it is working. It's just working better for the really old stuff because:
- there has been more time for training data to accumulate
- some of it predates the trend of monetizing data, so there was less hoarding and more sharing
It may be that the hard slow way is the only way to get good results. If the modern trends re: products don't have the longevity/community to benefit from it, maybe we should fix that.
I use Chatgpt for coding / API questions pretty frequently. It's bad at writing code with any kind of non-trivial design complexity.
There have been a bunch of times where I've asked it to write me a snippet of code, and it cheerfully gave me back something that doesn't work for one reason or another. Hallucinated methods are common. Then I ask it to check its code, and it'll find the error and give me back code with a different error. I'll repeat the process a few times before it eventually gets back to code that resembles its first attempt. Then I'll give up and write it myself.
As an example of a task that it failed to do: I asked it to write me an example Python function that runs a subprocess, prints its stdout transparently (so that I can use it for running interactive applications), but also records the process's stdout so that I can use it later. I wanted something that used non-blocking I/O methods, so that I didn't have to explicitly poll every N milliseconds or something.
Honestly I find that when GPT starts to lose the plot it's a good time to refactor and then keep on moving. "Break this into separate headers or modules and give me some YAML like markup with function names, return type, etc for each file." Or just use stubs instead of dumping every line of code in.
If it takes almost no cognitive energy, quite a while. Even if it's a little slower than what I can do, I don't care because I didn't have to focus deeply on it and have plenty of energy left to keep on pushing.
As my mother used to say, "I love work. I could watch it all day!"
I can see where you are coming from.
Maintaining a better creative + technical balance, instead of see-sawing. More continuous conscious planning, less drilling.
Plus the unwavering tireless help of these AI's seems psychologically conducive to maintaining one's own motivation. Even if I end up designing an elaborate garden estate or a simpler better six-axis camera stabilizer/tracker, or refactoring how I think of primes before attempting a theorem, ... when that was not my agenda for the day. Or any day.
I'm constantly having to go back and tell the AI about every mistake it makes and remind it not to reintroduce mistakes that were previously fixed. "no cognitive energy" is definitely not how I would describe that experience.
Exactly, 1 step forward, 1 step backward. Avoiding edge cases is something that can't be glossed over, and for that I need to carefully review the code. Since I'm accountable for it, and can't skip this part anyway, I'd rather review my own than some chatbot's.
Why aren't you writing unit tests just because AI wrote the function? Unit tests should be written regardless of the skill of the developer. Ironically, unit tests are also one area where AI really does help move faster.
High level design, rough outlines and approaches, is the worst place to use AI. The other place AI is pretty good is surfacing api call or function calls you might not know about if you're new to the language. Basically, it can save you a lot of time by avoiding the need for tons of internet searching in some cases.
> With minimal guidance[, LLM-based systems] put out pretty sensible tests.
Yes and no. They get out all the initial annoying boilerplate of writing tests out of the way, and the tests end up being mostly decent on the surface, but I have to manually tweak the behavior and write most of the important parts myself, especially for non-trivial tricky scenarios.
However, I am not saying this as a point against LLMs. The fact that they are able to get a good chunk of the boring boilerplate parts of writing unit tests out of the way and let me focus on the actual logic of individual tests has been noticeably helpful to me, personally.
I only use LLMs for the very first initial phase of writing unit tests, with most of the work still being done by me. But that initial phase is the most annoying and boring part of the process for me. So even if I still spend 90% of the time writing code manually, I still am very glad for being able to get that initial boring part out of the way quickly, without wasting my mental effort cycles on it.
The fact that you think "change detection" tests offer zero value speaks volumes. Those may well be the most important use of unit tests. Getting the function correct in the first place isn't that hard for a senior developer, which is often why it's tempting to skip unit tests. But then you go refactor something and oops you broke it without realizing it, some boring obvious edge case, or the like.
These tests are also very time consuming to write, with lots of boilerplate that AI is very good at writing.
>The fact that you think "change detection" tests offer zero value speaks volumes.
But code should change. What shouldn't change, if business rules don't change, is APIs and contracts. And for that we have integration tests and end to end tests.
I am kind of starting to doubt about the utility of unit tests. From a theoretical perspective I see the point in writing unit tests. But in practice I rarely seen them being useful. Guy A writes poor logic and sets in stone that poor logic by writing an unit test. Manual testing discovers a bug so guy B has to modify that poor logic and the unit test.
I'd rather see the need for integration tests and end to end tests. I want to test business logic not assert that 2 + 2 = 4.
The apparent speed up is mostly a deception. It definitely helps with rough outlines and approaches. But, the faster you go, the less you will notice the fine details, and the more assumptions you will accumulate before realizing the fundamental error.
I'd rather find out I was wrong within the same day. I'd probably have written some unit tests and played around with that function a lot more if I had handcrafted it.