It's the subtle errors that are really difficult to navigate. I got burned for a...

tanseydavid · on Oct 29, 2024

>> The apparent speed up is mostly a deception.

When I am able ask a very simple question of an LLM which then prevents me having to context-switch to answer the same simple question myself; this is a big time saver for me but hard-to-quantify.

Anything that reduces my cognitive load when the pressure is on is a blessing on some level.

bongodongobob · on Oct 29, 2024

Cognitive load is something people always leave out. I can fuckin code drunk with these things. Or just increase stamina to push farther than I would writing every single line.

oogetyboogety · on Oct 29, 2024

This might be the measurable "some" non deceptive time saving, whereas most of it is still deceptive in terms of time saved

0xFACEFEED · on Oct 29, 2024

You could make the same argument for any non-AI driven productivity tool/technique. If we can't trust the user to determine what is and is not time-saving then time-saving isn't a useful thing to discuss outside of an academic setting.

My issue with most AI discussions is they seem to completely change the dimensions we use to evaluate basic things. I believe if we replaced "AI" with "new useful tool" then people would be much more eager to adopt it.

What clicked for me is when I started treating it more like a tool and less like some sort of nebulous pandora's box.

Now to me it's no different than auto completing code, fuzzy finding files, regular expressions, garbage collection, unit testing, UI frameworks, design patterns, etc. It's just a tool. It has weaknesses and it has strengths. Use it for the strengths and account for the weaknesses.

Like any tool it can be destructive in the hands of an inexperienced person or a person who's asking it to do too much. But in the hands of someone who knows what they're doing and knows what they want out of it - it's so freakin' awesome.

Sorry for the digression. All that to say that if someone believes it's a productivity boost for them then I don't think they're being misled.

tensor · on Oct 29, 2024

Except actual studies objectively show efficiency gains, more with junior devs, which make sense. So no, it's not a "deception" but it is often overstated in popular media.

zeroonetwothree · on Oct 29, 2024

Studies have limitations, in particular they test artificial and narrowly-scoped problems that are quite different from real world work.

tensor · on Oct 29, 2024

And anecdotes are useless. If you want to show me improved studies justifying your claim great, but no I don't value random anecdotes. There are countless conflicting anecdotes (including my own).

rqmedes · on Oct 29, 2024

I find the opposite, the more senior the more value they offer as you know how to ask the right questions, how to vary the questions and try different tact’s and also observe errors or mistakes

enneff · on Oct 29, 2024

That’s the thing, isn’t it? The craft of programming in the small is one of being intimate with the details, thinking things through conscientiously. LLMs don’t do that.

__MatrixMan__ · on Oct 29, 2024

I find that it depends very heavily on what you're up to. When I ask it to write nix code it'll just flat out forget how the syntax works half way though. But if I want it to troubleshoot an emacs config or wield matplotlib it's downright wizardly, often including the kind of thing that does indicate an intimacy with the details. I get distracted because I'm then asking it:

> I un-did your change which made no sense to me and now everything is broken, why is what you did necessary?

I think we just have to ask ourselves what we want it to be good at, and then be diligent about generating decades worth of high quality training material in that domain. At some point, it'll start getting the details right.

esafak · on Oct 30, 2024

That doesn't work in the tech industry, because almost nothing is decades old, for obvious reasons.

__MatrixMan__ · on Oct 30, 2024

What languages/toolkits are you working with that are less than 10 years old?

Anyhow, it seems to me like it is working. It's just working better for the really old stuff because:

- there has been more time for training data to accumulate

- some of it predates the trend of monetizing data, so there was less hoarding and more sharing

It may be that the hard slow way is the only way to get good results. If the modern trends re: products don't have the longevity/community to benefit from it, maybe we should fix that.

Nevermark · on Oct 29, 2024

Perhaps it should be prompted to then?

Ask it to review its own code for any problems?

Also identify typical and corner cases and generate tests?

Question marks here because I have not used the tool.

The size & depth of each accepted code step is still up to the developer slash prompter

nrclark · on Oct 29, 2024

I use Chatgpt for coding / API questions pretty frequently. It's bad at writing code with any kind of non-trivial design complexity.

There have been a bunch of times where I've asked it to write me a snippet of code, and it cheerfully gave me back something that doesn't work for one reason or another. Hallucinated methods are common. Then I ask it to check its code, and it'll find the error and give me back code with a different error. I'll repeat the process a few times before it eventually gets back to code that resembles its first attempt. Then I'll give up and write it myself.

As an example of a task that it failed to do: I asked it to write me an example Python function that runs a subprocess, prints its stdout transparently (so that I can use it for running interactive applications), but also records the process's stdout so that I can use it later. I wanted something that used non-blocking I/O methods, so that I didn't have to explicitly poll every N milliseconds or something.

bongodongobob · on Oct 29, 2024

Honestly I find that when GPT starts to lose the plot it's a good time to refactor and then keep on moving. "Break this into separate headers or modules and give me some YAML like markup with function names, return type, etc for each file." Or just use stubs instead of dumping every line of code in.

tomrod · on Oct 29, 2024

How long are you willing to iterate to get things right?

bongodongobob · on Oct 29, 2024

If it takes almost no cognitive energy, quite a while. Even if it's a little slower than what I can do, I don't care because I didn't have to focus deeply on it and have plenty of energy left to keep on pushing.

Nevermark · on Oct 30, 2024

As my mother used to say, "I love work. I could watch it all day!"

I can see where you are coming from.

Maintaining a better creative + technical balance, instead of see-sawing. More continuous conscious planning, less drilling.

Plus the unwavering tireless help of these AI's seems psychologically conducive to maintaining one's own motivation. Even if I end up designing an elaborate garden estate or a simpler better six-axis camera stabilizer/tracker, or refactoring how I think of primes before attempting a theorem, ... when that was not my agenda for the day. Or any day.

Bjartr · on Oct 30, 2024

I'm constantly having to go back and tell the AI about every mistake it makes and remind it not to reintroduce mistakes that were previously fixed. "no cognitive energy" is definitely not how I would describe that experience.

bongodongobob · on Oct 30, 2024

Sounds like the context window is getting pruned. Start a new chat fresh after you make significant changes.

EVa5I7bHFq9mnYK · on Oct 29, 2024

That's presumably what o1-preview does? Iterates and checks the result. It takes much longer, but does indeed write slightly better code.

pawelduda · on Oct 29, 2024

Exactly, 1 step forward, 1 step backward. Avoiding edge cases is something that can't be glossed over, and for that I need to carefully review the code. Since I'm accountable for it, and can't skip this part anyway, I'd rather review my own than some chatbot's.

tensor · on Oct 29, 2024

Why aren't you writing unit tests just because AI wrote the function? Unit tests should be written regardless of the skill of the developer. Ironically, unit tests are also one area where AI really does help move faster.

High level design, rough outlines and approaches, is the worst place to use AI. The other place AI is pretty good is surfacing api call or function calls you might not know about if you're new to the language. Basically, it can save you a lot of time by avoiding the need for tons of internet searching in some cases.

chairhairair · on Oct 29, 2024

I have completely the opposite perspective.

Unit tests actually need to be correct, down to individual characters. Same goes with API calls. The API needs to actually exist.

Contrast that with "high level design, rough outlines". Those can be quite vague and hand-wavy. That's where these fuzzy LLMs shine.

That said, these LLM-based systems are great at writing "change detection" unit tests that offer ~zero value (or negative).

Aeolun · on Oct 29, 2024

> That said, these LLM-based systems are great at writing "change detection" unit tests that offer ~zero value (or negative).

That’s not at all true in my experience. With minimal guidance they put out pretty sensible tests.

filoleg · on Oct 29, 2024

> With minimal guidance[, LLM-based systems] put out pretty sensible tests.

Yes and no. They get out all the initial annoying boilerplate of writing tests out of the way, and the tests end up being mostly decent on the surface, but I have to manually tweak the behavior and write most of the important parts myself, especially for non-trivial tricky scenarios.

However, I am not saying this as a point against LLMs. The fact that they are able to get a good chunk of the boring boilerplate parts of writing unit tests out of the way and let me focus on the actual logic of individual tests has been noticeably helpful to me, personally.

I only use LLMs for the very first initial phase of writing unit tests, with most of the work still being done by me. But that initial phase is the most annoying and boring part of the process for me. So even if I still spend 90% of the time writing code manually, I still am very glad for being able to get that initial boring part out of the way quickly, without wasting my mental effort cycles on it.

tensor · on Oct 29, 2024

The fact that you think "change detection" tests offer zero value speaks volumes. Those may well be the most important use of unit tests. Getting the function correct in the first place isn't that hard for a senior developer, which is often why it's tempting to skip unit tests. But then you go refactor something and oops you broke it without realizing it, some boring obvious edge case, or the like.

These tests are also very time consuming to write, with lots of boilerplate that AI is very good at writing.

DeathArrow · on Oct 30, 2024

>The fact that you think "change detection" tests offer zero value speaks volumes.

But code should change. What shouldn't change, if business rules don't change, is APIs and contracts. And for that we have integration tests and end to end tests.

chairhairair · on Oct 30, 2024

https://testing.googleblog.com/2015/01/testing-on-toilet-cha...

"speaks volumes" lol

mattmanser · on Oct 29, 2024

I think you've misunderstood what he meant by change detection (not GP, could be wrong).

Hard to describe, easy to spot.

Some people write tests that are tightly coupled to their particular implementation.

They might have tons of setup code in each test. So refactoring means each test needs extensive rewrites.

Or there will be loads of asserts that have little to do with the actual thing being tested.

These tests usually have negative value as your only real option as another developer is to simply delete them all and start again.

That's what I would interpret the GP as meaning when they use the phrase "change detection" tests.

DeathArrow · on Oct 30, 2024

>Some people write tests that are tightly coupled to their particular implementation.

That is not due to people choice but due to what actual code being tested does.

I think integration tests and end to end tests are much better.

DeathArrow · on Oct 30, 2024

>But then you go refactor something and oops you broke it without realizing it, some boring obvious edge case, or the like

I will start to care when integration tests are failing, because that is an actual bug. Then I will fix the bug and move over.

DeathArrow · on Oct 30, 2024

I am kind of starting to doubt about the utility of unit tests. From a theoretical perspective I see the point in writing unit tests. But in practice I rarely seen them being useful. Guy A writes poor logic and sets in stone that poor logic by writing an unit test. Manual testing discovers a bug so guy B has to modify that poor logic and the unit test.

I'd rather see the need for integration tests and end to end tests. I want to test business logic not assert that 2 + 2 = 4.

bob1029 · on Oct 30, 2024

I've always looked at unit tests as a targeted and often temporary measure.

Once you have a way to do a good integration/e2e test, the results of the constituent unit tests don't provide as much value.

I'd rather run one big complicated thing for real once than hang my hat on a bunch of fake green checkmarks that update every 50 milliseconds.

hackable_sand · on Oct 30, 2024

I would love to find out

where programmers are learning this idea:

that writing code fast is ideal.

If it takes 30 years to write one loc, it takes 30 years.

Ideally, it takes 30 years to write zero lines of code.

DeathArrow · on Oct 30, 2024

>where programmers are learning this idea:

>that writing code fast is ideal.

At places that pay them.

hackable_sand · on Nov 1, 2024

I tried Cursor out for myself and I get it now.

gitaarik · on Oct 30, 2024

The best programmers are no programmers!

hackable_sand · on Oct 30, 2024

Alternatively,

more logic does not validate the truth.