> I’ve seen employees spend days drafting documents that a free tool like Mistral could generate in seconds, leaving them 30-60 minutes to review and refine.
What I have seen is employees spending days asking the model again and again to actually generate the document they need, and then submit it without reviewing it, only for a problem to explode a month later because no one noticed a glaring absurdity in the middle of the AI-polished garbage.
You're describing incompetence or laziness—I’ve encountered those kinds of people as well. But I’ve also seen others who are 2-3 times more productive thanks to AI. That said, I’m not suggesting AI should be used for every single task, especially if the output is garbage. If someone blindly relies on AI without adding any real value beyond typing prompts, then they’re not contributing anything meaningful.
Up to now, my attempts at doing what the author claims to be possible ends up in a broken piece of code that the agent only makes worse when asked to debug, and finally it wont even compile. There seems to be a threshold of difficulty above which the agent will go bug-runaway. I honestly haven't seen this threshold going up. If anything, it seems to be saturating.
Every issue at work becomes an item in this org, and all information relevant to the issue goes there: emails, chats, code snippets, documentation paragraphs, etc...
It's amazing how clearer things become, and how quickly you can get back in action if the issue resurfaces a few years later.
Yes, until all managers become "idea anti-bodies" and the winner is who can push all work to other teams. The company coasts along while its cash cows live.
I find it philosophical that suddenly everyone wants to take a medicine to stop wanting food. "I want to not want, but I can't help but wanting, so I want a medicine to make me stop wanting..."
> It also told me that it had permanently disabled my Facebook account—an account that I’d had for more than 15 years, and that was my primary way of staying in touch with family and friends around the world.
You don't need Facebook for that. Write down a list of the people you care about and contact them with some frequency, at least on their birthday.
contact them how? they don't read email (if they even have an address), they are not using any other messenger. and my introvert nature makes phonecalls very uncomfortable. at best i could send SMS, but that is not suitable (and across continents expensive) for real conversations. also none of the alternatives allow me to keep up with what's happening with them because they only post it on facebook. and there is at least one project that is important to me that i can't participate in without a facebook account because it is coordinated only there.
in general, reaching someone through other ways than their preferred channel only works if both sides are willing to do so. very often, especially with non-technical people that is simply not the case.
you can replace facebook with any other tool that is used for messaging, and you will find cases where not being able to access that tool would be a problem.
most of my contacts are only reachable through one specific messaging tool.
the reality is that we can no longer allow any of these companies to control who does and who doesn't get access. getting access must be a right that mustn't be denied to you if you choose to use it. or, better, all major messaging systems should be made interoperable so that you can stay in touch without needing an account there.
The actual reality is that you worked yourself into the untenable position of being so hands-off with people that you need a social network to do what a phone can do just as well, and that's because most of the people in everyone's social network are not actually people that we care to stay in touch with on a daily or even monthly basis. You don't need a social network. You choose to use one.
There are people in my facebook feed who I haven't seen in person in over 20 years. It's nice to keep up with them and their lives and families, but if I lost access to facebook tomorrow, it wouldn't change a thing for me. I am friendly with these people but I don't actually know them anymore and am not all that invested in their lives.
It is totally possible to have a robust social life and keep up with friends and family without Social Media. OP just needs to put in effort and not hide behind all those excuses. I burned my Facebook account at least 10 years ago, maybe 15... I can’t remember when, it’s been so long. Yet the people who are actually important to me stay in touch. I don’t consider someone an actual “friend” if they are unwilling to even communicate with me if I don’t use social media. These people aren’t actual friends.
it's not just friends. it's teachers, parents of my children's friends, my landlord. doctors, customers, employers, colleagues, even parents or my partner and other relatives, etc. these people absolutely do have the power to dictate which way i communicate with them (and for some of those people i absolutely don't want them to have my phone number if i can avoid it). same goes for groups, i may be able to ask individuals to switch messengers to stay in touch, but i can't move a whole group if that group matters to me.
the problem is that the choice of messengers is decided at the beginning when you meet someone for the first time. the person is not yet a friend but they may become a friend if you can only find a way to stay in touch. for me, as the more technical person in most cases this means that i must accept their choice of messenger, because i can't babysit them to switch. (though i do have counter examples where switching worked)
i travel a lot. when i come to a new place and i need to build up connections to locals it is totally not possible to come in and dictate which messenger they should use to communicate with me. i need to use theirs. only once i gained their trust and a friendship emerges i may suggest switching to a better way to communicate. but the reality is, that if i refuse use their messenger or worse are blocked from using it then i am unable to stay in touch with people i meet.
when i moved to china i long refused to use wechat. i was unable to stay in touch with most of the people i met during that time because of that. until i gave up resisting.
> the reality is that we can no longer allow any of these companies to control who does and who doesn't get access. getting access must be a right that mustn't be denied to you if you choose to use it.
A right to an account on web platforms? Okay... does that apply to every single platform, or only those of a certain size? What size? What about people using the platform to spam, harass, or threaten other users?
> or, better, all major messaging systems should be made interoperable so that you can stay in touch without needing an account there.
This is a far more reasonable take, though still not probably viable.
You didn't answer the rest of my questions, which are sort of relevant. If I run a small platform of a couple hundred users, and one or two of them are actively harassing and threatening other users, my only option should be taking them to court? To say nothing of jurisdictions, or anonymity, or any number of other issues, you should realize that this gives an unfair advantage to platforms that can spend more money on lawyers to argue their case, or lobby for legislative change, etc.
please use a more generous reading of what i said. clearly small companies shouldn't be held to the same standards as big ones. but even small companies should be able to provide an explanation for account closures, and no terms and conditions can abrogate the right to sue. so you don't have to take them to court, but they may well take you to court after you close their accounts. however if you have done your homework, and documented the abuse you'll win, and they will have to pay your court fees.
but until someone sues small companies will probably fly under the radar and the focus is on big companies as it should be. and yes big companies do have an advantage, that's why their actions deserve closer scrutiny.
I dunno. One of the core skills of bullies in school is taking advantage of "due process".
Funny in American schools teachers and administrators are known to show allyship to bullies in that they take no action to stop them unless they go really too far and kill somebody. I took a summer course though that was taught by two German instructors who actually led the bullying directly. Maybe they have a more toxic culture than the US.
I know the EU has a "right for embezzlers to reoffend" law which must have been an example of people who use their social skills as a weapon against other people using their social skills to turn the law into a weapon against the rest of us. The "right to be forgotten" is itself a crime when embezzling is concerned because embezzlers have a very high recidivism rate, essentially 100% when behavioral addictions like gambling are involved. This is a crime which can destroy businesses, ruin lives, deny people a secure retirement, and cause unemployment. I can think of no reason why embezzlement treason should ever be forgot.
ignoring that in my opinion bullying is an entirely different issue that has nothing to do with what we are discussing, if bullies can take advantage of due process then the rules are bad or unsuitable, and just because that may be the case in schools that doesn't mean that due process is a bad idea to begin with.
yes, the laws are not perfect, may need to be changed. actually that is the core of most of this subthread. the example that for this case in germany the laws are better than in the UK, and that it is possible to improve the protection of users against these big companies.
I can think of no reason why embezzlement treason should ever be forgot
this is another example of a law that may not be perfect. again, laws can and should be changed when evidence emerges that the current laws are not suitable to protect people from harm.
in this particular case i would argue that we may not have enough data to decide. there may also be a component of believing that people can change, and the question is what causes them to recommit such a crime, and what can we do about it to change that. one problem here is that a permanent record does not stop people from recommitting a crime, but it does make it very difficult for them to reform because reform requires that past transgressions are forgotten.
but we could go on endlessly finding examples of laws that are broken, and argue about how they should be fixed, or claim that because there are other broken laws, then the one we are discussing must be broken too, or whatever the argument is here. but doing so while we are discussing one such law is not really helpful, and feels like whataboutism.
say, when 1-10% of a country's (or state's, or whatever) citizens are users of the service, someone should pop their head in and say "we need a different legal strategy for retention and user onboard/offboard."
It should be taught in MBA, and VCs should tell startups that aim to get "millions" of users that they need to plan, in advance, for this sort of thing.
This isn't really a gotcha, i'd put a hard line somewhere around "10% of the global population" as needing extreme scrutiny.
Due process? Tell me you’ve never read the Terms and Conditions without telling me. You don’t have a legal right to anything on Facebooks servers. Just because you invest effort and time into something doesn’t mean you have a right or ownership. You aren’t squatting on Facebook’s land. They just haven’t forced you off yet.
In some countries you have legal rights to information collected about you. This can include information collected by social media sites. Just because Facebook has a forced arbitration agreement in their TOS doesn't mean it's valid everywhere, especially in countries that nullify those clauses. The same goes with information collection clauses. Laws supercede terms of services.
And personally, while I don't mind users being able to be banned for harassing users, I do think everyone, including trolls, should have the right to information collected about them and their account
It’s valid in the UK where this person is limited to their rights. Sure people should have right to information collected about themselves but a lot of countries don’t extend that right. Perhaps in a different reality or in 10-15 years time things will change. Not while Zuck is sucking people dry of their data and people use Facebook because they see identity as a valid reason to give up their freedoms so they can sell something to someone they’ll never communicate with again.
Not for people in the UK where it matters. Yes I understand hypotheticals and navel gazing at Germany’s data laws. That doesn’t make them more real or possible for this incident.
Sounds nice but the UK doesn’t benefit in regards to laws from another country.
this person doesn't benefit, but the country does. people could demand a change based on the german example. i don't know how likely that is to happen now, but some time ago the UK used to be part of the EU, which means there was a time when such a change would have been quite likely actually.
the german federal cartel office forced for example amazon to change their terms and conditions so that they may no longer arbitrarily close accounts. account closures also must include a reason. further german users can now sue against closures in germany.
so yes, companies can not arbitrarily ignore due process in their term and conditions
Well this person lives in the UK where such protections do not extend so I do not see such relevance to this topic. I would also be curious to find out if there is a difference between “account closed” and “account disabled indefinitely”.
the point is that it doesn't have to be that way. and the examples in other countries show that it is indeed not like that everywhere. it is a fair question to ask which way is better, and looking at other ways to respond to these cases is relevant in my opinion.
It doesn’t have to be that way but it is and won’t be anything different until many variables change. As far as this case is concerned: no there is and never will be due process for this situation nor does any UK law allow for that.
Maybe the user could make an argument in court that Facebook was hurting his business but hard to prove with a free service. No real harm has come to this person.
Usually a “losing access to customers” argument is tied to loss of capital to make the argument stick. It is harder to tie a customer to loss of capital in a free service. Especially a free service that isn’t the only offering.
i don't see how you could come to that conclusion. first please read my response here: https://news.ycombinator.com/item?id=41350245 and then reread my parent comment, in particular the last paragraph where i talk about "these companies" and "major messaging systems". it's clearly not referring to "all companies" and even less to a custom tool you built for your friends. at best "replace facebook with any other tool" could be interpreted as applying to your tool, but even then it should be pretty obvious that i could not possibly have a problem not being able to access your tool unless at least one of your friends is also my friend.
> and my introvert nature makes phonecalls very uncomfortable
It's got nothing to do with introversion, you aren't comfortable makeing phonecalls. Start making phonecalls and it will get better.
If you truly struggle, stop lying to yourself that it's just your introversion and seek help. Mental health issues aren't something to be ashamed of, it is affecting your quality of life and should be diagnosed and treated by a professional.
I'm not saying you have a mental health issue. I'm just saying if you do, you don't need to let it effect your life.
possibly, though the reality is that i grew up before the internet and i was making phonecalls just fine then. somehow it got worse. either bad experiences, or something changed when i discovered the ability to communicate with written short messages instead. i have no idea.
i like to take the time to think before i answer, and that just doesn't work on the phone.
but the real point here is that you tell me i should be able to switch away from a method to communicate that i don't like but use another one that i like even less. that doesn't make any sense. why would i do that? (and i haven't even touched on the fact how much i hate giving people my phonenumber, on most messengers at least i can ignore people if they bother me)
Early 2023, when everyone started using chat GPT for coding, I thought it would be a big boost because it enabled us to quickly employ a large number of people in a project regardless of language or framework.
An year into the project I am forced to revise my opinion. When browsing my code-base I often stumble in abstruse niche solutions for problems that should not have existed. It was clearly the work of someone inexperienced walking through walls in an AI-fuelled coding frenzy.
Having an oracle that knows all answers is useless if you don't know what to ask.
> It was clearly the work of someone inexperienced walking through walls in an AI-fuelled coding frenzy.
Isnt this what code reviews are for? I catch a decent amount of code that looks AI generated. Typically, some very foreign pattern or syntax that this engineers never used nor is common in the codebase. Or something weirdly obtuse that could be refactored and shows a lack of understanding.
Normally I ask something like, "Interesting approach! Is there a reason to do it this way over (mention a similar pattern in our codebase)?" or if it's egregious, I might ask, "Can you explain this to me?".
This feel similar to early career engineers copy pasting stack overflow code. Now its just faster and easier for them to do. It's still fairly easy to spot though.
You would. And if the juniors don't really _write_ code in the languages you use, if they don't make mistakes and do research themselves, it'll take them a lot longer to learn them sufficiently to be able to do those reviews.
Listening to people talk in Japanese, I picked up enough to have some idea what they're talking about. I can't speak the language beyond some canned phrases. I certainly wouldn't claim to know Japanese. And I definitely wouldn't get a job writing books in Japanese, armed with Google Translate.
For programming, my solution to this right now is lots of pair programming. Really gives you a good idea of where somebody is at, what they still need to learn, and lots of teaching opportunities. I just hired a junior and we spend about 8 hours a week pairing.
Yeah. Are we assuming 100% of the people working on this project are incompetent? I thought OP was talking about new people joining a project producing low quality code that was likely generated by AI.
Code reviews, done by existing more senior members of the project/team, should prevent this. If it isnt then the project has more issues than AI generated code.
There is no substitute to doing something correctly in the first place. The problem is that in the real world, deadlines and lack of time will always cause the default solution to be accepted a small percentage of time even when it is not ideal. The increasing creep of AI will only exacerbate that and most technophiles will default into thinking of a new and improved AI tool to help with the problem, until it will be AI tools all the way down.
I would agree with there being no substitute for doing something correctly the first place, but I would argue that in this case the "first place" is hiring/training better so your employees don't try to throw unrefined AI shit at the walls, but instead take the output of AI and hone it before creating a PR.
If you have an engineering culture that doesn't emphasize thorough code review (at least of juniors, leads and architects emergency-pushing is a different story) that's a problem. In addition to catching bugs, that's a major vector for passing on knowledge.
I agree with you, no doubt there. But still, technology often offers the path of least resistance. So even if you want to hire better people, what of all the people who will grow up training themselves with Copilot? Yes, you can filter them out but it will be harder and harder to find good people. And then, companies will want to go after the bottom line: what if they can get a bigger and more complex product out there with AI assistance but with code that is more unreliable in the long term?
I tell you, it's spiralling out of control. Besides, even if you're doing the hiring, you may not have control if there is a profit motive. Us technical people cannot control the race to the bottom line, especially over the period of decades.
A foundational concept of quality control is to not rely on inspection to catch production defects. Why not? It diffuses responsibility, lets more problems get to the customer and is less efficient than doing it correctly to start with.
Code review isn’t inspection, that would be testing - whether automatic or manual.
Code review is the feedback mechanism that allows quality control to improve the process, teaching the engineers how to do it right in the first place.
Sure. If its generating 85% code that passes a code review and 15% of code that doesnt... I'll take those percentages. That is certainly higher than the average early career engineers pr.
If you're trying to do things efficiently you can't afford not doing code reviews. Only well-funded organizations can afford to write the same code twice after the first attempt has to be thrown out.
I work at a FANMG company and see the value of code reviews for th code we're working on. But, I shipped 20 commercial products before that with no code review, just play testing (games). Our velocity without code review was easily 2x what it is with. For those projects I think that would still be true whereas I'd never suggest getting rid of code review at my current job.
The point being I think it depends on the project, it's size, audience, lifetime, etc whether or not code review is a win. my gut says it usually is a win but not always
At the one company I was where we did code reviews, they took at most one hour per day. The reviewer would go through the changes, ask for explanations when they didn't understand something, suggest small modifications to fit the codebase's style/structure/vibe, and move on.
They definitely didn't take 50% of our time. Half my time wasn't even spent coding, it was spent thinking about problems (and it was a workplace with an unusually high coding-over-thinking ratio, because of its good practices).
> At the one company I was where we did code reviews, they took at most one hour per day.
> They definitely didn't take 50% of our time.
Imo introducing code reviews might very well decrease the velocity by 50% even if the actual time spent on code reviews is much less.
I see at least two reasons for this: 1) increased need for synchronization/communication 2) increased subjective frictions and mental overhead (if only for task switching).
Code reviews are not special in this respect, similar effects can be caused by any changes to the process.
Thorough code reviews are expensive. When I review code, I have to know how to solve the problem well in the first place—that is, do all of the effort that went into writing the original code under review, minus time spent typing.
Less time may be required if you have good rapport and long work relationship with the person, but I would say halving productivity sounds about right if we are talking mandatory reviews.
Using an LLM to generate code means having to do the same. The longer the autocompleted text, the more careful you have to be vetting it. Lower predictability, compared to a person and especially one you know well, means that time requirement would not go down too much.
> When I review code, I have to know how to solve the problem well in the first place—that is, do all of the effort that went into writing the original code under review, minus time spent typing
Agree 100%, even when LLMs aren’t involved.
Certainly not all code reviews are like this—in many cases the approach is fundamentally sound, but you might have some suggestions to improve clarity or performance.
But where I work it’s not all that rare to open up a PR and find that the person has gone full speed in the wrong direction. In those cases you’re losing at least a morning to do a lot of the same leg work they did to find a solution for the problem, to then get them set off again down a better path.
Now in the ideal case they understood the problem they needed to solve and perhaps just didn’t have the knowledge or experience to know why their initial solution wasn’t good; in such a scenario the second attempt is usually much quicker and much higher quality. But introduce LLMs into the mix and you can make no guarantees about what, if anything, was understood. That’s where things really start to go off the rails imo.
> in many cases the approach is fundamentally sound
Sure, but I mean you still have to determine whether it is sound or not in the first place; i.e., you must know how to solve the problem. You can only evaluate it as right or not if you have something to compare against, after all.
If you have worked with a person for a while, though, I can see that you can spot various patterns (and ultimately whether the approach is sane) faster thanks to established rapport. Not so with LLMs, as you probably agree.
I mean, it's something we did once per day, usually towards the end of the day. This was pre-Covid, so our manager was in the room next to ours and we could just pop by his office to tell him we were ready for today's review.
It really doesn't need to be a massive amount of task-switching. And the benefits were obvious.
It probably also depends on the people that you're working with. I can easily imagine the velocity plummeting when the person reviewing your code loves to nitpick and bicker and ask for endless further changes with every next round of alterations that you do.
Doubly so if they want to do synchronous calls to go over everything and essentially create visible work out of code that already worked and was good enough in the first place.
I'm not saying that there aren't times like that for everyone occasionally, but there are people for whom that behaviour is a consistent pattern.
OMG this. Nitpicking code reviews are the worst. I've actually instituted a rule that code must be formatted with an opinionated formatter and linted, and whatever the linter/formatter decide is right, and no formatting/linting comments are allowed on code review to avoid this sort of bullshit review comment.
Code review matters when you have a product that has an indeterminate lifespan. When new people come on, they need to learn the codebase and code review both helps ensure a codebase that is easy for new devs to understand (because hard to understand code doesn't survive review) and also it provides an onboarding experience as new devs can be coached during code review.
"One and done" products like games where you're mostly fixing bugs and releasing new content without major overhauls to the underlying code probably only need code reviews for code that is complex/clever where bugs are likely to be hiding, beyond that QA is fine.
Not just thrown out, but also the customer relations when the bugs affect customers, the investigation to find the cause of the bugs, the migrations to fix any bad/corrupted data because of the bugs. Code review is a far cheaper way to catch errors, bad code, footguns and places where bugs might creep in than letting them into production and having them impact customers. Code reviews also help keep code tidier and cleaner which makes future understanding and changes easier. Small companies can’t afford to omit code reviews.
Can you elaborate on some scenarios you are describing? The only organization I can imagine that can't do code reviews would be organizations of 1. I don't even see how performing code reviews could ever be a net negative in terms of money nor overall productivity.
Even if you're not working in the same language/framework/whatever, you can still do code reviews. You might not know all the syntax and high-level details, but even just an overview of the logic happening in the code is already better than 0 review of any kind
Even if you're not working in the same language/framework/whatever, you can still do code reviews. You might not know all the syntax and high-level details, but even just an overview of the logic happening in the code is already better than 0 review of any kind
usually 'organization' excludes projects consisting of just one person, and if there are two people working on a project that involves writing code, the cheapest amount of that code for both of them to be familiar with is never zero, though it is often less than 100% of all the code they write. so i don't think that 'not all organizations can afford code reviews' is true, interpreted in the usual way. maybe you can be more specific about the scenario you're thinking of?
As mentioned in the other comments, code reviews are typically not done in the game industry, for example, or other write-once, never read scenarios. Very early stage startups typically don’t have formal code reviews either.
while that is true, and some of it is rational, other times it's just another stupid way people waste money and fuck up their projects. because even in a startup or a game there is code that people waste time debugging alone, or waiting for the one guy who knows it to get back from vacation, or out of bed
I recently was stumbled upon a code change done by a colleague who just followed the suggestion as is from Copilot, where it recommend `func (string, string) (error, struct)` whereas everywhere in the code we use `func (string, string) (struct, error)`
When I asked him what prompted him to do that, he said copilot suggest it so I just followed. I wonder if you could hijack copilot's results and inject malicious code as many end users does not understand lot of the niche code it generates sometimes, you could manipulate them to add the malicious code to the org's codebase.
The insidious thing is that quite possibly, it "feels" there should be a couple of bugs like that because all the codebases it was trained on had a few.
It might even take the context of the typos in your code comments, and conclude "yeah, this easy to miss subtle error feels right about here".
That is a problem but thankfully there is a lot of attention on training with highly curated high quality data right now because it is a known problem. Buggy code is still valuable training data if you use it as part of a question and evaluate the response against a corrected version of the code when training the model to perform a task like bug fixing.
It's definitely possible to inject malicious code that humans don't spot, there was a whole competition dedicated to humans doing this in C well before LLMs: https://en.wikipedia.org/wiki/Underhanded_C_Contest
Now I'm wondering, can you put in a comment which the LLM will pay attention to such that it generates subtle back-doors? And can this comment be such that humans looking at the code don't realise this behaviour will be due to the comment?
Almost two decades ago, I saw a junior colleague (this was Java) try to add "static" to almost all of one specific class' methods and members, i.e. making them global instead of attached to each class instance . Obviously this completely broke the software, albeit it did build. When questioned during review, my colleague just shrugged and said "Because Eclipse suggested to do so".
Apparently, if you tried to access a class member without specifying a class instance, one of Eclipse's "auto-fix-it" suggestions was to make all members of that class static, and he just followed that suggestion blindly.
> A year into the project I am forced to revise my opinion. When browsing my code-base I often stumble in abstruse niche solutions for problems that should not have existed.
This is a widespread problem regardless of AI. Hence the myriad Stack Overflow users who are frustrated after asking insane questions and getting pushback, who then dig their heels in after being told the entire approach they’re using to solve a problem is bonkers and they’re going to run into endless problems continuing down the path their on.
Not that people aren’t on too fine a hair trigger for that kind of response. But the sensitivity of that reaction is a learned defense mechanism for the sheer volume of it.
> Hence the myriad Stack Overflow users who are frustrated after asking insane questions and getting pushback, who then dig their heels in after being told the entire approach they’re using to solve a problem is bonkers and they’re going to run into endless problems continuing down the path their on.
The problem is, SO can't tell someone who asks an insane question from someone who asks the same question but has constraints that make it sane. *
So in time, sane people during unusual stuff stop asking questions and you're left with homework.
* For example, "we can't afford to refactor the whole codebase because some architecture astronaut on SO says so" is a constraint.
Or another nice one is "this is not and will never be a project that will handle google-like volumes of data".
> The problem is, SO can't tell someone who asks an insane question from someone who asks the same question but has constraints that make it sane.
Stack Overflow is not there to help you solve your use-case. It's there to create a body of knowledge that everyone can refer to. You need to spell out your specific reasons so that the question and answers become useful to others.
A lot of the friction on Stack Overflow comes from people thinking it's a free help website rather than an attempt to create a collaborative knowledgebase.
I think a lot of SO's problems comes from it trying to pretend it was building a knowledge base when it was really only ever a question and answer site.
It was nice to use early in its life when every answer was fresh. These days, the stale knowledge outweighs the fresh and the top answer is usually no longer correct.
If SO wants to be a knowledgebase, it needs to to be redesigned as it's current structure is not well suited to it.
How ‘bout remembering a million tokens? I’m not feeling too confident about that. Basically my only moat, if there is one, is that I’m able to rely on higher level cognition which LLMs don’t yet have, rather than just on associative memory alone.
I might, actually. Think of where electric cars were six years ago — 2018. Not much has changed. Or, at least, there are still fundamental problems to be solved.
In the same way I can imagine that by 2030 LLMs will still have memory problems and hallucinations. Although I’m sure by then we’ll have something better than pure LLMs.
I've heard claims that context without forgetfulness has already been reached 2 months ago, but as I'm not a domain expert I don't trust that I can differentiate breakthroughs from marketing BS, and I definitely can't differentiate either of those from a Clever Hans: https://arstechnica.com/information-technology/2024/03/claud...
I work in this field, so here's a comment with higher signal-to-noise ratio than you'll commonly find on HN when it comes to LLMs: notice how the demo use cases for very long context stuff deal almost universally with point retrieval, and never demonstrate a high degree of in-context learning. That is not coincidental. The ability to retrieve stuff is pretty great and superhuman already. The ability to reason about it or combine it in nontrivial ways leaves a lot to be desired still - for that you have to train (or at least fine tune) the underlying model. Which IMO is great, because it neatly plugs the gaps in human capability.
So do I, and worse. Look, all I’m saying is I’m thankful for this crutch that helps me deal with the limitations of my associative memory, so as long as it can’t think and can’t replace me entirely
bacopa and lions mane was night and day for my limited memory. But, the obvious, writing simpler code, keeping a notebook while working, etc, and really spending time breaking down problems into much smaller scopes, while simultaneously keeping copious notes on where I am in a given process helped immensely with dealing with peanut brained memory. sure, it's not quick, but my work is usually very readable and understandable for the future reader. I'm not convinced that a tool to help me overcome that memory barrier would actually help me write better code, maybe just write worse code faster. Of course, that's probably the corporate goal though.
Lions mane didn't seem to do much for me. My memory is actually not that bad, though certainly I have seen people with _much_ better memory than mine. I still maintain contact with some of them I met in college, and then later in the various companies. It's just that I deal with so much information and so many streams of it, even writing it down would be a massive chore.
I would pay for a pre-packaged system which could _locally_ and _privately_ make sense of all the emails, PDFs, slack messages, web pages I saw, other documents shared with me, code, etc etc, and make it all easily queryable with natural language, with references back to the sources. Sooner or later someone will make something like that.
The context heads of a LLM are more analogous to the sort of processing that goes on in, e.g., Brocca's Area of your brain as opposed to working memory. You can't have anything analogous to working memory as long as LLMs are operating on a strict feed forward basis[1]. And the fact that LLMs can talk so fluently without anything like a human working memory (yet) is a bit terrifying.
[1] Technically LLMs do have a forget that last toke and go back so I can try again operation so this is only 99% true.
I think humans have better general recall whilst lacking any kind of precision. After reading an entire book, I definitely can’t replicate much (if any) of the precise wording of it, but given a reasonably improbable sentence I can probably tell with certainty that it didn’t appear. LLMs are probably much more prone to believing they’ve read things that aren’t there and don’t even pass a basic sanity check, no matter how long the context window.
>> Having an oracle that knows all answers is useless if you don't know what to ask.
That is a great point. The issue of not asking the right questions has been around as far as I can remember but I guess it wasn't seen as the bottleneck because people were so focused on solving problems by any means possible that they never had to think about solving problems in a simple way. We're still very far from that though and in some ways we have taken steps back. I hope AI will help to shift human focus towards code architecture because that's something that has been severely neglected. Most complex projects I've seen are severely over-engineered... They are complex but they should not have grown to hundreds of thousands of lines of code; had people asked the right questions, focused on the right problems and chosen the right trade-offs, they would have been under 10K lines and way more efficient, interoperable and reliable.
I should note though, that my experience with coding with AI is that it often makes mistakes for complex algorithms, or it implements them in an inefficient way and I almost always have to change them. I get a lot of benefit from asking questions about APIs and to verify my assumptions or if I need a suggestion about possible approaches to do something.
How do you know it was "AI-fuelled"? And what makes it a "frenzy"?
People have been committing terrible code to projects for decades now, long before AI.
The solution is a code review process that works, and accountability if experienced employees are approving commits without properly reviewing them.
AI shouldn't have anything to do with it. Bad code shouldn't be passing review period, no matter if it was AI-assisted or not. And if your org doesn't do code review, then that's the actual problem.
> and accountability if experienced employees are approving commits without properly reviewing them.
You’re putting the entire responsibility on senior employees. So we need much more of them. In fact, we don’t need juniors, because we can generate all possible code combinations. After all, it’s the responsibility of the seniors to select which one is correct.
It’s like how hiring was made crap by the “One-click apply” on LinkedIn and all other platforms. Sure it’s easy for thousands of people to apply. Fact is, we offer quite a good job with high salary, and were looking for 5 people. We’ve spent a full year selecting them, because we’ve receive hundreds of irrelevant applications, probably some AI-generated.
It’s no use to flood a filter with crap, hoping that the filter will do better work because it has a lot of input.
This is almost a special case of the very similar (and common) argument that ‘AI-generated misinformation isn’t bad; we’ve always had misinformation’.
The answer is also the same.
Volume. AI makes it trivially easy to generate vast amounts of it that don’t betray their lack of coherence easily. As with much AI content, it creates arbitrary amounts of work for humans to have to sift through in order to know it’s right. And it gives confidence to those who don’t know very much to then start polluting the informationsphere with endless amounts of codswallop.
> Having an oracle that knows all answers is useless if you don't know what to ask.
Honestly I find this to be the biggest advantage to using a coding LLM. It's like a more interactive debugging duck. By the time I've described my problem in sufficient detail for the LLM to generate a useful answer, I've solved it.
I’ve had a project I’ve been doing for ~6 months learning python through Copilot+ChatGPT. It feels like any other project that accrues technical debt, but that debt is just a lot weirder. It was rough at first, but refactoring has gotten a lot better recently with bigger context sizes.
Good one, however, like Google making memory worse, copilot seems to erode knowledge of programming quite quickly. Last week I was with a colleague at my house; when I hired him during covid he knew these things just fine, but after not having to type in a list comprehension for around 2 years now, when my Starlink was down for a bit, he couldn’t for the life of him create one; he asked Claude opus on his phone instead.
For real. I have the same experience. I'm 19, freelance developer writing Python among other things. Except I never learned Python syntax, and maybe never will.
For a senior-level professional? Yes, it does matter.
I wouldn't be comfortable with an accountant who couldn't do practical arithmetic in their head, or a surveyor who didn't have a fluent grasp of trigonometry and the ratios or function values for common angles.
Of course -- I don't care about the people working for them as a bookkeeper or assistent. They can be button monkeys as much as their boss lets them, but I also wouldn't expect those folk to reach very high in their career. Not everyone's going to, and a disinterest in and lack of technical fluency is a darn good predictor.
Your teacher was giving good advice about building skills and internalizing knowledge, because those are what contribute to mastery of a craft; maybe you were just being too pedantic or cynical to hear it?
I think this advice is backwards, as is the problem. I’d trust an accountant’s mind or a builder’s eye if I knew that they are acting from experience. But you don’t wanna trust someone who believes in-their-head is superior or reliable and makes you good.
The core idea is no, calculating in your head doesn’t turn you into a pro. Being a pro makes calculating in your head safe enough to rely on until you get to your desk and double-check.
That said, I was an integrator half my life and seen some accountants, big and small. Everyone used a calculator, it’s right next to their mouse and there’s another one elsewhere. And no one ever tried to talk about numbers afk, except for ballparks. Most of the times they suggest to go and query a db together. I find it very professional.
I just trust my accountant will use excel and what they are doing isn't rocket science.
I have no interest interacting with someone who is going to get into how great the slide ruler is and how kids these days need to learn the slide ruler. When I was younger I thought this type of person was highly admirable but now older and wiser I see how full of shit they are.
>> If you need Copilot to code in Python, have you really learned Python?
> Does it really matter though? It sounds awfully like when a school teacher said you're not going to have an calculator in your pocket.
Yes. A younger me had teachers say that to me, and that younger me thought they were wrong.
But it turns out they were right, and younger me was wrong. Calculator dependence created a ceiling built from "educational debt" that many years later severely limited my mathematical ability.
The problem is focusing too much on getting the "answer," and losing sight of building the abilities that allow you to get there with your own mind. Eventually the need to manipulate your automated crutch turns into a bottleneck.
It’s also kind of annoying to intuitively get to the correct answer and have everyone claim they can’t distinguish between that and you using a calculator.
I wonder if that would be an issue if nobody had any.
Isn't it obvious? If you can't do arithmetic without a calculator, it makes it hard to do algebraic manipulation. (Un)luckly a calculator that could do that arrived in my lap just in time. Then you get to calculus, and it was the same story. I eked by with a lot of furious typing, but the mental workload of that made it untenable towards the end and I wasn't really gaining much anyway because of that. It would have been far better if I'd just hadn't been allowed to use a calculator from the start.
It didn't affect my ability to reason and prove things, just as long those things don't strongly require the knowledge and skills I should have gotten from the calculator shaped gap in my education. I lack a lot of the background knowledge and/or familiarity and comfort with many skills that I should have.
I think we're using the term “mathematical ability” in different ways. For me, it's an ability to prove a theorem, not to solve a differential equation.
Yes, it does matter. While there are plenty of situations where it's foolish to forbid calculators, that's not universally true.
Forbidding calculators and requiring students to do mental math for absolutely everything is unnecessary. But requiring students to solve integrals by hand when they're learning about integrals? Entirely reasonable.
If your goal when teaching coding is to teach the mechanical process of writing code: sure, go ahead and use LLMs for that process. But if your goal is to develop a deeper understanding of how to code, then LLMs can very easily obscure that. The end goal is not always just the answer.
> But requiring students to solve integrals by hand when they're learning about integrals? Entirely reasonable.
I remember having to write code on a piece of paper in university.
Curiously, this lead to a lot of other students not really learning or caring about indentations and attempting to make code formatted in a generally readable way once they actually got access to IDEs and such.
They'd write it much like you would regular text on a piece of paper and would get absolutely stumped with unbalanced brackets whenever some nested structure would be needed.
Not only that, but somehow a lot of them didn't quite get around to using the various refactoring features of the tools, since they treated them like editors for writing text.
Did it help me memorize the basic constructs of the languages in question? Absolutely not, at least in the long term, since nowadays I have to jump between a lot of different languages and none of the syntax quite sticks, rather a more high level approach to problem solving.
LLMs just make that easier for me, letting me offload more work around language constructs and boilerplate, albeit requiring occasional intervention.
AI is not replacing the same function as a calculator does when solving a math problem.
Using AI to “help” learn to program is replacing the person’s effort thinking through the problems fully and is ultimately going to stunt their learning.
A calculator is fine so long as you understand what the operations actually represent. Remembering the multiplication table is not important, but understanding what multiplication actually does is.
The problem with AI-assisted coding is that, applied uncritically, it circumvents this understanding. And without it, you won't even be able to judge whether the code that CoPilot etc spit out is correct for the task.
yeah, at the moment anyway, because if you don't know what the syntax means, you can't tell when the code means something you don't want. that means you can't tell which copilot suggestions to accept and which ones to reject, and you can't debug your code; asking gpt-4 what's wrong with it only goes so far
imagine if the calculator in your pocket gave you subtly wrong answers most of the time
This is my new capability- I am not a coder or a programmer and I can get things built, in my own time, at my own speed, solo.
Would it be better code if someone with 3 years of university and 5 years of coding practice did it? Yes, very probably, but the gap seems to be narrowing. Humorously I don't know enough about good code to tell you whether what I build with llms is good code. Sometime I build a function that feels magical, other times it seems like a fragile mess. But I don't know.
Do I know "javascript" or "python" or the theory of good coding practice? No, not currently. But I am building things. Things that I have personal, very specific requirements for.
Where I don't have to liaise or berate someone else. Where I don't have to pay someone else. Where I don't share the recognition (if there is any ever) of the thing, I and only I have produced- (with chatGPT, Gemini and most recently llama3).
Folks have been feeling superior for 70 years and earning a good living because they spoke the intermediary language of compute engines. What makes them actually special NOW is computer science, the theory- the languages, we have very cheap (and in the case of open source local models free) translators for those now. And they can teach you some computer science as well, but that is still time and practice.
I'm the muggle. The blunt. And I'm loving this glowy lantern, this psi-rig.
One of the lessons that one learns as a programmer is to be able to write code that one can later read back and understand. This includes code written by others as well as code written by oneself.
When it comes to production quality code that should capture complex and/or business-critical functionality, you do want an experienced person to have architected the solution and to have written the code and for that code to have been tested and reviewed.
The risk right now is of many IT companies trying to win bids by throwing inexperienced devs at complex problems and committing to lower prices and timelines by procuring a USD 20 per month Github Co-Pilot subscription.
You individually may enjoy being able to put together solutions as a non-programmer. Good for you. I myself recently used ChatGPT to understand how to write a web-app using Rust and I was able to get things working with some trial and error so I understand your feeling of liberation and of accomplishment.
Many of us on this discussion thread work in teams and on projects where the code is written for professional reasons and for business outcomes. The discussion is therefore focused on the reliability of and the readability of AI-assisted coding.
hmm.
I ended up with a few 750+ line chunks of js, beyond the ability of chatgpt to parse back at me. So my go-to technique now is to break it into smaller chunks and make them files in a folder structure, rather than existing inside a single *.js
So readability is an issue for me- even more so because I rely on ChatGPT to parse it back- sometimes I understand the code, but usually I need the llm to confirm my understanding.
I'm not sure if this scales for teams. My work has sourcegraph, which should assist with codebases. So far it hasn't been particularly useful- I can use it to find specific vulnerable libraries, keys in code etc, but that is just search.
What I really need is things like "show me the complete chain of code for this particular user activity in the app and highlight tokens used in authentication" ... - something senior engineers struggle to pull from our hundreds of services and huge pile of code. And so far sourcegraph and lightstep are incapable of doing that job. Maybe with better RAG or infinite context length or some other improvement there will be that tool. But currently the combined output of 1000's of engineers over years almost un-navigable.
Some of that code might be crisp, some of it is definitely of llm-like quality (in a bad way)- I know this because I hear people's explanation of said code and how they misremembered it's function during post mortems. Folks copy and pasting outdated example code from the wiki etc. ie making things they don't understand. I presume that used to happen from stackoverflow too. Engineers moving to llm won't make too much difference IMO.
I agree, your points are valid, but I see "prompt engineering" as democratization of the ability to code. Previously this was all out of reach for me, behind a wall of memorization of language and syntax that I touched in the Pascal era and never crossed.
12 hours to build my first node.js app that did something in exactly the way I had wanted for 30 years. (including installing git and vscode on windows- see, now I am truly one to be reviled)
The problem I currently have with AI-generated code, as an experienced programmer who can read and understand the output, isn’t that the code quality is bad, but that it’s often buggy and wrong. I recently recommended in my company that if copilot is allowed to be used, the developers using it must thoroughly understand every line of code it writes before accepting it, because the risk of error is too high.
Copilot may work for simple scripts, but even for that where it mostly for things right, in my experience it still introduced subtle bugs and incorrect results more often than not.
I've been coding Python for 15 years, but I probably couldn't code in Python now without Copilot, or a lot of reference docs. There's so much meaningless trivia and idioms I've purposefully pushed out of my mind so I can focus on other things. If copilot has my back, why do I need to remember that crap?
Learning to be a good programmer is as much about learning how to avoid technical dept as it is about learning to use a programming language. It may take a while until an AI assistant will be able to help with that.
>Having an oracle that knows all answers is useless if you don't know what to ask.
This sentence summarizes the issue with the current AI debacle, along with the whole "just copy/pase code from stackoverflow and earn top bucks" meme that was going around in 2010s.
You're not gonna be a valuable dev if you're just write wrong code faster. Not only does chatgpt/copilot give haphazard code half of the time, it approaches seemingly random syntax and format. Even if LLMs are polished, you're gonna need stand software engineering knowledge to know what's right and wrong.
Coding just keeps getting grosser and weirder decade over decade as new layers of abstraction and complexity get piled on. Just about everyone who does this shit for a living eventually hits a break point where the headassery du jour becomes too much to turn a blind eye to.
LLMs are no excuse for bad code reviews or developers who don’t know what it’s spitting out.
In my code reviews the person who wrote the code needs to explain to me what they changed and why. If they can’t then we are going to have a problem. If you don’t understand the code that an LLM spits out you don’t use it, it’s that simple. If you use it and can’t explain it, well… we are going to have to have some discussions and if it keeps happening you’re going to need to find other employment.
The exact same thing has been happening for pretty much the entire time we’ve had internet. Stack Overflow being the primary example now but there were plenty of other resources before SO. People have always been able to copy/paste code they don’t understand and shove it into a codebase. LLMs make that easier, no doubt, but the core issue has always been there and we, as an industry, have had decades to come up with defenses to this. Code review being the best tool in our toolbox IMHO.
> You might be surprised to learn that I actually think LLMs have the potential to be not only fun but genuinely useful. “Show me some bullshit that would be typical in this context” can be a genuinely helpful question to have answered, in code and in natural language — for brainstorming, for seeing common conventions in an unfamiliar context, for having something crappy to react to.
> Alas, that does not remotely resemble how people are pitching this technology.
It is exactly what happened to you: it wrote bullshit. Plausible bullshit but bullshit nonetheless.
It’s also just not as good at the task anymore. It frequently gets lazy and gives you an outline with a bunch of vague pseudocode. Compare to when GPT-4 was slower at producing output, but all of that output was solid, detailed work. Some of the magic that made you say “wow” feels like it’s been enshittified out of it.
I sometimes try the free chatgpt when I run into a problem and it's just hilarious how terrible it is. Loves to go around in circles with the same made up solution that has no basis in reality, using functions in libraries that would be great if they actually existed.
I noticed that like since a week ago? Output faster, but not impressive. Now I just skip to stack overflow or docs. The output is also giving error a lot more, as if the libraries on which the example is based off was old. Sometimes it's really trivial task just to save time, and it's just not of any help. Still helpful when you want to start something new, it just doesn't scale that well.
Yes, rather poor but people can always post new answers and votes sort the answers. It might not work all that well but there is a mechanism for improvement and to keep things up to date.
Language models can copy the top answers from SO, ingest docs and specs etc. And then the information is never updated? Or are they going to train it from scratch? On what? Outdated github saved games?
It is difficult to get a man to understand something when his salary depends on his not understanding it. ~ U. Sinclair
A significant problem is the subconscious defense mechanism or bias that compels us to conclude that AI has various shortcomings, asserting the ongoing need for status quo.
The capabilities of GPT-3.x in early 2023 pale in comparison to today's AI, and it will continue to evolve and improve.
I’m surprised to see this comment was downvoted heavily. The quote is very popular around here, to the point where there is an Ask HN if it’s the most quoted quote:
I don't even use a third party software-keyboard, I just use termux's special key bar. To set it up, add the following to ~/.termux/termux.properties
This has been enough for using org mode in everyday life tasks, and I don't need to keep swapping keyboards.