It's really quite amazing that people would actually hook an AI company up to data that actually matters. I mean, we all know that they're only doing this to build a training data set to put your business out of business and capture all the value for themselves, right?
A few months ago I would have said that no, Anthropic make it very clear that they don't ever train on customer data - they even boasted about that in the Claude 3.5 Sonnet release back in 2024: https://www.anthropic.com/news/claude-3-5-sonnet
> One of the core constitutional principles that guides our AI model development is privacy. We do not train our generative models on user-submitted data unless a user gives us explicit permission to do so.
This sucks so much. Claude Code started nagging me for permission to train on my input the other day, and I said "no" but now I'm always going to be paranoid that I miss some opt-out somewhere and they start training on my input anyway.
And maybe that doesn't matter at all? But no AI lab has ever given me a convincing answer to the question "if I discuss company private strategy with your bot in January, how can you guarantee that a newly trained model that comes out in June won't answer questions about that to anyone who asks?"
I don't think that would happen, but I can't in good faith say to anyone else "that's not going to happen".
For any AI lab employees reading this: we need clarity! We need to know exactly what it means to "improve your products with your data" or whatever vague weasel-words the lawyers made you put in the terms of service.
I often think suspect that the goal isn't exclusively training data so much as it's the freedom to do things that they haven't thought of in the future.
Imagine you come up with non-vague consumer terms for your product that perfectly match your current needs as a business. Everyone agrees to them and is happy.
And then OpenAI discover some new training technique which shows incredible results but relies on a tiny slither of unimportant data that you've just cut yourself off from!
So I get why companies want terms that sound friendly but keep their options open for future unanticipated needs. It's sensible from a business perspective, but it sucks as someone who is frequently asked questions about how safe it is to sign up as a customer of these companies, because I can't provide credible answers.
To me this is the biggest threat that AI companies pose at the moment.
As everyone rushes to them for fear of falling behind, they're forking over their secrets. And these users are essentially depending on -- what? The AI companies' goodwill? The government's ability to regulate and audit them so they don't steal and repackage those secrets?
Fifty years ago, I might've shared that faith unwaveringly. Today, I have my doubts.
Why do you even necessarily think that wouldn't happen?
As I understand it, we'd essentially be relying on something like an mp3 compression algorithm to fail to capture a particular, subtle transient -- the lossy nature itself is the only real protection.
I agree that it's vanishingly unlikely if one person includes a sensitive document in their context, but what if a company has a project context which includes the same document in 10,000 chats? Maybe then it's more much likely that whatever private memo could be captured in training...
I did get an answer from a senior executive at one AI lab who called this the "regurgitation problem" and said that they pay very close attention to it, to the point that they won't ship model improvements if they are demonstrated to cause this.
Lol and that was enough for you? You really think they can test every single prompt before release to see if it regurgitates stuff? Did this exec work in sales too :-D
They have a clear incentive to do exactly as said - regurgitation is a problem, because it indicates the model failed to learn from the data, and merely memorized it.
I think they can run benchmarks to see how likely it is for prompts to return exact copies of their training data and use those benchmarks to help tune their training procedures.
I despise the thumbs up and thumbs down buttons for the reason of “whoops I accidentally pressed this button and cannot undo it, looks like I just opted into my code being used for training data, retained for life, and having their employees read everything.”
> I mean, we all know that they're only doing this to build a training data set
That's not a problem. It leads to better models.
> to put your business out of business and capture all the value for themselves, right?
That's both true and paranoid. Yes, LLMs subsume most of the software industry, and many things downstream of it. There's little anyone can do about it; this is what happens when someone invents a brain on a chip. But no, LLM vendors aren't gunning for your business. They neither care, nor have the capability to perform if they did.
In fact my prediction is that LLM vendors will refrain from cannibalizing distinct businesses for as long as they can - because as long as they just offer API services (broad as they may be), they can charge rent from an increasingly large amount of the software industry. It's a goose that lays golden eggs - makes sense to keep it alive for as long as possible.
Its impossible to explain this to the business owners, giving a company this much access cant end up well. Right now, Google, Slack, Apple have a share of the data but with this Claude can get all of that.
Doesn't matter to 99.99% of businesses using social media. Only to the silly ones who decided to use a platform to compete with the platform itself, and to the ones that make a platform their critical dependency without realizing they're making a bet, then being surprised by it not panning out.
It's either that, or you are 100X slower for not using Claude Code. The manpower per hour savings are most likely more worth it than protecting some inputs.
You could also always run a local LLM like GLM for sensitive documents or information on a separate computer, and never expose that to third party LLMs.
You also need to remember that if you hire regular employees that they are still untrustworthy at a base level. There needs to be some obfuscation anyway since they can steal your data/info too as a human. Very common case especially when they run off to China or something to clone your company where IP laws don't matter.
In other news, been using Devstral 2 (Ollama) with OpenCode, and while it's not as good as Claude Code, my initial sense it that it's nonetheless good enough and doesn't require me to send my data off my laptop.
I kind of wonder how close we are to alternative (not from a major AI lab) models being good enough for a lot of productive work and data sovereignty being the deciding factor.
I've been coding professionally for almost 30 years. I use Claude Code heavily, even for larger features. To get that to work half-decent, you have to take on a PM/Tech-lead role, you're no longer a senior engineer.
For large pieces of work, I will iterate with CC to generate a feature spec. It's usually pretty good at getting you most of the way there first shot and then either have it tweak things or manually do so.
Implementation is having CC first generate a plan, and iterating with it on the plan - a bit like mentoring a junior, except CC won't remember anything after a little while. Once you get the plan in place, then CC is generally pretty good at getting through code and tests, etc. You'll still have to review it after for all the reasons others have mentioned, but in my experience, it'll get through it way faster than I would on my own.
To parallelize some of the work, I often have Visual Studio Code open to monitor what's happening while it's working so I can redirect early if necessary. It also allows me to get a head start on the code review.
I will admit that I spent a lot of time iterating on my way of working to get to where I am, and I don't feel at all done (CC has workflows and subagents to help with common tasks that I haven't fully explored yet). I think the big thing is that tools like CC allow us to work in new ways but we need to shift our mindset and invest time in learning how to use these tools.
* brainstorm all the ideas, get Claude to write docs + code for all them, and then throw away the code
* ask it to develop architecture and design principles based on the contents of those docs
* get it to write a concise config spec doc that incorporates all the features, respects the architecture and design as appropriate
* iterate over that for a while until I get it into a state I like
* ask it to write an implementation plan for the config spec
* babysit it as I ask it to implement phase by phase of the implementation plan while adhering to the config spec
It’s a bit slower to than what I’d hoped originally, but it’s a lot better in terms of end result and gives me more opportunity to verify tests, tweak implementation, briefly segue or explore enhancements, etc.
> To get that to work half-decent, you have to take on a PM/Tech-lead role, you're no longer a senior engineer.
But you’re saying it can be half-decent?
The problem is that about 75% of HN commenters have their identities tightly wound up in being a (genuflect) senior engineer and putting down PM/tech-lead type roles.
They’ll do anything to avoid losing that identity including writing non-stop about how bad AI code is. There’s an Upton Sinclair quote that fits the situation quite nicely.
I'd agree that 75% you speak of is generally hostile to the mere concept of PMs, but that's usually from a misapplication of PMs as proxy-bosses for absentee product owners/directors who don't want to talk to nerds - flow interruptions, beancounting perceived as useless, pointless ceremonies, even more pointless(er) meetings etc, and the further defiling of the definition of "agile".
But a deep conceptual product and roadmap understanding that helps one steer Claude Code is invaluable for both devs and PMs, and I don't think most of that 75% would begrudge that quality in a PM
Aren't executives just responding rationally to the current environment? Right or wrong, the broadly speaking, the current thinking is that GenAI will be super impactful. Which means there is a lot of risk to be seen as underinvesting in GenAI, even when the ROI isn't there. Until the hype dies down and there is a broad, practical understanding of the value of GenAI, I don't see how it could work anyother way.
I'm unsure how strange this is. As a Canadian, when I left the country I had to undergo what's termed a deemed disposition - i.e., pretend you sold all your assets and then pay the relevant taxes on the net gains you've enjoyed to that point. This includes proposing a value for any companies that are not publicly traded. See: https://www.canada.ca/en/revenue-agency/services/tax/interna...
So if all your money is tied up in your company you have to sell part of your business in order to be allowed to leave the country and by the way thanks for creating all those jobs? Sounds slightly CCP to me.
I know of at least 4 countries that have exit taxes, and while the US doesn’t have an exit tax if you simply move abroad (it does if you renounce citizenship) it has other very punitive taxes for expats. So, it isn’t a unique thing to Germany or, assuming you’re correct, China.
It's not that it has punitive taxes for expats, it's just that as a US citizen, the US doesn't care where you live -- you're subject to US tax. It has rules for everyone that prohibit deferring income taxes which implicate non-US tax-opaque entities, but that's not so much of a capital gains concern as a timing issue.
It's much simpler in many ways, although it creates its own issues with juggling tax treaties and realization timing.
Forcing you to pay a tax on unrealized gains is anathema to the US system, and would definitely burden founders to the extent they'd be well advised not to form in that jurisdiction in the first place.
PFIC taxation, of which some pension accounts qualify, as well as all mutual funds and ETFs, is absolutely punitive, and does in fact charge taxes on unrealized gains. Even if you don't have any PFICs, or even owe taxes to the US, the fact that you need to pay an accountant ~$500+, just to file the taxes, is in my opinion punitive as well.
Well there are actual realizations in the ETF and mutual fund context, and the market has simply decided that structure still works for the commingled investments of US taxpayers. PFIC taxation I concede is a head-scratcher, but it's somewhat orthogonal to the capital gains issue. The big take home is that the US wants tax on all of your income, and it doesn't want you hiding any of that offshore. I'm an expat who deals personally with the last issue, and again I agree it's quite frustrating, but that's not what punitive means. That's just the type of frustration that arises when one is required to synchronize one or more complex systems.
China doesn't allow its citizens to send or invest money abroad hardly at all. Yes, this means that most of the Chinese who bought houses in the US and Canada were breaking Chinese law.
One cause of China's real-estate bubble was too much (domestic) savings chasing too few investment vehicles.
To learn more, read up on financial repression in China.
I used to use Cursor and just deal with the slow requests for most of the month because it was the most affordable way to leverage an agent for coding, but I didn't find it so much better than Cline or Roo. When I first tried Claude Code, it was immediately clear to me that it worked better, both as an agent and for me, but it was way too expensive. Now with the $200/mo. Max plan, I couldn't be happier.
That said, I still approach it with the assumption that Claude Code is just mashing its fists on the keyboard and that there needs to be really strong, in-loop verification to keep it in line.
I've spent a lot of hours vibe coding with sonnet 3.7 thinking and I'm not seeing anything in the article that jumps out at me as being different from my experience.
After spending some time vibe coding, I think this article is pretty accurate in that it aligns with a) how poor AI agents work in practice and b) the fact that non-coders are expecting magic from AI (which to be fair is what the AI companies are promising with all of their hype).
Where I have found vibe coding as an approach really shine is if I need to write some sort of quick utility to get a task done. Something that might take an hour or more to slap together to solve some menial task that I need to do on a bunch of files. Here I can definitely throw it together quicker than manually and don't care if it is messy code.
Larger, more complicated apps that are meant for production are painful to try to get AI tools to build. Spend so much time prompting the AI to get the task done without breaking something else that I doubt I'm any faster than just hand coding it alongside a co-pilot
Claude Code released a little while ago and I've been using it on a production codebase at work. It's really good at repetitive tasks like filling out JSON schemas, cloning boilerplate logic, and so on. It's also honestly not half bad at helping point me to the right locations when bugs can be found.
I find that it works best when used by an actual programmer who has a good idea of exactly what they want to do and how they want it done. I often find myself telling it extremely specific things like, add a switch case in this callback in this file. Add a command in this file after this other one. Create a new file in this directory that follows the convention of all the others. And so on. If you instruct it well, you can then tell it to repeat what it just did for every item in a list that is like 20 items long and you will have saved hours of development time. Very rarely does it spit out fully functional code but it's very good at saving you the time it takes to constantly repeat yourself.
(This codebase isn't that good at DRY, I try my best with things like higher-order functions but there's only so much I can do, I still need to repeat myself in many cases.)
What do you mean cloning boiler plate logic? Don't you just write it once and then call the function? Need to change things? Okay do a little abstraction. But I thought a big part of coding was to reduce repetition
As an example, each tool callable by the AI needs its own input JSON schema, and in its execute function needs to send a request to the client, client needs to have a callback that handles that request, etc. it is very boilerplatey, bridges across multiple implementation languages, in completely different parts of the codebase, and Claude knocks it out in like 30 seconds flat so I can focus on the parts of the implementation that it slightly fucked up, but it usually gets the boilerplate bang-on.
I'm not really sure what you mean. It saves me time to ask Claude Code to clone a file for me and repurpose it for a new tool compared to doing it all manually. And iterating on it can also be faster because I can tell it what transformations to make to the file and it tends to be faster than doing those transformations myself.
I still do a bit of manual post-processing but the end-to-end latency is a lot lower than if I had done all of the processing that manually.
Another relatively common pattern for me is to do a certain transformation manually once and then tell Claude Code to repeat it on all of the related tools. It does a good job at that too. I don't even have to manually tab through each file in my editor, or come up with some cursed find and replace pattern, or anything like that. It's just more time-efficient for me.
Note that I don't use an LLM to save mental effort. It's purely for saving time. People who try to use an LLM to save mental effort are usually using it wrong. You still need to know what you're doing in order to properly tell the LLM to do that.
reply