Back when Claude Code had per-token pricing, almost nobody used it because it wa...

jsnell · 2025-07-11T18:24:14 1752258254

The Claude Code privacy policy[0] is pretty explicit that by default they train on neither the prompts, usage data, or even explicitly provided feedback data (presumably /bug?) that can be used for other product improvements.

> By default, Anthropic does not train generative models using code or prompts that are sent to Claude Code.

> We aim to be fully transparent about how we use your data. We may use feedback to improve our products and services, but we will not train generative models using your feedback from Claude Code.

[...]

> If you choose to send us feedback about Claude Code, such as transcripts of your usage, Anthropic may use that feedback to debug related issues and improve Claude Code’s functionality (e.g., to reduce the risk of similar bugs occurring in the future). We will not train generative models using this feedback. Given their potentially sensitive nature, we store user feedback transcripts for only 30 days.

For understanding what value they place on that data, they do have a program where you can opt-in to have your data be used for training[1] in exchange for a discount on the API rates.

[0] https://docs.anthropic.com/en/docs/claude-code/data-usage

[1] https://support.anthropic.com/en/articles/11174108-about-the...

brilee · 2025-07-11T18:33:06 1752258786

As a former big tech engineer, I can't help but come up with a gazillion ways to work around these sorts of seemingly straightforward policies.

Here's one way they could get around their own privacy policy: keep track of what % of Claude-generated code is retained in the codebase over time (as an indicator of how high-quality / bug-free the code was); A/B test variations of Claude Code to see which variations have higher retention percentages.

No usage data is retained, no code is retained, no data is used (other than a single floating point number) and yet they get to improve their product atop your usage patterns.

Here's another idea: use a summarization model to transform your session transcript into a set of bits saying "user was satisfied/dissatisfied with this conversation", "user indicated that claude was doing something dangerous", "user indicated that claude was doing something overly complicated / too simple", "user interrupted claude", "user indicated claude should remember something in CLAUDE.md", etc. etc. and then train on these auxiliary signals, without ever seeing the original code or usage data.

handfuloflight · 2025-07-11T18:51:12 1752259872

They can train all they want on my code all they want if I keep getting $10,000 in inference for $200.

bdangubic · 2025-07-12T00:51:09 1752281469

I always get a kick out of sheer number of HNers with deep concern about “training on their data” while hacking a crud boot service with nextjs fromt-end :)

mwigdahl · 2025-07-11T18:14:26 1752257666

Compared to when Claude Code was originally released in late February, its token use is greatly reduced now. Since the late May Claude 4 releases I agree with you; it hasn't decreased much since then.

flashgordon · 2025-07-12T06:07:22 1752300442

20? I'd be excited if the 200 a month plan stays after a year. I went with the max reluctantly being the cheapskate I am. There is no way I'm giving that up now. I'm really worried that if they rise the prices il find it extremely hard to not fall for it! Just hoping the open source models catch up by then. Even if they get to CC abilities (of today) that is good enough for me!