Back when Claude Code had per-token pricing, almost nobody used it because it was clearly much more expensive than the Cursor pricing - $20 a month flat for Cursor vs $5-10 a day for per-token Claude. The incentives manifested in the way both products used tokens - Claude Code has no particular qualms about submitting a gigantic number of tokens and letting Sonnet figure it all out, whereas Cursor puts in a lot of traditional software engineering to figure out the correct minimal context to put in. Now that Claude Code is on a fixed price plan, it strangely doesn't seem like Anthropic is doing anything to optimize the number of tokens consumed by Claude.
I think it's quite plausible that Anthropic is bleeding out ~100/month on token costs per $20/month user, and even at 80% margin, this is just merely breakeven. Their limited capacity also means that they are _losing_ the opportunity to sell the same capacity at a per-token marginal profit. I think the only plausible endgame here is that Anthropic uses the usage data to RL-finetune Claude Code to the point where it is actually worth a $200/month subscription.
Enjoy the $20/month Claude Pro plan while it lasts; I don't really see it sticking around for more than a year at best.
The Claude Code privacy policy[0] is pretty explicit that by default they train on neither the prompts, usage data, or even explicitly provided feedback data (presumably /bug?) that can be used for other product improvements.
> By default, Anthropic does not train generative models using code or prompts that are sent to Claude Code.
> We aim to be fully transparent about how we use your data. We may use feedback to improve our products and services, but we will not train generative models using your feedback from Claude Code.
[...]
> If you choose to send us feedback about Claude Code, such as transcripts of your usage, Anthropic may use that feedback to debug related issues and improve Claude Code’s functionality (e.g., to reduce the risk of similar bugs occurring in the future). We will not train generative models using this feedback. Given their potentially sensitive nature, we store user feedback transcripts for only 30 days.
For understanding what value they place on that data, they do have a program where you can opt-in to have your data be used for training[1] in exchange for a discount on the API rates.
As a former big tech engineer, I can't help but come up with a gazillion ways to work around these sorts of seemingly straightforward policies.
Here's one way they could get around their own privacy policy: keep track of what % of Claude-generated code is retained in the codebase over time (as an indicator of how high-quality / bug-free the code was); A/B test variations of Claude Code to see which variations have higher retention percentages.
No usage data is retained, no code is retained, no data is used (other than a single floating point number) and yet they get to improve their product atop your usage patterns.
Here's another idea: use a summarization model to transform your session transcript into a set of bits saying "user was satisfied/dissatisfied with this conversation", "user indicated that claude was doing something dangerous", "user indicated that claude was doing something overly complicated / too simple", "user interrupted claude", "user indicated claude should remember something in CLAUDE.md", etc. etc. and then train on these auxiliary signals, without ever seeing the original code or usage data.
I always get a kick out of sheer number of HNers with deep concern about “training on their data” while hacking a crud boot service with nextjs fromt-end :)
Compared to when Claude Code was originally released in late February, its token use is greatly reduced now. Since the late May Claude 4 releases I agree with you; it hasn't decreased much since then.
20? I'd be excited if the 200 a month plan stays after a year. I went with the max reluctantly being the cheapskate I am. There is no way I'm giving that up now. I'm really worried that if they rise the prices il find it extremely hard to not fall for it! Just hoping the open source models catch up by then. Even if they get to CC abilities (of today) that is good enough for me!
I think it's quite plausible that Anthropic is bleeding out ~100/month on token costs per $20/month user, and even at 80% margin, this is just merely breakeven. Their limited capacity also means that they are _losing_ the opportunity to sell the same capacity at a per-token marginal profit. I think the only plausible endgame here is that Anthropic uses the usage data to RL-finetune Claude Code to the point where it is actually worth a $200/month subscription.
Enjoy the $20/month Claude Pro plan while it lasts; I don't really see it sticking around for more than a year at best.