Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

have been on 1M context window with claude since 4.0 - it gets pretty expensive when you run 1M context on a long running project (mostly using it in cline for coding). I think they've realized more context length = more $ when dealing with most agentic coding workflows on api.




You should be doing everything you can to keep context under 200k, ideally even 100k. All the models unwind so badly as context grows.

I don't have that experience with gemini. Up to 90% full, it's just fine.

If the models are designed around it, and not resorting to compression to get to higher input token lengths, they don't 'fall off' as they get near the context window limit. When working with large codebases, exhausting or compressing the context actually causes more issues since the agent forgets what was in the other libraries and files. Google has realized this internally and were among the first to get to 2M token context length (internally then later released publicly).



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: