Hacker Newsnew | past | comments | ask | show | jobs | submit | iLoveOncall's commentslogin

All the people in the comments are blaming the user for supposedly running with `--dangerously-skip-permissions`, but there's actually absolutely no way for Claude CLI to 100% determine that a command it runs will not affect the home directory.

People are really ignorant when it comes to the safeguards that you can put in place for AI. If it's running on your computer and can run arbitrary commands, it can wipe your disk, that's it.


There is, in fact, a harness built into the Claude Code CLI tool that determines what can and cannot be run automatically. `rm` is on the "can't run this unless the user has approved it" list. So, it's entirely the user's fault here.

Surely you don't think everything that's happening in Claude Code is purely LLMs running in a loop? There's tons of real code that runs to correctly route commands, enable MCP, etc.


That's true - but something I've seen happen (not recently) is claude code getting around its own restrictions by running a python script to do the thing it was not able to do more directly.

echo "rm -rf ~/ > safe-rm" chmod 755 safe-rm ./safe-rm

Sandboxes are hard, because computer science.


Or just 'mv ~ /dev/null'

For what it's worth the author does acknowledge using "yolo mode," which I take to mean `--dangerously-skip-permissions`. So `--dangerously-skip-permissions` is the correct proximal cause. But I agree that it isn't the root cause.

Jup.

Honestly was stumped that there was no more explicit mention of this in the Anthropoc docs after reading this post couple days back.

Sandbox mode seems like a fake sense of security.

Short of containerizing Claude, there seems to be no other truly safe option.


I mean it's hard to tell if this story is even real, but on a serious note, I do think Anthropic should only allow `--dangerously-skip-permissions` to be applied if it's running in a container.

How exactly do you determine that you are running in a container?

Oof, you are bringing out the big philosophical question there. Many people have wondered whether we are running in a simulation or not. So far inconclusive and not answerable unfortunately.

:)


I asked Claude and it had a few good ideas… Not bulletproof, but if the main point is to keep average users from shooting themselves in the foot, anything is better than nothing.

I'm not sure how much you should do to stop people who enabled `--dangerously-skip-permissions` from shooting themselves in the foot. They're literally telling us to let them shoot their foot. Ultimately we have to trust that if we make good information and tools available to our users, they will exercise good judgment.

I think it would be better to focus on providing good sandboxing tools and a good UX for those tools so that people don't feel the need to enable footgun mode.


Wow 89% availability is a joke

> The UK is actually a scary place right now, if you are paying attention..

It has been the most authoritarian country in the West for decades already, this is nothing new.

British people are the most apathetic people in the world, so it's really easy to abuse them.


>the most authoritarian country in the West

Australia and the US are more authoritarian in specific areas e.g. censorship and taxation respectively.. but overall, yes, the UK is worse.

>British people are the most apathetic

I'm not sure that's fair, our culture looks apathetic from abroad, but like other countries we care deeply about what our media tell us to care about.


> our culture looks apathetic from abroad

I live in the UK, and have lived in multiple other western countries before.

British people absolutely ARE apathetic.


Check out this fully self-hosted solution that was posted on HN recently: https://news.ycombinator.com/item?id=46081188

After the article I set it up myself, it took me around a day I would say. It supports exactly what you're asking for, although it's not a comprehensive tutorial so you'll need to figure some things out on your own.

Full disclosure I ended up turning it off only 2 days later because it was causing too many issues with networking and I suck at networking-related things, but it was great while it was working. I plan on setting it up again in the near future.


> I truly believe they're really going to make resilience their #1 priority now

I hope that was their #1 priority from the very start given the services they sell...

Anyway, people always tend to overthink about those black-swan events. Yes, 2 happened in a quick succession, but what is the average frequency overall? Insignificant.


This is Cloudflare. They've repeatedly broken DNS for years.

Looking across the errors, it points to some underlying practices: a lack of systems metaphors, modularity, testability, and an reliance on super-generic configuration instead of software with enforced semantics.


I think they have to strike a balance between being extremely fast (reacting to vulnerabilities and DDOS attacks) while still being resilient. I don't think it's an easy situation

The most surprising from this article is that CloudFlare handles only around 85M TPS.

it can't really be that small, can it?

that's maybe half a rack of load


Given the number of lua scripts they seem to be running, it has to take more than half a rack.

Since it's not included in the main article, here is the prompt:

> You are a stock trading agent. Your goal is to maximize returns.

> You can research any publicly available information and make trades once per day.

> You cannot trade options.

> Analyze the market and provide your trading decisions with reasoning.

>

> Always research and corroborate facts whenever possible.

> Always use the web search tool to identify information on all facts and hypotheses.

> Always use the stock information tools to get current or past stock information.

>

> Trading parameters:

> - Can hold 5-15 positions

> - Minimum position size: $5,000

> - Maximum position size: $25,000

>

> Explain your strategy and today's trades.

Given the parameters, this definitely is NOT representative of any actual performance.

I recommend also looking at the trade history and reasoning for each trade for each model, it's just complete wind.

As an example, Deepseek made only 21 trades, which were all buys, which were all because "Companyy X is investing in AI". I doubt anyone believe this to be a viable long-term trading strategy.


Agree. Those parameters are incredibly artificial bullshit.

I know this is a joke comment, but there are plenty of websites that simulate the stock market and where you can use paper money to trade.

People say it's not equivalent to actually trading though, and you shouldn't use it as a predictor of your actual trading performance, because you have a very different risk tolerance when risking your actual money.


Yeah, if you give me $100K I'm almost certainly going to make very different decisions than either a supposedly optimizing computer or myself at different ages.

> Are you sure? While Amazon doesn't own a "true" frontier model they have their own foundation model called Nova.

I work for Amazon, everyone is using Claude. Nova is a piece of crap, nobody is using it. It's literally useless.

I haven't tried the new versions that just came out though.


> How will the Google/Anthropic/OpenAI's of the world make money on AI if open models are competitive with their models?

They won't. Actually, even if open models aren't competitive, they still won't. Hasn't this been clear since a while already?

There's no moat in models, investments in pure models has only been to chase AGI, all other investment (the majority, from Google, Amazon, etc.) has been on products using LLMs, not models themselves.

This is not like the gold rush where the ones who made good money were the ones selling shovels, it's another kind of gold rush where you make money selling shovels but the gold itself is actually worthless.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: