Allowing LLMs to execute unrestricted commands without human review is risky and...

jahooma · on Nov 7, 2024

Yes, this is a good point. I think not asking to run commands is maybe the most controversial choice we've made so far.

The reason we don't ask for human review is simply: we've found that it works fine to not ask.

We've had a few hundred users so far and usually people are skeptical of this at first, but as they use it they find that they don't want it to ask for every command. It enables cool use cases where Codebuff and iterate by running tests, seeing the error, attempting a fix, and running them again.

If you use source control like git, I also think that it's very hard for things to go wrong. Even if it ran rm -rf from your project directory, you should be able to undo that.

But here's the other thing: it won't do that. Claude is trained to be careful about this stuff and we've further prompted it to be careful.

I think not asking to run commands is the future of coding agents, so I hope you will at least entertain this idea. It's ok if you don't want to trust it, we're not asking you to do anything you are uncomfortable with.

israrkhan · on Nov 7, 2024

I am not afraid of rm -rf whole directory. I am afraid of other stuff that it can do to my machines. leak my ssh keys, cookies, persnal data, network devices, and making persistent modifications (malware) to my system. Or maybe inadvertently messing with my python version, or globally installing some library to mess up whole system.

Retr0id · on Nov 7, 2024

I, as a well-intending human, have run commands that broke my local python install. At least I was vaguely aware of what I did, and was able to fix things. If I didn't know what had happened I'd be pretty lost.

robertlagrant · on Nov 7, 2024

You'd hope that most of it is just rm -rf .venv && poetry install, or similar.

boratanrikulu · on Nov 7, 2024

> it won't do that. Claude is trained to be careful about this stuff and we've further prompted it to be careful.

Could you please explain a bit how you are sure about it?

jahooma · on Nov 7, 2024

It's mainly from experience. From when I set it up I didn't have the feature to ask whether to run commands. It has been rawdogging commands this whole time and has never been a problem for me.

I think we have many other users who are similar. To be fair, sometimes after watching it install packages with npm, people are surprised and say that they would have preferred that it asked. But usually this is just the initial reaction. I'm pretty confident this is the way forward.

boratanrikulu · on Nov 7, 2024

Do you have any sandbox-like restrictions in place to ensure that commands are limited to only touching the project folder not any other places in the system?

imiric · on Nov 8, 2024

You can use pledge[1] to restrict the tool to read/write only in specific directories, or only use certain system calls. This is easier to run than from a container or VM, but can be a bit fiddly to setup at first.

Assuming you trust it with the files in your codebase, and them being shared with third parties. Which is a hard pill to swallow for a proprietary program.

[1]: https://justine.lol/pledge/

jahooma · on Nov 7, 2024

We always reset the directory back to the project directory on each command, so that helps.

But we're open to adding more restrictions so that it can't for example run `cd /usr && rm -rf .`

ATechGuy · on Nov 7, 2024

How about executing commands in a VM (perhaps Firecracker)?

YetAnotherNick · on Nov 7, 2024

It's strange that all the closed models whose mentioned reasons for being closed is safety is allowing this, and banning the apps which allows for erotic roleplay all the time. Roleplay is significantly less dangerous than full shell control.

codenamev · on Nov 7, 2024

You are really missing out: https://github.com/e2b-dev/e2b

boratanrikulu · on Nov 7, 2024

I don't see any sandbox usage in the demo video.