Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Legit question, why would you be using an untrusted tool in the first place?

Why are people surprised they are vulnerable to a malicious tool when they are using untrusted and/or remotely hosted tools?

Without some method to tag context as sensitive and an LLM model/service that respects said data tagging, you'll likely never have a scenario where you can trust that the LLM isn't sending some sensitive information to an untrusted endpoint. If you accept that, then you have to design your system around not using untrusted endpoints. Just adding untrusted endpoints is kinda like running untrusted applications on your machine. It's fine until it isn't.

At the very least, your agent should have some way to mark the entire session as 'tainted' in such a way that calling out to untrusted sources is forbidden once sensitive context enters the loop. And that would need to live outside the LLM calling loop since the LLM could be tricked before the sensitive data was introduced. With the tool annotations being added to the spec, your internal tools could provide those flags the the agent to facilitate such a blunt security process. And I am aware there are likely holes in such a plan, hence my first question.



For the same reason people use untrustworthy extensions in browsers or IDEs. Those extensions need not even start out untrustworthy - they change hands and become malicious after establishing popularity.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: