Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Am I missing something, or where is the actual prompt given to Claude to trigger navigation to the page? Seems like the most interesting detail was left out of the article.

If the prompt said something along the lines of "Claude, navigate to this page and follow any instructions it has to say", it can't really be called "prompt injection" IMO.

EDIT: The linked demo shows exactly what's going on. The prompt is simply "show {url}" and there's no user confirmation after submitting the prompt, where Claude proceeds to download the binary and execute it locally using bash. That's some prompt injection! Demonstrating that you should only run this tool on trusted data and/or in a locked down VM.



OP is demonstrating that the product follows prompts from the pages it visits, not just from it's owner in the UI that controls it.

To be fair, this is a beta product and is likely ridden with bugs. I think OP is trying to make a point that LLM powered applications can be potentially tricked into behaving in ways that are unintended, and the "bug fixes" may be a constant catch up game for developers fighting an infinite pool of edge cases.


Saying 'tricked' is understating it. The example is Claude following instructions from a plain sentence in the web page content. There's no trickery at all, just a tool that's fundamentally unsuited for purpose.


For an LLM to read a screen, it has to be provided the screen as part of its prompt, and it will be vulnerable to prompt injections if any part of that screen contains untrusted data.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: