Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

All do respect to the efforts here to make things more secure, but this doesn't make much sense to me.

How can an individual MCP server assess prompt injection threats for my use case?

Why is it the Supabase MCP server's job to sanitize the text that I have in my database rows? How does it know what I intend to use that data for?

What if I have a database of prompt injection examples I am using for a training? Supabase MCP is going to amend this data?

What if I'm building an app where the rows are supposed to be instructions?

What if I don't use MCP and I'm just using Supabase APIs directly in my agent code? Is Supabase going to sanitize the API output as well?

We all know that even if you "Wrap all SQL responses with prompting that discourages the LLM from following instructions/commands injected within user data" future instructions can still override this. Ie this is exactly why you have to add these additional instructions in the first place because the returned values override previous instructions!

You don't have to use obvious instruction / commands / assertive language to prompt inject. There are a million different ways to express the same intent in natural language, and a gazillion different use cases of how applications will be using Supabase MCP results. How confident are you that you will catch them all with E2E tests? This feels like a huge game of whack-a-mole.

Great if you are adding more guardrails for Supabase MCP server. But what about all the other MCP servers? All it takes is a client connected to one other MCP server that returns a malicious response to use the Supabase MCP Server (even correctly within your guardrails) and then use that response however it sees fit.

All in all I think effort like this will give us a false sense of security. Yes they may reduce chances for some specific prompt injections a bit - which sure we should do. But just because they and turn some example Evals or E2E tests green we should not feel good and safe and that the job is done. At the end of the day the system is still inherently insecure, and not deterministically patched. It only takes 1 breach for a catastrophe.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: