Table of Contents
- Introduction
- SQL Injection via get_count.
- Sensitive Data Exposure via get_config
- Local File Inclusion via resources/read
- What These Three Paths Have in Common
- How to Prevent MCP Data Leaks
- Conclusion
Introduction
MCP servers are often presented as a clean interface for AI agents to discover tools and interact with applications. That framing can be misleading. In practice, an MCP endpoint is still an application surface, and if its tools proxy unsafe backend behavior, it can become a highly efficient data-exposure layer.
Broken Crystals shows this clearly. Its MCP endpoint at /api/mcp uses a separate initialize step, issues its own Mcp-Session-Id, and then allows clients to enumerate tools and resources before invoking them. Once that session is established, the question is no longer just whether the app has vulnerabilities. The question is which of those vulnerabilities have been wrapped into agent-friendly capabilities.Three of the most important examples in this repo are get_count, get_config, and resources/read. They look like convenient tools. In reality, they create three different paths to sensitive data leakage.
SQL Injection via get_count
The get_count tool is exposed as a public MCP capability. Its contract is simple: accept a query string and return a count. Under the hood, though, it proxies the user-supplied value directly into /api/testimonials/count and returns the raw result as text.
That design turns the MCP tool into a database disclosure primitive. Instead of restricting the caller to a fixed counting operation, it lets the caller decide what SQL gets executed. For example, the tool can be called with a simple SQL query select count(table_name) as count from information_schema.tables, and the response comes back as a query result. That is already a leak: it exposes database metadata and confirms the caller can query internal schema information rather than just count testimonials.
This is why SQL injection through MCP matters even when the tool name sounds harmless. An AI agent, attacker, or compromised integration does not need to know hidden routes or reverse engineer the backend. The tool is already documented, discoverable, and callable through the MCP flow.
The fix is not to “watch the prompts” more carefully. It is to stop accepting raw SQL as tool input. MCP tools should expose typed business parameters, not backend query language.
Sensitive Data Exposure via get_config
If get_count shows how MCP can leak data by executing unsafe queries, get_config shows how it can leak secrets by simply returning too much.
In Broken Crystals, get_config is an admin-only tool, but that does not make it safe. The implementation proxies /api/config, and unless include_sensitive is explicitly set to false, it returns the full configuration object. In other words, sensitive output is the default behavior.
The example response in the repo includes an S3 bucket URL, a PostgreSQL connection string, and a Google Maps API key. That is exactly the kind of data security teams try to keep out of logs, frontends, test fixtures, and support tooling. Exposing it through MCP means any agent or workflow with admin-level MCP access can retrieve it in one structured call.
This is a common failure mode in AI integrations. Teams assume the main risk is unauthorized public access. But over-privileged internal access is often the more realistic problem. If an agent is granted broad admin permissions for convenience, or if an authenticated MCP session is compromised, a configuration tool like this can leak credentials, infrastructure locations, service URLs, and third-party keys immediately.
The lesson is straightforward: admin-only is not a substitute for output minimization. Sensitive config should never be the default payload of an MCP tool. If a tool must exist at all, it should return a tightly redacted view designed for that specific use case.
Local File Inclusion via resources/read
The most direct data leak in the MCP layer is resources/read.
Broken Crystals exposes a resource model that accepts file:// URIs and proxies them into /api/file/raw. The implementation parses the URI, extracts the path, and returns the file contents. The resource can expose sensitive information from files like file:///etc/hosts or file:///etc/passwd, which is a critical security breach.
This is classic local file inclusion, but MCP makes it easier to operationalize. The caller does not need a browser exploit, path traversal trick, or guesswork about an upload directory. It can simply call resources/list, see that local file access exists, and then invoke resources/read with a server-side file URI.
That matters because local files are rarely just harmless system text. In real environments, file access can expose application configs, environment files, service credentials, SSH material, cloud metadata, and signing keys. Once file read is available through an agent-facing interface, the MCP server has effectively become a controlled exfiltration channel.
The right fix is to avoid exposing raw filesystem access through MCP in the first place. Resources should be virtualized, explicitly allowlisted, and mapped to safe application objects, not arbitrary local paths.
What These Three Paths Have in Common
These issues are different technically, but they share the same architectural problem: MCP is wrapping sensitive backend behavior in a discoverable interface built for automation.
get_count leaks through unsafe query execution. get_config leaks through overbroad secret exposure. resources/read leaks through direct file access. In each case, the underlying bug is familiar. What changes with MCP is the delivery mechanism. The dangerous functionality becomes easier to find, easier to invoke, and easier to chain into larger attack flows.
That is why MCP endpoints need their own AppSec review, not just inherited trust from the APIs behind them. Once a tool or resource is published to an agent, it becomes part of the attack surface.
How to Prevent MCP Data Leaks
Start with the basics, but apply them at the MCP layer itself.
Do not expose backend query languages as tool parameters. Do not return sensitive configuration by default. Do not map raw local paths into MCP resources. Treat every tool definition as a privilege decision, every resource as a data exposure decision, and every MCP session as its own trust boundary.
Most importantly, test MCP directly. Broken Crystals is valuable because it demonstrates these paths end to end: session initialization, role checks, tool invocation, resource reads, and concrete leaked outputs. That is the level where real agent security problems appear.
Conclusion
Sensitive data leakage through MCP does not just require a new class of AI-specific vulnerability. It may happen when existing application behavior is exposed through an interface designed for discovery, automation, and chained execution. That makes familiar weaknesses far more usable in practice.
For teams adopting MCP, the takeaway is straightforward: treat agent-facing integrations as first-class attack surfaces. Review what they expose, minimize the data they return, and test them directly. If security validation stops at the underlying API layer, the most important risks may still be sitting in the MCP layer above it.
