MVAR: Deterministic execution firewall for LLM agents (50 attacks blocked)

Posted by ShawnC21 1 hour ago

Counter1Comment2OpenOriginal

Comments

Comment by ShawnC21 1 hour ago

One clarification: MVAR is not a prompt filter and not a model judge.

The enforcement happens at the execution boundary. If model output reaches a critical sink (shell, filesystem, credentials, etc.) with untrusted provenance, the runtime blocks the call deterministically.

The repo includes the full attack corpus and proof pack if anyone wants to test the enforcement model locally.. Cheers - Shawn

Comment by ShawnC21 1 hour ago

Hi HN — I'm Shawn, the author. We did a Show HN for the GitHub launch a few weeks back. A number of things have shipped since then, so posting an update..

The core thesis

Prompt injection is not a prompt problem. It's an execution problem.

When an LLM agent can run shell commands, call APIs, read files, or use credentials, model output is effectively privileged code. Most defenses try to detect malicious prompts. That breaks down once output reaches execution sinks.

MVAR assumes prompt injection will occur and enforces deterministic policy between model output and privileged operations.

Invariant:

UNTRUSTED input + CRITICAL sink → BLOCK

No prompt classification. No heuristics. No secondary model judging intent. Deterministic enforcement at runtime.

Changes since the previous Show HN:

- Verified execution contracts — per-call invocation hash binding with replay protection - Ed25519-only enterprise hardening mode with signed policy bundles - Portable offline witness verifier for verifying decision chains without the runtime - CI-enforced governance gate — security validation required on every merge to main - First-party adapters for MCP, LangChain, OpenAI Agents SDK, AutoGen, CrewAI, OpenClaw - PyPI release: pip install mvar-security

Reproducible validation:

  git clone https://github.com/mvar-security/mvar
  cd mvar
  bash scripts/repro-validation-pack.sh
This runs the red-team gate, a 50-vector prompt-injection attack corpus across 9 categories, and the full validation suite.

Each run emits a timestamped artifact pack with SHA-256 checksums and an independently verifiable witness chain.

Expected result:

  LAUNCH GATE: ALL SYSTEMS GO
  50/50 attack vectors blocked
Minimal integration:

  pip install mvar-security

  from mvar import protect
  safe_tool = protect(my_tool)
This wraps any callable tool in an enforcement boundary and works with most agent runtimes.

Curious to hear from anyone building agent runtimes or MCP tool chains. If you find a bypass or failure mode, we'd definitely want to see it..

https://github.com/mvar-security/mvar