HELMSecurity & Trust
MCPLLMs

Security & Trust

Threat Model

Threat model, TCB policy, credential security, supply chain, chaos testing, and compliance.
PublicSource-ownedMarkdown export
HELM Trust BoundaryEvery governed call produces receipts that can be inspected, exported, and verified.
HELM Trust BoundaryAI ClientOpenAI-compatible SDKHELM Proxybase URL boundaryPolicy Engineallow / deny / requireReceiptsigned decision recordVerifieroffline evidence checks

Audience

Outcome

After this page you should know what this surface is for, which source files own the behavior, which public route or adjacent page to use next, and which validation command to run before changing the claim.

Source Truth

  • Public route: security/threat-model
  • Source document: helm-ai-enterprise/docs/public/security-and-trust/threat-model.md
  • Public manifest: helm-ai-enterprise/docs/public-docs.manifest.json
  • Source inventory: helm-ai-enterprise/docs/source-inventory.manifest.json
  • Validation: corepack pnpm run docs:coverage, corepack pnpm run docs:truth, and npm run coverage:inventory from docs-platform

Do not expand this page with unsupported product, SDK, deployment, compliance, or integration claims unless the inventory manifest points to code, schemas, tests, examples, or an owner doc that proves the claim.

Troubleshooting

Symptom First check
A link or route is missing from the docs website Check docs/public-docs.manifest.json, llms.txt, search, and the per-page Markdown export before changing navigation.
A claim is not backed by code or tests Remove the claim or add the missing code, example, schema, or validation command before publishing.

Trust Boundaries

┌─────────────────────────────────────────────────────┐
│                    UNTRUSTED                        │
│  LLM Provider · User Prompts · Connector Outputs    │
└───────────────────────┬─────────────────────────────┘
                        │
                  ┌─────▼─────┐
                  │ HELM      │  ← PEP boundary (schema + hash)
                  │ Kernel    │  ← Guardian (policy engine)
                  │           │  ← SafeExecutor (signed receipts)
                  └─────┬─────┘
                        │
┌───────────────────────▼─────────────────────────────┐
│                    TRUSTED                          │
│  Signed Receipt Store · ProofGraph DAG · Trust Reg  │
└─────────────────────────────────────────────────────┘

Threat Categories

T1: Unauthorized Tool Execution

Attack: Model generates a tool call not sanctioned by the current policy.

Defense: Guardian policy engine maintains an explicit allowlist. Undeclared tools are blocked before reaching the executor. Default-deny.

Residual risk: Runtime coverage must prove every advertised tool transport is mediated before dispatch. Launchpad MCP mediation tests cover stdio in the kernel CLI, HTTP JSON-RPC, /mcp/v1/execute, generated client configs, MCPB packaging, and unsupported WebSocket fail-closed behavior.

T2: Argument Tampering

Attack: Malicious input crafts tool arguments that bypass validation or alter semantics.

Defense:

  1. Schema validation against pinned JSON Schema (fail-closed)
  2. JCS canonicalization (RFC 8785) eliminates encoding ambiguity
  3. SHA-256 hash of canonical args (ArgsHash) bound into signed receipt

Residual risk: Schema must be correct. HELM enforces the schema, not its semantic correctness.

T3: Output Spoofing

Attack: Malicious connector returns data that doesn't match the declared output schema.

Defense: Output validation against pinned schema. Contract drift produces ERR_CONNECTOR_CONTRACT_DRIFT and halts execution.

Residual risk: Connector could return semantically wrong but schema-valid data.

T4: Resource Exhaustion (WASI)

Attack: Uploaded WASM module consumes unbounded CPU, memory, or time.

Defense:

  • Gas metering: hard budget per invocation
  • Wall-clock timeout: configurable per-tool
  • Memory cap: WASM linear memory bounded
  • Deterministic trap codes on budget exhaustion

Residual risk: None for compute resources. Side-channels at the host OS level are out of scope.

T5: Receipt Forgery

Attack: Attacker creates fake receipts to claim executions that didn't happen.

Defense: Ed25519 signatures on canonical payloads. Verification requires the signer's public key.

Residual risk: Key compromise. Mitigated by Trust Registry key rotation.

T6: Replay Attacks

Attack: Attacker replays a valid receipt to re-execute an effect.

Defense:

  • Lamport clock monotonicity per session
  • Causal PrevHash chain (each receipt signs over previous receipt's signature)
  • Idempotency cache in executor

Residual risk: None within a single session. Cross-session replay mitigated by session scoping.

T7: Approval Bypass

Attack: Model or operator bypasses human approval for high-risk operations.

Defense:

  • Timelock: approval window must elapse before execution
  • Deliberate confirmation: approver must produce a hash derived from the original intent
  • Domain separation: approval keys are distinct from execution keys
  • Challenge/response ceremony for disputes

Residual risk: Social engineering of the human approver is out of scope.

T8: Trust Registry Manipulation

Attack: Attacker adds a rogue key or revokes a legitimate one.

Defense: Event-sourced trust registry. Every key lifecycle event (add/revoke/rotate) is a signed, immutable event with Lamport ordering. Registry state is replayable from genesis.

Residual risk: Compromise of the registry admin key. Mitigated by ceremony-based key management.

T9: Proxy Sidecar Attacks

Attack vectors:

  1. MITM between client and proxy: Attacker intercepts traffic between the app and the local HELM proxy, injecting tool calls or modifying responses.

  2. Budget bypass: Attacker circumvents budget enforcement by directly hitting the upstream API, bypassing the proxy entirely.

  3. Receipt store tampering: Attacker modifies the JSONL receipt store on disk to cover traces or inject fake receipts.

  4. Session fixation: Attacker reuses a session-scoped Lamport counter to replay receipts from a previous session.

  5. SSE stream poisoning: In streaming mode, attacker injects partial tool_call fragments into the SSE stream to trigger unintended executions.

Defense:

  1. Proxy binds to localhost only; TLS is recommended for remote deployments.
  2. Budget enforcement is advisory in OSS sidecar mode. For hard enforcement, use --island-mode or deploy as a network gateway.
  3. Receipts are Ed25519-signed. Tampered receipts fail helm-ai-kernel pack verify. ProofGraph DAG nodes have causal chain integrity (prevHash linking).
  4. Session-scoped Lamport clocks with atomic increments. Cross-session replay detected by helm-ai-kernel replay --verify.
  5. Streaming responses are buffered and validated before governance checks. Partial tool_calls are held until the complete SSE stream is received.

Residual risk:

  • Local attacker with filesystem access can bypass the sidecar. This is inherent to sidecar architectures and mitigated by island mode for high-security environments.
  • SSE streaming governance is eventual (validated after full buffering), not inline.
  • Launchpad CONNECT proxy receipts prove destination allowlisting only. Payload contents remain encrypted and opaque without a token broker or model-gateway inspection path.

T10: Launchpad Container Isolation Overclaim

Attack: A hostile local app or agent abuses Docker daemon, kernel, namespace, or filesystem boundaries while the product claim implies stronger isolation than the selected substrate provides.

Defense: Launchpad records an explicit isolation tier in substrate specs, policy packs, runtime evidence, and start receipts. docker-default is a baseline developer substrate. Hardened claims require Docker rootless/userns, Docker ECI, gVisor, Kata/Firecracker, or dedicated VM evidence.

Residual risk: Baseline Docker remains inappropriate for hostile-code claims. Unsupported or unconfigured hardened modes fail closed before launch.

Out of Scope

  • Content safety / prompt injection within the text domain
  • Vulnerabilities in upstream LLM providers
  • Host OS / hardware side channels
  • Network-level attacks (TLS is assumed)
  • Social engineering of human approvers

Diagram

Diagram1. Ingestion & Context Plane -> Threat Model -> Trust Boundaries -> Out of Scope -> Reader outcome -> 2. Evaluation & Policy Plane -> Threat Categories
flowchart TD
    subgraph Ingestion["1. Ingestion & Context Plane"]
        source["Threat Model"]
        s0["Trust Boundaries"]
        s2["Out of Scope"]
        output["Reader outcome"]
    end

    subgraph Evaluation["2. Evaluation & Policy Plane"]
        s1["Threat Categories"]
    end

    %% Operational Flow Edges
    source --> s0
    s0 --> s1
    s1 --> s2
    s2 --> output

    %% Premium Styling Rules
    style s1 fill:#2d3748,stroke:#4a5568,stroke-width:2px,color:#fff
Mermaid source
flowchart TD
    subgraph Ingestion["1. Ingestion & Context Plane"]
        source["Threat Model"]
        s0["Trust Boundaries"]
        s2["Out of Scope"]
        output["Reader outcome"]
    end

    subgraph Evaluation["2. Evaluation & Policy Plane"]
        s1["Threat Categories"]
    end

    %% Operational Flow Edges
    source --> s0
    s0 --> s1
    s1 --> s2
    s2 --> output

    %% Premium Styling Rules
    style s1 fill:#2d3748,stroke:#4a5568,stroke-width:2px,color:#fff