---
title: "Threat Model"
canonical: "https://helm.docs.mindburn.org/security/threat-model"
source: "helm-ai-enterprise/docs/public/security-and-trust/threat-model.md"
edit: "https://github.com/Mindburn-Labs/helm-ai-enterprise/edit/main/docs/public/security-and-trust/threat-model.md"
section: "trust"
access: "public"
sensitivity: "public"
last_reviewed: "2026-04-30"
checksum_sha256: "sha256:90a8b1a6291d9efd28c1b5b0db725e800fb27066b4c77b05d28ccad83011b859"
build_timestamp: "2026-05-24T13:40:27.882Z"
---
# Threat Model

## Audience

## Outcome

After this page you should know what this surface is for, which source files own the behavior, which public route or adjacent page to use next, and which validation command to run before changing the claim.

## Source Truth

- Public route: `security/threat-model`
- Source document: `helm-ai-enterprise/docs/public/security-and-trust/threat-model.md`
- Public manifest: `helm-ai-enterprise/docs/public-docs.manifest.json`
- Source inventory: `helm-ai-enterprise/docs/source-inventory.manifest.json`
- Validation: `corepack pnpm run docs:coverage`, `corepack pnpm run docs:truth`, and `npm run coverage:inventory` from `docs-platform`

Do not expand this page with unsupported product, SDK, deployment, compliance, or integration claims unless the inventory manifest points to code, schemas, tests, examples, or an owner doc that proves the claim.

## Troubleshooting

| Symptom | First check |
| --- | --- |
| A link or route is missing from the docs website | Check `docs/public-docs.manifest.json`, `llms.txt`, search, and the per-page Markdown export before changing navigation. |
| A claim is not backed by code or tests | Remove the claim or add the missing code, example, schema, or validation command before publishing. |

## Trust Boundaries

```
┌─────────────────────────────────────────────────────┐
│                    UNTRUSTED                        │
│  LLM Provider · User Prompts · Connector Outputs    │
└───────────────────────┬─────────────────────────────┘
                        │
                  ┌─────▼─────┐
                  │ HELM      │  ← PEP boundary (schema + hash)
                  │ Kernel    │  ← Guardian (policy engine)
                  │           │  ← SafeExecutor (signed receipts)
                  └─────┬─────┘
                        │
┌───────────────────────▼─────────────────────────────┐
│                    TRUSTED                          │
│  Signed Receipt Store · ProofGraph DAG · Trust Reg  │
└─────────────────────────────────────────────────────┘
```

## Threat Categories

### T1: Unauthorized Tool Execution

**Attack:** Model generates a tool call not sanctioned by the current policy.

**Defense:** Guardian policy engine maintains an explicit allowlist. Undeclared tools are blocked before reaching the executor. Default-deny.

**Residual risk:** Runtime coverage must prove every advertised tool transport
is mediated before dispatch. Launchpad MCP mediation tests cover stdio in the
kernel CLI, HTTP JSON-RPC, `/mcp/v1/execute`, generated client configs, MCPB
packaging, and unsupported WebSocket fail-closed behavior.

### T2: Argument Tampering

**Attack:** Malicious input crafts tool arguments that bypass validation or alter semantics.

**Defense:**
1. Schema validation against pinned JSON Schema (fail-closed)
2. JCS canonicalization (RFC 8785) eliminates encoding ambiguity
3. SHA-256 hash of canonical args (`ArgsHash`) bound into signed receipt

**Residual risk:** Schema must be correct. HELM enforces the schema, not its semantic correctness.

### T3: Output Spoofing

**Attack:** Malicious connector returns data that doesn't match the declared output schema.

**Defense:** Output validation against pinned schema. Contract drift produces `ERR_CONNECTOR_CONTRACT_DRIFT` and halts execution.

**Residual risk:** Connector could return semantically wrong but schema-valid data.

### T4: Resource Exhaustion (WASI)

**Attack:** Uploaded WASM module consumes unbounded CPU, memory, or time.

**Defense:**
- Gas metering: hard budget per invocation
- Wall-clock timeout: configurable per-tool
- Memory cap: WASM linear memory bounded
- Deterministic trap codes on budget exhaustion

**Residual risk:** None for compute resources. Side-channels at the host OS level are out of scope.

### T5: Receipt Forgery

**Attack:** Attacker creates fake receipts to claim executions that didn't happen.

**Defense:** Ed25519 signatures on canonical payloads. Verification requires the signer's public key.

**Residual risk:** Key compromise. Mitigated by Trust Registry key rotation.

### T6: Replay Attacks

**Attack:** Attacker replays a valid receipt to re-execute an effect.

**Defense:**
- Lamport clock monotonicity per session
- Causal `PrevHash` chain (each receipt signs over previous receipt's signature)
- Idempotency cache in executor

**Residual risk:** None within a single session. Cross-session replay mitigated by session scoping.

### T7: Approval Bypass

**Attack:** Model or operator bypasses human approval for high-risk operations.

**Defense:**
- Timelock: approval window must elapse before execution
- Deliberate confirmation: approver must produce a hash derived from the original intent
- Domain separation: approval keys are distinct from execution keys
- Challenge/response ceremony for disputes

**Residual risk:** Social engineering of the human approver is out of scope.

### T8: Trust Registry Manipulation

**Attack:** Attacker adds a rogue key or revokes a legitimate one.

**Defense:** Event-sourced trust registry. Every key lifecycle event (add/revoke/rotate) is a signed, immutable event with Lamport ordering. Registry state is replayable from genesis.

**Residual risk:** Compromise of the registry admin key. Mitigated by ceremony-based key management.

### T9: Proxy Sidecar Attacks

**Attack vectors:**

1. **MITM between client and proxy:** Attacker intercepts traffic between the app and the local HELM proxy, injecting tool calls or modifying responses.

2. **Budget bypass:** Attacker circumvents budget enforcement by directly hitting the upstream API, bypassing the proxy entirely.

3. **Receipt store tampering:** Attacker modifies the JSONL receipt store on disk to cover traces or inject fake receipts.

4. **Session fixation:** Attacker reuses a session-scoped Lamport counter to replay receipts from a previous session.

5. **SSE stream poisoning:** In streaming mode, attacker injects partial tool_call fragments into the SSE stream to trigger unintended executions.

**Defense:**
1. Proxy binds to localhost only; TLS is recommended for remote deployments.
2. Budget enforcement is advisory in OSS sidecar mode. For hard enforcement, use `--island-mode` or deploy as a network gateway.
3. Receipts are Ed25519-signed. Tampered receipts fail `helm-ai-kernel pack verify`. ProofGraph DAG nodes have causal chain integrity (prevHash linking).
4. Session-scoped Lamport clocks with atomic increments. Cross-session replay detected by `helm-ai-kernel replay --verify`.
5. Streaming responses are buffered and validated before governance checks. Partial tool_calls are held until the complete SSE stream is received.

**Residual risk:**
- Local attacker with filesystem access can bypass the sidecar. This is inherent to sidecar architectures and mitigated by island mode for high-security environments.
- SSE streaming governance is eventual (validated after full buffering), not inline.
- Launchpad CONNECT proxy receipts prove destination allowlisting only. Payload
  contents remain encrypted and opaque without a token broker or model-gateway
  inspection path.

### T10: Launchpad Container Isolation Overclaim

**Attack:** A hostile local app or agent abuses Docker daemon, kernel, namespace,
or filesystem boundaries while the product claim implies stronger isolation than
the selected substrate provides.

**Defense:** Launchpad records an explicit isolation tier in substrate specs,
policy packs, runtime evidence, and start receipts. `docker-default` is a
baseline developer substrate. Hardened claims require Docker rootless/userns,
Docker ECI, gVisor, Kata/Firecracker, or dedicated VM evidence.

**Residual risk:** Baseline Docker remains inappropriate for hostile-code
claims. Unsupported or unconfigured hardened modes fail closed before launch.

## Out of Scope

- Content safety / prompt injection within the text domain
- Vulnerabilities in upstream LLM providers
- Host OS / hardware side channels
- Network-level attacks (TLS is assumed)
- Social engineering of human approvers

## Diagram

```mermaid
flowchart TD
    subgraph Ingestion["1. Ingestion & Context Plane"]
        source["Threat Model"]
        s0["Trust Boundaries"]
        s2["Out of Scope"]
        output["Reader outcome"]
    end

    subgraph Evaluation["2. Evaluation & Policy Plane"]
        s1["Threat Categories"]
    end

    %% Operational Flow Edges
    source --> s0
    s0 --> s1
    s1 --> s2
    s2 --> output

    %% Premium Styling Rules
    style s1 fill:#2d3748,stroke:#4a5568,stroke-width:2px,color:#fff
```