HELM AI Kernel

Prompt Injection Watchlist - April 2026

Open-source execution kernel, CLI, MCP, conformance, verification, and compatibility.

PublicSource-ownedMarkdown export

Audience

Security maintainers tracking public prompt-injection research that affects HELM AI Kernel execution-boundary design.

Outcome

After this page you should know what this surface is for, which source files own the behavior, which public route or adjacent page to use next, and which validation command to run before changing the claim.

Source Truth

Public route: helm-ai-kernel/security/prompt-injection-watchlist-2026-04
Source document: helm-ai-kernel/docs/security/prompt-injection-watchlist-2026-04.md
Public manifest: helm-ai-kernel/docs/public-docs.manifest.json
Source inventory: helm-ai-kernel/docs/source-inventory.manifest.json
Validation: make docs-coverage, make docs-truth, and npm run coverage:inventory from docs-platform

Do not expand this page with unsupported product, SDK, deployment, compliance, or integration claims unless the inventory manifest points to code, schemas, tests, examples, or an owner doc that proves the claim.

Troubleshooting

Symptom	First check
Published output is stale or incomplete	Run `npm run helm-public:accuracy` in `docs-platform`, then check the source path and public manifest row for this page.
A claim needs implementation backing	Check the Source Truth files above and update the implementation, manifest, source inventory, or page in the same change.

Diagram

This scheme maps the main sections of Prompt Injection Watchlist - April 2026 in reading order.

Diagram1. Ingestion & Context Plane -> Prompt Injection Watchlist - April 2026 -> Source Verification -> HELM Mapping -> No-Go Criteria

flowchart TD
    subgraph Ingestion["1. Ingestion & Context Plane"]
        Page["Prompt Injection Watchlist - April 2026"]
        A["Source Verification"]
        B["HELM Mapping"]
        C["No-Go Criteria"]
    end

    %% Operational Flow Edges
    Page --> A
    A --> B
    B --> C

    %% Premium Styling Rules

Mermaid source

flowchart TD
    subgraph Ingestion["1. Ingestion & Context Plane"]
        Page["Prompt Injection Watchlist - April 2026"]
        A["Source Verification"]
        B["HELM Mapping"]
        C["No-Go Criteria"]
    end

    %% Operational Flow Edges
    Page --> A
    A --> B
    B --> C

    %% Premium Styling Rules

This note records the source verification and implementation decision for the April 2026 HOSS radar items on upstream prompt-injection defenses. It is not a production commitment; HELM remains the deterministic downstream execution boundary.

Source Verification

Linear	Source	Verification	Decision
`MIN-237`	AgentWatcher: A Rule-based Prompt Injection Monitor	arXiv record exists, submitted April 1, 2026; title and authors match the radar text	Keep as watchlist/prototype material
`MIN-238`	ICON: Indirect Prompt Injection Defense for Agents based on Inference-Time Correction	arXiv record exists, submitted February 24, 2026; title and authors match the radar text	Keep as watchlist/prototype material

HELM Mapping

AgentWatcher is useful to evaluate because its rule-oriented framing can provide an explainable pre-filter before requests reach Guardian. A production implementation should live behind a policy toggle and emit evidence about which rule, source segment, and confidence threshold caused a short-circuit.

ICON is an inference-time defense. It is complementary to HELM, not a replacement for HELM: the model-layer probe may reduce compromised plans before they are proposed, while HELM still governs the downstream action boundary with policy, effect, delegation, and receipt evidence.

No-Go Criteria

Do not merge either approach into the default path until:

the implementation can run deterministically or preserve a deterministic evidence envelope around model-assisted decisions;
benchmark fixtures show the false-positive impact on benign tool-use workflows;
policy authors can disable the pre-filter without weakening HELM's existing Guardian gate;
emitted evidence can be replayed or independently inspected during incident review.

Review Cadence

Keep this watchlist tied to dated examples, affected adapters, and the receipt fields that expose attempted injection.