Pipeline
Stages of the input guard pipeline from raw text through heuristics, session scoring, and the Defend classifier.
The input path runs a pipeline before producing a final guard result. Exact ordering and diagnostics are implemented in defend_api.pipeline and the guard router; this page summarizes the ideas you need to configure and operate the service.

Normalization
Text is normalized for consistent downstream checks (whitespace, invisible characters, and related transforms). Diagnostics can record which transformations applied.
Intent fast-pass
When provider is defend, a heuristic layer (phrase and token signals, no ML) assigns a coarse label (benign, neutral, suspicious) and a score. Together with regex, it can safe-pass regex-clean benign traffic and skip the local classifier (decided_by: intent_safe_pass). When the provider is claude (Anthropic) or openai, this L2 stage is omitted entirely—the pipeline is normalization, regex, then L6 (modules + LLM). Implementation: defend_api/pipeline/heuristic_intent.py.
Regex and heuristics
Pattern-based rules contribute scores and can block or flag on high-confidence matches (for example, categories such as system_prompt_extraction).
Session accumulation
Per session_id, the service tracks rolling risk across turns so repeated suspicious behavior can escalate even when individual utterances look mild.
Defend classifier
When provider is defend, a local Hugging Face classifier estimates injection risk. It requires pip install pydefend[local] (or the adxzer/defend:local image). Model warm-up is tied to startup and to readiness when defend is the configured provider.
Providers and modules
For LLM-backed evaluation (claude or openai), the provider orchestrates semantic checks. Modules add structured prompt fragments on top of the provider; input modules apply on the input path, output modules on the output path. See Modules overview.
Regex categories
Heuristic categories are defined alongside patterns in defend_api/patterns.py (for example instruction_override, system_prompt_extraction, roleplay_jailbreak, role_hijack, wrapper_bypass, meta_jailbreak). Use those strings when interpreting regex-related diagnostics, not the intent fast-pass labels (benign, neutral, suspicious).