Defend

Actions and providers

pass, flag, and block semantics; defend versus Claude and OpenAI providers; and output on_fail behavior.

Actions

Guard endpoints return an action field with one of:

  • pass - Allow the flow to continue under a default policy.
  • flag - Elevated risk; your application chooses follow-up (logging, review, safer retry).
  • block - Hard stop for this path; do not call the LLM on blocked input or return blocked output verbatim.

On the output path, if the configured LLM provider is unavailable, the service falls back to guards.output.on_fail (block or flag) instead of failing the HTTP request.

Providers

  • defend - local classifier and pipeline-focused checks; suitable for input guarding without sending text to a third-party LLM API (subject to your deployment).
  • claude / openai - LLM-backed evaluation. Required for output guarding when modules are used, because output evaluation is implemented through the provider stack. Selected models must support tool/function calling because Defend requires structured provider verdicts via provider tools.

Configure provider for the main evaluation path and guards.input.provider / guards.output.provider for per-direction overrides as defined in Configuration.

External provider behavior

When provider is claude or openai, decisions are module-driven:

  • flag and block are accepted only when modules_triggered contains valid configured module names.
  • If no configured modules are available (or attribution is empty), the decision resolves to pass.
  • Selected models must support tool/function calling because Defend requires structured provider verdicts.

For published accuracy-oriented comparison numbers on a standard injection-detection benchmark (local defend path), see Benchmarks.