Defend

Toxicity

Detect toxic or abusive content in user input.

YAML key: toxicity
Direction: input

Detect toxic or abusive content in user input.

Configuration

Prop

Type

Example

defend.config.yaml (fragment)
guards:
  input:
    modules:
      - toxicity:
          categories: []
          severity: "standard"

Provider usage

Configure under guards.input.modules when using an LLM-backed input provider.

See the modules overview.