Skill Index

loa-freeside/

red-team

community[skill]

Red Team — Generative Adversarial Security Design

$/plugin install loa-freeside

details

Red Team — Generative Adversarial Security Design

Purpose

Use the Flatline Protocol's red team mode to generate creative attack scenarios against design documents. Produces structured attack scenarios with consensus classification and architectural counter-designs.

Invocation

/red-team grimoires/loa/sdd.md
/red-team grimoires/loa/sdd.md --focus "agent-identity,token-gated-access"
/red-team grimoires/loa/sdd.md --mode quick
/red-team grimoires/loa/sdd.md --depth 2 --mode deep
/red-team --spec "Users authenticate via wallet signature and receive a JWT"

Arguments

ArgumentFlagDefaultDescription
documentpositionalrequiredPath to document to red-team
spec--specInline spec text (creates temp document)
focus--focusallComma-separated attack surface categories
section--sectionallSpecific document section to target
depth--depth1Attack-counter_design iterations
mode--modestandardExecution mode: quick, standard, deep

Workflow

  1. Validate Config: Check red_team.enabled: true in .loa.config.yaml
  2. Input Handling: Load document or create temp file from --spec
  3. Surface Loading: Load attack surfaces from registry, filter by --focus
  4. Invoke Orchestrator: Call flatline-orchestrator.sh --mode red-team
  5. Present Results: Show attack summary with consensus categories
  6. Human Gate: If any severity >800, require human acknowledgment

Execution Modes

ModeModelsCross-ValidationCounter-DesignBudget
Quick2 (primary only)SkipInline only50K tokens
Standard4 (primary + secondary)FullFull synthesis200K tokens
Deep4 + iterationFullFull + multi-depth500K tokens

Quick Mode Restrictions

  • Outputs labeled UNVALIDATED
  • Cannot produce CONFIRMED_ATTACK — all findings are THEORETICAL or CREATIVE_ONLY
  • No cross-validation performed
  • For exploratory use only, not for gating decisions

Consensus Categories

CategoryCriteriaMeaning
CONFIRMED_ATTACKBoth models score >700Attack is realistic and should be addressed
THEORETICALOne model >700, other ≤700Plausible but models disagree
CREATIVE_ONLYNeither model scores >700Novel but neither model finds it convincing
DEFENDEDBoth models >700 AND counter-design existsAttack is real but already has effective defense

Score Examples:

  • GPT=850, Opus=900 → CONFIRMED_ATTACK (both >700)
  • GPT=800, Opus=400 → THEORETICAL (one >700, other ≤700)
  • GPT=650, Opus=750 → THEORETICAL (Opus >700, GPT ≤700)
  • GPT=500, Opus=600 → CREATIVE_ONLY (neither >700)
  • GPT=300, Opus=200 → CREATIVE_ONLY (neither >700)

Human Validation Gate

When any attack scores severity >800:

Interactive mode: Present attack details and require acknowledgment:

HUMAN REVIEW REQUIRED

ATK-003: Confused Deputy in Ensemble Routing
Severity: 920/1000
Consensus: CONFIRMED_ATTACK

[A]cknowledge / [D]ismiss / [E]scalate

Autonomous mode: Write to pending-review.json for later human review.

Output Files

FilePermissionsContent
.run/red-team/rt-{id}-result.json0644Full JSON result
.run/red-team/rt-{id}-report.md0600Full report (restricted)
.run/red-team/rt-{id}-summary.md0644Safe summary for PR/CI
.run/red-team/.ci-safe0644Manifest of CI-safe files

Error Handling

ErrorCauseResolution
"red_team.enabled is not true"Config toggle offSet red_team.enabled: true
"Input blocked by sanitizer"Credentials in documentRemove credentials from input
"Budget exceeded"Token limit hitUse lower execution mode
"Orchestrator failed"Model invocation errorCheck API keys, retry

Configuration

red_team:
  enabled: true
  mode: standard
  thresholds:
    confirmed_attack: 700
    theoretical: 400
    human_review_gate: 800
  budgets:
    quick_max_tokens: 50000
    standard_max_tokens: 200000
    deep_max_tokens: 500000

Simstim Integration

When red_team.simstim.auto_trigger: true, the red team automatically runs as Phase 4.5 (RED TEAM SDD) during the simstim workflow, after FLATLINE SDD review and before PLANNING.

Related

  • /flatline-review — Standard Flatline Protocol quality review
  • /audit — Codebase security audit (implementation-level)
  • .claude/data/attack-surfaces.yaml — Attack surface registry
  • .claude/data/red-team-golden-set.json — Calibration corpus

technical

github
0xHoneyJar/loa-freeside
stars
7
license
NOASSERTION
contributors
6
last commit
2026-04-30T00:44:24Z
file
.claude/skills/red-teaming/SKILL.md

related