generate-asset-actions
Generate asset-actions.yaml from ASSETS.yaml by classifying assets into priority tiers. Use when the user asks to regenerate, update, or refresh the asset actions.
/plugin install inspect_evalsdetails
Generate Asset Policy
Regenerate internal/audits/asset-actions.yaml and internal/audits/audit-summary.md from ASSETS.yaml.
If ASSETS.yaml may be stale, run uv run python tools/generate_asset_manifest.py first.
Run uv run python tools/summarise_asset_manifest.py to get aggregate counts (by type, by state, totals). Use these numbers when populating audit-summary.md.
Classification
Read ASSETS.yaml. For each asset, determine target stage first, then priority. Process both state: floating assets AND state: pinned assets that match known-unstable sources (since their target is controlled, they are not yet at their target stage).
Target stages (per ADR-0007)
The target stage depends on host reliability, not asset type:
controlled(Stage 2) — any asset where upstream has broken before, maintainer is unresponsive/deprecated, OR host is unreliable (personal repos, Google Drive,.edudomains, university servers, any host without version control). This applies togit_clone,direct_url, andhuggingfacealike.pinned(Stage 1) — assets on reliable, version-controlled hosts (GitHub, HuggingFace, well-known CDNs) with no history of breakage.
Per ADR-0007: "Anything hosted on a less reliable domain (personal websites, Google Drive, university servers, or any host without version control) should skip straight to Stage 2."
Priority tiers
- Urgent — all other floating refs on reliable hosts. Target is
pinned. - High — matches a known-unstable source (see registry below). Target is
controlled. - Medium — unreliable host (
drive.google.com,.edudomains, personal repos/websites) not already in the known-unstable registry. Target iscontrolled.
For assets with state: pinned and a {SHA} placeholder but no checksum, classify as Low (target: pinned with checksum).
Omit assets already at their target stage.
Every entry needs: eval, source, type, state, target, action, reason.
Known-Unstable Sources
Update this list when new instability is discovered.
| Source | Eval | Incident |
|---|---|---|
xlang-ai/OSWorld | osworld | Files removed (PR #958) |
openai/evals | makemesay | Deprecated upstream |
corebench.cs.princeton.edu | core_bench | University server, no versioning |
epatey/fonts | osworld | Personal repo |
ShishirPatil/gorilla | bfcl | Data format issues (PR #954) |
yunx-z/MLRC-Bench | mlrc_bench | Broken task |
LRudL/sad | sad | Upstream bugs (issues #7, #8) |
meg-tong/sycophancy-eval | sycophancy | Invalid JSON/NaN, workaround in code |
josancamon/paperbench | paperbench | Paper ID mismatch (HF discussion #2) |
sentientfutures/moru-benchmark | moru | Exact duplicate rows |
Verification
asset-actions.yamlparses as valid YAML- Every floating asset in ASSETS.yaml appears in urgent, high, or medium
floating_assets + needing_checksums + no_action_needed == total_external_assets- Numbers in
audit-summary.mdmatch output ofsummarise_asset_manifest.py
technical
- github
- UKGovernmentBEIS/inspect_evals
- stars
- 517
- license
- MIT
- contributors
- 100
- last commit
- 2026-05-29T04:29:08Z
- file
- .claude/skills/generate-asset-actions/SKILL.md