Generate Asset Policy

Regenerate internal/audits/asset-actions.yaml and internal/audits/audit-summary.md from ASSETS.yaml.

If ASSETS.yaml may be stale, run uv run python tools/generate_asset_manifest.py first.

Run uv run python tools/summarise_asset_manifest.py to get aggregate counts (by type, by state, totals). Use these numbers when populating audit-summary.md.

Classification

Read ASSETS.yaml. For each asset, determine target stage first, then priority. Process both state: floating assets AND state: pinned assets that match known-unstable sources (since their target is controlled, they are not yet at their target stage).

Target stages (per ADR-0007)

The target stage depends on host reliability, not asset type:

controlled (Stage 2) — any asset where upstream has broken before, maintainer is unresponsive/deprecated, OR host is unreliable (personal repos, Google Drive, .edu domains, university servers, any host without version control). This applies to git_clone, direct_url, and huggingface alike.
pinned (Stage 1) — assets on reliable, version-controlled hosts (GitHub, HuggingFace, well-known CDNs) with no history of breakage.

Per ADR-0007: "Anything hosted on a less reliable domain (personal websites, Google Drive, university servers, or any host without version control) should skip straight to Stage 2."

Priority tiers

Urgent — all other floating refs on reliable hosts. Target is pinned.
High — matches a known-unstable source (see registry below). Target is controlled.
Medium — unreliable host (drive.google.com, .edu domains, personal repos/websites) not already in the known-unstable registry. Target is controlled.

For assets with state: pinned and a {SHA} placeholder but no checksum, classify as Low (target: pinned with checksum).

Omit assets already at their target stage.

Every entry needs: eval, source, type, state, target, action, reason.

Known-Unstable Sources

Update this list when new instability is discovered.

Source	Eval	Incident
`xlang-ai/OSWorld`	osworld	Files removed (PR #958)
`openai/evals`	makemesay	Deprecated upstream
`corebench.cs.princeton.edu`	core_bench	University server, no versioning
`epatey/fonts`	osworld	Personal repo
`ShishirPatil/gorilla`	bfcl	Data format issues (PR #954)
`yunx-z/MLRC-Bench`	mlrc_bench	Broken task
`LRudL/sad`	sad	Upstream bugs (issues #7, #8)
`meg-tong/sycophancy-eval`	sycophancy	Invalid JSON/NaN, workaround in code
`josancamon/paperbench`	paperbench	Paper ID mismatch (HF discussion #2)
`sentientfutures/moru-benchmark`	moru	Exact duplicate rows

Verification

asset-actions.yaml parses as valid YAML
Every floating asset in ASSETS.yaml appears in urgent, high, or medium
floating_assets + needing_checksums + no_action_needed == total_external_assets
Numbers in audit-summary.md match output of summarise_asset_manifest.py

generate-asset-actions

details

Generate Asset Policy

Classification

Target stages (per ADR-0007)

Priority tiers

Known-Unstable Sources

Verification

technical

related