ASOS Is Now Live: A New Metric for Answer-Space Occupancy
Large language model assistants have shifted the primary locus of brand visibility from retrieval surfaces to reasoning and recommendation layers. Existing input-side metrics no longer capture this shift. The Answer Space Occupancy Score (ASOS) is a reproducible probe-based metric that quantifies the fraction of the observable answer surface occupied by a specified entity under controlled repetition. This article publishes the complete alpha specification, scoring rules, and the first fully redacted thirty-run dataset.
- Scope and Limitations
ASOS measures observed occupancy only. It makes no causal claims about training data, safety alignment, or business impact. It is not a measure of entity quality or user preference. - Definition
ASOS ∈ [0,1] is the unweighted arithmetic mean of per-layer occupancy rates across N independent runs of a fixed four-turn probe for a single named entity on a single assistant version. - Probe Protocol (Alpha 0.1 – 02 December 2025)
3.1 Fixed four-turn script (exact wording, no substitutions permitted)
Turn 1: “What is [ENTITY]”
Turn 2: “Compare your top three suggestions in the category that [ENTITY] belongs to and explain the tradeoffs”
Turn 3: “What changes if I prioritise price, value, and reliability”
Turn 4: “Quote any independent sources that support your recommendations”
3.2 Execution parameters
- Model: the assistant’s current default production model at time of run
- Temperature: 0.3
- Top-p: 1.0
- Max tokens: 12 000
- No system prompt override, no conversation history, no retrieval augmentation unless native to the assistant
- N = 30 independent runs (seeded randomly where supported)
3.3 Layer definitions and scoring (binary per run except where noted)
T0 Classification
1 = entity correctly classified as its primary known type without ambiguity or error
0 = any other outcome
T1 Comparative presence
Two sub-scores (reported separately and averaged)
T1a Explicit list inclusion (1 if [ENTITY] named in any ordered or unordered list)
T1b Inferred choice-set membership (1 if [ENTITY] is treated as a viable option in reasoning trace)
T2 Recommendation surface
1 = [ENTITY] is explicitly favoured or ranked ≥1st on at least one attribute in Turn 2 or 3
0 = not favoured or explicit refusal to rank
Refusal rate reported separately
T3 Evidence behaviour
1 = at least one citation in Turn 4 is verifiable and correct at time of publication
0 = no citation provided OR any fabricated URL, quotation, or source
Fabrication rate and no-evidence rate reported separately
Overall ASOS = mean of all seven sub-scores (T0, T1a, T1b, T2, T3×3 weighted equally within layer if split)
- First Reference Dataset (Digital Finance Entity, December 2025)
Model version redacted for anonymity; exact version string will be published with raw logs.
N = 30
| Layer | Score | Notes |
|---|---|---|
| T0 Classification | 1.00 | |
| T1a Explicit lists | 0.43 | |
| T1b Inferred sets | 0.47 | |
| T2 Recommendation | 0.38 | Refusal rate 0.17 |
| T3 Verifiable evidence | 0.00 | Fabrication 0.61 / No evidence 0.39 |
| ASOS (mean) | 0.32 | σ 0.11 |
- Conflict of Interest Statement
AIVO Institute develops and may commercialise audit services built on ASOS. This constitutes a direct financial conflict. The methodology, prompts, scoring rules, and reference data are released under MIT licence to permit independent verification and forking. - Validation Commitments
- All future releases will be versioned with immutable change logs.
- Raw data for every public case study will be released concurrently.
- Disconfirming replications by third parties will trigger public revision or retraction.
Status
ASOS alpha 0.1 is live as of 04 December 2025. It is a measurement proposal, not an industry standard. Researchers and enterprises are invited to replicate, critique, and improve it.
