AZ-024 — Concrete Conformance Corpus Layout v1
Status
Acest document definește layout-ul concret al corpusului de conformitate pentru ATLAS ZERO.
AZ-011 a definit conceptul de:
- conformance vectors,
- canonical examples,
- verdicturi deterministe,
- familii de teste.
AZ-024 transformă acel model într-o structură concretă de corpus:
- directoare,
- fișiere,
- manifeste,
- fixture-uri,
- expected outputs,
- naming policy,
- sharding,
- versioning,
- promotion rules.
Scopul lui este să răspundă la întrebarea: cum arată fizic, logic și normativ corpusul pe care două implementări diferite îl folosesc pentru a demonstra că decodează, validează, execută și finalizează identic?
Acest document se bazează pe:
- AZ-002 până la AZ-023, cu accent direct pe AZ-011, AZ-018, AZ-019, AZ-020 și AZ-021.
Termeni:
- MUST = obligatoriu
- MUST NOT = interzis
- SHOULD = recomandat puternic
- MAY = opțional
1. Obiectiv
AZ-024 răspunde la 10 întrebări practice:
- Cum este organizat corpusul concret pe disc?
- Cum sunt împărțite suitele și vectorii?
- Ce fișiere trebuie să existe în fiecare test case?
- Cum sunt referite fixture-urile mari fără duplicare inutilă?
- Cum sunt stocate expected outputs și verdicturile?
- Cum se versionează corpusul și subseturile sale?
- Cum se rulează corpusul incremental sau complet?
- Cum se leagă corpusul de manifesturi, attestations și release?
- Cum se promovează un vector din draft în official?
- Cum evităm corpusuri haotice, duplicate, necanonice sau imposibil de reprodus?
2. Principii
2.1 Corpus is a release-grade protocol artifact
Corpusul de conformitate pentru comportament consensus-critical MUST fi tratat ca artefact serios:
- versionat,
- manifestat,
- hash-uit,
- atestat,
- și distribuit controlat.
2.2 Layout must be boringly deterministic
Ordinea directoarelor, numelor și manifesturilor MUST fi suficient de clară încât două echipe diferite să obțină aceeași structură și aceleași referințe.
2.3 Shared fixtures over redundant duplication
Corpusul SHOULD reutiliza fixture-uri prin referințe canonice, nu prin copiere excesivă, atâta timp cât trasabilitatea și independența testelor rămân clare.
2.4 Test case is a unit of truth
Un test case MUST avea:
- identitate proprie,
- inputuri clare,
- context clar,
- expected outputs clare,
- metadata suficientă.
2.5 Promotion must be explicit
Un vector nu devine „official” doar pentru că există pe disc. Trebuie să aibă:
- status,
- review,
- manifest membership,
- și eventual attestation în funcție de criticitate.
3. Corpus classes
3.1 Recommended corpus classes
ATLAS ZERO SHOULD distinge:
CC_DEVCC_INTERNALCC_PUBLICCC_RELEASECC_MAINNET_CRITICAL
3.2 Meaning
CC_DEV
Corpus de lucru, iterativ, încă instabil.
CC_INTERNAL
Corpus folosit intern pentru validare și dezvoltare serioasă.
CC_PUBLIC
Corpus distribuit public pentru testnet și integratori.
CC_RELEASE
Corpus înghețat pentru un release candidate sau release stabil.
CC_MAINNET_CRITICAL
Subsetul minim critic folosit pentru validarea finală a implementărilor candidate de mainnet.
3.3 Rule
Clasele diferite MUST fi clar separate în manifest și în promotion state.
4. Corpus top-level layout
4.1 Recommended layout
conformance_corpus/
corpus_manifest/
suites/
fixtures/
shared/
expected/
indexes/
attestations/
docs/
4.2 Directory meanings
corpus_manifest/
Manifesturile corpusului, subseturilor și shard-urilor.
suites/
Suitele organizate pe familii de comportament.
fixtures/
State fixtures, object fixtures, dependency fixtures.
shared/
Artefacte comune reutilizate de mai multe suite.
expected/
Expected outputs centralizate sau bundle-uri de expecteds, unde modelul cere separare.
indexes/
Indexuri derivate pentru lookup rapid.
attestations/
Aprobări și review-uri ale corpusului sau subseturilor lui.
docs/
Documentație non-normativă, ghiduri de rulare, note de migrare.
5. Suite layout
5.1 Each suite SHOULD live under:
suites/<suite_name>/
5.2 Recommended suite names
encodinghashingpoliciestx_validationstate_transitionconsensusbvmwitnesseconomicsagentsgovernancegenesisadversarialcross_version
5.3 Rule
Suite names MUST be stable, lowercase, normalized and not semantically overloaded.
6. Suite internal layout
6.1 Recommended layout
suites/<suite_name>/
suite_manifest.json
shards/
cases/
notes/
6.2 Meaning
suite_manifest.json
Descrie suite-level metadata și intrările.
shards/
Optional grouping pentru rulare parțială sau paralelizare.
cases/
Cazurile concrete de test.
notes/
Documentație non-normativă, rationale, migration notes.
7. Case directory layout
7.1 Recommended per-case layout
suites/<suite_name>/cases/<case_id>/
case_manifest.json
inputs/
context/
expected/
attachments/
7.2 Meaning
case_manifest.json
Obiectul canonical care definește test case-ul.
inputs/
Inputuri primare pentru test.
context/
State fixtures, parameter fixtures, extra references.
expected/
Verdicturi și expected roots/hashes/errors.
attachments/
Artefacte opționale non-normative sau mari, doar dacă policy permite.
8. Case identity
8.1 case_id policy
Fiecare caz MUST avea case_id unic în tot corpusul.
8.2 Recommended naming
az024.<suite_name>.<family>.<nnnn>
Exemple:
az024.encoding.objects.0001az024.tx_validation.machine_call.0042az024.consensus.notarization.0017
8.3 Rule
case_id MUST NOT depinde de path local accidental.
Path-ul îl găzduiește; nu îl definește.
9. Case manifest
9.1 Canonical structure
CorpusCaseManifest {
version_major
version_minor
case_id
suite_name
suite_version
corpus_class
protocol_version
case_category
case_status
criticality_tier
input_root
context_root?
expected_root
tags_hash?
notes_hash?
}
9.2 case_status
Valori recomandate:
DRAFTREVIEWEDOFFICIALDEPRECATEDSUPERSEDEDREVOKED
9.3 Rule
Numai cazurile OFFICIAL SHOULD conta pentru conformance release-grade, dacă politica nu spune altfel.
10. Case categories
10.1 Examples
- object_valid
- object_invalid
- hash_derivation
- policy_satisfied
- policy_unsatisfied
- tx_accept
- tx_reject
- state_transition
- consensus_finalize
- consensus_reject
- bvm_success
- bvm_trap
- witness_active
- witness_revoked
- economics_formula
- agent_deny
- governance_activation
- genesis_recompute
- adversarial_reject
- cross_version_boundary
10.2 Rule
Category MUST align with suite semantics and expected outputs.
11. Input files
11.1 Inputs directory SHOULD contain:
- canonical objects
- raw bytes blobs
- tx files
- module blobs
- manifest refs
- reference lists
11.2 Naming recommendation
inputs/
input_001.blob
input_002.blob
tx.blob
module.blob
object_list.blob
11.3 Rule
File names are operational only. Normative references MUST be through manifest entries and content hashes.
12. Context files
12.1 Context MAY include
- pre_state fixture
- parameter_state fixture
- genesis subset fixture
- prior snapshot fixture
- witness status context
- committee derivation seed/context
- mandate state context
12.2 Recommended layout
context/
pre_state.blob
parameter_state.blob
committee_context.blob
mandate_state.blob
12.3 Rule
If a case depends on context, the context MUST be fully captured or referenced canonically. No hidden external state.
13. Expected files
13.1 Expected directory SHOULD contain:
- expected_verdict.blob
- expected_state_root.txt or blob
- expected_receipt.blob
- expected_error_code.txt or blob
- expected_effect_digest.txt or blob
- expected_exec_units.txt or blob depending on case type
13.2 Recommended minimal names
expected/
verdict.blob
outputs.blob
13.3 Rule
Expected outputs MUST be machine-readable and canonical where consensus-relevant.
14. Shared fixtures
14.1 Need
Multe cazuri folosesc aceeași:
- stare inițială,
- genesis subset,
- parameter state,
- validator set,
- machine module,
- witness bundle.
14.2 Recommended layout
fixtures/
states/
parameters/
validators/
bvm_modules/
witnesses/
governance/
genesis/
14.3 Fixture identity
Fiecare fixture SHOULD fi tratat ca artefact separat cu:
fixture_idcontent_hashcanonical_hashfixture_manifestdacă e bundle
14.4 Rule
Cases SHOULD reference fixtures by id/hash, not by fragile relative assumptions alone.
15. Fixture reference model
15.1 Canonical structure
FixtureRef {
fixture_class
fixture_id
content_hash
canonical_hash
}
15.2 Use
Case manifest sau suite manifest MAY include fixture refs במקום copiilor locale.
15.3 Rule
Dacă un case referă fixture extern, referința MUST fi sufficientă pentru a verifica exact artefactul folosit.
16. Shared directory
16.1 Purpose
Holds common helper artifacts not tied to a single suite. Examples:
- common schemas
- shared object bundles
- sample signature bundles
- registry snapshots
- common policy objects
16.2 Rule
Shared artifacts MUST remain immutable within a corpus release. Any change creates new artifact identity.
17. Shards
17.1 Need
Corpusul mare trebuie rulat incremental/paralel.
17.2 Recommended layout
suites/<suite_name>/shards/<shard_id>.json
17.3 Shard object
SuiteShardManifest {
shard_id
suite_name
included_case_ids
case_root
shard_class
}
17.4 shard_class examples
- smoke
- full
- heavy
- negative_only
- launch_critical
- regression_hotspots
17.5 Rule
Shards are derived subsets. They MUST NOT redefine the truth of cases.
18. Suite manifest
18.1 Canonical structure
SuiteManifest {
version_major
version_minor
suite_name
suite_version
suite_status
corpus_class
case_count
case_entries_root
shard_root?
fixture_refs_root?
notes_hash?
}
18.2 Suite status
DRAFTREVIEWEDOFFICIALDEPRECATEDSUPERSEDED
18.3 Rule
Official suite manifest SHOULD enumerate all official cases included in that suite version.
19. Corpus root manifest
19.1 Purpose
Anchors the entire corpus release.
19.2 Canonical structure
CorpusRootManifest {
version_major
version_minor
corpus_id
corpus_class
corpus_version
protocol_version_scope
suite_manifest_refs
suite_root
shared_fixture_root
attestation_root?
notes_hash?
}
19.3 corpus_id
corpus_id = H("AZ:CONFORMANCE_CORPUS:" || canonical_corpus_root_manifest)
19.4 Rule
A release-grade corpus MUST have exactly one authoritative root manifest per release scope.
20. Root derivation rules
20.1 Case roots
For each case:
input_root = MerkleRoot(hash(inputs entries))
context_root = MerkleRoot(hash(context entries)) or EMPTY_ROOT
expected_root = MerkleRoot(hash(expected entries))
20.2 Suite case entries root
case_entry_hash_i = H("AZ:CORPUS_CASE_ENTRY:" || canonical_case_entry_i)
case_entries_root = MerkleRoot(case_entry_hash_i...)
20.3 suite_root
suite_root = H("AZ:SUITE_ROOT:" || canonical_suite_manifest)
20.4 corpus suite root
suite_manifest_ref_hash_i = H("AZ:SUITE_REF:" || canonical_suite_ref_i)
suite_root = MerkleRoot(suite_manifest_ref_hash_i...)
20.5 Rule
All roots MUST use canonical ordering by ids/hashes.
21. Empty root convention
21.1 Rule
Corpus layout MUST reuse the global empty-root convention:
EMPTY_ROOT = H("AZ:EMPTY_ROOT:")
21.2 Use
For absent context, absent shard lists, absent shared fixture refs, etc.
22. Expected verdict object
22.1 Standard structure
ExpectedVerdict {
verdict_class
primary_status
error_code?
receipt_hash?
state_root?
effect_digest?
exec_units?
auxiliary_root?
}
22.2 verdict_class examples
- accept
- reject
- state_transition
- finalize
- vm_result
- governance_result
- witness_status
- economics_result
22.3 Rule
Case runners MUST compare according to verdict_class semantics, not only string equality of human messages.
23. Human-readable notes
23.1 Notes MAY exist:
- per suite
- per case
- per corpus release
23.2 Rule
Notes are advisory only. They MUST NOT be required for machine validation.
23.3 Use
Helpful for:
- rationale
- bug history
- migration notes
- interpretation guidance
24. Corpus statuses and promotion
24.1 Promotion ladder
A case or suite SHOULD move through:
DRAFTREVIEWEDOFFICIALRELEASE_LOCKEDDEPRECATEDSUPERSEDED
24.2 Rule
Promotion MUST be explicit and journaled. No silent “we now treat this test as official”.
24.3 RELEASE_LOCKED
Recommended for corpus subset used in a release candidate or launch scope.
25. Criticality tiers in corpus
25.1 Suggested tiers
CT_LOWCT_NORMALCT_IMPORTANTCT_CRITICALCT_MAINNET_CRITICAL
25.2 Rule
Mainnet-critical cases SHOULD live in a clearly queryable subset/shard.
25.3 Use
Tiers determine:
- required review
- required attestation
- release inclusion priority
- runner default selection
26. Corpus attestation policy
26.1 Recommendation
For release-grade corpus, SHOULD require:
- review attestation on suites
- approval for corpus root manifest
- security review or equivalent for mainnet-critical subsets if they guard launch claims
26.2 Rule
A release binary claiming conformance against corpus X SHOULD point to exact corpus_id and exact suite refs.
27. Corpus release linkage
27.1 Need
A release candidate may be tied to a specific corpus version.
27.2 Structure
ReleaseCorpusLinkage {
linkage_id
release_candidate_id
corpus_id
included_suite_scope_root
launch_critical_shard_root?
}
27.3 Rule
Conformance claims MUST be scope-locked to exact corpus version.
28. Cross-version corpus support
28.1 Need
Different protocol versions need different expected results.
28.2 Rule
Corpus root manifest MUST declare protocol_version_scope.
28.3 Cases
A case MAY be:
- valid only for protocol v1.0
- boundary case between v1.0 and v1.1
- deprecated after v1.1
28.4 Recommendation
Cross-version boundary tests SHOULD live in dedicated suite cross_version.
29. Corpus indexing
29.1 Recommended indexes
- case_id -> path
- suite_name -> case_ids
- corpus_class -> suite refs
- protocol_version_scope -> cases
- criticality_tier -> cases
- fixture_ref usage index
- verdict_class index
- shard membership index
- status index
29.2 Rule
Indexes are derived conveniences. Truth remains in manifests.
30. Runner contract
30.1 A conformant runner SHOULD:
- load corpus root manifest
- verify suite manifests
- resolve fixtures
- load case manifest
- materialize inputs/context
- execute subsystem under test
- compare against expected verdict
- emit machine-readable result record
30.2 Rule
Runners MUST NOT inject hidden defaults outside corpus-declared context.
31. Runner result record
31.1 Canonical structure
CorpusRunResult {
run_id
corpus_id
suite_name
case_id
implementation_id
implementation_version
observed_verdict_hash
pass_fail
mismatch_class?
timestamp_unix_ms
}
31.2 pass_fail
- pass
- fail
- skipped
- invalid_run
31.3 Rule
Machine-readable result records SHOULD be archivable and linkable to release evidence.
32. Heavy fixtures and large artifacts
32.1 Need
Some fixtures may be too large for duplication per case.
32.2 Rule
Large fixtures SHOULD live in shared fixture directories and be referenced by id/hash.
32.3 Recommendation
Cases SHOULD remain self-descriptive enough that a runner can fetch or resolve the fixture deterministically.
33. Corpus packaging
33.1 A release-grade corpus SHOULD be packageable as:
- vault subset
- canonical archive
- release bundle component
33.2 Rule
Packaging container MUST preserve internal canonical file bytes and manifest truth.
33.3 Recommendation
Corpus release package SHOULD include:
- corpus root manifest
- suite manifests
- required fixtures
- attestation bundle
- checksum bundle
- optional docs
34. Duplicate prevention
34.1 Rule
Corpus SHOULD reject:
- exact duplicate case with different case_id
- same case_id with conflicting inputs/expecteds
- semantically duplicated official mainnet-critical case unless supersession says otherwise
34.2 Near-duplicate handling
Near-duplicates MAY exist if:
- rationale differs,
- bug regression differs,
- protocol version differs,
- boundary semantics differ, but SHOULD be documented or tagged distinctly.
35. Supersession and deprecation
35.1 Case supersession
A case MAY be superseded when:
- bug in expected output fixed
- schema representation corrected
- protocol version evolved
- shard structure reorganized
35.2 Rule
Supersession MUST be explicit:
- old_case_id
- new_case_id
- reason class
- compatibility note hash if relevant
35.3 Deprecation
Deprecated cases remain historical but SHOULD not count toward current official release conformance unless policy says so.
36. Quarantine and invalid corpus artifacts
36.1 Cases or fixtures SHOULD be quarantined if:
- manifest mismatch
- missing context
- broken expected output encoding
- duplicate conflicting identity
- wrong protocol version scope
- corrupted shared fixture
- suspicious shadow update
36.2 Rule
A quarantined mainnet-critical case MUST NOT remain in active launch-critical shard.
37. Concrete naming recommendations
37.1 Suite manifest
suite_manifest.json
37.2 Case manifest
case_manifest.json
37.3 Corpus root manifest
corpus_root_manifest.json
37.4 Shared fixture manifest
fixture_manifest.json
37.5 Expected verdict
expected/verdict.blob
37.6 Rule
Physical filenames SHOULD be stable and predictable. Normative identity still comes from manifests and hashes.
38. Corpus integrity and vault integration
38.1 Rule
Release-grade corpus artifacts SHOULD be admitted into the secure vault with:
- artifact identity
- manifest chain
- attestation policy
- snapshot coverage
38.2 Rule
A corpus root manifest SHOULD be promotable like any critical release artifact.
39. Corpus anti-patterns
Systems SHOULD avoid:
- flat pile of test files with no manifests
- fixture duplication everywhere
- case identity by filename only
- expected outputs stored as prose only
- hidden environment assumptions not in context
- corpus updates that overwrite old cases in place
- mainnet-critical subset not distinguishable from dev noise
- release binary claiming conformance without exact corpus reference
- undocumented near-duplicate cases causing confusion
- notes/docs treated as normative input
40. Formal goals
AZ-024 urmărește aceste obiective:
40.1 Corpus determinism
The same corpus release yields the same manifests, roots and case identities.
40.2 Corpus completeness
Each official case contains enough information to be run deterministically.
40.3 Corpus scalability
The layout supports large numbers of cases through fixtures, shards and manifests without losing clarity.
40.4 Corpus release integrity
A release can bind itself to exact conformance evidence through exact corpus references.
41. Formula documentului
Concrete Conformance Corpus = canonical root manifest + suite manifests + case manifests + shared fixtures + expected verdict objects + explicit promotion states
42. Relația cu restul suitei
- AZ-011 definește ce este conformitatea.
- AZ-021 definește build/release pipeline.
- AZ-024 definește forma concretă a corpusului care alimentează și dovedește acea conformitate.
Pe scurt: AZ-011 dă teoria vectorilor; AZ-024 dă corpusul executabil real.
43. Ce urmează
După AZ-024, documentul corect este:
AZ-025 — Validator and Operator Launch Manual
Acolo trebuie fixate:
- pașii exacți pentru operatori și validatori,
- verificarea release package și genesis package,
- configurarea inițială,
- bootstrap de nod,
- verificări pre-launch și post-launch,
- comportamentul în safe mode și incidente de început.
Închidere
Un corpus bun nu este doar o colecție de teste. Este o infrastructură de adevăr comparabil: cu manifesturi, fixture-uri, expected outputs, status, promotion și release linkage. Fără această structură, conformance devine un obicei de echipă. Cu ea, devine artefact protocolar verificabil.
Acolo începe corpusul serios real.