AZ-036 — Network Upgrade Rollout and Version Compatibility Matrix v1
Status
Acest document definește:
- matricea de compatibilitate între versiuni;
- regulile de rollout al upgrade-urilor în rețea;
- comportamentul mixed-fleet;
- gating-ul pentru validare și semnare;
- tranziția controlată înainte, în timpul și după activarea unei versiuni noi.
După AZ-001 până la AZ-035, există deja:
- specificația protocolului și a subsistemelor;
- guvernanța, activările și hard fork-urile;
- launch discipline, monitoring, decision ledger și archive;
- disciplina cheilor și recovery post-compromitere.
AZ-036 răspunde la întrebarea: cum ducem o rețea vie de la o versiune la alta fără ambiguitate despre cine mai este compatibil, cine mai are voie să semneze și ce combinații de versiuni pot coexista temporar fără a produce divergență semantică sau risc operațional?
Scopul documentului este să fixeze:
- clasele de compatibilitate între versiuni;
- matricea oficială de compatibilitate;
- regulile de mixed-fleet behavior;
- pragurile de rollout și readiness;
- gating-ul pentru proposer/verifier/notary/observer;
- pașii operaționali de rollout și cutover;
- observarea și stabilizarea post-upgrade.
Acest document se bazează pe:
- AZ-002 până la AZ-035, cu accent direct pe AZ-004, AZ-009, AZ-015, AZ-017, AZ-021, AZ-025, AZ-028, AZ-030, AZ-031, AZ-034 și AZ-035.
Termeni:
- MUST = obligatoriu
- MUST NOT = interzis
- SHOULD = recomandat puternic
- MAY = opțional
1. Obiectiv
AZ-036 răspunde la 10 întrebări critice:
- Ce înseamnă compatibilitate între versiuni în ATLAS ZERO?
- Ce combinații de versiuni pot coexista într-o rețea vie?
- Când un nod vechi mai poate valida, dar nu mai poate semna?
- Când un nod trebuie să fail-closed?
- Ce praguri de rollout sunt necesare înainte de activare?
- Cum se face cutover-ul pentru upgrade-uri compatibile și incompatibile?
- Ce observăm în mixed-fleet period?
- Cum legăm release package, activation boundary și operator behavior?
- Cum evităm split-ul accidental cauzat de versiuni amestecate?
- Cum arhivăm și audităm întreaga tranziție de versiune?
2. Principii
2.1 Compatibility is behavior, not label only
Compatibilitatea MUST însemna comportament compatibil sub reguli active, nu doar aceeași familie de versiune sau aceeași impresie generală.
2.2 Signing eligibility is stricter than observation eligibility
Un nod MAY putea observa sau chiar valida parțial mai mult timp decât are voie să semneze. Semnarea MUST avea praguri mai stricte.
2.3 Mixed fleets require explicit policy
Coexistența a două versiuni MUST fi permisă explicit de matricea de compatibilitate. Altfel, se presupune fail-closed pentru roluri critice.
2.4 Activation boundary and rollout policy must agree
Nu este suficient să existe o activare. Trebuie să existe și reguli clare despre:
- cine trebuie să fie deja upgradat;
- cine poate rămâne în urmă;
- când semnarea veche devine interzisă.
2.5 Old versions must not improvise support
Un nod vechi MUST NOT continua să semneze dacă nu poate demonstra suport pentru semantica activă după boundary.
2.6 Rollout is part of protocol safety
Rollout-ul MUST fi tratat ca problemă de siguranță protocolară, nu doar ca problemă de fleet management.
3. Version model
3.1 Recommended version axes
ATLAS ZERO SHOULD distinge:
protocol_majorprotocol_minorexecution_profile_versionvalidation_profile_versionfeature_profile_versionrelease_versionconformance_corpus_version
3.2 Rule
Nodurile SHOULD face decizii de compatibilitate bazate pe matrix oficial, nu doar pe tuple de versiuni brute.
4. Compatibility classes
4.1 Standard compatibility classes
ATLAS ZERO SHOULD suporta cel puțin:
VC_FULLY_COMPATIBLEVC_COMPATIBLE_OBSERVER_ONLYVC_COMPATIBLE_VALIDATION_ONLYVC_COMPATIBLE_SIGNING_RESTRICTEDVC_TRANSITION_ONLYVC_INCOMPATIBLE_POST_BOUNDARYVC_HARD_FORK_SEPARATE_NETWORK
4.2 Meaning
VC_FULLY_COMPATIBLE
Versiunile pot coexista și participa complet conform politicii active.
VC_COMPATIBLE_OBSERVER_ONLY
Versiunea mai veche poate observa, dar nu este suficient de sigură pentru validare sau semnare.
VC_COMPATIBLE_VALIDATION_ONLY
Poate valida sau urmări protocolul suficient pentru observare și verificare locală, dar nu poate semna.
VC_COMPATIBLE_SIGNING_RESTRICTED
Poate semna doar în scope limitat sau numai înainte de boundary specific.
VC_TRANSITION_ONLY
Compatibilitate doar în perioada pre-activare sau într-o fereastră strictă de tranziție.
VC_INCOMPATIBLE_POST_BOUNDARY
După boundary, versiunea veche trebuie să fail-closed pentru roluri critice.
VC_HARD_FORK_SEPARATE_NETWORK
După cutover, cele două linii trebuie tratate ca rețele separate.
5. Compatibility matrix purpose
5.1 Matrix should answer:
- poate versiunea A coexista cu versiunea B?
- poate versiunea A propune/verifica/notariza după boundary-ul X?
- poate versiunea A doar observa?
- ce upgrade path trebuie urmat?
- care este momentul de fail-closed obligatoriu?
5.2 Rule
Fără matrice oficială de compatibilitate, operatorii ar improviza per release, ceea ce este periculos.
6. Version compatibility matrix object
6.1 Canonical structure
VersionCompatibilityMatrix {
version_major
version_minor
matrix_id
target_network_class
current_protocol_version
target_protocol_version
compatibility_entries_root
rollout_policy_root
activation_boundary
metadata_hash?
}
6.2 matrix_id
matrix_id = H("AZ:VERSION_COMPAT_MATRIX:" || canonical_matrix_body)
6.3 Rule
Matricea MUST fi scope-bound la exact target version pair și activation boundary.
7. Compatibility entry object
7.1 Canonical structure
CompatibilityEntry {
entry_id
from_version
to_version
compatibility_class
allowed_role_classes_root
disallowed_role_classes_root?
valid_before_boundary_only
valid_after_boundary
notes_hash?
}
7.2 allowed_role_classes examples
- observer
- validator_validation_only
- proposer
- verifier
- notary
- archive_reader
- tooling_client
7.3 Rule
Compatibilitatea MUST fi exprimată per rol, nu doar per nod generic.
8. Rollout policy root
8.1 Purpose
Definește regulile operaționale ale rollout-ului.
8.2 MAY include:
- minimum upgraded validator threshold
- minimum upgraded notary threshold
- proposer threshold
- observer-only tolerance
- pre-boundary rollout deadline
- signing stop deadline for old versions
- post-boundary quarantine policy for laggards
- rollback posture if allowed
8.3 Rule
Rollout policy MUST be explicit before cutover.
9. Node role classes under rollout
9.1 Core role classes
ROLE_OBSERVERROLE_VALIDATION_ONLYROLE_PROPOSERROLE_VERIFIERROLE_NOTARYROLE_ARCHIVE_ONLY
9.2 Rule
Fiecare rol SHOULD avea propriul gating de versiune. Un observer nu are aceleași cerințe ca un notary.
10. Mixed-fleet periods
10.1 Definition
Mixed-fleet = perioadă în care mai multe versiuni distincte sunt prezente simultan în rețea.
10.2 Types
- pre-activation mixed fleet
- transition mixed fleet
- post-activation tolerated mixed fleet
- post-activation forbidden mixed fleet
10.3 Rule
Fiecare upgrade SHOULD declara ce mixed-fleet periods sunt permise și pentru cât timp.
11. Pre-activation rollout phase
11.1 Purpose
Operatorii își actualizează nodurile înainte de boundary.
11.2 Expected behavior
- vechile și noile versiuni coexistă conform matrix
- operatorii noi rulează compat mode dacă upgrade-ul o cere
- monitoring urmărește adoption rate și role readiness
- semnarea veche rămâne permisă doar în limitele declarate
11.3 Rule
Pre-activation rollout SHOULD produce explicit readiness metrics before activation is allowed.
12. Activation boundary behavior
12.1 At boundary, nodes MUST decide deterministically:
- continue in full role
- continue validation-only
- continue observer-only
- disable signing
- stop / fail-closed depending on compatibility matrix and active release scope
12.2 Rule
Boundary behavior MUST NOT depend on operator guesswork during critical moment.
13. Post-boundary behavior
13.1 After boundary, nodes SHOULD be in one of:
- fully supported active version
- restricted but allowed validation-only observer state
- failed closed due to incompatibility
13.2 Rule
Old version consensus-signers MUST NOT continue post-boundary if matrix says incompatible.
14. Signing gating model
14.1 A node MAY sign only if:
- version is allowed to sign for current boundary state
- release package scope matches active upgrade scope
- key bindings are valid
- local preflight for upgraded role passed
- no decision ledger restriction blocks role
14.2 Rule
Signing eligibility after upgrade MUST be stricter than mere peer compatibility.
15. Validation-only gating model
15.1 A node MAY remain validation-only if:
- it can still parse and validate enough active semantics safely
- matrix marks role as validation-only compatible
- operator policy allows it
- it does not emit incompatible signatures
15.2 Rule
Validation-only should not become excuse for silent semantically outdated participation.
16. Observer-only gating model
16.1 Observer-only mode MAY be allowed for:
- lagging infrastructure
- auditors
- archival readers
- external tooling
- temporary compatibility windows
16.2 Rule
Observer-only nodes MUST be provably non-signing in contexts where signing is forbidden.
17. Fail-closed conditions
17.1 Nodes MUST fail-closed for critical roles if:
- current active semantics are unsupported
- activation boundary passed and matrix forbids role
- release/genesis/upgrade scope mismatch exists
- version support uncertain
- upgrade package or matrix cannot be verified
- post-boundary mixed-fleet tolerance expired
17.2 Rule
“Likely still works” is not acceptable justification for continuing to sign.
18. Upgrade threshold model
18.1 Rollout policy SHOULD define thresholds such as:
- minimum upgraded proposer fraction
- minimum upgraded verifier fraction
- minimum upgraded notary fraction
- minimum upgraded validator cluster coverage
- mandatory named operators upgraded
- minimum archive/observer availability
18.2 Rule
Thresholds SHOULD be role-aware. Notary threshold matters more than archive-only threshold.
19. Threshold evidence object
19.1 Canonical structure
UpgradeThresholdEvidence {
evidence_id
proposal_id
observation_window_hash
upgraded_role_counts_root
upgraded_role_threshold_verdict
timestamp_unix_ms
}
19.2 Rule
Cutover SHOULD rely on explicit threshold evidence, not optimistic fleet estimates.
20. Rollout readiness stages
20.1 Recommended stages
UR_STAGE_PREPAREUR_STAGE_DISTRIBUTEUR_STAGE_UPGRADE_IN_PROGRESSUR_STAGE_THRESHOLD_REACHEDUR_STAGE_ACTIVATION_READYUR_STAGE_BOUNDARY_ACTIVEUR_STAGE_POST_BOUNDARY_STABILIZINGUR_STAGE_COMPLETE
20.2 Rule
Transition between stages SHOULD be backed by records and monitoring evidence.
21. Rollout record object
21.1 Canonical structure
UpgradeRolloutRecord {
rollout_id
proposal_id
rollout_stage
active_matrix_id
readiness_evidence_ref?
decision_ref?
timestamp_unix_ms
}
21.2 Rule
A serious network upgrade SHOULD have explicit rollout record chain.
22. Operator upgrade checklist linkage
22.1 Operators SHOULD complete upgrade-specific checklists for:
- artifact verification
- matrix verification
- boundary understanding
- signing stop criteria
- post-upgrade preflight
- rejoin after upgrade
22.2 Rule
Operator rollout readiness SHOULD be linkable to checklist records, not only version reporting.
23. Release package linkage
23.1 Each upgrade rollout SHOULD bind exact:
- target release package id
- minimum required release versions
- incompatible superseded releases
- upgrade proposal id
- matrix id
23.2 Rule
Version numbers alone SHOULD NOT be the sole truth source for rollout.
24. Activation decision linkage
24.1 Upgrade rollout SHOULD integrate with decision ledger decisions such as:
- upgrade_hold
- upgrade_proceed
- upgrade_go
- upgrade_defer
- cutover_authorized
- laggard_quarantine
- rejoin_approved
24.2 Rule
Cutover without explicit decision linkage is operationally weak.
25. Monitoring during rollout
25.1 Rollout monitoring SHOULD track:
- version adoption by role
- signer eligibility failures
- peer compatibility mismatch rates
- validation divergence signals
- no-finality risk during transition
- rollback triggers if applicable
- laggard node population
- post-boundary failure-to-stop signals
25.2 Rule
Mixed-fleet monitoring MUST be stricter around activation boundary than during quiet pre-upgrade staging.
26. Laggard node policy
26.1 Laggard nodes are nodes that remain on outdated version beyond tolerated stage.
26.2 Policy SHOULD define:
- when laggards are merely warned
- when laggards are signing-forbidden
- when laggards are quarantined
- when laggards are dropped or treated as incompatible peers
26.3 Rule
Laggard treatment MUST be explicit and role-aware.
27. Rollback model
27.1 Some compatible upgrades MAY support rollback if:
- pre-boundary or immediate post-boundary safe window exists
- no irreversible state migration has committed
- decision ledger and policy explicitly allow it
27.2 Rule
Hard forks and irreversible migrations SHOULD assume rollback is not trivial.
27.3 Rule
Rollback policy MUST be explicit before rollout, not invented after trouble begins.
28. Transition-only compatibility
28.1 Some versions MAY be tolerated only in narrow window:
- can join pre-boundary
- must stop signing at boundary
- may remain observer-only briefly post-boundary
28.2 Rule
This class SHOULD be used explicitly for bridging releases and not confused with durable compatibility.
29. Hard fork rollout specifics
29.1 Hard fork rollout MUST define:
- fork boundary
- old/new identity treatment
- replay protection
- required release package set
- node stop behavior on old chain if joining new chain
- archive and audit requirements
- post-fork monitoring window
29.2 Rule
Hard fork mixed-fleet post-boundary MUST be treated as separate network coexistence problem, not normal compatibility.
30. Version compatibility matrix publication
30.1 The matrix SHOULD be published as:
- canonical matrix artifact
- operator-readable summary
- release notes linkage
- rollout advisory linkage
30.2 Rule
Human-readable summary is helpful, but canonical matrix object remains normative.
31. Conformance requirements for version transitions
31.1 Upgrades SHOULD include tests for:
- old->new pre-boundary behavior
- boundary cutover
- post-boundary supported role behavior
- unsupported old version fail-closed behavior
- mixed-fleet tolerance if any
- replay and migration correctness if applicable
31.2 Rule
Version transition semantics SHOULD be part of conformance corpus, not only operator docs.
32. Key and role interaction
32.1 Rollout MAY coincide with key rotation or role rebinding.
32.2 Rule
If upgrade changes key requirements, matrix and rollout policy MUST state:
- whether old keys remain valid
- whether new role bindings are required
- whether signing must stop until new bindings active
32.3 Rule
Version rollout and key rotation combined can multiply risk and SHOULD be handled with extra explicitness.
33. Archive and audit linkage
33.1 A serious upgrade rollout SHOULD archive:
- compatibility matrix
- rollout policy
- readiness evidence
- threshold evidence
- decision ledger subset
- operator advisory bundle
- activation and observation records
- post-upgrade stabilization evidence
33.2 Rule
Future reviewers SHOULD be able to reconstruct exactly why mixed fleets were allowed or forbidden.
34. Compatibility summary matrix example semantics
34.1 Example rows
- v1.2 -> v1.3 compatible gated, observers yes, validators yes, signers yes pre-boundary, old signers no post-boundary
- v1.3 -> v2.0 hard fork separate network, observers old-chain only, new-chain join requires full upgrade
- v1.3 -> v1.3.1 fully compatible, all roles yes, no boundary disruption
34.2 Rule
These summaries are examples only; actual matrix MUST be canonical and scope-bound.
35. Anti-patterns
Systems SHOULD avoid:
- saying “everyone please upgrade soon” with no matrix
- allowing old signers to continue after incompatible boundary
- mixed fleets with no monitoring or threshold evidence
- role eligibility not distinguished from observer compatibility
- rollback assumptions with irreversible migration
- version numbers changed with no activation record
- operator docs that conflict with matrix object
- hard fork with no explicit post-boundary peer treatment
- no laggard policy
- no archive of the rollout path
36. Formal goals
AZ-036 urmărește aceste obiective:
36.1 Deterministic version compatibility
Nodes and operators can determine exactly whether a version is allowed for each role at each stage.
36.2 Safe rollout
Upgrade rollout can happen with explicit thresholds, staged monitoring and predictable cutover behavior.
36.3 Controlled mixed fleets
Temporary coexistence of versions does not become accidental semantic divergence.
36.4 Audit-grade transition history
The network can later reconstruct exactly how it moved from one version regime to another.
37. Formula documentului
Network Upgrade Rollout = official compatibility matrix + role-aware gating + rollout thresholds + explicit activation boundary + mixed-fleet monitoring + fail-closed rules for unsupported signers
38. Relația cu restul suitei
- AZ-034 definește clasificarea upgrade-urilor și hard fork-urilor.
- AZ-035 definește disciplina cheilor și a rolurilor.
- AZ-036 definește cum se mută efectiv rețeaua vie între versiuni, fără ambiguitate despre compatibilitate.
Pe scurt: AZ-034 spune ce se schimbă și când; AZ-036 spune cine mai are voie să participe și în ce mod pe durata tranziției.
39. Ce urmează
După AZ-036, documentul corect este:
AZ-037 — Long-Term Archive Verification and Preservation Schedule
Acolo trebuie fixate:
- verificările periodice ale arhivelor;
- rotația mediilor de stocare;
- revalidarea manifesturilor și hash-urilor;
- politici de retenție pe termen foarte lung;
- și cum ne asigurăm că arhivele de launch, upgrade și incident rămân verificabile peste ani.
Închidere
O rețea nu se upgradează în siguranță doar pentru că există cod nou. Se upgradează în siguranță când fiecare nod știe exact: ce versiuni pot coexista, când are voie să semneze, când trebuie să tacă, și când o versiune veche nu mai este doar „mai veche”, ci deja incompatibilă.
Acolo începe rollout-ul matur între versiuni.