ATLAS ZERO VM.zip / AZ-036_Network_Upgrade_Rollout_and_Version_Compatibility_Matrix_v1.md

AZ-036 — Network Upgrade Rollout and Version Compatibility Matrix v1

AZ-036 — Network Upgrade Rollout and Version Compatibility Matrix v1

Status

Acest document definește:

  • matricea de compatibilitate între versiuni;
  • regulile de rollout al upgrade-urilor în rețea;
  • comportamentul mixed-fleet;
  • gating-ul pentru validare și semnare;
  • tranziția controlată înainte, în timpul și după activarea unei versiuni noi.

După AZ-001 până la AZ-035, există deja:

  • specificația protocolului și a subsistemelor;
  • guvernanța, activările și hard fork-urile;
  • launch discipline, monitoring, decision ledger și archive;
  • disciplina cheilor și recovery post-compromitere.

AZ-036 răspunde la întrebarea: cum ducem o rețea vie de la o versiune la alta fără ambiguitate despre cine mai este compatibil, cine mai are voie să semneze și ce combinații de versiuni pot coexista temporar fără a produce divergență semantică sau risc operațional?

Scopul documentului este să fixeze:

  • clasele de compatibilitate între versiuni;
  • matricea oficială de compatibilitate;
  • regulile de mixed-fleet behavior;
  • pragurile de rollout și readiness;
  • gating-ul pentru proposer/verifier/notary/observer;
  • pașii operaționali de rollout și cutover;
  • observarea și stabilizarea post-upgrade.

Acest document se bazează pe:

  • AZ-002 până la AZ-035, cu accent direct pe AZ-004, AZ-009, AZ-015, AZ-017, AZ-021, AZ-025, AZ-028, AZ-030, AZ-031, AZ-034 și AZ-035.

Termeni:

  • MUST = obligatoriu
  • MUST NOT = interzis
  • SHOULD = recomandat puternic
  • MAY = opțional

1. Obiectiv

AZ-036 răspunde la 10 întrebări critice:

  1. Ce înseamnă compatibilitate între versiuni în ATLAS ZERO?
  2. Ce combinații de versiuni pot coexista într-o rețea vie?
  3. Când un nod vechi mai poate valida, dar nu mai poate semna?
  4. Când un nod trebuie să fail-closed?
  5. Ce praguri de rollout sunt necesare înainte de activare?
  6. Cum se face cutover-ul pentru upgrade-uri compatibile și incompatibile?
  7. Ce observăm în mixed-fleet period?
  8. Cum legăm release package, activation boundary și operator behavior?
  9. Cum evităm split-ul accidental cauzat de versiuni amestecate?
  10. Cum arhivăm și audităm întreaga tranziție de versiune?

2. Principii

2.1 Compatibility is behavior, not label only

Compatibilitatea MUST însemna comportament compatibil sub reguli active, nu doar aceeași familie de versiune sau aceeași impresie generală.

2.2 Signing eligibility is stricter than observation eligibility

Un nod MAY putea observa sau chiar valida parțial mai mult timp decât are voie să semneze. Semnarea MUST avea praguri mai stricte.

2.3 Mixed fleets require explicit policy

Coexistența a două versiuni MUST fi permisă explicit de matricea de compatibilitate. Altfel, se presupune fail-closed pentru roluri critice.

2.4 Activation boundary and rollout policy must agree

Nu este suficient să existe o activare. Trebuie să existe și reguli clare despre:

  • cine trebuie să fie deja upgradat;
  • cine poate rămâne în urmă;
  • când semnarea veche devine interzisă.

2.5 Old versions must not improvise support

Un nod vechi MUST NOT continua să semneze dacă nu poate demonstra suport pentru semantica activă după boundary.

2.6 Rollout is part of protocol safety

Rollout-ul MUST fi tratat ca problemă de siguranță protocolară, nu doar ca problemă de fleet management.


3. Version model

3.1 Recommended version axes

ATLAS ZERO SHOULD distinge:

  • protocol_major
  • protocol_minor
  • execution_profile_version
  • validation_profile_version
  • feature_profile_version
  • release_version
  • conformance_corpus_version

3.2 Rule

Nodurile SHOULD face decizii de compatibilitate bazate pe matrix oficial, nu doar pe tuple de versiuni brute.


4. Compatibility classes

4.1 Standard compatibility classes

ATLAS ZERO SHOULD suporta cel puțin:

  • VC_FULLY_COMPATIBLE
  • VC_COMPATIBLE_OBSERVER_ONLY
  • VC_COMPATIBLE_VALIDATION_ONLY
  • VC_COMPATIBLE_SIGNING_RESTRICTED
  • VC_TRANSITION_ONLY
  • VC_INCOMPATIBLE_POST_BOUNDARY
  • VC_HARD_FORK_SEPARATE_NETWORK

4.2 Meaning

VC_FULLY_COMPATIBLE

Versiunile pot coexista și participa complet conform politicii active.

VC_COMPATIBLE_OBSERVER_ONLY

Versiunea mai veche poate observa, dar nu este suficient de sigură pentru validare sau semnare.

VC_COMPATIBLE_VALIDATION_ONLY

Poate valida sau urmări protocolul suficient pentru observare și verificare locală, dar nu poate semna.

VC_COMPATIBLE_SIGNING_RESTRICTED

Poate semna doar în scope limitat sau numai înainte de boundary specific.

VC_TRANSITION_ONLY

Compatibilitate doar în perioada pre-activare sau într-o fereastră strictă de tranziție.

VC_INCOMPATIBLE_POST_BOUNDARY

După boundary, versiunea veche trebuie să fail-closed pentru roluri critice.

VC_HARD_FORK_SEPARATE_NETWORK

După cutover, cele două linii trebuie tratate ca rețele separate.


5. Compatibility matrix purpose

5.1 Matrix should answer:

  • poate versiunea A coexista cu versiunea B?
  • poate versiunea A propune/verifica/notariza după boundary-ul X?
  • poate versiunea A doar observa?
  • ce upgrade path trebuie urmat?
  • care este momentul de fail-closed obligatoriu?

5.2 Rule

Fără matrice oficială de compatibilitate, operatorii ar improviza per release, ceea ce este periculos.


6. Version compatibility matrix object

6.1 Canonical structure

VersionCompatibilityMatrix {
  version_major
  version_minor

  matrix_id
  target_network_class
  current_protocol_version
  target_protocol_version
  compatibility_entries_root
  rollout_policy_root
  activation_boundary
  metadata_hash?
}

6.2 matrix_id

matrix_id = H("AZ:VERSION_COMPAT_MATRIX:" || canonical_matrix_body)

6.3 Rule

Matricea MUST fi scope-bound la exact target version pair și activation boundary.


7. Compatibility entry object

7.1 Canonical structure

CompatibilityEntry {
  entry_id
  from_version
  to_version
  compatibility_class
  allowed_role_classes_root
  disallowed_role_classes_root?
  valid_before_boundary_only
  valid_after_boundary
  notes_hash?
}

7.2 allowed_role_classes examples

  • observer
  • validator_validation_only
  • proposer
  • verifier
  • notary
  • archive_reader
  • tooling_client

7.3 Rule

Compatibilitatea MUST fi exprimată per rol, nu doar per nod generic.


8. Rollout policy root

8.1 Purpose

Definește regulile operaționale ale rollout-ului.

8.2 MAY include:

  • minimum upgraded validator threshold
  • minimum upgraded notary threshold
  • proposer threshold
  • observer-only tolerance
  • pre-boundary rollout deadline
  • signing stop deadline for old versions
  • post-boundary quarantine policy for laggards
  • rollback posture if allowed

8.3 Rule

Rollout policy MUST be explicit before cutover.


9. Node role classes under rollout

9.1 Core role classes

  • ROLE_OBSERVER
  • ROLE_VALIDATION_ONLY
  • ROLE_PROPOSER
  • ROLE_VERIFIER
  • ROLE_NOTARY
  • ROLE_ARCHIVE_ONLY

9.2 Rule

Fiecare rol SHOULD avea propriul gating de versiune. Un observer nu are aceleași cerințe ca un notary.


10. Mixed-fleet periods

10.1 Definition

Mixed-fleet = perioadă în care mai multe versiuni distincte sunt prezente simultan în rețea.

10.2 Types

  • pre-activation mixed fleet
  • transition mixed fleet
  • post-activation tolerated mixed fleet
  • post-activation forbidden mixed fleet

10.3 Rule

Fiecare upgrade SHOULD declara ce mixed-fleet periods sunt permise și pentru cât timp.


11. Pre-activation rollout phase

11.1 Purpose

Operatorii își actualizează nodurile înainte de boundary.

11.2 Expected behavior

  • vechile și noile versiuni coexistă conform matrix
  • operatorii noi rulează compat mode dacă upgrade-ul o cere
  • monitoring urmărește adoption rate și role readiness
  • semnarea veche rămâne permisă doar în limitele declarate

11.3 Rule

Pre-activation rollout SHOULD produce explicit readiness metrics before activation is allowed.


12. Activation boundary behavior

12.1 At boundary, nodes MUST decide deterministically:

  • continue in full role
  • continue validation-only
  • continue observer-only
  • disable signing
  • stop / fail-closed depending on compatibility matrix and active release scope

12.2 Rule

Boundary behavior MUST NOT depend on operator guesswork during critical moment.


13. Post-boundary behavior

13.1 After boundary, nodes SHOULD be in one of:

  • fully supported active version
  • restricted but allowed validation-only observer state
  • failed closed due to incompatibility

13.2 Rule

Old version consensus-signers MUST NOT continue post-boundary if matrix says incompatible.


14. Signing gating model

14.1 A node MAY sign only if:

  • version is allowed to sign for current boundary state
  • release package scope matches active upgrade scope
  • key bindings are valid
  • local preflight for upgraded role passed
  • no decision ledger restriction blocks role

14.2 Rule

Signing eligibility after upgrade MUST be stricter than mere peer compatibility.


15. Validation-only gating model

15.1 A node MAY remain validation-only if:

  • it can still parse and validate enough active semantics safely
  • matrix marks role as validation-only compatible
  • operator policy allows it
  • it does not emit incompatible signatures

15.2 Rule

Validation-only should not become excuse for silent semantically outdated participation.


16. Observer-only gating model

16.1 Observer-only mode MAY be allowed for:

  • lagging infrastructure
  • auditors
  • archival readers
  • external tooling
  • temporary compatibility windows

16.2 Rule

Observer-only nodes MUST be provably non-signing in contexts where signing is forbidden.


17. Fail-closed conditions

17.1 Nodes MUST fail-closed for critical roles if:

  • current active semantics are unsupported
  • activation boundary passed and matrix forbids role
  • release/genesis/upgrade scope mismatch exists
  • version support uncertain
  • upgrade package or matrix cannot be verified
  • post-boundary mixed-fleet tolerance expired

17.2 Rule

“Likely still works” is not acceptable justification for continuing to sign.


18. Upgrade threshold model

18.1 Rollout policy SHOULD define thresholds such as:

  • minimum upgraded proposer fraction
  • minimum upgraded verifier fraction
  • minimum upgraded notary fraction
  • minimum upgraded validator cluster coverage
  • mandatory named operators upgraded
  • minimum archive/observer availability

18.2 Rule

Thresholds SHOULD be role-aware. Notary threshold matters more than archive-only threshold.


19. Threshold evidence object

19.1 Canonical structure

UpgradeThresholdEvidence {
  evidence_id
  proposal_id
  observation_window_hash
  upgraded_role_counts_root
  upgraded_role_threshold_verdict
  timestamp_unix_ms
}

19.2 Rule

Cutover SHOULD rely on explicit threshold evidence, not optimistic fleet estimates.


20. Rollout readiness stages

20.1 Recommended stages

  • UR_STAGE_PREPARE
  • UR_STAGE_DISTRIBUTE
  • UR_STAGE_UPGRADE_IN_PROGRESS
  • UR_STAGE_THRESHOLD_REACHED
  • UR_STAGE_ACTIVATION_READY
  • UR_STAGE_BOUNDARY_ACTIVE
  • UR_STAGE_POST_BOUNDARY_STABILIZING
  • UR_STAGE_COMPLETE

20.2 Rule

Transition between stages SHOULD be backed by records and monitoring evidence.


21. Rollout record object

21.1 Canonical structure

UpgradeRolloutRecord {
  rollout_id
  proposal_id
  rollout_stage
  active_matrix_id
  readiness_evidence_ref?
  decision_ref?
  timestamp_unix_ms
}

21.2 Rule

A serious network upgrade SHOULD have explicit rollout record chain.


22. Operator upgrade checklist linkage

22.1 Operators SHOULD complete upgrade-specific checklists for:

  • artifact verification
  • matrix verification
  • boundary understanding
  • signing stop criteria
  • post-upgrade preflight
  • rejoin after upgrade

22.2 Rule

Operator rollout readiness SHOULD be linkable to checklist records, not only version reporting.


23. Release package linkage

23.1 Each upgrade rollout SHOULD bind exact:

  • target release package id
  • minimum required release versions
  • incompatible superseded releases
  • upgrade proposal id
  • matrix id

23.2 Rule

Version numbers alone SHOULD NOT be the sole truth source for rollout.


24. Activation decision linkage

24.1 Upgrade rollout SHOULD integrate with decision ledger decisions such as:

  • upgrade_hold
  • upgrade_proceed
  • upgrade_go
  • upgrade_defer
  • cutover_authorized
  • laggard_quarantine
  • rejoin_approved

24.2 Rule

Cutover without explicit decision linkage is operationally weak.


25. Monitoring during rollout

25.1 Rollout monitoring SHOULD track:

  • version adoption by role
  • signer eligibility failures
  • peer compatibility mismatch rates
  • validation divergence signals
  • no-finality risk during transition
  • rollback triggers if applicable
  • laggard node population
  • post-boundary failure-to-stop signals

25.2 Rule

Mixed-fleet monitoring MUST be stricter around activation boundary than during quiet pre-upgrade staging.


26. Laggard node policy

26.1 Laggard nodes are nodes that remain on outdated version beyond tolerated stage.

26.2 Policy SHOULD define:

  • when laggards are merely warned
  • when laggards are signing-forbidden
  • when laggards are quarantined
  • when laggards are dropped or treated as incompatible peers

26.3 Rule

Laggard treatment MUST be explicit and role-aware.


27. Rollback model

27.1 Some compatible upgrades MAY support rollback if:

  • pre-boundary or immediate post-boundary safe window exists
  • no irreversible state migration has committed
  • decision ledger and policy explicitly allow it

27.2 Rule

Hard forks and irreversible migrations SHOULD assume rollback is not trivial.

27.3 Rule

Rollback policy MUST be explicit before rollout, not invented after trouble begins.


28. Transition-only compatibility

28.1 Some versions MAY be tolerated only in narrow window:

  • can join pre-boundary
  • must stop signing at boundary
  • may remain observer-only briefly post-boundary

28.2 Rule

This class SHOULD be used explicitly for bridging releases and not confused with durable compatibility.


29. Hard fork rollout specifics

29.1 Hard fork rollout MUST define:

  • fork boundary
  • old/new identity treatment
  • replay protection
  • required release package set
  • node stop behavior on old chain if joining new chain
  • archive and audit requirements
  • post-fork monitoring window

29.2 Rule

Hard fork mixed-fleet post-boundary MUST be treated as separate network coexistence problem, not normal compatibility.


30. Version compatibility matrix publication

30.1 The matrix SHOULD be published as:

  • canonical matrix artifact
  • operator-readable summary
  • release notes linkage
  • rollout advisory linkage

30.2 Rule

Human-readable summary is helpful, but canonical matrix object remains normative.


31. Conformance requirements for version transitions

31.1 Upgrades SHOULD include tests for:

  • old->new pre-boundary behavior
  • boundary cutover
  • post-boundary supported role behavior
  • unsupported old version fail-closed behavior
  • mixed-fleet tolerance if any
  • replay and migration correctness if applicable

31.2 Rule

Version transition semantics SHOULD be part of conformance corpus, not only operator docs.


32. Key and role interaction

32.1 Rollout MAY coincide with key rotation or role rebinding.

32.2 Rule

If upgrade changes key requirements, matrix and rollout policy MUST state:

  • whether old keys remain valid
  • whether new role bindings are required
  • whether signing must stop until new bindings active

32.3 Rule

Version rollout and key rotation combined can multiply risk and SHOULD be handled with extra explicitness.


33. Archive and audit linkage

33.1 A serious upgrade rollout SHOULD archive:

  • compatibility matrix
  • rollout policy
  • readiness evidence
  • threshold evidence
  • decision ledger subset
  • operator advisory bundle
  • activation and observation records
  • post-upgrade stabilization evidence

33.2 Rule

Future reviewers SHOULD be able to reconstruct exactly why mixed fleets were allowed or forbidden.


34. Compatibility summary matrix example semantics

34.1 Example rows

  • v1.2 -> v1.3 compatible gated, observers yes, validators yes, signers yes pre-boundary, old signers no post-boundary
  • v1.3 -> v2.0 hard fork separate network, observers old-chain only, new-chain join requires full upgrade
  • v1.3 -> v1.3.1 fully compatible, all roles yes, no boundary disruption

34.2 Rule

These summaries are examples only; actual matrix MUST be canonical and scope-bound.


35. Anti-patterns

Systems SHOULD avoid:

  1. saying “everyone please upgrade soon” with no matrix
  2. allowing old signers to continue after incompatible boundary
  3. mixed fleets with no monitoring or threshold evidence
  4. role eligibility not distinguished from observer compatibility
  5. rollback assumptions with irreversible migration
  6. version numbers changed with no activation record
  7. operator docs that conflict with matrix object
  8. hard fork with no explicit post-boundary peer treatment
  9. no laggard policy
  10. no archive of the rollout path

36. Formal goals

AZ-036 urmărește aceste obiective:

36.1 Deterministic version compatibility

Nodes and operators can determine exactly whether a version is allowed for each role at each stage.

36.2 Safe rollout

Upgrade rollout can happen with explicit thresholds, staged monitoring and predictable cutover behavior.

36.3 Controlled mixed fleets

Temporary coexistence of versions does not become accidental semantic divergence.

36.4 Audit-grade transition history

The network can later reconstruct exactly how it moved from one version regime to another.


37. Formula documentului

Network Upgrade Rollout = official compatibility matrix + role-aware gating + rollout thresholds + explicit activation boundary + mixed-fleet monitoring + fail-closed rules for unsupported signers


38. Relația cu restul suitei

  • AZ-034 definește clasificarea upgrade-urilor și hard fork-urilor.
  • AZ-035 definește disciplina cheilor și a rolurilor.
  • AZ-036 definește cum se mută efectiv rețeaua vie între versiuni, fără ambiguitate despre compatibilitate.

Pe scurt: AZ-034 spune ce se schimbă și când; AZ-036 spune cine mai are voie să participe și în ce mod pe durata tranziției.


39. Ce urmează

După AZ-036, documentul corect este:

AZ-037 — Long-Term Archive Verification and Preservation Schedule

Acolo trebuie fixate:

  • verificările periodice ale arhivelor;
  • rotația mediilor de stocare;
  • revalidarea manifesturilor și hash-urilor;
  • politici de retenție pe termen foarte lung;
  • și cum ne asigurăm că arhivele de launch, upgrade și incident rămân verificabile peste ani.

Închidere

O rețea nu se upgradează în siguranță doar pentru că există cod nou. Se upgradează în siguranță când fiecare nod știe exact: ce versiuni pot coexista, când are voie să semneze, când trebuie să tacă, și când o versiune veche nu mai este doar „mai veche”, ci deja incompatibilă.

Acolo începe rollout-ul matur între versiuni.