AZ-025 — Validator and Operator Launch Manual v1
Status
Acest document definește manualul operațional de lansare pentru validatori și operatori în ecosistemul ATLAS ZERO.
După AZ-001 până la AZ-024, există deja:
- specificația protocolului;
- regulile de validare, consens, BVM, witness, economie și guvernanță;
- modelul de securitate și runbook-urile de incident;
- pachetul genesis concret;
- vault-ul de artefacte și pipeline-ul de release;
- corpusul concret de conformitate;
- milestone-urile de implementare.
AZ-025 răspunde la întrebarea: ce trebuie să facă exact un validator sau operator pentru a porni corect, sigur și verificabil un nod înainte, în timpul și imediat după lansare?
Scopul documentului este să fixeze:
- pașii exacti de pre-launch;
- verificarea release package și genesis package;
- pregătirea cheilor și a mediului;
- verificările de integritate înainte de boot;
- activarea rolurilor de consens;
- monitorizarea primelor epoci;
- comportamentul în safe mode, incident și recovery la început de rețea.
Acest document se bazează pe:
- AZ-002 până la AZ-024, cu accent direct pe AZ-015, AZ-016, AZ-017, AZ-021 și AZ-022.
Termeni:
- MUST = obligatoriu
- MUST NOT = interzis
- SHOULD = recomandat puternic
- MAY = opțional
1. Obiectiv
AZ-025 răspunde la 10 întrebări operaționale:
- Ce verifică operatorul înainte să pornească nodul?
- Cum validează release package și genesis package?
- Cum pregătește cheile, configurația și mediul de rulare?
- Cum pornește nodul fără să introducă ambiguități locale?
- Cum activează rolurile de proposer/verifier/notary?
- Ce verificări face în launch window și în primele epoci?
- Cum răspunde la mismatch-uri, lipsă de finalitate sau pachete invalide?
- Cum intră în safe mode sau halt local fără a falsifica realitatea protocolară?
- Cum execută bootstrap, restart și rejoin în mod sigur?
- Ce dovezi și jurnale trebuie să păstreze pentru audit?
2. Principii
2.1 Verify before boot
Un nod MUST verifica:
- pachetul de release,
- pachetul genesis,
- binarul,
- configurația,
- cheile, înainte de a porni în rol validator.
2.2 No local reinterpretation
Operatorul MUST NOT „ghici”:
- ce genesis este corect,
- ce release este corect,
- ce parametri „probabil” trebuie folosiți. Totul trebuie verificat din artefactele canonice.
2.3 Separate local health from protocol truth
Problemele locale ale nodului MUST fi tratate ca:
- degradare locală,
- halt local de servicii,
- retragere din rol, nu ca modificări ale adevărului protocolar.
2.4 Start conservative
În launch window și primele epoci, operatorul SHOULD favoriza:
- siguranță,
- verificare,
- observabilitate, înainte de disponibilitate agresivă sau tuning de performanță.
2.5 Preserve evidence
Orice anomalie semnificativă MUST fi jurnalizată și păstrată pentru:
- incident response,
- audit,
- replay,
- eventual fraud proof.
3. Role classes covered by this manual
3.1 Covered roles
Manualul acoperă în principal:
- full validation node operator
- validator operator
- proposer operator
- verifier operator
- notary operator
- archival/observer operator with launch duties
3.2 Additional operators
Unele secțiuni pot ajuta și:
- genesis custodian
- release manager
- validator bootstrap coordinator
- recovery operator
3.3 Rule
Orice operator MUST ști exact ce roluri are activate. Nu se pornesc roluri de consens „implicit”.
4. Launch phases
4.1 Operational phases
Operatorii SHOULD trata lansarea în faze:
- Artifact Intake
- Local Verification
- Environment Preparation
- Node Preflight
- Bootstrap Start
- Role Activation
- Launch Window Monitoring
- Early Epoch Stabilization
- Restricted Post-Launch Operation
- Normalization
4.2 Rule
Trecerea dintre faze SHOULD fi explicită și verificabilă.
5. Required artifacts before launch
5.1 Every validator/operator SHOULD possess:
- exact release package
- exact genesis package
- exact release manifest
- exact genesis package manifest
- exact conformance corpus reference or launch-critical conformance evidence
- exact operator guide bundle if distributed separately
- signed/checksummed binary artifacts
- launch window instructions
- incident escalation contacts or hashes/references if policy uses them
5.2 Role-specific extras
A validator operator SHOULD also have:
- validator identity ref
- proposer/verifier/notary key refs or secure access to them
- local role configuration
- network boot peers or discovery config
- monitoring and log sink config
5.3 Rule
Missing mandatory launch artifacts MUST be treated as blocker before node role activation.
6. Artifact verification checklist
6.1 Before boot, operator MUST verify:
- release package manifest
- release artifact hashes
- release approvals/attestations
- genesis package manifest
- genesis package hashes
- genesis package attestations
- exact
genesis_hash - exact
chain_id - compatibility between release package and genesis package
- local binary hash matches approved release artifact
6.2 Rule
No validator SHOULD join using a package set not explicitly matched and validated.
7. Release package verification steps
7.1 Minimum checks
Operator SHOULD:
- verify published release manifest authenticity
- verify required artifacts present
- verify binary content_hash and canonical identity
- verify release candidate / final release approvals
- verify no revocation on release artifact set
- verify scope lock to intended target network
7.2 Rule
If release package validation fails, node MUST NOT start in consensus role.
8. Genesis package verification steps
8.1 Minimum checks
Operator MUST:
- load genesis package manifest
- verify package manifest integrity
- verify all required artifacts present
- recompute artifact hashes
- validate
genesis_spec.blob - recompute
genesis_hash - recompute
chain_id - recompute derived roots
- verify validator set bundle
- verify parameter state bundle
- verify registry and policy bundles
- verify attestation sufficiency
8.2 Rule
Any mismatch in genesis_hash, chain_id, derived roots or validator bundle MUST be treated as hard stop.
9. Compatibility check between release and genesis
9.1 Operator MUST verify:
- release package target network class == genesis package target network class
- release package chain_id compatibility == genesis package chain_id
- release binary protocol version supports genesis parameter state
- mainnet/public-testnet scope matches intended launch scope
- no superseded/revoked artifact still used
9.2 Rule
A node binary validated for one genesis scope MUST NOT be assumed valid for another launch scope without explicit linkage.
10. Local environment preparation
10.1 Operator SHOULD prepare:
- isolated host or controlled environment
- pinned configuration
- correct clock synchronization
- storage paths
- snapshot/replay capacity
- logging sink
- metrics sink
- alert channels
- network/firewall rules
- secure key access path
10.2 Rule
Consensus-role nodes SHOULD run in environment with minimal undeclared mutable state.
11. Key preparation
11.1 Required key classes as applicable
- validator identity key
- proposer signing key
- verifier signing key
- notary signing key
- admin key for local service controls if used
- emergency local stop controls if used operationally
11.2 Rule
Role keys SHOULD be separated. Same hot key for all roles SHOULD be avoided, especially for production launch.
11.3 Pre-launch checks
Operator MUST verify:
- correct key loaded for correct role
- no wrong-network key mix-up
- key material accessibility path works
- signer process integrity
- backup/recovery or rotation plan exists
12. Time and environment sanity
12.1 Operator MUST verify:
- local clock within tolerated drift
- timezone assumptions not affecting protocol configuration
- host identity and networking consistent
- adequate disk space
- write permissions to required paths
- snapshot path available
- log sink writable
12.2 Rule
A node with broken time sync or unstable storage SHOULD NOT enter consensus role.
13. Local configuration policy
13.1 Local config SHOULD include only:
- node role settings
- network endpoints / peers
- storage paths
- telemetry endpoints
- safe mode / local halt controls
- key access references
- resource limits
13.2 Local config MUST NOT redefine:
- genesis truth
- protocol parameters
- chain identity
- release identity
- activation boundaries
13.3 Rule
If local config appears to override protocol truth, launch must stop and operator configuration be reviewed.
14. Node preflight
14.1 Before full boot, operator SHOULD run preflight mode that checks:
- binary hash
- release package linkage
- genesis package linkage
- storage readiness
- key presence
- network configuration sanity
- telemetry path
- snapshot restore ability if relevant
- local config syntax/semantic validity
- role enablement policy
14.2 Rule
Preflight failures MUST block consensus-role startup.
15. Preflight verdicts
15.1 Recommended verdicts
PREFLIGHT_OKPREFLIGHT_OK_WITH_WARNINGSPREFLIGHT_BLOCKEDPREFLIGHT_SCOPE_MISMATCHPREFLIGHT_KEY_FAILUREPREFLIGHT_ARTIFACT_FAILUREPREFLIGHT_ENV_FAILURE
15.2 Rule
Consensus roles MAY start only with PREFLIGHT_OK or narrowly defined OK_WITH_WARNINGS classes approved by launch policy.
16. Bootstrap start sequence
16.1 Recommended sequence
- verify artifacts
- verify config
- run preflight
- initialize local data stores
- load genesis package
- derive genesis state and roots
- compare derived values with expected values
- open network connections
- sync or confirm initial protocol view
- enter validation-only mode first
- activate consensus role only after local and network sanity checks
16.2 Rule
Nodes SHOULD avoid jumping directly into proposer/notary role before initial validation-only bootstrap.
17. Validation-only bootstrap
17.1 Purpose
Allows node to confirm:
- it understands the network correctly,
- it sees expected genesis,
- it derives same protocol state, without yet affecting network by proposing or notarizing.
17.2 Recommended checks in this phase
- peer compatibility
- chain_id match
- genesis_hash match
- initial parameter state match
- validator role eligibility match
- no local replay mismatch
17.3 Rule
If any of these checks fail, operator MUST remain out of consensus role.
18. Role activation policy
18.1 Role activation SHOULD be explicit per role:
- validation active
- proposer active
- verifier active
- notary active
18.2 Rule
Notary role SHOULD be activated last among consensus roles unless launch process explicitly requires simultaneous activation and tooling guarantees readiness.
18.3 Additional caution
If node is healthy enough to validate but not fully healthy enough to sign, operator SHOULD keep signing roles disabled.
19. Proposer activation checklist
19.1 Before enabling proposer role, operator SHOULD verify:
- mempool/candidate pool healthy
- local state current
- peer connectivity acceptable
- no unresolved preflight warnings in consensus-critical scope
- proposer key reachable and correct
- telemetry for block production active
19.2 Rule
A node with uncertain local state SHOULD NOT propose.
20. Verifier activation checklist
20.1 Before enabling verifier role, operator SHOULD verify:
- candidate validation path passes self-checks
- vote signing path correct
- consensus state current
- fraud proof logging enabled
- replay path available for anomaly investigation
20.2 Rule
Verifier role SHOULD be disabled if node cannot deterministically reproduce validation path under launch conditions.
21. Notary activation checklist
21.1 Before enabling notary role, operator MUST verify:
- reexecution path healthy
- finality threshold and committee view correct
- notary key correct and isolated
- notarization logs and evidence capture active
- no unresolved validation divergence
- no suspicious launch anomaly active
21.2 Rule
Notary role is highest-risk among core launch roles and SHOULD be activated only after strongest confidence checks.
22. Peer and network checks
22.1 Operator SHOULD verify:
- peers report expected chain identity
- enough peer diversity
- no obvious partition
- acceptable latency
- expected launch peers reachable
- peer software versions acceptable per launch policy
22.2 Rule
A node connected mostly to mismatched or suspicious peers SHOULD not activate consensus roles.
23. Launch window monitoring
23.1 In launch window, operator SHOULD monitor:
- finalized epoch cadence
- block proposal acceptance/rejection patterns
- verifier/notary participation metrics
- invalid object rates
- BVM failure rates
- witness/proof anomalies
- governance activation anomalies
- local resource saturation
- key/signing path health
23.2 Rule
Launch window monitoring MUST be higher-sensitivity than steady-state operation.
24. Early epoch checks
24.1 During first epochs, operator SHOULD confirm:
- expected genesis anchored
- first finalized roots consistent
- validator participation expected
- no unexplained no-finality
- no deterministic replay mismatch
- no artifact scope mismatch discovered post-start
24.2 Rule
If early epoch truth is uncertain, operator SHOULD step down to validation-only or local safe mode rather than continue signing blindly.
25. Launch anomaly classes
25.1 Recommended classes
- artifact mismatch
- genesis mismatch
- validator set mismatch
- parameter state mismatch
- consensus participation anomaly
- no-finality anomaly
- BVM divergence anomaly
- witness/proof anomaly
- governance anomaly
- local environment anomaly
- key/signer anomaly
25.2 Rule
Every anomaly class SHOULD map to an operator action profile.
26. Operator action profiles
26.1 Standard profiles
OP_OBSERVEOP_VALIDATION_ONLYOP_DISABLE_PROPOSEROP_DISABLE_SIGNING_ALLOP_LOCAL_SAFE_MODEOP_LOCAL_SERVICE_HALTOP_ESCALATE_INCIDENTOP_RECOVERY_REPLAY
26.2 Rule
Operators SHOULD choose the least dangerous profile that preserves protocol truth and local evidence.
27. Local safe mode
27.1 Local safe mode MAY include:
- disable proposer
- disable verifier/notary signing
- keep network and validation alive
- keep metrics and logs alive
- freeze admin changes
- preserve snapshots
- increase alert sensitivity
27.2 Rule
Local safe mode MUST be clearly local. It does not alter network protocol rules.
28. Local service halt
28.1 Purpose
Stop dangerous or broken local components.
28.2 May include stopping:
- proposer service
- notary service
- RPC write endpoints
- local agent integrations
- indexer or explorer adjunct
28.3 Rule
Local service halt SHOULD be used if continuing to sign or submit is riskier than going temporarily dark.
29. Incident escalation
29.1 Operator MUST escalate when:
- genesis mismatch discovered
- binary/release mismatch discovered
- deterministic divergence suspected
- no-finality persists beyond threshold
- conflicting notarization seen
- key compromise suspected
- impossible governance activation seen
- BVM consensus-critical anomaly seen
29.2 Escalation package SHOULD include:
- node identity
- role status
- exact artifact ids and hashes
- relevant logs
- state roots
- observed anomaly class
- timestamps
- any preserved evidence refs
30. Replay and rebuild actions
30.1 If operator suspects local corruption or divergence, SHOULD:
- disable signing roles
- preserve current logs and snapshots
- identify last trusted finalized checkpoint
- replay from trusted checkpoint
- compare derived roots and receipts
- re-evaluate whether node can rejoin safely
30.2 Rule
A node MUST NOT resume signing after replay mismatch without explicit incident handling and resolution.
31. Restart procedure
31.1 Safe restart sequence
- preserve state and logs
- verify artifacts unchanged
- verify no local config drift
- run preflight again
- restore from last good checkpoint if needed
- start validation-only
- re-enable signing roles gradually
31.2 Rule
Crash/restart MUST NOT imply immediate automatic re-entry into all signing roles unless launch policy explicitly permits and health checks pass.
32. Rejoin procedure after downtime
32.1 Operator SHOULD:
- verify current release/genesis scope still same
- verify local binary still valid for active network scope
- sync state and compare finalized checkpoints
- run replay spot-checks if downtime significant or incident occurred
- enter validation-only first
- re-enable roles only after healthy sync
32.2 Rule
Rejoin after suspicious downtime SHOULD be conservative.
33. Snapshot policy for operators
33.1 Before launch, operator SHOULD ensure:
- initial empty/pre-genesis local snapshot policy defined
- post-genesis snapshot available or derivable
- periodic finalized checkpoint snapshots enabled
- pre-restart and pre-recovery snapshots possible
33.2 Rule
Snapshots used operationally MUST be tied to canonical roots and trusted package scope.
34. Logging and audit requirements
34.1 Launch-time logs SHOULD capture:
- binary identity
- release package id
- genesis package id
- genesis_hash
- chain_id
- role enablement events
- preflight verdict
- first peer compatibility checks
- first finalized epoch observations
- anomalies and local safe mode transitions
34.2 Rule
If a validator cannot later prove which exact artifacts it launched with, launch audit quality is insufficient.
35. Communication discipline
35.1 Operator communications SHOULD distinguish:
- local node issue
- release artifact issue
- genesis package issue
- network-wide consensus issue
- observability-only issue
35.2 Rule
Do not label local misconfiguration as protocol fault until evidence supports it.
36. Genesis ceremony and launch ceremony integration
36.1 Operators SHOULD treat:
- package verification
- checksum/root confirmation
- role readiness confirmation as explicit ceremony steps, not informal chat confirmations.
36.2 Recommended confirmations
- “verified genesis_hash”
- “verified chain_id”
- “verified binary hash”
- “validator role ready”
- “notary role ready”
- “monitoring live”
- “incident path staffed”
36.3 Rule
Ceremony statements SHOULD map to actual checks, not ritual words.
37. Launch blockers for individual operators
37.1 An operator MUST NOT activate consensus role if:
- release artifact mismatch
- genesis package mismatch
- key mapping incorrect
- preflight blocked
- state store unhealthy
- clock drift severe
- signer unavailable or misconfigured
- validator set eligibility unclear
- telemetry/incident path absent for critical roles
37.2 Rule
Individual no-go is preferable to unsafe participation.
38. Post-launch restricted posture
38.1 For early epochs/days, operators SHOULD:
- avoid unnecessary config changes
- avoid unnecessary binary changes
- keep signing roles conservative
- increase snapshot frequency
- monitor more aggressively
- require stricter internal approval for local modifications
38.2 Rule
Early launch is stabilization period, not optimization period.
39. Change control during launch window
39.1 Operators SHOULD NOT during launch window:
- swap binaries casually
- modify local role mappings ad hoc
- change trusted package source
- alter genesis files
- patch configs that affect consensus semantics
- rotate keys without recorded reason and process
39.2 Allowed emergency changes
Only those required by incident response and already covered by runbook/process.
40. Minimal operator checklist summary
40.1 Before launch
- verify release package
- verify genesis package
- verify binary hash
- verify chain_id and genesis_hash
- verify keys and roles
- verify preflight
- verify monitoring
- verify incident path
40.2 At bootstrap
- start validation-only
- verify peers and state
- enable roles in controlled order
- watch first finalized epochs
40.3 If anomaly
- preserve evidence
- disable risky roles
- escalate
- replay/recover if needed
41. Anti-patterns
Operators SHOULD avoid:
- starting from unverified downloaded binaries
- hand-editing genesis or package files
- enabling all roles at once before validation-only checks
- continuing to sign after replay mismatch
- assuming peer majority means local node is correct
- mixing local debug config into launch production config
- treating missing monitoring as acceptable for notary role
- restarting into full signing mode automatically after crash
- using same unprotected hot key for all roles
- improvising launch confirmations without actual artifact verification
42. Formal goals
AZ-025 urmărește aceste obiective:
42.1 Safe operator bootstrap
Validators and operators can start nodes without introducing artifact or configuration ambiguity.
42.2 Launch-role discipline
Consensus roles activate only after explicit verification and readiness checks.
42.3 Evidence-preserving anomaly handling
Early launch anomalies are contained and investigated without destroying useful evidence.
42.4 Rejoin safety
Nodes can restart or rejoin conservatively without silently poisoning consensus with local uncertainty.
43. Formula documentului
Validator/Operator Launch Manual = verify artifacts + verify environment + preflight node + activate roles conservatively + monitor first epochs + preserve evidence on anomaly
44. Relația cu restul suitei
- AZ-022 definește pachetul genesis concret.
- AZ-017 definește criteriul de lansare.
- AZ-025 definește cum execută efectiv operatorii și validatorii acea lansare.
Pe scurt: AZ-017 spune când ai voie să lansezi; AZ-025 spune cum pornești nodurile fără să strici lansarea.
45. Ce urmează
După AZ-025, documentul corect este:
AZ-026 — Genesis Ceremony and Launch Ceremony Protocol
Acolo trebuie fixate:
- pașii formali de ceremonie,
- cine confirmă ce,
- în ce ordine,
- cum se închid aprobările,
- și cum se marchează oficial trecerea de la package verification la network start.
Închidere
Un launch manual bun nu spune doar „pornește nodul”. Spune exact: ce verifici, ce nu ai voie să presupui, când ai voie să semnezi, când trebuie să te oprești, și ce dovadă păstrezi dacă ceva nu se potrivește.
Acolo începe operarea disciplinată reală a validatorilor.