Runbooks Index
Catalog of AngaraBase operator runbooks. All runbooks are tied to the code and updated together with release trains.
By Category
Lifecycle
| Runbook | When to use |
|---|---|
| Upgrade and migration | Before a version upgrade — pre-flight, rolling, verification |
| MVCC and GC operator minimum | AngaraGC setup, visibility diagnostics |
Reliability
| Runbook | When to use |
|---|---|
| Backup and restore | Regular backup, base/PITR restore, verification |
| Disaster recovery playbook | Full instance loss, host migration, restore oracle |
| Replication v2 operations guide | Managing AngaraReplica v2 |
Performance
| Runbook | When to use |
|---|---|
| Performance tuning guide | Targeted workload optimization |
| Parallel runtime observability | Parallel execution diagnostics, DOP caps |
| jemalloc heap profiling | Investigating memory growth |
Observability
| Runbook | When to use |
|---|---|
| Observability metrics checklist | Configure the minimum metric/alert set |
| Diagnostics bundle | Collect artifacts during an incident |
| Troubleshooting guide | Symptom → cause → action |
| Alert runbooks (RM-0.6.3.8 S7) | Per-alert remediation: backing pages for each runbook_url in tools/observability/alerts/angarabase_alerts.yaml |
Security
| Runbook | When to use |
|---|---|
| Security operations baseline | Regular security checks, knobs registry |
| Hardening | Move an instance to production-ready security configuration |
Reference (operator)
| Document | What it contains |
|---|---|
| Configuration schema reference | Full registry of TOML/env parameters |
| Client compatibility baseline | Tested clients, known limitations |
| Known issues baseline | Operator-level known issues |
| Operational policies baseline | Production policy baseline |
Validation
| Document | When to use |
|---|---|
| Testing and validation baseline | Acceptance checks before production |
| Golden dataset management | Managing golden datasets |
| CI reproducibility contract | Artifact reproducibility contract |
By Symptom (Quick Navigation)
| Symptom | Where to look |
|---|---|
| Server does not start | Troubleshooting → Configuration → Crash recovery |
| Queries became slower | Performance tuning → Diagnostics → Diagnostics bundle |
0A000 feature_not_supported error | SQL compatibility → Known issues |
| Used disk size is growing | MVCC and GC operator minimum → Diagnostics |
| RSS / OOM is growing | jemalloc profiling → Configuration |
| Backup or restore failed | Backup and restore → Disaster recovery |
| Authentication / RLS / audit behave unexpectedly | Security operations → Security model |
| Client / ORM problem | Client compatibility → SQL compatibility |
| Suspected data corruption | Verify release artifacts → Disaster recovery |
If the runbook did not help
Collect a diagnostics bundle and contact us through the Support flow.
Next
- Troubleshooting guide — symptom index and first actions.
- Disaster recovery playbook — “lost lease / damaged datadir” scenarios.
- Diagnostics bundle runbook — how to collect everything needed for escalation in one package.