Operations Overview
The canonical AngaraBase operations corpus for DBAs and SREs. It collects runbooks, baselines, and checklists that are kept in sync with the code and release trains.
If you are just starting, begin with the user guides in Operations (How-to): they are shorter and work well as an entry point. This section is for operator-level deep dives.
How to Navigate
| Task | Where to go |
|---|---|
| Bring up an instance from scratch | Installation → Configuration |
| Start a container / minimal k8s deployment | Container deployment quickstart |
| Move to production | Operational policies baseline → Hardening → Security operations baseline |
| Configure monitoring and alerts | Observability metrics checklist → Parallel runtime observability |
| Investigate a production issue | Troubleshooting guide → Diagnostics bundle runbook → Error debug runbook |
| Read query plans | How to read query plans → Performance tuning guide |
| Optimize performance | Performance tuning guide → How to read query plans → Parallel runtime observability |
| Backup / restore | Backup and restore (operator-level) → Disaster recovery playbook |
| Upgrade the version | Upgrade and migration |
| Connect an unfamiliar client / ORM | Client compatibility baseline |
| Prepare a voucher for a bug report | Diagnostics bundle runbook → Support |
Canonical Operations Pages
Lifecycle
- Upgrade and migration — pre-flight, rolling steps, verification.
- MVCC and GC operator minimum — AngaraGC behavior and operator knobs.
- Checkpoint operations — managing the checkpoint process.
Reliability
- Container deployment quickstart — image-first startup, cgroup probe, minimal k8s smoke.
- Backup and restore — operator-level baseline (cold + online/PITR).
- Disaster recovery playbook — DR scenarios, host migration.
- Replication v2 operations guide — AngaraReplica v2.
Performance
- Performance tuning guide — workload-driven knobs, what to measure first.
- Statistics and ANALYZE — statistics collection and persistence.
- How to read query plans —
EXPLAIN, operators, diagnostics, cache/replan signals. - Parallel runtime observability runbook — DOP caps, partitioned join.
- jemalloc heap profiling runbook — memory diagnostics.
Observability
- Observability metrics checklist — required minimum metrics.
- Diagnostics bundle runbook — what to collect during an incident.
Security
- Security operations baseline — security knobs registry, regular checks.
Reference
- Configuration schema reference — all TOML/env parameters with types and defaults.
- Client compatibility baseline — list of tested clients and caveats.
- Known issues baseline — operator-level known issues.
- Operational policies baseline — production-policy baseline.
Troubleshooting
- Troubleshooting guide — common incidents and actions.
- Runbooks index — table of contents for all runbooks.
Validation
- Testing and validation baseline — what to check before production.
- Golden dataset management — managing golden data.
- CI reproducibility contract — build reproducibility contract.
Links
- Architecture overview — how the database is structured (operations context).
- Security model — the full security model.
- SQL compatibility — supported SQL boundaries.
- Support — how to report a problem.