Architecture Overview
This document is a map of the AngaraBase architecture as-is: what major subsystems exist, how an SQL query flows through them, and where the boundaries of responsibility lie. For a user-facing introduction, see AngaraBase Architecture.
High-Level Components
| Component | What it does |
|---|---|
angarabased | Server adapter: pgwire protocol, listener, connection and session management |
angarabase (engine core) | Parse/bind/plan/execute, transactions, storage API, WAL/recovery primitives |
angara-cli | CLI for administration (identity, ops via admin endpoint) |
| Operational surface | Configuration, metrics, logs, diagnostic bundles, upgrade policies |
Full layering contract and dependency rules: Layering and Boundaries.
Query Flow (Simplified)
flowchart LR
C[Client/Driver] -- pgwire --> S[angarabased adapter]
S -- SQL + session ctx --> E[angarabase engine core]
E --> P[Parse / Bind / Plan / Execute]
P --> Sec[Security: RBAC + RLS]
P --> T[Txn / MVCC]
P --> St[Storage API]
P --> Stat[Stats / CBO feedback]
T --> Wal[WAL / Recovery]
St --> Wal
Wal --> IO[IO / fsync contract]
E -- rows / errors --> S
S -- pgwire responses --> C
Key Architectural Decisions
| Area | Decision | Why |
|---|---|---|
| MVCC | UNDO-log (history is a separate append-only log; heap contains only current versions) | Less bloat, no heavy VACUUM, deterministic GC |
| Storage | Pluggable: row-store baseline + AngaraMemory; AngaraColumn in roadmap | HTAP direction, different tiers for different workloads |
| Recovery | WAL-first, idempotent replay, fail-closed on lack of WAL integrity | Correctness is more important than latency |
| Optimizer | Cost-based AngaraPlan + LEO feedback loop, robust planning | Resilience to estimation errors |
| Execution | Volcano streaming (AngaraFlow) + vector path (AngaraVector) | Separation by plan shapes, explicit management via EXPLAIN |
| Catalog | Persisted SysCatalog, DDL survives restart | Predictability for production |
| Security | 6-layer model: TLS/Auth → RBAC → RLS → Break-glass → Audit chain → TDE | Defence-in-depth, fail-closed |
| Backup | Per-database, cold + online/PITR baseline | Multi-tenant isolation |
| Distribution | Single-node engine; distributed SQL is on the horizon of major branches | Concentration on correctness first |
Boundaries and Invariants
angarabased(adapter) does not contain SQL logic — only pgwire framing, session ctx, and routing to the core.angarabasecore does not know about pgwire — it communicates via the core API contract.- Storage does not perform MVCC visibility — only heap I/O. Visibility is computed by the MVCC layer.
- Index does not determine visibility — it only points to the TID; visibility is always rechecked against the heap.
- Any unsupported SQL construct returns an explicit SQLSTATE (
0A000, etc.) — no silent bypasses. - Public API: pgwire + admin endpoint. Internal modules are an implementation detail and may change.
Architectural constraints and do-not-block rules: Architecture Constraints.
Reliability and Physical Portability
- Cold/offline backup and restore — full-instance copy at the data-directory level (see Backup/Restore).
- Host migration — without
pg_dump/pg_restore: copy + verify + start. More details in Crash recovery. - Identity rehearsal — every release goes through the rehearsal upgrade pipeline.
- Page checksums + WAL CRC — corruption detection upon reading/recovery.
Additional Resources
- Layering and Boundaries — official layering contract.
- AngaraBase Architecture (user-facing) — overview for users and DBAs.
- Project Principles — the ideological compass of the project.