AngaraBase Architecture
This document provides an understanding of AngaraBase’s internal design at a level sufficient for making decisions: configuration choices, issue diagnostics, and assessing applicability.
Detailed technical specification: docs/01_ARCHITECTURE.md.
Multi-layered Architecture
AngaraBase consists of six layers. Each layer has its own API and depends only on the layers below it. This allows implementations (e.g., storage engine) to be swapped out without changing the other layers.
┌──────────────────────────────────────────────────────────────┐
│ TIER 1: CLIENT LAYER (Wire Protocol) │
│ │
│ • pgwire protocol (compatibility with psql, JDBC, etc.) │
│ • connection pooling │
│ • async event loop │
└─────────────────────────┬────────────────────────────────────┘
│
┌─────────────────────────┴────────────────────────────────────┐
│ TIER 2: SESSION / TRANSACTION LAYER │
│ │
│ • sessions and session variables │
│ • transaction management (BEGIN/COMMIT/ROLLBACK/SAVEPOINT) │
│ • isolation levels (READ COMMITTED, REPEATABLE READ) │
│ • locks and deadlock detection │
└─────────────────────────┬────────────────────────────────────┘
│
┌─────────────────────────┴────────────────────────────────────┐
│ TIER 3: QUERY EXECUTION LAYER │
│ │
│ • SQL query parsing │
│ • semantic validation and type checking │
│ • planning and optimization (AngaraPlan) │
│ • physical plan execution (AngaraFlow) │
└─────────────────────────┬────────────────────────────────────┘
│
┌─────────────────────────┴────────────────────────────────────┐
│ TIER 4: CATALOG & TYPE SYSTEM │
│ │
│ • registry of tables, schemas, databases │
│ • registry of types, functions, operators │
│ • index registry (access methods) │
│ • system views sys.* │
└─────────────────────────┬────────────────────────────────────┘
│
┌─────────────────────────┴────────────────────────────────────┐
│ TIER 5: STORAGE LAYER (Pluggable Storage) │
│ │
│ • row-store engine (OLTP baseline) │
│ • pluggable: in-memory and column-store (planned) │
│ • indexes (AngaraTree: B+tree, BRIN) │
│ • Transaction Log (WAL) — transaction journal │
└─────────────────────────┬────────────────────────────────────┘
│
┌─────────────────────────┴────────────────────────────────────┐
│ TIER 6: SYSTEM LAYER │
│ │
│ • buffer manager and page cache │
│ • metrics and telemetry │
│ • crash recovery │
│ • resource scheduler (CPU, memory, I/O) │
└──────────────────────────────────────────────────────────────┘
What this means for you
- TIER 1: you connect using a standard PostgreSQL client — no special tools needed.
- TIER 2: transactions work as usual —
BEGIN,COMMIT,ROLLBACK,SAVEPOINT. - TIER 3: SQL queries go through a parser, optimizer, and executor.
EXPLAINshows the execution plan. - TIER 4: metadata about tables, types, and indexes is accessible via
sys.*system views (e.g.SELECT * FROM sys.tables). - TIER 5: data is stored in a pluggable storage engine. Currently, it’s a row-store; in future versions, you’ll be able to choose the engine when creating a table.
- TIER 6: buffer, metrics, and recovery are part of the infrastructure layer working transparently. You interact with it via configuration and monitoring.
Named Components
Key AngaraBase subsystems have their own names. This simplifies diagnostics, documentation, and configuration — when you see a name in the logs or metrics, you know which part of the system it refers to.
| Component | What it does | Status |
|---|---|---|
| AngaraTree | Indexes: B+tree, BRIN | Available |
| AngaraStat | Table statistics: NDV, histograms, MCV | Available |
| AngaraPlan | Cost-based query optimizer | Available |
| AngaraFlow | Streaming query execution (iterator/Volcano model) | Available |
| AngaraIO | Async I/O pipeline (storage, WAL, prefetch) | Available |
| AngaraGC | MVCC garbage collection (cleaning up obsolete row versions) | Available |
| AngaraVector | Vectorized execution (SIMD-optimization) | Available |
| AngaraParallel | Parallel query execution | Available |
| AngaraMemory | In-memory storage engine | Available |
Example: if EXPLAIN shows AngaraTree: Index Scan, it means the query is using a B+tree index.
If AngaraGC appears in the logs, it’s the garbage collector for obsolete row versions.
Data Model
AngaraBase uses a four-level hierarchy (similar to MS SQL Server):
Instance (angarabased process)
└─ Database
└─ Schema
└─ Table
- Instance — a single running
angarabasedprocess. It can contain multiple databases. - Database — an isolated database. Each DB has its own data files, transaction log, and settings. Backup and restore operate on the individual database level.
- Schema — logical grouping of tables within a database (default is
public). - Table — a table containing data.
Example:
angarabased (instance)
├─ Database "odoo_prod"
│ ├─ Schema "public"
│ │ ├─ Table "res_partner"
│ │ ├─ Table "sale_order"
│ │ └─ ...
│ └─ Schema "staging"
│ └─ ...
├─ Database "analytics"
│ └─ Schema "public"
│ └─ ...
└─ System catalog (sys.*)
Each database is independent: a backup of odoo_prod does not affect analytics, and vice versa.
Configuration Hierarchy
Settings in AngaraBase are applied at three levels, from broadest to narrowest:
Instance (angarabase.conf)
└─ Database (ALTER DATABASE ... SET ...)
└─ Session (SET ...)
A narrower level overrides a broader one:
- Instance — server settings (port, memory limits, file paths). Some of these require a restart.
- Database — database-specific settings (limits, storage parameters). Applied without a restart.
- Session — settings for the current connection (
SET timezone = 'Europe/Moscow'). Active until the session ends.
What this means in practice
AngaraBase architecture is designed with several principles that affect daily operations:
-
Connect with standard tools. pgwire compatibility means you don’t need custom drivers or libraries.
psql, DBeaver, your Python or Java application — everything connects just like standard PostgreSQL. -
Per-database isolation. Each database is an independent unit for backup, restore, and configuration. This is handy for multi-tenant scenarios: each client can have their own DB with individual settings and a separate backup schedule.
-
Clear diagnostics. System views
sys.*provide access to metadata and system state. Named components (AngaraTree, AngaraPlan, etc.) are reflected inEXPLAIN, logs, and metrics — you always know what part of the system is involved. -
Pluggable storage. A row-store (optimized for OLTP) is available now. In future versions, you’ll be able to choose the storage engine when creating a table — in-memory for hot data, column-store for analytics.
-
Fail-closed behavior. If an SQL construct isn’t supported — you’ll get an explicit error with an SQLSTATE code, not an unexpected result. This is predictable and safe for production.
Additional Resources
- Canonical architecture doc:
docs/01_ARCHITECTURE.md— complete technical specification. - Storage Engine:
concepts/storage-engine.md— row-store, pages, pluggable storage. - Query Processing:
concepts/query-processing.md— parser, planner, optimizer, executor. - Catalog and Metadata:
concepts/catalog-and-metadata.md— SysCatalog and system views.