Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

AngaraBase Architecture

This document provides an understanding of AngaraBase’s internal design at a level sufficient for making decisions: configuration choices, issue diagnostics, and assessing applicability.

Detailed technical specification: docs/01_ARCHITECTURE.md.


Multi-layered Architecture

AngaraBase consists of six layers. Each layer has its own API and depends only on the layers below it. This allows implementations (e.g., storage engine) to be swapped out without changing the other layers.

┌──────────────────────────────────────────────────────────────┐
│ TIER 1: CLIENT LAYER (Wire Protocol) │
│ │
│ • pgwire protocol (compatibility with psql, JDBC, etc.) │
│ • connection pooling │
│ • async event loop │
└─────────────────────────┬────────────────────────────────────┘
 │
┌─────────────────────────┴────────────────────────────────────┐
│ TIER 2: SESSION / TRANSACTION LAYER │
│ │
│ • sessions and session variables │
│ • transaction management (BEGIN/COMMIT/ROLLBACK/SAVEPOINT) │
│ • isolation levels (READ COMMITTED, REPEATABLE READ) │
│ • locks and deadlock detection │
└─────────────────────────┬────────────────────────────────────┘
 │
┌─────────────────────────┴────────────────────────────────────┐
│ TIER 3: QUERY EXECUTION LAYER │
│ │
│ • SQL query parsing │
│ • semantic validation and type checking │
│ • planning and optimization (AngaraPlan) │
│ • physical plan execution (AngaraFlow) │
└─────────────────────────┬────────────────────────────────────┘
 │
┌─────────────────────────┴────────────────────────────────────┐
│ TIER 4: CATALOG & TYPE SYSTEM │
│ │
│ • registry of tables, schemas, databases │
│ • registry of types, functions, operators │
│ • index registry (access methods) │
│ • system views sys.* │
└─────────────────────────┬────────────────────────────────────┘
 │
┌─────────────────────────┴────────────────────────────────────┐
│ TIER 5: STORAGE LAYER (Pluggable Storage) │
│ │
│ • row-store engine (OLTP baseline) │
│ • pluggable: in-memory and column-store (planned) │
│ • indexes (AngaraTree: B+tree, BRIN) │
│ • Transaction Log (WAL) — transaction journal │
└─────────────────────────┬────────────────────────────────────┘
 │
┌─────────────────────────┴────────────────────────────────────┐
│ TIER 6: SYSTEM LAYER │
│ │
│ • buffer manager and page cache │
│ • metrics and telemetry │
│ • crash recovery │
│ • resource scheduler (CPU, memory, I/O) │
└──────────────────────────────────────────────────────────────┘

What this means for you

  • TIER 1: you connect using a standard PostgreSQL client — no special tools needed.
  • TIER 2: transactions work as usual — BEGIN, COMMIT, ROLLBACK, SAVEPOINT.
  • TIER 3: SQL queries go through a parser, optimizer, and executor. EXPLAIN shows the execution plan.
  • TIER 4: metadata about tables, types, and indexes is accessible via sys.* system views (e.g. SELECT * FROM sys.tables).
  • TIER 5: data is stored in a pluggable storage engine. Currently, it’s a row-store; in future versions, you’ll be able to choose the engine when creating a table.
  • TIER 6: buffer, metrics, and recovery are part of the infrastructure layer working transparently. You interact with it via configuration and monitoring.

Named Components

Key AngaraBase subsystems have their own names. This simplifies diagnostics, documentation, and configuration — when you see a name in the logs or metrics, you know which part of the system it refers to.

ComponentWhat it doesStatus
AngaraTreeIndexes: B+tree, BRINAvailable
AngaraStatTable statistics: NDV, histograms, MCVAvailable
AngaraPlanCost-based query optimizerAvailable
AngaraFlowStreaming query execution (iterator/Volcano model)Available
AngaraIOAsync I/O pipeline (storage, WAL, prefetch)Available
AngaraGCMVCC garbage collection (cleaning up obsolete row versions)Available
AngaraVectorVectorized execution (SIMD-optimization)Available
AngaraParallelParallel query executionAvailable
AngaraMemoryIn-memory storage engineAvailable

Example: if EXPLAIN shows AngaraTree: Index Scan, it means the query is using a B+tree index. If AngaraGC appears in the logs, it’s the garbage collector for obsolete row versions.


Data Model

AngaraBase uses a four-level hierarchy (similar to MS SQL Server):

Instance (angarabased process)
 └─ Database
 └─ Schema
 └─ Table
  • Instance — a single running angarabased process. It can contain multiple databases.
  • Database — an isolated database. Each DB has its own data files, transaction log, and settings. Backup and restore operate on the individual database level.
  • Schema — logical grouping of tables within a database (default is public).
  • Table — a table containing data.

Example:

angarabased (instance)
 ├─ Database "odoo_prod"
 │ ├─ Schema "public"
 │ │ ├─ Table "res_partner"
 │ │ ├─ Table "sale_order"
 │ │ └─ ...
 │ └─ Schema "staging"
 │ └─ ...
 ├─ Database "analytics"
 │ └─ Schema "public"
 │ └─ ...
 └─ System catalog (sys.*)

Each database is independent: a backup of odoo_prod does not affect analytics, and vice versa.


Configuration Hierarchy

Settings in AngaraBase are applied at three levels, from broadest to narrowest:

Instance (angarabase.conf)
 └─ Database (ALTER DATABASE ... SET ...)
 └─ Session (SET ...)

A narrower level overrides a broader one:

  • Instance — server settings (port, memory limits, file paths). Some of these require a restart.
  • Database — database-specific settings (limits, storage parameters). Applied without a restart.
  • Session — settings for the current connection (SET timezone = 'Europe/Moscow'). Active until the session ends.

What this means in practice

AngaraBase architecture is designed with several principles that affect daily operations:

  • Connect with standard tools. pgwire compatibility means you don’t need custom drivers or libraries. psql, DBeaver, your Python or Java application — everything connects just like standard PostgreSQL.

  • Per-database isolation. Each database is an independent unit for backup, restore, and configuration. This is handy for multi-tenant scenarios: each client can have their own DB with individual settings and a separate backup schedule.

  • Clear diagnostics. System views sys.* provide access to metadata and system state. Named components (AngaraTree, AngaraPlan, etc.) are reflected in EXPLAIN, logs, and metrics — you always know what part of the system is involved.

  • Pluggable storage. A row-store (optimized for OLTP) is available now. In future versions, you’ll be able to choose the storage engine when creating a table — in-memory for hot data, column-store for analytics.

  • Fail-closed behavior. If an SQL construct isn’t supported — you’ll get an explicit error with an SQLSTATE code, not an unexpected result. This is predictable and safe for production.


Additional Resources