AngaraBase
A relational DBMS with PostgreSQL protocol and predictable behavior.
Written in Rust. Current branch — 0.6.x.
AngaraBase is designed for ERP/SaaS workloads, where predictability is more important than “magic”, and every behavior is defined by an explicit contract.
This covers what is already working in the code on the current branch. Roadmap features are marked separately. AngaraBase is a young project, and we prefer accuracy over marketing promises.
Getting Started
Choose your entry point based on your role:
| Who you are | Where to go |
|---|---|
| Exploring the product | What is AngaraBase → High-level Architecture |
| Want to try locally | Quickstart in 5 minutes |
| Application Developer | SQL compatibility → Data types → Known issues |
| DBA / SRE | Installation → Configuration → Backup/Restore → Operations runbooks |
| Security engineer | Security model → Hardening → Audit |
| Contributor / Researcher | Architecture overview → Layering and Boundaries → Glossary |
| Reporting an issue | Support flow |
Key Features
PostgreSQL Compatibility Without Surprises
- Standard pgwire protocol:
psql, JDBC, psycopg, node-postgres, pgx, Npgsql work just like with standard PostgreSQL. - Contractual SQL subset: what is supported is supported fully; what is not supported returns an explicit SQLSTATE (
0A000etc.) instead of a silent bypass. - Pinned compatibility tests (compat suite) as proof, rather than a “compatibility percentage”.
Read more: SQL compatibility overview.
Modern Engine: MVCC via UNDO-log, Not via Bloat
Unlike PostgreSQL, AngaraBase stores historical row versions in a separate UNDO-log (like Oracle/InnoDB), rather than in the table itself. As a result:
- heap pages contain only current row versions → less bloat;
VACUUMin the traditional sense is not needed — old versions are cleaned up by the background AngaraGC according to a strict contract;- visibility for transactions is determined deterministically by snapshot.
Read more: Transactions and MVCC.
Pluggable Storage From Day One
- Row-store (baseline) — for OLTP.
- AngaraMemory — in-memory tables with three durability tiers (
none/logged/snapshotted) and hard row-cap. - AngaraColumn — columnar storage for analytics (in roadmap; HTAP direction).
Storage engine is selected during CREATE TABLE ... WITH (storage='memory'|'row'|...).
Read more: Storage engine.
Cost-based Optimizer and Vectorized Execution
- AngaraPlan — CBO with robust planning, resilient to estimation errors.
- AngaraStat — statistics: HLL for NDV, equi-height histograms, MCV (reservoir sampling).
- AngaraFlow — streaming execution (Volcano).
- AngaraVector — vectorized path for scan/filter/project/join/aggregate; modes
auto/force_vector/force_row.EXPLAINexplicitly showsVectorHashJoin,VectorAgg, etc. - AngaraParallel — partitioned parallel join, DOP-caps (
ANGARABASE_PARALLEL_DOP_CAP_*).
Read more: Query processing.
Multi-layered Security With Explicit Contracts
Security is built into the core, not “bolted on” as an afterthought. Six layers of protection, each with its own contract:
- Transport and identity — TLS, SCRAM, cert authentication.
- RBAC — who is allowed at all.
- RLS v1 — which rows are visible and mutable.
- Break-glass — the only way to bypass RLS (even
SUPERUSERdoesn’t have it), always with a REASON, TTL, and audit. - Audit chain — append-only, tamper-evident (SHA-256 chain).
- TDE — encryption of pages, WAL, and audit-sink; fail-closed without a key.
Read more: Security model.
Operations Built for Predictability
- Per-database backup and restore (cold + online/PITR baseline).
- Clear diagnostics:
EXPLAIN/EXPLAIN ANALYZE,sys.*views (sys.identity,sys.health,sys.settings,sys.tables,sys.column_stats,angara_stat_activity,angara_stat_statements). - Structured logging, OpenTelemetry-style spans, USDT/eBPF probes.
- Prometheus metrics:
angara_*named subsystems are visible in logs, metrics, andEXPLAIN. - Native RPM/DEB packages, init-first service start fence, systemd units.
Read more: Operations runbooks.
Project Principles (Explicit Stance)
| Principle | What it means |
|---|---|
| Restrictive by default | Strict checks and authentication by default. Bypasses — only via explicit flag |
| Contract-first | Every feature has an explicit contract: what is supported, what SQLSTATE on failure, what invariants |
| Fail-closed | When uncertain — an explicit error, not “it will work somehow” |
| Evidence-first | Correctness is proven by test artifacts, not marketing |
| PostgreSQL-friendly | No custom clients: all compatibility is via pgwire |
| Minimum dependencies | Fewer runtime dependencies → smaller supply-chain attack surface |
Full declaration: Project principles.
Documentation Sections
Tutorials
Concepts (Explanation)
- Storage Engine — row-store, pages, slotted pages, pluggable storage
- Transactions and MVCC — UNDO-log MVCC, isolation, GC
- Indexes — AngaraTree (B+tree, BRIN), IndexStore
- Query Processing — AngaraPlan, AngaraStat, AngaraFlow, AngaraVector
- Catalog and Metadata — SysCatalog and
sys.*views - Instance Lifecycle — init, startup, recovery, shutdown
SQL Reference
- Compatibility Overview — policy, SQLSTATE codes, vector execution
- Data Types — supported types, casting, NULL
- DDL — CREATE/ALTER/DROP, indexes, constraints
- DML — INSERT, UPDATE, DELETE, mutation policies
- Queries — CTE, JOIN, aggregates, ORDER BY
- Partitioning — RANGE/LIST partitioning, routing, pruning
Security (How-to)
- Security Model
- Authentication
- Authorization (RBAC + RLS)
- Audit
- Encryption (TDE + client-side)
- Break-glass
- Hardening Runbook
- GOST Compliance
Operations (How-to)
- Installation — portable archive, RPM/DEB, source build
- Configuration — TOML, env,
sys.settings - Backup and Restore — cold + online/PITR
- Crash Recovery — host migration, WAL replay
- Version Upgrade
- Monitoring — Prometheus, Grafana, health probes
- Diagnostics — EXPLAIN, slow log,
sys.* - Logging, Tracing, USDT/eBPF probes
- GC auto-tuning
- Error debug runbook (10 minutes)
- GOST crypto setup
- Verify release artifacts
Operator deep-dives — runbooks (Reference)
- Operations overview
- Runbooks index
- Troubleshooting guide
- Disaster recovery playbook
- Performance tuning guide
- MVCC and GC operator minimum
- Diagnostics bundle runbook
- Security operations baseline
- Upgrade and migration
- Backup and restore (operator-level)
- Configuration schema reference
- Observability metrics checklist
- jemalloc heap profiling runbook
- Parallel runtime observability runbook
- Replication v2 operations guide
- Operational policies baseline
- Client compatibility baseline (operator)
- Testing and validation baseline
- Golden dataset management
- CI reproducibility contract
- Known issues baseline (operator)
Architecture (Reference)
Reference
- Known Issues and SQLSTATE
- Glossary and Named Subsystems
- System Views
sys.* - Client Compatibility
- Support and Bug Report Artifact Collection
- Generated reference (auto-generated registries)
Changelog
- AngaraBook changelog (highlights) — concise change feed from a user perspective.
Community and Contribution
We welcome community contributors:
- Found a bug or regression? Gather artifacts according to the Support flow and open an issue.
- Want to change behavior? Propose an RFC via the project’s development process (internal loop).
- Want to help with documentation? AngaraBook is
documentation-as-codein the same repository. Formatting rules: see the internalWRITING_RULES.mdnext to the book. - Want to track development? Follow the changelog and release notes.
About the Documentation
AngaraBook is built on the principles of Diátaxis:
- Tutorials — learning through action, for new users.
- How-to guides (Security / Operations) — recipes for specific tasks.
- Reference (SQL / Architecture) — exact description of behavior.
- Explanation (Concepts) — why things are designed the way they are.
Quality Guarantees:
- Documentation is code: edited in the same repository, passes the same CI as the engine.
- Any claimed SQL/ops behavior is verified by pinned tests or explicitly marked as a roadmap feature.
- Anti-drift: command versions, configuration keys, and SQLSTATE codes are checked automatically.
- Public build passes a security gate: internal processes and confidential links do not make it into the public portal.
If you notice a discrepancy between the documentation and actual behavior — this is a documentation bug; please report it.
AngaraBase v0.6.x · Linux x86_64/aarch64 · glibc >= 2.28
What is AngaraBase
AngaraBase is a relational DBMS written in Rust, compatible with the PostgreSQL protocol and a subset of its SQL. It is designed for ERP/SaaS workloads, where predictable behavior, explicit boundaries, and an absence of “magic” are critical.
- Server platform: Linux x86_64 / aarch64 (
glibc >= 2.28) - Clients: any platform via standard PostgreSQL drivers
- Current branch:
0.6.x
AngaraBase is a young project. This documentation covers what is already working in the code on the current branch, and explicitly separates it from what is in the roadmap.
What you get right now
| Capability | Status |
|---|---|
pgwire protocol, connecting psql/JDBC/psycopg/pgx without modifications | Available |
| A PostgreSQL SQL subset with an explicit contract and pinned tests | Available |
| Transactions, MVCC (UNDO-log), READ COMMITTED / REPEATABLE READ isolation levels | Available |
| Per-database backup/restore (cold + online/PITR baseline) | Available |
| Multi-layer security: SCRAM/TLS, RBAC, RLS v1, audit chain, TDE, break-glass | Available |
| AngaraTree indexes (B+tree, BRIN), AngaraStat statistics (HLL, histograms, MCV) | Available |
| AngaraPlan cost-based optimizer, AngaraFlow streaming execution | Available |
AngaraVector vectorized execution (scan/filter/project/join/agg, auto/force_* modes) | Available (bounded) |
AngaraMemory in-memory storage (storage='memory', durability tiers) | Available (bounded) |
| AngaraParallel parallel execution (DOP-caps, partitioned join) | Available (bounded) |
| AngaraColumn columnar storage, HTAP, distributed SQL | In roadmap (see Architecture) |
The exact boundaries of SQL support and current limitations are documented in the SQL compatibility overview and Known issues. AngaraBase does not publish a “compatibility percentage” — instead, it provides an exact contract.
Principles
| Principle | What it means in practice |
|---|---|
| Restrictive by default | Strict checks, constraints, and authentication by default. “Magic” bypasses require an explicit flag |
| Contract-first | Every feature has an explicit contract (what is supported, what SQLSTATE on failure, what invariants) |
| No semantic surprises | Unsupported constructs return an explicit SQLSTATE (0A000 etc.), not a silent bypass |
| Fail-closed | When uncertain, the system rejects the request rather than letting it through |
| Evidence-first | Correctness is proven by test artifacts and oracle scripts, not marketing |
| PostgreSQL-friendly | pgwire and SQL subset compatibility; no custom wrappers in the client |
PostgreSQL Compatibility
AngaraBase implements the pgwire protocol as its primary API. From the client’s perspective, it is a standard PostgreSQL endpoint:
psql "host=127.0.0.1 port=5432 user=angara_root dbname=base sslmode=verify-full"
| Stack | Driver | Status |
|---|---|---|
| Python | psycopg2, psycopg3 | Supported |
| Node.js | pg (node-postgres) | Supported |
| Java | PostgreSQL JDBC | Supported |
| Go | lib/pq, pgx | Supported |
| .NET | Npgsql | Supported |
| Tooling | psql, DBeaver | Supported with caveats — see Client compatibility |
Full compat contract and smoke scenarios: SQL compatibility overview.
How AngaraBase differs from PostgreSQL
| Area | PostgreSQL | AngaraBase |
|---|---|---|
| Pluggable storage | In progress (pg_am v2) | Built-in: row-store + AngaraMemory; AngaraColumn — in roadmap |
| MVCC | UNDO-in-heap (bloat, VACUUM) | UNDO-log (separate log, heap contains only current versions) |
| Backup/restore | Cluster-wide | Per-database, cold + online/PITR baseline |
| Security | Extensions and configuration | Multi-layer model out of the box: RBAC + RLS + audit chain + TDE + break-glass |
| Named subsystems | — | AngaraTree, AngaraStat, AngaraPlan, AngaraFlow, AngaraIO, AngaraGC, AngaraVector, AngaraMemory, AngaraParallel — each with explicit contract and metrics |
| Behavior on unsupported SQL | Often best-effort | Explicit SQLSTATE, fail-closed |
| Compatibility as a metric | Full SQL | Contractual subset with pinned tests and public known-issues registry |
Who it is for
- ERP/SaaS teams (e.g. based on Odoo) who need a predictable PostgreSQL-compatible database with an explicit compatibility contract.
- DBAs who value explicit behavioral boundaries over “best-effort” compatibility.
- Engineers who care about knowing exactly what is supported and having reproducible tests as proof.
- The Community willing to participate in shaping a young DBMS.
What AngaraBase does not do (on the current branch)
- It does not provide distributed SQL and multi-master HA — these are for future major branches.
- It does not implement full PostgreSQL SQL — only a contractual subset with clear boundaries.
- It does not mask unsupported features — you get an explicit error with a SQLSTATE.
- It does not run on non-Linux servers. Clients are cross-platform.
Getting Started
| You | Where to go |
|---|---|
| Meeting us for the first time | Architecture “from a bird’s-eye view” |
| Want to run locally | Quickstart |
| Evaluating fit for your stack | SQL compatibility overview, Known issues |
| Planning a production deployment | Installation, Security, Hardening |
| Reporting an error | Support |
Links
- Quickstart — build, run, and execute your first SQL in a few minutes.
- Architecture — how the DB is structured internally.
- SQL reference — what SQL is supported.
- Security model — security model.
- Operations — configuration and operations.
- Glossary — terms and named subsystems.
AngaraBase Architecture
This document provides an understanding of AngaraBase’s internal design at a level sufficient for making decisions: configuration choices, issue diagnostics, and assessing applicability.
Detailed technical specification: docs/01_ARCHITECTURE.md.
Multi-layered Architecture
AngaraBase consists of six layers. Each layer has its own API and depends only on the layers below it. This allows implementations (e.g., storage engine) to be swapped out without changing the other layers.
┌──────────────────────────────────────────────────────────────┐
│ TIER 1: CLIENT LAYER (Wire Protocol) │
│ │
│ • pgwire protocol (compatibility with psql, JDBC, etc.) │
│ • connection pooling │
│ • async event loop │
└─────────────────────────┬────────────────────────────────────┘
│
┌─────────────────────────┴────────────────────────────────────┐
│ TIER 2: SESSION / TRANSACTION LAYER │
│ │
│ • sessions and session variables │
│ • transaction management (BEGIN/COMMIT/ROLLBACK/SAVEPOINT) │
│ • isolation levels (READ COMMITTED, REPEATABLE READ) │
│ • locks and deadlock detection │
└─────────────────────────┬────────────────────────────────────┘
│
┌─────────────────────────┴────────────────────────────────────┐
│ TIER 3: QUERY EXECUTION LAYER │
│ │
│ • SQL query parsing │
│ • semantic validation and type checking │
│ • planning and optimization (AngaraPlan) │
│ • physical plan execution (AngaraFlow) │
└─────────────────────────┬────────────────────────────────────┘
│
┌─────────────────────────┴────────────────────────────────────┐
│ TIER 4: CATALOG & TYPE SYSTEM │
│ │
│ • registry of tables, schemas, databases │
│ • registry of types, functions, operators │
│ • index registry (access methods) │
│ • system views sys.* │
└─────────────────────────┬────────────────────────────────────┘
│
┌─────────────────────────┴────────────────────────────────────┐
│ TIER 5: STORAGE LAYER (Pluggable Storage) │
│ │
│ • row-store engine (OLTP baseline) │
│ • pluggable: in-memory and column-store (planned) │
│ • indexes (AngaraTree: B+tree, BRIN) │
│ • Transaction Log (WAL) — transaction journal │
└─────────────────────────┬────────────────────────────────────┘
│
┌─────────────────────────┴────────────────────────────────────┐
│ TIER 6: SYSTEM LAYER │
│ │
│ • buffer manager and page cache │
│ • metrics and telemetry │
│ • crash recovery │
│ • resource scheduler (CPU, memory, I/O) │
└──────────────────────────────────────────────────────────────┘
What this means for you
- TIER 1: you connect using a standard PostgreSQL client — no special tools needed.
- TIER 2: transactions work as usual —
BEGIN,COMMIT,ROLLBACK,SAVEPOINT. - TIER 3: SQL queries go through a parser, optimizer, and executor.
EXPLAINshows the execution plan. - TIER 4: metadata about tables, types, and indexes is accessible via
sys.*system views (e.g.SELECT * FROM sys.tables). - TIER 5: data is stored in a pluggable storage engine. Currently, it’s a row-store; in future versions, you’ll be able to choose the engine when creating a table.
- TIER 6: buffer, metrics, and recovery are part of the infrastructure layer working transparently. You interact with it via configuration and monitoring.
Named Components
Key AngaraBase subsystems have their own names. This simplifies diagnostics, documentation, and configuration — when you see a name in the logs or metrics, you know which part of the system it refers to.
| Component | What it does | Status |
|---|---|---|
| AngaraTree | Indexes: B+tree, BRIN | Available |
| AngaraStat | Table statistics: NDV, histograms, MCV | Available |
| AngaraPlan | Cost-based query optimizer | Available |
| AngaraFlow | Streaming query execution (iterator/Volcano model) | Available |
| AngaraIO | Async I/O pipeline (storage, WAL, prefetch) | Available |
| AngaraGC | MVCC garbage collection (cleaning up obsolete row versions) | Available |
| AngaraVector | Vectorized execution (SIMD-optimization) | Available |
| AngaraParallel | Parallel query execution | Available |
| AngaraMemory | In-memory storage engine | Available |
Example: if EXPLAIN shows AngaraTree: Index Scan, it means the query is using a B+tree index.
If AngaraGC appears in the logs, it’s the garbage collector for obsolete row versions.
Data Model
AngaraBase uses a four-level hierarchy (similar to MS SQL Server):
Instance (angarabased process)
└─ Database
└─ Schema
└─ Table
- Instance — a single running
angarabasedprocess. It can contain multiple databases. - Database — an isolated database. Each DB has its own data files, transaction log, and settings. Backup and restore operate on the individual database level.
- Schema — logical grouping of tables within a database (default is
public). - Table — a table containing data.
Example:
angarabased (instance)
├─ Database "odoo_prod"
│ ├─ Schema "public"
│ │ ├─ Table "res_partner"
│ │ ├─ Table "sale_order"
│ │ └─ ...
│ └─ Schema "staging"
│ └─ ...
├─ Database "analytics"
│ └─ Schema "public"
│ └─ ...
└─ System catalog (sys.*)
Each database is independent: a backup of odoo_prod does not affect analytics, and vice versa.
Configuration Hierarchy
Settings in AngaraBase are applied at three levels, from broadest to narrowest:
Instance (angarabase.conf)
└─ Database (ALTER DATABASE ... SET ...)
└─ Session (SET ...)
A narrower level overrides a broader one:
- Instance — server settings (port, memory limits, file paths). Some of these require a restart.
- Database — database-specific settings (limits, storage parameters). Applied without a restart.
- Session — settings for the current connection (
SET timezone = 'Europe/Moscow'). Active until the session ends.
What this means in practice
AngaraBase architecture is designed with several principles that affect daily operations:
-
Connect with standard tools. pgwire compatibility means you don’t need custom drivers or libraries.
psql, DBeaver, your Python or Java application — everything connects just like standard PostgreSQL. -
Per-database isolation. Each database is an independent unit for backup, restore, and configuration. This is handy for multi-tenant scenarios: each client can have their own DB with individual settings and a separate backup schedule.
-
Clear diagnostics. System views
sys.*provide access to metadata and system state. Named components (AngaraTree, AngaraPlan, etc.) are reflected inEXPLAIN, logs, and metrics — you always know what part of the system is involved. -
Pluggable storage. A row-store (optimized for OLTP) is available now. In future versions, you’ll be able to choose the storage engine when creating a table — in-memory for hot data, column-store for analytics.
-
Fail-closed behavior. If an SQL construct isn’t supported — you’ll get an explicit error with an SQLSTATE code, not an unexpected result. This is predictable and safe for production.
Additional Resources
- Canonical architecture doc:
docs/01_ARCHITECTURE.md— complete technical specification. - Storage Engine:
concepts/storage-engine.md— row-store, pages, pluggable storage. - Query Processing:
concepts/query-processing.md— parser, planner, optimizer, executor. - Catalog and Metadata:
concepts/catalog-and-metadata.md— SysCatalog and system views.
Quickstart (testing)
Goal
Start angarabased, connect via psql, execute basic DDL/DML, and verify that pgwire works.
Prerequisites
- Linux x86_64
- One of the installation options:
- Rust toolchain (see
rust-toolchain.toml) for a source build, - or the portable archive
x86_64-unknown-linux-gnu(glibc >= 2.28).
Install from portable archive
mkdir -p /opt/angarabase
tar -xzf angarabase-0.6.3-x86_64-unknown-linux-gnu.tar.gz -C /opt/angarabase
/opt/angarabase/angarabase-0.6.3/bin/angarabase-server --version
If runtime glibc is below baseline (2.28), angarabase-server exits fail-closed with an explicit compatibility message.
Native package flow
For RPM/DEB deployments, service start is intentionally blocked before secure init:
angarabase-server --init /var/lib/angarabase --superuser admin --auth-mode scram --superuser-password-file /secure/path/pass.txt --require-auth
systemctl start angarabase
If you intentionally need trust bootstrap for isolated labs, it must be explicit:
angarabase-server --init /var/lib/angarabase --auth-mode trust --insecure-trust
Build
cargo build -p angarabase-server
cargo build -p angara-cli
Run server (local)
AngaraBase uses explicit instance initialization (--init) before regular startup.
Minimal path for testing (without manual config creation):
- Run the one-time initialization in the instance directory.
target/debug/angarabase-server --init /tmp/angarabase-instance --superuser angara_root --superuser-password 'change-me' --auth-mode scram
By default, this will create:
data/at/tmp/angarabase-instance/datatxlog/at/tmp/angarabase-instance/txlog- config
angarabase.confat/tmp/angarabase-instance/angarabase.conf
- Start the server:
target/debug/angarabase-server --config /tmp/angarabase-instance/angarabase.conf
In this scenario, the SCRAM bootstrap user angara_root is used.
For local trust/no-auth mode, you can explicitly run with --allow-insecure-no-auth.
SecurityContext note:
- In
scram/certmodes, protected SQL execution requires session context. - Minimal setup for tenant-scoped workloads:
SET SESSION CONTEXT 'app.tenant_id' = 'public';
Alternative path (if you want to use an existing config):
--config <path>with--initis read as the input config if the file exists,- and written as the output config if the file does not exist.
Examples:
# init using an existing config (input)
target/debug/angarabase-server --config ./angarabase.conf --init
# init and write a new config (output; file must not exist)
target/debug/angarabase-server --config /tmp/angarabase.conf --init /tmp/angarabase-instance
A shortcut is allowed for local development:
target/debug/angarabase-server --config angarabase.conf --dev
--dev preserves the auto-init behavior only for dev/test scenarios.
Connect with psql
psql "host=127.0.0.1 port=5432 user=angara_root dbname=base password=change-me sslmode=disable"
Smoke SQL
CREATE TABLE t (id INT PRIMARY KEY, v INT);
INSERT INTO t (id, v) VALUES (1, 10);
INSERT INTO t (id, v) VALUES (2, 20);
SELECT * FROM t ORDER BY id;
Restart check (DDL survives restart)
CREATE TABLE metadata (catalog) should survive restart.
- Stop the server (Ctrl+C).
- Start it again.
- Verify that the table is visible:
SELECT table_name FROM sys.tables WHERE table_name = 't';
Sys introspection (sys.*)
Examples of useful queries:
SELECT * FROM sys.identity;
SELECT * FROM sys.health;
SELECT * FROM sys.settings WHERE name IN ('server.addr','storage.data_directory');
SELECT * FROM sys.tables;
SELECT * FROM sys.columns WHERE table_name = 't';
Optional: SQL shutdown (fail-closed)
By default, shutdown via SQL is disabled. To enable it (locally/for tests):
export ANGARABASE_ALLOW_SQL_SHUTDOWN=1
You can then request a shutdown from psql:
SELECT sys.request_shutdown();
If something fails
- Check “Known issues”:
../reference/known-issues.md - For connecting clients (DBeaver, etc.):
../reference/client-compatibility.md - To report bugs, gather artifacts according to
../reference/support.md.
What’s Next
Once the server has answered psql -h 127.0.0.1 and the basic SELECT has succeeded, logical next steps:
- What is AngaraBase — a product overview: what the project is for, how it differs from vanilla PostgreSQL.
- SQL Compatibility Overview — what parts of the standard you can use right now.
- Configuration — how to run the server beyond defaults, tailored to your scenario.
- Security Model — before letting anyone else in but yourself.
Connecting clients: psql, Python, JDBC
What you’ll get in 15 minutes
After this tutorial, you will have three working methods to connect to a locally running AngaraBase instance:
psql— interactive PostgreSQL console.- Python via
psycopg[binary]— a typical application script. - JDBC via standard
org.postgresql:postgresql— a typical Java/Kotlin/Scala stack.
All three methods work via the standard pgwire protocol: AngaraBase presents itself to clients as PostgreSQL, so no special drivers are needed.
This describes the minimal guaranteed working path. The full list of tested clients and nuances of specific GUI tools (DBeaver, IntelliJ DataGrip, etc.) are in the separate Client compatibility guide.
Prerequisites
- AngaraBase running locally according to Quickstart. We assume the server is listening on
127.0.0.1:5432, and there is a userangaraand a databaseangara_demo. - Installed
psql(any PostgreSQL version ≥ 13). - Python 3.10+ (for Step 2).
- JDK 17+ and Maven/Gradle (for Step 3).
Verify that the server is responding:
psql --version
# psql (PostgreSQL) 16.4
ss -ltnp 'sport = 5432'
# LISTEN ... 127.0.0.1:5432 ...
If the port is not listening — go back to Quickstart and make sure angarabased started without errors.
Step 1. psql — interactive console
1.1. Connecting
psql 'postgresql://angara@127.0.0.1:5432/angara_demo'
Password (if set) — at the prompt. Success sign: the angara_demo=> prompt.
1.2. Minimal scenario: create a table, insert, select
-- Inside psql:
CREATE TABLE products (
id BIGINT PRIMARY KEY,
name TEXT NOT NULL,
price NUMERIC(10, 2) NOT NULL
);
INSERT INTO products (id, name, price) VALUES
(1, 'Coffee', 4.50),
(2, 'Tea', 3.00);
SELECT id, name, price FROM products ORDER BY id;
Expected output:
id | name | price
----+--------+-------
1 | Coffee | 4.50
2 | Tea | 3.00
(2 rows)
1.3. Useful \ commands
| Command | Purpose |
|---|---|
\dt | List user tables in the current database. |
\d products | Structure of the products table (columns, types, indexes). |
\du | List roles (RBAC). |
\timing on | Enable displaying execution time for each query. |
\q | Exit psql. |
1.4. If something goes wrong
could not connect to server: Connection refused— the server isn’t running or isn’t listening on127.0.0.1. Checkss -ltnp 'sport = 5432'and theangarabasedlogs.authentication failed for user "angara"— password isn’t set or doesn’t match. See Authentication.feature_not_supported (0A000)— you hit an SQL construct that AngaraBase does not support. This is an explicit fail-closed contract; see Known Issues and SQLSTATE.
Step 2. Python via psycopg
2.1. Installing the driver
We use psycopg version 3 (binary wheel — without local compilation):
python3 -m venv .venv
source .venv/bin/activate
pip install 'psycopg[binary]>=3.1,<4'
2.2. Minimal script
Create a file connect_demo.py:
import psycopg
DSN = "postgresql://angara@127.0.0.1:5432/angara_demo"
with psycopg.connect(DSN) as conn:
with conn.cursor() as cur:
cur.execute(
"INSERT INTO products (id, name, price) VALUES (%s, %s, %s)",
(3, "Espresso", 4.25),
)
cur.execute("SELECT id, name, price FROM products ORDER BY id")
for row in cur.fetchall():
print(row)
conn.commit()
Run:
python3 connect_demo.py
Expected output:
(1, 'Coffee', Decimal('4.50'))
(2, 'Tea', Decimal('3.00'))
(3, 'Espresso', Decimal('4.25'))
2.3. Important things to know about the Python client
- Parameterized queries are mandatory. Don’t substitute values via
f"...{value}..."— this is a path to SQL injections.psycopgsubstitutes parameters on the driver side via server-side prepared statements. with conn:andconn.commit()are different things. Thewith conn:context manager guarantees connection closure, but does not auto-commit. The transaction is committed only by an explicitconn.commit().- AngaraBase predictably returns SQLSTATE. Catch
psycopg.errors.FeatureNotSupportedand checke.diag.sqlstate == "0A000"to gracefully handle unsupported constructs (fail-closed contract).
Step 3. JDBC via org.postgresql:postgresql
3.1. Dependency
Maven (pom.xml):
<dependency>
<groupId>org.postgresql</groupId>
<artifactId>postgresql</artifactId>
<version>42.7.4</version>
</dependency>
Gradle (build.gradle.kts):
dependencies {
implementation("org.postgresql:postgresql:42.7.4")
}
3.2. Minimal class
import java.sql.Connection;
import java.sql.DriverManager;
import java.sql.PreparedStatement;
import java.sql.ResultSet;
public class ConnectDemo {
public static void main(String[] args) throws Exception {
String url = "jdbc:postgresql://127.0.0.1:5432/angara_demo";
java.util.Properties props = new java.util.Properties();
props.setProperty("user", "angara");
props.setProperty("preferQueryMode", "simple");
try (Connection conn = DriverManager.getConnection(url, props)) {
conn.setAutoCommit(false);
try (PreparedStatement ps = conn.prepareStatement(
"INSERT INTO products (id, name, price) VALUES (?, ?, ?)")) {
ps.setLong(1, 4L);
ps.setString(2, "Cappuccino");
ps.setBigDecimal(3, new java.math.BigDecimal("4.75"));
ps.executeUpdate();
}
try (PreparedStatement ps = conn.prepareStatement(
"SELECT id, name, price FROM products ORDER BY id");
ResultSet rs = ps.executeQuery()) {
while (rs.next()) {
System.out.printf(
"%d %s %s%n",
rs.getLong("id"),
rs.getString("name"),
rs.getBigDecimal("price"));
}
}
conn.commit();
}
}
}
3.3. Important things to know about the JDBC client
preferQueryMode=simpleis the recommended default for AngaraBase. This disables the aggressive probe mode of the extended protocol that the driver uses for compatibility with PostgreSQL extensions. AngaraBase implements pgwire by contract, and some extended protocol checks are not needed.assumeMinServerVersion=9.0— add topropsif you plan to work via DBeaver/DataGrip; see Client compatibility → DBeaver.- Transactions and
setAutoCommit(false). AngaraBase implements MVCC via UNDO-log; explicit transactions provide a predictable snapshot. Do not leave long transactions open — this slows down AngaraGC.
What’s Next
- SQL Compatibility Overview — what exactly from the SQL standard is available via all three clients.
- Client Compatibility — reference guide for specific GUI tools and drivers (DBeaver, IntelliJ DataGrip, etc.).
- Security Model — how to set up TLS, SCRAM, and certificate authentication for production connections.
- Known Issues and SQLSTATE — what codes the client should expect instead of “magic” values.
Contracts in AngaraBase
Goal
Explain what the term “contract” means in AngaraBase, the levels of contracts that exist, and the mechanisms used to enforce them. This page is for users, DBAs, and new contributors: to make it clear when reading documentation, code, or error messages what guarantees can be relied upon, and what is explicitly outside the contract.
What We Mean by a Contract
A contract in AngaraBase is an explicit promise about the observable behavior of a component: what it takes as input, what it returns, what guarantees it provides, and how it behaves when boundaries are violated. A contract is not “best intentions” or “how it usually works” — it is a formalized agreement that both sides must adhere to: the implementation and the caller (or client).
A contract has three key properties:
- Explicitness. The contract is documented in a single canonical source — not in a chat, not in a code comment, not in the “shared understanding of the team.”
- Verifiability. Compliance with the contract is verified automatically: by tests, types, metrics, or lint checks.
- Fail-closed. When a contract boundary is violated, the system returns an explicit error (with a known code), rather than “somehow continuing to work.”
If something in the system is not covered by a contract, it is explicitly marked as roadmap, experimental, or known limitation. Behavior outside a contract can change without a deprecation cycle.
Levels of Contracts
AngaraBase has several levels of contracts, each with its own source of truth and method of verification.
1. SQL Contract (External, for the User)
What is supported is supported fully. What is not supported returns an explicit SQLSTATE (0A000 feature_not_supported, etc.) rather than a silent bypass or distorted result.
- Source: SQL Compatibility Overview, Known Issues and SQLSTATE.
- Verification: pinned compat suite, regression tests for every documented SQLSTATE.
- What this means in practice: a client can catch
psycopg.errors.FeatureNotSupportedand know exactly that they hit a documented limitation, not a bug.
2. Configuration Contract
Every config key has a type, a default value, a range of acceptable values, and defined behavior for absence or invalid values.
- Source: Configuration, Configuration schema reference.
- Verification: the parser rejects unknown/invalid keys on startup (fail-closed), rather than silently ignoring them.
- Changing the semantics of a key goes through a deprecation cycle (see
WRITING_RULES.md§9a).
3. Operational Contract
Metrics, USDT probes, names of sys.* views, backup/restore formats, log formats, and runbook output — all of these are public names that monitoring and operator automation rely on.
- Source: System tables, Observability metrics checklist, Backup and restore, USDT/eBPF probes.
- Principle: every resource boundary (buffer pool RAM budget, transaction write-set limit, snapshot age, etc.) must have a Prometheus metric and explicit fail-closed behavior on violation. A boundary without observability is not a contract.
4. Internal API Contracts (For Contributors)
Each core subsystem has a public Rust trait defining its semantics: TableEngine, PageProvider, TransactionLogSink, StorageIo, etc. Without implementing the contract, the code won’t compile — this is an “honest checked promise,” not just a “developer’s word.”
- Source: doc-comments on traits, Architecture overview, API Boundaries.
- Verification: Rust compiler + property-tests for invariants + layering lints (
Coredoesn’t depend onAdapters/Tooling).
5. Documentation Contract (Anti-drift)
Documentation is part of the code. Any change to a public contract (SQL surface, config keys, metrics, SQLSTATE, subsystem names, protective defaults, init/upgrade sequence) must be accompanied by an update to AngaraBook in the same PR.
- Source: WRITING_RULES.md §8 — Anti-drift contract.
- Verification: pre-commit / CI run
tools/docs/lint_angarabook_public.py,tools/docs/check_public_build_security.pyand mark drift as blocking.
How We Enforce Contracts
A contract without an enforcement mechanism is just a declaration. AngaraBase uses multiple layers of enforcement working together.
The Type System as the First Line of Defense
Result<T, Error>instead of panics.unwrap()/expect()are forbidden in production code.- Bounded generics and trait objects instead of dynamic dispatch where invariants can be encoded at the type level.
- No-panic policy: the server does not crash on user input — it returns an SQLSTATE.
Restrictive by Default + Fail-closed
Every component with a resource boundary must define:
- its boundary (e.g.,
buffer_pool_size_mb,txn_max_write_set_mb,max_concurrent_queries), - behavior upon violation (an explicit error with a known SQLSTATE),
- the reaction of the caller code (Reaction Propagation Contract).
No boundary and no fail-closed behavior — no merge.
Pinned Tests and Golden Datasets
The contract for SQL compatibility and client compatibility is verified by pinned tests, not a “compatibility percentage.” If a test is pinned — changing behavior requires either updating the test with a justification, or rolling back the change.
More details: Testing and validation baseline, Golden dataset management, CI reproducibility contract.
Deprecation Cycle
When a contract is phased out, it does not silently disappear. A unified cycle is applied: Active → Deprecated → Removed, with at least one major release between Deprecated and Removed; before v1.0 — at least two minor versions. Every phase change is an atomic PR (code + AngaraBook + Migration steps).
Full procedure: WRITING_RULES.md §9a — Deprecation policy. Public list of all deprecated/removed contracts: Known Issues and SQLSTATE.
Security Gate as Fail-closed for Documentation
Public builds of AngaraBook pass through a security gate: links to internal directories (RFC/development/spec/rules) and paths with /internal/ are forbidden. Internal content is hidden via <!-- internal --> ... <!-- /internal --> blocks; the preprocessor fails closed on nested/unclosed markers.
What This Means for the User
- Predictable Behavior. If behavior is documented, it is stable within a major release. If documented as a limitation with an SQLSTATE, it will return exactly that SQLSTATE, rather than “sometimes working.”
- Safe Client Code. You can catch specific SQLSTATEs and build retry/error handling logic without heuristics or parsing text messages.
- Clear Upgrades. Changing the behavior of a public contract goes through an explicit deprecation cycle with migration steps; you see what will change and when, well in advance.
- Observable Guarantees. Every resource boundary has a metric; you can see utilization and rejects before they become incidents.
What This Means for Developers and Contributors
- Compiler Contract, Not Review Comment. If an invariant can be encoded in a trait or type, it is encoded. Code review catches what the compiler cannot.
- Single Source of Truth per Knowledge Type. No need to “search for the current truth across multiple documents”: there is a single canonical owner for each layer of the contract.
- Predictable Debt Management. A deprecated contract is tracked in
reference/known-issues.md, and if not resolved in one PR, in the technical debt registry with a status ofscheduled <target-train>. - Anti-drift in a Single PR. Change the behavior — update AngaraBook right there. No “update docs later.”
What Is Not a Contract
To avoid false expectations, explicitly: the following are not considered contracts:
- internal module names, filenames, and private core functions (can change during refactoring without deprecation);
- behavior of features with
status: experimentalfrontmatter or--experimental-*CLI flags — they were never considered stable; - error message texts (the contract is the
SQLSTATEand its meaning, not the text); - undocumented side effects noticed empirically (“it works for me if…”);
- benchmarks and numerical latency values — this is observability, not a performance claim (a performance claim requires a pinned benchmark, see
tools/perf_pack/).
If you rely on any of these in production, it is technical debt on the client side, and a future AngaraBase update will expose it.
Links
- Project Principles §1 — Restrictive by Default — the foundation of the fail-closed approach.
- SQL Compatibility Overview — the SQL contract.
- Known Issues and SQLSTATE — public list of boundaries and deprecated/removed contracts.
- Configuration schema reference — the configuration contract.
- Architecture overview, API Boundaries — internal API contracts and layering.
- Observability metrics checklist — observability as part of the contract.
- WRITING_RULES.md — the documentation contract (anti-drift, deprecation policy).
Storage Engine
Goal
Explain how AngaraBase stores data on disk, what file formats are used, and what storage engines are available (or planned).
Pluggable Storage Architecture
AngaraBase uses a modular storage architecture: the storage engine is responsible for physical data placement, while upper layers (SQL, transactions, indexes) interact via a unified interface. This allows swapping different engines for different workloads without modifying the SQL layer.
The current engine is the Row-Store. Column-Store and In-Memory Engines are planned for the future.
Row-Store (Current Engine)
Page-Based Heap Storage
Data is stored in fixed-size pages (16 KB). Each table is represented by a set of heap pages, where rows are placed sequentially.
Slotted Pages
Each page is structured as a slotted page:
┌─────────────────────────────────────┐
│ Page Header (LSN, checksum, flags) │
├─────────────────────────────────────┤
│ Slot Array → [offset₁, offset₂…] │
│ (grows downwards ↓) │
│ │
│ free space │
│ │
│ (row data grows upwards ↑) │
│ Row₂ data │ Row₁ data │
└─────────────────────────────────────┘
- Header contains the LSN (log sequence number), checksum,
page_type, and flags. - Slot array — an array of pointers to rows within the page. This allows moving rows within the page without altering external references (TID =
page_id+slot_id). - Row data is written from the end of the page towards the beginning.
Page types (page_type): 0 = data (heap), 1 = index (reserved), 2 = meta (reserved), 3 = overflow (reserved).
Page Checksums
Each page is protected by a checksum (CRC32C). When reading a page from disk, the checksum is verified; if it doesn’t match, the server returns an error rather than serving corrupted data (fail-closed with diagnostics).
File Formats
AngaraBase uses a per-database file model: each database consists of a pair of files.
| Extension | Purpose | Magic |
|---|---|---|
.adb | Heap pages containing table data and indexes. Self-contained per-database storage file. | APG1 |
.atl | Transaction log (WAL) for the specific database. Per-database WAL. | ADB1 |
AngaraTree indexes are stored inside the .adb file — page_type = 1 in the page header is reserved for them. There is no separate file for indexes.
Source of truth: crates/angarabase/src/on_disk.rs, angarabook/src/operations/upgrade-and-migration.md.
Data Directory Layout
The data directory is defined by the storage.data_directory setting. Typical layout:
data_directory/
├── VERSION # initialization marker (AVR1, 256 bytes, CRC32C)
├── base.adb # system database (SysCatalog) — heap pages
├── base.atl # WAL for the system database
├── mydb.adb # user DB — heap pages + index pages
├── mydb.atl # WAL for the user DB
└── …
WAL is not stored in separate segmented files (like wal_000001 in PostgreSQL). In AngaraBase, the WAL is a single .atl file per database, located in the same data_directory.
The storage.transaction_log_directory setting defines an alternative directory for .atl files (useful for placing WAL on a separate disk).
Key Settings
[storage]
data_directory = "/var/lib/angarabase/data"
transaction_log_directory = "/var/lib/angarabase/txlog"
More details on settings — Configuration.
Column-Store (Planned, v6)
A columnar engine based on an Arrow/Parquet-like format, oriented towards analytical queries (OLAP). Data is stored by columns, supporting compression and vectorized scans.
Status: not implemented, planned for the v6 roadmap.
In-Memory Engine (Planned, v5 — AngaraMemory)
An engine for storing data in RAM. Three modes are planned:
| Mode | Description |
|---|---|
volatile | Data in memory only; lost upon restart. |
logged | Writes are duplicated in the WAL; recovered upon restart. |
snapshotted | Periodic snapshots to disk + WAL. |
Status: in development.
HTAP Direction
AngaraBase’s long-term strategy is HTAP (Hybrid Transactional/Analytical Processing):
- Row-Store serves OLTP (transactional workloads).
- Column-Store serves OLAP (analytics).
- Between them lies asynchronous replication: data from the row-store is converted into columnar format for analytical queries.
This will allow running analytics on fresh data without ETL pipelines and without affecting transactional performance.
Related Sections
Concepts (What to read next)
- Transactions and MVCC — how page versions relate to MVCC and WAL.
- Indexes — how B+tree pages fit into the tablespace.
- Catalog and Metadata — where physical table metadata is visible from SQL.
How-to (What to do)
- Configuration —
storage,wal,checkpointsettings. - Backup and Restore — how to transfer a datadir between instances.
- Crash recovery — storage behavior after an unexpected shutdown.
- Diagnostics — how to view IO/page-cache metrics.
Reference
- System Views
sys.*—sys.tablespaces,sys.healthfor inspecting storage state. - Known Issues and SQLSTATE —
STORAGE_*error section.
Transactions and MVCC
AngaraBase ensures concurrent access to data via MVCC (Multi-Version Concurrency Control). Transactions guarantee atomicity of changes, while MVCC allows readers and writers to operate simultaneously without mutually blocking each other.
Transaction Basics
Transaction Management
BEGIN; -- start an explicit transaction
SAVEPOINT sp1; -- create a savepoint
ROLLBACK TO SAVEPOINT sp1; -- rollback to the savepoint
COMMIT; -- commit the transaction
ROLLBACK; -- rollback the entire transaction
Autocommit
By default, AngaraBase operates in autocommit mode: each individual SQL statement executes as a standalone transaction. If the statement completes successfully, the result is committed automatically; on error, it is rolled back.
For operations affecting multiple rows or tables, use explicit transactions (BEGIN / COMMIT) to group changes into a single atomic unit.
MVCC: Row Versioning
The core idea of MVCC is that readers do not block writers, and writers do not block readers. This is achieved by storing multiple versions of each row.
Version Metadata
Each row version contains two system fields:
| Field | Purpose |
|---|---|
created_commit | Epoch (commit timestamp) when the version was created |
deleted_commit | Epoch when the version was marked as deleted (∞ for active versions) |
Visibility Rule
A row version is visible to a transaction with snapshot S if both conditions are met:
created_commit <= S— the version was created before or at the time of the snapshot- The version is not deleted, or
deleted_commit > S— the deletion occurred after the snapshot
Write Operations
- INSERT — creates a new row version with
created_commit= current epoch - UPDATE — does not modify the row in-place. Instead, it marks the current version as deleted (
deleted_commit= current epoch) and creates a new version with the updated data - DELETE — marks the version as deleted (
deleted_commit= current epoch)
Isolation Levels
Each transaction receives a snapshot — a fixed view of the data at a specific point in time.
READ COMMITTED (Default)
The snapshot is updated before each statement. The transaction sees all data committed before the start of the current statement. This is the recommended isolation level for most workloads.
SET TRANSACTION ISOLATION LEVEL READ COMMITTED;
REPEATABLE READ
The snapshot is fixed at the time of BEGIN and remains unchanged until the end of the transaction. All statements within the transaction see the same state of the data.
SET TRANSACTION ISOLATION LEVEL REPEATABLE READ;
SERIALIZABLE
As of version 0.6.4.4, AngaraBase implements full SERIALIZABLE (SSI) isolation level.
In SERIALIZABLE mode, write skew and phantoms anomalies are prevented through SIREAD locks and tracking read-write anti-dependencies.
Transactions violating serializability are aborted with code 40001.
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;
Locks
Read Operations
Reading uses the MVCC snapshot and does not require locks. A reader never waits for a writer, and vice versa.
Write Operations (Lock-Free DML)
Writers use atomic operations (Compare-And-Swap) for row-level version installation, ensuring lock-free and conflict-free modifications. Traditional write locks are no longer held.
DDL Operations
Schema change operations (CREATE TABLE, ALTER TABLE, DROP TABLE) acquire table-level locks for the duration of the execution.
Deadlock Detection
AngaraBase detects deadlocks using:
- Timeout — if a transaction waits for a lock longer than a configured threshold, it is aborted
- Victim selection — upon detecting a cycle, the system chooses a victim transaction to rollback
- Deterministic lock ordering — an internal strategy for ordering locks to reduce the likelihood of deadlocks
Garbage Collection (AngaraGC)
Over time, the storage accumulates old row versions that are no longer visible to any active transaction. The AngaraGC subsystem is responsible for cleaning them up.
How It Works
- GC watermark — calculated as the minimum snapshot among all active transactions:
min(active_snapshots) - Row versions with
deleted_commit < watermarkare safe to delete — no active transaction can see them - Cleanup is performed by a background process without pausing query execution
- Bounded slices — GC processes data in fixed-size chunks to avoid latency spikes
- Epoch Reaper — a background worker (since version 0.6.5.24) that prevents the GC watermark from stalling due to abruptly disconnected or hung sessions.
Difference from PostgreSQL
AngaraBase does not have autovacuum in the traditional sense. It uses a hybrid design with an epoch-based watermark (similar to Oracle/InnoDB), allowing for more precise control over the cleanup timing.
Recommendations
- Use READ COMMITTED (the default level) for most workloads
- Avoid long-running transactions — they hold back the GC watermark and prevent the cleanup of old row versions, which increases disk space consumption
- When using REPEATABLE READ, be aware of potential write skew. If strict serializability is needed, use explicit locks (
SELECT ... FOR UPDATE)
MVCC State Upon Crash Recovery
Upon restarting after a crash, AngaraBase restores the MVCC state from the transaction log (WAL).
What Is Restored
- Committed transactions — transactions that managed to write a COMMIT to the WAL
- Aborted transactions — incomplete transactions are marked as aborted
- MVCC visibility — information about which row versions are visible for each commit epoch
- Transaction counters — current commit epoch and other counters
Recovery Process
- WAL scan — scanning the transaction log files in chronological order
- MVCC replay — restoring in-memory MVCC structures from WAL records
- Cleanup — marking uncompleted transactions as aborted
Limitations
- Backend requirement: MVCC recovery only works with
transaction_log.backend = "file_bin" - Memory rebuild: The MVCC state is rebuilt in memory, which can take time for large WAL volumes
- Read-your-writes: Immediately after restart, uncompleted transactions are invisible (marked as aborted)
Monitoring Recovery
-- Check the recovery mode
SELECT recovery_mode FROM sys.identity;
-- Check system health after recovery
SELECT txn_commit_epoch_current FROM sys.health;
Possible recovery_mode values:
"normal"— normal start without recovery"crash_recovery"— recovery after a crash"forced_takeover"— forced instance lease takeover
Related Sections
- Storage Engine — storage architecture and page format
- Instance Lifecycle — instance lifecycle and crash recovery
- Crash Recovery — operational procedures for recovery
- SQL Reference — SQL statement syntax
Indexes
Goal
Explain what types of indexes are available in AngaraBase, when to use them, and how they interact with MVCC.
AngaraTree — index engine
AngaraTree is the index engine for AngaraBase. Indexes are stored in .atl files separate from heap data.
B+tree (default)
The primary index type. Suitable for equality and range queries; keys are stored in deterministic order.
-- Creating a B+tree index (equivalent forms):
CREATE INDEX idx_name ON orders (customer_id);
CREATE INDEX idx_name ON orders USING btree (customer_id);
A B+tree index accelerates:
- Exact matches:
WHERE customer_id = 42 - Ranges:
WHERE created_at >= '2026-01-01' AND created_at < '2026-02-01' - Sorting:
ORDER BY customer_id
BRIN (Block Range Index)
A compact index for data with a natural order (append-only, time-series). BRIN stores min/max values for ranges of heap pages, allowing entire blocks to be skipped during scans.
CREATE INDEX idx_ts ON events USING brin (created_at);
Supported key types:
| Type | Aliases |
|---|---|
INTEGER | int, int4 |
BIGINT | int8 |
DATE | — |
TIMESTAMP | — |
TIMESTAMPTZ | — |
How BRIN works: the index acts as an accelerator path — first it prunes blocks that do not contain the required values, then a heap fetch is performed with an MVCC predicate recheck. BRIN does not guarantee exactness — it only narrows the search area.
Efficiency metric: angara_brin_range_efficiency shows the fraction of blocks skipped thanks to BRIN. The closer to 1.0, the more efficient the index (data is well-clustered).
Hash / Bloom
Reserved as optional/future index types. Not currently implemented.
Indexes and MVCC
An index stores TID references (page_id, slot_id) to rows in the heap. The visibility of a row is determined not by the index, but by the MVCC layer when reading the heap page:
- The query accesses the index → retrieves a set of TIDs.
- For each TID, the heap page is read.
- The MVCC layer checks the visibility of the row version for the current transaction.
Consequence: an index may contain references to invisible (obsolete) row versions. This is normal — such entries are filtered during the heap fetch.
IndexStore — persistent secondary indexes
AngaraBase supports persistent secondary indexes for RowStore tables via IndexStore.
How it works
CREATE INDEXbuilds the index via a full table scan (build_from_rows) and saves the result.- DML (INSERT/DELETE) automatically updates all table indexes — fail-closed: if the index update fails, the heap mutation is rolled back.
- The optimizer uses the index for
WHERE col = valuequeries (O(log N) instead of O(N) seq_scan).
Resource constraints
| Constraint | Config | On violation |
|---|---|---|
| Max pages per index | storage.max_index_pages_per_table | PageLimitExceeded → DML abort |
| Index maintenance time | storage.index_maintenance_budget_ms (default: 5000ms) | MaintenanceBudgetExceeded → DML abort |
Observability
| Metric | Description |
|---|---|
angarabase_index_inserts_total | Total inserts into the index |
angarabase_index_deletes_total | Total deletes from the index |
angarabase_index_reject_total | DMLs rejected due to index errors |
angarabase_index_maintenance_duration_ms | Histogram of index maintenance duration |
Current Limitations
| Limitation | Status |
|---|---|
| Single-column indexes only | Current version (v0 bound) |
| No partial indexes | Not supported (v4 scope) |
| No expression indexes | Not supported (v4 scope) |
| No covering indexes | Not supported (v4 scope) |
| Online index build (without DML lock) | Not supported (H1-v0.7.x) |
| WAL-first for index mutations | In-memory index: recovered via build_from_rows on recovery. Disk-backed WAL-first — in roadmap for future releases. |
Attempting to create an unsupported index returns SQLSTATE 0A000 (feature_not_supported).
When to create indexes
Recommended:
- On columns frequently used in
WHERE,JOIN ON,ORDER BY. - BRIN — on time-series columns of tables with
append_only = true, where data is inserted in ascending order of the key.
Not recommended:
- On tables with a small number of rows (a full scan will be faster).
- On columns with very low selectivity (e.g.,
booleanflags). - Creating many indexes on a single table slows down
INSERT/UPDATE/DELETE.
Use EXPLAIN ANALYZE to check if the optimizer is using an index. For more details — Query processing.
Index Integrity Check
To perform an offline check of B+tree index integrity, the validate() function is available:
SELECT angara_index_validate('idx_name');
Recommended to run after a crash or recovery from backup.
Related Sections
Concepts (What to read next)
- Query Processing — how the optimizer selects and combines indexes.
- Storage Engine — how B+tree pages map onto the tablespace.
- Transactions and MVCC — why updating indexes under load requires MVCC visibility.
How-to (What to do)
- DDL: CREATE/DROP INDEX — syntax for creating and dropping indexes.
- Diagnostics — how to use
EXPLAIN ANALYZEandsys.*to see if an index is used.
Reference
- Data Types — which types are supported as index keys.
- System Views
sys.*—sys.indexes,sys.column_statsfor coverage analysis.
Query Processing
Goal
Explain how AngaraBase processes SQL queries: from text to result. Useful for understanding EXPLAIN plans and performance diagnostics.
Pipeline Overview
Each SQL query goes through four stages:
SQL text ──▸ Parsing ──▸ Planning ──▸ Optimization ──▸ Execution ──▸ Result
1. Parsing: SQL → AST
The parser converts the query text into an Abstract Syntax Tree (AST). AngaraBase uses a PostgreSQL-compatible SQL dialect.
If the syntax is not supported, the server returns SQLSTATE 0A000 (feature_not_supported) with a description of the unsupported construct.
2. Planning: AST → Logical Plan
During the planning stage:
- Name resolution — matching table, column, and function names to catalog objects.
- Type checking — verifying types and applying automatic coercion if needed.
The result is a logical plan describing what needs to be done, but not how.
3. Optimization (AngaraPlan)
The AngaraPlan optimizer converts the logical plan into a physical one, choosing the most efficient execution strategy.
Cost-based optimizer (CBO): decisions are made based on statistics (AngaraStat) — row counts, value distributions, index availability.
Key optimizer decisions:
| Decision | Options |
|---|---|
| Access path | Full table scan, B+tree index scan, BRIN scan |
| Join method | Hash join, nested loop join |
| Join order | Reordering tables to minimize intermediate results |
Robust planning: the optimizer is resilient to estimation errors — with a significant discrepancy between estimated and actual rows, the plan remains viable (avoids worst-case behavior).
LEO (Learning Optimizer): feedback loop — after query execution, actual statistics are used to improve future estimates.
4. Execution (AngaraFlow)
The AngaraFlow executor runs the physical plan in an iterator/streaming model (Volcano): each operator requests the next row from its child operator.
Core Operators:
| Operator | Description |
|---|---|
| Scan | Reading rows from the heap (full scan) or an index (index scan) |
| Filter | Applying WHERE predicates |
| Hash Join | Joining via a hash table; Grace hash join for large datasets |
| Nested Loop | Joining via nested loops (for small tables or index lookups) |
| Group By | Aggregation (GROUP BY, HAVING) |
| Sort | Sorting; external sort for data that doesn’t fit in memory |
| Limit | Limiting the number of rows |
AngaraStat (Statistics)
The optimizer uses statistics from system tables to estimate plan costs.
sys.table_stats
| Column | Description |
|---|---|
row_count | Estimated number of rows in the table |
mutation_epoch | Mutation counter (to identify stale statistics) |
sys.column_stats
| Column | Description |
|---|---|
ndv | Number of distinct values (HyperLogLog) |
min_value / max_value | Value range boundaries |
null_count | Number of NULL values |
histogram | Value distribution (equi-height histogram) |
mcv | Most common values (and their frequencies) |
Managing Statistics Level
ALTER TABLE t SET (stats_level_max = 2);
| Level | What is gathered |
|---|---|
| 0 | Only row_count and mutation_epoch |
| 1 | + NDV, min/max, null_count |
| 2 | + Histograms, MCV (reservoir sampling) |
| 3 | + Extended statistics (reserved) |
A higher level gives the optimizer more information but increases statistics gathering time.
Query Diagnostics
EXPLAIN
Viewing the execution plan without executing the query:
EXPLAIN SELECT * FROM orders WHERE customer_id = 42;
Variants:
EXPLAIN ANALYZE SELECT ...; -- executes the query, shows actual rows/time
EXPLAIN (BUFFERS) SELECT ...; -- + page read statistics
EXPLAIN (FORMAT JSON) SELECT ...; -- JSON output
System Views
| View | Description |
|---|---|
angara_stat_activity | Currently active queries |
angara_stat_statements | Aggregated statistics by query type |
angara_top_queries | Top queries by execution time |
Slow Query Log
Enabled via environment variables:
ANGARABASE_LOG_MIN_DURATION_MS=100 # log queries taking longer than 100 ms
ANGARABASE_LOG_QUERY_TEXT=1 # include the query text in the log
More details — Diagnostics.
Future
Planned query processing improvements:
| Component | Description |
|---|---|
| AngaraVector | Vectorized/SIMD execution — processing batches by column instead of row-at-a-time |
| AngaraParallel | Morsel-driven parallelism — parallel execution across multiple cores |
| AngaraAdapt | Adaptive processing — switching strategies during execution |
Related Sections
Concepts (What to read next)
- Indexes — how the optimizer chooses an index for
SELECT/UPDATE. - Catalog and Metadata — where statistics used by the planner are stored.
- Transactions and MVCC — how the isolation level affects the execution plan.
How-to (What to do)
- Diagnostics —
EXPLAIN ANALYZE, slow-query log, how to read a plan. - Tracing — distributed tracing of parse → plan → execute phases.
- Logging —
ANGARABASE_LOG_QUERY_TEXT,ANGARABASE_LOG_MIN_DURATION_MS.
Reference
- SQL Overview — what parts of the standard are supported at the planner level.
- System Views
sys.*—sys.queries,sys.column_stats,sys.health.
Catalog and Metadata
AngaraBase stores metadata for all database objects in a central registry — the SysCatalog. For users, access to metadata is provided via the sys.* system views and introspection functions.
Object Hierarchy
Database objects are organized in a strict hierarchy:
Instance → Database → Schema → Table → Column
Each level has a unique identifier in the SysCatalog. Tables belong to schemas, schemas to databases, and databases to the instance.
SysCatalog
SysCatalog is the central metadata registry. It stores information about:
- tables, columns, and their data types
- indexes and constraints
- users, roles, and privileges
- functions and aggregates
- security policies (RLS)
- statistics for the query optimizer
The SysCatalog is updated during DDL operations (CREATE, ALTER, DROP) and when statistics are gathered.
System Views (sys.*)
System views are read-only and do not require special privileges (with the exception of security-related views).
Instance Identification and State
| View | Description |
|---|---|
sys.identity | Instance identity: version, instance_id |
sys.health | Server health status |
sys.settings | Effective configuration (name, value). Secrets are not exposed |
Data Structure
| View | Description |
|---|---|
sys.tables | All tables with metadata (schema, name, type, row count) |
sys.columns | Columns for each table: column_name, data_type, nullable, etc. |
Statistics
| View | Description |
|---|---|
sys.table_stats | Table-level statistics: stats_level_max, last_committed_rowid, mutation epochs |
sys.column_stats | Column-level statistics: ndv_approx, min/max, null_count, histograms, MCV |
Security and Access
| View | Description |
|---|---|
sys.users | User accounts |
sys.roles | Roles |
sys.user_roles | Assignments of users to roles |
sys.role_privileges | Role privileges |
sys.object_grants | Object-level grants |
sys.my_privileges | Current user’s privileges |
sys.security_policies | RLS policies |
sys.audit_log | Audit log |
Introspection Functions
AngaraBase provides a set of built-in functions for programmatic access to metadata.
Roles and Privileges
| Function | Purpose |
|---|---|
angara_user_roles() | Current user’s roles |
angara_role_privileges() | Privileges of a given role |
angara_user_privileges() | Effective user privileges |
angara_object_privileges() | Privileges on a specific object |
angara_has_privilege() | Check for a specific privilege |
Security (RLS, audit, break-glass)
| Function | Purpose |
|---|---|
angara_table_policies() | RLS policies for a table |
angara_is_rls_active() | Whether RLS is active for a table |
angara_effective_rls_predicate() | The final RLS predicate for the current user |
angara_break_glass_status() | Status of the break-glass session |
angara_audit_verify_chain() | Verify the integrity of the audit chain |
Diagnostics and Performance
| Function / View | Purpose |
|---|---|
angara_stat_activity | Active sessions and running queries |
angara_stat_statements | Aggregated statistics for executed queries |
angara_top_queries() | Top queries by resource consumption |
angara_stat_statements_reset() | Reset query statistics |
Practical Examples
Instance Information
SELECT * FROM sys.identity;
Checking Server Health
SELECT * FROM sys.health;
List All Tables
SELECT * FROM sys.tables;
Checking Security Settings
SELECT name, value FROM sys.settings WHERE name LIKE 'security.%';
Viewing Table Column Statistics
SELECT column_name, ndv_approx, null_count
FROM sys.column_stats
WHERE table_name = 'orders';
Checking Current User Privileges
SELECT * FROM sys.my_privileges;
SELECT angara_has_privilege('orders', 'SELECT');
Related Sections
- Quickstart (sys.* examples) — first steps with system views
- Security — security introspection functions
- Diagnostics — monitoring and troubleshooting
- Query Processing — how the optimizer uses catalog metadata
Instance Lifecycle
This document explains the conceptual model of AngaraBase instance identity, lifecycle, and the Instance Lease system that enables safe crash recovery and storage portability.
Instance Identity
Each AngaraBase instance has a unique identity established during initialization:
Core Identity Components
cluster_id: UUID identifying the logical database clusterinstance_id: UUID identifying this specific instance- Data directory: Physical location of database files
- Transaction log directory: Physical location of WAL files
Identity Persistence
Identity is stored in two places:
- VERSION marker: Binary file with format version and IDs
- System catalog pages: In
base.adbreserved pages with full metadata
Instance Lease System
The Instance Lease prevents multiple instances from accessing the same data files simultaneously, which would cause corruption.
Lease Structure
#![allow(unused)]
fn main() {
pub struct InstanceLeaseV0 {
pub holder_id: String, // UUID of owning instance
pub acquired_at_unix_s: u64, // When lease was taken
pub expires_at_unix_s: u64, // When lease expires (TTL)
pub holder_pid: u32, // Process ID (diagnostic)
pub holder_hostname: String, // Hostname (diagnostic)
}
}
Lease State Machine
[None] ──acquire──> [Held] ──heartbeat──> [Held]
↑ │ │
│ │ │
└──expired/release──┘ │
│
[Expired] <──────────timeout─────────────────┘
│
└──takeover──> [Held by new instance]
State Transitions
- None → Held: First instance startup or after graceful shutdown
- Held → Held: Periodic heartbeat updates (every 10s by default)
- Held → None: Graceful shutdown releases lease immediately
- Held → Expired: Heartbeat stops (crash, network partition)
- Expired → Held: New instance takes over after TTL expiration
Lease Storage
- Location: Stored in
SysCatalogMetaV0withinbase.adbpages - Persistence: Atomic updates with full page images
- Reliability: Works on NFS/SAN where
flock()is unreliable
Startup Sequence
Phase 1: Pre-flight Checks
- Verify data directory exists and is initialized
- Check VERSION marker compatibility
- Validate page size matches compiled binary
Phase 2: Lease Acquisition
- Load system catalog from
base.adb - Check existing lease status:
- No lease: Acquire immediately
- Expired lease: Take over with warning
- Active lease: Fail with informative error
- Force takeover: Override active lease (dangerous)
Phase 3: Recovery
- WAL Recovery: Replay transaction log (file_bin backend)
- MVCC Recovery: Restore in-memory transaction state
- Heartbeat Start: Begin periodic lease renewal
Phase 4: Ready for Connections
- Start protocol listeners (pgwire, admin)
- Begin accepting client connections
- Continue heartbeat until shutdown
Recovery Modes
AngaraBase tracks the recovery mode for operational visibility:
Normal Startup
- Clean start on existing, properly shut down data
- No WAL replay required
recovery_mode = "normal"
Crash Recovery
- Previous instance terminated unexpectedly
- WAL replay recovers committed transactions
- MVCC state rebuilt from transaction log
recovery_mode = "crash_recovery"
Forced Takeover
- Operator used
ANGARABASE_FORCE_LEASE_TAKEOVER=1 - May indicate emergency recovery scenario
recovery_mode = "forced_takeover"
Shared Storage Scenarios
The Instance Lease system enables AngaraBase to work correctly on shared storage where multiple hosts can access the same files.
NFS/SAN Deployment
Host A ──┐
├── NFS/SAN ──> [data/] [txlog/]
Host B ──┘ [base.adb with lease]
Benefits
- Failover: Host B can take over if Host A crashes
- Maintenance: Move instance between hosts without dump/restore
- Testing: Run against production data copies safely
Limitations
- Single writer: Only one instance can write at a time
- Network partitions: May cause false lease expiration
- Performance: Network storage latency affects throughput
File Copy Scenarios
For non-shared storage, manual file copy enables:
- Backup testing: Verify backup integrity on different host
- Development: Use production data copy for debugging
- Migration: Move to new hardware without downtime
Configuration
Lease Timing
ANGARABASE_LEASE_TTL_S: How long lease lasts (default: 30s)ANGARABASE_LEASE_HEARTBEAT_S: Renewal frequency (default: 10s)
Safety Controls
ANGARABASE_FORCE_LEASE_TAKEOVER: Emergency override (default: false)
Recommended Settings
# Production: Longer TTL for network stability
export ANGARABASE_LEASE_TTL_S=60
export ANGARABASE_LEASE_HEARTBEAT_S=20
# Development: Shorter TTL for faster iteration
export ANGARABASE_LEASE_TTL_S=15
export ANGARABASE_LEASE_HEARTBEAT_S=5
Monitoring and Observability
Instance Status
-- Check current lease holder
SELECT lease_holder_id, lease_holder_hostname,
lease_expires_at, recovery_mode
FROM sys.identity;
-- Check system health
SELECT uptime_seconds, txn_commit_epoch_current
FROM sys.health;
Lease Events
AngaraBase logs lease events to stderr:
Instance lease acquired: holder=abc123...
Instance lease taken over: holder=def456...
Warning: lease heartbeat failed: I/O error
Instance lease released: holder=abc123...
Metrics Integration
Future versions will expose lease metrics via:
- Prometheus metrics endpoint
sys.metricsvirtual table- Structured logging output
Security Considerations
Access Control
- Lease system does NOT provide authentication
- File system permissions still required
- Network access controls recommended for shared storage
Audit Trail
- Lease changes logged with timestamps
- Instance identity tracked in
sys.identity - Recovery mode visible for forensics
Troubleshooting
Common Issues
“Cannot start: database files are owned by another instance”
- Diagnosis: Active lease prevents startup
- Resolution: Wait for expiration or verify other instance is dead
Frequent lease takeovers
- Diagnosis: Network instability or resource contention
- Resolution: Increase TTL, check network/disk performance
“MVCC recovery failed”
- Diagnosis: Corrupted transaction log
- Resolution: Check filesystem, restore from backup if needed
Debug Information
-- Instance identity and lease
SELECT * FROM sys.identity;
-- Recent recovery statistics
SELECT * FROM sys.health;
-- Transaction log status
SELECT * FROM sys.settings WHERE name LIKE 'transaction_log.%';
Related Sections
Concepts (What to read next)
- Storage Engine — datadir that protects the instance lease.
- Transactions and MVCC — what happens to active transactions during
forced_takeover.
How-to (What to do)
- Crash recovery — operational procedures for recovery after a crash.
- Configuration — variables
instance_lease.*,recovery.*. - Backup and Restore — how to protect datadir and take a snapshot.
Reference
- System views
sys.*—sys.identity,sys.healthfor lease state diagnostics. - Known issues and SQLSTATE —
INSTANCE_*error section.
SQL compatibility overview
Goal
Understand AngaraBase’s compatibility model with PostgreSQL and quickly interpret errors during testing.
Approach
AngaraBase implements a limited subset of PostgreSQL SQL via the pgwire protocol. Unsupported constructs return an explicit SQLSTATE (most often 0A000 feature_not_supported), rather than a silent incorrect result.
Principle: fail-closed — if a feature is not implemented, the client receives a deterministic error.
Source of truth
| Source | Path |
|---|---|
| Known issues (canonical) | angarabook/src/operations/known-issues.md |
| Compat probes (tests) | tools/compat_suite/run.sh |
| User-facing known issues | Known issues |
Practical advice
- ORM/tooling (DBeaver, psql, Hibernate, etc.) often run
pg_catalogqueries upon connection. Rely on the compat suite results in--dbeaver-smoke/--nightlymodes. - If you observe a hang/stall — this is a P0/P1 bug. Collect artifacts following the instructions in Support.
SQLSTATE quick reference
| SQLSTATE | Name | Typical scenario |
|---|---|---|
0A000 | feature_not_supported | WITH RECURSIVE, complex RLS predicates, multi-column ON CONFLICT target, ON CONFLICT ON CONSTRAINT |
23514 | check_violation | Partition routing: no matching partition and no DEFAULT |
42809 | wrong_object_type | UPDATE/DELETE on append-only table; PK/FK update under no_delete |
22023 | invalid_parameter_value | Invalid stats_level_max, invalid break-glass TTL, setval value out of [MINVALUE..MAXVALUE] |
42501 | insufficient_privilege | Missing roles for security operations, no SecurityContext |
25001 | active_sql_transaction | SET SESSION CONTEXT inside an active transaction |
2200H | sequence_generator_limit_exceeded | nextval past MAXVALUE/MINVALUE without CYCLE (RM-0.6.3.7) |
55000 | object_not_in_prerequisite_state | currval(seq) before any nextval in the current session (RM-0.6.3.7, session-bound contract) |
42P07 | duplicate_object | CREATE SEQUENCE of an existing name without IF NOT EXISTS |
42P01 | undefined_table | DROP SEQUENCE / nextval / currval / setval on a missing sequence |
428C9 | generated_always | INSERT with explicit non-NULL into a GENERATED ALWAYS AS IDENTITY column |
Full list with context: Known issues.
SQL reference pages
Current status of SQL subsystems on the 0.6.x branch. Badges reflect how well the subsystem is covered by pinned compat-suite tests and whether it can be relied upon in production scenarios.
| Topic | Page | Status |
|---|---|---|
| Data types | data-types.md | Stable |
| DDL (CREATE, ALTER, DROP) | ddl.md | Stable |
| DML (INSERT, UPDATE, DELETE) | dml.md | Stable |
| Queries (CTE, JOIN, ORDER BY) | queries.md | Stable |
| Table partitioning | partitioning.md | Baseline |
SQL Functions (RM-0.6.5.5)
AngaraBase implements a subset of PostgreSQL built-in functions.
| Function | Signature | Description |
|---|---|---|
NOW() | () → timestamp | Current UTC time |
CURRENT_TIMESTAMP | () → timestamp | Alias for NOW() |
date_trunc(field, ts) | (text, timestamp) → timestamp | Truncate timestamp to field. Supported fields: microseconds, milliseconds, second, minute, hour, day, week, month, quarter, year, decade, century, millennium. Unknown field returns NULL. |
set_config(name, val, local) | (text, text, bool) → text | Stub: returns val. Used for Django compatibility. |
obj_description(oid, cat) | (oid, text) → text | Stub: returns NULL. Used for Django introspection. |
Vector execution visibility
When vector execution mode allows vector path (ANGARABASE_SQL_EXECUTION_MODE=auto for supported plans, or force_vector), EXPLAIN surfaces vector operator names:
VectorSeqScanVectorFilterVectorProjectVectorHashJoinVectorAgg
If the plan shape is not supported by AngaraVector, planner/executor keeps row operators in EXPLAIN.
Execution mode behavior:
auto(default) — Stable use vector only for fully supported plan shapes, otherwise deterministic row fallback.force_row— Stable always use row executor.force_vector— Experimental fail-closed withfeature_not_supportedif vector execution is not possible. Suitable for targeted testing of the vector path; not recommended for production workloads in0.6.x.
AngaraMemory table options
CREATE TABLE ... WITH (...) now supports bounded memory-table surfaces:
storage='memory'— Baseline selects AngaraMemory table engine.durability='none'|'logged'|'snapshotted'— Baseline volatile/durable behavior.max_rows=<n>— Stable hard row-cap; overflow is fail-closed (54023).eviction_policy='error'|'fifo'— Experimental defaulterror;fifois opt-in.checkpoint_interval_ms=<n>— Baseline valid only withdurability='snapshotted'.
If max_rows is omitted for memory tables, bounded default is applied from instance policy.
Behavior clarification for durability='snapshotted':
- DML hot path does not perform immediate page persistence.
- Checkpoint worker applies per-table scheduling using each table’s
checkpoint_interval_ms(with bounded fallback).
Links
- Known issues: Known issues
- Support: Support
- Security model: Security / authorization
Data types
Goal
Reference for supported AngaraBase data types and type casting rules.
Supported types
| SQL type | Alias | Storage | Notes |
|---|---|---|---|
INTEGER | INT, INT4 | 32-bit signed | Primary numeric type |
BIGINT | INT8 | 64-bit signed | Large counters, IDs |
VARCHAR(n) | — | Variable-length text | Bounded by n characters |
TEXT | — | Variable-length text | Unbounded text |
BOOLEAN | BOOL | 1-byte | TRUE / FALSE / NULL (OID 16) |
TIMESTAMP | — | Text-backed compat | ISO 8601 UTC (OID 1114) |
DATE | — | Text-backed compat | ISO 8601 date portion (OID 1082) |
Text-backed temporal types
TIMESTAMP and DATE are stored in text-backed compatibility mode. This means:
- Comparisons are performed as text (lexicographical); ISO 8601 format guarantees correct ordering.
- Serialization (RM-0.6.5.5):
TIMESTAMPis always serialized in UTC without offset (e.g.,2026-05-07 14:30:00.123). Trailing microsecond zeros are truncated. - Arithmetic (
INTERVALetc.) is not supported —0A000. - BRIN indexes on
date/timestamp/timestamptzcolumns are supported (using text min/max).
Planned types
| SQL type | Status |
|---|---|
DECIMAL / NUMERIC | Planned |
UUID | Planned |
Attempting to use an unsupported type will result in a parser error or 0A000 feature_not_supported.
NULL handling
- All types allow
NULLunless the column is declared asNOT NULL. - In
ORDER BY ASC,NULLvalues are treated as the largest (appear last). - Explicit
NULLS FIRST/NULLS LASTcontrol is not supported —0A000.
Type casting
AngaraBase supports PostgreSQL-syntax type casting:
SELECT '42'::INTEGER;
SELECT id::TEXT FROM t;
Casting between incompatible types results in a runtime error with the corresponding SQLSTATE.
Expected SQLSTATE
| Situation | SQLSTATE |
|---|---|
| Unsupported type in DDL | 0A000 |
INTERVAL arithmetic | 0A000 |
NULLS FIRST / NULLS LAST | 0A000 |
| Invalid cast | Runtime error |
Links
- SQL compatibility overview: overview.md
- DDL (CREATE TABLE with types): ddl.md
- Known issues: Known issues
DDL — Data Definition Language
Goal
Reference for supported DDL operations: creating, altering, and dropping tables, indexes, and constraints.
CREATE TABLE
Basic form
CREATE TABLE t (
id INTEGER PRIMARY KEY,
name VARCHAR(100) NOT NULL,
v BIGINT
);
With table options
CREATE TABLE events (
id INTEGER PRIMARY KEY,
ts TIMESTAMP NOT NULL,
data TEXT
) WITH (append_only = true);
CREATE TABLE ledger (
id INTEGER PRIMARY KEY,
amount BIGINT,
ref_id INTEGER
) WITH (mutation_policy = 'no_delete');
Constraints
PRIMARY KEY— required for every table (exactly one).NOT NULL— column does not acceptNULL.FOREIGN KEY ... NOT ENFORCED— declarative FK without runtime checking.
CREATE TABLE orders (
id INTEGER PRIMARY KEY,
parent_id INTEGER NOT NULL,
FOREIGN KEY (parent_id) REFERENCES parents (id) NOT ENFORCED
);
Enforced foreign keys are not supported — attempting to create an FK without NOT ENFORCED will return 0A000.
ALTER TABLE
Table-level options
ALTER TABLE t SET (append_only = true);
ALTER TABLE t SET (append_only = false);
ALTER TABLE t SET (mutation_policy = 'no_delete');
ALTER TABLE t SET (mutation_policy = 'unrestricted');
ALTER TABLE t SET (stats_level_max = 2);
ALTER TABLE t SET (stats_reservoir_size = 1000);
Column-level options
ALTER TABLE t ALTER COLUMN v SET (stats_level_max = 1);
Table options reference
| Option | Values | Default | Description |
|---|---|---|---|
append_only | true / false | false | Reject UPDATE/DELETE (SQLSTATE 42809) |
mutation_policy | unrestricted / append_only / no_delete | unrestricted | Fine-grained mutation control |
stats_level_max | 0–3 | 0 | Max statistics collection level |
stats_reservoir_size | ≥ 1 | Engine default | Reservoir sample size for Level 2 stats |
append_only = true is equivalent to mutation_policy = 'append_only'.
DROP TABLE
DROP TABLE t;
With DROP TABLE, owned sequences (SERIAL/IDENTITY) are cascadedly dropped, even without CASCADE (PostgreSQL behavior).
CREATE / ALTER / DROP SEQUENCE
AngaraBase supports first-class sequence objects (SEQUENCE) — RM-0.6.3.7, RFC-2026-497. They are persisted in sys_catalog, survive server restart, and back SERIAL/BIGSERIAL and GENERATED [ALWAYS|BY DEFAULT] AS IDENTITY.
CREATE SEQUENCE
CREATE SEQUENCE s1; -- start=1, inc=1, no upper bound
CREATE SEQUENCE s2 START WITH 100 INCREMENT BY 5;
CREATE SEQUENCE s3 MINVALUE 1 MAXVALUE 999 CYCLE; -- cyclic counter
CREATE SEQUENCE IF NOT EXISTS s1; -- idempotent
Options (any order):
START WITH n, INCREMENT BY n, MINVALUE n / NO MINVALUE, MAXVALUE n / NO MAXVALUE, CYCLE / NO CYCLE.
ALTER SEQUENCE
ALTER SEQUENCE s1 INCREMENT BY 10;
ALTER SEQUENCE s1 RESTART WITH 1; -- reset counter
ALTER SEQUENCE s1 MAXVALUE 1000 NO CYCLE;
ALTER SEQUENCE t_id_seq OWNED BY t.id; -- bind to column
ALTER SEQUENCE s1 OWNED BY NONE; -- unbind
DROP SEQUENCE
DROP SEQUENCE s1;
DROP SEQUENCE IF EXISTS s1;
DROP SEQUENCE s1 CASCADE;
Function behavior
See dml.md — “Sequence functions” section (nextval, currval, setval).
SQLSTATE
| Situation | SQLSTATE |
|---|---|
CREATE SEQUENCE with an existing name without IF NOT EXISTS | 42P07 |
DROP SEQUENCE for a non-existent name without IF EXISTS | 42P01 |
ALTER SEQUENCE of a non-existent sequence | 42P01 |
MINVALUE > MAXVALUE or START outside [MIN..MAX] | 22023 |
CREATE INDEX
AngaraBase supports single-column indexes of two types: btree (default) and brin.
-- btree (default)
CREATE INDEX idx_t_v ON t (v);
-- btree (explicit)
CREATE INDEX idx_t_v ON t USING btree (v);
-- brin
CREATE INDEX idx_events_ts ON events USING brin (ts);
BRIN supported key types
| Type | Aliases |
|---|---|
int | int4, integer |
bigint | int8 |
date | — |
timestamp | — |
timestamptz | — |
BRIN remains an accelerator path with heap fetch + MVCC predicate recheck.
Current bounds
- Single-column indexes only.
- Composite and expression indexes are not supported —
0A000. - Unsupported index methods (GIN, GiST, etc.) —
0A000.
Expected SQLSTATE
| Situation | SQLSTATE |
|---|---|
| Unsupported DDL form | 0A000 |
| Enforced FK | 0A000 |
| Composite / expression index | 0A000 |
| Unsupported index method | 0A000 |
stats_level_max outside [0..3] | 22023 |
Links
- Data types: data-types.md
- DML (INSERT/UPDATE/DELETE): dml.md
- Partitioning (CREATE TABLE … PARTITION BY): partitioning.md
- Known issues: Known issues
DML — Data Manipulation Language
Goal
Reference for supported DML operations and mutation policy behavior.
INSERT
INSERT INTO t (id, v) VALUES (1, 100);
INSERT INTO t (id, v) VALUES (2, 200), (3, 300);
For partitioned tables, an INSERT into the parent table automatically routes the row to the appropriate partition. If no suitable partition (and no DEFAULT) is found — 23514 check_violation.
INSERT … SELECT
RM-0.6.3.7 (RFC-2026-497, S8): the source of an INSERT can be an arbitrary SELECT query. The result columns are mapped to the target columns positionally (or by explicit column list); SERIAL/IDENTITY/DEFAULT are resolved independently for each row.
INSERT INTO archive (id, v) SELECT id, v FROM t WHERE created_at < '2025-01-01';
INSERT INTO log (note) SELECT 'rebuilt:' || name FROM rebuilt_items;
INSERT … ON CONFLICT (UPSERT)
RM-0.6.3.7 (RFC-2026-497, S9): single-column conflict target (or default PK) is supported. EXCLUDED.<col> refers to the proposed row.
-- DO NOTHING (earlier behavior, now accepts an explicit target):
INSERT INTO t (id, v) VALUES (1, 100) ON CONFLICT (id) DO NOTHING;
-- DO UPDATE SET ... [WHERE ...]
INSERT INTO counters (key, n)
VALUES ('hits', 1)
ON CONFLICT (key) DO UPDATE SET n = counters.n + EXCLUDED.n;
-- WHERE-filter on existing row (PG-semantics: conflict + WHERE-false = silent no-op):
INSERT INTO inventory (sku, qty)
VALUES ('A', 5)
ON CONFLICT (sku) DO UPDATE SET qty = EXCLUDED.qty
WHERE inventory.qty < EXCLUDED.qty;
Not supported:
- multi-column target (
ON CONFLICT (a, b)) →0A000; ON CONFLICT ON CONSTRAINT <name>→0A000.
UPDATE
UPDATE t SET v = 999 WHERE id = 1;
Since RM-0.6.5.5, UPDATE SET supports functional expressions and type casting:
UPDATE t SET write_date = NOW() WHERE id = 1;UPDATE t SET ts = '2026-05-07 14:30:00'::timestamp WHERE id = 1;UPDATE t SET day = date_trunc('day', NOW()) WHERE id = 1;
Supported are NOW(), CURRENT_TIMESTAMP, CURRENT_DATE, date_trunc(), and explicit CAST (syntax ::).
DELETE
DELETE FROM t WHERE id = 2;
RETURNING
RM-0.6.3.7 (RFC-2026-497, S10): RETURNING is supported for multi-row INSERT, UPDATE, DELETE. The projection allows *, an explicit column list, and expressions. For INSERT ... ON CONFLICT DO UPDATE, the post-update row is returned (like in PostgreSQL).
INSERT INTO t (v) VALUES (100), (200), (300) RETURNING id, v;
UPDATE t SET v = v + 1 WHERE v > 0 RETURNING id, v AS new_v;
DELETE FROM t WHERE v IS NULL RETURNING id;
Sequence functions
RM-0.6.3.7 (RFC-2026-497): nextval / currval / setval for SEQUENCE objects (see ddl.md — “CREATE / ALTER / DROP SEQUENCE” section). Non-transactional: gap-on-rollback is the correct behavior.
SELECT nextval('s1'); -- 1, 2, 3, ...; overflow without CYCLE → 2200H
SELECT currval('s1'); -- last value issued in THIS session
SELECT setval('s1', 100); -- last_value=100, is_called=true; next nextval = 101
SELECT setval('s1', 50, false); -- next nextval = 50
currval is session-bound: it returns the value of the last nextval (or setval(_, _, true)) in the same pgwire session. If nextval has not yet been called in the current session — 55000 object_not_in_prerequisite_state, regardless of whether nextval was called in other sessions.
SERIAL / IDENTITY
SERIAL / BIGSERIAL and GENERATED [ALWAYS|BY DEFAULT] AS IDENTITY automatically create an owned sequence <table>_<col>_seq, which is dropped when the table is dropped.
CREATE TABLE users (id SERIAL PRIMARY KEY, name TEXT);
INSERT INTO users (name) VALUES ('alice'), ('bob') RETURNING id;
-- id is populated via nextval('users_id_seq')
CREATE TABLE orders (
id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
amount NUMERIC
);
INSERT INTO orders (id, amount) VALUES (42, 100);
-- → 428C9 generated_always: id column is GENERATED ALWAYS
CREATE TABLE invoices (
id BIGINT GENERATED BY DEFAULT AS IDENTITY PRIMARY KEY,
total NUMERIC
);
INSERT INTO invoices (id, total) VALUES (DEFAULT, 250) RETURNING id;
INSERT INTO invoices (id, total) VALUES (1000, 250); -- explicit OK
SELECT
SELECT id, v FROM t WHERE v > 50 ORDER BY id;
For details on queries (CTE, JOIN, GROUP BY, etc.) — see queries.md.
TRUNCATE
TRUNCATE TABLE t;
TRUNCATE is a DDL reset of the table. For append-only tables, this is the only way to delete data.
Mutation policy enforcement
Mutation policy controls the allowed DML operations on a table.
append_only
| Operation | Result |
|---|---|
INSERT | Allowed |
UPDATE | Rejected — 42809 wrong_object_type |
DELETE | Rejected — 42809 wrong_object_type |
TRUNCATE | Allowed (DDL reset) |
no_delete
| Operation | Result |
|---|---|
INSERT | Allowed |
UPDATE (non-PK, non-FK columns) | Allowed |
UPDATE PK column (id) | Rejected — 42809 wrong_object_type |
UPDATE FK child column | Rejected — 42809 wrong_object_type |
DELETE | Rejected — 42809 wrong_object_type |
TRUNCATE | Rejected — 42809 wrong_object_type |
unrestricted
All DML operations are allowed (default behavior).
Autocommit vs explicit transactions
By default, every DML query runs in autocommit mode (an implicit transaction, committed immediately).
For explicit transactions:
BEGIN;
INSERT INTO t (id, v) VALUES (10, 1000);
UPDATE t SET v = 2000 WHERE id = 10;
COMMIT;
ROLLBACK aborts all changes of the current transaction.
Expected SQLSTATE
| Situation | SQLSTATE |
|---|---|
UPDATE/DELETE on append-only table | 42809 |
DELETE/TRUNCATE under no_delete | 42809 |
PK/FK UPDATE under no_delete | 42809 |
| Insert into partitioned table, no matching partition | 23514 |
nextval overflow (without CYCLE) | 2200H |
currval before the first nextval in this session | 55000 |
setval value outside [MINVALUE..MAXVALUE] | 22023 |
Non-existent sequence (nextval/currval/setval/DROP SEQUENCE without IF EXISTS) | 42P01 |
INSERT into a GENERATED ALWAYS AS IDENTITY column with an explicit non-NULL | 428C9 |
INSERT ... ON CONFLICT (a, b) (multi-column target) | 0A000 |
INSERT ... ON CONFLICT ON CONSTRAINT <name> | 0A000 |
Links
- DDL (CREATE TABLE, mutation policy options): ddl.md
- Queries (SELECT details): queries.md
- Partitioning: partitioning.md
- Known issues: Known issues
Queries
Goal
Reference for supported query forms: CTE, JOIN, aggregation, sorting, and statistical surfaces.
Non-recursive WITH (CTE)
AngaraBase supports limited (non-recursive) CTEs as a derived source in FROM:
WITH recent AS (
SELECT id, v FROM t WHERE v > 100
)
SELECT r.id, r.v
FROM recent r
ORDER BY r.id;
Supported CTE projection forms
WITH c AS (SELECT id, v FROM t)
SELECT * FROM c WHERE id < 10;
Explicitly unsupported
WITH RECURSIVE—0A000 feature_not_supported- Data-modifying CTEs (
WITH ... INSERT/UPDATE/DELETE) —0A000 feature_not_supported
JOINs
SELECT a.id, b.name
FROM orders a
INNER JOIN customers b ON a.customer_id = b.id;
SELECT a.id, b.name
FROM orders a
LEFT JOIN customers b ON a.customer_id = b.id;
Supported are: INNER JOIN, LEFT JOIN.
ORDER BY
SELECT * FROM t ORDER BY v;
SELECT * FROM t ORDER BY v ASC;
SELECT * FROM t ORDER BY v DESC;
ORDER BY <expr> supports limited scalar expressions already supported by the engine.
Support for aliases
ORDER BY can reference aliases defined in the SELECT list using the AS keyword. This works for both regular columns and aggregation results:
-- Alias for an expression
SELECT x * 2 AS doubled FROM t ORDER BY doubled;
-- Alias for aggregation
SELECT grp, sum(amount) AS total
FROM t
GROUP BY grp
ORDER BY total DESC;
Column ordinals
Sorting by the ordinal number of a column in the SELECT list (1-based) is supported, according to the SQL:2011 standard:
-- Sort by the first column (a)
SELECT a, b FROM t ORDER BY 1;
-- Sort by the second column (b) in descending order
SELECT a, b FROM t ORDER BY 2 DESC;
If the specified ordinal is out of range for the number of columns in the query (or equals 0), a 42P10 (invalid_column_reference) error is returned.
NULL ordering
- In
ASCorder,NULLis treated as the largest value (appears last). - In
DESCorder,NULLappears first. - Explicit
NULLS FIRST/NULLS LASTis not supported —0A000.
GROUP BY / HAVING
SELECT v, COUNT(*) AS cnt
FROM t
GROUP BY v
HAVING COUNT(*) > 1;
LIMIT / OFFSET
SELECT * FROM t ORDER BY id LIMIT 10;
SELECT * FROM t ORDER BY id LIMIT 10 OFFSET 20;
Expression Errors
When incompatible types are used in arithmetic expressions (for example, attempting to add a string to a number 'string' + 1), AngaraBase returns a 42883 (undefined_function / operator not found) error.
Note: string literals containing numbers (e.g., '123') may be automatically cast to a numeric type depending on the context.
Unsupported query forms
| Form | SQLSTATE |
|---|---|
UNION / INTERSECT / EXCEPT | 0A000 |
Window functions (OVER (...)) | 0A000 |
ORDER BY ... NULLS FIRST|LAST | 0A000 |
WITH RECURSIVE | 0A000 |
| Data-modifying CTEs | 0A000 |
AngaraStat surfaces
AngaraBase provides built-in statistical views.
sys.table_stats
| Column | Description |
|---|---|
stats_level_max | Max collection level configured |
last_committed_rowid | Last committed row ID |
last_insert_epoch | Epoch of last insert |
last_mutation_epoch | Epoch of last mutation |
sys.column_stats
| Column | Description |
|---|---|
ndv_approx | Approximate number of distinct values (HLL) |
col_min | Column minimum (typed Value) |
col_max | Column maximum (typed Value) |
null_count | Number of NULLs |
stats_epoch | Stats collection epoch |
hll_enabled | Whether HLL tracking is active |
histogram_bounds | Equi-depth histogram boundaries (Level 2) |
mcv_values | Most common values (Level 2) |
mcv_frequencies | MCV frequencies (Level 2) |
reservoir_size | Reservoir sample size (Level 2) |
reservoir_epoch | Reservoir epoch (Level 2) |
reservoir_drift_count | Drift count since last reservoir refresh (Level 2) |
SELECT * FROM sys.table_stats WHERE table_name = 'events';
SELECT * FROM sys.column_stats WHERE table_name = 'events' AND column_name = 'ts';
Expected SQLSTATE
| Situation | SQLSTATE |
|---|---|
WITH RECURSIVE | 0A000 |
| Data-modifying CTE | 0A000 |
UNION / INTERSECT / EXCEPT | 0A000 |
| Window functions | 0A000 |
ORDER BY ... NULLS FIRST|LAST | 0A000 |
ORDER BY ordinal out of range | 42P10 |
| Expression type error | 42883 |
Links
- DML operations: dml.md
- DDL (stats options): ddl.md
- SQL compatibility overview: overview.md
- Known issues: Known issues
Table partitioning
Goal
Reference for table partitioning: RANGE, LIST, partition management, and runtime behavior.
PARTITION BY RANGE
CREATE TABLE events (
id INTEGER PRIMARY KEY,
ts DATE NOT NULL,
data TEXT
) PARTITION BY RANGE (ts);
Range partitions
CREATE TABLE events_2025 PARTITION OF events
FOR VALUES FROM ('2025-01-01') TO ('2026-01-01');
CREATE TABLE events_2026 PARTITION OF events
FOR VALUES FROM ('2026-01-01') TO ('2027-01-01');
PARTITION BY LIST
CREATE TABLE metrics (
id INTEGER PRIMARY KEY,
region VARCHAR(20) NOT NULL,
value BIGINT
) PARTITION BY LIST (region);
List partitions
CREATE TABLE metrics_eu PARTITION OF metrics
FOR VALUES IN ('eu-west', 'eu-east');
CREATE TABLE metrics_us PARTITION OF metrics
FOR VALUES IN ('us-east', 'us-west');
DEFAULT partition
CREATE TABLE events_other PARTITION OF events DEFAULT;
Rows that do not fit into any partition are routed to DEFAULT.
If DEFAULT does not exist and no suitable partition is found — 23514 check_violation.
ALTER TABLE — attach / detach
ALTER TABLE events ATTACH PARTITION events_2027
FOR VALUES FROM ('2027-01-01') TO ('2028-01-01');
ALTER TABLE events DETACH PARTITION events_2025;
DROP PARTITION
DROP PARTITION events_2025;
Runtime behavior
INSERT routing
An INSERT into the parent table automatically routes the row to the appropriate child partition:
INSERT INTO events (id, ts, data) VALUES (1, '2026-06-15', 'test');
-- → events_2026
If no partition matches and DEFAULT is missing:
ERROR: 23514: new row for relation "events" violates check constraint
Append-only inheritance
If the parent table is declared as append_only = true:
- All attached partitions inherit the append-only mode.
- A partition cannot disable
append_onlywhile the parent remains append-only —42809.
CREATE TABLE events (
id INTEGER PRIMARY KEY,
ts DATE NOT NULL
) PARTITION BY RANGE (ts) WITH (append_only = true);
-- partitions inherit append_only
CREATE TABLE events_2026 PARTITION OF events
FOR VALUES FROM ('2026-01-01') TO ('2027-01-01');
Current v0 bounds
| Limitation | Status |
|---|---|
| Single-column partition key | Only supported form |
| Hash partitioning | Not supported |
| Subpartitioning | Not supported |
Unsupported forms return 0A000 feature_not_supported.
Expected SQLSTATE
| Situation | SQLSTATE |
|---|---|
| No matching partition, no DEFAULT | 23514 |
| Hash partitioning | 0A000 |
| Multi-column partition key | 0A000 |
| Subpartitioning | 0A000 |
| Disable append-only on child (parent is append-only) | 42809 |
Links
- DDL reference: ddl.md
- DML (INSERT routing): dml.md
- SQL compatibility overview: overview.md
- Known issues: Known issues
Security model (overview)
Goal
Understand the layered security architecture of AngaraBase: which controls exist, how they interact, and how to verify your instance is running in a secure configuration.
Prerequisites
- A running AngaraBase instance (local or staging).
- SQL session access (pgwire).
- Basic understanding of roles and tables in your database.
Security model (layers)
AngaraBase uses a layered defence model. Each layer is independent and composable — no single layer bypass compromises the whole.
Layer 1 — Transport and identity
- TLS protects the wire protocol.
- Auth modes (
trust,scram,cert) control how clients prove identity. - Fail-closed: remote bind without TLS is rejected when
tls.require_on_remote_bind = true.
See authentication.md for setup and verification.
Layer 2 — Authorization and data visibility
- RBAC (roles, grants, privileges) decides whether an operation is allowed at all.
- RLS (row-level security policies) decides which rows are visible or modifiable.
- Deny-by-default: enabling RLS without policies blocks all rows, including for the table owner.
See authorization.md for SQL surface and introspection.
Layer 3 — Controlled privilege escalation
- Break-glass is the only way to bypass RLS — even
SUPERUSERcannot. - Activation requires a mandatory
REASONandTTL. - Every query during break-glass generates a dedicated audit entry.
See break-glass.md for the full lifecycle.
Layer 4 — Audit and accountability
- Audit chain is append-only and tamper-evident (SHA-256 chain hash).
- Scope: auth, DDL, DCL, policy changes, break-glass lifecycle.
- DML audit policy: configurable
off|allowlist|denylistper table.
See audit.md for configuration and verification.
Layer 5 — Data-at-rest protection
- TDE (Transparent Data Encryption) covers pages, WAL, and audit sink.
- Fail-closed: missing or invalid key material prevents startup and audit I/O.
See encryption.md for TDE setup and key management.
Layer 6 — Client-encrypted columns (v0)
- Server stores ciphertext + metadata (
alg,mode,key_id) but never the keys. DETERMINISTICmode allows equality predicates;RANDOMIZEDrejects server-side predicates (0A000).
See encryption.md for the SQL surface and operator rules.
How features work together
| Combination | Behaviour |
|---|---|
| RBAC + RLS | RBAC decides “is this operation allowed at all”; RLS further restricts “which rows”. |
| Break-glass + audit | Temporary elevation is accepted only with a reason and full traceability in the audit chain. |
| TDE + audit | When TDE is enabled, audit bytes on disk are encrypted; sys.audit_log remains readable only with the correct key. |
| Client encryption + SQL bounds | Deterministic mode allows a limited predicate path; randomized mode fail-closes unsupported server-side operations. |
Quick security verification
Step 1 — Check effective settings
SELECT name, value
FROM sys.settings
WHERE name LIKE 'tls.%'
OR name LIKE 'security.%'
OR name LIKE 'audit.%'
ORDER BY name;
Returns effective security knobs without exposing secrets.
Step 2 — Check security surfaces
SELECT * FROM angara_user_roles() LIMIT 20;
SELECT * FROM angara_table_policies('public.users');
SELECT * FROM angara_break_glass_status();
SELECT * FROM angara_audit_verify_chain();
Validates that key introspection/verification functions are available and responsive.
Step 3 — Validate RLS explanation surface
SELECT * FROM angara_effective_rls_predicate('public.users');
Returns the effective predicate and helps explain row-visibility behaviour.
Expected result
sys.settingsshows security knobs without secrets.- Security functions return data (or empty results) without internal errors.
- Unsupported operations terminate with an explicit SQLSTATE (
0A000,42501, or22023) — never a silent bypass.
Troubleshooting
42501 insufficient_privilegeon security DDL/ops Check user roles and session context; see authorization.md.0A000 feature_not_supportedin policy/encrypted path This is a bounded contract (not a bug) — use the supported syntax or mode.- TDE enabled but audit/data I/O fails Verify master key presence and correctness; fail-closed is expected. See encryption.md.
- Need a bug-report artifact? Follow the bundle steps in ../reference/support.md.
Links
-
Security knobs registry:
angarabook/src/operations/security-operations.md -
Authentication: authentication.md
-
Authorization: authorization.md
-
Audit: audit.md
-
Encryption: encryption.md
-
Break-glass: break-glass.md
-
Hardening runbook: hardening.md
Authentication
Goal
Configure and verify transport security (TLS) and client authentication modes so that only identified clients can connect to AngaraBase.
Prerequisites
- Access to the server configuration (TOML config and/or environment variables).
- TLS certificate and key files (for
scramorcertmodes with remote clients). - Ability to restart the server after configuration changes.
Auth modes
AngaraBase supports three authentication modes, set via ANGARABASE_AUTH_MODE or security.auth_mode in the
config:
| Mode | When to use | Identity proof |
|---|---|---|
trust | Local development / testing only | None — any connecting client is accepted |
scram | Production and staging | SCRAM-SHA-256 password challenge |
cert | mTLS environments | Client TLS certificate validated against CA |
Default: trust (requires explicit opt-in flag for remote bind — see Startup safety below).
Steps
1) Configure TLS
TLS is controlled in the [tls] section of angarabase.conf:
[tls]
enabled = true
cert_path = "/etc/angarabase/tls/server.crt"
key_path = "/etc/angarabase/tls/server.key"
require_on_remote_bind = true
Or via environment variables:
export ANGARABASE_TLS_ENABLED=1
export ANGARABASE_TLS_CERT_PATH=/etc/angarabase/tls/server.crt
export ANGARABASE_TLS_KEY_PATH=/etc/angarabase/tls/server.key
export ANGARABASE_TLS_REQUIRE_ON_REMOTE_BIND=1
require_on_remote_bind = true enforces fail-closed behaviour: if the server binds to a non-loopback address
and TLS is not enabled, startup is refused.
2) Set auth mode
export ANGARABASE_AUTH_MODE=scram
Or in angarabase.conf:
[security]
auth_mode = "scram"
3) SCRAM setup
When auth_mode = scram, the superuser must be bootstrapped with a password at init time:
angarabase-server --init /var/lib/angarabase \
--superuser admin \
--superuser-password 'strong-password' \
--auth-mode scram
The password is converted to a SCRAM-SHA-256 verifier before it touches disk — plaintext is never stored.
Additional users are created via SQL:
CREATE USER app_reader WITH PASSWORD 'change-me';
Password storage format: SCRAM-SHA-256$<iterations>:<salt>$<StoredKey>:<ServerKey>.
4) Startup safety
To protect against accidental production use of trust mode:
trustmode on a non-loopback bind address requires the explicit flag--allow-insecure-no-auth.- Without this flag the server refuses to start (fail-closed).
scramorcertmodes do not require the flag.
5) Verify effective config from SQL
After startup, confirm the active settings:
SELECT name, value
FROM sys.settings
WHERE name IN (
'tls.enabled',
'tls.require_on_remote_bind',
'security.auth_mode'
)
ORDER BY name;
No secret material (keys, passwords, verifiers) appears in sys.settings.
Expected result
- In
scrammode, unauthenticated connections are rejected. - In
certmode, clients without a valid certificate are rejected. truston remote bind without--allow-insecure-no-authprevents startup.sys.settingsconfirms the effective auth mode and TLS state without leaking secrets.
Troubleshooting
- Server refuses to start in trust mode
You are binding to a non-loopback address. Either switch to
scram/certor pass--allow-insecure-no-authfor development. authentication failed for user "..."on connect Verify the password or certificate; check thatsecurity.auth_modematches the client’s auth method.- TLS handshake failure
Confirm
cert_pathandkey_pathpoint to valid, non-expired files; verify the client’ssslmodesetting. - Init refused: scram requires password
Provide
--superuser-password(or--superuser-password-file) when--auth-mode scramis set. - Need a bug-report artifact? See ../reference/support.md.
Links
- Security model overview: overview.md
- Hardening runbook: hardening.md
- Configuration reference: ../operations/configuration.md
- Security knobs registry:
angarabook/src/operations/security-operations.md
Authorization (RBAC and RLS)
Goal
Set up role-based access control and row-level security policies so that users see and modify only the data they are permitted to.
Prerequisites
- SQL session with a user that has
SUPERUSERorSECURITY_ADMINprivileges. - A test table (e.g.
public.users) with representative data.
RBAC — users, roles, privileges
Create a user and grant a role
CREATE USER app_reader WITH PASSWORD 'change-me';
GRANT reader TO app_reader;
GRANT SELECT ON public.users TO reader;
Object-level grants
GRANT SELECT ON TABLE public.orders TO analyst;
GRANT INSERT, UPDATE ON TABLE public.orders TO writer_role;
GRANT SELECT ON ALL TABLES IN SCHEMA public TO analyst;
Default policy is deny-by-default: a user without an explicit GRANT cannot access another user’s tables.
Built-in role hierarchy
SUPERUSER
├── SECURITY_ADMIN — policy, audit, key, and break-glass management
├── DBA — ops (shutdown, backup, restore, settings, diagnostics)
├── CREATEROLE — CREATE/ALTER/DROP USER and ROLE
└── CREATEDB — CREATE DATABASE
SUPERUSER does not bypass RLS — only BREAK_GLASS can (see break-glass.md).
RLS — row-level security
Enable RLS on a table
ALTER TABLE public.users ENABLE ROW LEVEL SECURITY;
After this, the table follows deny-by-default: no rows are visible to anyone (including the owner) until at least one policy is created.
Create a security policy
CREATE SECURITY POLICY p_users_tenant ON public.users
USING (tenant_id = current_setting('app.tenant_id')::int);
The USING predicate is injected as an automatic WHERE clause on SELECT, INSERT, UPDATE, and
DELETE.
RLS enforcement rules
| Operation | Behaviour |
|---|---|
SELECT | Rows that fail the predicate are silently excluded. |
INSERT | The new row must satisfy the predicate; otherwise an error is raised. |
UPDATE | The old row must be visible (silent skip if not); the new row must satisfy the predicate (error if not). |
DELETE | The row must be visible (silent skip if not; returns 0 affected rows). |
Multiple policies on one table use AND semantics — all must pass.
RLS v1 masking metadata
Policies can include a MASK clause for user-facing field masking:
ALTER SECURITY POLICY p_users_tenant ON public.users
USING (tenant_id = current_setting('app.tenant_id')::int)
MASK (email USING 'partial');
Supported mask types in v1:
| Mask | Effect |
|---|---|
partial | Returns a stable masked shape of the original value. |
nullify | Returns NULL in place of the real value. |
Unsupported mask expressions return 0A000 feature_not_supported.
Unsupported predicates — fail-closed
In the IR/planner mode, complex predicates that cannot be safely rewritten (subqueries, arbitrary function calls, JOINs) are rejected:
ERROR: 0A000 feature_not_supported
unsupported RLS predicate form
This is a bounded contract, not a bug. Use the supported predicate language (column references,
current_setting(), current_user, literals, comparisons, AND/OR/NOT, IN, IS [NOT] NULL, type
casts).
Introspection
SELECT * FROM angara_table_policies('public.users');
SELECT * FROM angara_effective_rls_predicate('public.users');
SELECT * FROM sys.users;
SELECT * FROM sys.roles;
SELECT * FROM sys.user_roles;
SELECT * FROM sys.role_privileges;
SELECT * FROM sys.object_grants;
SELECT * FROM sys.my_privileges;
SELECT * FROM sys.security_policies;
SELECT * FROM angara_user_roles('alice');
SELECT * FROM angara_user_privileges('alice');
SELECT angara_has_privilege('alice', 'SELECT', 'TABLE', 'public.orders');
SELECT angara_is_rls_active('public.users');
angara_effective_rls_predicate() shows the combined predicate, provenance, and mask metadata for a table —
useful for explaining row-visibility behaviour.
Expected result
- RBAC grants control object-level access; users without grants are denied.
- RLS predicates filter rows transparently on
SELECT/INSERT/UPDATE/DELETE. angara_table_policiesreflects active policies, mask metadata, and provenance.- Unsupported paths return a deterministic SQLSTATE, never a silent bypass.
Troubleshooting
42501 insufficient_privilegeThe current user lacks the required role or grant. Checksys.my_privilegesandangara_user_roles().0A000 feature_not_supportedon RLS policy or mask The predicate or mask expression is outside the supported v1 syntax. Simplify the expression.- Rows unexpectedly invisible after enabling RLS
Deny-by-default is working correctly. Create a
SECURITY POLICYwith aUSINGpredicate to allow the intended rows. - Need a bug-report artifact? See ../reference/support.md.
Links
- Security model overview: overview.md
- Break-glass (RLS bypass): break-glass.md
- Audit: audit.md
- SQL compatibility: ../sql-reference/overview.md
- Known issues: ../reference/known-issues.md
Audit
Goal
Understand and configure AngaraBase’s tamper-evident audit subsystem: chain verification, DML audit policy, and JSON export.
Prerequisites
- A running AngaraBase instance with audit enabled (default).
- SQL session with
SECURITY_ADMINorSUPERUSERprivileges (for configuration changes). - Write-accessible path for the audit log file.
Audit chain concept
AngaraBase maintains an append-only, tamper-evident audit trail:
- Every security-relevant event (auth, DDL, DCL, policy changes, break-glass lifecycle) produces an
AuditEvent. - Each event contains
prev_hash— a SHA-256 hash of the preceding entry, forming a hash chain. - Breaking or modifying any entry invalidates all subsequent hashes, making tampering detectable.
- The chain is separate from the transactional data path — audit events are recorded even for rolled-back transactions.
Audit scope
| Event category | v0 (baseline) | v1 |
|---|---|---|
| Auth (success / failure / disconnect) | Yes | Yes |
| DDL (CREATE / ALTER / DROP) | Yes | Yes |
| DCL (GRANT / REVOKE) | Yes | Yes |
| User / role management | Yes | Yes |
| Security policy changes | Yes | Yes |
| Break-glass lifecycle | Yes | Yes |
| DML (SELECT / INSERT / UPDATE / DELETE) | No | Yes (policy-driven) |
| Key operations | No | Yes |
Steps
1) Verify chain integrity
SELECT * FROM angara_audit_verify_chain();
Returns is_valid, first_broken_seq, and details. A healthy chain returns is_valid = true.
2) Query the audit log
SELECT * FROM sys.audit_log
WHERE event_type = 'break_glass_query'
AND timestamp > now() - INTERVAL '24 hours'
ORDER BY seq DESC
LIMIT 50;
Columns: seq, timestamp, event_type, user_name, auth_method, client_ip, database,
session_claims, payload, prev_hash.
3) Configure audit v1 DML policy
DML audit is controlled by three knobs:
export ANGARABASE_AUDIT_DML_MODE=allowlist
export ANGARABASE_AUDIT_DML_ALLOWLIST=public.users,public.payments
| Mode | Behaviour |
|---|---|
off | No DML events recorded (default). |
allowlist | Record DML only for listed schema.table entries — targeted compliance. |
denylist | Record DML for all tables except those listed — broad coverage with exclusions. |
Use ANGARABASE_AUDIT_DML_DENYLIST for the denylist.
Malformed policy or ambiguous object references cause a startup/config-apply rejection (fail-closed).
4) Configure audit log path
export ANGARABASE_AUDIT_LOG_PATH=/var/lib/angarabase/audit/audit.jsonl
The path must be writable. If the path is inaccessible, audit writes fail-closed.
5) Configure JSON export
export ANGARABASE_AUDIT_EXPORT_JSON_ENABLED=1
export ANGARABASE_AUDIT_EXPORT_RATE_LIMIT_RPS=50
Export is bounded and rate-limited. Export failures are reported but never expose secret payload fragments in error text.
6) TDE interaction
When TDE is enabled (ANGARABASE_TDE_ENABLE=1), the audit sink on disk is encrypted transparently:
- Key:
audit_dek = KDF(master_key, domain="audit-v0", key_id). sys.audit_logdecrypts on read when the correct key is available.- Without the key, audit read/write is impossible (fail-closed).
- Encrypted audit data remains encrypted in backups and copied artefacts.
See encryption.md for TDE configuration.
Expected result
angara_audit_verify_chain()returnsis_valid = truefor an intact chain.sys.audit_logshows auth, DDL, DCL, policy, and break-glass events.- With DML policy set to
allowlistordenylist, matching DML operations appear in the audit log. - TDE-encrypted audit files are unreadable without the master key.
Troubleshooting
angara_audit_verify_chain()returnsis_valid = falseThe chain has been tampered with or corrupted. Note thefirst_broken_seqand investigate the audit file. The audit subsystem can self-repair by truncating to the last valid entry.- DML events not appearing in audit log
Check
audit.dml_mode— default isoff. Verify that the target table is in the allowlist (or not in the denylist). - Audit write failures after enabling TDE
Verify
ANGARABASE_TDE_MASTER_KEY_HEXand key validity. Fail-closed behaviour is expected when key material is missing. - Break-glass activation fails with “audit subsystem unavailable” Break-glass requires a healthy audit subsystem. Fix the audit path or key material first.
- Need a bug-report artifact? See ../reference/support.md.
Links
- Security model overview: overview.md
- Encryption (TDE): encryption.md
- Break-glass: break-glass.md
- Configuration reference: ../operations/configuration.md
Encryption
Goal
Configure data-at-rest protection with TDE and understand the server-side contract for client-encrypted columns.
Prerequisites
- Access to server environment variables or configuration file.
- A 256-bit master key (64 hex characters) for TDE.
- Ability to restart the server after TDE changes.
TDE — Transparent Data Encryption (v0)
TDE encrypts data at rest without application changes. When enabled, the following are encrypted:
- Storage pages (the
.adbdata file). - WAL (write-ahead log) entries.
- Audit sink on disk (JSONL trail).
Configure TDE
export ANGARABASE_TDE_ENABLE=1
export ANGARABASE_TDE_MASTER_KEY_HEX=<64-hex-characters>
export ANGARABASE_TDE_MASTER_KEY_ID=master-prod-2026q1
export ANGARABASE_TDE_LAST_ROTATION_UNIX=1760000000
| Variable | Purpose |
|---|---|
ANGARABASE_TDE_ENABLE | 1 to enable; unset or 0 to disable. |
ANGARABASE_TDE_MASTER_KEY_HEX | 256-bit key in hex (64 characters). Never appears in sys.settings or logs. |
ANGARABASE_TDE_MASTER_KEY_ID | Human-readable, non-secret key identifier (visible in sys.settings). |
ANGARABASE_TDE_LAST_ROTATION_UNIX | Unix timestamp of the last key rotation (metadata only). |
Fail-closed behaviour
- If
ANGARABASE_TDE_ENABLE=1and the key is missing, malformed, or wrong length, the server refuses to start. - If the key is invalid for the existing data directory, page/WAL decryption fails at startup — also fail-closed.
- The audit sink follows the same rule: without a valid key, audit read/write is impossible.
Verify from SQL
SELECT name, value
FROM sys.settings
WHERE name IN (
'security.tde_enabled',
'security.tde_master_key_id'
)
ORDER BY name;
Only non-secret metadata (key_id, enabled flag) is exposed. The hex key itself never appears.
Client-encrypted columns
Client-encrypted columns allow applications to store ciphertext in the database while the server never holds key material.
SQL surface
ALTER TABLE customers
ALTER COLUMN tax_id
SET ENCRYPTED WITH (TYPE=DETERMINISTIC, KEY_ID='cust-key-01');
Full syntax:
ALTER TABLE <table>
ALTER COLUMN <column>
SET ENCRYPTED WITH (
TYPE = DETERMINISTIC | RANDOMIZED,
KEY_ID = '<opaque-id>',
ALG = '<algorithm-id>' -- optional, defaults to AES-256-SIV / AES-256-GCM
);
What the server stores
| Field | Content |
|---|---|
enc.alg | Algorithm identifier (whitelisted). |
enc.mode | deterministic or randomized. |
enc.key_id | Opaque, non-secret key identifier. |
| payload | Ciphertext bytes — never interpreted by the server. |
The server never accepts raw key material via SQL, environment variables, config, logs, or sys.* views.
Operator rules
| Mode | Allowed operations | Rejected operations |
|---|---|---|
DETERMINISTIC | Equality predicates (=, !=, IN) | Range, LIKE, aggregation, ordering |
RANDOMIZED | None (read/write only) | All server-side predicates |
Unsupported operations return 0A000 feature_not_supported (shape-stable error).
Fail-closed and observability
- Missing or invalid encryption metadata on a column causes DML to be rejected.
sys.settingsand diagnostics expose onlykey_id, mode, and algorithm — never ciphertext blobs or key-like material.- Error messages never include ciphertext fragments or key material.
Expected result
- With TDE enabled, the data directory contains no plaintext data, WAL, or audit payloads.
sys.settingsshowssecurity.tde_enabled = trueand thekey_idwithout exposing the key.- Client-encrypted columns store ciphertext; deterministic mode allows equality queries; randomized mode rejects server-side predicates.
- All fail-closed paths produce deterministic errors, not silent fallbacks.
Troubleshooting
- Server refuses to start after enabling TDE
Check
ANGARABASE_TDE_MASTER_KEY_HEX— must be exactly 64 hex characters. If the data directory was previously unencrypted, you cannot enable TDE retroactively on existing data (see migration docs). 0A000 feature_not_supportedon encrypted column query The query uses an unsupported operator for the column’s encryption mode. ForRANDOMIZEDcolumns, only read/write is allowed. ForDETERMINISTIC, only equality predicates work.- Audit read/write fails with TDE enabled The audit DEK derives from the master key. Verify the master key matches the one used when the audit file was created.
sys.settingsdoes not show TDE knobs EnsureANGARABASE_TDE_ENABLE=1is set in the environment before server startup.- Need a bug-report artifact? See ../reference/support.md.
Links
- Security model overview: overview.md
- Audit (TDE interaction): audit.md
- Hardening runbook: hardening.md
- Configuration reference: ../operations/configuration.md
- Security knobs registry:
angarabook/src/operations/security-operations.md
Break-glass
Goal
Understand and use controlled privilege escalation (break-glass) to temporarily bypass RLS policies with full auditability.
Prerequisites
- SQL session with a user who has been granted the
BREAK_GLASScapability. - A healthy audit subsystem (break-glass cannot activate if audit is unavailable).
What is break-glass?
Break-glass is AngaraBase’s mechanism for controlled, time-limited, fully audited bypass of row-level
security policies. It exists because SUPERUSER alone does not bypass RLS — this is a deliberate design
choice.
Key properties:
- Activation requires a mandatory reason and a mandatory TTL.
- Every query executed during break-glass generates a dedicated
break_glass_queryaudit entry. - There is no “silent” bypass — the audit chain always records break-glass activity.
- This is a first-in-class database feature: neither PostgreSQL nor MS SQL Server have a built-in break-glass mechanism with TTL + reason + mandatory audit.
Steps
1) Grant the break-glass capability
A SECURITY_ADMIN grants the BREAK_GLASS role to a user or role:
GRANT BREAK_GLASS TO dba_team;
2) Activate break-glass
The user who has been granted BREAK_GLASS activates it with a reason and duration:
SET BREAK_GLASS REASON='INCIDENT-789: data corruption investigation' TTL='2h';
Duration format: '15m', '2h', '1d' etc. Maximum TTL is controlled by the server configuration (see
below).
3) Check status
SELECT * FROM angara_break_glass_status();
Returns: is_active, reason, expires_at, activated_at.
4) Work under break-glass
While break-glass is active, RLS policies are bypassed. Every query in this session generates an audit
entry with event_type = 'break_glass_query', including the full (sanitized) SQL text.
5) Deactivate (manual or automatic)
RESET BREAK_GLASS;
If not deactivated manually, break-glass auto-expires when the TTL elapses. After expiry, RLS applies again immediately.
6) Revoke the capability
REVOKE BREAK_GLASS FROM dba_team;
Configuration
| Variable | Default | Description |
|---|---|---|
ANGARABASE_SECURITY_BREAK_GLASS_MAX_TTL | 24h | Maximum allowed TTL for any break-glass session. Requests exceeding this are rejected. |
Also exposed as security.break_glass_max_ttl in sys.settings.
Audit trail
All break-glass lifecycle events are recorded:
| Event type | When |
|---|---|
break_glass_activate | SET BREAK_GLASS succeeds. |
break_glass_query | Every query while break-glass is active. |
break_glass_deactivate | RESET BREAK_GLASS is called. |
break_glass_expire | TTL elapses without manual deactivation. |
Invariants
- Audit must be healthy. If the audit subsystem is down or corrupted, break-glass activation fails (fail-closed).
- TTL is mandatory.
SET BREAK_GLASSwithoutTTL→ error. - Reason is mandatory.
SET BREAK_GLASSwithoutREASON→ error. - Max TTL is server-enforced. Exceeding
security.break_glass_max_ttl→22023 invalid_parameter_value. - No refresh. A client cannot extend the TTL — deactivate and re-activate with a new reason/TTL instead.
- SUPERUSER ≠ RLS bypass. Only
BREAK_GLASSbypasses RLS.
Expected result
SET BREAK_GLASSwith valid reason and TTL activates bypass;angara_break_glass_status()confirms.- All queries during break-glass appear in
sys.audit_logwithevent_type = 'break_glass_query'. - After TTL expiry or
RESET BREAK_GLASS, RLS enforcement resumes. - Invalid TTL returns
22023; missing reason or TTL returns an error.
Troubleshooting
22023 invalid_parameter_valueonSET BREAK_GLASSThe TTL exceedssecurity.break_glass_max_ttlor is in an invalid format. Check the max TTL setting and use a supported duration format ('15m','2h','1d').42501 insufficient_privilegeonSET BREAK_GLASSThe current user has not been grantedBREAK_GLASS. ASECURITY_ADMINmust runGRANT BREAK_GLASS TO <user>.- Break-glass activation fails with “audit unavailable”
The audit subsystem must be healthy. Check
ANGARABASE_AUDIT_LOG_PATHand audit key material if TDE is enabled. - Break-glass expired unexpectedly TTL is server-enforced and cannot be refreshed. Deactivate and re-activate with a new reason and TTL.
- Need a bug-report artifact? See ../reference/support.md.
Links
- Security model overview: overview.md
- Authorization (RLS policies): authorization.md
- Audit: audit.md
- Known issues: ../reference/known-issues.md
Deployment hardening runbook
Goal
Walk through a step-by-step process to launch AngaraBase in a production-secure configuration, and verify the result from SQL.
Prerequisites
- Access to server configuration (TOML config and/or environment variables).
- TLS certificate and key files.
- A 256-bit master key (64 hex characters) for TDE.
- Ability to restart the server.
Security hardening checklist
Before starting, confirm each item applies to your deployment:
- TLS enabled for all remote connectivity.
- Auth mode set explicitly (
scramorcert) — not left astrustfor production. - TDE enabled with valid key material.
- Audit log path set and writable.
- DML audit policy chosen deliberately (
off,allowlist, ordenylist). - Break-glass max TTL configured appropriately.
-
angara_audit_verify_chain()runs clean.
Steps
Step 1 — Enable TLS
[tls]
enabled = true
cert_path = "/etc/angarabase/tls/server.crt"
key_path = "/etc/angarabase/tls/server.key"
require_on_remote_bind = true
This protects the wire protocol and enforces fail-closed on non-loopback bind without TLS.
See authentication.md for TLS details.
Step 2 — Set auth mode
export ANGARABASE_AUTH_MODE=scram
Disables trust-only behaviour for production. The superuser must have been bootstrapped at --init time with
a SCRAM password.
See authentication.md for auth modes and SCRAM setup.
Step 3 — Enable TDE
export ANGARABASE_TDE_ENABLE=1
export ANGARABASE_TDE_MASTER_KEY_HEX=<64-hex-secret>
export ANGARABASE_TDE_MASTER_KEY_ID=master-prod-2026q1
export ANGARABASE_TDE_LAST_ROTATION_UNIX=1760000000
Enables at-rest encryption for pages, WAL, and audit sink. Without a valid key the server refuses to start (fail-closed).
See encryption.md for TDE configuration and key management.
Step 4 — Configure audit policy
export ANGARABASE_AUDIT_LOG_PATH=/var/lib/angarabase/audit/audit.jsonl
export ANGARABASE_AUDIT_DML_MODE=allowlist
export ANGARABASE_AUDIT_DML_ALLOWLIST=public.users,public.payments
export ANGARABASE_AUDIT_EXPORT_JSON_ENABLED=1
export ANGARABASE_AUDIT_EXPORT_RATE_LIMIT_RPS=50
Sets targeted DML audit coverage and bounded JSON export.
See audit.md for audit policy options and chain verification.
Step 5 — Verify effective config from SQL
After starting the server, confirm all settings from a SQL session:
SELECT name, value
FROM sys.settings
WHERE name IN (
'tls.enabled',
'tls.require_on_remote_bind',
'security.auth_mode',
'security.tde_enabled',
'security.tde_master_key_id',
'audit.dml_mode',
'audit.export_json_enabled',
'audit.export_rate_limit_rps'
)
ORDER BY name;
Only non-secret metadata appears. The TDE hex key, passwords, and SCRAM verifiers are never exposed.
Step 6 — Verify audit chain
SELECT * FROM angara_audit_verify_chain();
A healthy deployment returns is_valid = true.
Step 7 — Verify security surfaces
SELECT * FROM angara_user_roles() LIMIT 20;
SELECT * FROM angara_break_glass_status();
Confirm that introspection functions are responsive and return expected data.
Expected result
- Server starts only in fail-closed mode for unsafe configurations.
sys.settingsshows only non-secret metadata.- Disk contains no plaintext audit payloads when TDE is enabled.
- Audit chain is intact; DML events appear for tables in the allowlist.
- Authentication rejects unauthenticated connections in
scram/certmode.
Troubleshooting
- Server does not start after enabling TDE
Verify
ANGARABASE_TDE_MASTER_KEY_HEXis exactly 64 hex characters and matches the data directory’s key. Fail-closed is expected. - Auth/TLS conflict on remote bind
If binding to a non-loopback address,
tls.enabledmust betrueor--allow-insecure-no-authmust be passed (dev only). Checktls.require_on_remote_bindandserver.host. - DML audit events missing
Verify
audit.dml_modeis notoffand that the target table matches the allowlist/denylist entries. Table names must be fully qualified (schema.table). angara_audit_verify_chain()returnsis_valid = falseThe audit chain is corrupted. Note thefirst_broken_seqand investigate the audit file.- Break-glass cannot activate The audit subsystem must be healthy. Fix the audit path or TDE key material first.
- Need a bug-report artifact? See ../reference/support.md.
Links
- Security model overview: overview.md
- Authentication: authentication.md
- Authorization: authorization.md
- Audit: audit.md
- Encryption: encryption.md
- Break-glass: break-glass.md
- Configuration reference: ../operations/configuration.md
- Security knobs registry:
angarabook/src/operations/security-operations.md
GOST Security Compliance & Testing Guide
Status: TLS implemented, TDE planned Target Audience: Security Auditors, DevOps, QA
1. GOST Security Ecosystem in AngaraBase
AngaraBase implements a layered approach to Russian national cryptographic standards (GOST).
1.1. Transport Layer (TLS) — Available
Protection of data in transit using GOST R 34.10-2012 (Public Key) and GOST 28147-89 (Cipher suites).
- Implementation: Provider-based abstraction (OpenSSL Engine / Rustls).
- Policy: Fail-closed (server refuses to start if configured GOST provider is missing).
- Configuration:
tls.gost_enabled,tls.gost_cipher_suites.
1.2. Data-at-Rest (TDE) — Planned
Protection of data on disk (Pages, WAL, Audit Logs) using block ciphers Kuznyechik (GOST 34.12-2015) or Magma.
- Scope: Transparent Data Encryption (TDE) for storage files.
- Key Management: Integration with external KMS supporting GOST keys.
- Status: Roadmap item.
1.3. Integrity & Authentication — Future
- Hashing: Migration from SHA-256 to Streebog (GOST R 34.11-2012) for data checksums and SCRAM authentication.
- Audit Signing: Digital signature of audit logs to ensure non-repudiation.
2. Testing GOST TLS Support
This guide describes how to verify that AngaraBase is correctly using GOST cipher suites and strictly enforcing the fail-closed policy.
Prerequisites
You need a Linux environment with OpenSSL configured for GOST.
# Debian/Ubuntu
sudo apt-get install openssl libssl-dev libengines-gost
# Verify engine availability
openssl engine gost -t
# Output should contain: [gost] Reference implementation of GOST engine -> [ available ]
Step 1: Generate GOST Certificates
Standard RSA/ECDSA certificates will not work with GOST cipher suites. You must generate keys using GOST algorithms.
# 1. Generate a private key using GOST R 34.10-2012 (256 bit)
openssl genpkey -algorithm gost2012_256 -pkeyopt paramset:A -out gost_server.key
# 2. Generate a self-signed certificate
openssl req -new -x509 -days 365 \
-key gost_server.key \
-out gost_server.crt \
-subj "/CN=localhost"
# 3. Verify the certificate algorithm
openssl x509 -in gost_server.crt -text -noout | grep "Signature Algorithm"
# Expected: Signature Algorithm: GOST R 34.10-2012 with GOST R 34.11-2012 (256 bit)
Step 2: Configure AngaraBase
Enable TLS and GOST mode. Ensure allow_insecure is OFF to test strict mode.
export ANGARABASE_TLS_ENABLE=1
export ANGARABASE_TLS_CERT_PATH=$(pwd)/gost_server.crt
export ANGARABASE_TLS_KEY_PATH=$(pwd)/gost_server.key
export ANGARABASE_TLS_GOST_ENABLED=1
export ANGARABASE_TLS_GOST_CIPHER_SUITES="GOST2012-GOST8912-GOST8912"
# Start the server
./angarabase-server
Step 3: Verification (Positive Test)
Connect using a client that supports GOST (e.g., openssl s_client or a patched psql).
Using OpenSSL s_client:
openssl s_client -connect localhost:5152 -servername localhost
Verification Checklist:
- Look for
Cipher : GOST2012-GOST8912-GOST8912(or similar GOST suite) in the output. - Look for
Protocol : TLSv1.2. - Ensure the handshake completes successfully.
Using SQL (if psql supports it):
SELECT name, value FROM sys.settings WHERE name LIKE 'tls.%';
-- Verify tls.gost_enabled is 'true'
Step 4: Fail-Closed Verification (Negative Test)
Verify that the server refuses to start if the environment is broken.
- Scenario A: Missing Provider.
Temporarily disable the GOST engine (e.g., by renaming the library or changing OpenSSL config) and try to
start AngaraBase with
ANGARABASE_TLS_GOST_ENABLED=1.
- Expected Result: Server panic/exit with “GOST provider not available”.
- Scenario B: Invalid Cipher Suite.
Set
ANGARABASE_TLS_GOST_CIPHER_SUITES="INVALID-CIPHER".
- Expected Result: Server panic/exit with configuration error.
- Scenario C: RSA Certificate with GOST Ciphers.
Try to start with
ANGARABASE_TLS_GOST_ENABLED=1but provide standard RSA certificates.
- Expected Result: Handshake failures (OpenSSL error: “no shared cipher” or “wrong signature type”).
3. Troubleshooting
| Symptom | Probable Cause | Fix |
|---|---|---|
no shared cipher | Client does not support GOST or Server has RSA certs. | Install libengines-gost on client; Use GOST certs on server. |
wrong signature type | Certificate key type mismatch. | Ensure gost2012_256 is used for key generation. |
| Server fails to start | openssl.cnf not configured for GOST. | Run openssl engine gost -t to verify system setup. |
Next steps
Once you have determined which GOST scenarios you need:
- GOST crypto setup — step-by-step installation of the cryptographic provider and profile switching.
- Encryption (TDE + client) — general contract for TDE and client encryption.
- Audit — how to close GOST signatures on the audit-chain.
- Hardening runbook — final checklist before production.
Installation
AngaraBase is distributed through package repositories. Public access is free and requires no registration.
Supported operating systems
- RHEL-compatible (8+): Red Hat Enterprise Linux, CentOS Stream, AlmaLinux, Rocky Linux, Oracle Linux, Fedora.
- Gentoo Linux: supported via the official binhost.
RPM (RHEL / CentOS / Alma / Fedora)
Installation is performed using the standard dnf package manager.
1. Import the GPG key
Import the public key used to sign the packages:
rpm --import https://rpm.angarabase.dev/release-key.asc
2. Add the repository
Create the repository configuration file:
cat > /etc/yum.repos.d/angarabase.repo << 'EOF'
[angarabase]
name=AngaraBase
baseurl=https://rpm.angarabase.dev/el$releasever/stable/
enabled=1
gpgcheck=1
gpgkey=https://rpm.angarabase.dev/release-key.asc
EOF
3. Install the package
Install the server package:
dnf install angarabase-server
Gentoo Linux
Gentoo users can install from a pre-built binhost with compiled packages.
1. Import the GPG key
gpg --fetch-keys https://rpm.angarabase.dev/release-key.asc
2. Add the binhost
Add the repository to your Portage configuration:
cat >> /etc/portage/binrepos.conf << 'EOF'
[angarabase]
priority = 9999
sync-uri = https://rpm.angarabase.dev/gentoo/stable/
EOF
3. Install the package
Install the package using the binary build:
emerge --getbinpkg angarabase
Enterprise repository
Enterprise subscribers get access to a private repository with binary builds. Access is secured via mTLS (mutual certificate authentication).
Setup instructions are provided together with your invitation code when the subscription is activated. If you are a customer and do not have the instructions, contact us at team@angarabase.com.
More information: angarabase.dev/enterprise
Configuration
Goal
Configure a AngaraBase instance for local testing or production deployment using TOML config keys, data-directory conventions, and environment variable overrides.
Prerequisites
- Built
angarabase-serverbinary (see Quickstart)
Minimal config keys
Server
[server]
addr = "127.0.0.1:5152" # host:port; use 0.0.0.0:PORT to bind all interfaces
Storage
[storage]
data_directory = "/var/lib/angarabase/data"
transaction_log_directory = "/var/lib/angarabase/txlog"
Logging
[logging]
log_level = "info"
log_directory = "/var/log/angarabase"
Ops (metrics / admin listeners)
[ops]
metrics_addr = "127.0.0.1:9898" # Prometheus metrics endpoint; empty = disabled
# admin_addr = "127.0.0.1:9999" # reserved, not yet active
Transaction log (durability)
[transaction_log]
backend = "noop"
durability = "strict"
fsync = true
TLS (optional)
[tls]
enabled = true
cert_path = "/etc/angarabase/tls/server.crt"
key_path = "/etc/angarabase/tls/server.key"
# If true and server.host is non-loopback, TLS is required (fail-closed).
require_on_remote_bind = true
WAL (Write-Ahead Log)
[wal]
vlf_enable = true
max_size_mb = 512
vlf_size_mb = 32
init_vlfs = 2
auto_shrink = true
SQL Execution
[execution]
mode = "auto" # auto | force_vector | force_row
vector_batch_size = 1024
query_memory_limit_mb = 256
Adaptive Query Processing (AQP)
[aqp]
enabled = true
mode = "conservative" # conservative | aggressive
min_query_time_ms = 100
learning_rate = 0.1
max_correction = 100.0
variance_threshold = 10.0
correction_cache_mb = 64
store_capacity_mb = 1024
Diagnostics
[diagnostics]
log_min_duration_ms = -1 # -1 = disabled
log_query_text = false
stat_statements_max = 0 # 0 = disabled
Initialization workflow
Run --init before the first normal start:
angarabase-server --config /path/to/angarabase.conf --init
- Directories
data_directoryandtransaction_log_directorymust be writable. - When a valid
logging.log_directoryis set, the server writesangarabase-server.log. --initcreates all required data-directory artifacts (see below).
For local dev/test shortcuts see Quickstart.
Data directory artifacts
Inside storage.data_directory the server creates and maintains:
| Artifact | Purpose |
|---|---|
VERSION | Initialization marker (format_version, min_server_version); enforces fail-closed startup. |
base.adb | On-disk system database file. |
Legacy text artifacts (identity_v0.txt, sys_catalog/*.txt) are no longer the source of truth and are not
created by --init. If they remain from older versions they are ignored.
Environment variables
Environment variables override the corresponding config keys. When both are set, the env variable wins.
Precedence rule: default → config → env override
All parameters can be configured in angarabase.conf. Environment variables are intended for operational
override without restart (for dynamic parameters) or for diagnostics. For permanent configuration, use the
TOML config file.
Core server knobs
| Variable | Default | Description |
|---|---|---|
ANGARABASE_SHUTDOWN_TIMEOUT_MS | 1000 | Bounded graceful-shutdown timeout (ms). |
ANGARABASE_ALLOW_SQL_SHUTDOWN | off | Allow shutdown trigger via SELECT sys.request_shutdown(). |
ANGARABASE_STORAGE_STRICT_STARTUP | enabled | Storage verification on startup. Set 0/false/off to disable. |
ANGARABASE_METRICS_ADDR | — | Override for [ops].metrics_addr. |
ANGARABASE_SUPPORT_CONTACT | https://github.com/angarabase/angarabase/issues | Support contact shown by linux-gnu glibc compatibility guard when runtime glibc < 2.28. Override only if you run a forked distribution with its own support channel. |
ANGARABASE_IN_MEMORY_MAX_ROWS_PER_TABLE | 100000 | Default hard cap for storage='memory' tables when max_rows is omitted. Must be positive. |
TLS knobs
| Variable | Default | Description |
|---|---|---|
ANGARABASE_TLS_ENABLE | off | Enable TLS upgrade via pgwire SSLRequest. |
ANGARABASE_TLS_CERT_PATH | — | PEM certificate chain (required when TLS enabled). |
ANGARABASE_TLS_KEY_PATH | — | PEM private key (required when TLS enabled). |
Transaction log knobs
| Variable | Default | Description |
|---|---|---|
ANGARABASE_TRANSACTION_LOG | — | Override txlog backend (noop / file / file_bin). |
ANGARABASE_TRANSACTION_LOG_DURABILITY | — | Override durability (strict / relaxed / group_commit). |
ANGARABASE_TRANSACTION_LOG_FSYNC | — | Override fsync for strict mode (0 / 1). |
WAL (Write-Ahead Log) knobs
| Variable | Default | Description |
|---|---|---|
ANGARABASE_WAL_VLF_ENABLE | enabled | Enable VLF (Virtual Log Files) circular layout. Set 0 to disable and use linear WAL. |
ANGARABASE_WAL_MAX_SIZE_MB | 512 | Maximum WAL file size in MB. Controls circular buffer size for VLF. |
ANGARABASE_WAL_VLF_SIZE_MB | 32 | Individual VLF segment size in MB. Must be smaller than WAL_MAX_SIZE_MB. |
ANGARABASE_WAL_INIT_VLFS | 2 | Initial number of VLF segments to allocate at startup. |
ANGARABASE_WAL_AUTO_SHRINK | enabled | Automatically shrink unused VLF segments after checkpoint. Set 0 to disable. |
Diagnostics / slow-query knobs
| Variable | Default | Description |
|---|---|---|
ANGARABASE_LOG_MIN_DURATION_MS | -1 (disabled) | Slow-query log threshold in milliseconds. |
ANGARABASE_LOG_QUERY_TEXT | 0 | Include raw SQL text in slow log (0 / 1). |
ANGARABASE_STAT_STATEMENTS_MAX | — | Max in-memory entries for angara_stat_statements (LRU bounded). |
Optimizer planning knobs
| Variable | Default | Description |
|---|---|---|
ANGARABASE_OPTIMIZER_PLANNING_TIMEOUT_MS | 0 | CBO planning budget in milliseconds (0 = disabled). On timeout the planner degrades to a greedy fallback plan instead of returning an error. |
sql.optimizer.planning_timeout_ms | 0 | Settings surface alias for optimizer planning timeout; reflected in sys.settings when configured. |
Vector execution knobs
| Variable | Default | Description |
|---|---|---|
ANGARABASE_SQL_EXECUTION_MODE | auto | Execution mode for SELECT pipeline: auto (default, vector only when fully supported), force_vector (fail-closed if vector path is unavailable), force_row (always row engine). Legacy aliases vector and vector_auto remain accepted for compatibility. |
ANGARABASE_VECTOR_BATCH_SIZE | 1024 | Vector batch size (1..1024), mostly useful for tests and diagnostics. |
ANGARABASE_QUERY_MEMORY_LIMIT_MB | 256 | Per-query vector memory budget; exceeding the limit fails closed with SQLSTATE 53100 (no OOM fallback). |
Parallel execution governance knobs
| Variable | Default | Description |
|---|---|---|
ANGARABASE_PARALLEL_DOP_CAP_GLOBAL | CPU cores | Global upper bound for SQL parallel workers. |
ANGARABASE_PARALLEL_DOP_CAP_QUERY | CPU cores | Per-query upper bound for SQL parallel workers. Effective DOP is capped by both values. |
Adaptive query processing knobs
| Variable | Default | Description |
|---|---|---|
ANGARABASE_AQP_ENABLED | 1 | Global AQP switch for advisory feedback learning/apply. |
ANGARABASE_AQP_MODE | conservative | AQP mode (conservative / aggressive); conservative keeps correction hysteresis. |
ANGARABASE_AQP_MIN_QUERY_TIME_MS | 100 | Minimum query runtime eligible for feedback observation. |
ANGARABASE_AQP_LEARNING_RATE | 0.1 | EMA learning rate used for correction updates. |
ANGARABASE_AQP_MAX_CORRECTION | 100.0 | Upper correction multiplier bound (guardrail against outliers). |
ANGARABASE_AQP_VARIANCE_THRESHOLD | 10.0 | If variance exceeds threshold, correction entry is marked unstable and ignored. |
ANGARABASE_AQP_CORRECTION_CACHE_MB | 64 | In-memory correction cache budget. |
ANGARABASE_AQP_STORE_CAPACITY_MB | 1024 | Total bounded advisory store capacity; overflow evicts deterministic low-value entries. |
OpenTelemetry tracing knobs
| Variable | Default | Description |
|---|---|---|
ANGARABASE_OTEL_ENABLED | 0 | Enable OTel-style query span export (0 / 1). |
ANGARABASE_OTEL_SAMPLE_RATE_PPM | 1000000 | Sampling in parts-per-million (0..1000000). |
ANGARABASE_OTEL_EXPORTER | stderr | Export sink (stderr / file). |
ANGARABASE_OTEL_ENDPOINT | — | Export target for file exporter (JSONL path). |
See also Diagnostics for query performance analysis.
Authentication knobs
| Variable | Default | Description |
|---|---|---|
ANGARABASE_AUTH_MODE | — | Auth mode contract (trust / scram / cert). |
Startup safety: trust/no-auth mode now requires explicit --allow-insecure-no-auth; without this flag the
server refuses to start when ANGARABASE_AUTH_MODE resolves to trust/no-auth.
Security / break-glass knobs
| Variable | Default | Description |
|---|---|---|
ANGARABASE_SECURITY_BREAK_GLASS_MAX_TTL | 86400 | Max allowed break-glass TTL in seconds. |
Audit knobs
| Variable | Default | Description |
|---|---|---|
ANGARABASE_AUDIT_LOG_PATH | — | Path to append-only audit sink file (JSONL + chain-hash fields). |
ANGARABASE_AUDIT_MAX_BYTES | 4194304 (4 MiB) | Rotation threshold for audit sink. |
ANGARABASE_AUDIT_DML_MODE | off | Audit policy for DML (off / allowlist / denylist). |
ANGARABASE_AUDIT_DML_ALLOWLIST | — | CSV schema.table list for audited DML (allowlist mode). |
ANGARABASE_AUDIT_DML_DENYLIST | — | CSV schema.table list excluded from audit (denylist mode). |
ANGARABASE_AUDIT_EXPORT_JSON_ENABLED | 0 | Enable JSON export worker. |
ANGARABASE_AUDIT_EXPORT_SYSLOG_ENABLED | 0 | Enable syslog export worker. |
ANGARABASE_AUDIT_EXPORT_RATE_LIMIT_RPS | — | Export rate limit in records/second (bounded). |
AngaraIO v2 I/O Scheduler knobs
| Variable | Default | Description |
|---|---|---|
ANGARABASE_IO_URING_QUEUE_DEPTH_LOW | entries/4 | Low-priority I/O queue depth. Controls backpressure for non-critical operations. Clamped to 1–16384. |
ANGARABASE_IO_URING_BATCH_MAX | 16 | Maximum I/O operations batched per scheduler round. Higher values improve throughput but increase latency. Clamped to 1–1024. |
Note: High-priority queue is unbounded per RFC invariant to ensure critical operations never block.
TDE (Transparent Data Encryption) knobs
| Variable | Default | Description |
|---|---|---|
ANGARABASE_TDE_ENABLE | 0 | Enable TDE v0 for page/WAL at-rest encryption. |
ANGARABASE_TDE_MASTER_KEY_HEX | — | Master key bytes in hex (64 chars; secret; required when TDE enabled). |
ANGARABASE_TDE_MASTER_KEY_ID | — | Non-secret key identifier visible in sys.settings. |
ANGARABASE_TDE_LAST_ROTATION_UNIX | — | Non-secret last-rotation timestamp visible in sys.settings. |
Security note: when ANGARABASE_TDE_ENABLE=1 and ANGARABASE_AUDIT_LOG_PATH is set, the audit sink is
encrypted at rest; without valid key material audit read/write is fail-closed (no plaintext fallback).
Expected result
After a successful --init and start:
curl -sS http://127.0.0.1:9898/health/ready # port matches [ops].metrics_addr
Returns {"status":"ready"}. Effective configuration is visible via:
SELECT * FROM sys.settings;
Troubleshooting
| Symptom | Action |
|---|---|
| Server refuses to start in trust mode | Add --allow-insecure-no-auth. |
VERSION mismatch at startup | The data directory was initialized by a different server version. Re-init or upgrade. |
| TDE fail-closed after restore | Ensure the same key material (ANGARABASE_TDE_MASTER_KEY_HEX) is available. |
| Parameter not applied | Check precedence: env variables override config. Use SELECT * FROM sys.settings to see effective values and sources. |
For unresolved issues see Known issues and Support.
Links
- Security/ops knobs registry (full defaults, fail-closed gates,
sys.settingscontract):angarabook/src/operations/security-operations.md - Operator runbook:
angarabook/src/operations/troubleshooting.md - Security overview: Security model
- Quickstart: Quickstart
Container Deployment Quickstart
Operator quickstart for image-first AngaraBase startup:
- local
docker runwith a cgroup-aware startup probe; - minimal k8s deployment (
single-node, without HA); - smoke checks for readiness and basic diagnostics.
0) 30-second evaluator path (image-only)
If you need a quick evaluator/DPP smoke without a Rust toolchain:
tools/compat_suite/image_smoke.sh
The script starts a container from the canonical Dockerfile, waits for /health/ready,
checks the startup log (deployment probe resolved effective budgets, cgroup_version),
and verifies probe metrics in /metrics.
1) Local container smoke (docker run)
Build the image:
docker build -t angarabase:local .
Run with a memory limit (probe contract):
docker run --rm --name angarabase-local \
--memory=512m \
-e ANGARABASE_DEPLOYMENT_PROFILE=container \
-p 5152:5152 \
-p 9898:9898 \
angarabase:local
Expected signal in logs:
- the
deployment probe resolved effective budgetsline is present; deployment_profile=container;memory_source/cpu_sourcereflectcgroup_v1orcgroup_v2(orproc_fallbackwithfallback_reason).
Check readiness in the container:
curl -fsS http://127.0.0.1:9898/health/ready
The container HEALTHCHECK uses the same endpoint, so it becomes green only
after health_readiness_reason_v0() passes.
Check metrics:
curl -fsS http://127.0.0.1:9898/metrics | rg "deployment_probe|deployment_profile"
2) Resource-limit examples
CPU + memory cap:
docker run --rm --name angarabase-capped \
--memory=1g \
--cpus=1.5 \
-e ANGARABASE_DEPLOYMENT_PROFILE=container \
-p 5152:5152 \
-p 9898:9898 \
angarabase:local
Explicit override (the probe must not overwrite it):
docker run --rm \
--memory=512m \
-e ANGARABASE_DEPLOYMENT_PROFILE=container \
-e ANGARABASE_MEMORY_BUDGET=256MB \
-e ANGARABASE_CPU_BUDGET=1000m \
-p 5152:5152 \
-p 9898:9898 \
angarabase:local
Expected: the budget source in metrics/logs becomes config_override.
3) Kubernetes minimal smoke
Apply manifests:
kubectl apply -f tools/deploy/kubernetes/minimal/namespace.yaml
kubectl apply -f tools/deploy/kubernetes/minimal/configmap.yaml
kubectl apply -f tools/deploy/kubernetes/minimal/service.yaml
kubectl apply -f tools/deploy/kubernetes/minimal/statefulset.yaml
Verify:
kubectl -n angarabase get pods
kubectl -n angarabase get svc angarabase
kubectl -n angarabase logs statefulset/angarabase
kubectl -n angarabase get statefulset angarabase -o yaml | rg "resources:|health/"
Important: resources.requests/limits are set in statefulset.yaml. If limits are removed,
the probe may see “unlimited” and switch to a host-level fallback for budgets.
Delete:
kubectl delete namespace angarabase
4) Diagnostics and backup entrypoints
For the packaged/operator path, use angara-cli instead of direct tools/... calls:
angara-cli diagnostics bundle --root artifacts/diagnostics/container-smoke --json
angara-cli backup full --config /etc/angarabase/angarabase.conf --out /tmp/base_full.abk
Related runbooks:
DPP alignment:
- for resource-limit safety rails and fail-closed mode, see
business_strategy/PILOT_CHECKLIST.md(§5 and §6).
Backup and restore
Goal
Create and restore backups of a AngaraBase instance — from cold/offline baselines through online FULL + LOG chain with point-in-time recovery (PITR).
Prerequisites
- Initialized AngaraBase instance (
angarabase-server --init; see Configuration) - Built
angara-clibinary (see Quickstart) - For online backups (phase 1b): a running server with WAL (
transaction_logbackend ≠noop)
Cold / offline backup
The original supported method. The server must be stopped (or will be stopped by the runner).
- Copy
storage.data_directoryandstorage.transaction_log_directory. - After restore, run the txlog-level oracle.
- The source must be a correctly initialized data directory.
Pinned commands (legacy shell)
Backup:
tools/backup_restore/run.sh backup \
--data-dir /var/lib/angarabase/data \
--txlog-dir /var/lib/angarabase/transaction_log \
--out /tmp/angarabase-backup.tar.gz \
--root artifacts/backup_restore/backup
Restore:
tools/backup_restore/run.sh restore \
--archive /tmp/angarabase-backup.tar.gz \
--dest /tmp/angarabase-restore \
--force \
--root artifacts/backup_restore/restore
Oracle (post-restore verification):
tools/backup_restore/oracle.sh --root artifacts/backup_restore_oracle/run_1
Remote Admin Backup Flow (Online Backup via Admin API)
AngaraBase supports triggering backups remotely via the Admin TCP endpoint. This is the recommended approach for production deployments, as it does not require direct file system access to the database node.
Requirements
- The
angarabase-servermust be running with the Admin API enabled (configured viaadmin.listen_addressinangarabase.conforANGARABASE_ADMIN_ADDRenvironment variable). angara-climust be executed with the--addr <host:port>parameter instead of--config.- The user executing the backup must have network access to the Admin API port (default:
9899).
How it works
When using --addr, angara-cli connects to the running server’s Admin API and issues a RUN backup_full
command. The server coordinates the backup process internally (Fuzzy Copy, WAL Inclusion) and streams the
resulting .angarabk file back to the client over the network.
This allows operators to take backups from a central management node without SSH access to the database servers.
Storage layout note
User tables use the page-based .adb path. Legacy heap_store/*.bin is migrated to .adb on first access.
For backup/restore and triage:
- Both files per database are critical:
<db>.adband<db>.atl. heap_store/is not the source of truth after migration.- Upgrading old instances may take time proportional to
heap_storevolume. - If the backup was taken with TDE enabled (
ANGARABASE_TDE_ENABLE=1), restore requires the same valid key material; without it, startup after restore is fail-closed.
Backup/Restore v2 — phase 1a: one-file backupset
Minimal offline pipeline producing a single *.angarabk file (manifest-first format).
Capabilities
| Command | Purpose |
|---|---|
backup full | Offline FULL backup to a single file. |
backup inspect | Read manifest only (no payload scan). |
backup verify | Struct/hash integrity check (binary answer). |
backup restore | Local restore + txlog-level oracle (best-effort). |
Pinned commands (phase 1a)
# FULL (offline/local baseline)
angara-cli backup full \
--config /path/to/angarabase.conf \
--out /tmp/full_0001.angarabk \
--db-id base
# FULL (remote execution via Admin API)
angara-cli backup full \
--addr 127.0.0.1:9899 \
--out /tmp/full_0001.angarabk \
--db-id base
# INSPECT (manifest-only)
angara-cli backup inspect --file /tmp/full_0001.angarabk
# VERIFY v0 (struct/hash)
angara-cli backup verify --file /tmp/full_0001.angarabk --json
# RESTORE (local) into a new directory
angara-cli backup restore \
--config /path/to/angarabase.conf \
--file /tmp/full_0001.angarabk \
--target-dir /tmp/restore_0001 \
--overwrite
Evidence pack (N=3)
tools/backup_restore/evidence_v2_phase1a.sh --runs 3
Inspect vs verify
backup inspect: manifest-only, no payload scan.backup verify: struct/hash integrity; proves backupset consistency, not SQL-level correctness.
Backup/Restore v2 — phase 1b: online FULL + LOG chain + PITR
Online pipeline with point-in-time recovery support.
Capabilities
| Command | Purpose |
|---|---|
backup full-online | Online FULL boundary (best-effort, with backup fence). |
backup log | WAL chunk by LSN range (strictly contiguous chain). |
backup chain-validate | Binary validation of the full chain. |
backup restore-chain | PITR restore to target_lsn + oracle. |
Pinned commands (phase 1b)
# ONLINE FULL (best-effort, with backup fence)
# Local execution:
angara-cli backup full-online \
--config /path/to/angarabase.conf \
--out /tmp/full_online_0001.angarabk \
--db-id base
# Remote execution (via Admin API):
angara-cli backup full-online \
--addr 127.0.0.1:9899 \
--out /tmp/full_online_0001.angarabk \
--db-id base
# LOG chunk (LSN boundaries must be contiguous with FULL/prev LOG)
angara-cli backup log \
--config /path/to/angarabase.conf \
--out /tmp/log_0001.angarabk \
--start-lsn <u64> --end-lsn <u64> \
--db-id base
# Validate chain contiguity
angara-cli backup chain-validate \
--chain /tmp/full_online_0001.angarabk,/tmp/log_0001.angarabk \
--json
# PITR restore to target_lsn
angara-cli backup restore-chain \
--config /path/to/angarabase.conf \
--target-dir /tmp/restore_chain_0001 \
--target-lsn <u64> \
--chain /tmp/full_online_0001.angarabk,/tmp/log_0001.angarabk \
--overwrite
Evidence pack (N=3)
tools/backup_restore/evidence_v2_phase1b.sh --runs 3
Expected result
After a successful restore the server should start cleanly and previously committed data should be present:
SELECT * FROM sys.health;
SELECT * FROM sys.identity;
For PITR restores, data reflects the state at target_lsn.
Troubleshooting
| Symptom | Action |
|---|---|
| Restore fails with TDE error | Ensure the same ANGARABASE_TDE_MASTER_KEY_HEX is set. See Configuration — TDE knobs. |
| Chain validation fails | LSN boundaries between FULL and LOG chunks must be contiguous. Re-take the LOG chunk from the correct start LSN. |
| Oracle reports mismatch | The txlog replay-pages oracle is best-effort. Collect artifacts and file a report via Support. |
heap_store migration slow after restore | Expected for legacy-format backups. Migration is proportional to heap_store size. |
For unresolved issues see Known issues and Support.
Host migration (without dump/restore)
An alternative to backup/restore for moving AngaraBase to another host is direct data-file copying using the Instance Lease system.
When applicable
- Single-node deployment — one AngaraBase instance
- Shared storage — NFS, SAN, or manual file copy
- Same version — the same AngaraBase version on source and target
- Maintenance window — the instance can be stopped during copying
Step-by-step instructions
- Stop the source instance:
# Graceful shutdown to release the lease
kill -TERM <angarabase-pid>
# Wait for completion
ps aux | grep angarabase-server
- Verify lease release (optional):
-- On another instance or via backup connection
SELECT lease_holder_id FROM sys.identity;
- Copy data files:
# Copy data directory
rsync -av /source/data/ /target/data/
# Copy transaction log directory
rsync -av /source/txlog/ /target/txlog/
# Verify integrity
find /target/data -name "*.adb" -exec ls -la {} \;
- Start on the target host:
# Update configuration if needed
vim /target/angarabase.conf
# Start AngaraBase
angarabase-server --config /target/angarabase.conf
- Verify migration:
-- Check lease holder
SELECT lease_holder_hostname, recovery_mode FROM sys.identity;
-- Verify data integrity
SELECT COUNT(*) FROM your_tables;
-- Check system state
SELECT * FROM sys.health;
What to check after startup
-
lease_holder_hostnamechanged to the new host -
recovery_modeshows the recovery type - All tables are accessible and contain the expected data
- No errors in server logs
- Client connections work correctly
Limitations
- Downtime required — requires stopping the service during copying
- File consistency — files must be copied atomically
- Version compatibility — AngaraBase versions must match
- Configuration — paths in the configuration may need updates
Comparison with backup/restore
| Aspect | Host migration | Backup/restore |
|---|---|---|
| Downtime | File copy time | Backup + restore time |
| Disk space | 1x (target only) | 2x (backup + target) |
| Network | Direct copy | Through backup storage |
| Complexity | Low | Medium |
| Point-in-time | Current state only | Any LSN |
See also
Detailed host migration instructions: Crash Recovery — Host Migration
Links
-
Canonical operator runbook:
angarabook/src/operations/backup-restore.md -
Configuration reference: Configuration
-
Diagnostics (post-restore health check): Diagnostics
-
Host migration details: Crash Recovery
Crash Recovery and Storage Portability
This guide covers crash recovery scenarios and how to safely restart AngaraBase instances on existing data files, including migration to different hosts.
Overview
AngaraBase includes an Instance Lease system that prevents dual-write corruption while enabling safe crash recovery and storage portability. The system works on any filesystem, including NFS and SAN where traditional file locking is unreliable.
What Happens During Crash Recovery
When AngaraBase starts on existing data files, it performs these recovery phases:
Phase A: WAL Recovery (file_bin backend only)
- Scans transaction log files for incomplete entries
- Truncates partial tail records to maintain consistency
- Replays committed page deltas (redo operations)
Phase B: MVCC History Restore
- Recovers in-memory MVCC state from transaction log
- Marks uncommitted transactions as aborted
- Restores visibility information for concurrent reads
Phase C: Instance Lease Check
- Checks for active instance lease in
base.adb - Prevents dual instance startup with fail-closed error
- Automatically takes over expired leases (crash recovery)
Restarting on Existing Files (Same Host)
For normal restart scenarios on the same machine:
- Stop the server (if running):
# Graceful shutdown releases the lease automatically
kill -TERM <angarabase-pid>
- Start normally:
angarabase-server --config angarabase.conf
- Verify recovery:
SELECT recovery_mode, lease_holder_id FROM sys.identity;
The instance lease will be automatically acquired after the TTL expires (default: 30 seconds).
Host Migration (Without dump/restore)
To move data files to a different host:
Prerequisites
- Both hosts have the same AngaraBase version
- Same page size (checked automatically)
- Shared storage OR manual file copy
Step-by-Step Process
- Stop the source instance:
# Graceful shutdown to release lease
kill -TERM <angarabase-pid>
# Verify shutdown completed
ps aux | grep angarabase-server
- Verify lease is released:
# Check that no process holds the lease
# (Optional: use another AngaraBase instance to query sys.identity)
- Copy data files (if not using shared storage):
# Copy entire data directory
rsync -av /old/host/data/ /new/host/data/
# Copy transaction log directory
rsync -av /old/host/txlog/ /new/host/txlog/
- Start on new host:
angarabase-server --config angarabase.conf
- Verify migration:
SELECT lease_holder_hostname, recovery_mode FROM sys.identity;
SELECT COUNT(*) FROM your_tables; -- Verify data integrity
Force Lease Takeover
If the previous instance crashed and the lease hasn’t expired, use force takeover:
When to Use
- Previous instance confirmed dead (host crashed, process killed)
- Lease shows expired time but takeover not automatic
- Emergency recovery scenarios
How to Use
# Set environment variable before starting
export ANGARABASE_FORCE_LEASE_TAKEOVER=1
angarabase-server --config angarabase.conf
Safety Checks
Before forcing takeover, verify:
- Previous instance process is definitely terminated
- No other AngaraBase processes accessing the same files
- Network partitions resolved (if applicable)
Warning: Force takeover with a running instance will cause data corruption.
Diagnostics
Check Lease Status
SELECT
lease_holder_id,
lease_holder_hostname,
lease_expires_at,
lease_acquired_at,
recovery_mode
FROM sys.identity;
Check Recovery Metrics
SELECT * FROM sys.health;
Lease Configuration
Environment variables (set before startup):
ANGARABASE_LEASE_TTL_S: Lease duration in seconds (default: 30)ANGARABASE_LEASE_HEARTBEAT_S: Heartbeat interval (default: 10)ANGARABASE_FORCE_LEASE_TAKEOVER: Force takeover flag (default: false)
Limitations
WAL Backend Requirements
- Full recovery: Requires
transaction_log.backend = "file_bin" - Partial recovery:
noopbackend has no WAL replay (data loss possible)
Filesystem Considerations
- Local filesystems: Full support (ext4, xfs, btrfs, etc.)
- NFS/SAN: Instance lease works; verify file copy consistency
- Network partitions: May cause false lease expiration
Version Compatibility
- Same major.minor version required for host migration
- Page size must match (checked automatically)
- Configuration compatibility recommended
Troubleshooting
“Cannot start: database files are owned by another instance”
- Cause: Active lease held by another instance
- Solution: Wait for lease expiration or use force takeover (if safe)
“MVCC recovery failed”
- Cause: Corrupted transaction log files
- Solution: Check disk space, filesystem errors; may need backup restore
“VERSION decode failed”
- Cause: Corrupted version marker or incompatible format
- Solution: Restore from backup; check filesystem integrity
Performance After Recovery
- First queries may be slower (cold buffer pool)
- MVCC state rebuilds incrementally
- Statistics may need refresh (
ANALYZE TABLE)
See Also
- Instance Lifecycle - Conceptual overview
- Backup and Restore - Host migration alternatives
- Configuration - Lease settings reference
Binary release upgrade
This scenario covers upgrade/downgrade within the current release line without changing the on-disk format.
Invariants
- The upgrade changes binaries and unit/scripts, but does not touch user data.
- The data directory
/var/lib/angarabaseis not deleted. - The config
/etc/angarabase/angarabase.confis preserved (package noreplace policy).
Upgrade via DEB/RPM
DEB
sudo dpkg -i angarabase-server_<VERSION>_amd64.deb
sudo systemctl status angarabase --no-pager
RPM
sudo rpm -Uvh angarabase-server-<VERSION>-1.x86_64.rpm
sudo systemctl status angarabase --no-pager
Upgrade from tarball
sudo systemctl stop angarabase
sudo install -m 0755 bin/angarabase-server /usr/bin/angarabase-server
sudo install -m 0755 bin/angara-cli /usr/bin/angara-cli
sudo systemctl daemon-reload
sudo systemctl start angarabase
Post-upgrade check
angarabase-server --version
sudo systemctl is-active angarabase
Smoke-check:
- the server starts without panic;
- config and data dir are accessible;
- the basic SQL health check passes.
Rollback
Rollback is performed by installing the previous package/archive.
Important: rollback is allowed only between compatible storage-contract versions.
Next
After a successful upgrade and post-upgrade checklist:
- Crash recovery — what to do if the new version does not start.
- Backup and restore — take a control backup immediately after the upgrade.
- Verify release artifacts — make sure the next version is downloaded and verified the same way.
- Operator deep-dive: Upgrade and migration — full playbook with rollback scenarios.
Monitoring
Goal
Connect Prometheus and Grafana to AngaraBase metrics and get a working dashboard for ops/triage monitoring.
Prerequisites
- Running
angarabase-server(see Quickstart) - Access to the AngaraBase metrics listener (
ANGARABASE_METRICS_ADDR) - Running Prometheus instance
- Running Grafana instance
Step 1: enable the metrics endpoint
Set the listener address in the config file:
[ops]
metrics_addr = "127.0.0.1:9898"
Or use the environment variable override (takes precedence over config):
export ANGARABASE_METRICS_ADDR=127.0.0.1:9898
angarabase-server --config /etc/angarabase/angarabase.conf
Verify the endpoint responds:
curl -sS http://127.0.0.1:9898/metrics | rg '^angarabase_' -m 5
curl -sS http://127.0.0.1:9898/health/live
curl -sS http://127.0.0.1:9898/health/ready
curl -sS http://127.0.0.1:9898/health/startup
See Configuration — Ops knobs for the full [ops] section reference.
OpenTelemetry tracing
AngaraBase also exposes opt-in OTel-style span export for query lifecycle triage.
Minimal setup example:
export ANGARABASE_OTEL_ENABLED=1
export ANGARABASE_OTEL_EXPORTER=file
export ANGARABASE_OTEL_ENDPOINT=artifacts/otel/spans.jsonl
export ANGARABASE_OTEL_SAMPLE_RATE_PPM=1000000
Spans are emitted for bounded stage names (accept, auth, session, parse, plan, execute,
storage_io, commit/rollback) and are intended for diagnostics evidence, not for raw SQL capture.
Step 2: configure the Prometheus scrape
Add a job to prometheus.yml:
scrape_configs:
- job_name: angarabase
scrape_interval: 15s
metrics_path: /metrics
static_configs:
- targets: ["127.0.0.1:9898"]
Restart Prometheus and verify in the UI (/targets) that the angarabase target is UP.
Step 3: add the Prometheus data source in Grafana
- Open the Grafana UI.
- Navigate to Connections → Data sources.
- Add Prometheus.
- Set the URL to your Prometheus instance (e.g.,
http://127.0.0.1:9090). - Click Save & test and confirm the connection succeeds.
Step 4: import the AngaraBase dashboard
- Go to Dashboards → New → Import.
- Upload the file
tools/observability/grafana/angarabase-overview-v2.json. - Select the Prometheus data source added in the previous step.
- Save the dashboard.
Alternative — download the JSON directly from the server:
curl -fsS http://127.0.0.1:9898/grafana/angarabase-overview.json -o angarabase-overview.json
Then import angarabase-overview.json in Grafana.
Expected result
After import, the dashboard should display the following panels:
| Panel | Key metric(s) |
|---|---|
| QPS | angarabase_queries_total |
| Query latency p50/p95/p99 | histogram buckets |
| Slow query count | angarabase_slow_query_total |
| Active connections | gauge |
| TPS / commit latency | angarabase_commits_total |
| Lock contention | lock wait counters |
| Buffer pool pressure | angarabase_buffer_pool_hit_total, angarabase_buffer_pool_miss_total |
| BRIN range efficiency | angara_brin_range_efficiency |
| Mutation policy rejections | angara_table_no_delete_rejected_*, angara_table_mutation_epoch |
| WAL lag / fsync latency | WAL counters |
| IO latency / checkpoint / GC pressure | IO counters |
If panels are empty, see Troubleshooting below.
Troubleshooting
No data in panels
- Check
http://<prometheus>/targets: theangarabasetarget must beUP. - Ensure
ANGARABASE_METRICS_ADDRmatches thetargetsvalue in Prometheus. - Verify metrics use the
angarabase_prefix:
curl -sS http://127.0.0.1:9898/metrics | rg '^angarabase_' -m 20
Data source test fails in Grafana
- Verify the data source URL and network reachability from Grafana to Prometheus.
- For Docker/K8s use the service DNS/hostname (not
127.0.0.1) when Grafana and Prometheus run in different containers.
Readiness probe is not ready
GET /health/readyreturns a reason in JSON — this is the primary triage signal.- Collect a diagnostics bundle and attach
summary.json:
tools/diagnostics_bundle/run.sh \
--root artifacts/diagnostics/<incident-id> \
--config <path> \
--metrics-url http://<metrics-host>/metrics
- For further investigation see Diagnostics and Support.
Links
- Dashboard JSON:
tools/observability/grafana/angarabase-overview-v2.json - Dashboard JSON from server:
GET /grafana/angarabase-overview.json(onANGARABASE_METRICS_ADDR/[ops].metrics_addr) - Dashboard import guide:
tools/observability/README.md - Operator runbook (metrics + probes): Troubleshooting guide
- Metrics checklist (contract): Observability metrics checklist
- Configuration reference: Configuration
- Support flow: Support
Diagnostics
Goal
Analyze query performance, inspect server state, and triage issues using AngaraBase’s built-in diagnostic tools.
Prerequisites
- Running
angarabase-server(see Quickstart) psqlor another pgwire-compatible client- For slow-query logging: env variables set before server start (see Configuration)
EXPLAIN variants
Basic query plan
EXPLAIN SELECT * FROM t WHERE id = 1;
Shows the planned execution path without running the query.
EXPLAIN ANALYZE (with actual timing)
EXPLAIN ANALYZE SELECT * FROM t WHERE id > 100;
Executes the query and reports actual row counts and timing alongside the plan.
For parallel join plans, EXPLAIN ANALYZE also reports best-effort join accounting counters:
join_build_rows— rows processed by join build phase,join_probe_rows— rows processed by join probe phase.
EXPLAIN ANALYZE for DML (dry-run)
EXPLAIN ANALYZE over INSERT, UPDATE, or DELETE runs the statement inside an isolated dry-run
transaction that rolls back automatically — no data is modified.
EXPLAIN ANALYZE INSERT INTO t (id, v) VALUES (999, 42);
EXPLAIN ANALYZE UPDATE t SET v = v + 1 WHERE id = 1;
EXPLAIN ANALYZE DELETE FROM t WHERE id = 1;
Buffer statistics
EXPLAIN (BUFFERS) SELECT * FROM t WHERE id = 1;
Adds buffer-pool hit/miss counters to the plan output.
JSON output
EXPLAIN (FORMAT JSON) SELECT * FROM t WHERE id = 1;
Returns the plan as a JSON document — useful for programmatic analysis.
Runtime diagnostic views
angara_stat_activity
Shows currently executing queries and their wait state.
SELECT pid, state, wait_event_type, query
FROM angara_stat_activity;
wait_event_type values: Lock, IO, Net, CPU — provides coarse wait categorization for triage.
angara_stat_statements
Aggregated per-query statistics (call count, total time, rows, etc.).
SELECT query, calls, total_time, rows
FROM angara_stat_statements
ORDER BY total_time DESC
LIMIT 10;
Reset accumulated stats:
SELECT angara_stat_statements_reset();
The maximum number of tracked statements is controlled by ANGARABASE_STAT_STATEMENTS_MAX (LRU-bounded). See
Configuration — Diagnostics knobs.
angara_top_queries
Convenience wrapper returning the top-N queries by cumulative time:
SELECT * FROM angara_top_queries(10);
Slow-query log
Capture queries that exceed a duration threshold to the server log.
Configuration
Set before server start (env variables override config):
| Variable | Default | Description |
|---|---|---|
ANGARABASE_LOG_MIN_DURATION_MS | -1 (disabled) | Threshold in milliseconds. Set to 0 to log all queries. |
ANGARABASE_LOG_QUERY_TEXT | 0 | Include raw SQL text in the log entry (0 / 1). |
Example:
export ANGARABASE_LOG_MIN_DURATION_MS=500
export ANGARABASE_LOG_QUERY_TEXT=1
angarabase-server --config /path/to/angarabase.conf
Slow queries appear in the server log at logging.log_directory.
Saturation / backpressure (operability)
When p99/p99.9 degrades under load, you usually want to distinguish:
- lock/contention waits,
- IO/scheduler saturation,
- admission/queue rejects vs “random” timeouts.
Practical entry points:
angara_stat_activity.wait_event_type(coarse wait classification)- operator runbook:
angarabook/src/operations/troubleshooting.md
System introspection views
sys.health
Overall server health status:
SELECT * FROM sys.health;
sys.settings
Effective configuration (config + env overrides resolved):
SELECT * FROM sys.settings;
SELECT * FROM sys.settings WHERE name LIKE 'storage.%';
sys.identity
Instance identity metadata:
SELECT * FROM sys.identity;
Expected result
EXPLAINvariants return plan text (or JSON) without errors.angara_stat_activityshows at least the current session.angara_stat_statementsaccumulates entries after queries are executed.sys.healthreturns{"status":"ready"}on a healthy instance.
Troubleshooting
| Symptom | Action |
|---|---|
EXPLAIN ANALYZE modifies data | This should not happen — DML runs in a dry-run transaction. If data changes persist, file a bug via Support. |
angara_stat_statements is empty | Ensure queries have been executed since the last reset. Check ANGARABASE_STAT_STATEMENTS_MAX is not 0. |
| Slow-query log has no entries | Verify ANGARABASE_LOG_MIN_DURATION_MS is set to a non-negative value and queries actually exceed the threshold. |
sys.health shows not-ready | Check GET /health/ready for the JSON reason. Collect a diagnostics bundle — see Monitoring. |
For unresolved issues see Known issues and Support.
Links
- SQL compatibility reference: SQL overview
- Monitoring setup: Monitoring
- Configuration (env knobs): Configuration
- Diagnostics bundle tool:
tools/diagnostics_bundle/run.sh - Operator runbook:
angarabook/src/operations/troubleshooting.md - Incident runbook (10 minutes): Error debug runbook
- Support flow: Support
Structured Logging
Status: ✅ Active Scope: Production operations, diagnostics, troubleshooting
Overview
AngaraBase uses structured logging for production observability and diagnostics. This replaces ad-hoc
println! calls with consistent, level-based log messages using the Rust log crate and env_logger
backend.
Key Benefits
- Consistent Format: All log messages follow structured
key=valueformat - Level Control: Runtime-configurable verbosity (error/warn/info/debug/trace)
- Production Ready: Suitable for log aggregation systems (ELK, Splunk, etc.)
- Performance: Low overhead when disabled, structured context when enabled
Log Levels
Level Policy
| Level | Usage | Examples |
|---|---|---|
error | System failures, data corruption, unrecoverable errors | Database corruption, OOM, panic recovery |
warn | Recoverable failures, degraded performance, misconfigurations | Failed heartbeat, audit sink errors, fallback modes |
info | Operational events, lifecycle changes, important state transitions | Instance startup, lease acquisition, stats completion |
debug | Detailed diagnostics, performance metrics, internal state | Micro-rescan progress, MVCC recovery details, buffer pool stats |
trace | Very verbose debugging, hot path details | Individual tuple processing, lock acquisition |
Production Recommendations
- Production:
info(default) - captures operational events without performance impact - Troubleshooting:
debug- detailed diagnostics for performance analysis - Development:
trace- full verbosity for debugging
Configuration
Static Configuration (angarabase.conf)
[logging]
log_level = "info"
Supported values: error, warn, info, debug, trace
Default: info
Restart required: No (dynamic setting)
Environment Variable Override
export ANGARABASE_LOG_LEVEL=debug
./angarabase-server --config /etc/angarabase/angarabase.conf
Environment variables take precedence over config file settings during bootstrap.
Runtime Configuration
Change log level without restart using SQL:
-- Check current setting
SELECT * FROM sys.settings WHERE name = 'logging.log_level';
-- Change to debug level
SELECT sys.set_setting('logging.log_level', 'debug');
-- Verify change
SELECT * FROM sys.settings WHERE name = 'logging.log_level';
Note: Runtime changes are volatile and reset on restart. Update angarabase.conf for persistent changes.
Log Format
Structure
All log messages follow this format:
2026-03-09T10:30:45.123Z INFO [angarabase_server::stats] stats_scheduler: auto_analyze_triggered table=users staleness_score=2.45
Components:
- Timestamp: ISO 8601 with millisecond precision
- Level: ERROR/WARN/INFO/DEBUG/TRACE
- Module: Rust module path (e.g.,
angarabase_server::stats) - Context: Structured
key=valuepairs with operation context
Context Format
Log messages use consistent key=value structured context:
#![allow(unused)]
fn main() {
// Good: structured context
log::info!("instance_lease: acquired holder={} expires_at={}", holder_id, expires_at);
// Good: operation context
log::debug!("micro_rescan: completed table={} scanned_rows={} duration={:?}", table, rows, duration);
// Good: error context
log::warn!("audit_sink: write_failed path={} err={}", path, error);
}
Context Guidelines:
- Use
snake_casefor keys - Include operation prefix (e.g.,
micro_rescan:,instance_lease:) - Provide enough context for troubleshooting
- Avoid sensitive data in logs (use IDs, not content)
Production Deployment
Log Aggregation
AngaraBase structured logs integrate well with log aggregation systems:
ELK Stack:
# logstash.conf
filter {
if [program] == "angarabase-server" {
grok {
match => { "message" => "%{TIMESTAMP_ISO8601:timestamp} %{LOGLEVEL:level} \[%{DATA:module}\] %{GREEDYDATA:context}" }
}
kv {
source => "context"
field_split => " "
value_split => "="
}
}
}
Splunk:
[angarabase]
EXTRACT-level = (?<level>ERROR|WARN|INFO|DEBUG|TRACE)
EXTRACT-module = \[(?<module>[^\]]+)\]
EXTRACT-operation = (?<operation>\w+):
Log Rotation
Configure log rotation for production deployments:
# /etc/logrotate.d/angarabase
/var/log/angarabase/*.log {
daily
rotate 30
compress
delaycompress
missingok
notifempty
create 0644 angarabase angarabase
postrotate
systemctl reload angarabase-server
endscript
}
Monitoring
Key log patterns to monitor:
# Error rate monitoring
grep "ERROR" /var/log/angarabase/server.log | wc -l
# Instance lease issues
grep "instance_lease.*failed" /var/log/angarabase/server.log
# Performance degradation
grep "stats_scheduler.*io_backpressure" /var/log/angarabase/server.log
# MVCC recovery events
grep "mvcc_recovery:" /var/log/angarabase/server.log
Troubleshooting
Common Issues
High log volume at debug level:
-- Reduce to info level
SELECT sys.set_setting('logging.log_level', 'info');
Missing log output:
# Check RUST_LOG environment variable
echo $RUST_LOG
# Override with explicit filter
export RUST_LOG=angarabase_server=debug,rustls=warn
Log format parsing issues:
- Ensure timestamp parsing handles millisecond precision
- Context parsing should handle
key=valuepairs with spaces - Module names contain
::separators
Performance Impact
Log level performance characteristics:
| Level | Overhead | Use Case |
|---|---|---|
error/warn | Minimal | Always safe in production |
info | Low | Default production level |
debug | Moderate | Short-term troubleshooting |
trace | High | Development/debugging only |
Recommendation: Use info for production, temporarily increase to debug for troubleshooting specific
issues.
Migration Notes
From println! to log::*!
Current implementation migrated all production println! calls to structured logging:
- 60+ calls across 11 files replaced
- Backward compatibility: No breaking changes to existing functionality
- Enhanced context: More structured information for operations
Future Enhancements (Phase 2)
Planned for next releases:
- Tracing integration: Distributed tracing with spans
- OpenTelemetry: OTLP export for APM systems
- Metrics correlation: Link logs with performance metrics
- Sampling: Adaptive sampling for high-volume environments
See Also
- RFC-2026-360: Structured Logging & Tracing v0
- Instance Lifecycle - Instance lease logging
- Crash Recovery - MVCC recovery logging
- Settings Management - Runtime configuration
AngaraBase Tracing Operations Guide
Audience: Database operators, SREs, performance engineers
Overview
AngaraBase includes structured tracing based on the tracing crate with OpenTelemetry support. This system
replaces legacy env_logger and provides end-to-end visibility for the query execution pipeline.
Key Benefits:
- End-to-end spans:
accept → authenticate → parse → plan → execute → storage → response - Each query has a unique
trace_id - JSON-structured logs for machine parsing
- Integration with Jaeger/Tempo through OpenTelemetry
- Automatic tracing-context propagation across async/sync boundaries
Configuration
Basic Setup
In angarabase.conf:
[diagnostics]
# Tracing output format: "text" (human-readable) or "json" (structured)
tracing_format = "json"
# Log level filtering (same as RUST_LOG)
log_level = "info"
Environment Variables
| Variable | Description | Example |
|---|---|---|
RUST_LOG | Log level filter | RUST_LOG=angarabase=debug,tokio=info |
ANGARABASE_OTLP_ENDPOINT | OpenTelemetry collector endpoint | http://jaeger:14268/api/traces |
ANGARABASE_TRACE_SAMPLE_RATE | Sampling rate (0.0-1.0) | 0.1 (10% sampling) |
JSON vs Text Format
Text format (human-readable):
2026-03-12T10:30:45.123Z INFO angarabase::query::executor: query_start session_id=12345 sql="SELECT * FROM users"
2026-03-12T10:30:45.125Z DEBUG angarabase::query::planner: plan_created plan_hash=abc123 estimated_rows=1000
JSON format (machine-parseable):
{"timestamp":"2026-03-12T10:30:45.123Z","level":"INFO","target":"angarabase::query::executor","fields":{"session_id":12345,"sql":"SELECT * FROM users"},"span":{"name":"query_execution","trace_id":"abc123"}}
Tracing Architecture
Span Hierarchy
query_execution (root span)
├── parse (SQL → AST)
├── plan (AST → execution plan)
├── execute
│ ├── storage_io (page reads/writes)
│ ├── lock_acquisition
│ └── wal_flush
└── commit (autocommit finalization)
Span Propagation
AngaraBase automatically propagates tracing context through:
- Async boundaries:
tokio::task::spawn_blockingcalls - Thread pool: Worker thread execution
- Network layer: AngaraNet io_uring/tokio integration
Implementation pattern:
#![allow(unused)]
fn main() {
let span = tracing::Span::current();
tokio::task::spawn_blocking(move || {
let _enter = span.enter();
// Sync code here inherits tracing context
engine.execute(&query)
}).await
}
Operational Procedures
Enabling Tracing
- Development/Debug:
RUST_LOG=angarabase=trace ./angarabase-server
- Production (JSON logs):
# angarabase.conf
[diagnostics]
tracing_format = "json"
log_level = "info"
- OpenTelemetry Export:
export ANGARABASE_OTLP_ENDPOINT=http://jaeger:14268/api/traces
export ANGARABASE_TRACE_SAMPLE_RATE=0.05 # 5% sampling
./angarabase-server
Monitoring Query Performance
1. Slow Query Detection
Text logs:
grep "query_execution.*duration_ms" /var/log/angarabase.log | \
awk '$NF > 1000' | head -10 # Queries > 1 second
JSON logs:
jq 'select(.span.name == "query_execution" and .fields.duration_ms > 1000)' \
/var/log/angarabase.log
2. Per-Phase Timing Analysis
Look for spans with names: parse, plan, execute, commit
Example JSON query:
jq 'select(.span.name == "execute" and .fields.duration_ms > 500)' \
/var/log/angarabase.log | \
jq '.fields | {trace_id, duration_ms, estimated_rows}'
3. Lock Contention Detection
# Find queries waiting on locks
jq 'select(.fields.wait_event_type == "Lock")' /var/log/angarabase.log
Troubleshooting Common Issues
High Parse Time
# Find queries with slow parsing
jq 'select(.span.name == "parse" and .fields.duration_ms > 100)' \
/var/log/angarabase.log | jq '.fields.sql'
Common causes:
- Complex SQL with many JOINs
- Large IN clauses
- Deeply nested subqueries
High Plan Time
# Find queries with slow planning
jq 'select(.span.name == "plan" and .fields.duration_ms > 200)' \
/var/log/angarabase.log
Common causes:
- Missing statistics
- Complex join ordering
- Large number of tables
High Execute Time
# Correlate with storage I/O
jq 'select(.span.name == "storage_io" and .fields.duration_ms > 1000)' \
/var/log/angarabase.log
Common causes:
- Sequential scans
- I/O bottlenecks
- Lock contention
Integration with External Tools
Jaeger Integration
- Setup Jaeger:
docker run -d --name jaeger \
-p 14268:14268 -p 16686:16686 \
jaegertracing/all-in-one:latest
- Configure AngaraBase:
export ANGARABASE_OTLP_ENDPOINT=http://localhost:14268/api/traces
- View traces: http://localhost:16686
Grafana/Tempo Integration
# tempo.yaml
server:
http_listen_port: 3200
distributor:
receivers:
otlp:
protocols:
http:
endpoint: 0.0.0.0:4318
storage:
trace:
backend: local
local:
path: /tmp/tempo/traces
Log Aggregation (ELK Stack)
Logstash configuration:
input {
file {
path => "/var/log/angarabase.log"
codec => "json"
}
}
filter {
if [span][name] {
mutate {
add_field => { "trace_operation" => "%{[span][name]}" }
}
}
}
output {
elasticsearch {
hosts => ["elasticsearch:9200"]
index => "angarabase-traces-%{+YYYY.MM.dd}"
}
}
Performance Impact
Overhead Measurements
| Configuration | CPU Overhead | Latency Impact |
|---|---|---|
| Text format, INFO level | < 1% | < 5ms p99 |
| JSON format, INFO level | < 2% | < 8ms p99 |
| JSON + OTLP export, 5% sampling | < 3% | < 10ms p99 |
| JSON + OTLP export, 100% sampling | < 8% | < 20ms p99 |
Production Recommendations
- Use JSON format for structured log processing
- Set appropriate log levels:
INFOfor production,DEBUGfor troubleshooting - Configure OTLP sampling: 1-10% for high-traffic systems
- Monitor log volume: JSON logs are ~2x larger than text
Alerting and Monitoring
Key Metrics to Monitor
- Query Duration Distribution:
histogram_quantile(0.95,
rate(angarabase_query_duration_seconds_bucket[5m])
)
- Slow Query Count:
increase(angarabase_slow_queries_total[5m])
- Tracing Overhead:
rate(angarabase_tracing_events_total[5m])
Alerting Rules
groups:
- name: angarabase_tracing
rules:
- alert: HighQueryLatency
expr: |
histogram_quantile(0.95,
rate(angarabase_query_duration_seconds_bucket[5m])
) > 1.0
for: 2m
labels:
severity: warning
annotations:
summary: "AngaraBase query latency is high"
- alert: TracingVolumeHigh
expr: |
rate(angarabase_tracing_events_total[5m]) > 1000
for: 5m
labels:
severity: info
annotations:
summary: "AngaraBase tracing volume is high"
Security Considerations
Sensitive Data in Logs
⚠️ WARNING: Tracing logs may contain SQL queries with sensitive data.
Mitigation strategies:
- Query parameter redaction:
[diagnostics]
redact_query_params = true # Replace literals with ?
- Log rotation and retention:
# logrotate configuration
/var/log/angarabase.log {
daily
rotate 7
compress
delaycompress
missingok
notifempty
}
- Access control:
chmod 640 /var/log/angarabase.log
chown angarabase:angarabase-ops /var/log/angarabase.log
OTLP Export Security
# Use TLS for OTLP export
export ANGARABASE_OTLP_ENDPOINT=https://jaeger.internal:14268/api/traces
export ANGARABASE_OTLP_TLS_CERT=/etc/ssl/certs/angarabase.crt
export ANGARABASE_OTLP_TLS_KEY=/etc/ssl/private/angarabase.key
Troubleshooting
Common Issues
1. No Tracing Output
Symptoms: No tracing spans in logs Causes:
RUST_LOGnot set or too restrictivetracing_formatmisconfigured- Tracing not enabled in binary
Solution:
# Check tracing is compiled in
./angarabase-server --version | grep tracing
# Enable debug tracing
RUST_LOG=angarabase=debug ./angarabase-server
2. Missing Span Context
Symptoms: Broken trace chains, missing parent-child relationships Causes:
- Missing
span.enter()inspawn_blockingcalls - Incorrect async context propagation
Solution: Check for proper span propagation pattern in code
3. High Log Volume
Symptoms: Disk space issues, performance degradation Causes:
- Log level too verbose (
TRACEin production) - No log rotation
- High query volume
Solution:
# Reduce log level
export RUST_LOG=angarabase=info
# Enable log rotation
logrotate -f /etc/logrotate.d/angarabase
4. OTLP Export Failures
Symptoms: Traces not appearing in Jaeger/Tempo Causes:
- Network connectivity issues
- Incorrect endpoint configuration
- Authentication failures
Solution:
# Test OTLP endpoint
curl -v $ANGARABASE_OTLP_ENDPOINT/health
# Check AngaraBase logs for export errors
grep "otlp.*error" /var/log/angarabase.log
Advanced Topics
Custom Span Attributes
AngaraBase automatically adds these attributes to spans:
| Attribute | Description | Example |
|---|---|---|
session_id | Database session ID | 12345 |
query_fingerprint | SQL query hash | abc123def |
plan_hash | Execution plan hash | def456ghi |
estimated_rows | Query planner estimate | 1000 |
actual_rows | Actual rows returned | 987 |
wait_event_type | Current wait type | Lock, IO, Net |
wait_event | Specific wait event | RowLock, PageRead |
Correlation with System Metrics
CPU correlation:
# Find high-CPU queries
jq 'select(.span.name == "execute" and .fields.cpu_time_ms > 1000)' \
/var/log/angarabase.log
I/O correlation:
# Find I/O-heavy queries
jq 'select(.fields.io_reads > 1000 or .fields.io_writes > 100)' \
/var/log/angarabase.log
Custom Dashboards
Grafana query examples:
- Query throughput by operation:
sum(rate(angarabase_queries_total[5m])) by (operation)
- Average query duration by phase:
avg(angarabase_query_phase_duration_seconds) by (phase)
- Lock contention rate:
sum(rate(angarabase_wait_events_total{type="Lock"}[5m]))
References
- RFC-2026-360: Structured Logging and Tracing v0
- RFC-2026-461: Async Runtime Migration Strategy v0
- CODING_STANDARDS.md: Tracing guidelines (§9)
- ASYNC_GUIDELINES.md: Span propagation patterns
- Tracing crate docs: https://docs.rs/tracing/
- OpenTelemetry spec: https://opentelemetry.io/docs/
Contact: Database SRE team Last updated: 2026-03-12 by current implementation phase
AngaraBase USDT Probes Operations Guide
Audience: Database operators, performance engineers, eBPF developers
Overview
AngaraBase includes User Statically-Defined Tracing (USDT) probes — zero-overhead instrumentation points for integration with eBPF tools (bpftrace, bcc, pg_expecto).
Key Benefits:
- Zero overhead when not attached (NOP instructions)
- Real-time attachment without restarting the process
- Correlation of engine-level events with OS-level metrics
- Compatibility with existing eBPF toolchains
Probe Taxonomy: 12 probe points cover the query lifecycle and wait events.
Probe Architecture
Probe Categories
| Category | Probes | Description |
|---|---|---|
| Query Lifecycle | query_start, query_end | Full query execution timing |
| Query Phases | phase_start, phase_end | Per-phase timing (parse/plan/exec/commit) |
| Lock Events | lock_wait_start, lock_wait_end | Lock contention measurement |
| I/O Events | io_start, io_end | Storage I/O operations |
| Network Events | net_stall_start, net_stall_end | Network I/O stalls |
| Scheduler Events | sched_wait_start, sched_wait_end | Thread pool queue waits |
Probe Signature
All probes follow consistent ABI:
// Start probes (2 arguments)
probe angarabase:query_start(uint64_t session_id, uint64_t query_fingerprint);
probe angarabase:lock_wait_start(uint64_t session_id, uint8_t lock_type);
// End probes (3-4 arguments)
probe angarabase:query_end(uint64_t session_id, uint64_t query_fingerprint,
uint64_t duration_us, uint8_t outcome);
probe angarabase:lock_wait_end(uint64_t session_id, uint64_t wait_duration_us);
Probe Locations
Probes fire at same instrumentation points as WaitEventGuard RAII:
#![allow(unused)]
fn main() {
// Example: Lock wait instrumentation
let _guard = WaitEventGuard::enter(session_id, WaitEventType::Lock, WaitEvent::RowLock);
// probe_lock_wait_start! fires here
// Blocking operation (lock acquisition)
let result = lock_manager.acquire_lock(&key);
// probe_lock_wait_end! fires when _guard drops
}
Compilation and Feature Flags
Build Configuration
Default build (probes enabled):
cargo build --release
# USDT probes compiled in (zero overhead when not attached)
Minimal build (probes excluded):
cargo build --release --no-default-features
# No USDT probes, smaller binary
Verify probe compilation:
# Check probes are embedded
readelf -n ./target/release/angarabase-server | grep -A5 "stapsdt"
# List available probes
bpftrace -l 'usdt:./target/release/angarabase-server:angarabase:*'
Expected output:
usdt:./target/release/angarabase-server:angarabase:query_start
usdt:./target/release/angarabase-server:angarabase:query_end
usdt:./target/release/angarabase-server:angarabase:phase_start
usdt:./target/release/angarabase-server:angarabase:phase_end
usdt:./target/release/angarabase-server:angarabase:lock_wait_start
usdt:./target/release/angarabase-server:angarabase:lock_wait_end
usdt:./target/release/angarabase-server:angarabase:io_start
usdt:./target/release/angarabase-server:angarabase:io_end
usdt:./target/release/angarabase-server:angarabase:net_stall_start
usdt:./target/release/angarabase-server:angarabase:net_stall_end
usdt:./target/release/angarabase-server:angarabase:sched_wait_start
usdt:./target/release/angarabase-server:angarabase:sched_wait_end
bpftrace Examples
Query Tags
query_tag is a u64 hash (xxh64) propagated through all probe fires for a query/session.
You can set it per-session:
SET query_tag = 'demo-heavy-report';
or per-query:
/*+ angarabase:tag=demo-heavy-report */ SELECT * FROM orders;
The scripts below filter on “tag != 0” (tagged traffic only), so regular traffic keeps zero-overhead behavior.
# 1) Lock + IO latency by tagged query
bpftrace -e '
usdt:./angarabased:angarabase:lock_wait_end /arg2 != 0/ {
@lock_wait_us_by_tag[arg2] = hist(arg1);
}
usdt:./angarabased:angarabase:io_end /arg2 != 0/ {
@io_wait_us_by_tag[arg2] = hist(arg1);
}
'
# 2) Per-operator breakdown (fires only for tag != 0)
bpftrace -e '
usdt:./angarabased:angarabase:operator_end /arg1 != 0/ {
@op_duration_us[arg1, arg2] = hist(arg4);
@op_rows_out[arg1, arg2] = sum(arg3);
}
'
# 3) Compare p99 across tagged query cohorts (A/B)
bpftrace -e '
usdt:./angarabased:angarabase:query_end /arg4 != 0/ {
@query_total_us_by_tag[arg4] = hist(arg2);
@query_rows_by_tag[arg4] = sum(arg5);
}
'
Basic Query Monitoring
1. Query Throughput and Latency
#!/usr/bin/env bpftrace
# query_latency.bt - Monitor query execution time
usdt:./angarabase-server:angarabase:query_start {
@start[arg0] = nsecs; // session_id -> start_time
@queries++;
}
usdt:./angarabase-server:angarabase:query_end {
$duration_us = arg2; // duration from probe
$outcome = arg3; // 0=Ok, 1=Error
delete(@start[arg0]);
if ($outcome == 0) {
@latency_us = hist($duration_us);
@successful++;
} else {
@failed++;
}
}
interval:s:5 {
printf("\n=== Query Stats (5s) ===\n");
printf("Total queries: %d\n", @queries);
printf("Successful: %d\n", @successful);
printf("Failed: %d\n", @failed);
print(@latency_us);
clear(@queries); clear(@successful); clear(@failed);
}
END {
clear(@start); clear(@latency_us);
}
2. Per-Phase Timing Breakdown
#!/usr/bin/env bpftrace
# phase_timing.bt - Analyze query phase performance
usdt:./angarabase-server:angarabase:phase_start {
$session_id = arg0;
$phase = arg1; // 1=parse, 2=plan, 3=execute, 4=commit
@phase_start[$session_id, $phase] = nsecs;
}
usdt:./angarabase-server:angarabase:phase_end {
$session_id = arg0;
$phase = arg1;
$duration_us = arg2;
delete(@phase_start[$session_id, $phase]);
if ($phase == 1) {
@parse_time = hist($duration_us);
} else if ($phase == 2) {
@plan_time = hist($duration_us);
} else if ($phase == 3) {
@exec_time = hist($duration_us);
} else if ($phase == 4) {
@commit_time = hist($duration_us);
}
}
interval:s:10 {
printf("\n=== Phase Timing Distribution ===\n");
printf("Parse time (us):\n"); print(@parse_time);
printf("Plan time (us):\n"); print(@plan_time);
printf("Execute time (us):\n"); print(@exec_time);
printf("Commit time (us):\n"); print(@commit_time);
}
END {
clear(@phase_start);
clear(@parse_time); clear(@plan_time);
clear(@exec_time); clear(@commit_time);
}
Lock Contention Analysis
3. Lock Wait Time Distribution
#!/usr/bin/env bpftrace
# lock_contention.bt - Monitor lock contention
usdt:./angarabase-server:angarabase:lock_wait_start {
$session_id = arg0;
$lock_type = arg1; // Lock type enum
@lock_start[$session_id] = nsecs;
@lock_requests[$lock_type]++;
}
usdt:./angarabase-server:angarabase:lock_wait_end {
$session_id = arg0;
$wait_duration_us = arg1;
delete(@lock_start[$session_id]);
@lock_wait_time = hist($wait_duration_us);
if ($wait_duration_us > 1000) { // > 1ms
@slow_locks++;
printf("SLOW LOCK: session=%d wait=%d us\n", $session_id, $wait_duration_us);
}
}
interval:s:5 {
printf("\n=== Lock Contention Stats ===\n");
printf("Lock requests by type:\n"); print(@lock_requests);
printf("Lock wait time distribution (us):\n"); print(@lock_wait_time);
printf("Slow locks (>1ms): %d\n", @slow_locks);
clear(@lock_requests); clear(@slow_locks);
}
END {
clear(@lock_start); clear(@lock_wait_time);
}
I/O Performance Monitoring
4. Storage I/O Latency
#!/usr/bin/env bpftrace
# io_latency.bt - Monitor storage I/O performance
usdt:./angarabase-server:angarabase:io_start {
$session_id = arg0;
$op_type = arg1; // I/O operation type
$bytes = arg2; // Bytes to read/write
@io_start[$session_id] = nsecs;
@io_bytes += $bytes;
@io_ops++;
}
usdt:./angarabase-server:angarabase:io_end {
$session_id = arg0;
$latency_us = arg1;
delete(@io_start[$session_id]);
@io_latency = hist($latency_us);
if ($latency_us > 10000) { // > 10ms
@slow_io++;
printf("SLOW I/O: session=%d latency=%d us\n", $session_id, $latency_us);
}
}
interval:s:5 {
printf("\n=== I/O Performance Stats ===\n");
printf("Total I/O ops: %d\n", @io_ops);
printf("Total I/O bytes: %d\n", @io_bytes);
printf("I/O latency distribution (us):\n"); print(@io_latency);
printf("Slow I/O ops (>10ms): %d\n", @slow_io);
clear(@io_ops); clear(@io_bytes); clear(@slow_io);
}
END {
clear(@io_start); clear(@io_latency);
}
Advanced Use Cases
Correlation with System Events
5. Engine + Kernel Correlation
#!/usr/bin/env bpftrace
# engine_kernel_correlation.bt - Correlate AngaraBase events with kernel
#include <linux/sched.h>
// Track AngaraBase query starts
usdt:./angarabase-server:angarabase:query_start {
@query_pids[pid] = 1;
@query_start_time[arg0] = nsecs; // session_id -> start_time
}
// Track context switches for AngaraBase processes
tracepoint:sched:sched_switch {
$prev_pid = args->prev_pid;
$next_pid = args->next_pid;
// Track when AngaraBase process gets scheduled out
if (@query_pids[$prev_pid]) {
@context_switches[$prev_pid]++;
}
}
// Track page faults for AngaraBase processes
tracepoint:exceptions:page_fault_user {
if (@query_pids[pid]) {
@page_faults[pid]++;
}
}
usdt:./angarabase-server:angarabase:query_end {
$session_id = arg0;
$duration_us = arg2;
delete(@query_start_time[$session_id]);
printf("Query session=%d duration=%d us ctx_switches=%d page_faults=%d\n",
$session_id, $duration_us, @context_switches[pid], @page_faults[pid]);
}
END {
clear(@query_pids); clear(@query_start_time);
clear(@context_switches); clear(@page_faults);
}
Custom Aggregations
6. Top Slow Queries by Fingerprint
#!/usr/bin/env bpftrace
# top_slow_queries.bt - Track slowest queries by fingerprint
usdt:./angarabase-server:angarabase:query_end {
$session_id = arg0;
$query_fingerprint = arg1;
$duration_us = arg2;
$outcome = arg3;
if ($outcome == 0 && $duration_us > 1000) { // Successful queries > 1ms
@slow_queries[$query_fingerprint] = hist($duration_us);
@query_count[$query_fingerprint]++;
@total_time[$query_fingerprint] += $duration_us;
}
}
interval:s:30 {
printf("\n=== Top 10 Slowest Query Fingerprints ===\n");
print(@slow_queries, 10);
printf("\n=== Query Execution Counts ===\n");
print(@query_count, 10);
printf("\n=== Total Time by Fingerprint (us) ===\n");
print(@total_time, 10);
}
END {
clear(@slow_queries); clear(@query_count); clear(@total_time);
}
Production Deployment
Performance Impact
| Configuration | CPU Overhead | Memory Overhead |
|---|---|---|
| Probes compiled, not attached | 0% | +~50KB binary size |
| Light monitoring (query_start/end) | < 0.5% | +~1MB |
| Full monitoring (all 12 probes) | < 2% | +~5MB |
| Heavy aggregation (histograms) | < 5% | +~20MB |
Recommended Monitoring Strategy
Production Environment
# Light monitoring - query throughput and basic latency
bpftrace query_latency.bt
# Periodic deep-dive - run for 5 minutes every hour
*/60 * * * * timeout 300 bpftrace phase_timing.bt > /var/log/angarabase-phase-timing.log
Troubleshooting Environment
# Full monitoring during incident investigation
bpftrace lock_contention.bt &
bpftrace io_latency.bt &
bpftrace engine_kernel_correlation.bt &
# Stop all monitoring
pkill bpftrace
Security Considerations
Required Capabilities
# BPF capability required for probe attachment
sudo setcap cap_bpf+ep /usr/bin/bpftrace
# Alternative: run as root (not recommended)
sudo bpftrace query_latency.bt
Access Control
# Restrict bpftrace access to specific group
groupadd angarabase-monitoring
usermod -a -G angarabase-monitoring monitoring-user
# Set appropriate permissions
chown root:angarabase-monitoring /usr/bin/bpftrace
chmod 750 /usr/bin/bpftrace
Integration with Monitoring Systems
Prometheus Integration
#!/bin/bash
# angarabase_probe_exporter.sh - Export probe metrics to Prometheus
# Run bpftrace and parse output
bpftrace -e '
usdt:./angarabase-server:angarabase:query_end {
@query_duration_sum += arg2;
@query_count++;
}
interval:s:15 {
printf("angarabase_query_duration_sum %d\n", @query_duration_sum);
printf("angarabase_query_count %d\n", @query_count);
clear(@query_duration_sum); clear(@query_count);
}
' | while read line; do
echo "$line" > /var/lib/prometheus/node-exporter/angarabase-probes.prom.$$
mv /var/lib/prometheus/node-exporter/angarabase-probes.prom.$$ \
/var/lib/prometheus/node-exporter/angarabase-probes.prom
done
Grafana Dashboard
{
"dashboard": {
"title": "AngaraBase USDT Probes",
"panels": [
{
"title": "Query Rate",
"type": "stat",
"targets": [
{
"expr": "rate(angarabase_query_count[5m])",
"legendFormat": "Queries/sec"
}
]
},
{
"title": "Average Query Latency",
"type": "stat",
"targets": [
{
"expr": "rate(angarabase_query_duration_sum[5m]) / rate(angarabase_query_count[5m])",
"legendFormat": "Avg latency (us)"
}
]
}
]
}
}
Troubleshooting
Common Issues
1. No Probes Listed
Symptoms:
$ bpftrace -l 'usdt:./angarabase-server:angarabase:*'
# No output
Causes:
- Binary compiled with
--no-default-features - Wrong binary path
- Probes stripped during build
Solution:
# Verify binary has probes
readelf -n ./angarabase-server | grep stapsdt
# Check build configuration
./angarabase-server --version | grep usdt
# Rebuild with probes
cargo build --release --features usdt
2. Permission Denied
Symptoms:
$ bpftrace query_latency.bt
ERROR: failed to attach probe
Causes:
- Missing BPF capabilities
- SELinux/AppArmor restrictions
- Kernel version incompatibility
Solution:
# Add BPF capability
sudo setcap cap_bpf+ep /usr/bin/bpftrace
# Check kernel version (requires 4.9+)
uname -r
# Temporary workaround
sudo bpftrace query_latency.bt
3. High Overhead
Symptoms:
- CPU usage increase > 5%
- Query latency degradation
- Memory usage growth
Causes:
- Too many active probes
- Heavy aggregation (histograms)
- High-frequency events
Solution:
# Reduce monitoring frequency
interval:s:60 { ... } # Instead of interval:s:5
# Use sampling
usdt:./angarabase-server:angarabase:query_start / pid % 10 == 0 / {
# Monitor only 10% of queries
}
# Limit histogram buckets
@latency = hist($duration_us, 0, 10000, 100); # 100 buckets max
4. Missing Events
Symptoms:
- Expected probes don’t fire
- Incomplete event sequences
- Zero counts in monitoring
Causes:
- Process not using instrumented code paths
- Probe attachment race condition
- Incorrect probe signatures
Solution:
# Verify process is running instrumented code
ps aux | grep angarabase-server
# Check probe attachment
bpftrace -l 'usdt:*:angarabase:*' | grep $(pgrep angarabase-server)
# Debug probe firing
bpftrace -e 'usdt:./angarabase-server:angarabase:* { printf("probe=%s\n", probe); }'
Best Practices
Probe Development
- Start Simple: Begin with basic query_start/query_end monitoring
- Add Gradually: Introduce additional probes based on specific needs
- Test Impact: Measure overhead before deploying to production
- Use Sampling: For high-traffic systems, sample events (e.g., 1 in 100)
Script Organization
# Directory structure
/opt/angarabase/monitoring/
├── scripts/
│ ├── query_latency.bt
│ ├── lock_contention.bt
│ └── io_performance.bt
├── dashboards/
│ └── grafana-angarabase-probes.json
└── exporters/
└── prometheus-exporter.sh
Monitoring Strategy
- Always-On: Basic query throughput and latency
- Periodic: Detailed phase timing (every hour for 5 minutes)
- On-Demand: Lock contention and I/O analysis during incidents
- Correlation: Engine + kernel events for deep troubleshooting
References
- RFC-2026-369: USDT eBPF Observability Probes v0
- bpftrace documentation: https://github.com/iovisor/bpftrace
- BCC tools: https://github.com/iovisor/bcc
- USDT specification: https://sourceware.org/systemtap/wiki/UserSpaceProbeImplementation
- Linux eBPF: https://ebpf.io/
Contact: Database SRE team Last updated: 2026-03-12 by current implementation phase
GC auto-tuning
Status: Production-ready Contract: RFC-2026-034 §12 (AngaraGC Phase 4 auto-tuning)
Overview
AngaraGC auto-tuning automatically adjusts Garbage Collection budgets based on workload telemetry:
- Bloat ratio — how much dead data accumulates
- Epoch lag — how far behind active transactions are from committed state
- Cycle latency — how long each GC cycle takes
The controller uses a PID-like feedback loop to balance aggressive cleanup (high budget) against latency impact (low budget).
Configuration
Enable auto-tuning
export ANGARABASE_GC_BACKGROUND=1 # Enable GC background worker
export ANGARABASE_GC_AUTO_TUNING=1 # Enable auto-tuning controller
Or via config file:
[gc]
auto_tuning_enabled = true
Tuning parameters (optional)
Default values are production-safe. Override only if you understand the trade-offs:
# Target bloat ratio (default: 20%)
export ANGARABASE_GC_TARGET_BLOAT_RATIO=20
# Target epoch lag (default: 1000 epochs)
export ANGARABASE_GC_TARGET_LAG=1000
# Latency spike threshold (default: 100ms)
export ANGARABASE_GC_LATENCY_THRESHOLD=100
# Budget bounds (default: 100..100000 tuples/cycle)
export ANGARABASE_GC_BUDGET_MIN=100
export ANGARABASE_GC_BUDGET_MAX=100000
Monitoring
View current auto-tuning state
SELECT * FROM sys.gc_tuning_status;
Returns:
current_budget— current GC budget (tuples/cycle)current_sleep_ms— current sleep interval between cyclestuning_decision— last decision (increase,decrease,hold)
Prometheus metrics
# Controller state
angarabase_gc_tuning_budget_tuples_per_cycle
angarabase_gc_tuning_sleep_ms
angarabase_gc_tuning_bloat_ratio_percent
angarabase_gc_tuning_min_active_epoch_lag
angarabase_gc_tuning_cycle_duration_ms_last
# Decision counters
angarabase_gc_tuning_decision_total_increase
angarabase_gc_tuning_decision_total_decrease
angarabase_gc_tuning_decision_total_hold
Operational notes
When to enable
- Production workloads with variable load: auto-tuning adapts to workload changes
- High bloat risk: auto-tuning increases budget when bloat accumulates
- Latency-sensitive workloads: auto-tuning backs off when GC causes latency spikes
When NOT to enable
- Predictable workloads with stable budget: manual tuning may be simpler
- Debugging GC issues: disable auto-tuning to isolate issues with fixed budget
Bounds safety
- Controller never violates
min_budgetormax_budget - Bounds clamp adaptive adjustments, ensuring predictable worst-case behavior
- Default bounds are conservative; adjust only after profiling
Troubleshooting
Auto-tuning oscillates
Symptom: Decision alternates increase/decrease rapidly.
Fix: Increase latency threshold or bloat target to reduce sensitivity:
export ANGARABASE_GC_LATENCY_THRESHOLD=200 # more tolerance for latency
export ANGARABASE_GC_TARGET_BLOAT_RATIO=30 # more tolerance for bloat
Budget stuck at min/max
Symptom: current_budget reaches bound and stays there.
Diagnosis:
- Stuck at max: Bloat or lag persistently above target → increase
max_budgetor investigate workload - Stuck at min: Latency spikes persist → investigate GC contention or reduce GC work
Reference
- RFC-2026-034 §12: AngaraGC auto-tuning design
- Implementation scope
crates/angarabased/src/gc_worker.rs: Controller implementationcrates/angarabase/src/virtual_catalog/mod.rs:sys.gc_tuning_statusview
Next
After angara_gc.auto_tuning = on and GC metrics have stabilized:
- Diagnostics — how to read
sys.healthand GC metrics under load. - Monitoring — which alerts to create for the “GC backlog is growing” area.
- Transactions and MVCC — conceptually, what AngaraGC cleans up.
Incident runbook: debug errors in 10 minutes
Goal
Quickly localize the cause of production degradation/errors: understand what is breaking (logs/tracing) and where the bottleneck is (USDT wait events + eBPF).
Prerequisites
angarabase-serveris running.- Access to server logs.
- Access to an SQL client (
psql/pgwire). - For USDT/eBPF: Linux with eBPF support,
bpftraceinstalled, and permissions (CAP_BPF/CAP_PERFMONor root).
Fast path (10 minutes)
Step 1 (0-2 min): record the symptom and affected sessions
Check active sessions and wait state:
SELECT pid, state, wait_event_type, wait_event, query
FROM angara_stat_activity
ORDER BY pid;
If the problem is widespread, save the snapshot to the incident ticket/chat.
Step 2 (2-4 min): understand exactly what is failing or degrading
Find ERROR/WARN entries in logs around the incident time:
rg "ERROR|WARN|panic|timeout|failed|degraded" /var/log/angarabase.log
If tracing is enabled (JSON), quickly filter long-running operations:
jq 'select(.fields.duration_ms != null and .fields.duration_ms > 1000)' /var/log/angarabase.log
Interpretation:
- many
Lock/timeoutentries -> contention is likely; - many
io/fsync/walentries -> a storage bottleneck is likely; - network errors -> the
Netpath is likely.
Step 3 (4-7 min): capture runtime evidence via USDT probes
Verify that probes are available:
bpftrace -l 'usdt:./angarabase-server:angarabase:*'
Quick lock-wait histogram:
bpftrace -e 'usdt:./angarabase-server:angarabase:lock_wait_end { @lock_us = hist(arg1); } interval:s:10 { print(@lock_us); clear(@lock_us); }'
Quick I/O latency histogram:
bpftrace -e 'usdt:./angarabase-server:angarabase:io_end { @io_us = hist(arg1); } interval:s:10 { print(@io_us); clear(@io_us); }'
Quick query-latency slice:
bpftrace -e 'usdt:./angarabase-server:angarabase:query_end { @q_us = hist(arg2); } interval:s:10 { print(@q_us); clear(@q_us); }'
Step 4 (7-9 min): correlation and root-cause hypothesis
Correlate:
angara_stat_activity.wait_event_type- logs/traces
- USDT histograms
Triage rule:
- high
lock_wait_end+wait_event_type=Lock-> contention/serialization; - high
io_end+ storage warnings -> disk/flush path; - normal lock/io but high
query_end-> planner/execute CPU path.
Step 5 (9-10 min): record evidence and next action
Minimum report contents:
- time window;
- top symptoms from logs;
- 1-2 commands and their output (histograms);
- preliminary root cause;
- immediate mitigation (for example, reduce load, limit heavy queries, increase monitoring).
Expected result
Within 10 minutes you have:
- a reproducible evidence pack;
- initial incident classification (Lock/IO/Net/CPU/Scheduler);
- a clear next step for mitigation/fix.
Troubleshooting
| Symptom | Action |
|---|---|
bpftrace does not see probes | Check the binary and stapsdt section: `readelf -n ./angarabase-server |
failed to attach probe | Run as root or grant capability to bpftrace |
| Logs exist but root cause is unclear | Increase the logging level during the incident and repeat the USDT slice for 60-120 seconds |
phase_* probes correlate poorly across sessions | Use query_*, lock_*, io_* as the primary signal; treat phase_* as auxiliary |
Links
- Structured logging: Structured logging
- Tracing: Tracing
- USDT probes: USDT probes
- General diagnostics: Diagnostics
GOST crypto profiles setup
Status: Production-ready, opt-in Contract: Provider-based GOST support (OQ-2026-022 Option A)
Overview
AngaraBase supports GOST cipher suites for TLS 1.2 in regulated environments (GOST 28147-89, GOST R 34.10-2012).
Key properties:
- Opt-in: GOST disabled by default
- Fail-closed: Server refuses to start if GOST is enabled but crypto provider is incompatible
- Provider-based: Uses OpenSSL GOST engine or compatible crypto provider (not bundled)
Prerequisites
1. Install GOST crypto provider
Option A: OpenSSL with GOST engine (Linux)
# Install OpenSSL with GOST support
sudo apt-get install openssl libssl-dev libengines-gost
# Verify GOST engine is available
openssl engine gost -c
Option B: Custom provider
Implement GostCryptoProvider trait in crates/angarabase/src/security/crypto.rs.
2. Generate TLS certificates
# Generate server certificate with GOST algorithm
openssl req -new -x509 -days 365 \
-newkey gost2012_256 \
-keyout server.key \
-out server.crt \
-nodes \
-subj "/CN=localhost"
Configuration
Enable GOST cipher suites
export ANGARABASE_TLS_ENABLE=1
export ANGARABASE_TLS_CERT_PATH=/path/to/server.crt
export ANGARABASE_TLS_KEY_PATH=/path/to/server.key
export ANGARABASE_TLS_GOST_ENABLED=1
export ANGARABASE_TLS_GOST_CIPHER_SUITES="GOST2012-GOST8912-GOST8912"
Or via config file:
[tls]
enable = true
cert_path = "/etc/angarabase/tls/server.crt"
key_path = "/etc/angarabase/tls/server.key"
gost_enabled = true
gost_cipher_suites = "GOST2012-GOST8912-GOST8912"
Verify configuration
Check effective settings:
SELECT name, value FROM sys.settings WHERE name LIKE 'tls.%';
Expected output:
tls.enable | true
tls.cert_path | /etc/angarabase/tls/server.crt
tls.key_path | /etc/angarabase/tls/server.key
tls.gost_enabled | true
tls.gost_cipher_suites | GOST2012-GOST8912-GOST8912
Client connection
psql with GOST
Requires psql built with OpenSSL GOST support:
psql "host=localhost port=5152 dbname=mydb sslmode=require"
Verify cipher suite
From client:
SHOW ssl_cipher;
Should return GOST cipher suite name.
Security notes
Fail-closed behavior
- If
ANGARABASE_TLS_GOST_ENABLED=1but GOST provider is unavailable, server refuses to start (no silent fallback to standard ciphers) - Invalid cipher suites are rejected at startup (fail-closed validation)
Secrets handling
All tls.* knobs are marked security-sensitive in settings registry:
tls.gost_cipher_suitesis sensitive (policy: alltls.*knobs are security-sensitive per SECURITY_GOVERNANCE.md)- Private key (
tls.key_path) is never exposed insys.settingsor logs
Troubleshooting
Server fails to start with “GOST provider not available”
Cause: GOST crypto provider is not installed or OpenSSL GOST engine is missing.
Fix: Install OpenSSL GOST support (see Prerequisites).
Invalid cipher suites error
Cause: tls.gost_cipher_suites contains invalid cipher names.
Fix: Use valid GOST cipher suite names from OpenSSL GOST engine documentation:
# List available GOST ciphers
openssl ciphers -v | grep GOST
Client connection fails with “no shared cipher”
Cause: Client does not support GOST cipher suites.
Fix: Use psql/libpq built with OpenSSL GOST support.
Limitations (v0)
- Provider availability: GOST support requires compatible crypto provider (not bundled with AngaraBase)
- Platform support: Linux only (OpenSSL GOST engine availability)
- Cipher suite coverage: TLS 1.2 + GOST only (TLS 1.3 GOST deferred to future release)
Reference
- OQ-2026-022: GOST crypto support decision
- Implementation scope
crates/angarabase/src/security/crypto.rs: GOST provider abstraction
Next
After the GOST provider is installed and angarabase starts with crypto.profile = gost:
- GOST compatibility — which scenarios (TLS, TDE, audit-sink) are already covered and which are on the roadmap.
- Encryption (TDE + client-side) — the general encryption contract that embeds the GOST profile.
- Hardening runbook — the final check before putting the instance into production.
Verify release artifacts
Goal
Confirm three properties before installation:
- the artifact is authentic (GPG signature);
- the artifact is not corrupted (SHA256);
- the artifact matches the expected version.
Step-by-step verify path
VERSION="0.6.3"
ART="angarabase-server-v${VERSION}-x86_64-unknown-linux-gnu.tar.gz"
BASE_URL="https://s3.angarabase.io/stable/v${VERSION}"
# 1) Download
curl -fsSL "${BASE_URL}/${ART}" -o "${ART}"
curl -fsSL "${BASE_URL}/${ART}.asc" -o "${ART}.asc"
curl -fsSL "${BASE_URL}/SHA256SUMS" -o SHA256SUMS
# 2) Import the release key (once)
gpg --keyserver hkps://keys.openpgp.org --recv-keys <KEY_FINGERPRINT>
# alternative:
# curl -fsSL https://angarabase.io/release-key.gpg | gpg --import
# 3) Verify signature
gpg --verify "${ART}.asc" "${ART}"
# 4) Verify checksum
sha256sum --check --ignore-missing SHA256SUMS
Success criteria
gpg --verifyreturnsGood signature.sha256sum --checkreturnsOK.
If any step fails, do not use the artifact.
Next
After the signature and SHA-256 match:
- Installation — unpack the verified archive or install the package.
- GOST crypto setup — if GOST cryptography is required during verification.
- Support and bug-report artifact collection — what to do if the hash does not match.
Operations Overview
The canonical AngaraBase operations corpus for DBAs and SREs. It collects runbooks, baselines, and checklists that are kept in sync with the code and release trains.
If you are just starting, begin with the user guides in Operations (How-to): they are shorter and work well as an entry point. This section is for operator-level deep dives.
How to Navigate
| Task | Where to go |
|---|---|
| Bring up an instance from scratch | Installation → Configuration |
| Start a container / minimal k8s deployment | Container deployment quickstart |
| Move to production | Operational policies baseline → Hardening → Security operations baseline |
| Configure monitoring and alerts | Observability metrics checklist → Parallel runtime observability |
| Investigate a production issue | Troubleshooting guide → Diagnostics bundle runbook → Error debug runbook |
| Read query plans | How to read query plans → Performance tuning guide |
| Optimize performance | Performance tuning guide → How to read query plans → Parallel runtime observability |
| Backup / restore | Backup and restore (operator-level) → Disaster recovery playbook |
| Upgrade the version | Upgrade and migration |
| Connect an unfamiliar client / ORM | Client compatibility baseline |
| Prepare a voucher for a bug report | Diagnostics bundle runbook → Support |
Canonical Operations Pages
Lifecycle
- Upgrade and migration — pre-flight, rolling steps, verification.
- MVCC and GC operator minimum — AngaraGC behavior and operator knobs.
- Checkpoint operations — managing the checkpoint process.
Reliability
- Container deployment quickstart — image-first startup, cgroup probe, minimal k8s smoke.
- Backup and restore — operator-level baseline (cold + online/PITR).
- Disaster recovery playbook — DR scenarios, host migration.
- Replication v2 operations guide — AngaraReplica v2.
Performance
- Performance tuning guide — workload-driven knobs, what to measure first.
- Statistics and ANALYZE — statistics collection and persistence.
- How to read query plans —
EXPLAIN, operators, diagnostics, cache/replan signals. - Parallel runtime observability runbook — DOP caps, partitioned join.
- jemalloc heap profiling runbook — memory diagnostics.
Observability
- Observability metrics checklist — required minimum metrics.
- Diagnostics bundle runbook — what to collect during an incident.
Security
- Security operations baseline — security knobs registry, regular checks.
Reference
- Configuration schema reference — all TOML/env parameters with types and defaults.
- Client compatibility baseline — list of tested clients and caveats.
- Known issues baseline — operator-level known issues.
- Operational policies baseline — production-policy baseline.
Troubleshooting
- Troubleshooting guide — common incidents and actions.
- Runbooks index — table of contents for all runbooks.
Validation
- Testing and validation baseline — what to check before production.
- Golden dataset management — managing golden data.
- CI reproducibility contract — build reproducibility contract.
Links
- Architecture overview — how the database is structured (operations context).
- Security model — the full security model.
- SQL compatibility — supported SQL boundaries.
- Support — how to report a problem.
Runbooks Index
Catalog of AngaraBase operator runbooks. All runbooks are tied to the code and updated together with release trains.
By Category
Lifecycle
| Runbook | When to use |
|---|---|
| Upgrade and migration | Before a version upgrade — pre-flight, rolling, verification |
| MVCC and GC operator minimum | AngaraGC setup, visibility diagnostics |
Reliability
| Runbook | When to use |
|---|---|
| Backup and restore | Regular backup, base/PITR restore, verification |
| Disaster recovery playbook | Full instance loss, host migration, restore oracle |
| Replication v2 operations guide | Managing AngaraReplica v2 |
Performance
| Runbook | When to use |
|---|---|
| Performance tuning guide | Targeted workload optimization |
| Parallel runtime observability | Parallel execution diagnostics, DOP caps |
| jemalloc heap profiling | Investigating memory growth |
Observability
| Runbook | When to use |
|---|---|
| Observability metrics checklist | Configure the minimum metric/alert set |
| Diagnostics bundle | Collect artifacts during an incident |
| Troubleshooting guide | Symptom → cause → action |
| Alert runbooks (RM-0.6.3.8 S7) | Per-alert remediation: backing pages for each runbook_url in tools/observability/alerts/angarabase_alerts.yaml |
Security
| Runbook | When to use |
|---|---|
| Security operations baseline | Regular security checks, knobs registry |
| Hardening | Move an instance to production-ready security configuration |
Reference (operator)
| Document | What it contains |
|---|---|
| Configuration schema reference | Full registry of TOML/env parameters |
| Client compatibility baseline | Tested clients, known limitations |
| Known issues baseline | Operator-level known issues |
| Operational policies baseline | Production policy baseline |
Validation
| Document | When to use |
|---|---|
| Testing and validation baseline | Acceptance checks before production |
| Golden dataset management | Managing golden datasets |
| CI reproducibility contract | Artifact reproducibility contract |
By Symptom (Quick Navigation)
| Symptom | Where to look |
|---|---|
| Server does not start | Troubleshooting → Configuration → Crash recovery |
| Queries became slower | Performance tuning → Diagnostics → Diagnostics bundle |
0A000 feature_not_supported error | SQL compatibility → Known issues |
| Used disk size is growing | MVCC and GC operator minimum → Diagnostics |
| RSS / OOM is growing | jemalloc profiling → Configuration |
| Backup or restore failed | Backup and restore → Disaster recovery |
| Authentication / RLS / audit behave unexpectedly | Security operations → Security model |
| Client / ORM problem | Client compatibility → SQL compatibility |
| Suspected data corruption | Verify release artifacts → Disaster recovery |
If the runbook did not help
Collect a diagnostics bundle and contact us through the Support flow.
Next
- Troubleshooting guide — symptom index and first actions.
- Disaster recovery playbook — “lost lease / damaged datadir” scenarios.
- Diagnostics bundle runbook — how to collect everything needed for escalation in one package.
Alert Runbooks
Operator-facing runbooks for each alert rule from
tools/observability/alerts/angarabase_alerts.yaml (RM-0.6.3.8 S7).
Each alert contains annotations.runbook_url with a link to one of
the pages below — this is the binding between the observability surface and the operator
remediation path.
Repo-reproducibility contract (G2-FIX cycle 2 / F-DOC-1): for each
runbook_urlin the alert YAML there is a backing markdown file in this directory. Verifier:python3 - <<'PY' import re, pathlib rules = pathlib.Path("tools/observability/alerts/angarabase_alerts.yaml").read_text() slugs = re.findall(r"runbooks/([a-z0-9-]+)", rules) root = pathlib.Path("angarabook/src/operations/runbooks") missing = [s for s in slugs if not (root / f"{s}.md").exists()] print("OK" if not missing else f"MISSING: {missing}") PY
By Alert Rule
| Alert | Severity | Runbook |
|---|---|---|
AngarabaseDown | critical | angarabase-down.md |
HighP99Latency | warning | high-p99-latency.md |
HighSlowQueryRatio | warning | high-slow-query-ratio.md |
BufferPoolPressure | warning | buffer-pool-pressure.md |
WALFsyncSlow | warning | wal-fsync-slow.md |
DeadlockSpike | critical | deadlock-spike.md |
LongTransaction | warning | long-transaction.md |
GCBloatHigh | warning | gc-bloat-high.md |
ReplicationLag | warning | replication-lag.md |
IndexRoutingLegacyFallback | warning | index-routing-legacy-fallback.md |
URL Convention
The production angarabook deployment maps /operations/runbooks/<slug> →
angarabook/src/operations/runbooks/<slug>.md. If your build
uses a different layout, update runbook_url in the alert YAML
accordingly (the source of truth is the alert file, not the runbooks themselves).
New Runbook Page Template
Each runbook page contains:
- What it means (required) — short explanation of alert semantics + PromQL link.
- Severity — critical / warning / info.
- Initial response (≤ 5 minutes) — what to do right now.
- Diagnostics — concrete commands (
curl,psql,iostat, …). - Mitigation — “symptom → action” table.
- Escalation — when and how to escalate.
- Related — links to adjacent runbooks and reference docs.
Related
- Runbooks index — general catalog of operator runbooks.
- Observability metrics checklist — minimal metric set.
Runbook: AngarabaseDown
Source of truth:
tools/observability/alerts/angarabase_alerts.yaml. Backed by: RM-0.6.3.8 S7 (Prometheus Alert Rules v0).
What It Means
Prometheus has not received a response from the up{job="angarabase"} target for more than 30 seconds.
The server either crashed, does not respond on /metrics, or the network path between Prometheus and the instance is broken.
Severity
critical. Affects service availability for all clients.
Initial response (5 minutes)
# 1. Check the process
systemctl status angarabase-server # or your service manager
ps -ef | grep angarabase-server
# 2. Check the port
ss -ltnp | grep -E ':(5432|9898)'
# 3. Fetch metrics directly from the host
curl -sf http://127.0.0.1:9898/metrics | head -5
Diagnostics
-
Server log:
journalctl -u angarabase-server -n 200(or your log path). -
Crash diagnostics (RM-0.6.5.6):
- Panic hook: on crash, the server writes
[PANIC] thread='...' message='...' backtrace:to stderr (usually redirected towrapper.log). Look for the backtrace to understand the cause. - Supervisor crash log:
manage.shwrites[CRASH] pid=N exit_code=Mtowrapper.log. This line confirms that the process crashed under supervisor control.
Commands for quick diagnostics:
# Find the latest panic with backtrace (show 20 context lines): grep -A 20 "\[PANIC\]" artifacts/golden_db/logs/wrapper.log | tail -40 # Find all crash events with exit codes: grep "\[CRASH\]" artifacts/golden_db/logs/wrapper.log | tail -10 # Example output: [CRASH] pid=18073 exit_code=101 timestamp=2026-05-07T07:03:57Z # Check the last 50 server-log lines before the crash: grep -B 5 "\[CRASH\]\|\[PANIC\]" artifacts/golden_db/logs/wrapper.log | tail -30 - Panic hook: on crash, the server writes
-
Lease: see
crash-recovery.mdif the server failed because ofResourceBusy(PID file / lease). -
Network:
ss -s,iptables -L -n, check the firewall between Prometheus and the instance.
Mitigation
| Scenario | Action |
|---|---|
| Process crashed | systemctl restart angarabase-server + collect a crash dump |
| Lease stuck | ANGARABASE_FORCE_LEASE_TAKEOVER=1 + restart (see troubleshooting.md) |
| Network | Check firewall, route, DNS |
/metrics overloaded | Lower scrape_interval; check timeouts in Prometheus |
Escalation
If restart does not help for more than 10 minutes, collect a diagnostics bundle and escalate through the support flow.
Related
Runbook: HighP99Latency
Source of truth:
tools/observability/alerts/angarabase_alerts.yaml. Backed by: RM-0.6.3.8 S7.
What It Means
P99 query latency exceeds 100 ms for 5 minutes.
Metric: histogram_quantile(0.99, rate(angarabase_query_exec_duration_ms_bucket[5m])).
Severity
warning. Signal of degraded UX, not an outage.
Initial response
- Open the Grafana dashboard AngaraBase Overview v2 → row “Query Performance”.
- Compare with P50/P95 — if all three increased together, this is a global issue (CPU/IO/lock); if only P99 did, tail latency (GC, fsync stall, single slow query).
- Check the
slow_query_totalrate — whether the number of slow queries is growing.
Diagnostics
# Top-N slow queries
curl -sf http://127.0.0.1:9898/metrics | rg slow_query_total
curl -sf http://127.0.0.1:9898/metrics | rg query_exec_duration_ms_bucket
# Active long-running transactions
psql -c "SELECT pid, age(now(), xact_start), query FROM angara_stat_activity \
WHERE state = 'active' ORDER BY xact_start LIMIT 10;"
Cross-check with other signals: BufferPoolPressure, WALFsyncSlow, LongTransaction.
Mitigation
- Optimization plan: see performance-tuning.md.
- ANALYZE on hot tables.
- Indexes: check
angarabase_index_routing_legacy_total > 0— if yes, run DROP+CREATE INDEX (see index-routing-legacy-fallback). - Buffer pool: hit ratio < 90% → increase
buffer_pool_pages.
Escalation
If latency does not decrease after standard actions for more than 30 minutes → diagnostics bundle + escalation.
Related
Runbook: HighSlowQueryRatio
Source of truth:
tools/observability/alerts/angarabase_alerts.yaml. Backed by: RM-0.6.3.8 S7. Renamed fromHighErrorRatein G2-FIX cycle 2 (F-S7-1, 2026-04-19) to reflect the semantics accurately.
What It Means
The share of slow queries exceeds 1% of total queries over the last 5 minutes:
rate(angarabase_slow_query_total[5m])
/ clamp_min(rate(angarabase_query_exec_total[5m]), 1)
> 0.01
Important: this is NOT a true error rate. AngaraBase does not yet split
angarabase_query_exec_totalinto_ok / _errcounters (Design Gap DG-1, moved to RM-0.6.6.0). Slow-query ratio is a best-effort proxy for client-perceived degradation. A trueHighErrorRatewill appear after the counters are split.
Severity
warning. Degradation signal, not an outage.
Initial response
- Open Grafana Overview v2 → row “Query Performance” → panel “Slow queries / Total queries ratio”.
- Drill down into the Query Store dashboard → top-N slow queries.
- Check correlation with
BufferPoolPressure,LongTransaction,WALFsyncSlow.
Diagnostics
curl -sf http://127.0.0.1:9898/metrics | rg -E '^angarabase_(slow_query|query_exec)_total'
psql -c "SELECT * FROM angara_stat_statements ORDER BY total_time DESC LIMIT 10;"
Mitigation
| Symptom | Action |
|---|---|
| Specific query | EXPLAIN ANALYZE → recreate the index / rewrite the query |
runtime_facts.spill_bytes > 0 | Not enough memory for the operator. See Performance tuning (increase memory limit / work_mem) |
seq scan chosen: low cardinality / low selectivity | Expected with thresholds in [execution]. First run ANALYZE and check distinct_estimate. Then, if needed, adjust index_cardinality_threshold / index_scan_selectivity_threshold in angarabase.conf (or env before startup) and restart; SET in Simple Query does not apply. See Statistics, Performance tuning |
| All queries are slower | See HighP99Latency — check system signals first |
| Growing after deploy | Roll back the release; check the query plan |
| Correlates with GC | See GCBloatHigh |
Escalation
If the ratio does not drop for more than 30 minutes → diagnostics bundle + escalation.
Related
- Performance tuning guide
- HighP99Latency
- DG-1 split
_ok/_errcounters (RM-0.6.6.0 Specs Backlog)
Runbook: BufferPoolPressure
Source of truth:
tools/observability/alerts/angarabase_alerts.yaml. Backed by: RM-0.6.3.8 S7.
What It Means
The buffer pool hit ratio has fallen below 90% (metric angarabase_buffer_pool_hit_ratio_milli < 900)
for 10 minutes. Each page read is increasingly going to disk instead of memory.
Severity
warning. Read performance is degrading; not yet critical.
Initial response
- Grafana Overview v2 → row “Buffer Pool & Memory”.
- Compare
pages_loadedrate withpages_evictedrate — whether there is churn. - Check
angarabase_jemalloc_resident_bytes— whether RSS is growing (hint of a leak).
Diagnostics
# Current buffer-pool capacity and load (RM-0.6.6.3 S6-D2)
curl -sf http://127.0.0.1:9898/metrics | rg "buffer_pool_capacity|buffer_pool_hit|buffer_pool_miss"
# Example output:
# angarabase_buffer_pool_capacity_pages 195797 ← auto-detect 25% AvailRAM (3.0 GiB)
# angarabase_buffer_pool_hit_total 4120000
# angarabase_buffer_pool_miss_total 380000
curl -sf http://127.0.0.1:9898/metrics | rg buffer_pool
curl -sf http://127.0.0.1:9898/metrics | rg jemalloc
# Top tables by reads
psql -c "SELECT relname, heap_blks_read, heap_blks_hit \
FROM pg_statio_user_tables ORDER BY heap_blks_read DESC LIMIT 10;"
Mitigation
-
Auto-sizing (RM-0.6.6.3): starting with this release, the engine automatically determines the buffer-pool size at startup: 25% of
MemAvailablefrom/proc/meminfo, clamped to [1.6 GiB, 32 GiB]. Restarting after freeing memory on the host often solves the problem without changing config.For a forced value:
export ANGARABASE_STORAGE_MAX_CACHED_PAGES=<N>before startup, where N = number of 16 KiB pages (for example, 200000 ≈ 3.1 GiB). -
Working set > RAM: consider partitioning or archiving old data.
-
GC churn: check GCBloatHigh — bloat increases the working set.
-
Memory leak: see jemalloc-profiling.md.
Escalation
If the hit ratio does not recover after a config change / restart, collect a diagnostics bundle.
Related
- Performance tuning guide
- Configuration schema reference
- jemalloc profiling
- Backpressure Coordinator (RM-0.6.3.9 §S5+§S9) —
unified pool/WAL/uncommitted-pages backpressure decisions, including
the
pool_wait_timeout_msknob,angarabase_buffer_pool_over_capacity_pages,angarabase_buffer_pool_evict_failed_total,angarabase_buffer_pool_waiter_wait_secondshistogram, and theBufferPoolError::WaitTimeoutSQL error path (RM-0.6.3.9 §S2+§S8 capacity waiter). - Resource Advisors v0 (RM-0.6.3.9 §S10) —
angarabase_memory_pressure_ratiocorrelates with sustainedBufferPoolPressureevents when working-set growth, not churn, is the cause.
Runbook: WALFsyncSlow
Source of truth:
tools/observability/alerts/angarabase_alerts.yaml. Backed by: RM-0.6.3.8 S7.
What It Means
P99 fsync latency for WAL exceeds 50 ms for 5 minutes. Each commit waits for disk longer than the target budget — TPS drops, commit latency grows, and cascading backlog risk increases.
Severity
warning. At 200 ms+, it is close to critical (consider escalation).
Initial response
- Grafana Overview v2 → row “WAL & Durability”.
- Check whether WAL throughput rate (bytes/s) has grown — write buffer overflow.
iostat -xm 1 5on the host — whether the WAL disk is saturated.
Diagnostics
curl -sf http://127.0.0.1:9898/metrics | rg transaction_log
iostat -xm 1 5
dmesg | tail -50 # I/O errors / SMART warnings
Mitigation
| Cause | Action |
|---|---|
| Disk saturated | Move WAL to a separate disk; use SSD/NVMe instead of HDD |
| Group commit off | Enable wal.group_commit = true in config |
| Network FS | Do NOT use NFS / CIFS for wal/ — fsync semantics are unpredictable |
Large wal_buffer_bytes | Reduce to a reasonable value (16-64 MB) |
| Filesystem barriers off | Check mount options (barrier=1, data=ordered) |
Escalation
If fsync > 200 ms persists for more than 10 minutes, this is a path to coordinated omission and commit loss; collect a diagnostics bundle and escalate urgently (durability-critical).
Related
Runbook: CommitLatencyTuning
Sources of truth:
- RM-0.6.3.10 (Track B S11/S12/S13) — group-commit baseline.
- RM-0.6.4.0 (Sprint 2/3) — WAL contract, SyncAtCommit mode.
What It Means
Runbook for situations where COMMIT latency is higher than expected or unstable between identical workloads (single-client cron jobs, batch DML, mixed RW).
After RM-0.6.4.0, the sync_at_commit mode was introduced (alias strict): each COMMIT
forces WAL fsync before acknowledgment. This adds new modes
to the expected-latency table.
Durability Modes (RM-0.6.4.0+)
Configured through ANGARABASE_TRANSACTION_LOG_DURABILITY (env).
| Mode | Env value | Behavior | Use |
|---|---|---|---|
| Relaxed | relaxed | WAL is buffered, no fsync per commit | Dev/bench only |
| Group commit | group_commit | WAL pump coalesces and fsyncs by batch | Production (default) |
| Sync at commit | sync_at_commit or strict | fsync on every COMMIT | Banks, finance, max durability |
Important:
SET [LOCAL] durability = ...andCOMMIT WITH DURABILITY = ...are reserved for v0.6.5 and returnSQLSTATE 0A000 feature_not_supported. Use env for configuration.
Baseline Latency Expectations
| Mode | Condition | Expectation (guide) |
|---|---|---|
relaxed | fsync=false | sub-ms COMMIT; not for production |
group_commit | fsync=false | COMMIT ~0.1–5 ms; batches smooth spikes |
group_commit | fsync=true | COMMIT 2–20 ms; disk dominates |
sync_at_commit | NVMe | COMMIT 1–5 ms per tx (one fsync) |
sync_at_commit | HDD | COMMIT 5–20+ ms per tx |
If p50 or p99 is significantly above the range, check the diagnostics block below.
Which Metrics to Watch
New RM-0.6.4.0 Metrics (WAL Commit Path)
curl -sf http://127.0.0.1:9898/metrics | rg "wal_commit|wal_durability|wal_barrier"
| Metric | Meaning |
|---|---|
angarabase_wal_commit_fsync_total | Number of WAL writer fsync calls (growth = active sync) |
angarabase_wal_durability_epoch | Monotonic counter of durability barrier epochs |
angarabase_wal_barrier_wait_total | Number of transactions that waited for the durability barrier |
angarabase_wal_barrier_duration_seconds | Histogram of barrier wait time |
Baseline Metrics (group commit / write path)
curl -sf http://127.0.0.1:9898/metrics | rg "write_path_phase_b|group_commit|transaction_log"
| Metric | Meaning |
|---|---|
angarabase_write_path_phase_b_duration_seconds | Phase B histogram (commit hot path) |
angarabase_write_path_phase_b_timeout_total | Phase B timeouts — should be low |
angarabase_group_commit_batches_total | Number of pump batches |
angarabase_group_commit_batch_size | Batch-size distribution |
angarabase_transaction_log_group_commit_pumps_total | Number of pump runs |
angarabase_transaction_log_group_commit_pump_duration_ms | Duration of one pump |
Quick Diagnostics
# New WAL metrics (RM-0.6.4.0)
curl -sf http://127.0.0.1:9898/metrics | rg "wal_(commit|durability|barrier)"
# Group commit baseline
curl -sf http://127.0.0.1:9898/metrics | rg "write_path_phase_b|group_commit|transaction_log_group_commit"
# I/O correlate
iostat -xm 1 5
If iostat shows high await/util and *_pump_duration_ms and p99 COMMIT grow at the same time,
the problem is almost always in the I/O layer.
With sync_at_commit: if angarabase_wal_commit_fsync_total grows
proportionally to tx rate, the mode works correctly. If the rate is disproportionately high,
check wal_barrier_duration_seconds for stalls.
Tuning Order
- Confirm the durability mode (
ANGARABASE_TRANSACTION_LOG_DURABILITY) and target SLA. - For
sync_at_commit: make sure WAL files are on NVMe / a separate spindle. - Check whether the workload burst: compare tx-rate and batch-size histogram.
- For production, stabilize disk first, then tune
group_commit_interval_ms. - For bench/dev,
relaxedis allowed, but record it in the report.
DML-coverage check
For triage, it is useful to confirm that the latency anomaly is not masking a regression:
INSERT INTO t(...) VALUES (..., now())— should succeed.UPDATE t SET x = x + 1 WHERE ...— the expression should be applied.UPDATE/DELETEin autocommit and in txn should return the correct row count.
Escalation
- If
fsync=trueand p99 COMMIT > 200 ms for more than 10 minutes, escalate as a durability-risk incident. - If
wal_barrier_duration_secondsp99 > 50 ms withsync_at_commit, check for I/O stall. - If
errors_total > 0in TPC-B-lite smoke, stop performance claims until correctness is fixed.
Related
- Runbook: WALFsyncSlow
- Runbook: BufferPoolPressure
- Backpressure runtime settings
- Performance tuning guide
Runbook: DeadlockSpike
Source of truth:
tools/observability/alerts/angarabase_alerts.yaml. Backed by: RM-0.6.3.8 S7, RM-0.6.4.4 (SSI).
What It Means
rate(angarabase_deadlock_detected_total[1m]) > 1 — the deadlock detector
triggered more than once per minute. One or two deadlocks per hour is normal;
a spike points to a problematic workload pattern.
For SERIALIZABLE transactions: a spike in 40001 errors (serialization_failure)
can look like a deadlock spike in application logs, but has a different cause
(rw anti-dependencies). See the angarabase_ssi_aborts_total metric.
Severity
critical. Deadlock = aborted transaction = potential loss of client work.
For SSI: 40001 is expected behavior, but a high rate requires contention analysis.
Initial response
- Grafana Overview v2 → row “Locks”.
- Check which tables participate in the spike (see server log messages
deadlock detected: ...). - Correlate with recent deploy / migration — new workload?
Diagnostics
curl -sf http://127.0.0.1:9898/metrics | rg -E 'lock_|deadlock'
journalctl -u angarabase-server -n 500 | rg -i 'deadlock'
# Active locks (if a compatible view exists)
psql -c "SELECT * FROM angara_stat_locks WHERE granted = false;"
Mitigation
| Cause | Action |
|---|---|
| Different lock acquisition orders | Standardize the order (UPDATE by PK ASC) in client code |
| Long-running txn holds a lock | See LongTransaction |
| Hot row contention | Shard the counter; use a sequence instead of UPDATE |
| Specific code needs updating | Roll back deploy, fix, redeploy |
Escalation
If the spike does not subside for more than 15 minutes, it blocks business operations; escalate urgently.
Related
Runbook: LongTransaction
Source of truth:
tools/observability/alerts/angarabase_alerts.yaml. Backed by: RM-0.6.3.8 S7, RM-0.6.4.4 (SSI).
What It Means
angarabase_txn_oldest_snapshot_age_seconds > 300 — the oldest open transaction
has been alive for more than 5 minutes. This blocks MVCC GC and leads to bloat.
For SERIALIZABLE transactions: holding a transaction for a long time also blocks cleanup (GC) of SIREAD locks and the SSI conflict graph, which can increase false positive aborts (40001) for new transactions because of lock escalation.
Severity
warning. At 30+ minutes it becomes a real GC blocker.
For SSI workloads, it is critical for throughput because of aborts.
Initial response
- Grafana Overview v2 → row “GC / MVCC”.
- Find the transaction PID:
SELECT pid, age(now(), xact_start) AS age, state, query
FROM angara_stat_activity
WHERE state IN ('idle in transaction', 'active')
ORDER BY xact_start ASC LIMIT 5;
Diagnostics
curl -sf http://127.0.0.1:9898/metrics | rg -E 'txn_(oldest_snapshot|active|idle)'
curl -sf http://127.0.0.1:9898/metrics | rg gc_
Mitigation
| Cause | Action |
|---|---|
Client stuck in idle in transaction | Enable idle_in_transaction_session_timeout |
| Long analytical query | Move to a read replica; split into batches |
| Pgbouncer pool | Check server_idle_timeout, restart the pool |
| Application bug | Fix on the client side (transaction scope) |
Forced abort (last resort):
SELECT pg_terminate_backend(<pid>);
Escalation
If the transaction is older than 1 hour and blocks GC until bloat > 30%, consider terminate + escalation and document the incident.
Related
Runbook: GCBloatHigh
Source of truth:
tools/observability/alerts/angarabase_alerts.yaml. Backed by: RM-0.6.3.8 S7.
What It Means
angarabase_gc_tuning_bloat_ratio_percent > 50 — for each “live” version there is
more than one “dead” version (which AngaraGC cannot remove). Most often this is a symptom
of a blocking long transaction (see LongTransaction).
Severity
warning. At 80%+ bloat, the buffer pool hit ratio drops.
Initial response
- Grafana Overview v2 → row “GC / MVCC”.
- Check the
LongTransactionalert — the root cause is usually there. - Check
gc_tuning_state— whether auto-tuning is reacting by itself.
Diagnostics
curl -sf http://127.0.0.1:9898/metrics | rg gc_
curl -sf http://127.0.0.1:9898/metrics | rg mvcc_
# Top tables by bloat
psql -c "SELECT schemaname, relname, n_dead_tup, n_live_tup,
round(100.0 * n_dead_tup / NULLIF(n_live_tup,0), 2) AS bloat_pct
FROM pg_stat_user_tables
WHERE n_dead_tup > 1000
ORDER BY bloat_pct DESC NULLS LAST LIMIT 10;"
Mitigation
- Close long transactions — see LongTransaction.
- Run vacuum on hot tables.
- Tune GC — increase auto-tuning aggressiveness (see mvcc-gc.md §Knobs).
- Full rebuild (downtime) if bloat > 70% and vacuum does not help.
Escalation
If bloat > 70% and does not fall after vacuum + closing long txns, collect diagnostics and escalate (service downtime may be needed).
Related
Runbook: ReplicationLag
Source of truth:
tools/observability/alerts/angarabase_alerts.yaml. Backed by: RM-0.6.3.8 S7.
What It Means
Replica lag (angarabase_replication_lag_bytes or equivalent in seconds) > 10 seconds.
The replica is behind primary; reads from the replica return stale data.
Severity
warning. At > 60 seconds there is a risk of data loss during failover.
Initial response
- Grafana Overview v2 → row “Replication”.
- On primary: check slot status / sender backpressure.
- On replica: check apply rate / disk space / network bandwidth.
Diagnostics
# Primary
curl -sf http://primary:9898/metrics | rg replication
# Replica
curl -sf http://replica:9898/metrics | rg replication
# Application lag in seconds
psql -h replica -c "SELECT now() - pg_last_xact_replay_timestamp() AS apply_lag;"
See also replication-v2.md §Diagnostics.
Mitigation
| Cause | Action |
|---|---|
| Network | Check bandwidth, RTT, packet loss between primary and replica |
| Replica slower than primary | Upgrade hardware (SSD, CPU, RAM) on replica |
| Large slot backlog | Free it (risky — drop inactive slot) |
| Apply bottleneck (single-threaded) | See replication-v2.md §Tuning |
| Competing GC on replica | Reduce query load on replica |
Escalation
If lag > 60 seconds and grows for more than 15 minutes, assess split-brain risk during failover and prepare a recovery plan.
Related
Runbook: IndexRoutingLegacyFallback
Source of truth:
tools/observability/alerts/angarabase_alerts.yaml. Backed by: RM-0.6.3.8 S5 + S7. Synergy alert: binds Track 1 storage-correctness counter to Track 2 alerting layer.
What It Means
angarabase_index_routing_legacy_total{db="<db>"} > 0 post-upgrade — this instance
has secondary indexes whose catalog records (IndexDef.index_db_name) do not yet contain the
owning DB. They work correctly through legacy fallback to base.adb, but this means:
- the index physically lives in
base.adbinstead of<db>.adb, - a backup of only
<db>.adbwill lose this index on restore, - RFC-2026-087 §4.1 invariant (per-DB co-location of pages) is violated.
Severity
warning. Not critical — data is not lost and the index works, but migration is required.
Why this alert fires after upgrade
RM-0.6.3.8 introduced per-DB IndexStore page routing. Binaries before RM-0.6.3.8 created
IndexDef without the index_db_name field. After upgrade, such indexes decode
with index_db_name = None and go through the legacy path, incrementing this counter.
Initial response
# Which DBs contain legacy indexes
curl -sf http://127.0.0.1:9898/metrics | rg index_routing_legacy_total
# List indexes on the problematic DB
psql -d <db> -c "SELECT schemaname, tablename, indexname FROM pg_indexes \
WHERE schemaname NOT IN ('pg_catalog','information_schema');"
Mitigation: DROP + CREATE INDEX
For each affected index, run (during a maintenance window):
\c <db>
DROP INDEX IF EXISTS public.<index_name>;
CREATE INDEX <index_name> ON public.<table> (<col>);
After recreation, the counter stops growing (the old IndexDef record is replaced
with a new one that has the correct index_db_name = Some("<db>"); pages are written to <db>.adb).
Verification
# Counter should not grow after migration
watch -n 30 'curl -sf http://127.0.0.1:9898/metrics | rg index_routing_legacy_total'
And check the per-DB file size:
ls -lh <datadir>/<db>.adb
ls -lh <datadir>/base.adb # should shrink after indexes move
Escalation
Not required — this is a planned migration, not an incident. If the counter grows without an upgrade, this is a bug; escalate.
Related
- Backup and restore — why legacy indexes broke backup
- Upgrade and migration
- RFC-2026-087 §4.1 + Addendum §S2 (RM-0.6.3.8)
Troubleshooting Guide
This document moves the key operator fast path from the legacy troubleshooting runbook into AngaraBook.
Scope
It covers common AngaraBase operational incidents and quick actions for diagnostics/remediation.
Related documents:
Incident: False Positive Commit Conflicts (40001)
Symptoms
- Clients receive
SQLSTATE 40001on everyCOMMITattempt. - Startup logs may contain recovery warnings.
Typical causes
- Version below the fix release for the VLF recovery path.
data_directoryandtransaction_log_directoryare mixed up.
Actions
- Upgrade to the fixed version.
- Separate
storage.data_directoryandstorage.transaction_log_directory.
Incident: Backpressure active (no-steal)
Symptoms
buffer_pool_backpressure_active == 1buffer_pool_uncommitted_dirty_ratioandtxn_write_set_limit_exceeded_totalare growing
Actions
- Reduce write transaction batch size.
- Enable
buffer_pool.backpressure.mode = fail_fastif needed. - Lower
txn.max_write_set_pagesand/or increasebuffer_pool.size_bytes.
Incident: p99 spikes during checkpoint
Symptoms
- Growth in
checkpoint_duration_secondsand latency spikes.
Actions
- Increase
checkpoint.target_ms. - Limit
writeback.max_bytes_per_sec. - Tune
checkpoint.dirty_ratio_hardfor earlier background writeback.
Incident: commit fsync tails / durable_lsn lag grows
Symptoms
- Growth in
commit_ack_latency_secondsanddurable_lsn_lag_bytes.
Actions
- Move WAL/TL to a separate volume if possible.
- Tune
group_commit.max_wait_us. - Reduce writeback interference.
Start / stop (operator baseline)
angarabase-server --config /etc/angarabase/angarabase.conf
Minimum checks before startup:
- valid config;
- correct data/txn log directories;
- sufficient disk limits and fsync latency budget.
Incident: CRC mismatch in Delete Vector blob
Symptoms
- Query fails with error:
CRC mismatch for DV blob <path> (segment <id>): expected <exp>, got <got> - Possible during compaction or applying columnar DELETE.
What it means
The .bdel file (Delete Vector blob) is corrupted. The blob_uri field points to the exact file path, and segment_id is the segment identifier inside the blob. The error is fail-closed: reading stops and data is not modified.
Actions
- Find the corrupted file by
blob_urifrom the error message. - Check storage volume integrity (IO errors in
dmesg, S.M.A.R.T.). - If the file is irreversibly corrupted, restore from backup (
disaster-recovery.md). - For recurring CRC errors, enable monitoring of
angarabase_columnar_pending_deleted_rowsto track DV fragmentation pressure.
Triage fast-path
- Check the binary version and active config.
- Capture baseline metrics (
commit_ack_latency, checkpoint, backpressure). - Check recovery/txn log state.
- Apply remediation for the corresponding incident.
Extended related materials:
Next
- Diagnostics bundle runbook — what to attach to a ticket if the symptom does not map to a runbook.
- Disaster recovery playbook — for lease loss or corrupted datadir cases.
- Performance tuning guide — if the symptom is degradation, not outage.
- Operations overview — navigation across other operator materials.
Disaster Recovery Playbook
Basic DR playbook for cases where the normal recovery path does not resolve the incident.
Canonical source: this runbook in angarabook/src/operations/.
Scope
Covers the minimal scenarios:
- WAL corruption;
- data directory loss;
- emergency modes with deliberate risk.
1) Corrupted WAL
Symptoms
ChecksumMismatchorInvalidRecordat startup.
Actions
- If corruption is in the tail, expect the normal truncate/recovery path.
- If corruption is in the middle:
- prefer restore from a valid backup (see Backup and restore);
- emergency truncate is allowed only as a last resort with transaction-loss risk.
2) Lost data directory
Actions
- Restore
data_directoryfrom full backup (procedure — Backup and restore). - Verify that WAL contains a contiguous chain after the backup point.
- Run replay and confirm consistency with checks.
3) Emergency modes (high risk)
- Ignoring/weakening integrity checks is allowed only as break-glass.
- Any such startup requires explicit incident evidence and post-incident restoration to normal mode.
4) Prevention baseline
- Regular verified backup/restore rehearsal.
- Atomic data+txlog snapshots when using a snapshot strategy.
- Pinned evidence for recent DR exercises.
Next
- Backup and restore (operator-level) — which preliminary snapshots are required for DR scenarios.
- Upgrade and migration — overlap with DR during cross-version migration.
- Replication v2 operations guide — how DR is built on top of logical replication.
- Troubleshooting guide — if the DR procedure gets stuck in a specific phase.
Performance Tuning Guide
Operator baseline for performance tuning in early releases.
Canonical source: this runbook in angarabook/src/operations/.
Scope
Focus:
- buffer pool / checkpoint / writeback;
- TL/WAL durability and group commit;
- no-steal guardrails for large transactions.
Core principle (MVP)
MVP uses no-steal:
- uncommitted pages are not flushed to disk;
- recovery correctness is simpler, but strict guardrails are needed for write pressure.
Quick profiles
OLTP (short transactions)
durability = sync_at_commit(strict) for maximum reliability, orgroup_commitwith a smallgroup_commit.max_wait_usfor lower latency.- Conservative
txn.max_write_set_pageslimits. buffer_pool.backpressure.mode = blockfor predictable behavior.
Analytics / long queries
- A higher write set ceiling is acceptable.
buffer_pool.backpressure.mode = fail_fastif latency/SLO is the priority.- Stronger control of commit tail latency with
group_commit.
Storage Compression (RM-0.6.4.8+)
- Page Compression is enabled via
CREATE TABLE ... WITH (compression='lz4'). - During intensive reads of compressed pages, watch
angarabase_buffer_pool_decomp_spill_total. If it grows, consider limiting concurrency or increasing resources. - If compression fails during page eviction, the system falls back to writing without compression (fail-open) and increments
angarabase_compression_downgrade_total.
SIMD Float Aggregation (RM-0.6.6.5)
- Aggregate functions
SUM(float4)andSUM(float8)automatically use SIMD instructions (AVX2 or NEON) when supported by the CPU. - This significantly speeds up analytical queries over floating-point numbers.
- If SIMD is unavailable, the system transparently falls back to the scalar implementation, incrementing
angarabase_simd_agg_fallback_total.
Adaptive Hash-Join (RM-0.6.6.5)
- The planner automatically swaps the Build and Probe sides in Hash Join if their actual size ratio exceeds
adaptive_hash_join_swap_ratio(default 4). - This uses the smaller table to build the hash table, saving memory and reducing spill probability.
- The switch is recorded in the
angarabase_adaptive_probe_swap_totalmetric.
Dev / test
durability = relaxedis acceptable (deliberately).txn.statement_timeout_ms = 0.fail_fastis useful for early overload detection.
Knobs (MVP list)
durability = sync_at_commit|strict|group_commit|relaxed(env:ANGARABASE_TRANSACTION_LOG_DURABILITY)sync_at_commit/strict— fsync on every COMMIT (max durability, RM-0.6.4.0)group_commit— pump coalesces fsync (default, production)relaxed— no fsync (dev/bench only)
group_commit.max_batch_sizegroup_commit.max_wait_uscheckpoint.interval_mscheckpoint.target_mscheckpoint.dirty_ratio_soft|hardwriteback.max_bytes_per_sectxn.max_write_set_pages|bytesbuffer_pool.uncommitted_pages_ratio_hard(RM-0.6.3.9 §S5+§S9 rename; old name removed without alias)buffer_pool.backpressure.mode = block|fail_fast[execution].index_cardinality_threshold(default 0.15, env:ANGARABASE_INDEX_CARDINALITY_THRESHOLD)- If predicate selectivity is strictly above this threshold, single-key index scan is rejected (
seq scan chosen: low cardinality).
- If predicate selectivity is strictly above this threshold, single-key index scan is rejected (
[execution].index_scan_selectivity_threshold(default 0.05, env:ANGARABASE_INDEX_SCAN_SELECTIVITY_THRESHOLD)- If selectivity is not below this threshold, index scan is also rejected (
seq scan chosen: low selectivity). - On mixed OLTP workloads a filter may return ~10-15% of rows: the cardinality threshold already allows the plan, but the selectivity threshold 0.05 does not; in that case raise
index_scan_selectivity_threshold(for example to 0.15) in config and restart the process.
- If selectivity is not below this threshold, index scan is also rejected (
[execution].late_materialization_selectivity_threshold(default 0.3, env:ANGARABASE_LATE_MATERIALIZATION_SELECTIVITY_THRESHOLD)- Selectivity threshold for enabling the
LateMaterializenode. If the filter passes fewer than 30% of rows, delayed column materialization is enabled.
- Selectivity threshold for enabling the
[execution].adaptive_hash_join_swap_ratio(default 4.0, env:ANGARABASE_ADAPTIVE_HASH_JOIN_SWAP_RATIO)- Ratio for adaptive side swap in Hash Join. If the side size ratio (probe/build) becomes ≥ 4, sides are swapped to optimize memory use.
- Changed only through the configuration file (
angarabase.conf,[execution]section) or env before process startup; a server restart is then required. SET optimizer.*/ regularSET ...in Simple Query protocol do not change the planner: pgwire returns successfulCommandComplete, but the value is not applied (client compatibility). To test a hypothesis, edit config or env and restart.
Symptoms -> actions (fast path)
- Checkpoint p99 spikes: increase
checkpoint.target_ms, limitwriteback.max_bytes_per_sec. - Frequent backpressure: reduce batch size, lower
txn.max_write_set_pages, and if needed increase buffer pool. - durable_lsn lag / commit tails: check fsync latency, tune
group_commitparameters. - Slow query / plan changed: capture
EXPLAIN (VERBOSE, DIAGNOSTIC)and read the plan using How to read query plans. - Unexpected SeqScan on a large table: read
scan_strategy_reason. Forlow cardinality, lower[execution].index_cardinality_thresholdif needed; forlow selectivity, raise[execution].index_scan_selectivity_threshold. First check statistics (ANALYZE,distinct_estimate). Restart after changing thresholds.
Must-have alerts
buffer_pool_backpressure_active == 1longer than the threshold.buffer_pool_uncommitted_dirty_ratioabove hard-limit.- Growth in
txn_write_set_limit_exceeded_total. - GC/watermark stall (according to project SLO).
Next
- How to read query plans — how to read
EXPLAIN, cost/rows,Vector*,replan_reason,cache_status, andreason_codes. - Observability metrics checklist — what must be measured before and after tuning changes.
- Parallel runtime observability runbook — for CPU-bound workloads and DOP caps.
- jemalloc heap profiling runbook — if the bottleneck is memory, not CPU.
- MVCC and GC operator minimum — if latency growth correlates with GC backlog.
How to Read Query Plans
This guide helps DBAs/SREs read EXPLAIN in AngaraBase without knowing
the planner internals. The goal is not to manually “outplay”
the optimizer, but to quickly answer operator questions:
- which execution path the database chose;
- why that path was chosen;
- whether the query uses vector/parallel path;
- whether the plan was reused from cache or rebuilt;
- where to look for the cause of high latency.
Quick Start
For a regular plan:
EXPLAIN SELECT * FROM public.orders WHERE customer_id = 42;
For operator diagnostics:
EXPLAIN (DIAGNOSTIC)
SELECT * FROM public.orders WHERE customer_id = 42;
For detailed output:
EXPLAIN (VERBOSE, DIAGNOSTIC)
SELECT * FROM public.orders WHERE customer_id = 42;
For machine-readable evidence:
EXPLAIN (VERBOSE, DIAGNOSTIC, FORMAT JSON)
SELECT * FROM public.orders WHERE customer_id = 42;
If you need to see runtime counters, use EXPLAIN ANALYZE.
It executes the query, so for DML use it carefully and only
in a safe environment.
Runtime Facts
In ANALYZE mode, AngaraBase collects additional query execution facts
in the runtime_facts block. This block appears if the query encountered
waits, data spill to disk, or rejection due to resource limits.
Example JSON output:
"runtime_facts": {
"spill_bytes": 4096,
"wal_sync_wait_ms": 12,
"resource_reject_count": 1,
"last_runtime_reason": "spilled_memory_budget"
}
Example text output:
runtime_facts: spill_bytes=4096 wal_sync_wait_ms=12 resource_reject_count=1 last_runtime_reason=spilled_memory_budget
Main fields:
spill_bytes— amount of data spilled to disk (for example, when HashJoin or Sort lacks memory).wal_sync_wait_ms— WAL synchronization wait time. (May be omitted forSELECT, withdurability=relaxed, or if the transaction successfully joined group commit without additional I/O wait).resource_reject_count— number of rejections due to resource limits.last_runtime_reason— reason code, for examplespilled_memory_budget.
Note: only fields with non-zero values are emitted.
How to Read the Tree
Read the plan bottom-up. The bottom operator receives data from a table or index. Each next operator above applies a filter, join, aggregation, sort, or projection.
Example:
Project cost=0.00..1030.00 rows=100
VectorFilter cost=0.00..1025.00 rows=100
VectorSeqScan workers_planned=2 workers_launched=2 numa_affinity=disabled table=public.ux_stats_v2 cost=0.00..1000.00 rows=1000
--- Optimizer Diagnostics ---
query_fingerprint=1795416667712787713
plan_fingerprint=3192678580981205807
workload_class=select
replan_reason=none
cache_status=hit
reason_codes=stats_default_fallback
Read it like this:
VectorSeqScanreads tablepublic.ux_stats_v2.VectorFilterapplies theWHEREcondition.Projectkeeps the required columns in the result.- The
Optimizer Diagnosticsblock explains query/plan identifiers, cache, replan reason, and reason codes.
Operator Line Format
Let’s break down the line:
VectorSeqScan workers_planned=2 workers_launched=2 numa_affinity=disabled table=public.ux_stats_v2 cost=0.00..1000.00 rows=1000
| Field | What it means | How an operator should read it |
|---|---|---|
VectorSeqScan | Operator type. Vector = vector executor, SeqScan = sequential table read. | Reads the whole table in batches. Good for full scan / analytics, bad for point lookup on a large table without an index. |
workers_planned=2 | How many workers the planner wanted to use. | Plan allows parallelism. |
workers_launched=2 | How many workers were actually allocated. | If fewer than planned, runtime pressure or parallelism limits are possible. |
numa_affinity=disabled | Whether binding to NUMA-node is enabled. | Usually disabled is normal for dev/cloud; on bare metal it may be a separate tuning question. |
table=public.ux_stats_v2 | Source table. | Verify that the expected table/schema is scanned. |
cost=0.00..1000.00 | startup_cost..total_cost in planner units. | Not milliseconds. Compare with alternative plans, not wall-clock. |
rows=1000 | Estimated output rows from the operator. | Large estimate error often leads to bad join order or unnecessary full scan. |
Cost and Rows
cost is an internal work estimate, not execution time.
startup_cost— cost to get the first row.total_cost— cost to get all rows.rows— expected row count after the operator.
Common mistake: reading cost=1000 as 1000 ms. Do not do that.
Cost is used by the optimizer to compare alternatives:
- full scan vs index scan;
- hash join vs nested/index join;
- sort before or after filter;
- aggregate over all rows or over already filtered input.
If rows is clearly far from reality, check statistics first:
ANALYZE public.orders;
SELECT *
FROM sys.table_stats
WHERE schema_name = 'public' AND table_name = 'orders';
SELECT *
FROM sys.column_stats
WHERE schema_name = 'public' AND table_name = 'orders';
Vector Prefix
Operators with the Vector prefix execute through the vectorized path:
VectorSeqScanVectorIndexScanVectorFilterVectorProjectVectorWindowFunctionVectorSetOperation
The vector path processes data in batches, reducing per-row overhead. For operators this is usually a good sign, especially on scan/filter/aggregate workload.
If the expected Vector* disappeared:
- Check query shape: whether an expression was added that is not yet supported by the vector executor.
- Compare
EXPLAIN (VERBOSE, DIAGNOSTIC)before/after the query change. - Inspect
reason_codesand plan node types. - For latency regressions, use Performance tuning guide and Parallel runtime observability runbook.
Operator Glossary
| Operator | What it does | When good | When suspicious |
|---|---|---|---|
Scan / VectorSeqScan | Reads the whole table. | Small table, analytics, low filter selectivity. | Point lookup on a large table where an index should exist. |
IndexScan / VectorIndexScan | Reads through an index, then checks residual filter if needed. | Selective predicate on indexed column. | If it returns a large share of the table, full scan may be cheaper. |
IndexOnlyScan | Reads only the index, without heap fetch, if visibility map allows. | Coverage index + all-visible pages. | If it often falls back to heap, check visibility map / vacuum-like processes. |
Filter / VectorFilter | Applies WHERE/predicate to input stream. | After scan/index scan. | If filter is above expensive join, check pushdown. |
Project / VectorProject | Selects/computes output columns. | Normal top operator for SELECT. | Usually not an issue except very expensive expressions. |
Join | Generic join node with kind=inner/left/right/full/cross. | Expected join type matches SQL. | cross almost always needs attention. |
HashSemiJoin | Implements EXISTS/semi join through hash. | Good sign for decorrelated EXISTS. | If semi join was expected but nested/cross-like plan appears. |
HashAntiJoin | Implements NOT EXISTS/anti join through hash. | Good sign for anti-semi workload. | If input is large and there is no memory headroom. |
NLIndexJoin | Nested-loop probe by index. | Small outer input + selective index lookup. | Large outer input: may become many index probes. |
Aggregate | COUNT, SUM, GROUP BY, and other aggregate operations. | After filter or join with already reduced input. | If aggregate must materialize huge input. |
Sort | Sorts the stream. | For ORDER BY, merge-like paths. | Large sort without LIMIT or index on order key. |
Distinct | Removes duplicates. | Needed for DISTINCT. | On large input without prior row reduction. |
Limit / Offset | Limits or skips rows. | LIMIT can sharply reduce total cost. | Large OFFSET still forces reading/skipping many rows. |
WindowFunction | Window functions. | Analytical queries. | If it requires large sort/partition. |
SetOperation | UNION, INTERSECT, EXCEPT. | Set queries. | If unexpectedly expensive due to dedup/sort. |
LateralJoin | LATERAL/derived-table dependent path. | Correlated derived inputs. | Can be expensive on large outer inputs. |
LateMaterialize | Delayed column materialization. Reads only columns needed for filtering, then reads the rest later for rows that passed the filter. | High filter selectivity (selectivity < 0.3). | If selectivity is low, double read may be more expensive than normal. |
DmlInsert / DmlUpdate / DmlDelete | Sentinel for DML. | EXPLAIN DML shows intent. | For runtime counters, use EXPLAIN ANALYZE carefully. |
Ddl | Sentinel for DDL. | Shows DDL path. | Not a query performance hot path. |
Scan Strategy Reason
For Scan (SeqScan) and IndexScan nodes, the planner prints the reason for choosing
a specific scan strategy in the scan_strategy_reason field. This helps
understand why the optimizer preferred sequential scan over index scan
or vice versa.
Output examples:
index scan: high selectivity (0.0005)— index was chosen because the condition is highly selective.seq scan chosen: low cardinality (0.1328)— SeqScan was chosen: selectivity is above[execution].index_cardinality_threshold(the planner considers the column “too low-cardinality” for an index on this predicate).seq scan chosen: low selectivity (0.1111)— SeqScan was chosen: selectivity is not below[execution].index_scan_selectivity_threshold(separate gate after cardinality).
If you see seq scan chosen where you expected an index:
- Check statistics freshness (
ANALYZE). - Check
distinct_estimateinsys.column_stats. - Tune thresholds in
angarabase.conf([execution]) or through env before startup (ANGARABASE_INDEX_CARDINALITY_THRESHOLD,ANGARABASE_INDEX_SCAN_SELECTIVITY_THRESHOLD), then restart the server.SET ...from psql in Simple Query protocol does not change these knobs (see Performance tuning).
Optimizer Diagnostics
EXPLAIN (DIAGNOSTIC) adds a block:
--- Optimizer Diagnostics ---
query_fingerprint=1795416667712787713
plan_fingerprint=3192678580981205807
workload_class=select
replan_reason=none
cache_status=hit
reason_codes=stats_default_fallback
query_fingerprint
Stable identifier of the query’s logical shape. Literal values usually should not create a new fingerprint for every constant.
Use it to correlate:
- slow query;
- metrics;
- logs;
- repeated
EXPLAIN; - regression evidence.
plan_fingerprint
Identifier of the plan shape. If the query is the same but the plan changed,
query_fingerprint remains the same while plan_fingerprint changes.
This is useful when investigating:
- “after
ANALYZE, the query became faster/slower”; - “after adding an index, the plan changed”;
- “yesterday it was
IndexOnlyScan, today it isSeqScanagain”.
workload_class
Workload class:
selectwriteddl- other classes if a specific path marks them.
For operators, this helps separate OLTP read path from write/DDL events.
replan_reason
Why the plan was rebuilt or why there is no explicit reason.
| Value | Meaning | What to do |
|---|---|---|
none | No explicit replan reason. Usually normal path. | If cache_status=hit, cache is working. |
stats_drift | Statistics changed enough that the old plan could be stale. | Check ANALYZE frequency, table churn, latency stability. |
schema_changed | Schema changed: DDL, index, column, or another schema signal. | Normal after migrations; suspicious with frequent DDL in production. |
aqp_feedback | Runtime feedback affected estimates/planning. | Check AQP metrics and workload skew. |
forced_fallback | Planner/runtime chose a safe fallback. | Compare reason codes and unsupported expressions. |
cache_status
Shows how the query relates to the plan cache.
| Value | Meaning | How to interpret |
|---|---|---|
hit | Plan reused. | Good for stable OLTP. |
miss | Plan built again. | Normal for first run or new query shape. |
bypass | Cache deliberately not used. | Check DDL, volatile shape, diagnostics mode, or safety path. |
invalidated | Old plan dropped. | Look for replan_reason. |
unknown | Runtime did not pass status. | Do not infer cache behavior only from this field. |
reason_codes
Reasons for planner choice or fallback.
| Code | Meaning | What to check |
|---|---|---|
stats_default_fallback | Planner could not use detailed statistics and applied defaults. | Run ANALYZE, check sys.table_stats and sys.column_stats. |
index_only_eligible | Plan can read only the index without heap fetch. | Check visibility map and index coverage. |
bitmap_candidate_rejected | There was an alternative bitmap-like/index path, but another path or residual filter was chosen. | Compare predicate selectivity and presence of a suitable index. |
hash_join_fits_work_mem | Hash join is considered memory-eligible. | When p99 grows, check memory pressure and join cardinality. |
used_multicol_stats | Multi-column statistics were used. | Good sign for correlated predicates. |
If reason_codes is empty, AngaraBase shows stats_default_fallback
so operators do not get a “silent” diagnostic block.
JSON Format
For CI, evidence packs, and diffs between releases, use JSON:
EXPLAIN (VERBOSE, DIAGNOSTIC, FORMAT JSON)
SELECT * FROM public.orders WHERE customer_id = 42;
In JSON, the same entities are represented as fields:
Node TypeStartup CostTotal CostPlan RowsPlansworkers_plannedworkers_launchednuma_affinityquery_fingerprintplan_fingerprintreplan_reasonreason_codescache_status
For release evidence, do not compare the entire JSON byte-for-byte; compare stable properties: node class, join type, fingerprints, reason codes, and key estimates.
Common Reading Scenarios
1. Slow point lookup
Symptom:
VectorSeqScan table=public.orders ... rows=1000000
What to check:
- Whether there is an index on the filter column.
- Whether the planner sees statistics (
sys.column_stats). - Whether diagnostics shows
stats_default_fallback. - Whether filter selectivity is too low.
The desired plan for point lookup is usually closer to:
IndexScan index_name=... index_col=customer_id key_range=eq(...)
or:
IndexOnlyScan index_name=... index_col=customer_id index_only_reason="..."
2. Late Materialization
If the filter rejects a significant share of rows, the planner may choose the LateMaterialize node. This avoids expensive reading of all columns for rows that will be filtered out anyway.
The enable threshold is controlled by [execution].late_materialization_selectivity_threshold (default 0.3).
Plan example:
Project cost=10.00..50.00 rows=100
LateMaterialize cost=5.00..45.00 rows=100
VectorFilter (x > 100) cost=0.00..40.00 rows=100
VectorSeqScan table=large_table cost=0.00..30.00 rows=1000
3. EXISTS should not be nested-loop
For query:
EXPLAIN (DIAGNOSTIC)
SELECT *
FROM public.orders o
WHERE EXISTS (
SELECT 1
FROM public.order_items i
WHERE i.order_id = o.id
);
Good sign:
HashSemiJoin kind=semi on=...
This means the optimizer decorrelated EXISTS and chose hash semi join.
3. NOT EXISTS and anti join
Good sign:
HashAntiJoin kind=anti on=...
If input is large, check hash_join_fits_work_mem and memory metrics.
4. GROUP BY is too expensive
Symptom:
Aggregate cost=... rows=...
VectorSeqScan table=...
What to check:
- Whether rows can be filtered before aggregate.
- Whether there are unnecessary projected columns.
- Whether group key fits fast path (for example, single integer key).
- Whether there are too many groups.
- Whether the query requires sorting after aggregate.
5. Parallel planned, but latency is high
Symptom:
workers_planned=2 workers_launched=0
or workers_launched is less than workers_planned.
What to check:
- Global parallel runtime limits.
- CPU saturation.
- pgwire/runtime queues.
- Memory pressure.
- Parallel runtime observability runbook.
6. Plan changed after ANALYZE
Compare:
query_fingerprint— should remain stable for the same SQL shape;plan_fingerprint— changes if the plan shape changed;replan_reason— should explain the rebuild;reason_codes— show which new factors became available.
If IndexScan or IndexOnlyScan appears after ANALYZE, that is usually
a good sign. If SeqScan appears on a large OLTP lookup, check
selectivity and statistics.
Relationship with sys.* views
EXPLAIN shows the plan, and sys.* helps verify whether the optimizer
has data for a good decision.
Minimum set:
SELECT *
FROM sys.table_stats
WHERE schema_name = 'public' AND table_name = 'orders';
SELECT *
FROM sys.column_stats
WHERE schema_name = 'public' AND table_name = 'orders';
SELECT *
FROM sys.multicolumn_stats
WHERE schema_name = 'public' AND table_name = 'orders';
SELECT *
FROM sys.workload_stats
WHERE schema_name = 'public' AND table_name = 'orders';
How to read:
row_count_estimatehelps understand whether the optimizer knows table size.distinct_estimatehelps estimate equality predicate selectivity.min_i64/max_i64help range predicates.multicolumn_statshelps with correlated conditions.workload_statsshows how the table is actually used.
Triage Checklist
When a user says “the query became slow”, proceed as follows:
-
Capture the plan:
EXPLAIN (VERBOSE, DIAGNOSTIC) <query>; -
If safe, capture runtime:
EXPLAIN ANALYZE <query>; -
Read the tree bottom-up.
-
Find the widest input (
rowssharply above expected). -
Check whether the expected operator class is used:
IndexScan,IndexOnlyScan,HashSemiJoin,Aggregate,Vector*. -
Check
reason_codes. -
If
stats_default_fallbackis present, runANALYZEand compare the plan. -
Compare
query_fingerprintandplan_fingerprintbefore/after. -
If the issue is in parallel path, go to Parallel runtime observability runbook.
-
If the issue is in storage/IO, go to Performance tuning guide.
Common Interpretation Mistakes
| Mistake | Why it is wrong | Correct approach |
|---|---|---|
cost=1000 means 1000 ms | Cost is a relative optimizer model. | For timing, use EXPLAIN ANALYZE and latency metrics. |
SeqScan is always bad | Full scan may be optimal for small tables or low-selectivity filters. | Check table size, selectivity, and index availability. |
IndexScan is always better | Index scan may be worse than full scan if it returns a large share of the table. | Compare rows/cost and actual runtime. |
workers_planned=2 guarantees 2x speedup | Workers have overhead and may not start. | Check workers_launched and runtime metrics. |
replan_reason=none means optimizer did nothing | It means there is no explicit replan reason. | Check cache_status, fingerprints, and reason codes. |
stats_default_fallback can be ignored | It signals the optimizer may be guessing without statistics. | Run ANALYZE and check sys.* views. |
When to Escalate
Escalate as a bug/perf issue if:
EXPLAIN (DIAGNOSTIC)does not show a diagnostic block (make sure you are not usingEXPLAIN (DIAGNOSTIC ON)— the boolean suffixON/OFFis not supported by AngaraBase and silently ignores the option; useEXPLAIN (DIAGNOSTIC)without suffix);query_fingerprintis unstable for the same query shape;plan_fingerprintchanges without schema/stats/AQP reason;replan_reason=stats_driftappears too often on a stable table;IndexOnlyScanis chosen, but runtime constantly does heap fetch;HashSemiJoin/HashAntiJoindisappear for simpleEXISTS/NOT EXISTS;workers_launchedis systematically belowworkers_plannedwithout a clear pressure signal;- JSON/text outputs contradict each other.
For bug report, attach:
- query SQL;
EXPLAIN (VERBOSE, DIAGNOSTIC)text;EXPLAIN (VERBOSE, DIAGNOSTIC, FORMAT JSON);- relevant rows from
sys.table_stats,sys.column_stats,sys.multicolumn_stats,sys.workload_stats; - AngaraBase version and capability/profile snapshot, if available.
Next
- Performance tuning guide — what to do after reading the plan if the issue is latency/throughput.
- Parallel runtime observability runbook — how to
investigate
workers_planned/workers_launchedand runtime pressure. - Observability metrics checklist — which metrics
to correlate with
query_fingerprintandplan_fingerprint. - Diagnostics bundle runbook — how to collect evidence for support.
MVCC and GC Operator Minimum
Minimal operator contract for triaging GC/MVCC behavior.
Goal
Make GC predictable:
- see lag and stalls;
- bound the pause budget;
- understand which knobs to adjust first.
Metrics to watch
- Watermark:
angarabase_gc_watermark_snapshot- Slice latency:
angarabase_gc_compact_slice_duration_ms_*- GC progress:
angarabase_gc_compact_slices_totalangarabase_gc_compact_tables_scanned_totalangarabase_gc_compact_versions_removed_totalangarabase_gc_compact_tables_removed_total- Long snapshot risk:
txn_oldest_snapshot_age_secondstxn_long_snapshot_warn_totaltxn_long_snapshot_hard_total
Core knobs
ANGARABASE_GC_BUDGET_TABLESANGARABASE_GC_BUDGET_MSANGARABASE_GC_BUDGET_VERSIONSANGARABASE_GC_BURST_SLICESANGARABASE_GC_BURST_MAX_MSANGARABASE_GC_CURSOR_FILE(best-effort persisted cursor)
Full settings: src/operations/config-schema.md.
Triage: “GC not keeping up”
- Check
txn_oldest_snapshot_age_seconds: large age limits the watermark by contract. - Check the tail of
gc_compact_slice_duration_ms_*: if it grows, reduce the slice budget. - Check the trend of
*_versions_removed_totaland*_tables_scanned_total: if there is no progress, look for a long snapshot and environment issues through a diagnostics bundle.
UndoStore GC (RM-0.6.5.20)
RM-0.6.5.20 introduced epoch-based UNDO log GC:
How It Works
UndoGcWorkerstarts as a background thread at server startup- Every ~60 seconds (configurable interval),
gc_watermarkis computed for each DB UndoStore::gc_purge_older_than(gc_watermark)removes records older than the watermark- Watermark = committed_epoch minus safety margin (protects active read-only transactions)
Metric
angarabase_undo_purged_records_total — gauge showing UNDO record cleanup progress. Updated when GC is active.
Diagnostics
SELECT * FROM sys.metrics WHERE name LIKE '%undo%';
-- Expected: angarabase_undo_purged_records_total > 0 under write load
Troubleshooting (UNDO GC not working):
If angarabase_undo_purged_records_total stays at 0 for a long time during active UPDATE/DELETE:
- Check
txn_oldest_snapshot_age_seconds— long (stuck) transactions blockgc_watermarkadvancement. - Find and terminate stuck transactions (kill).
- Check server logs for
UndoGcWorkererrors (for example, I/O errors with.audfiles).
Manual heap-file compaction
# one-shot compact for a specific DB:
bash tools/golden_db/manage.sh compact <db_name>
Use after bulk DELETE / many UPDATEs if the .adb file is suspiciously large.
Related runbooks
src/operations/diagnostics-bundle.mdsrc/operations/performance-tuning.md
Diagnostics Bundle Runbook
Operator runbook for quickly collecting triage artifacts.
Canonical source: this runbook in angarabook/src/operations/.
Goal
diagnostics bundle should produce a predictable package:
- version and runtime environment;
- basic on-disk snapshot;
- config in redacted form;
- metrics and final artifact index.
Pinned commands
CLI (operator-facing, preferred for packaged distribution):
angara-cli diagnostics bundle \
--root artifacts/diagnostics/incident-1 \
--config /etc/angarabase/angarabase.conf \
--data-dir /var/lib/angarabase/data \
--txlog-dir /var/lib/angarabase/transaction_log \
--json
Legacy tools entrypoint (workspace/dev path):
tools/diagnostics_bundle/run.sh --root artifacts/diagnostics/dev
With config and directories:
tools/diagnostics_bundle/run.sh \
--root artifacts/diagnostics/incident-1 \
--config /etc/angarabase/angarabase.conf \
--data-dir /var/lib/angarabase/data \
--txlog-dir /var/lib/angarabase/transaction_log
Structure validation:
tools/diagnostics_bundle/validate.sh <bundle_root>
Artifact layout (minimum)
system.txt,versions.txt,env_angarabase.txton_disk_inspect.jsonconfig.redacted.conf,config_redaction.txtmetrics.prom(ormetrics.prom.error.txt)summary.jsonok.txt
Security policy
Secrets in config are redacted: values of keys matching password, secret, token, api_key
are replaced with "<REDACTED>".
Evidence routing
- Heavy bundle is stored in
artifacts/diagnostics/<stamp>/... - Documentation records only compact pinned summaries/evidence pointers.
Next
- Troubleshooting guide — where to attach the collected bundle.
- Disaster recovery playbook — separate artifact set for DR escalations.
- Observability metrics checklist — which metrics to duplicate in the bundle.
Security Operations Baseline
Short operator security baseline. Extended context and additional details are in migration history and related AngaraBook pages.
Goal
Bring security-relevant knobs into a single operational contract:
- safe defaults;
- fail-closed gates;
- observability points (
sys.settings).
Source of truth
- Config schema:
src/operations/config-schema.md - Runtime settings surfaces:
crates/angarabase/src/settings.rs,crates/angarabase/src/virtual_catalog.rs - Security governance:
src/operations/operational-policies.md
Defaults
server.addr = 127.0.0.1:5152as the safe default.- Remote bind is forbidden by default without an explicit insecure override.
- TLS is opt-in by default; for remote bind, policy may require TLS fail-closed.
Required fail-closed gates
- Remote bind without
allow_insecuremust fail startup. - Password auth without TLS must fail startup.
- Runtime settings changes (
sys.set_setting) require thesession_settingsrole (RM-0.6.4.16).
Knobs registry (operator highlights)
[security] allow_insecure,[security] dev_mode[tls] enabled,[tls] cert_path,[tls] key_path,[tls] require_on_remote_bindANGARABASE_AUTH_MODE,ANGARABASE_TLS_ENABLEANGARABASE_TDE_ENABLE,ANGARABASE_TDE_MASTER_KEY_IDANGARABASE_AUDIT_LOG_PATH,ANGARABASE_AUDIT_DML_MODE
Secrets (for example, ANGARABASE_AUTH_PASSWORD, master key) must not appear in sys.settings.
Security modes matrix
- Local + strict/group_commit: allowed.
- Local + relaxed durability: allowed with warning.
- Remote bind: only with explicit override and warning.
- Remote bind + relaxed: only with override and stronger warning.
Threat model and evidence
For threat inventory and evidence pointers, use:
src/operations/operational-policies.md
Next
- Operational policies baseline — where the policies underlying security operations.
- Backup and restore (operator-level) — data protection as part of the SecOps perimeter.
- Disaster recovery playbook — security incidents as a DR case.
Upgrade and Migration
Key pre-v1 operator contract for on-disk format and migration procedures.
Goal
Pin fail-closed rules:
- data and WAL layout;
- format version/magic;
- startup guards;
- required actions when the on-disk format changes.
Current on-disk layout (as implemented)
- System DB data:
base.adb - System DB WAL:
base.atl - User DB data:
<db_name>.adb - User DB WAL:
<db_name>.atl - Init marker:
VERSION(binaryAVR1, CRC32C)
MVCC history is stored in .atl; separate mvcc_history.v1.bin is no longer created.
Startup behavior (fail-closed)
- non-dev startup without
VERSION→ reject. - Text legacy
VERSIONis not supported. format_versionabove/below supported → reject.page_sizefromVERSIONdoes not match compiledPAGE_SIZE→ reject.
Format identifiers
- Storage page magic/version:
APG1/v3 - WAL record magic/version:
ADB1/v2(v3 planned)
Single source of truth for magic/version:
crates/angarabase/src/on_disk.rs
Offline migration baseline
Recommended general path:
- Back up the old state.
- Run
--initin the new format. - Restore data.
- Post-check startup and recovery.
In-place migration pre-v1 is limited and must be performed only through a documented runbook path.
Upgrade rehearsal
Before production rollout:
- run an upgrade rehearsal on staging;
- record artifacts in evidence;
- verify the rollback plan.
Detailed procedures:
src/operations/testing-validation.mdsrc/operations/backup-restore.md
Next
- Backup and restore (operator-level) — required step before starting the upgrade.
- Disaster recovery playbook — rollback and recovery scenarios for a failed upgrade.
- Replication v2 operations guide — upgrade in an active replication setup.
Backup and Restore
Key operator baseline for backup/restore. For details and extended procedures, see the migrated AngaraBook runbook flow.
Goal
Pin the verifiable workflow:
- what backup guarantees;
- how restore is performed;
- which artifacts confirm correctness.
Contract (cold/offline)
- Backup is performed with the server stopped.
- Backup includes:
storage.data_directory(including.adb);storage.transaction_log_directory;storage.undo_directory(if moved separately), or.audfromdata_directory.- The snapshot is consistent at the moment of stop.
- For TDE backup, restore requires valid key material (fail-closed).
What is out of scope
- No hot backup in this contract.
- No PITR.
- No incremental backup.
Pinned commands
CLI (operator-facing, preferred for packaged distribution):
angara-cli backup full --config /etc/angarabase/angarabase.conf --out /tmp/base_full.abk
angara-cli backup verify --file /tmp/base_full.abk --json
Restore via CLI:
angara-cli backup restore \
--config /etc/angarabase/angarabase.conf \
--file /tmp/base_full.abk \
--target-dir /tmp/angarabase-restore \
--overwrite
Legacy tools entrypoint (workspace/dev path):
tools/backup_restore/run.sh backup \
--data-dir /var/lib/angarabase/data \
--txlog-dir /var/lib/angarabase/transaction_log \
--out /tmp/angarabase-backup.tar.gz \
--root artifacts/backup_restore/backup
tools/backup_restore/run.sh restore \
--archive /tmp/angarabase-backup.tar.gz \
--dest /tmp/angarabase-restore \
--force \
--root artifacts/backup_restore/restore
Restore oracle (txlog-level):
tools/backup_restore/oracle.sh --root artifacts/backup_restore_oracle/run_1
Evidence surfaces
Minimum for triage:
summary.json(top-level outcome);backup_manifest.json(inspect);verify_report.json(verify);- oracle JSON (
txlog scan/replay-pages,compare.json).
Upgrade linkage
Before a version upgrade, it is recommended to:
- take a cold backup;
- save the archive as a rollback point;
- run the restore oracle on a separate directory.
Compatibility and on-disk policy: src/operations/upgrade-and-migration.md.
Remote Admin Flow (updated 2026-04-23 for TD-2026-0012)
For packaged distribution (apt install angarabase-server), angara-cli is preferred (remote admin over TCP
or local). Legacy tools/backup_restore/run.sh remains only for dev/workspace.
Remote admin flow:
angara-cli backup ... --remote-admin(or through the configured admin endpoint).- Runbooks in
angarabook/src/operations/now point to packaged CLI instead of direct tool calls. - Evidence:
tools/ci/backup_smoke.sh+ golden_db restore oracle.
last_reviewed: 2026-04-23. Drift resolved.
Config Schema
Operator summary of the AngaraBase configuration surface.
Canonical source: this runbook in angarabook/src/operations/.
Goal
Pin the config contract:
- keys and sections;
- defaults;
- precedence and compatibility.
Core sections
[server]:addr(main bind endpoint),host/portas deprecated.[storage]:data_directory,transaction_log_directory(wal_directoryas alias),io_backend_strict.[logging]:log_level,log_directory.[transaction_log]:backend,durability,fsync,checkpoint_target_lsn_lag_mb,checkpoint_min_interval_s.[ops]:metrics_addr,admin_addr.[security]:allow_insecure,dev_mode, TDE metadata fields.[memory]:soft_limit_mb,hard_limit_mb,max_dataset_bytes.[wal]:max_size_mb, performance and observability knobs.[execution],[aqp],[diagnostics],[optimizer]: performance and observability knobs.
Config Strictness and Unknown Keys (RM-0.6.5.6)
Starting with RM-0.6.5.6, unknown keys in named sections ([server], [storage], [wal], etc.) cause FATAL ERROR at server startup. This prevents typos in configuration that could previously be ignored.
- On error, the server prints a hint (Levenshtein suggestion) for the closest existing key.
Example typo error in a key:
[ERROR] config: unknown key 'max_siz_mb' in section [wal]; did you mean 'max_size_mb'?
Diagnostics: if the server does not start, check stderr / wrapper.log:
grep -i "unknown key\|config:" artifacts/golden_db/logs/wrapper.log | tail -20
WAL and Checkpoint Tuning (RM-0.6.5.8)
-
[wal] max_size_mb: Maximum WAL segment size. Applied since RM-0.6.5.6 (previously ignored).- Default: 512
- ENV:
ANGARABASE_WAL_MAX_SIZE_MB - Startup log:
wal: max_size_mb=2048 MiB (source=config)
-
[transaction_log] checkpoint_target_lsn_lag_mb: Target LSN lag for checkpoint trigger.- Default: 256
- ENV:
ANGARABASE_CHECKPOINT_LSN_LAG_TRIGGER_MB - Startup log:
checkpoint: lsn_lag_trigger_mb=256
-
[transaction_log] checkpoint_min_interval_s: Minimum interval between checkpoints in seconds.- Default: 300
- ENV:
ANGARABASE_CHECKPOINT_INTERVAL_MS(specified in milliseconds for ENV)
-
[storage] io_backend_strict(default: false): Strict I/O backend verification mode. -
ANGARABASE_CHECKPOINT_BACKGROUND=true: Background checkpoint is now enabled by default.
Verify max_size_mb application at startup:
grep "wal: max_size_mb" artifacts/golden_db/logs/wrapper.log | tail -3
# Expected output: [INFO] wal: max_size_mb=512 MiB (source=config)
Memory Limits (RM-0.6.5.8)
The [memory] section controls server process RAM consumption.
-
soft_limit_mb: RSS (Resident Set Size) threshold in MiB at which the server starts emitting warnings.- Default: disabled (if not set)
- Behavior: when the limit is crossed,
angarabase_memory_soft_limit_exceeded_totalis incremented. - Example:
soft_limit_mb = 4096
-
hard_limit_mb: Hard RSS limit in MiB.- Default: disabled (if not set)
- Behavior: when the limit is exceeded, the server performs an emergency flush and exits with
exit(1). - Example:
hard_limit_mb = 8192
Index Maintenance and Durability (RM-0.6.5.8)
-
Index Durability: Starting with RM-0.6.5.8,
CREATE INDEXguarantees durability. After the command completes, the index is fully synchronized and available after recovery even if a crash happens immediately after creation. -
storage.max_index_pages_per_table(default: 65535): page limit per index. -
storage.index_maintenance_budget_ms(default: 5000): time budget for index maintenance in one DML command. -
visibility_map.rebuild_max_pages_per_tick(default: 1024): background VM rebuild rate. -
visibility_map.rebuild_io_budget_bytes(default: 10MB): I/O budget of the VM worker.
Init behavior (--init)
angarabase-server --init uses effective settings and creates the bootstrap layout:
<root>/data<root>/txlog<root>/angarabase.conf(if an existing config path is not set)
Precedence
Precedence rule:
- default
- config (
angarabase.conf) - environment override
Contract: default -> config -> env.
Critical env surface (operator minimum)
ANGARABASE_TRANSACTION_LOG*(backend/durability/fsync)ANGARABASE_METRICS_ADDRANGARABASE_TDE_*ANGARABASE_TLS_*ANGARABASE_MAX_DATASET_BYTESANGARABASE_AQP_*ANGARABASE_GC_*
Backward compatibility policy
Breaking includes:
- renaming a key without an alias period;
- unsafe default change (for example, binding externally);
- semantic change without migration notes.
Non-breaking:
- new keys with safe defaults;
- new fail-closed checks for unsafe combinations with explicit override.
Spill / Temp Storage (RM-0.6.4.2 — Spill to Disk)
New ENV knobs control spill-to-disk (Grace Hash Join, External Merge Sort, Set Ops) when QueryMemoryBudget is exceeded. Default disabled (safe fail-closed on OOM 53100).
Core spill knobs:
ANGARABASE_QUERY_SPILL_ENABLED=0— enable spill path (set=1 for analytical workloads).ANGARABASE_TEMP_MAX_BYTES_PER_QUERY(unlimited) — per-query soft quota.ANGARABASE_TEMP_MAX_BYTES_TOTAL_*(SOFT/HARD) — global spill limits, fail-closed on hard.ANGARABASE_TEMP_DIRECT_IO=0,ANGARABASE_TEMP_USE_O_TMPFILE=0,ANGARABASE_TEMP_OTMPFILE_DIRECT_FALLBACK=0— io_uring + O_DIRECT + O_TMPFILE profile (production-like, kernel-managed cleanup on crash).ANGARABASE_SPILL_HASH_JOIN_*(MAX_PARTITION_ROWS=8192, MAX_RECURSION_DEPTH=3, SKEW_THRESHOLD=75%, BLOOM_BITS=65536) — tuning recursion, skew handling, prefilter. Overflow → SQLSTATE 53400 graceful refusal.
Monitoring: see observability-metrics.md for angarabase_spill_*, angarabase_wal_* counters and
sys.wait_events.
See RM-0.6.4.2 Surface Map and RFC-2026-492 for full contract. Recommended for HTAP/TPC-H with low
ANGARABASE_QUERY_MEMORY_LIMIT_MB.
Next
- Operations overview — where config-schema fits into the broader operator material.
- Operational policies baseline — which configuration values are fixed by policy.
Observability Metrics Reference
Full AngaraBase metrics reference with diagnostic routes and a quick reference card.
Canonical source: this runbook in angarabook/src/operations/.
Quick Reference Card (Top-10 for wallboard)
Print this and keep it near the on-call desk. These 10 metrics cover 80% of production incidents.
| # | Metric | Type | Normal range | What crossing the boundary means |
|---|---|---|---|---|
| 1 | angarabase_connections_active | gauge | < 80% max_pool | Connection leak / missing PgBouncer — check angara_stat_activity |
| 2 | angarabase_txn_rollback_total (rate 1m) | counter rate | < 5% of commit rate | Abnormal rollback rate — MVCC conflicts, deadlock, or application bugs |
| 3 | angarabase_storage_dirty_pages_total | gauge | < 10,000 pages | Checkpoint cannot keep up — lower write rate or reduce checkpoint interval |
| 4 | angarabase_checkpoint_errors_total (change) | counter | 0 | Checkpoint error = critical incident; inspect logs immediately |
| 5 | angarabase_transaction_log_flush_lsn vs durable_lsn (delta) | gauge | < 1 MB | Large gap = WAL durability lag; data-loss risk on crash |
| 6 | angarabase_query_exec_duration_ms_bucket P99 | histogram | < 100 ms | P99 degradation — check angara_stat_activity + EXPLAIN |
| 7 | angarabase_buffer_pool_miss_total (rate) | counter rate | < 20% hit/miss | Low cache hit ratio — increase buffer_pool_size_mb |
| 8 | angarabase_memory_rss_bytes | gauge | < soft_limit*0.9 | Approaching soft limit — OOM risk; check query patterns + GC |
| 9 | angarabase_qos_rejected_critical_total (rate) | counter rate | 0 | Any CRITICAL rejections = production incident candidate |
| 10 | angarabase_uptime_seconds | gauge | monotonically increasing | Value < 60 after a pause = unexpected restart / crash |
Full Metrics Reference
Connections and Sessions
| Metric | Type | What it measures | Normal | Crossing the boundary |
|---|---|---|---|---|
angarabase_connections_active | gauge | Active client connections | < max_pool * 0.8 | Check pool config, connection leaks |
angarabase_connections_accepted_total | counter | Total connections since startup | monotonic | Sudden rate spike — DDoS or reconnect storm |
angarabase_pgwire_active_tasks | gauge | Active pgwire spawn_blocking tasks | ≤ max_blocking_threads | Saturation of blocking runtime path |
angarabase_session_claims_set_total | counter | Session claims set operations (app.*) | — | Used for audit trail |
Connection diagnostics:
SELECT pid, state, consumer_id, wait_event FROM angara_stat_activity;
Transactions and MVCC
| Metric | Type | What it measures | Normal | Crossing the boundary |
|---|---|---|---|---|
angarabase_txn_begin_total | counter | Total BEGIN | — | Throughput baseline |
angarabase_txn_commit_total | counter | Total COMMIT | — | rate(1m) = TPS |
angarabase_txn_rollback_total | counter | Total ROLLBACK | < 5% of commit | Conflicts, application errors |
angarabase_txn_active_count | gauge | Transactions in flight | < 100 (OLTP) | Long txns — check txn_oldest_snapshot_age_seconds |
angarabase_txn_commit_conflicts_total | counter | MVCC conflicts | close to 0 | High rate = competing writes to the same rows |
angarabase_txn_oldest_snapshot_age_seconds | gauge | Age of oldest snapshot | < 60s | Long snapshot blocks GC → GC bloat |
angarabase_mvcc_history_versions_total | gauge | Versions in MVCC store | grows slowly | Fast growth = GC cannot keep up (see MVCC GC runbook) |
angarabase_txn_commit_epoch_current | gauge | Current commit epoch | monotonic | Does not change for > 30s under load = WAL issue |
PromQL — TPS:
rate(angarabase_txn_commit_total[1m])
PromQL — Conflict ratio:
rate(angarabase_txn_commit_conflicts_total[5m]) / rate(angarabase_txn_commit_total[5m])
WAL and durability
| Metric | Type | What it measures | Normal | Crossing the boundary |
|---|---|---|---|---|
angarabase_transaction_log_flush_lsn | gauge | LSN of last flush | monotonic | Growth stops = WAL writer hung |
angarabase_transaction_log_durable_lsn | gauge | LSN of last fsync | ≤ flush_lsn | gap > 1 MB = durability lag |
angarabase_transaction_log_last_checkpoint_id | gauge | ID of last checkpoint | monotonic | — |
angarabase_transaction_log_checkpoint_end_valid_total | counter | Successful checkpoint ends | monotonic | — |
angarabase_transaction_log_checkpoint_end_invalid_total | counter | Invalid checkpoint ends | 0 | > 0 = WAL corruption |
angarabase_wal_sync_wait_total | counter | WAL sync waits (strict mode) | — | rate grows = I/O latency |
angarabase_wal_group_commit_wait_total | counter | WAL group commit waits | — | rate grows = group commit backlog |
angarabase_transaction_log_bytes_appended_total | counter | Bytes written to WAL | — | WAL write throughput |
PromQL — WAL durability gap (bytes):
angarabase_transaction_log_flush_lsn - angarabase_transaction_log_durable_lsn
Storage and buffer pool
| Metric | Type | What it measures | Normal | Crossing the boundary |
|---|---|---|---|---|
angarabase_storage_dirty_pages_total | gauge | Dirty pages in memory | < 10,000 | Checkpoint lag; reduce write rate or checkpoint_interval |
angarabase_storage_cached_pages_total | gauge | Cached pages | grows up to bp size | Sudden drop = eviction storm |
angarabase_buffer_pool_hit_total | counter | Cache hits | — | hit rate = hits / (hits + misses) |
angarabase_buffer_pool_miss_total | counter | Cache misses | — | miss rate > 20% = larger buffer pool needed |
angarabase_buffer_pool_warmup_pages_total | counter | Pages loaded during warmup | — | After restart |
angarabase_storage_flush_ok_total | counter | Successful flushes | monotonic | — |
angarabase_storage_backpressure_events_total | counter | Backpressure events | 0 | > 0 = writer faster than disk |
angarabase_storage_backpressure_commit_rejected_total | counter | Commit rejected by backpressure | 0 | I/O performance is insufficient |
angarabase_storage_flush_bytes_total | counter | Bytes flushed to disk | — | I/O write throughput |
PromQL — Buffer pool hit ratio:
rate(angarabase_buffer_pool_hit_total[5m]) /
(rate(angarabase_buffer_pool_hit_total[5m]) + rate(angarabase_buffer_pool_miss_total[5m]))
Checkpoint and bgwriter
| Metric | Type | What it measures | Normal | Crossing the boundary |
|---|---|---|---|---|
angarabase_checkpoint_total | counter | Successful checkpoints | > 0 in 5 min | = 0 for 10 min = checkpoint stopped |
angarabase_checkpoint_errors_total | counter | Checkpoint errors | 0 | Inspect logs immediately |
angarabase_checkpoint_dirty_pages | gauge | Dirty pages at checkpoint time | < 5,000 | High value = checkpoint cannot keep up |
angarabase_checkpoint_duration_ms_sum | counter | Total checkpoint time (ms) | — | avg = sum/count |
angarabase_checkpoint_aborted_total | counter | Aborted checkpoints | 0 | > 0 = cancellations; check reason |
angarabase_checkpoint_per_db_timeout_total | counter | Per-DB checkpoint timeouts | 0 | timeout = disk too slow |
angarabase_angarabase_wal_forced_checkpoints_total | counter | Forced checkpoints due to backpressure | 0 | > 0 = write pressure is critical |
SQL — bgwriter state:
SELECT * FROM angara_stat_bgwriter;
PromQL — checkpoint avg duration:
rate(angarabase_checkpoint_duration_ms_sum[5m]) / rate(angarabase_checkpoint_duration_ms_count[5m])
Query execution
| Metric | Type | What it measures | Normal | Crossing the boundary |
|---|---|---|---|---|
angarabase_query_exec_total_ok_select | counter | SELECT queries OK | — | QPS baseline |
angarabase_query_exec_total_ok_write | counter | Write queries OK | — | Write TPS |
angarabase_query_exec_total_err_select | counter | SELECT errors | close to 0 | rate grows = bugs or overload |
angarabase_query_exec_duration_ms_bucket | histogram | Latency distribution | P99 < 100ms | P99 > 500ms = degradation |
angarabase_slow_query_total | counter | Slow queries (> threshold) | 0 | > 0 = EXPLAIN slow queries needed |
angarabase_sql_routing_not_supported_total | counter | Unsupported SQL routes | 0 | > 0 = application uses unsupported SQL |
angarabase_legacy_fallback_triggered_total | counter | Legacy path fallbacks | 0 | > 0 = unsupported query plan |
angarabase_simd_agg_fallback_total | counter | SIMD aggregation fallback to scalar path | 0 | > 0 = AVX2/NEON support missing or type incompatibility |
angarabase_adaptive_probe_swap_total | counter | Number of adaptive Hash Join side swaps | — | Shows optimizer activity under table-size skew |
PromQL — P99 latency:
histogram_quantile(0.99,
rate(angarabase_query_exec_duration_ms_bucket[5m])
)
SQL — slow queries:
SELECT query, calls, mean_exec_time_ms, max_exec_time_ms
FROM angara_stat_statements
ORDER BY mean_exec_time_ms DESC LIMIT 10;
Memory
| Metric | Type | What it measures | Normal | Crossing the boundary |
|---|---|---|---|---|
angarabase_memory_rss_bytes | gauge | Process RSS (bytes) | < soft_limit * 0.9 | OOM risk; check query patterns |
angarabase_memory_soft_limit_exceeded_total | counter | soft_limit_mb crossings | 0 | > 0 = memory under pressure |
angarabase_tx_overlay_dataset_bytes_total | gauge | In-memory tx overlay size | < 512 MB | Large txns keep much data in memory |
QoS Scheduler
| Metric | Type | What it measures | Normal | Crossing the boundary |
|---|---|---|---|---|
angarabase_qos_rejected_critical_total | counter | CRITICAL queue rejections | 0 | Incident candidate — immediate triage |
angarabase_qos_rejected_interactive_total | counter | INTERACTIVE queue rejections | 0 | User-facing degradation |
angarabase_qos_rejected_background_total | counter | BACKGROUND queue rejections | — | Reduce background concurrency |
angarabase_qos_blocking_inflight | gauge | Blocking tasks | < max_blocking | scheduler saturation |
angarabase_spawn_blocking_active | gauge | Active spawn_blocking | < max_blocking | — |
Troubleshooting by Dashboard
Route 1: High P99 latency
angarabase_query_exec_duration_ms P99 > 500ms?
│
├─ Yes → angara_stat_activity: any waiting sessions?
│ │
│ ├─ Yes (wait_event != '') → Lock contention or WAL sync wait
│ │ → check angarabase_txn_commit_conflicts_total
│ │ → check angarabase_wal_sync_wait_total
│ │
│ └─ No → angara_stat_statements: top queries by max_exec_time_ms
│ → EXPLAIN the top query
│ → check buffer_pool_miss_total rate (I/O bound?)
│
└─ No → baseline normal, false alarm
SQL:
SELECT query, calls, max_exec_time_ms, mean_exec_time_ms
FROM angara_stat_statements
ORDER BY max_exec_time_ms DESC LIMIT 5;
Route 2: QPS Drop (sudden SELECT rate drop)
rate(angarabase_query_exec_total_ok_select[1m]) dropped sharply?
│
├─ connections_active also dropped → process restarted? uptime < 60s?
│ → check logs for panic / OOM / segfault
│
├─ connections_active high, QPS low → scheduler saturation?
│ → qos_rejected_* > 0?
│ → qos_blocking_inflight high?
│ → spawn_blocking_active ≈ spawn_blocking_max?
│
└─ Connections normal → long transaction blocking?
→ angara_stat_activity WHERE state = 'idle in transaction'
→ txn_oldest_snapshot_age_seconds > 60s?
Route 3: GC Pressure / MVCC bloat
mvcc_history_versions_total grows monotonically without decrease?
│
├─ txn_oldest_snapshot_age_seconds > 120s → long open snapshot
│ → find pid from angara_stat_activity ORDER BY query_start ASC
│ → terminate or wait for completion
│
├─ columnar_pending_deleted_rows > 1M → compaction lagging
│ → check Background Compactor in angara_stat_activity
│ → temporarily SET angarabase.compaction_enabled = true
│
└─ memory_rss_bytes grows together → GC bloat + memory pressure
→ see mvcc-gc.md runbook
Route 4: Checkpoint Issues
checkpoint_errors_total changed?
│
├─ Yes → inspect logs immediately (disk full? I/O error?)
│ → storage_backpressure_events_total > 0?
│ → df -h on data directory
│
└─ No, but dirty_pages_total high (> 10,000)?
→ checkpoint cannot keep up with writes
→ lower checkpoint_interval_ms
→ or limit write throughput
→ SQL: SELECT * FROM angara_stat_bgwriter;
Memory and Buffer Pool Metrics (RM-0.6.5.8)
Goal
Keep the minimum sufficient signal set for:
- durability;
- concurrency/locks;
- storage/checkpoint;
- recovery.
Metrics source
ANGARABASE_METRICS_ADDR=host:port- endpoint:
GET /metrics(Prometheus format)
Must-have groups
- Transactions / concurrency
- Transaction log / durability
- Locks
- Storage / writeback / checkpoint
- Query diagnostics / stats
- Recovery / replay outcomes
Memory and Buffer Pool Metrics (RM-0.6.5.8)
| Metric | Type | Meaning |
|---|---|---|
angarabase_memory_rss_bytes | gauge | Resident Set Size of the server process in bytes. Updated every 5s. |
angarabase_memory_soft_limit_exceeded_total | counter | Number of soft_limit_mb threshold crossings (edge-trigger). |
angarabase_buffer_pool_warmup_evictions_during_warmup_total | counter | Number of page evictions from buffer pool during warmup (warmup cap enforcement). |
angarabase_buffer_pool_warmup_completed_pages | counter | Number of pages loaded during warmup. |
angarabase_buffer_pool_warmup_aborted_at_cap_total | counter | Warmup aborted because cap was exceeded (>95%). |
PromQL — Alert when approaching soft limit:
# Replace <soft_limit_bytes> with soft_limit_mb * 1024 * 1024
# For example, for soft_limit_mb = 4096: threshold = 4294967296
angarabase_memory_rss_bytes > <soft_limit_bytes> * 0.9
Storage and Checkpoint Metrics (RM-0.6.5.8)
| Metric | Type | Meaning |
|---|---|---|
angarabase_checkpoint_total | counter | Total number of completed checkpoints. > 0 after 5 min uptime confirms auto-checkpoint is working. |
Visibility Map and Index-Only Scan (RM-0.6.4.3)
| Metric | Type | Meaning |
|---|---|---|
angarabase_visibility_map_all_visible_fraction | gauge | Share of all-visible pages (planner signal). |
angarabase_index_only_scan_hits_total | counter | Successful Index-Only Scan (without Heap access). |
angarabase_index_only_scan_heap_fetches_total | counter | Fallback to Heap during Index-Only Scan (VM bit=0). |
angarabase_visibility_map_rebuild_pages_remaining | gauge | Remaining pages for background VM rebuild. |
angarabase_visibility_map_corrupt_total | counter | Detected VM corruptions (rebuild trigger). |
Specific metric names linked to dashboard panels: see the table. The full name contract is pinned by a test; link below in “Contract pinning”.
New RM-0.6.4.0 Metrics (WAL Commit Path + Durability)
Added in Sprint 2/3 RM-0.6.4.0 (RFC-2026-090). Cover the new sync_at_commit
mode and the durability barrier group.
curl -sf http://127.0.0.1:9898/metrics | rg "wal_(sync_wait|group_commit_wait)|wait_events_total\\{event=\"wal_"
| Metric | Type | Meaning |
|---|---|---|
angarabase_wal_sync_wait_total | counter | Number of commit-wait events on the IO::WalSync path (strict durability). |
angarabase_wal_group_commit_wait_total | counter | Number of commit-wait events on the IO::WalGroupCommit path (batched durability wait). |
angarabase_wait_events_total{event="wal_sync"} | counter | Unified wait-event counter for the WAL sync path. |
angarabase_wait_events_total{event="wal_group_commit"} | counter | Unified wait-event counter for the group-commit path. |
Diagnostics by mode
relaxed:wal_sync_wait_totalandwal_group_commit_wait_totalare close to 0.group_commit:wal_group_commit_wait_totalgrows;wal_sync_wait_totalis usually noticeably lower.sync_at_commit/strict:wal_sync_wait_totalgrows;wait_events_total{event="wal_sync"}reflects long-term sync-path load.
Durability mode is checked through env ANGARABASE_TRANSACTION_LOG_DURABILITY.
SQL SET durability / COMMIT WITH DURABILITY are reserved for v0.6.5 → SQLSTATE 0A000.
Details: WAL writer contract spec (wal_writer_contract_v0.md) and RFC-2026-090.
HTAP / Vector Execution Metrics (RM-0.6.4.13 / RM-0.6.4.14 / RM-0.6.6.9)
HTAP-specific metrics for diagnosing vector and stream execution paths.
The label contract is stable starting with v0.6.x.
curl -sf http://127.0.0.1:9898/metrics | grep -E "scan_stream|vector_fallback|vector_memory|columnar_manifest|vector_columnar_native|columnar_batched_scan|segments_pruned|parallel_agg"
| Metric | Type | Meaning |
|---|---|---|
angarabase_scan_stream_materialize_total{reason="batch_to_rows"} | counter | Materialization at batch→rows boundary. |
angarabase_scan_stream_materialize_total{reason="drain_rows_default"} | counter | Materialization through drain_rows (fallback default). |
angarabase_scan_stream_materialize_total{reason="stream_to_relation_boundary"} | counter | Materialization at stream→relation boundary. |
angarabase_scan_stream_fallback_total | counter | Stream-plan fallback to legacy executor. |
angarabase_vector_fallback_total | counter | Vector-path fallback to row path (unsupported plan or type error). |
angarabase_vector_columnar_native_total | counter | Successful native vector-path activations for columnar tables. |
angarabase_columnar_batched_scan_batches_total | counter | Total processed columnar batches in native path. |
angarabase_columnar_segments_pruned_total | counter | Number of segments pruned by metadata (zone-map pruning). |
angarabase_parallel_agg_total | counter | Number of parallel aggregator runs. |
angarabase_vector_memory_budget_exceeded_total | counter | Vector budget allocation refusal (SQLSTATE 53100). |
angarabase_columnar_manifest_init_failed_total | counter | SegmentManifest init error during CREATE TABLE USING COLUMNAR. |
Note:
reason=labels onangarabase_scan_stream_materialize_totalare a stable operator-facing contract withinv0.6.x.
Columnar DV Pressure (RM-0.6.4.19 Track C C2)
angarabase_columnar_pending_deleted_rows — signed gauge showing the total
number of logically deleted rows in live segments that have not yet been reclaimed by compaction.
- Increment on
AttachDeleteVector(on every columnar DELETE):+row_countfrom the DV op. - Decrement on
compact_l0_to_l1:-rows_reclaimedby number of rows not included in the L1 pack.
Normally, the gauge grows after DELETE and decreases after a Background Compactor run. If the gauge grows monotonically, compaction is lagging or fully disabled.
curl -sf http://127.0.0.1:9898/metrics | rg "pending_deleted_rows"
Alert rule (DV fragmentation)
# Alert if accumulated DV pressure > 5 million rows.
angarabase_columnar_pending_deleted_rows > 5_000_000
Recommended severity:
warningwhen >1M rows — compaction is likely lagging;criticalwhen >10M rows — scan performance degradation is possible.
Interpretation:
- gauge ≤ 0 — normal (all DV reclaimed, possibly a small transient underflow during replay);
- gauge grows without decrease for > 30 minutes — check Background Compactor (
angara_stat_activity,angarabase_columnar_compaction_total).
| Metric | Type | Meaning |
|---|---|---|
angarabase_columnar_pending_deleted_rows | gauge (signed) | Net pending-deleted rows across all columnar segments. |
Heap fetch fallback reason metrics (RM-0.6.5.6)
angarabase_heap_point_fetch_fallback_reason_stale_tid_index_total— fallback due to stale tid indexangarabase_heap_point_fetch_fallback_reason_not_found_total— fallback due to row not found
Quick check (curl):
curl -s http://localhost:8080/metrics | grep "fallback_reason"
# angarabase_heap_point_fetch_fallback_reason_stale_tid_index_total 0
# angarabase_heap_point_fetch_fallback_reason_not_found_total 0
PromQL — fallback rate by reason:
rate(angarabase_heap_point_fetch_fallback_reason_stale_tid_index_total[5m])
rate(angarabase_heap_point_fetch_fallback_reason_not_found_total[5m])
If stale_tid_index grows, there may be an issue with the V3 chain path or index rebuild. If not_found grows, data loss or an MVCC visibility bug is possible.
QoS Scheduler and spawn_blocking (RM-0.6.4.10 / RM-0.6.4.19)
RM-0.6.4.10 adds runtime signals for QoS scheduler and blocking path. They
help distinguish SQL contention from scheduler saturation: if QoS
rejections or qos_blocking grow, the problem is in execution queues, not in
row/table locks.
curl -sf http://127.0.0.1:9898/metrics | rg "qos_(queued|rejected|blocking)|spawn_blocking"
| Metric | Type | Meaning |
|---|---|---|
angarabase_qos_queued_critical_total | counter | Total tasks placed in QoS CRITICAL queue. |
angarabase_qos_queued_interactive_total | counter | Total tasks placed in QoS INTERACTIVE queue. |
angarabase_qos_queued_background_total | counter | Total tasks placed in QoS BACKGROUND queue. |
angarabase_qos_rejected_critical_total | counter | CRITICAL queue rejections with SQLSTATE 53600. |
angarabase_qos_rejected_interactive_total | counter | INTERACTIVE queue rejections with SQLSTATE 53600. |
angarabase_qos_rejected_background_total | counter | BACKGROUND queue rejections with SQLSTATE 53600. |
angarabase_qos_blocking_inflight | gauge | Current blocking tasks across QoS shards. |
angarabase_spawn_blocking_max | gauge | spawn_blocking thread limit from max_blocking_threads; 0 before startup init. |
angarabase_spawn_blocking_active | gauge | Active spawn_blocking tasks. Incremented at start, decremented on completion via SpawnBlockingGuard (RM-0.6.4.19 Track C C2). |
QoS queues by level:
rate({__name__=~"angarabase_qos_queued_.*_total"}[5m])
QoS rejections by level:
rate({__name__=~"angarabase_qos_rejected_.*_total"}[5m])
Alert on any scheduler rejection:
sum(rate({__name__=~"angarabase_qos_rejected_.*_total"}[5m])) > 0
Blocking pressure:
angarabase_qos_blocking_inflight > 0
Blocking budget headroom:
angarabase_spawn_blocking_max - angarabase_spawn_blocking_active
Interpretation:
queued_background_totalgrows but norejected_*— scheduler accepts batch workload; usually normal;rejected_background_totalgrows — batch/ETL is too aggressive; lower concurrency or raiseANGARABASE_QOS_MAX_QUEUED;rejected_critical_totalgrows — production incident candidate: CRITICAL workload should not regularly hit the queue cap;qos_blocking_inflight > 0together with growth inqos_blockingwait event means pressure in the blocking runtime path.
Query Execution Duration Histogram (RM-0.6.5.10)
angarabase_query_exec_duration_ms — histogram of SQL query execution latency.
Note (RM-0.6.5.10 S6):
histogram_quantile(0.99)is correct only if the value is < 10,000ms. At p99 ≥ 10,000ms, inspect the share in bucket+Inf. Buckets:[1,5,10,50,100,500,1000,2500,5000,10000,+Inf]ms.
SLO-oriented usage
- Latency:
histogram_quantile()over*_bucket(p95/p99) - Throughput:
rate()over counters - Errors/contention: conflict/timeout/deadlock rates
- Saturation: backpressure counters and queue depth
Contract pinning
Must-have metric names are considered part of the operability contract and are protected by a test:
crates/angarabase/src/metrics.rsprometheus_export_contains_must_have_metrics_names
Next
- Performance tuning guide — which metrics to read first during degradation.
- Parallel runtime observability runbook — narrow metrics for the parallel runtime.
- MVCC and GC operator minimum — separate MVCC/GC metrics and alerts package.
Columnar / HTAP Tables
Source of truth: RFC-2026-092, RM-0.6.4.0 Sprint 3.
What Is HTAP row-column store
AngaraBase supports storing tables in the HTAP row-column format (HtapRowColumn):
rows are written through the regular WAL, and columns are segmented in ColumnStore (SegmentManifest).
This allows serving OLTP transactions and AP scans from one table.
Vector Pipeline (OLAP)
High-performance analytics (OLAP) uses the Vector Pipeline, which uses SIMD and parallel aggregation directly over columnar data.
Creating an HTAP Table (copy-paste ready)
There are two equivalent syntaxes:
-- Variant 1: PostgreSQL-style USING (recommended)
CREATE TABLE metrics (
ts TIMESTAMPTZ NOT NULL,
device_id INT NOT NULL,
value FLOAT8
) USING COLUMNAR;
-- Variant 2: WITH option (explicit alias)
CREATE TABLE events (
id SERIAL PRIMARY KEY,
data TEXT
) WITH (storage = 'columnar');
-- Variant 3: canonical engine name
CREATE TABLE events2 (
id SERIAL PRIMARY KEY,
data TEXT
) WITH (storage = 'htap_row_column');
All three forms are equivalent: the table is created with TableStorageEngineV0::HtapRowColumn
in the catalog.
Checking engine in catalog
-- Through sys_catalog (RM-0.6.4.0)
SELECT table_name, storage_engine
FROM sys.tables
WHERE table_name = 'metrics';
Background Compaction (L1 Compaction)
Starting with RM-0.6.4.8, angarabase supports background compaction (L1 Compaction) for columnar tables.
The compactor merges small L0 segments into larger L1 pack files, reducing inode pressure and improving compression.
The process is fully transparent.
Metrics for monitoring compaction:
angarabase_columnar_compaction_totalangarabase_columnar_compaction_duration_msangarabase_columnar_direct_io_fallback_total
Errors and Their Meaning
Unknown USING method → 0A000
ERROR: USING brin is not a supported storage method; ...
SQLSTATE: 0A000 (feature_not_supported)
What to do: use USING COLUMNAR, USING htap_row_column (→ HTAP),
or omit USING (→ row_store by default).
Invalid storage value
ERROR: storage expects one of: row_store, memory, columnar, htap_row_column
SQLSTATE: 42601 (syntax_error)
What to do: use one of the listed values.
Non-goals (Phase 1, RM-0.6.4.0)
- SegmentManifest auto-init on table creation — deferred to Sprint 4 (OQ-S3-3).
- OOM protection for Column Cache — Sprint 4 (OQ-S3-4).
- Full AP scan optimization — RFC-2026-092 Phase 2.
Related
- Performance tuning guide — tuning for mixed OLTP+HTAP workloads.
- Observability metrics — columnar store metrics.
- RFC-2026-092: HTAP Sync and Materialization v1.
Wait Events
AngaraBase 0.6.3.9 §S11 — baseline wait events model. RM-0.6.4.10 adds QoS scheduler events. RM-0.6.4.19 Track C C1 adds per-session counters and per-session query
angara_stat_wait_events.
This page describes the WaitEvent taxonomy that AngaraBase uses to
classify blocking operations. For operators, wait events answer the question:
“what is the cluster waiting on right now?” without strace, manual stack trace analysis, or
instrumenting every call site.
Why the wait events model is needed
The model is similar to pg_stat_activity.wait_event in PostgreSQL or
sys.dm_os_wait_stats in SQL Server:
- every blocking code section gets a specific wait reason;
- current wait is visible at the session/activity level;
- aggregated metrics provide rate, active count, and latency distribution for each event;
- dashboards compare different wait classes through a unified label
event=<variant_snake_case>.
Two observability layers:
- Current session wait —
angara_stat_activity.wait_event_typeandangara_stat_activity.wait_event. - Aggregated Prometheus metrics — counters, gauges, and histograms for each wait event.
Events
WaitEvent is a stable public API. Adding a variant is non-breaking:
dashboards will see the new event=... label after upgrade. Deleting, renumbering,
or changing the as_str() value is considered a breaking change.
| Variant | Label | Wait type | When it fires |
|---|---|---|---|
RowLock | row_lock | Lock | Waiting for tuple-level lock. |
PageLock | page_lock | Lock | Waiting for page-level latch. |
TableLock | table_lock | Lock | Waiting for relation-level lock for DDL/lock manager. |
TransactionLock | transaction_lock | Lock | Waiting for another transaction to commit/finish. |
PredicateLockAcquire | predicate_lock_acquire | Lock | Waiting to acquire predicate lock for SSI foundation. |
PredicateConflictCheck | predicate_conflict_check | Lock | Waiting to check predicate conflict graph. |
PageRead | page_read | IO | Reading heap/index page on cache miss. |
PageWrite | page_write | IO | Page write-back during checkpoint or eviction. |
WalFlush | wal_flush | IO | WAL flush / fsync path. |
Fsync | fsync | IO | Other fsync paths: catalog, FPI, and related operations. |
WalSync | wal_sync | IO | Strict WAL sync wait in durability path. |
WalGroupCommit | wal_group_commit | IO | Waiting for group commit batch. |
ColumnarCompaction | columnar_compaction | IO | Background compactor waits for disk I/O or manifest append mutex in compact_l0_to_l1(). |
ClientRead | client_read | Net | Reading from client socket. |
ClientWrite | client_write | Net | Writing to client socket. |
ReplicaRead | replica_read | Net | Reading from replica connection. |
ReplicaWrite | replica_write | Net | Writing to replica connection. |
NetRead | net_read | Net | Generic network read. |
NetWrite | net_write | Net | Generic network write. |
CpuRun | cpu_run | CPU | Session is running on CPU; this is not blocking. |
PageDecompression | page_decompression | CPU | CPU time for page decompression on buffer-pool miss. |
PageCompression | page_compression | CPU | CPU time for dirty page compression before flush. |
AdmissionQueue | admission_queue | Scheduler | Waiting for admission control queue. |
IoSchedulerQueue | io_scheduler_queue | Scheduler | Waiting for I/O scheduler queue. |
MemoryGrantQueue | memory_grant_queue | Scheduler | Waiting for memory grant. |
BufferPoolEviction | buffer_pool_eviction | Scheduler | Session waits for a free or evictable buffer-pool slot. |
BackpressureThrottle | backpressure_throttle | Scheduler | Unified backpressure coordinator throttles caller. |
DiskRestartHarness | disk_restart_harness | Scheduler | Test harness waits for on-disk state re-hydration in disk-restart test. |
QosQueue | qos_queue | Scheduler | Async task is in per-shard DRR queue of the QoS scheduler before dispatch. |
QosBlocking | qos_blocking | Scheduler | Blocking task waits for dispatch through the QoS blocking path. |
QoS Events RM-0.6.4.10
qos_queue means the task has already been classified by service level
(critical, interactive, background) and is waiting for dispatch in the scheduler queue.
Growth in this wait usually indicates scheduler saturation or a load burst.
qos_blocking means the task entered the blocking path of the QoS scheduler.
Watch it together with gauges angarabase_qos_blocking_inflight and
angarabase_spawn_blocking_max: if blocking wait grows and inflight is close to the
limit, cluster pressure is in the runtime/blocking pool, not SQL locks.
In Sprint 2A, service-level granularity is intentionally coarse: there are no separate
qos_queue_critical, qos_queue_interactive, qos_queue_background.
For service level, use QoS counters:
angarabase_qos_queued_*_total and angarabase_qos_rejected_*_total.
Ordinals and compatibility
Ordinals are append-only and pinned in WaitEvent::ordinal():
QosQueuehas ordinal28;QosBlockinghas ordinal29.
The WaitEvent::ALL array is used to render all label values in metrics.
The fixed metrics array size is defined by N_WAIT_EVENT_VARIANTS.
Compatibility rules:
- adding a variant — non-breaking;
- deleting a variant — breaking;
- renumbering ordinal — breaking;
- renaming a label value from
as_str()— breaking for dashboards and alerts.
Per-session wait events (RM-0.6.4.19 Track C C1)
Starting with RM-0.6.4.19, angara_stat_wait_events supports per-session mode:
-- Process-wide aggregates (as before):
SELECT * FROM angara_stat_wait_events;
-- Per-session counters of the current session:
SELECT * FROM angara_stat_wait_events WHERE session_id = current_session();
In per-session mode:
total— total number of entries into this wait event for the current session since it started.activeandtotal_duration_us— always0in phase 1 (per-session histogram deferred to phase 2).- Counters are incremented via
WaitEventGuard::enterand stored inAtomicWaitState::event_counts(per-session registry, indexed by session_id).
If the session has not entered any wait event, all total = 0 (empty wait state returns zeros).
Metrics
For each event, three Prometheus series are exported with label
event=<variant_snake_case>:
| Metric | Type | Meaning |
|---|---|---|
angarabase_wait_events_total | counter | How many times code entered this wait type. |
angarabase_wait_events_active | gauge | How many waits of this type are active right now. |
angarabase_wait_event_duration_seconds | histogram | Wait duration distribution. |
Histogram buckets in seconds: 0.001, 0.005, 0.01, 0.05, 0.1,
0.5, 1, 5, +Inf.
PromQL Examples
Top-N wait classes by accumulated time over 5 minutes:
topk(
5,
rate(angarabase_wait_event_duration_seconds_sum[5m])
)
Active waits right now:
sum by (event) (angarabase_wait_events_active)
p99 latency for buffer-pool eviction:
histogram_quantile(
0.99,
rate(angarabase_wait_event_duration_seconds_bucket{event="buffer_pool_eviction"}[5m])
)
Backpressure throttle rate:
rate(angarabase_wait_events_total{event="backpressure_throttle"}[1m])
QoS queue wait rate:
rate(angarabase_wait_events_total{event="qos_queue"}[5m])
p95 waits in QoS queue:
histogram_quantile(
0.95,
rate(angarabase_wait_event_duration_seconds_bucket{event="qos_queue"}[5m])
)
Active QoS blocking waits:
angarabase_wait_events_active{event="qos_blocking"}
Alert for long QoS queue:
histogram_quantile(
0.99,
rate(angarabase_wait_event_duration_seconds_bucket{event="qos_queue"}[5m])
) > 0.5
Alert for blocking pool pressure:
angarabase_wait_events_active{event="qos_blocking"} > 0
and
angarabase_qos_blocking_inflight > 0
Correlation of QoS waits with rejections:
rate(angarabase_wait_events_total{event="qos_queue"}[5m])
and
sum(rate({__name__=~"angarabase_qos_rejected_.*_total"}[5m])) > 0
Operator playbook
BufferPoolEviction is growing:
- buffer pool is smaller than the working set;
max_cached_pageshas been reached;- check the buffer-pool-pressure runbook.
BackpressureThrottle is growing:
- WAL queue or buffer pool exhaustion slows clients;
- check
angarabase_buffer_pool_uncommitted_pages_ratio; - correlate with WAL group-commit latency.
WalFlush or WalSync p99 is above 100 ms:
- fsync regression or storage stall is likely;
- use the wal-fsync-slow runbook.
RowLock has high duration:
- look for lock contention and long transactions;
- use the deadlock-spike runbook.
QosQueue is growing:
- check
angara_stat_qos_queues; - watch
angarabase_qos_rejected_*_total; - reduce batch job concurrency;
- move heavy jobs to
SET service_level = 'background'; - review
ANGARABASE_QOS_WEIGHTSandANGARABASE_QOS_MAX_QUEUED.
QosBlocking is growing:
- check
angarabase_qos_blocking_inflight; - check
angarabase_spawn_blocking_max; - look for blocking workload that displaces runtime capacity;
- do not treat this by increasing SQL lock timeout: the wait is in the scheduler/runtime path.
Source of truth
- Code:
crates/angarabase/src/observability/wait_events.rs - Per-session dispatch:
crates/angarabase/src/virtual_catalog.rs+virtual_catalog/shared_catalog.rs - Metrics:
crates/angarabase/src/metrics/core.rs - Render:
crates/angarabase/src/metrics/render.rs - QoS scheduler:
crates/angarabase/src/qos_manager.rs - RM:
docs/planning/v0.6/RM-0.6.3.9.md§S11,docs/planning/v0.6/RM-0.6.4.10.md,docs/planning/v0.6/RM-0.6.4.19.mdTrack C C1
Resource Advisors v0
AngaraBase 0.6.3.9 §S10 — single-node, in-process advisors. Closes the Article #5 review finding “RFC-2026-010 ↔ runtime drift”.
This page documents the two minimum-viable advisors that ship with
0.6.3.9 — the AIMD checkpoint IoAdvisor and the RSS-sensor
MemoryAdvisor — together with the metrics they expose and the
guarantees they explicitly do not make. The full AngaraTuner
Resource Broker (distributed, QoS-weighted, schema-aware) remains a
future train (RM-0.7.0 / RM-0.8.0); RM-0.6.3.9 promotes only the
single-node sensor stubs called out as [Future] in
RFC-2026-010 §3 to Current v0.
Why advisors at all
Modern databases self-tune — Postgres auto_explain + extension-based
tuners, SQL Server’s automatic plan correction, Oracle’s adaptive
execution. AngaraBase’s long-term plan is to collapse the ~60
operator knobs down to ~10–15 budgets (memory_budget, io_budget,
cpu_budget) plus QoS policies, with an in-process broker computing
the rest.
For 0.6.3.9 we ship just enough of that vision to:
- anchor the public Article #5 narrative (no more “future-only” advisors), and
- give downstream code (plan-cache eviction, future spill paths) a stable hint API to consume now without committing to the full broker contract.
IoAdvisor — AIMD checkpoint throttler
Algorithm
Single-knob AIMD over observed flush IOPS, ticked once per attempted
checkpoint (the CheckpointWorker::with_io_advisor integration path):
on tick(observed_iops):
if observed_iops > iops_threshold:
batch_size *= decrease_factor # multiplicative-decrease, >= min
decision = throttle
else:
batch_size += increase_step # additive-increase, <= max
decision = recover
if batch_size unchanged:
decision = hold
Defaults (crates/angarabase/src/storage/advisors/io.rs,
IoAdvisorConfig::default):
| Knob | Default | Notes |
|---|---|---|
initial_batch_size | 64 pages | Resets to this on restart (no persistence) |
min_batch_size | 8 pages | Hard floor |
max_batch_size | 1024 pages | Hard ceiling |
iops_threshold | 5 000 IOPS | Above → multiplicative-decrease |
decrease_factor | 0.5 | Clamped into (0.0, 1.0) |
increase_step | 8 pages | Additive-increase per tick |
Metrics
angarabase_io_advisor_current_batch_size(gauge, pages): the advisor’s currently recommended checkpoint batch size.angarabase_io_advisor_decisions_total{action="throttle"|"recover"|"hold"}(counter): split by AIMD decision. Sum reproduces the historical decision count.
What v0 does not do
- It does not yet enforce the recommended batch size — the periodic
checkpoint flush still drains every dirty page ≤
target_lsnto preserve the completion invariant (RFC-2026-073 §S12). Wiring the recommendation into the flush path is tracked inDEBT_REGISTERas a follow-up. - It does not consider latency, only IOPS — adaptive io_uring queue depth (TD-2026-0122) is the v1 follow-up.
- It does not persist its state across restarts.
MemoryAdvisor — RSS sensor
What it samples
On every sample() call (driven by the same checkpoint worker tick on
Linux):
- read
process_rssfrom/proc/self/statm(pages * sysconf(_SC_PAGESIZE)), - compute
ratio = process_rss / configured_limit, - publish
angarabase_memory_pressure_ratiogauge, - emit a
WARNlog line ifratio >= warn_threshold(default0.8).
limit_bytes = 0 disables the advisor: is_under_pressure() returns
false and the gauge stays at 0. Non-Linux platforms always
return None from sample() in v0 (portable sensor is a follow-up).
Hint API
#![allow(unused)]
fn main() {
let advisor: Arc<MemoryAdvisor> = ...;
if advisor.is_under_pressure() {
// shed load: e.g. evict from plan cache, fall back to spill plan
}
}
The check is a single relaxed atomic load — safe to call on the hot path. The decision of what to do under pressure is intentionally left to each subsystem so the advisor itself stays narrow.
Metrics
angarabase_memory_pressure_ratio(gauge, float in[0.0, 8.0]): most recentprocess_rss / limit_bytesratio. Hard-clamped to 8.0 to bound the impact of bogus RSS reads.
Recommended PromQL
# Checkpoint throttling intensity over the last 5m
rate(angarabase_io_advisor_decisions_total{action="throttle"}[5m])
# Memory pressure crossing the warn threshold
angarabase_memory_pressure_ratio > 0.8
# Current checkpoint batch recommendation, for capacity dashboards
angarabase_io_advisor_current_batch_size
Operator playbook
| Symptom | What to check | Action |
|---|---|---|
io_advisor_current_batch_size stuck at min_batch_size | rate(io_advisor_decisions_total{action="throttle"}[5m]) consistently > 0 | Storage IOPS budget is the bottleneck. Either provision more IOPS or raise iops_threshold after measuring sustained capacity. |
memory_pressure_ratio > 0.9 for > 5m | RSS growth pattern; per-subsystem memory metrics | Consider lowering max_cached_pages or query_memory_limit_mb. Plan cache eviction will start consulting is_under_pressure() as the consumer-side wiring lands. |
io_advisor_decisions_total{action="hold"} ≫ throttle/recover | Workload is stable around the threshold | No action — AIMD is doing its job. |
Compatibility contract
- Metric names (
angarabase_io_advisor_*,angarabase_memory_pressure_ratio) are stable within the 0.6.x series. Adding new advisors or newaction=label values is a non-breaking change. - The Rust API (
IoAdvisor::tick,MemoryAdvisor::sample,MemoryAdvisor::is_under_pressure) is internal (pubfor cross-crate wiring, but not stabilised for downstream consumers outside the AngaraBase workspace). - The full AngaraTuner Resource Broker (RFC-2026-010 Phase 1+2) will coexist with v0 and may layer over these advisors; v0 metric names will continue to be emitted for backwards compatibility.
Cross-references
Backpressure Coordinator
AngaraBase 0.6.3.9 §S5+§S9 — unified backpressure surface.
This page documents how AngaraBase decides to slow down (or refuse) write work when one of its internal queues is at risk of overflowing, and the single Prometheus surface operators should use to investigate it.
Why a coordinator
Up to v0.6.3.8 the storage layer carried three independent backpressure mechanisms:
- The
buffer_pool.uncommitted_pages_ratio_hardthreshold (write transactions blocked until the writeback worker drains the uncommitted-pages set). - The high-priority I/O queue depth (low-priority prefetch dropped when OLTP demand reads saturate the I/O scheduler).
- The buffer pool capacity waiter introduced in §S2+§S8 (writers blocked when no frame can be evicted).
Each mechanism had its own metric and ad-hoc decision logic. There was no single answer to the operator question:
Why is the database refusing my write right now?
Starting with RM-0.6.3.9 the same three mechanisms remain (each as an
isolated BackpressureSource), but they are evaluated through one
BackpressureCoordinator façade and report through one unified
metric family.
Decision model
Each source returns one of three decisions on every coordinator evaluation:
| Decision | Meaning | Typical caller reaction |
|---|---|---|
pass | Source reports no pressure; the request may proceed without delay. | Continue. |
throttle | Source reports elevated pressure; the caller should slow down. | Block on a WaitEventGuard for BackpressureThrottle. |
reject | Source reports critical pressure; the request must be rejected. | Surface a 53400 INSUFFICIENT_RESOURCES error. |
The coordinator’s combined decision uses a strict dominance rule:
reject > throttle > pass
That is, any source reporting reject wins immediately, and any source
reporting throttle wins over a passing source. This mirrors the
fail-fast / block semantics that already existed for the
buffer_pool.backpressure.mode knob (see
runtime_settings.md).
Sources
| Source label | Signal | Tunable knob |
|---|---|---|
uncommitted_pages | Fraction of buffer-pool frames carrying uncommitted page-image deltas. | buffer_pool.uncommitted_pages_ratio_hard (default 0.30). |
wal_queue | High-priority I/O queue depth (OLTP demand reads above the saturation watermark). | (internal, default threshold 4) |
buffer_pool | Buffer-pool capacity waiter (max_cached_pages exhausted, no evictable frame). | buffer_pool.pool_wait_timeout_ms (default 5000). |
The buffer_pool source reports reject when the pool is over capacity
(an eviction failed because every frame is currently pinned) and
throttle when a writer is parked on the capacity cv. The two conditions
are tracked independently of each other.
BREAKING (RM-0.6.3.9 §S5+§S9, decision #5): the
buffer_pool.uncommitted_pages_ratio_hardknob was previously nameduncommitted_dirty_ratio_hard. The legacy identifier is removed without a compatibility alias — operators upgrading from v0.6.3.8 or earlier must rename the key in their config files. The release entry for RM-0.6.3.9 indocs/planning/releases/v0/RELEASE_NOTES.mdcontains the migration note.OPERATOR-UX hardening (2026-04-20, RM-0.6.3.10, closes F-UX-1 + OQ-2026-054 + TD-2026-0175): the parser is now fail-closed on the legacy key. A config that still contains
[buffer_pool] uncommitted_dirty_ratio_hard = …will refuse to start withexit 78(EX_CONFIG) and an operator-facing message naming the renamed key. Unknown keys (typos, future-feature backports) emit a structuredtracing::warn!(target = "config", section, key)and increment the counterangarabase_config_unknown_keys_total— recommended alert:> 0after a fresh deploy. Soft-deprecated aliases ([server] host/port,[storage] wal_directory) remain silently recognized for compatibility.
Metrics
All metrics are emitted on the standard /metrics endpoint
(see observability.md).
| Metric | Type | Labels | Meaning |
|---|---|---|---|
angarabase_backpressure_throttle_decisions_total | counter | source, decision | Per (source × decision) counter incremented on every coordinator evaluation. |
angarabase_backpressure_active_sources | gauge | — | Number of sources currently reporting non-pass (snapshot, refreshed on every evaluation). |
Label sets are stable across releases:
source∈ {uncommitted_pages,wal_queue,buffer_pool}decision∈ {pass,throttle,reject}
PromQL recipes
Detect any active backpressure right now:
angarabase_backpressure_active_sources > 0
Decision rate by source over the last 5 minutes:
sum by (source) (
rate(angarabase_backpressure_throttle_decisions_total{decision!="pass"}[5m])
)
Reject rate (the operator pager-worthy signal):
sum(rate(
angarabase_backpressure_throttle_decisions_total{decision="reject"}[5m]
))
Operator playbooks
| Symptom | First check | Next step |
|---|---|---|
angarabase_backpressure_active_sources >= 1 for >30 s | Which source label dominates the decision counters? | Follow per-source playbook below. |
source="uncommitted_pages",decision="throttle" rate climbing | buffer_pool_uncommitted_dirty_ratio near buffer_pool.uncommitted_pages_ratio_hard? | Increase buffer pool size, or shrink concurrent write batch sizes. |
source="wal_queue",decision="throttle" rate climbing | angarabase_io_advisor_current_batch_size shrinking? (correlated) | Investigate disk saturation; throttle prefetch / background warmup. |
source="buffer_pool",decision="reject" non-zero | angarabase_buffer_pool_over_capacity_pages > 0? | Pinned-page leak is suspected — capture a diagnostics bundle and open an incident. |
Compatibility contract
- The
(source, decision)label sets above are part of the public Prometheus contract and will only change in a major release. - Adding a new source or decision is backward-compatible; removing or
renaming requires a deprecation cycle documented in
CHANGELOG.md. - Coordinator dominance order (
reject > throttle > pass) is part of the contract: alerts may rely on it.
Vector Observability
AngaraBase 0.6.3.10 §S17 — closes the RFC-2026-151 §7a Observability Contract that was deferred from RM-0.6.2.9 G2 (Sprint 4 G2-001 disposition). User direction 2026-04-20 «extended scope» — close before the HTAP column-store train (RM-0.6.4.0).
This page documents the vector executor observability surface: 3 USDT
probes on the hot path, 1 selection-ratio histogram exposed via /metrics,
and the operator playbooks that turn those signals into rewrite / index
decisions.
Surface summary
| Surface | Type | Source |
|---|---|---|
angarabase:vector_batch_start | USDT probe | crates/angarabase/src/observability/probes.rs |
angarabase:vector_batch_end | USDT probe | crates/angarabase/src/observability/probes.rs |
angarabase:vector_fallback | USDT probe | crates/angarabase/src/observability/probes.rs |
angarabase_vector_selection_ratio | Prometheus histogram | crates/angarabase/src/metrics/core.rs |
angarabase_vector_fallback_total | Prometheus counter (existing) | crates/angarabase/src/metrics/core.rs |
angarabase_vector_rows_produced_total | Prometheus counter (existing) | crates/angarabase/src/metrics/core.rs |
The probes carry the inline #[cfg(feature = "usdt")] guard, so non-usdt
builds (WASM, slim test profiles) pay zero instructions per call. RFC-2026-369
remains the canonical source for the broader USDT/eBPF probe infrastructure;
this page covers only the vector-executor-specific subset finalised by S17.
USDT probes
All three probes use provider name angarabase and follow the
<subsystem>_<event> convention. Numeric discriminants for every enum
argument are append-only — adding a new variant is non-breaking, but
renumbering or removing one requires an RFC update.
vector_batch_start(operator_kind: u8, batch_size: u32, source: u8)
Fired at the entry of VectorOperator::next_batch() for every primary
operator (Filter / SeqScan / IndexScan / Bridge / ParallelSeqScan).
operator_kind—ProbeVectorOperatorKinddiscriminant. Stable values:Filter=0,SeqScan=1,IndexScan=2,Bridge=3,ParallelSeqScan=4,HashJoin=5(reserved),Aggregate=6(reserved),Project=7(reserved).batch_size— upstream batch length in rows.source—ProbeVectorBatchSourcediscriminant. Stable values:HeapScan=0,IndexScan=1,UpstreamVector=2,ParallelMorsel=3.
vector_batch_end(operator_kind: u8, rows_produced: u32, rows_filtered: u32, duration_us: u64)
Fired at the exit. rows_filtered is the count dropped by this operator;
rows_produced is the count emitted to the next operator. duration_us is
the wall-time of the call, including any upstream next_batch() recursion.
vector_fallback(plan_kind: u8, reason: u8)
Fired wherever the planner / executor falls back to the row path.
plan_kind—ProbeOperatordiscriminant (best-effort tag for the plan node that tripped the fallback; e.g.HashProbe=4,Aggregate=7).reason—ProbeVectorFallbackReasondiscriminant. Stable values:UnsupportedPlan=0,TypeError=1,NonEquiJoin=2,BudgetExceeded=3,FeatureDisabled=4.
Wire contract notice. S17 finalises the
vector_fallbackargument shape, replacing the legacy ad-hoc(u64, u64)literals that the S9-D4 code shipped with. RFC-2026-369 was open at S17 close so no production bpftrace consumers were broken; new consumers MUST use theProbeOperator×ProbeVectorFallbackReasonmapping.
bpftrace recipes
# Live histogram of post-Filter selectivity (per-batch, last 60 s).
usdt:./angarabased:angarabase:vector_batch_end /arg0 == 0/ {
@sel = hist(arg1 * 100 / (arg1 + arg2));
}
# Top fallback reasons in the last hour.
usdt:./angarabased:angarabase:vector_fallback {
@[arg0, arg1] = count();
}
# Vector hot-path call rate by operator.
usdt:./angarabased:angarabase:vector_batch_start {
@[arg0] = count();
}
Self-test scripts live under tools/usdt/ (per the standing convention from
RFC-2026-369 §4 — bpftrace -l 'usdt:./angarabased:angarabase:*').
Histogram: angarabase_vector_selection_ratio
Cumulative Prometheus histogram (HELP / TYPE headers emitted on every
scrape) tracking the per-batch ratio rows_produced / rows_scanned observed
by VectorFilterV0::next_batch().
Bucket scheme (compatible with histogram_quantile()):
[0.001, 0.01, 0.05, 0.1, 0.25, 0.5, 0.75, 0.9, 0.99, 1.0, +Inf]
The +Inf bucket exists for protocol compliance only — selection ratio is
bounded to [0.0, 1.0] by construction (kept ≤ scanned enforced inside
apply_predicate). Empty batches (rows_scanned == 0) carry no signal and
are silently dropped by VectorSelectionRatioHistogram::observe().
The histogram is rendered alongside the existing wait-event histogram by
render_prometheus() and is covered by:
metrics::render::tests::vector_selection_ratio_histogram_appears_in_prometheus_output_rm06310_s17(exposition-shape test);metrics::render::tests::vector_selection_ratio_histogram_observe_buckets_rm06310_s17(bucket-edge test, independent of the renderer).
PromQL examples
# Median Filter selectivity over the last 5 minutes.
histogram_quantile(
0.5,
rate(angarabase_vector_selection_ratio_bucket[5m])
)
# Share of batches with extremely-low selectivity (≤ 5 %).
sum(rate(angarabase_vector_selection_ratio_bucket{le="0.05"}[5m]))
/
sum(rate(angarabase_vector_selection_ratio_count[5m]))
# Mean selectivity (sum / count, both already rate-friendly).
rate(angarabase_vector_selection_ratio_sum[5m])
/
rate(angarabase_vector_selection_ratio_count[5m])
Operator playbook
| Observation | Likely diagnosis | Recommended action |
|---|---|---|
p50 ≥ 0.9 consistently | Filter is essentially a no-op; predicate could be pushed down or removed entirely | Review query plan: candidate for filter pushdown to scan / index level; rewrite query |
p95 ≤ 0.05 and high vector_batch_start rate | Index missing — Filter is throwing away ≥ 95 % of every batch | Run ANALYZE; add an index on the predicate columns; verify with EXPLAIN |
p99 = 1.0 only on a small set of queries | Selective predicate fires occasionally on a hot table | Acceptable; consider partial index if the predicate is stable |
vector_fallback rate spike | A new query shape is tripping the row-path fallback | Filter vector_fallback{reason=…} — match against ProbeVectorFallbackReason |
Cross-reference runbooks/buffer-pool-pressure.md for I/O-side correlation
and the future commit-latency-tuning.md (Track B S13) for write-path
overlay.
Source-of-truth contract
| Artifact | Role |
|---|---|
RM-0.6.3.10 §S17 | Sprint contract (this page is the operator-facing rendering) |
RFC-2026-151 §7a Observability Contract | Long-form design (closed by S17) |
RFC-2026-369 USDT/eBPF Observability Probes | Broader probe taxonomy (open at S17 close — soft prereq) |
crates/angarabase/src/observability/probes.rs | USDT macro definitions + enum stability tests |
crates/angarabase/src/metrics/core.rs::VectorSelectionRatioHistogram | Histogram storage + observe() API |
crates/angarabase/src/metrics/render.rs::render_vector_selection_ratio | Prometheus exposition |
crates/angarabase/src/query/vector.rs::VectorFilterV0::next_batch | Single observation point (Filter operator boundary) |
jemalloc Heap Profiling Runbook
Operator runbook for memory/heap analysis based on jemalloc.
Canonical source: this runbook in angarabook/src/operations/.
Scope
- feature:
jemalloc-prof(opt-in); - heap-fragmentation metrics;
- on-demand profiling in staging/debug;
- leak-check for long-running runs (Golden DB).
Build and verify
cargo build --release --features jemalloc-prof
curl http://localhost:9091/metrics | rg jemalloc
Key metrics
angarabase_jemalloc_allocated_bytesangarabase_jemalloc_resident_bytesangarabase_jemalloc_active_bytesangarabase_jemalloc_mapped_bytesangarabase_jemalloc_fragmentation_ratio
Practical interpretation:
~1.0: low fragmentation;>1.5: fragmentation risk;>2.0: memory path investigation required.
Heap profiling workflow
- Start with
MALLOC_CONF=prof:true,.... - Send
SIGUSR1for forced dump. - Analyze with
jeprof(text/pdf/diff).
Leak-check for long-lived runs
Golden DB flow:
tools/golden_db/manage.sh leak-check baseline- long-running load
tools/golden_db/manage.sh leak-check report
Compare baseline/after by allocated/resident/mapped and fragmentation.
Related references
tools/golden_db/manage.shsrc/operations/golden-dataset.md
Parallel Runtime Observability Runbook
Operator runbook for diagnosing regressions in AngaraParallel.
Canonical source: this runbook in angarabook/src/operations/.
Goal
Quickly determine the source of QPS drops / latency growth without deep debugging in code:
- planner/plan shape;
- runtime/scheduler pressure;
- storage/IO contention.
Fast triage
- Compare bench metrics and server metrics in the same time window.
- Check QPS, p95/p99, queue depth, lock waits, error-rate.
- Classify the issue: planner vs runtime vs storage.
Required signals
- USDT:
probe_parallel_query_startprobe_morsel_dispatchedprobe_morsel_completed- Prometheus minimum:
angarabase_storage_io_read_duration_ms_*angarabase_storage_io_write_duration_ms_*angarabase_pgwire_pool_queue_depthangarabase_lock_wait_duration_ms_*angarabase_slow_query_total
Incident playbook
- Capture baseline and regression run on the same profile.
- Collect
EXPLAIN ANALYZEfor slow queries. - Verify that the expected parallel path is used:
workers_planned,workers_launched,Vector*operators, andreason_codes. - Correlate dispatch/completion with tail latency.
- Check memory guardrails and degradation instead of hard-fail.
- Record a short report: impact, suspect component, next action.
Next
- How to read query plans — detailed explanation of
workers_planned,workers_launched,Vector*, and optimizer diagnostics. - Performance tuning guide — general approaches to tuning for parallelism.
- Observability metrics checklist — general metrics that include parallel counters.
AngaraReplica v2 Operations Guide
Short operator guide for streaming replication v2.
Canonical source: this runbook in angarabook/src/operations/.
Topology and scope
- 1 primary + up to 8 standby nodes (async replication).
- Standby works in read-only mode (
SQLSTATE 25006on write). - Promote is performed manually (auto-failover in the next major line).
Configuration baseline
Primary:
[replication].role = "primary"listen_addrwal_retention_segments
Standby:
[replication].role = "standby"primary_addrslot_namewal_path
Operations flow
- Start primary.
- Start standby and check lag metrics.
- Monitor replication lag / reconnects / slots.
Promote (manual failover)
- Promote must complete through the sync-checkpoint handshake.
- Promote timeout fails closed (standby does not accept writes if the handshake did not complete).
- Lease-based fencing reduces split-brain risk, but does not fully replace STONITH/Raft.
Key monitoring signals
angara_node_is_standbyangara_replication_lag_bytesangara_replication_lag_msangara_replication_reconnects_totalangara_promote_totalangara_promote_duration_ms_last
Typical incidents
- Standby does not connect: address/port/firewall/reconnects.
WAL segment gone: base backup and standby restart are required.- Promote timeout: check network and WAL write path on primary.
Next
- Disaster recovery playbook — DR scenarios on top of replication.
- Backup and restore (operator-level) — how replication complements (does not replace) backup.
- Operational policies baseline — SLA/RTO/RPO agreements within which replication v2.
Operational Policies Baseline
Short summary of operational policies for release discipline.
Breaking changes policy (v1.*)
Default for minor trains: avoid breaking changes.
If breaking change is unavoidable, the required package is:
- RFC/design note;
- migration notes;
- release notes with impact/steps.
On-disk changes require strict upgrade discipline:
- update
crates/angarabase/src/on_disk.rs; - update
src/operations/upgrade-and-migration.md; - attach upgrade rehearsal evidence.
Breaking budget registry
The breaking surfaces registry must explicitly record:
- status (
no change/changed with notes/planned); - evidence path;
- updates within each train with potential impact.
SLO/SLA methodology
Measurements are based on a reproducible profile and pinned evidence:
- latency;
- throughput;
- saturation/backpressure signals.
Runner baseline:
tools/perf_pack/run.sh
Evidence policy
- Heavy artifacts: locally in
artifacts/.... - Pinned evidence: compact reports in
docs/planning/evidence/.... - RM/release notes link only to pinned evidence.
Triage entry points
src/operations/observability-metrics.mdsrc/operations/troubleshooting.md
Next
- Configuration schema reference — specific keys referenced by policies.
- Security operations baseline — security part of operator policies.
- Testing and validation baseline — how policies are verified in CI.
Client Compatibility Baseline
Operator baseline for client/ORM compatibility.
Canonical source: this runbook in angarabook/src/operations/.
Supported Framework Matrix (RM-0.6.5.7)
| Framework / Driver | Version | Status | Notes |
|---|---|---|---|
| psql | any | ✅ Supported | baseline |
| psycopg3 | 3.3.4 | ✅ Supported | Fixed in v0.6.5.3: EQP cast/arithmetic, date/timestamp encoding, IS NULL, = ANY(ARRAY). Fixed in v0.6.5.5: correct OID mapping and UTC serialization. Fixed in v0.6.5.7: DATE type (OID 1082, binary i32), bool comparison (1=2 → false), FK DDL accept (NOT ENFORCED), multi-col CREATE INDEX accept |
| Django ORM + psycopg3 | 5.x | ✅ Supported | Basic migrations PASS; EQP gaps fixed in v0.6.5.3. Fixed in v0.6.5.5: set_config/obj_description stubs unblock introspection. |
| Odoo 19 (community) | 19.x | ✅ Supported | Fixed in v0.6.5.3: IS NULL, = ANY(ARRAY), pg_class filter, pg_index, CREATE SEQUENCE. Fixed in v0.6.5.5: GAP-C2 (UPDATE SET func), GAP-C5 (date_trunc). |
| sqlx | 0.8.6 | ✅ Supported | Fixed in v0.6.5.3: ParameterDescription in Describe(S) |
| tokio-postgres (simple query) | 0.7.17 | ✅ Supported | Simple query protocol |
| tokio-postgres (extended query) | 0.7.17 | ✅ Supported | Fixed in v0.6.5.3: ParameterDescription in Describe(S). Fixed in v0.6.5.7: binary encode for DATE/TIMESTAMP, binary param decode OID 1114/1184, ParameterDescription returns real OIDs from Parse |
Goal
Maintain a repeatable compatibility baseline for:
psql- DBeaver
- Odoo-shaped probes
and track regressions through pinned reproducible checks.
Phase A focus (Odoo)
- Runtime smoke without critical SQL errors.
- Stable trace replay without shape/protocol regressions.
- Explicit boundary between acceptable shape stubs and unacceptable semantic-unsafe stubs.
Pinned tooling
tools/pg_catalog_trace/run_odoo_stages_angara.shtools/pg_catalog_trace/extract_angara_trace.pytools/pg_catalog_trace/replay_angara_trace.shtools/compat_suite/run.sh --nightly --runs 3tools/compat_suite/run.sh --odoo --runs 3
High-signal checks
- SQLSTATE mapping (
42601,0A000) is stable. - Catalog/info_schema response shapes are stable for the probe set.
- Odoo/DBeaver smoke path is not broken by changes in
pg_catalog.
Regression triage
Compat-suite artifacts:
summary.jsoniter_<N>/summary.jsoniter_<N>/test_<name>.log
Function Compatibility (RM-0.6.5.5)
The following functions were added to support ORM (Django, Odoo):
set_config(name, value, is_local): Stub, returnsvalue. Allows Django to configureTimeZoneandsearch_pathwithout errors.obj_description(oid, catalog): Stub, returnsNULL. Allows Django to perform database introspection.date_trunc(field, timestamp): Full implementation. Supports all standard fields (year,month,day,hour, etc.).NOW(),CURRENT_TIMESTAMP: Return the current time in UTC.
DML Compatibility (RM-0.6.5.5)
UPDATE SET col = func_call(): Functions (for example,write_date = NOW()) and explicit type casts (col = val::type) are now supported in theSETclause. This is critical for Odoo 19.
Query Execution & Bug Fixes (RM-0.6.5.7)
- Bool comparison: Comparing constants of different types now correctly coerces to boolean.
Example:
SELECT 1 = 2;returnsf(previously could cause a type error).
Supported Types (RM-0.6.5.7)
date: Native type. OID 1082. Binary mode: BE i32 (days since 2000-01-01). Text mode: ISO-8601 (YYYY-MM-DD).current_dateis supported. Example:SELECT '2026-05-08'::date;timestamp: Native type. OID 1114 (without time zone) / 1184 (with time zone). Binary mode: BE i64 (microseconds since 2000-01-01). Casting string literals in ISO-8601 format is supported. Example:SELECT '2026-05-08 12:00:00'::timestamp;
DDL Compatibility (RM-0.6.5.7)
Foreign Key Constraints
The REFERENCES and FOREIGN KEY ... REFERENCES syntax is accepted by the parser (v0.6.5.7+).
Constraints are not enforced (NOT ENFORCED). The server logs [FK constraint] ... accepted as NOT ENFORCED.
Example:
CREATE TABLE orders (
id INT PRIMARY KEY,
user_id INT REFERENCES users(id) -- NOT ENFORCED
);
Multi-Column Indexes
CREATE INDEX ON t(a, b, c) is accepted (v0.6.5.7+). The index is built only on the first column.
Remaining columns are preserved in metadata. The server logs a warning.
Example:
CREATE INDEX idx_multi ON my_table (col1, col2, col3); -- Built only on col1
Known Limitations (RM-0.6.6.3)
SQL-level PREPARE / EXECUTE / DEALLOCATE
SQL syntax PREPARE stmt AS ... / EXECUTE stmt(...) / DEALLOCATE stmt
is not supported in AngaraBase v0.6. It returns feature_not_supported.
What to use instead: extended query protocol (Parse/Bind/Execute pgwire messages), which is used automatically by all supported drivers:
# psycopg3 — EQP automatically (prepare=True by default)
with conn.cursor() as cur:
cur.execute("SELECT $1::int", (42,)) # → PREPARE + BIND + EXECUTE under the hood
// JDBC
PreparedStatement ps = conn.prepareStatement("SELECT ?::int");
ps.setInt(1, 42);
# psql — uses simple query protocol; PREPARE as SQL does NOT work.
# For interactive testing, use \bind (psql 16+):
psql> SELECT $1::int \bind 42 \g
pg_sleep()
The pg_sleep(seconds) function is not implemented in v0.6.
To test timeouts, use a heavy query:
SET statement_timeout = 10; -- 10 ms
SELECT count(*) FROM large_table a CROSS JOIN large_table b; -- → ERROR 57014
Protocol Compatibility (RM-0.6.5.7)
- Binary encode DATE/TS:
dateandtimestamptypes are encoded in binary format as BE i32 and BE i64 respectively. - Binary param decode: Binary parameter decoding is supported for OID 1114 (
timestamp) and 1184 (timestamptz). - ParameterDescription OID: The
ParameterDescriptionmessage (Parse phase) now returns real parameter type OIDs instead of zeros.
Fast Path:
- find the failing test in summary;
- open the corresponding log;
- reproduce locally with the exact
cargo testor suite-runner command.
Testing and Validation
Operator baseline for crash/recovery validation and related checks.
Canonical source: this runbook in angarabook/src/operations/.
Goal
Verify that the recovery path:
- does not allow silent corruption;
- preserves the durability-mode contract;
- remains idempotent across repeated restarts.
Core invariants
- No silent corruption: either clean startup or explicit failure with diagnostics.
- Idempotent recovery: repeated restart does not change the oracle outcome.
- Visibility safety: uncommitted changes do not become visible after recovery.
- Durability semantics:
strict: ack-commit must survive crash;group_commit: ack semantics strictly match the stated contract;relaxed: some ack-commits may be lost within the contract, but without integrity violations.
Minimal runner
Pinned runner:
tools/storage_poc/crash_loop.sh
Key profiles:
--nightly--dirty-pressure--double-restart--durability strict|group_commit|relaxed--corrupt-txlog/--corrupt-storage(fail-fast checks)
Minimum scenarios
- SIGKILL during commit storm (tail handling).
- SIGKILL around checkpoint markers.
- Double restart idempotence.
Required artifacts
txlog_scan_*.jsontxlog_replay_*/*.jsonrecovery_summary.json(+ restart2 summary for idempotence)- machine-readable pass/fail summary for CI/nightly.
Exit criteria (operator gate)
- All required scenarios pass.
- Artifacts are valid and available for triage.
- No visibility/durability invariant violations.
Related operations references
src/operations/backup-restore.mdsrc/operations/disaster-recovery.mdsrc/operations/diagnostics-bundle.md
Next
- Golden dataset management — which data validation scenarios run on.
- CI reproducibility contract — reproducibility guarantees for the validation pipeline.
- Operational policies baseline — which policies must have validation coverage.
Golden Dataset Management
Operator baseline for maintaining persistent Golden DB.
Canonical source: this runbook in angarabook/src/operations/.
Goal
Use a stable large dataset for:
- release closure validation;
- upgrade rehearsal;
- performance drift tracking;
- soak scenarios under realistic load.
Canonical sources
- RFC:
RFC-2026-380-continuous-validation-infrastructure-v0 - Tooling:
tools/golden_db/manage.sh
Infrastructure baseline
- Storage:
.fastio/golden_db(NVMe). - Separate data/txlog paths.
- Production-like durability and binary WAL.
Main commands
tools/golden_db/manage.sh inittools/golden_db/manage.sh starttools/golden_db/manage.sh stoptools/golden_db/manage.sh statustools/golden_db/manage.sh grow --rows <n>tools/golden_db/manage.sh upgrade-check --binary <path>
Routine release flow
- Stop Golden DB.
- Run
upgrade-checkwith the new binary on a snapshot. - Verify startup/connectivity/row-count oracle.
- Record artifacts and the final verdict.
Validation tiers
- Tier 1: read compatibility (required).
- Tier 2: write compatibility (planned).
- Tier 3: performance canary (planned).
Next
- Testing and validation baseline — how the golden dataset is connected to the validation pipeline.
- CI reproducibility contract — fixture reproducibility requirements.
CI Reproducibility Contract
Short reproducibility contract for local CI gates.
Canonical source: this runbook in angarabook/src/operations/.
Goal
Make CI gates reproducible in a clean local environment to reduce bus factor.
Pinned entrypoint:
tools/ci_pr.sh
Must
- The script explicitly reports required tools.
- Runs are deterministic where possible.
- Artifacts are saved to stable paths under
artifacts/. - All key gates are available through a single entrypoint.
Should
- Have a container/dev-env run option.
- Add an advisory/license gate (
cargo-denyand analogs) when network policy allows it.
Non-goals
- Hard binding to a specific CI provider.
- Identical perf metrics across different machines.
Dogfooding linkage (pilot)
Practical cycle: deploy -> smoke -> workload -> observe -> backup/restore rehearsal -> triage.
Pinned evidence paths (legacy surface):
docs/planning/evidence/archive/legacy-root-20260417/pilot/latest.mddocs/planning/evidence/archive/legacy-root-20260417/pilot/report_v0.md
Next
- Testing and validation baseline — which scenarios must be reproducible under this contract.
- Golden dataset management — how fixture versions are pinned for reproducibility.
Known Issues
Live registry of known issues for repeatable triage.
Canonical source: this runbook in angarabook/src/operations/.
Purpose
- Explicitly record known gaps;
- link reproducible repro + expected SQLSTATE/shape;
- simplify release decisions.
Usage rules
- Add an issue as soon as the problem is reproducible and temporarily acceptable.
- Prefer a link to the regression test that pins the expected behavior.
- Close an issue only after the fix + tests/evidence update.
Entry template (minimum)
- ID
- Area
- Severity
- Affects
- Repro
- Expected
- Observed
- Status
- Owner
- Next step
Fixed in v0.6.5.5
GAP-C2: UPDATE SET col = func() not supported
- Status: Fixed in RM-0.6.5.5.
- Summary:
UPDATE SETnow supports function calls (e.g.,NOW()) andCASTexpressions, unblocking Odoo 19 record writes.
GAP-C5: date_trunc() not implemented
- Status: Fixed in RM-0.6.5.5.
- Summary:
date_trunc(field, timestamp)is fully implemented for all standard fields, unblocking Odoo reporting.
GAP-DJ-001/002: Missing set_config() and obj_description()
- Status: Fixed in RM-0.6.5.5.
- Summary: Added stubs for
set_configandobj_descriptionto ensure Django ORM introspection and migrations work without monkey-patches.
Protocol: Timestamp UTC serialization & OID mapping
- Status: Fixed in RM-0.6.5.5.
- Summary:
TIMESTAMP WITHOUT TIME ZONEis now correctly mapped to OID 1114 and serialized in UTC without offset, fixing time-shift issues inpsycopg3.
Fixed in v0.6.5.3
GAP-RUST-001: Missing ParameterDescription in Extended Query Protocol
- Status: Fixed in RM-0.6.5.3.
- Summary:
ParameterDescription(‘t’) is now sent beforeRowDescription/NoDatafor Statement Describe, unblockingsqlxandtokio-postgresEQP.
GAP-A1: Cast expressions fail in Extended Query Protocol
- Status: Fixed in RM-0.6.5.3.
- Summary:
SELECT 1::int4and arithmetic via EQP no longer cause “vector type mismatch Utf8”.
GAP-A2: date/timestamp encoding violation in pgwire
- Status: Fixed in RM-0.6.5.3.
- Summary:
dateandtimestampvalues are returned in ISO-8601 format, not as raw integers.
GAP-C1: pg_catalog.pg_index missing
- Status: Fixed in RM-0.6.5.3.
- Summary:
pg_catalog.pg_indexnow returns an empty resultset instead of an error.
GAP-C3: IS NULL / IS NOT NULL predicates not supported
- Status: Fixed in RM-0.6.5.3.
- Summary:
IS NULLandIS NOT NULLwork in the general query path.
GAP-C4: = ANY(ARRAY[…]) not supported
- Status: Fixed in RM-0.6.5.3.
- Summary:
col = ANY(ARRAY[v1, v2, v3])now works (desugars toIN).
GAP-C6: pg_class WHERE filter bypass
- Status: Fixed in RM-0.6.5.3.
- Summary:
pg_class WHERE relname = 'foo'filter now works correctly.
GAP-C7: CREATE SEQUENCE + nextval() catalog registration
- Status: Fixed in RM-0.6.5.3.
- Summary:
CREATE SEQUENCE+nextval()works in the simple query path.
KI-2026-005: SQLSTATE leakage
- Status: Fixed in RM-0.6.5.3.
- Summary: Internal execution errors (e.g., raw
54000) are no longer leaked directly to clients and are properly mapped to standard protocol errors.
Current open issues
KI-2026-006 — SELECT ... FOR UPDATE supports only the documented single-table form
- ID:
KI-2026-006 - Area: SQL / Row locking
- Severity: Medium (application compatibility)
- Affects: clients that rely on
SELECT ... FOR UPDATEwith joins, subqueries,NOWAIT,SKIP LOCKED, orFOR SHARE - Repro:
- Prepare two simple tables:
SET search_path = public; CREATE TABLE t_lock_a (id INT); CREATE TABLE t_lock_b (id INT); INSERT INTO t_lock_a (id) VALUES (1); INSERT INTO t_lock_b (id) VALUES (1); - Run a locking query outside the documented single-table form:
SELECT t_lock_a.id FROM t_lock_a JOIN t_lock_b ON t_lock_a.id = t_lock_b.id FOR UPDATE; - Or run a lock-clause variant:
SELECT id FROM t_lock_a FOR UPDATE SKIP LOCKED;
- Prepare two simple tables:
- Expected: unsupported locking forms fail closed with SQLSTATE
0A000 feature_not_supported; parser-level incompatibility may fail earlier with SQLSTATE42601 syntax_error. - Observed: only the plain single-table
FOR UPDATEpath withidin projection is part of the documented contract. - Status: Open.
- Owner: SQL / MVCC owner
- Next step: define the full lock-clause compatibility matrix before documenting additional forms as supported.
KI-2026-004 — current_setting() returned NULL for unknown keys (fixed RM-0.6.5.1)
- ID:
KI-2026-004 - Area: SQL / Session claims
- Severity: Minor (operator UX)
- Affects: clients calling
current_setting('app.nonexistent')and expecting PostgreSQL-compatible error - Repro:
SELECT current_setting('app.nonexistent')on a session without that claim set - Expected:
ERROR: unrecognized configuration parameter "app.nonexistent"(SQLSTATE42704) - Observed: result was
NULL(silent empty result), diverging from PostgreSQL semantics - Status: Fixed (
RM-0.6.5.1 Track Z item Z4). Unit testcurrent_setting_unknown_key_returns_42704pins the fix. - Owner: Core team
- Next step: N/A — closed.
KI-2026-001 — WAL strict-mode latency on write-heavy workloads
- ID:
KI-2026-001 - Area: WAL / Commit path
- Severity: Major (performance)
- Affects: deployments with
ANGARABASE_TRANSACTION_LOG_DURABILITY=strictand frequent DML commits - Repro:
- Run write-heavy transactional workload (
INSERT/UPDATE/DELETE) in strict durability mode. - Compare commit latency against the same workload with grouped flush boundaries.
- Run write-heavy transactional workload (
- Expected: strict mode preserves durability with near-commit-boundary flush cost.
- Observed: per-record fsync amplification can inflate commit latency noticeably.
- Status: Open (tracked as
TD-2026-0184, scheduled RM-0.6.4.0) - Owner: WAL owner / Core team
- Next step: land commit-boundary fsync consolidation and re-run strict-mode benchmark pack.
KI-2026-002 — Grafana 12 UI importer fails on canonical dashboard JSON
- ID:
KI-2026-002 - Area: Observability / Dashboards
- Severity: Medium (operator UX)
- Affects: manual dashboard import through Grafana 12 UI
- Repro:
- Open Grafana 12.x and use “Import dashboard”.
- Load one of
tools/observability/grafana/*.json.
- Expected: dashboard imports via UI without manual rewriting.
- Observed: Grafana 12 UI may fail with
Cannot read properties of undefined (reading 'type'). - Status: Mitigated workaround available (
tools/observability/grafana/import/bundle, tracked byTD-2026-0163) - Owner: Observability owner
- Next step: keep canonical JSON as source of truth; for Grafana 12 manual import use the
import/bundle flow.
KI-2026-003 — No offline major-version migrator yet
- ID:
KI-2026-003 - Area: Upgrade / Operations
- Severity: Medium (operational risk)
- Affects: major upgrades that require on-disk format change
- Repro:
- Prepare data dir with one format version.
- Attempt direct startup with incompatible major format.
- Expected: supported offline migrator path for major on-disk transitions.
- Observed: fail-closed behavior works, but migration path is still dump/restore.
- Status: Open (
TD-2026-0170, linked riskR-0.6.3.11-04) - Owner: Core team
- Next step: design and implement offline migrator contract in a dedicated train.
Statistics and ANALYZE
This section describes how to work with the statistics collection subsystem in AngaraBase.
ANALYZE
The ANALYZE command collects statistics about data distribution in tables, which the optimizer uses to build efficient query execution plans.
Drift Detection
When running ANALYZE, AngaraBase uses a drift detection mechanism to
minimize unnecessary writes to the system catalog. If the new
distinct_estimate value changed by less than 10% compared with the current
stored value, the statistics update for that column is skipped.
This avoids unnecessary I/O and cached-plan invalidation for minor data changes.
Column Statistics and distinct_estimate
The main metric for estimating selectivity is distinct_estimate (analogous to
n_distinct in PostgreSQL). It estimates the number of unique values in a column.
- If
distinct_estimateequals the total number of rows, the column is unique. - If
distinct_estimateis small compared with the row count, the column has low cardinality.
The optimizer (CBO) uses this data to calculate predicate selectivity.
Cardinality-Aware Index Scan
Starting with version 0.6.5.2, AngaraBase uses an improved algorithm for choosing between
IndexScan and SeqScan that accounts for column cardinality.
Previously, the planner could incorrectly choose an index for columns with few unique values because of strict minimum-selectivity limits. That limit has now been removed, and CBO correctly computes cost for low-cardinality columns.
How it works:
- The planner computes filter selectivity from
distinct_estimate. - If selectivity exceeds
[execution].index_cardinality_threshold(default 0.15), the planner prefersSeqScanfor the corresponding gate (“low cardinality” inEXPLAIN; see alsoindex_scan_selectivity_thresholdfor the “low selectivity” reason). EXPLAINprints the selection reason:seq scan chosen: low cardinality (0.1328).
Example:
If a 1,000,000-row table has only 3 values in the status column,
the selectivity of the status = 'ACTIVE' filter is about 0.33. Because 0.33 > 0.15,
the database chooses a sequential scan, since reading a third of the table through
an index would be slower because of random I/O.
Multicolumn Statistics
Multicolumn statistics allow the optimizer to account for correlation between several columns, which is critical for complex predicates.
New in 0.6.4.18: Multicolumn statistics collected by
ANALYZEare now persisted to disk (sys_catalog snapshot protocol v4). Statistics survive server restarts. To force a fresh collection, runANALYZE <table>again.
Viewing Statistics
Statistics are available through system views (sys_catalog).
AngaraStream
AngaraStream provides Change Data Capture (CDC) capabilities for AngaraBase.
start_offset semantics
The start_offset parameter controls which events are delivered to a new subscription.
| Value | Behaviour |
|---|---|
'latest' (default) | Skips historical events. Only new events published after the subscription is created are delivered. |
'earliest' | Delivers all events currently in the buffer, starting from the oldest retained event. |
Buffer overflow with earliest
If the internal event buffer has been partially evicted before 'earliest' is processed,
AngaraBase returns StreamGapError. In this case, drop the subscription and recreate it
(the gap cannot be recovered without a re-seed).
-- recreate after StreamGapError
SELECT angara_stream_unsubscribe('my-sub');
SELECT angara_stream_subscribe('my-topic', 'my-sub', start_offset => 'earliest');
Checkpoint Operations
Checkpoint is the process of flushing dirty pages from memory to disk and synchronizing WAL, ensuring data durability and speeding up recovery after failures.
CheckpointWorker
The checkpoint process is managed by the background worker CheckpointWorker.
New in 0.6.4.18: CheckpointWorker has been refactored. The checkpoint logic is now consolidated in
storage::CheckpointWorkerand exposed via theCheckpointEngineOpstrait. The server-side entry point is a thin wrapper. Legacy behaviour is preserved viarun_legacy()and activated automatically for heap-primary tables.
Configuration
Checkpoint parameters are configured in angarabase.toml in the [storage.checkpoint] section.
Invariant Registry — Engineer Guide
Canonical registry:
Lint tool:tools/lint/invariant_refs.py
Introduced: RM-0.6.4.11
AngaraBase Invariant Registry makes key architectural guarantees machine-traceable: each invariant links ID → RFC/spec → tests → metrics → evidence. This prevents drift between documentation, code, and observability.
Current Invariants (seed v0)
| ID | Subsystem | Risk | Status |
|---|---|---|---|
INV-MVCC-SNAPSHOT-VISIBILITY | tx_overlay | critical | active |
INV-WAL-DURABLE-AFTER-FSYNC | wal | critical | active |
INV-RECOVERY-REDO-UNDO-CLR | recovery | critical | active |
INV-RESOURCE-FAIL-CLOSED | storage/backpressure | high | active |
INV-SPILL-RECURSION-OVERFLOW-53400 | query/spill | high | active |
When INV-ID Is Required in Code
When changing files in the following paths, include a // INV-<ID> comment
or add the file to the scoped allowlist:
| Path | Required invariant |
|---|---|
crates/angarabase/src/tx_overlay/ | INV-MVCC-SNAPSHOT-VISIBILITY |
crates/angarabase/src/wal/ | INV-WAL-DURABLE-AFTER-FSYNC |
crates/angarabase/src/recovery.rs | INV-RECOVERY-REDO-UNDO-CLR |
crates/angarabase/src/storage/backpressure/ | INV-RESOURCE-FAIL-CLOSED |
crates/angarabase/src/query/spill/ | INV-SPILL-RECURSION-OVERFLOW-53400 |
Example code comment
#![allow(unused)]
fn main() {
// INV-WAL-DURABLE-AFTER-FSYNC: UndoAppend must be WAL-durable before PageDelta is written.
// See: RFC-2026-073 §4.Y — ordering invariant (UNDO-before-heap).
assert!(
undo_lsn < page_delta_lsn,
"WAL ordering invariant violated: undo_lsn={} >= page_delta_lsn={}",
undo_lsn, page_delta_lsn
);
}
How to Add a New Invariant
Step 1 — Add an entry to invariants.yaml
- id: INV-<CATEGORY>-<NAME>
title: "Human-readable name"
surface: <module_path> # e.g. tx_overlay, wal, query/spill
risk_level: critical # critical | high | medium
owner: <team> # e.g. storage-team, query-team
spec_refs:
- rfcs/RFC-YYYY-NNN-name.md # path relative to repo root docs/rfcs/
test_refs:
- crates/angarabase/src/<path>/tests.rs
metric_refs:
- angarabase_<metric_name> # Prometheus metric name (not a file path)
evidence_refs: [] # or list of pinned evidence paths
status: active # or pending_evidence if refs TBD
Step 2 — If refs are not implemented yet
Use status: pending_evidence and add target_rm:
status: pending_evidence
target_rm: RM-0.6.5.0 # train that will create tests/metrics
Lint does not check file refs for pending_evidence invariants.
Step 3 — Run lint
python3 tools/lint/invariant_refs.py --check invariants.yaml
# → exit 0: OK
Step 4 — Add a comment in code
Find the assertion/guard in source code and add the // INV-<ID> comment nearby.
Step 5 — Commit
tools/dev/git-commit-safe.sh -m "NEW(invariants): add INV-<ID> to registry" \
-- invariants.yaml
Schema Fields
| Field | Type | Required | Description |
|---|---|---|---|
id | string | ✓ | Unique ID, starts with INV- |
title | string | ✓ | Short human-readable name |
surface | string | ✓ | Module/subsystem |
risk_level | enum | ✓ | critical / high / medium |
owner | string | ✓ | Owning team |
spec_refs | list[path] | ✓ | Paths to RFC/spec (checked by lint) |
test_refs | list[path] | ✓ | Paths to test files (checked by lint) |
metric_refs | list[string] | ✓ | Prometheus metrics (names) |
evidence_refs | list[path] | ✓ | Pinned evidence (checked by lint; [] = none) |
status | enum | ✓ | active / pending_evidence |
target_rm | string | for pending_evidence | Train that closes the gap |
Running Lint in CI
Lint runs automatically through docs/validate-docs.sh:
bash docs/validate-docs.sh
# → ✅ Invariant registry valid (5 invariant(s) checked, 0 errors)
Standalone:
python3 tools/lint/invariant_refs.py --check invariants.yaml
# exit 0 → OK
# exit 1 → broken ref or schema error (blocks CI)
Related Documents
Project Health Dashboard
Project Health Dashboard helps the Tech Lead track project health as both a product and a repository. This is separate from runtime/DBA observability: it analyzes trains, risks, tech debt, LOC, and quality gates.
What to Use It For
- Quickly understand which train is active and which was closed most recently.
- See problem areas: active Red/Amber risks and open High/Critical debt.
- Find the largest files (top-10 LOC) and
src_loc_guardrisk. - Check quality gate status from evidence without manually walking logs.
- Assess architectural discipline through invariant signals.
Data Source
Metrics are generated by the script:
python3 tools/observability/export_metrics.py --format prometheus --out artifacts/tmp/project_health.prom
Key sources:
-
PROJECT_STATUS.md -
latest gate artifacts in
docs/planning/evidence/release_trains/<RM-ID>/(if present)
Metrics and Meaning
angara_project_release_health— release context (active/closed/planning).angara_project_active_risks— number of active risks.angara_project_tech_debt_by_component— debt by component/severity.angara_project_top_loc_files— top LOC + threshold markers.angara_project_gate_status— PASS/FAIL/unknown quality gates.angara_project_architecture_health— invariants total + missing links.
Grafana dashboard
- Title:
AngaraBase Project Health - UID:
angarabase-project-health-v0 - Path:
tools/observability/grafana/angarabase-project-health.json
The dashboard intentionally does not include runtime panels (QueryStore/pgwire/io_uring/WAL runtime): this is the project governance level.
Architecture Overview
This document is a map of the AngaraBase architecture as-is: what major subsystems exist, how an SQL query flows through them, and where the boundaries of responsibility lie. For a user-facing introduction, see AngaraBase Architecture.
High-Level Components
| Component | What it does |
|---|---|
angarabased | Server adapter: pgwire protocol, listener, connection and session management |
angarabase (engine core) | Parse/bind/plan/execute, transactions, storage API, WAL/recovery primitives |
angara-cli | CLI for administration (identity, ops via admin endpoint) |
| Operational surface | Configuration, metrics, logs, diagnostic bundles, upgrade policies |
Full layering contract and dependency rules: Layering and Boundaries.
Query Flow (Simplified)
flowchart LR
C[Client/Driver] -- pgwire --> S[angarabased adapter]
S -- SQL + session ctx --> E[angarabase engine core]
E --> P[Parse / Bind / Plan / Execute]
P --> Sec[Security: RBAC + RLS]
P --> T[Txn / MVCC]
P --> St[Storage API]
P --> Stat[Stats / CBO feedback]
T --> Wal[WAL / Recovery]
St --> Wal
Wal --> IO[IO / fsync contract]
E -- rows / errors --> S
S -- pgwire responses --> C
Key Architectural Decisions
| Area | Decision | Why |
|---|---|---|
| MVCC | UNDO-log (history is a separate append-only log; heap contains only current versions) | Less bloat, no heavy VACUUM, deterministic GC |
| Storage | Pluggable: row-store baseline + AngaraMemory; AngaraColumn in roadmap | HTAP direction, different tiers for different workloads |
| Recovery | WAL-first, idempotent replay, fail-closed on lack of WAL integrity | Correctness is more important than latency |
| Optimizer | Cost-based AngaraPlan + LEO feedback loop, robust planning | Resilience to estimation errors |
| Execution | Volcano streaming (AngaraFlow) + vector path (AngaraVector) | Separation by plan shapes, explicit management via EXPLAIN |
| Catalog | Persisted SysCatalog, DDL survives restart | Predictability for production |
| Security | 6-layer model: TLS/Auth → RBAC → RLS → Break-glass → Audit chain → TDE | Defence-in-depth, fail-closed |
| Backup | Per-database, cold + online/PITR baseline | Multi-tenant isolation |
| Distribution | Single-node engine; distributed SQL is on the horizon of major branches | Concentration on correctness first |
Boundaries and Invariants
angarabased(adapter) does not contain SQL logic — only pgwire framing, session ctx, and routing to the core.angarabasecore does not know about pgwire — it communicates via the core API contract.- Storage does not perform MVCC visibility — only heap I/O. Visibility is computed by the MVCC layer.
- Index does not determine visibility — it only points to the TID; visibility is always rechecked against the heap.
- Any unsupported SQL construct returns an explicit SQLSTATE (
0A000, etc.) — no silent bypasses. - Public API: pgwire + admin endpoint. Internal modules are an implementation detail and may change.
Architectural constraints and do-not-block rules: Architecture Constraints.
Reliability and Physical Portability
- Cold/offline backup and restore — full-instance copy at the data-directory level (see Backup/Restore).
- Host migration — without
pg_dump/pg_restore: copy + verify + start. More details in Crash recovery. - Identity rehearsal — every release goes through the rehearsal upgrade pipeline.
- Page checksums + WAL CRC — corruption detection upon reading/recovery.
Additional Resources
- Layering and Boundaries — official layering contract.
- AngaraBase Architecture (user-facing) — overview for users and DBAs.
- Project Principles — the ideological compass of the project.
Table Partitioning
AngaraBase supports declarative partitioning (RFC-2026-097 v1): RANGE and LIST strategies with DEFAULT catch-all.
Supported Strategies
| Strategy | Syntax | When to Use |
|---|---|---|
| RANGE | PARTITION BY RANGE (col) | Time series, ID ranges |
| LIST | PARTITION BY LIST (col) | Categorical values (region, type) |
| DEFAULT | PARTITION OF parent DEFAULT | Catch-all for rows outside of all ranges |
DDL
Creating a Partitioned Table
Use the PARTITION BY clause to create a partitioned table.
CREATE TABLE orders (
id INTEGER NOT NULL,
user_id INTEGER NOT NULL,
amount_usd BIGINT NOT NULL,
status TEXT NOT NULL DEFAULT 'pending',
month_ts BIGINT NOT NULL,
CONSTRAINT orders_pkey PRIMARY KEY (id, month_ts)
) PARTITION BY RANGE (month_ts);
Attaching a Child Partition
Child partitions are created by specifying the parent table and the range of values (for RANGE) or list of values (for LIST).
-- RANGE partition for January 2025
CREATE TABLE orders_p2025_01 PARTITION OF orders
FOR VALUES FROM (1735689600) TO (1738368000);
-- RANGE partition for February 2025
CREATE TABLE orders_p2025_02 PARTITION OF orders
FOR VALUES FROM (1738368000) TO (1740787200);
-- DEFAULT partition (catch-all)
CREATE TABLE orders_p_default PARTITION OF orders DEFAULT;
DML Through the Parent
INSERT
An INSERT into the parent is automatically routed to the appropriate child partition based on the partition key value.
-- Row will be routed to the appropriate month-partition
INSERT INTO orders (id, user_id, amount_usd, month_ts)
VALUES (1, 42, 9900, 1735689601);
No match error: if a row does not fall into any range and there is no DEFAULT partition — SQLSTATE 23514 check_violation.
ON CONFLICT constraint: ON CONFLICT is not supported on a partitioned parent — returns a feature_not_supported error (SQLSTATE 0A000).
SELECT Through the Parent (UNION ALL Expansion + Pruning)
A SELECT from the parent is automatically expanded into a UNION ALL across all children. If the WHERE clause contains a condition on the partition key — irrelevant partitions are skipped (pruning).
-- Scans only the partition for January 2025
SELECT * FROM orders WHERE month_ts >= 1735689600 AND month_ts < 1738368000;
UPDATE Through the Parent
An UPDATE on non-partition key columns works via a fan-out to all (or pruned) children.
UPDATE orders SET status = 'shipped' WHERE id = 42 AND month_ts = 1735689601;
Limitation: Updating the partition key column is forbidden — returns SQLSTATE 23514. Cross-partition row movement is not supported in v1.
DELETE Through the Parent
A DELETE with a WHERE clause on the partition key applies pruning and deletes only from the matching children.
DELETE FROM orders WHERE month_ts = 1735689601 AND id = 42;
Monitoring
| Metric | Description |
|---|---|
angarabase_partition_route_ok_total | Successful INSERT routing to a child |
angarabase_partition_route_no_match_total | INSERT without a matching partition (→ 23514) |
angarabase_partition_route_default_total | INSERT into the DEFAULT partition |
angarabase_partition_pruned_branches_total | Children skipped during SELECT/DML |
Limitations v1
- Hash partitioning — not supported (planned for v0.7)
- Subpartitioning / multi-column partition key — not supported (v0.7)
- Cross-partition UPDATE — forbidden (explicit error 23514)
- ON CONFLICT on a parent table — not supported
- REINDEX through parent — not supported
Troubleshooting
| Symptom | Cause | Solution |
|---|---|---|
ERROR 23514 check_violation during INSERT | Partition key value does not fall into any range | Add a DEFAULT partition or check the value |
ERROR 23514 during UPDATE | Attempting to modify the partition key column | Do not modify the partition key; use DELETE + INSERT instead of UPDATE |
ERROR 0A000 ON CONFLICT not supported | ON CONFLICT through the parent partition | Perform the INSERT directly into the child partition |
Index Durability
Overview
AngaraBase ensures that all indexes, including PRIMARY KEY and secondary BTree indexes, are persistent and survive system restarts or crashes. Indexes are managed by the IndexStore component, which handles the allocation of unique table identifiers (index_table_id) and coordinates with the Checkpoint worker to flush index pages to persistent storage.
There is one deliberate exception: indexes over volatile InMemory tables (storage='memory', durability='none'),
including CREATE TEMP TABLE, are catalog-visible for the session/query planner but do not allocate a persistent
index_table_id. They are rebuilt from the live in-memory heap when needed and disappear with the table.
Durability pipeline (PRIMARY KEY)
The durability of PRIMARY KEY indexes is guaranteed through a multi-stage process that integrates with the system catalog and the checkpointing mechanism:
- Table Creation: When a
CREATE TABLE ... PRIMARY KEYstatement is executed, the DDL executor allocates a uniqueindex_table_idusing the storage engine. - Catalog Persistence: The Primary Key definition is saved in the
SysCatalogwith its assignedindex_table_id. This ensures that the mapping between the table and its index is persistent. - Checkpoint Integration: The Checkpoint worker periodically calls
flush_all_indexes(). This operation forces all dirty index pages from memory to disk. - Recovery and Restoration: Upon system restart, the recovery process reads the index definitions from the catalog and restores the index state from the persisted pages on disk.
- Startup Backfill (Legacy Migration): For legacy databases where PRIMARY KEY indexes were created without an
index_table_id, AngaraBase performs an asynchronous background backfill after the database begins accepting connections (to avoid blocking startup). This process identifies legacy PKs, allocates the missingindex_table_id, updates the catalog, and persists the index pages.
Durability pipeline (secondary indexes)
Secondary indexes follow a “durable-by-default” path during creation and maintenance:
- Immediate Persistence: During
CREATE INDEX, the system immediately forces a synchronous flush of index pages to disk after the index build is complete. This ensures that the index is durable even if a crash occurs before the next checkpoint. - Periodic Flushing: Similar to Primary Keys, secondary indexes are included in the periodic
flush_all_indexes()calls by the Checkpoint worker.
Volatile InMemory indexes
InMemory tables with durability='none' use a volatile index path. This includes temporary tables, because
CREATE TEMP TABLE is forced to memory + durability=none even if the statement includes a durable storage hint.
Behavior:
CREATE INDEXsucceeds on volatile InMemory tables.- The index definition is visible in the catalog while the table exists.
index_table_idis intentionally absent, because there are no persistent index pages to restore.- The executor can still use a B-Tree/index-scan path by building an ephemeral index from current in-memory rows.
- On session disconnect or temp table cleanup, the table and its volatile index metadata are removed together.
Example:
SET search_path = public;
CREATE TEMP TABLE tt_items (
id INT,
code mvarchar(16)
);
INSERT INTO tt_items (id, code) VALUES (1, 'AbC ');
CREATE INDEX idx_tt_items_code ON tt_items (code);
-- Uses mvarchar equality semantics and can be served through the volatile index path.
SELECT id FROM tt_items WHERE code = 'abc';
This design keeps temp-table workloads fast and avoids WAL/checkpoint pressure, while preserving the same
comparison contract as durable B-Tree indexes. In particular, mvarchar index keys are normalized with the same
case-insensitive and trailing-space-insensitive rules used by expression evaluation.
Checkpoint fail-closed semantics
AngaraBase employs a fail-closed approach to index durability during the checkpoint process. The flush_all_indexes() operation returns a success status:
- Success: If all indexes are successfully flushed, the checkpoint continues, and the
end_checkpointrecord is written to the WAL. - Failure: If
flush_all_indexes()returnsfalse(indicating a flush error), the checkpoint is aborted. Theend_checkpointrecord is not written.
In the event of an aborted checkpoint, the next system startup will trigger a WAL replay starting from the last successfully completed checkpoint, ensuring that no index data is lost or left in an inconsistent state. The metric angarabase_checkpoint_index_flush_errors_total tracks these occurrences.
Monitoring
You can monitor the status of index durability and backfill operations using the following Prometheus metrics:
| Metric | Type | Description |
|---|---|---|
angarabase_pkey_backfill_in_progress | Gauge | 1 while startup PK backfill is running |
angarabase_pkey_backfill_ok_total | Counter | PK indexes successfully backfilled with TableId |
angarabase_pkey_backfill_fail_total | Counter | PK backfill failures (allocation or persist error) |
angarabase_checkpoint_index_flush_errors_total | Counter | Checkpoint aborted due to index flush failure |
angarabase_index_pkey_no_table_id_total | Gauge | Legacy PK indexes found at startup (before backfill) |
angarabase_index_restore_empty_total | Counter | Indexes that restored empty after restart (Alert if > 0) |
angarabase_wal_lsn_drift_resets_total | Counter | WAL VLF LSN drift resets after crash (Alert if > 0) |
angarabase_index_stale_tuple_fallbacks_total | Counter | UPDATE fallbacks to seq-scan due to stale index tuple |
Alerting Guidance & PromQL
- Empty Indexes:
rate(angarabase_index_restore_empty_total[5m]) > 0- Threshold: > 0 is critical. Indicates data loss or corruption in index persistence.
- WAL Drift:
rate(angarabase_wal_lsn_drift_resets_total[5m]) > 0- Threshold: > 0 is critical. Indicates WAL corruption or crash-safety failure.
- Stale Tuples:
rate(angarabase_index_stale_tuple_fallbacks_total[5m]) > 0.1- Threshold: Occasional fallbacks are normal, but a high rate indicates index corruption or MVCC issues.
Troubleshooting
“After restart the index is empty (entries=0)”
If an index appears empty after a restart despite having data previously (or if angarabase_index_restore_empty_total > 0), check the following:
- Legacy PKs: Check
angarabase_index_pkey_no_table_id_total. If it is greater than 0, the backfill might still be in progress (wait for completion) or have failed (check logs for allocation errors). - Checkpoint Failures: Verify if
angarabase_checkpoint_index_flush_errors_totalis incrementing. A failure to flush indexes prevents the checkpoint from completing. Check disk space and permissions. - Logs: Inspect system logs for
WAL replayerrors or messages indicating that index pages could not be restored.
“Checkpoint loop doesn’t complete”
If the checkpoint process seems stuck or keeps restarting:
- Flush Errors: Check the
angarabase_checkpoint_index_flush_errors_totalmetric. - Worker Logs: Look for
checkpoint_worker: flush_all_indexes failedin the logs. This indicates that the index store is unable to persist pages, possibly due to disk space issues or I/O errors.
“WAL LSN drift resets > 0”
If angarabase_wal_lsn_drift_resets_total is incrementing:
- This indicates that after a crash (e.g.,
kill -9), the WAL head LSN was out of sync with the actual physical WAL size, and the system had to reset it. - Check the logs for
append_commit FAILEDorWAL VLF LSN driftmessages. While the system recovers automatically, frequent occurrences might indicate underlying storage fsync issues.
“High rate of stale tuple fallbacks”
If angarabase_index_stale_tuple_fallbacks_total is spiking:
- This means UPDATE operations are finding index entries that point to non-existent or stale tuples, forcing a slow sequential scan fallback.
- This can happen during heavy concurrent UPDATEs. If it persists, consider REINDEXing the affected table.
SQL examples
-- Verify index usage and check for stale index fallbacks
EXPLAIN SELECT * FROM public.your_table WHERE id = 123;
-- If the query plan shows [stale_index], it means the index entry was stale
-- and the engine fell back to a sequential scan.
NOTE: System catalog tables for index metadata (
angara_sys.indexesandangara_sys.index_stats) are not part of the current user-facing SQL surface. UseEXPLAINto verify index usage.
Architecture Layers and Dependency Governance
The layering model is stabilized (active). Changes are made only via RFC.
Goal: To have a simple, verifiable model of layers and dependencies, so that changes do not blur the boundaries.
Layers (Working Model)
- Core engine (
angarabase): SQL semantics, transactions, storage API, WAL/recovery primitives. - Adapters (
angarabased, future): pgwire/HTTP/admin surfaces, protocol transformation. - Tooling (
angara-cli,tools/*): utilities, runners, test harnesses. - Distribution/Packaging tooling (
packaging/*,tools/release/*): release artifacts, signatures, package manifests, publication scripts. - Operational surfaces: configuration, runbooks, policies, evidence packs.
Dependency Rule (Most Important)
- Core does not depend on adapters and tooling.
- Adapters may depend on core.
- Tooling may depend on core and adapters (but without backward hooks into core).
- Distribution/Packaging tooling may depend on tooling/core artifacts and does not introduce runtime dependencies in core/adapters.
Visual Diagram
flowchart TB
subgraph Core[Core engine (angarabase)]
CoreAPI[engine public API]
end
subgraph Adapters[Adapters (angarabased, pgwire, ...)]
Pgwire[pgwire]
Admin[admin surfaces]
end
subgraph Tooling[Tooling (angara-cli, tools/*)]
CLI[angara-cli]
Runners[test/bench runners]
end
Adapters --> Core
Tooling --> Core
Tooling --> Adapters
CoreAPI --> Core
Related Documents
- Do-not-block constraints:
angarabook/src/architecture/constraints.md - Engine core map:
angarabook/src/architecture/components/engine_core.md - Governance RFC:
RFC-2026-449-architecture-layering-and-dependency-governance-v0
Known issues (testing)
This is a user-facing “summary”. The canonical list is: angarabook/src/operations/known-issues.md.
SQLSTATE quick reference
| SQLSTATE | Name | Context |
|---|---|---|
0A000 | feature_not_supported | Unsupported SQL forms, complex RLS predicates in IR mode, server-side predicates on client-encrypted columns (randomized), unsupported RLS mask expressions |
22023 | invalid_parameter_value | Invalid stats_level_max values, invalid SET BREAK_GLASS ... TTL=... values |
23514 | check_violation | Insert into partitioned parent with no matching child/DEFAULT partition |
25001 | active_sql_transaction | SET SESSION CONTEXT inside an active transaction |
42501 | insufficient_privilege | User/role/policy/break-glass actions without required roles, protected SQL without SecurityContext |
42809 | wrong_object_type | DML on append-only tables, disabling append-only on child while parent is append-only |
General
pg_database probe (KI-2026-001)
Some pg_database queries may lead to a stall/hang.
See details and repro: angarabook/src/operations/known-issues.md → KI-2026-001.
Checksum verification
- If on-disk page verification fails, read path fails closed with explicit
checksum mismatchdiagnostics. - Quick operator probe:
angara-cli storage verify-pages --dir <data_dir> --json
SQL behavior
Unsupported SQL forms
0A000 feature_not_supported is expected for currently out-of-scope SQL forms:
WITH RECURSIVE- Set operations (
UNION/INTERSECT/EXCEPT) - Window functions
ORDER BY ... NULLS FIRST|LAST
Partitioning
23514 check_violation is expected when inserting into a partitioned parent table and no child partition (and no DEFAULT partition) matches the row partition key.
Append-only tables
42809 wrong_object_type is expected for:
UPDATEon an append-only table,DELETEon an append-only table,- Attached child partition trying to disable append-only while parent remains append-only.
Stats parameter validation
22023 invalid_parameter_value is expected for invalid stats_level_max values outside [0..3].
Statistics
HLL NDV tracking
HLL NDV tracking for TEXT columns skips very long values (len > 256) by design.
This keeps memory and merge cost bounded for streaming stats.
Security behavior
Privilege enforcement
42501 insufficient_privilegeis expected when user/role/policy/break-glass actions are executed without required roles.22023 invalid_parameter_valueis expected for invalidSET BREAK_GLASS ... TTL=...values (missing/zero/exceeds max TTL).25001 active_sql_transactionis expected forSET SESSION CONTEXTinside an active transaction.
Session security context
42501 insufficient_privilegeis expected inscram/certmodes when protected SQL executes without a sessionSecurityContext.0A000 feature_not_supportedis expected in IR mode for unsupported complex RLS predicates (bounded fail-closed planner/IR contract).
Client encryption and RLS masks
0A000 feature_not_supportedis expected for unsupported server-side predicates on client-encrypted columns in randomized mode.0A000 feature_not_supportedis expected for unsupported RLS mask expressions outside v1 bounded forms (partial,nullify).
What’s Next
- Support and Bug Report Artifact Collection — how to report a new issue if yours is not in the list.
- Client Compatibility — which clients are tested as “works as documented”.
- SQL Compatibility Overview — which SQL constructs return
0A000and why.
Glossary
A unified reference of AngaraBase terms. Terms are listed in alphabetical order.
A
AngaraAdapt — adaptive query processing subsystem (v5). Automatic plan correction based on runtime feedback: if the actual cardinality differs significantly from the estimate, the plan is recalculated on the fly.
AngaraFlow — streaming query execution subsystem. Implements the Volcano model (iterator model) with scan, filter, join, sort, and aggregate operators. Each operator requests the next batch of data from its child operator.
AngaraGC — MVCC garbage collection subsystem. Uses an epoch-based watermark to determine the safe boundary for cleanup. Deletes row versions invisible to any active transaction.
AngaraIO — asynchronous I/O subsystem. Uses io_uring for storage and WAL operations, minimizing system calls and context switches.
AngaraMemory — in-memory storage engine (v5). Supports three operation modes: volatile (data only in memory), logged (with WAL writing), and snapshotted (with periodic disk snapshots).
AngaraNet — io_uring-based network I/O subsystem (v5). Provides asynchronous processing of network operations without blocking threads.
AngaraParallel — parallel query execution subsystem (v5). Morsel-driven model with a work-stealing scheduler for multi-core processor utilization. NUMA-aware distribution (per-node morsel queues, thread pinning) is deferred to v0.7 (RFC-2026-376 §13 H2 — “NUMA-aware shard selection”); until then, execution is NUMA-agnostic, and EXPLAIN reports numa_affinity=disabled.
AngaraPlan — cost-based query optimizer. Uses robust planning (resilience to estimation errors) and LEO feedback (learning from actual execution metrics).
AngaraPool — connection and thread management subsystem. Responsible for connection pooling, thread scheduling, and resource allocation between sessions.
AngaraStat — statistics collection subsystem for the optimizer. Includes HyperLogLog for NDV, equi-height histograms, MCV (Most Common Values), and reservoir sampling for building samples.
AngaraTree — index engine. Supports B+tree (primary index type), BRIN (Block Range Index for append-only data), and Hash (for point equality lookups).
AngaraVector — vectorized execution subsystem (v5). Processes data in batches using SIMD instructions: AVX2, AVX-512 on x86-64, and NEON on ARM.
B
BRIN (Block Range Index) — a lightweight index storing min/max values for page ranges. Effective for append-only and time-series data where values are naturally ordered.
Break-glass — a controlled privilege escalation mechanism. Allows authorized users to temporarily gain elevated privileges with mandatory reason specification, limited TTL, and full audit of all actions.
C
CBO (Cost-Based Optimizer) — a query optimizer that selects an execution plan based on statistics (cardinality, NDV, histograms) and cost models (CPU, I/O, memory).
E
Epoch — a logical unit of time in the MVCC subsystem. Every successful commit increments the global epoch. Epoch is used to determine the visibility of row versions and calculate the GC watermark.
F
Fail-closed — a security principle where the system rejects an operation in case of uncertainty, rather than allowing it. For example, if an RLS policy cannot be evaluated, access is denied.
G
GC watermark — the minimum snapshot among all active transactions. Determines the safe boundary for garbage collection: row versions with deleted_commit < watermark can be deleted.
H
HLL (HyperLogLog) — a probabilistic data structure for approximate counting of unique values (NDV). Uses a fixed amount of memory (~1 KB) regardless of the number of values. Relative error is ~2%.
L
LEO (Learning Optimizer) — an AngaraPlan component that adjusts cost models based on actual query execution metrics. After each execution, it compares the estimated and actual cardinality and updates correction factors.
LSN (Log Sequence Number) — a monotonically increasing identifier for a WAL record. Used to determine the order of records, recovery position, and replication.
M
MCV (Most Common Values) — a list of the most frequently occurring values in a column along with their frequencies. Used by the optimizer for accurate cardinality estimation when filtering by specific values.
MVCC (Multi-Version Concurrency Control) — a mechanism for concurrent data access via row versioning. Allows readers and writers to work simultaneously without mutual blocking. Each modification creates a new row version instead of modifying the existing one.
N
NDV (Number of Distinct Values) — the number of unique values in a column. A key metric for the optimizer: affects the cardinality estimation of joins and group by.
P
pgwire — PostgreSQL wire protocol. The network protocol used by AngaraBase to communicate with clients. Ensures compatibility with PostgreSQL clients (psql, libpq drivers, JDBC).
PITR (Point-In-Time Recovery) — a mechanism for restoring the database to an arbitrary point in time. Uses a base backup + replay of WAL records up to a specified LSN or timestamp.
R
RBAC (Role-Based Access Control) — a role-based access control model. Privileges are assigned to roles, and roles to users. Supports nested roles and privilege inheritance.
RLS (Row-Level Security) — a mechanism for row-level visibility policies on a table. Allows restricting access to specific rows based on the attributes of the current user (role, department, tenant_id).
S
SQLSTATE — a 5-character error code according to the SQL standard (ISO/IEC 9075). AngaraBase uses explicit SQLSTATEs for all expected errors, simplifying error handling in client applications.
SysCatalog — system catalog, the central registry of metadata for all database objects: tables, columns, indexes, users, roles, privileges, security policies, and statistics.
T
TDE (Transparent Data Encryption) — transparent data encryption on disk. Encrypts data pages, WAL records, and the audit log. Transparent to applications — encryption and decryption occur at the storage engine level.
TID (Tuple Identifier) — the physical address of a row in storage, consisting of two components: (page_id, slot_id). Used for direct access to a row via an index.
W
WAL (Write-Ahead Log) — a transaction log ensuring durability and recovery. All changes are first written to the WAL, then applied to data pages. During crash recovery, the WAL is used to reapply committed but not yet disk-flushed changes.
What’s Next
- What is AngaraBase — a product overview for those who came to the glossary from a search.
- AngaraBase Architecture — how named subsystems (AngaraTree, AngaraPlan, etc.) are assembled into a working engine.
- System Views
sys.*— where the same terms are visible from SQL.
System Catalog
Goal: Describe the structure and purpose of AngaraBase system tables for monitoring, diagnostics, and metadata management.
The AngaraBase system catalog is accessible via the sys virtual schema. These tables provide real-time information about the instance state, configuration, and database objects.
sys.tables Table
Contains metadata for all tables across all databases and schemas.
| Field | Type | Description |
|---|---|---|
db_id | string | Database identifier. |
schema_name | string | Schema name (usually public). |
table_name | string | Table name. |
tablespace_name | string | Tablespace name. |
storage_engine | string | Storage engine type: row_store (HeapFile), memory (In-memory), htap_row_column. |
durability | string | Durability level (for memory tables): none, logged, snapshotted. |
max_rows | uint64 | Row limit (for memory tables). |
eviction_policy | string | Eviction policy (for memory tables): error, fifo, lru, lfu. |
checkpoint_interval_ms | uint64 | Checkpoint interval in ms (for snapshotted). |
current_rows | uint64 | Current number of live rows (approximate). |
evictions_total | uint64 | Counter of evicted rows. |
limit_errors_total | uint64 | Counter of row limit exceeded errors. |
append_only | bool | Append-only mode flag (deprecated, see mutation_policy). |
mutation_policy | string | Mutation policy: unrestricted, no_delete, append_only. |
Storage Engine Note
- row_store: Standard disk storage (HeapFile).
- memory: Data is stored in RAM. Durability is controlled by the
durabilityparameter.
sys.settings Table
Provides access to the current server configuration settings (Effective Configuration).
| Field | Type | Description |
|---|---|---|
name | string | Parameter name (e.g., server.addr). |
value | string | Current effective value. |
source | string | Value source: default, config, bootstrap_env, sql_runtime. |
dynamic | bool | true if the parameter can be changed without a restart. |
doc | string | Brief description of the parameter. |
sys.databases Table
List of available databases.
| Field | Type | Description |
|---|---|---|
db_id | string | Unique database identifier. |
name | string | Database name. |
sys.schemas Table
List of schemas within databases.
| Field | Type | Description |
|---|---|---|
db_id | string | Database identifier. |
schema_name | string | Schema name. |
Troubleshooting
-
Symptom: The
sys.tablestable returns an empty result.- Cause: The user does not have permission to view metadata, or no database is selected (if filtering is used).
- Solution: Check access rights (RBAC).
-
Symptom: A change in
sys.settingsis not applied.- Cause: The parameter has
dynamic = falseor is overridden by an environment variable (source = bootstrap_env). - Solution: A server restart or a change in configuration/environment variables is required.
- Cause: The parameter has
Links
Client Compatibility
This guide covers compatibility considerations and configuration steps for connecting various database clients to AngaraBase.
DBeaver
DBeaver is a popular database administration tool that can connect to AngaraBase via the PostgreSQL protocol.
However, DBeaver automatically sends metadata queries to pg_catalog and information_schema tables that
AngaraBase does not support, which can cause connection failures.
Problem
When connecting DBeaver to AngaraBase, you may encounter feature_not_supported errors. This happens because
DBeaver automatically sends queries to PostgreSQL system catalogs (pg_catalog.* and information_schema.*)
during connection setup and schema browsing. AngaraBase does not implement these PostgreSQL-specific metadata
tables.
Solution
Follow these steps to configure DBeaver for AngaraBase compatibility:
1. Connection Properties
In DBeaver, edit your connection and go to Driver Properties:
- Set
assumeMinServerVersion=9.0 - Set
preferQueryMode=simple
These settings reduce the number of metadata queries DBeaver sends.
2. PostgreSQL Connection Settings
In the PostgreSQL tab of your connection settings:
- Uncheck “Read all data types”
- Uncheck “Show non-default schemas”
This prevents DBeaver from querying system catalogs for type and schema information.
3. Server Type Selection
When creating the connection, select PostgreSQL 9.x as the server type. This uses a compatibility mode with minimal system catalog queries.
Diagnostics
If you’re still experiencing issues, you can enable slow query logging to see exactly what queries DBeaver is sending:
export ANGARABASE_LOG_MIN_DURATION_MS=0
export ANGARABASE_LOG_QUERY_TEXT=1
Start your AngaraBase server with these environment variables to log all queries. Check the logs to identify which specific queries are failing.
Alternative Approach
If the above configuration doesn’t work for your use case, consider using a simpler PostgreSQL client like
psql for command-line access:
psql -h localhost -p 5152 -U your_username -d your_database
Limitations
Even with proper configuration, some DBeaver features may not work with AngaraBase:
- Schema browser may show limited information
- Auto-completion may be reduced
- Some administrative features will not be available
For the most complete AngaraBase experience, consider using the native angara-cli tool or connecting via
standard PostgreSQL drivers in your applications.
Other Clients
psql
The PostgreSQL command-line client works well with AngaraBase:
psql -h localhost -p 5152 -U username -d database_name
JDBC/ODBC Drivers
Standard PostgreSQL JDBC and ODBC drivers should work with AngaraBase for basic SQL operations. Avoid using driver features that query system catalogs.
Application Frameworks
Most application frameworks (Django, Rails, etc.) work with AngaraBase when configured to use PostgreSQL drivers, though some ORM introspection features may be limited.
Troubleshooting
Common Issues
| Issue | Cause | Solution |
|---|---|---|
feature_not_supported on connection | Client querying pg_catalog | Configure client to minimize metadata queries |
| Slow connection setup | Too many system queries | Use preferQueryMode=simple |
| Missing schema information | AngaraBase doesn’t implement full pg_catalog | Use direct SQL queries instead of client introspection |
Getting Help
If you encounter client compatibility issues not covered here:
- Check the Known Issues page
- Enable query logging to identify problematic queries
- Consult the Support page for additional resources
For additional client-specific configuration tips, see our community documentation.
What’s Next
- Quickstart — example of connecting
psqlwith pgwire parameters. - SQL Compatibility Overview — what exactly the SQL layer supports.
- Known Issues and SQLSTATE — what codes the client should expect instead of “magic”.
- Support and Bug Report Artifact Collection — if your client driver behaves differently than described.
Support / bug reports (testing)
What to include in a report
Every report should contain the following:
- Version: git commit hash (or tag), OS/kernel.
- Launch config: config (
angarabase.conf) and launch command. - Reproduction steps: SQL/steps/client (psql/ORM/tool).
- Expected vs actual behavior: what was expected and what happened.
- Artifacts:
- Server logs.
- If this is a crash/recovery/backup topic:
summary.jsonand the entireartifacts/folder.
Helpful tooling (already in repo)
Nightly-style evidence pack
Running a local evidence pack (single run):
tools/ci/nightly_gate.sh --runs 1 --root artifacts/nightly_local
Diagnostics bundle
Collecting a diagnostic bundle:
tools/diagnostics_bundle/run.sh --root artifacts/diagnostics_bundle
Specific scenarios
Hang / stall
If the problem is a hang or stall:
- Attach client-side timeout/hang description.
- Attach server logs.
- Attach stack trace or diag bundle (if possible).
- Check known issues — specifically
KI-2026-001(pg_database probe stall).
Crash / recovery
- Attach
summary.jsonfromartifacts/. - Attach the full
artifacts/directory. - Specify whether this occurred during normal startup, restart, or backup/restore.
Unexpected SQLSTATE
- Provide the full error text (SQLSTATE code + message).
- Check known issues — many SQLSTATE codes are documented as expected behavior.
What’s Next
- Known Issues and SQLSTATE — check if your problem is described as a known limitation.
- Client Compatibility — if the problem is in the driver or client.
- Diagnostics — how to collect an
EXPLAIN, slow-log, orsys.*snapshot to attach to an issue.
Generated Reference Artifacts
Auto-generated markdown documentation artifacts are placed here.
AngaraBook changelog (user/testing highlights)
Это не копия CHANGELOG.md и не замена release notes в planning пакетах.
Цель: дать тестерам короткий “что поменялось в опыте тестирования” + ссылки на каноничные источники.
Source of truth
-
Product changelog (executive):
CHANGELOG.md(+CHANGELOG.ru.md) -
Release trains / gates:
docs/planning/releases/v2/minor/README.md -
Known issues (canonical):
angarabook/src/operations/known-issues.md
Unreleased (testing focus)
- RM-0.6.3.4 (in_review) — CBO P4.1 remediation:
- train:
docs/planning/v0.6/RM-0.6.3.4.md - highlights: optimizer planning timeout contract hardened (
sql.optimizer.planning_timeout_ms/ANGARABASE_OPTIMIZER_PLANNING_TIMEOUT_MS), timeout path degrades to greedy planning, and optimizer observability expanded with planning counters/histogram.
- train:
- RM-5.17 (closed) — SQL Coverage Expansion:
- train:
docs/planning/releases/v5/minor/RM-5.17.md - release notes:
docs/planning/releases/v5/RELEASE_NOTES.md - highlights: Window functions v0 (
ROW_NUMBER,RANK,LAG,LEAD,SUM/COUNTOVER), Set operations (UNION,INTERSECT,EXCEPT), TPC-H partial benchmarks, pgbench read-write support.
- train:
- RM-5.16 (closed) — Columnar Segment Format v0 (AngaraColumn prep):
- train:
docs/planning/releases/v5/minor/RM-5.16.md - release notes:
docs/planning/releases/v5/RELEASE_NOTES.md - highlights: columnar segment on-disk format v0 + column cache + zone maps; baseline read-only scan surface; see evidence pack for closure notes.
- train:
- RM-5.15.13 (closed) — Page-Based Default + Overlay Hydrate Bridge:
- train:
docs/planning/releases/v5/minor/RM-5.15.13.md - release notes:
docs/planning/releases/v5/RELEASE_NOTES.md - highlights: page-based storage default ON, hydrate bridge restores persistent tables on startup,
configurable
flush_on_commitfor snapshotted tables.
- train:
- RM-5.15.12 (closed) — Txlog Path Guardrails & Regression Tests:
- train:
docs/planning/releases/v5/minor/RM-5.15.12.md - release notes:
docs/planning/releases/v5/RELEASE_NOTES.md - highlights: regression tests for commit conflict 40001 (hotfix follow-up), runtime warning for split-directory misconfiguration, and risk closure.
- train:
- RM-5.13.1 (in_review) — AQP hardening fix release:
- train:
docs/planning/releases/v5/minor/RM-5.13.1.md - release notes:
docs/planning/releases/v5/RELEASE_NOTES.md - highlights: bounded async feedback ingest path, self-join-safe operator identity matching, and documented
AQP capacity knob (
ANGARABASE_AQP_STORE_CAPACITY_MB) for deterministic bounded advisory store behavior.
- train:
- RM-5.15.11 (closed) — IR Executor Refactor + Unwrap/Expect Cleanup:
- train:
docs/planning/releases/v5/minor/RM-5.15.11.md - release notes:
docs/planning/releases/v5/RELEASE_NOTES.md - note: no user-facing SQL/ops contract changes (refactor-only train; internal executor module decomposition
- guardrails).
- train:
- RM-5.12.3 (closed) — Comparative Benchmark Infrastructure:
- train:
docs/planning/releases/v5/minor/RM-5.12.3.md - release notes:
docs/planning/releases/v5/RELEASE_NOTES.md - highlights: SQL-level benchmark suite (B1-B7), AngaraBase-vs-PostgreSQL comparator reports, SQL coverage corpus and score reporting, and optional nightly SQL benchmark hook.
- train:
- RM-5.12.2 (closed) — Parallel Performance Polish:
- train:
docs/planning/releases/v5/minor/RM-5.12.2.md - release notes:
docs/planning/releases/v5/RELEASE_NOTES.md - highlights: removed sequential join-build merge bottleneck via shared partition build path; preserved join
accounting telemetry continuity in
EXPLAIN ANALYZE.
- train:
- RM-5.12.1 (closed) — AngaraParallel gap closure:
- train:
docs/planning/releases/v5/minor/RM-5.12.1.md - release notes:
docs/planning/releases/v5/RELEASE_NOTES.md - highlights: partitioned parallel join build phase, settings-governed DOP caps
(
ANGARABASE_PARALLEL_DOP_CAP_GLOBAL/ANGARABASE_PARALLEL_DOP_CAP_QUERY), andEXPLAIN ANALYZEparallel join counters (join_build_rows,join_probe_rows).
- train:
- RM-5.10.1 (closed) — AngaraVector gap closure:
- train:
docs/planning/releases/v5/minor/RM-5.10.1.md - release notes:
docs/planning/releases/v5/RELEASE_NOTES.md - highlights: execution mode contract aligned to
auto/force_vector/force_row, explicit vector bridges (RowToColumnBridge,BatchToRowBridge), and column-native vector hash kernels used in vector join/aggregate path.
- train:
- RM-5.10 (closed) — AngaraVector phase-1:
- train:
docs/planning/releases/v5/minor/RM-5.10.md - release notes:
docs/planning/releases/v5/RELEASE_NOTES.md - highlights: vector execution mode now covers join/aggregate plan paths and
EXPLAINmarks vector operators (VectorHashJoin,VectorAgg) when vector mode is active.
- train:
- RM-5.9 (closed) — AngaraVector phase-0:
- train:
docs/planning/releases/v5/minor/RM-5.9.md - release notes:
docs/planning/releases/v5/RELEASE_NOTES.md - highlights: introduced vector batch format baseline (
batch_sizedefault 1024), scan/filter/project vector path, and bounded per-query vector memory budget knobs.
- train:
- RM-5.8.1 (closed) — AngaraMemory async snapshots + per-table scheduler:
- train:
docs/planning/releases/v5/minor/RM-5.8.1.md - release notes:
docs/planning/releases/v5/RELEASE_NOTES.md - highlights:
durability='snapshotted'no longer performs immediate hot-path page persistence; checkpoint worker now honors per-tablecheckpoint_interval_msscheduling.
- train:
- RM-5.8 (closed) — AngaraMemory phase-1:
- train:
docs/planning/releases/v5/minor/RM-5.8.md - release notes:
docs/planning/releases/v5/RELEASE_NOTES.md - highlights:
durability='logged'|'snapshotted', opt-ineviction_policy='fifo', and SQL-visible memory-table runtime counters insys.tables.
- train:
- RM-5.7 (closed) — AngaraMemory phase-0:
- train:
docs/planning/releases/v5/minor/RM-5.7.md - release notes:
docs/planning/releases/v5/RELEASE_NOTES.md - highlights:
storage='memory'table surface, fail-closedmax_rowsenforcement (54023), and volatiledurability='none'restart semantics.
- train:
- RM-5.6.5 (closed) — reliability engineering closure:
- train:
docs/planning/releases/v5/minor/RM-5.6.5/RM-5.6.5.md - release notes:
docs/planning/releases/v5/minor/RM-5.6.5/release_notes.md - note: no user-facing SQL/ops contract changes (runtime panic-hardening + CI governance).
- train:
- RM-5.6.4 (closed) — architecture governance closure:
- train:
docs/planning/releases/v5/minor/RM-5.6.4/RM-5.6.4.md - release notes:
docs/planning/releases/v5/minor/RM-5.6.4/release_notes.md - note: no user-facing SQL/ops contract changes (layering/dependency CI guardrails).
- train:
- RM-5.6.3 (closed) — native packaging + secure init-first bootstrap:
- train:
docs/planning/releases/v5/minor/RM-5.6.3/RM-5.6.3.md - release notes:
docs/planning/releases/v5/minor/RM-5.6.3/release_notes.md - highlights:
- secure init CLI hardening (
--superuser-password-file|--superuser-password-env,--require-auth, explicit--insecure-trust) - native RPM/DEB packaging manifests with init-first service start fence (
ConditionPathExists=/var/lib/angarabase/data/VERSION) - release signing helpers and deterministic repo-layout scaffolding for package publication
- secure init CLI hardening (
- train:
- RM-5.6.2 (closed) — packaging baseline for operator install path:
- train:
docs/planning/releases/v5/minor/RM-5.6.2/RM-5.6.2.md - release notes:
docs/planning/releases/v5/minor/RM-5.6.2/release_notes.md - highlights:
- portable
x86_64-unknown-linux-gnuarchive build in pinned RHEL8/UBI8 (glibc 2.28) environment - Gentoo source ebuild baseline with systemd/OpenRC assets
- runtime fail-closed glibc compatibility guard (
glibc >= 2.28) with operator-facing support contact message
- portable
- train:
- RM-5.6.1 (closed) — architecture hygiene patch:
- train:
docs/planning/releases/v5/minor/RM-5.6.1/RM-5.6.1.md - release notes:
docs/planning/releases/v5/minor/RM-5.6.1/release_notes.md - note: no user-facing SQL/ops contract changes (AngaraBook: no user-facing changes)
- train:
- RM-5.5 / RM-5.6 (closed) — advanced diagnostics + pilot validation:
- trains:
docs/planning/releases/v5/minor/RM-5.5/RM-5.5.mddocs/planning/releases/v5/minor/RM-5.6/RM-5.6.md
- highlights:
- wait-event taxonomy and usage stats surfaces (
sys.table_stats,sys.index_stats) with bounded reset semantics - OTel-style span export knobs for query-stage triage evidence (
ANGARABASE_OTEL_*) - production pilot evidence updated with workload command and reprioritization notes
- wait-event taxonomy and usage stats surfaces (
- trains:
- AngaraBook security docs expanded:
- pages:
security/overview.md,security/authorization.md,security/authentication.md,security/audit.md,security/encryption.md,security/break-glass.md,security/hardening.md - highlights: end-to-end user-facing security documentation for RM-4.25/RM-5.3/RM-5.3.1/RM-5.4 surfaces (RBAC/RLS/break-glass/audit/TDE/client-encrypted columns)
- pages:
- RM-5.4 (closed) — Security Layer Reinforcement Phase 2:
- train:
docs/planning/releases/v5/minor/RM-5.4/RM-5.4.md - release notes:
docs/planning/releases/v5/minor/RM-5.4/release_notes.md - highlights: Audit v1 DML policy controls (
off|allowlist|denylist), RLS v1 masking/provenance introspection, and client-encrypted columns v0 metadata contract with fail-closed predicate bounds
- train:
- RM-5.3.1 (closed) — TDE patch for audit-at-rest:
- train:
docs/planning/releases/v5/minor/RM-5.3.1/RM-5.3.1.md - release notes:
docs/planning/releases/v5/minor/RM-5.3.1/release_notes.md - highlights: audit sink bytes are encrypted when TDE is enabled,
sys.audit_logstays readable with key material, missing key is fail-closed for audit sink read/write
- train:
- RM-5.3 (closed) — TDE v0 baseline:
- train:
docs/planning/releases/v5/minor/RM-5.3/RM-5.3.md - release notes:
docs/planning/releases/v5/minor/RM-5.3/release_notes.md - highlights: fail-closed TDE enablement for page/WAL at-rest encryption,
sys.settingsmetadata-only introspection for key id/rotation timestamp, restore fail-closed without keys
- train:
- RM-5.2 (closed) — module decomposition phase-1 (
.inc.rselimination, security module layout, pgwire tests split, architecture doc consolidation) completed as refactor-only train:- train:
docs/planning/releases/v5/minor/RM-5.2/RM-5.2.md - release notes:
docs/planning/releases/v5/minor/RM-5.2/release_notes.md - note: no user-facing SQL/ops contract changes
- train:
- See current train planning:
docs/planning/releases/README.md - RM-5.1 (closed) — module decomposition phase-0 (
sys_catalog,virtual_catalog,metrics) completed as refactor-only train:- train:
docs/planning/releases/v5/minor/RM-5.1/RM-5.1.md - release notes:
docs/planning/releases/v5/minor/RM-5.1/release_notes.md - note: no user-facing SQL/ops contract changes
- train:
- RM-4.0 line (closed) — major-line closure completed with truth-of-now planning sync:
- major entry point:
docs/planning/releases/v4/RM-4.0.md - train index:
docs/planning/releases/v4/minor/README.md - rollup changelog:
docs/planning/releases/v4/CHANGELOG.md - closure evidence:
docs/planning/evidence/release_lines/rm-4.0/major_closure_20260216.md - transition:
docs/planning/releases/v5/RM-5.0.md
- major entry point:
- RM-4.25.1 (closed) — Security Hardening & RLS Optimization:
- train:
docs/planning/releases/v4/minor/RM-4.25.1/RM-4.25.1.md - release notes:
docs/planning/releases/v4/minor/RM-4.25.1/release_notes.md - highlights: fail-closed IR RLS predicate checks (
0A000on unsupported complexity), mandatory SecurityContext enforcement (42501in non-trust modes), bounded planner-stage RLS rewrite for IR SELECT, and audit fsync barriers for break-glass lifecycle events
- train:
- RM-4.25 (closed) — Security Reinforcement Phase 1:
- train:
docs/planning/releases/v4/minor/RM-4.25/RM-4.25.md - release notes:
docs/planning/releases/v4/minor/RM-4.25/release_notes.md - highlights: secure
--initsuperuser bootstrap, RLS v0 on reads+writes, break-glass lifecycle, audit chain verification,sys.*security introspection views/functions
- train:
- RM-4.21 (closed) — AngaraStat Level 2 reservoir stats (bounded):
- train:
docs/planning/releases/v4/minor/RM-4.21/RM-4.21.md - release notes:
docs/planning/releases/v4/minor/RM-4.21/release_notes.md - highlights:
stats_reservoir_size, Level 2 histogram/MCV surfaces insys.column_stats
- train:
- RM-4.22 (closed) — query diagnostics v0:
- train:
docs/planning/releases/v4/minor/RM-4.22/RM-4.22.md - release notes:
docs/planning/releases/v4/minor/RM-4.22/release_notes.md - highlights:
EXPLAIN/EXPLAIN ANALYZE, slow query log,angara_stat_activity,angara_stat_statements,angara_top_queries()
- train:
- RM-4.24 (closed) — reliability/efficiency hardening:
- train:
docs/planning/releases/v4/minor/RM-4.24/RM-4.24.md - release notes:
docs/planning/releases/v4/minor/RM-4.24/release_notes.md - highlights:
REINDEX INDEX, BRIN range-efficiency metric, strict storage startup default, no-auth startup guardrail
- train:
- RM-4.24.1 (closed) — mutation policy
no_delete:- train:
docs/planning/releases/v4/minor/RM-4.24.1/RM-4.24.1.md - release notes:
docs/planning/releases/v4/minor/RM-4.24.1/release_notes.md - highlights: unified
mutation_policy,42809guards for DELETE/TRUNCATE and PK/FK updates,sys.tables.mutation_policy
- train:
- RM-4.24.2 (closed) — SQL semantics/stats hardening:
- train:
docs/planning/releases/v4/minor/RM-4.24.2/RM-4.24.2.md - release notes:
docs/planning/releases/v4/minor/RM-4.24.2/release_notes.md - highlights: typed min/max streaming stats, mutation epoch metric, EXPLAIN ANALYZE dry-run for DML, wait-event classification
- train:
- RM-4.24.3 (closed) — SQL/stats closure:
- train:
docs/planning/releases/v4/minor/RM-4.24.3/RM-4.24.3.md - release notes:
docs/planning/releases/v4/minor/RM-4.24.3/release_notes.md - highlights: typed
col_min/col_maxsurfaces, typed reservoir samples with membership-aware UPDATE/DELETE handling,wait_event_typeinangara_stat_activity
- train:
- RM-4.24.4 (closed) — core decomposition for executor:
- train:
docs/planning/releases/v4/minor/RM-4.24.4/RM-4.24.4.md - release notes:
docs/planning/releases/v4/minor/RM-4.24.4/release_notes.md - highlights: internal refactor (
ir_executorsplit into scan/join/aggregate/sort modules), no user-facing SQL/ops contract changes
- train:
- RM-4.15 (closed) — BRIN baseline (bounded):
- train:
docs/planning/releases/v4/minor/RM-4.15/RM-4.15.md - release notes:
docs/planning/releases/v4/minor/RM-4.15/release_notes.md - evidence:
docs/planning/evidence/release_trains/rm-4.15-4.16/release_closure_20260216.md
- train:
- RM-4.16 (closed) — page checksums v0 + verify-pages triage surface:
- train:
docs/planning/releases/v4/minor/RM-4.16/RM-4.16.md - release notes:
docs/planning/releases/v4/minor/RM-4.16/release_notes.md - evidence:
docs/planning/evidence/release_trains/rm-4.15-4.16/release_closure_20260216.md
- train:
- RM-4.17 (closed) — SQL semantics tranche (bounded
WITH,ORDER BY <expr>, deterministic0A000hygiene):- train:
docs/planning/releases/v4/minor/RM-4.17/RM-4.17.md - release notes:
docs/planning/releases/v4/minor/RM-4.17/release_notes.md - evidence:
docs/planning/evidence/release_trains/rm-4.17-4.18/release_closure_20260216.md
- train:
- RM-4.18 (closed) — table partitioning v0 (
RANGE/LIST, routing, pruning, per-partition cascade):- train:
docs/planning/releases/v4/minor/RM-4.18/RM-4.18.md - release notes:
docs/planning/releases/v4/minor/RM-4.18/release_notes.md - evidence:
docs/planning/evidence/release_trains/rm-4.17-4.18/release_closure_20260216.md
- train:
- RM-4.19 (closed) — append-only table mode v0 (
append_onlyDDL/property,42809mutation guards, partition inheritance, rowid watermark):- train:
docs/planning/releases/v4/minor/RM-4.19/RM-4.19.md - release notes:
docs/planning/releases/v4/minor/RM-4.19/release_notes.md - evidence:
docs/planning/evidence/release_trains/rm-4.19-4.20/release_closure_20260216.md
- train:
- RM-4.20 (closed) — AngaraStat Level 1 (
ndv_approx/min/max/null_count,stats_level_maxcontrols, stats observability):- train:
docs/planning/releases/v4/minor/RM-4.20/RM-4.20.md - release notes:
docs/planning/releases/v4/minor/RM-4.20/release_notes.md - evidence:
docs/planning/evidence/release_trains/rm-4.19-4.20/release_closure_20260216.md
- train:
- RM-4.23 (closed) — unified
.adbstorage path for heap tables:- train + gates + evidence:
docs/planning/releases/v4/minor/RM-4.23/RM-4.23.md - runtime routing fixed: user DB writes go to
<db>.adband<db>.atl(notbase.*) - backup/restore note updated:
operations/backup-and-restore.md
- train + gates + evidence:
Milestones (testing-ready)
-
RM-2.3 (closed) — backup/restore baseline is testable:
- cold/offline backup/restore + restore oracle:
docs/planning/releases/v2/minor/RM-2.3/RM-2.3.md - runbook:
angarabook/src/operations/backup-restore.md
- cold/offline backup/restore + restore oracle:
-
RM-2.4 (closed) — execution/compat deepening is pinned:
- train + gates + evidence pointers:
docs/planning/releases/v2/minor/RM-2.4/RM-2.4.md - known issues remain explicit:
angarabook/src/operations/known-issues.md
- train + gates + evidence pointers:
-
RM-2.7 (closed) — backup/restore v2 phase 1a is testable (offline/local baseline):
- train + gates + evidence:
docs/planning/releases/v2/minor/RM-2.7/RM-2.7.md - pinned evidence runner:
tools/backup_restore/evidence_v2_phase1a.sh - AngaraBook page:
operations/backup-and-restore.md
- train + gates + evidence:
-
RM-2.8 (closed) — validation hardening is testable:
- train + gates:
docs/planning/releases/v2/minor/RM-2.8/RM-2.8.md - compat nightly emits
coverage_report.json(probe-level):tools/compat_suite/run.sh
- train + gates:
-
RM-2.9 (closed) — backup/restore v2 phase 1b (online/PITR) is testable:
- train + gates + pinned pointers:
docs/planning/releases/v2/minor/RM-2.9/RM-2.9.md - pinned evidence runner:
tools/backup_restore/evidence_v2_phase1b.sh - AngaraBook page:
operations/backup-and-restore.md
- train + gates + pinned pointers:
-
RM-2.10 (closed) — SysCatalog identity + native
sys.*introspection is available:- train + gates + pinned pointers:
docs/planning/releases/v2/minor/RM-2.10/RM-2.10.md - identity file:
storage.data_directory/identity_v0.txt
- train + gates + pinned pointers:
-
RM-2.11 (closed) —
pg_catalogsemantics are rooted in SysCatalog (trace-driven) + identity rehearsal gate:- train + gates + pinned pointers:
docs/planning/releases/v2/minor/RM-2.11/RM-2.11.md - identity rehearsal runner:
tools/release_preflight/rehearsal_identity.sh(renamed fromrehearsal_identity_rm211.sh) - compat matrix (truth source):
angarabook/src/operations/client-compatibility.md
- train + gates + pinned pointers:
-
RM-2.12 (closed) — upgrade rehearsal is wired into nightly discipline + docs anti-drift is enforced:
- train + gates + pinned pointers:
docs/planning/releases/v2/minor/RM-2.12/RM-2.12.md - rehearsal runner:
tools/upgrade_rehearsal/run.sh - preflight wiring:
tools/release_preflight/run.sh - docs validator:
docs/validate-docs.sh
- train + gates + pinned pointers:
-
RM-2.13 (closed) — code health hardening: prevent god-files growth (touched-file budget gate):
- train + gates + pinned pointers:
docs/planning/releases/v2/minor/RM-2.13/RM-2.13.md - gate runner:
tools/lint/code_health.sh - PR wiring:
tools/ci_pr.sh
- train + gates + pinned pointers:
-
RM-2.14 (closed) — admin remote transport v0 (TCP) +
angara-cliremote identity:- train + gates + pinned pointers:
docs/planning/releases/v2/minor/RM-2.14/RM-2.14.md - server: env
ANGARABASE_ADMIN_ADDR+crates/angarabased/src/admin_tcp.rs - client:
angara-cli admin identity --addr <host:port> [--json]
- train + gates + pinned pointers:
-
RM-2.15 (closed) — persisted SysCatalog v0: DDL survives restart:
- train + gates + pinned pointers:
docs/planning/releases/v2/minor/RM-2.15/RM-2.15.md - persisted catalog file:
storage.data_directory/sys_catalog_v0.txt
- train + gates + pinned pointers:
-
RM-2.16 (closed) — graceful shutdown contract (bounded):
- train + gates + pinned pointers:
docs/planning/releases/v2/minor/RM-2.16/RM-2.16.md - env knob:
ANGARABASE_SHUTDOWN_TIMEOUT_MS
- train + gates + pinned pointers:
-
RM-2.17 (closed) — admin/ops via pgwire (SQL/sys.*, no Unix sockets):
- train + gates + pinned pointers:
docs/planning/releases/v2/minor/RM-2.17/RM-2.17.md - sys views:
sys.identity,sys.health,sys.settings - optional SQL shutdown (fail-closed):
ANGARABASE_ALLOW_SQL_SHUTDOWN=1+SELECT sys.request_shutdown()
- train + gates + pinned pointers:
Released tags
Список released tags и ссылки на release notes см. в CHANGELOG.md → “Released”.