Performance Tuning Guide
Operator baseline for performance tuning in early releases.
Canonical source: this runbook in angarabook/src/operations/.
Scope
Focus:
- buffer pool / checkpoint / writeback;
- TL/WAL durability and group commit;
- no-steal guardrails for large transactions.
Core principle (MVP)
MVP uses no-steal:
- uncommitted pages are not flushed to disk;
- recovery correctness is simpler, but strict guardrails are needed for write pressure.
Quick profiles
OLTP (short transactions)
durability = sync_at_commit(strict) for maximum reliability, orgroup_commitwith a smallgroup_commit.max_wait_usfor lower latency.- Conservative
txn.max_write_set_pageslimits. buffer_pool.backpressure.mode = blockfor predictable behavior.
Analytics / long queries
- A higher write set ceiling is acceptable.
buffer_pool.backpressure.mode = fail_fastif latency/SLO is the priority.- Stronger control of commit tail latency with
group_commit.
Storage Compression (RM-0.6.4.8+)
- Page Compression is enabled via
CREATE TABLE ... WITH (compression='lz4'). - During intensive reads of compressed pages, watch
angarabase_buffer_pool_decomp_spill_total. If it grows, consider limiting concurrency or increasing resources. - If compression fails during page eviction, the system falls back to writing without compression (fail-open) and increments
angarabase_compression_downgrade_total.
SIMD Float Aggregation (RM-0.6.6.5)
- Aggregate functions
SUM(float4)andSUM(float8)automatically use SIMD instructions (AVX2 or NEON) when supported by the CPU. - This significantly speeds up analytical queries over floating-point numbers.
- If SIMD is unavailable, the system transparently falls back to the scalar implementation, incrementing
angarabase_simd_agg_fallback_total.
Adaptive Hash-Join (RM-0.6.6.5)
- The planner automatically swaps the Build and Probe sides in Hash Join if their actual size ratio exceeds
adaptive_hash_join_swap_ratio(default 4). - This uses the smaller table to build the hash table, saving memory and reducing spill probability.
- The switch is recorded in the
angarabase_adaptive_probe_swap_totalmetric.
Dev / test
durability = relaxedis acceptable (deliberately).txn.statement_timeout_ms = 0.fail_fastis useful for early overload detection.
Knobs (MVP list)
durability = sync_at_commit|strict|group_commit|relaxed(env:ANGARABASE_TRANSACTION_LOG_DURABILITY)sync_at_commit/strict— fsync on every COMMIT (max durability, RM-0.6.4.0)group_commit— pump coalesces fsync (default, production)relaxed— no fsync (dev/bench only)
group_commit.max_batch_sizegroup_commit.max_wait_uscheckpoint.interval_mscheckpoint.target_mscheckpoint.dirty_ratio_soft|hardwriteback.max_bytes_per_sectxn.max_write_set_pages|bytesbuffer_pool.uncommitted_pages_ratio_hard(RM-0.6.3.9 §S5+§S9 rename; old name removed without alias)buffer_pool.backpressure.mode = block|fail_fast[execution].index_cardinality_threshold(default 0.15, env:ANGARABASE_INDEX_CARDINALITY_THRESHOLD)- If predicate selectivity is strictly above this threshold, single-key index scan is rejected (
seq scan chosen: low cardinality).
- If predicate selectivity is strictly above this threshold, single-key index scan is rejected (
[execution].index_scan_selectivity_threshold(default 0.05, env:ANGARABASE_INDEX_SCAN_SELECTIVITY_THRESHOLD)- If selectivity is not below this threshold, index scan is also rejected (
seq scan chosen: low selectivity). - On mixed OLTP workloads a filter may return ~10-15% of rows: the cardinality threshold already allows the plan, but the selectivity threshold 0.05 does not; in that case raise
index_scan_selectivity_threshold(for example to 0.15) in config and restart the process.
- If selectivity is not below this threshold, index scan is also rejected (
[execution].late_materialization_selectivity_threshold(default 0.3, env:ANGARABASE_LATE_MATERIALIZATION_SELECTIVITY_THRESHOLD)- Selectivity threshold for enabling the
LateMaterializenode. If the filter passes fewer than 30% of rows, delayed column materialization is enabled.
- Selectivity threshold for enabling the
[execution].adaptive_hash_join_swap_ratio(default 4.0, env:ANGARABASE_ADAPTIVE_HASH_JOIN_SWAP_RATIO)- Ratio for adaptive side swap in Hash Join. If the side size ratio (probe/build) becomes ≥ 4, sides are swapped to optimize memory use.
- Changed only through the configuration file (
angarabase.conf,[execution]section) or env before process startup; a server restart is then required. SET optimizer.*/ regularSET ...in Simple Query protocol do not change the planner: pgwire returns successfulCommandComplete, but the value is not applied (client compatibility). To test a hypothesis, edit config or env and restart.
Symptoms -> actions (fast path)
- Checkpoint p99 spikes: increase
checkpoint.target_ms, limitwriteback.max_bytes_per_sec. - Frequent backpressure: reduce batch size, lower
txn.max_write_set_pages, and if needed increase buffer pool. - durable_lsn lag / commit tails: check fsync latency, tune
group_commitparameters. - Slow query / plan changed: capture
EXPLAIN (VERBOSE, DIAGNOSTIC)and read the plan using How to read query plans. - Unexpected SeqScan on a large table: read
scan_strategy_reason. Forlow cardinality, lower[execution].index_cardinality_thresholdif needed; forlow selectivity, raise[execution].index_scan_selectivity_threshold. First check statistics (ANALYZE,distinct_estimate). Restart after changing thresholds.
Must-have alerts
buffer_pool_backpressure_active == 1longer than the threshold.buffer_pool_uncommitted_dirty_ratioabove hard-limit.- Growth in
txn_write_set_limit_exceeded_total. - GC/watermark stall (according to project SLO).
Next
- How to read query plans — how to read
EXPLAIN, cost/rows,Vector*,replan_reason,cache_status, andreason_codes. - Observability metrics checklist — what must be measured before and after tuning changes.
- Parallel runtime observability runbook — for CPU-bound workloads and DOP caps.
- jemalloc heap profiling runbook — if the bottleneck is memory, not CPU.
- MVCC and GC operator minimum — if latency growth correlates with GC backlog.