Resource Advisors v0
AngaraBase 0.6.3.9 §S10 — single-node, in-process advisors. Closes the Article #5 review finding “RFC-2026-010 ↔ runtime drift”.
This page documents the two minimum-viable advisors that ship with
0.6.3.9 — the AIMD checkpoint IoAdvisor and the RSS-sensor
MemoryAdvisor — together with the metrics they expose and the
guarantees they explicitly do not make. The full AngaraTuner
Resource Broker (distributed, QoS-weighted, schema-aware) remains a
future train (RM-0.7.0 / RM-0.8.0); RM-0.6.3.9 promotes only the
single-node sensor stubs called out as [Future] in
RFC-2026-010 §3 to Current v0.
Why advisors at all
Modern databases self-tune — Postgres auto_explain + extension-based
tuners, SQL Server’s automatic plan correction, Oracle’s adaptive
execution. AngaraBase’s long-term plan is to collapse the ~60
operator knobs down to ~10–15 budgets (memory_budget, io_budget,
cpu_budget) plus QoS policies, with an in-process broker computing
the rest.
For 0.6.3.9 we ship just enough of that vision to:
- anchor the public Article #5 narrative (no more “future-only” advisors), and
- give downstream code (plan-cache eviction, future spill paths) a stable hint API to consume now without committing to the full broker contract.
IoAdvisor — AIMD checkpoint throttler
Algorithm
Single-knob AIMD over observed flush IOPS, ticked once per attempted
checkpoint (the CheckpointWorker::with_io_advisor integration path):
on tick(observed_iops):
if observed_iops > iops_threshold:
batch_size *= decrease_factor # multiplicative-decrease, >= min
decision = throttle
else:
batch_size += increase_step # additive-increase, <= max
decision = recover
if batch_size unchanged:
decision = hold
Defaults (crates/angarabase/src/storage/advisors/io.rs,
IoAdvisorConfig::default):
| Knob | Default | Notes |
|---|---|---|
initial_batch_size | 64 pages | Resets to this on restart (no persistence) |
min_batch_size | 8 pages | Hard floor |
max_batch_size | 1024 pages | Hard ceiling |
iops_threshold | 5 000 IOPS | Above → multiplicative-decrease |
decrease_factor | 0.5 | Clamped into (0.0, 1.0) |
increase_step | 8 pages | Additive-increase per tick |
Metrics
angarabase_io_advisor_current_batch_size(gauge, pages): the advisor’s currently recommended checkpoint batch size.angarabase_io_advisor_decisions_total{action="throttle"|"recover"|"hold"}(counter): split by AIMD decision. Sum reproduces the historical decision count.
What v0 does not do
- It does not yet enforce the recommended batch size — the periodic
checkpoint flush still drains every dirty page ≤
target_lsnto preserve the completion invariant (RFC-2026-073 §S12). Wiring the recommendation into the flush path is tracked inDEBT_REGISTERas a follow-up. - It does not consider latency, only IOPS — adaptive io_uring queue depth (TD-2026-0122) is the v1 follow-up.
- It does not persist its state across restarts.
MemoryAdvisor — RSS sensor
What it samples
On every sample() call (driven by the same checkpoint worker tick on
Linux):
- read
process_rssfrom/proc/self/statm(pages * sysconf(_SC_PAGESIZE)), - compute
ratio = process_rss / configured_limit, - publish
angarabase_memory_pressure_ratiogauge, - emit a
WARNlog line ifratio >= warn_threshold(default0.8).
limit_bytes = 0 disables the advisor: is_under_pressure() returns
false and the gauge stays at 0. Non-Linux platforms always
return None from sample() in v0 (portable sensor is a follow-up).
Hint API
#![allow(unused)]
fn main() {
let advisor: Arc<MemoryAdvisor> = ...;
if advisor.is_under_pressure() {
// shed load: e.g. evict from plan cache, fall back to spill plan
}
}
The check is a single relaxed atomic load — safe to call on the hot path. The decision of what to do under pressure is intentionally left to each subsystem so the advisor itself stays narrow.
Metrics
angarabase_memory_pressure_ratio(gauge, float in[0.0, 8.0]): most recentprocess_rss / limit_bytesratio. Hard-clamped to 8.0 to bound the impact of bogus RSS reads.
Recommended PromQL
# Checkpoint throttling intensity over the last 5m
rate(angarabase_io_advisor_decisions_total{action="throttle"}[5m])
# Memory pressure crossing the warn threshold
angarabase_memory_pressure_ratio > 0.8
# Current checkpoint batch recommendation, for capacity dashboards
angarabase_io_advisor_current_batch_size
Operator playbook
| Symptom | What to check | Action |
|---|---|---|
io_advisor_current_batch_size stuck at min_batch_size | rate(io_advisor_decisions_total{action="throttle"}[5m]) consistently > 0 | Storage IOPS budget is the bottleneck. Either provision more IOPS or raise iops_threshold after measuring sustained capacity. |
memory_pressure_ratio > 0.9 for > 5m | RSS growth pattern; per-subsystem memory metrics | Consider lowering max_cached_pages or query_memory_limit_mb. Plan cache eviction will start consulting is_under_pressure() as the consumer-side wiring lands. |
io_advisor_decisions_total{action="hold"} ≫ throttle/recover | Workload is stable around the threshold | No action — AIMD is doing its job. |
Compatibility contract
- Metric names (
angarabase_io_advisor_*,angarabase_memory_pressure_ratio) are stable within the 0.6.x series. Adding new advisors or newaction=label values is a non-breaking change. - The Rust API (
IoAdvisor::tick,MemoryAdvisor::sample,MemoryAdvisor::is_under_pressure) is internal (pubfor cross-crate wiring, but not stabilised for downstream consumers outside the AngaraBase workspace). - The full AngaraTuner Resource Broker (RFC-2026-010 Phase 1+2) will coexist with v0 and may layer over these advisors; v0 metric names will continue to be emitted for backwards compatibility.