Vector Observability
AngaraBase 0.6.3.10 §S17 — closes the RFC-2026-151 §7a Observability Contract that was deferred from RM-0.6.2.9 G2 (Sprint 4 G2-001 disposition). User direction 2026-04-20 «extended scope» — close before the HTAP column-store train (RM-0.6.4.0).
This page documents the vector executor observability surface: 3 USDT
probes on the hot path, 1 selection-ratio histogram exposed via /metrics,
and the operator playbooks that turn those signals into rewrite / index
decisions.
Surface summary
| Surface | Type | Source |
|---|---|---|
angarabase:vector_batch_start | USDT probe | crates/angarabase/src/observability/probes.rs |
angarabase:vector_batch_end | USDT probe | crates/angarabase/src/observability/probes.rs |
angarabase:vector_fallback | USDT probe | crates/angarabase/src/observability/probes.rs |
angarabase_vector_selection_ratio | Prometheus histogram | crates/angarabase/src/metrics/core.rs |
angarabase_vector_fallback_total | Prometheus counter (existing) | crates/angarabase/src/metrics/core.rs |
angarabase_vector_rows_produced_total | Prometheus counter (existing) | crates/angarabase/src/metrics/core.rs |
The probes carry the inline #[cfg(feature = "usdt")] guard, so non-usdt
builds (WASM, slim test profiles) pay zero instructions per call. RFC-2026-369
remains the canonical source for the broader USDT/eBPF probe infrastructure;
this page covers only the vector-executor-specific subset finalised by S17.
USDT probes
All three probes use provider name angarabase and follow the
<subsystem>_<event> convention. Numeric discriminants for every enum
argument are append-only — adding a new variant is non-breaking, but
renumbering or removing one requires an RFC update.
vector_batch_start(operator_kind: u8, batch_size: u32, source: u8)
Fired at the entry of VectorOperator::next_batch() for every primary
operator (Filter / SeqScan / IndexScan / Bridge / ParallelSeqScan).
operator_kind—ProbeVectorOperatorKinddiscriminant. Stable values:Filter=0,SeqScan=1,IndexScan=2,Bridge=3,ParallelSeqScan=4,HashJoin=5(reserved),Aggregate=6(reserved),Project=7(reserved).batch_size— upstream batch length in rows.source—ProbeVectorBatchSourcediscriminant. Stable values:HeapScan=0,IndexScan=1,UpstreamVector=2,ParallelMorsel=3.
vector_batch_end(operator_kind: u8, rows_produced: u32, rows_filtered: u32, duration_us: u64)
Fired at the exit. rows_filtered is the count dropped by this operator;
rows_produced is the count emitted to the next operator. duration_us is
the wall-time of the call, including any upstream next_batch() recursion.
vector_fallback(plan_kind: u8, reason: u8)
Fired wherever the planner / executor falls back to the row path.
plan_kind—ProbeOperatordiscriminant (best-effort tag for the plan node that tripped the fallback; e.g.HashProbe=4,Aggregate=7).reason—ProbeVectorFallbackReasondiscriminant. Stable values:UnsupportedPlan=0,TypeError=1,NonEquiJoin=2,BudgetExceeded=3,FeatureDisabled=4.
Wire contract notice. S17 finalises the
vector_fallbackargument shape, replacing the legacy ad-hoc(u64, u64)literals that the S9-D4 code shipped with. RFC-2026-369 was open at S17 close so no production bpftrace consumers were broken; new consumers MUST use theProbeOperator×ProbeVectorFallbackReasonmapping.
bpftrace recipes
# Live histogram of post-Filter selectivity (per-batch, last 60 s).
usdt:./angarabased:angarabase:vector_batch_end /arg0 == 0/ {
@sel = hist(arg1 * 100 / (arg1 + arg2));
}
# Top fallback reasons in the last hour.
usdt:./angarabased:angarabase:vector_fallback {
@[arg0, arg1] = count();
}
# Vector hot-path call rate by operator.
usdt:./angarabased:angarabase:vector_batch_start {
@[arg0] = count();
}
Self-test scripts live under tools/usdt/ (per the standing convention from
RFC-2026-369 §4 — bpftrace -l 'usdt:./angarabased:angarabase:*').
Histogram: angarabase_vector_selection_ratio
Cumulative Prometheus histogram (HELP / TYPE headers emitted on every
scrape) tracking the per-batch ratio rows_produced / rows_scanned observed
by VectorFilterV0::next_batch().
Bucket scheme (compatible with histogram_quantile()):
[0.001, 0.01, 0.05, 0.1, 0.25, 0.5, 0.75, 0.9, 0.99, 1.0, +Inf]
The +Inf bucket exists for protocol compliance only — selection ratio is
bounded to [0.0, 1.0] by construction (kept ≤ scanned enforced inside
apply_predicate). Empty batches (rows_scanned == 0) carry no signal and
are silently dropped by VectorSelectionRatioHistogram::observe().
The histogram is rendered alongside the existing wait-event histogram by
render_prometheus() and is covered by:
metrics::render::tests::vector_selection_ratio_histogram_appears_in_prometheus_output_rm06310_s17(exposition-shape test);metrics::render::tests::vector_selection_ratio_histogram_observe_buckets_rm06310_s17(bucket-edge test, independent of the renderer).
PromQL examples
# Median Filter selectivity over the last 5 minutes.
histogram_quantile(
0.5,
rate(angarabase_vector_selection_ratio_bucket[5m])
)
# Share of batches with extremely-low selectivity (≤ 5 %).
sum(rate(angarabase_vector_selection_ratio_bucket{le="0.05"}[5m]))
/
sum(rate(angarabase_vector_selection_ratio_count[5m]))
# Mean selectivity (sum / count, both already rate-friendly).
rate(angarabase_vector_selection_ratio_sum[5m])
/
rate(angarabase_vector_selection_ratio_count[5m])
Operator playbook
| Observation | Likely diagnosis | Recommended action |
|---|---|---|
p50 ≥ 0.9 consistently | Filter is essentially a no-op; predicate could be pushed down or removed entirely | Review query plan: candidate for filter pushdown to scan / index level; rewrite query |
p95 ≤ 0.05 and high vector_batch_start rate | Index missing — Filter is throwing away ≥ 95 % of every batch | Run ANALYZE; add an index on the predicate columns; verify with EXPLAIN |
p99 = 1.0 only on a small set of queries | Selective predicate fires occasionally on a hot table | Acceptable; consider partial index if the predicate is stable |
vector_fallback rate spike | A new query shape is tripping the row-path fallback | Filter vector_fallback{reason=…} — match against ProbeVectorFallbackReason |
Cross-reference runbooks/buffer-pool-pressure.md for I/O-side correlation
and the future commit-latency-tuning.md (Track B S13) for write-path
overlay.
Source-of-truth contract
| Artifact | Role |
|---|---|
RM-0.6.3.10 §S17 | Sprint contract (this page is the operator-facing rendering) |
RFC-2026-151 §7a Observability Contract | Long-form design (closed by S17) |
RFC-2026-369 USDT/eBPF Observability Probes | Broader probe taxonomy (open at S17 close — soft prereq) |
crates/angarabase/src/observability/probes.rs | USDT macro definitions + enum stability tests |
crates/angarabase/src/metrics/core.rs::VectorSelectionRatioHistogram | Histogram storage + observe() API |
crates/angarabase/src/metrics/render.rs::render_vector_selection_ratio | Prometheus exposition |
crates/angarabase/src/query/vector.rs::VectorFilterV0::next_batch | Single observation point (Filter operator boundary) |