Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Vector Observability

AngaraBase 0.6.3.10 §S17 — closes the RFC-2026-151 §7a Observability Contract that was deferred from RM-0.6.2.9 G2 (Sprint 4 G2-001 disposition). User direction 2026-04-20 «extended scope» — close before the HTAP column-store train (RM-0.6.4.0).

This page documents the vector executor observability surface: 3 USDT probes on the hot path, 1 selection-ratio histogram exposed via /metrics, and the operator playbooks that turn those signals into rewrite / index decisions.

Surface summary

SurfaceTypeSource
angarabase:vector_batch_startUSDT probecrates/angarabase/src/observability/probes.rs
angarabase:vector_batch_endUSDT probecrates/angarabase/src/observability/probes.rs
angarabase:vector_fallbackUSDT probecrates/angarabase/src/observability/probes.rs
angarabase_vector_selection_ratioPrometheus histogramcrates/angarabase/src/metrics/core.rs
angarabase_vector_fallback_totalPrometheus counter (existing)crates/angarabase/src/metrics/core.rs
angarabase_vector_rows_produced_totalPrometheus counter (existing)crates/angarabase/src/metrics/core.rs

The probes carry the inline #[cfg(feature = "usdt")] guard, so non-usdt builds (WASM, slim test profiles) pay zero instructions per call. RFC-2026-369 remains the canonical source for the broader USDT/eBPF probe infrastructure; this page covers only the vector-executor-specific subset finalised by S17.

USDT probes

All three probes use provider name angarabase and follow the <subsystem>_<event> convention. Numeric discriminants for every enum argument are append-only — adding a new variant is non-breaking, but renumbering or removing one requires an RFC update.

vector_batch_start(operator_kind: u8, batch_size: u32, source: u8)

Fired at the entry of VectorOperator::next_batch() for every primary operator (Filter / SeqScan / IndexScan / Bridge / ParallelSeqScan).

  • operator_kindProbeVectorOperatorKind discriminant. Stable values: Filter=0, SeqScan=1, IndexScan=2, Bridge=3, ParallelSeqScan=4, HashJoin=5 (reserved), Aggregate=6 (reserved), Project=7 (reserved).
  • batch_size — upstream batch length in rows.
  • sourceProbeVectorBatchSource discriminant. Stable values: HeapScan=0, IndexScan=1, UpstreamVector=2, ParallelMorsel=3.

vector_batch_end(operator_kind: u8, rows_produced: u32, rows_filtered: u32, duration_us: u64)

Fired at the exit. rows_filtered is the count dropped by this operator; rows_produced is the count emitted to the next operator. duration_us is the wall-time of the call, including any upstream next_batch() recursion.

vector_fallback(plan_kind: u8, reason: u8)

Fired wherever the planner / executor falls back to the row path.

  • plan_kindProbeOperator discriminant (best-effort tag for the plan node that tripped the fallback; e.g. HashProbe=4, Aggregate=7).
  • reasonProbeVectorFallbackReason discriminant. Stable values: UnsupportedPlan=0, TypeError=1, NonEquiJoin=2, BudgetExceeded=3, FeatureDisabled=4.

Wire contract notice. S17 finalises the vector_fallback argument shape, replacing the legacy ad-hoc (u64, u64) literals that the S9-D4 code shipped with. RFC-2026-369 was open at S17 close so no production bpftrace consumers were broken; new consumers MUST use the ProbeOperator × ProbeVectorFallbackReason mapping.

bpftrace recipes

# Live histogram of post-Filter selectivity (per-batch, last 60 s).
usdt:./angarabased:angarabase:vector_batch_end /arg0 == 0/ {
    @sel = hist(arg1 * 100 / (arg1 + arg2));
}

# Top fallback reasons in the last hour.
usdt:./angarabased:angarabase:vector_fallback {
    @[arg0, arg1] = count();
}

# Vector hot-path call rate by operator.
usdt:./angarabased:angarabase:vector_batch_start {
    @[arg0] = count();
}

Self-test scripts live under tools/usdt/ (per the standing convention from RFC-2026-369 §4 — bpftrace -l 'usdt:./angarabased:angarabase:*').

Histogram: angarabase_vector_selection_ratio

Cumulative Prometheus histogram (HELP / TYPE headers emitted on every scrape) tracking the per-batch ratio rows_produced / rows_scanned observed by VectorFilterV0::next_batch().

Bucket scheme (compatible with histogram_quantile()):

[0.001, 0.01, 0.05, 0.1, 0.25, 0.5, 0.75, 0.9, 0.99, 1.0, +Inf]

The +Inf bucket exists for protocol compliance only — selection ratio is bounded to [0.0, 1.0] by construction (kept ≤ scanned enforced inside apply_predicate). Empty batches (rows_scanned == 0) carry no signal and are silently dropped by VectorSelectionRatioHistogram::observe().

The histogram is rendered alongside the existing wait-event histogram by render_prometheus() and is covered by:

  • metrics::render::tests::vector_selection_ratio_histogram_appears_in_prometheus_output_rm06310_s17 (exposition-shape test);
  • metrics::render::tests::vector_selection_ratio_histogram_observe_buckets_rm06310_s17 (bucket-edge test, independent of the renderer).

PromQL examples

# Median Filter selectivity over the last 5 minutes.
histogram_quantile(
  0.5,
  rate(angarabase_vector_selection_ratio_bucket[5m])
)

# Share of batches with extremely-low selectivity (≤ 5 %).
sum(rate(angarabase_vector_selection_ratio_bucket{le="0.05"}[5m]))
  /
sum(rate(angarabase_vector_selection_ratio_count[5m]))

# Mean selectivity (sum / count, both already rate-friendly).
rate(angarabase_vector_selection_ratio_sum[5m])
  /
rate(angarabase_vector_selection_ratio_count[5m])

Operator playbook

ObservationLikely diagnosisRecommended action
p50 ≥ 0.9 consistentlyFilter is essentially a no-op; predicate could be pushed down or removed entirelyReview query plan: candidate for filter pushdown to scan / index level; rewrite query
p95 ≤ 0.05 and high vector_batch_start rateIndex missing — Filter is throwing away ≥ 95 % of every batchRun ANALYZE; add an index on the predicate columns; verify with EXPLAIN
p99 = 1.0 only on a small set of queriesSelective predicate fires occasionally on a hot tableAcceptable; consider partial index if the predicate is stable
vector_fallback rate spikeA new query shape is tripping the row-path fallbackFilter vector_fallback{reason=…} — match against ProbeVectorFallbackReason

Cross-reference runbooks/buffer-pool-pressure.md for I/O-side correlation and the future commit-latency-tuning.md (Track B S13) for write-path overlay.

Source-of-truth contract

ArtifactRole
RM-0.6.3.10 §S17Sprint contract (this page is the operator-facing rendering)
RFC-2026-151 §7a Observability ContractLong-form design (closed by S17)
RFC-2026-369 USDT/eBPF Observability ProbesBroader probe taxonomy (open at S17 close — soft prereq)
crates/angarabase/src/observability/probes.rsUSDT macro definitions + enum stability tests
crates/angarabase/src/metrics/core.rs::VectorSelectionRatioHistogramHistogram storage + observe() API
crates/angarabase/src/metrics/render.rs::render_vector_selection_ratioPrometheus exposition
crates/angarabase/src/query/vector.rs::VectorFilterV0::next_batchSingle observation point (Filter operator boundary)