Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Runbook: GCBloatHigh

Source of truth: tools/observability/alerts/angarabase_alerts.yaml. Backed by: RM-0.6.3.8 S7.

What It Means

angarabase_gc_tuning_bloat_ratio_percent > 50 — for each “live” version there is more than one “dead” version (which AngaraGC cannot remove). Most often this is a symptom of a blocking long transaction (see LongTransaction).

Severity

warning. At 80%+ bloat, the buffer pool hit ratio drops.

Initial response

  1. Grafana Overview v2 → row “GC / MVCC”.
  2. Check the LongTransaction alert — the root cause is usually there.
  3. Check gc_tuning_state — whether auto-tuning is reacting by itself.

Diagnostics

curl -sf http://127.0.0.1:9898/metrics | rg gc_
curl -sf http://127.0.0.1:9898/metrics | rg mvcc_

# Top tables by bloat
psql -c "SELECT schemaname, relname, n_dead_tup, n_live_tup,
                round(100.0 * n_dead_tup / NULLIF(n_live_tup,0), 2) AS bloat_pct
         FROM pg_stat_user_tables
         WHERE n_dead_tup > 1000
         ORDER BY bloat_pct DESC NULLS LAST LIMIT 10;"

Mitigation

  1. Close long transactions — see LongTransaction.
  2. Run vacuum on hot tables.
  3. Tune GC — increase auto-tuning aggressiveness (see mvcc-gc.md §Knobs).
  4. Full rebuild (downtime) if bloat > 70% and vacuum does not help.

Escalation

If bloat > 70% and does not fall after vacuum + closing long txns, collect diagnostics and escalate (service downtime may be needed).