Transactions and MVCC
AngaraBase ensures concurrent access to data via MVCC (Multi-Version Concurrency Control). Transactions guarantee atomicity of changes, while MVCC allows readers and writers to operate simultaneously without mutually blocking each other.
Transaction Basics
Transaction Management
BEGIN; -- start an explicit transaction
SAVEPOINT sp1; -- create a savepoint
ROLLBACK TO SAVEPOINT sp1; -- rollback to the savepoint
COMMIT; -- commit the transaction
ROLLBACK; -- rollback the entire transaction
Autocommit
By default, AngaraBase operates in autocommit mode: each individual SQL statement executes as a standalone transaction. If the statement completes successfully, the result is committed automatically; on error, it is rolled back.
For operations affecting multiple rows or tables, use explicit transactions (BEGIN / COMMIT) to group changes into a single atomic unit.
MVCC: Row Versioning
The core idea of MVCC is that readers do not block writers, and writers do not block readers. This is achieved by storing multiple versions of each row.
Version Metadata
Each row version contains two system fields:
| Field | Purpose |
|---|---|
created_commit | Epoch (commit timestamp) when the version was created |
deleted_commit | Epoch when the version was marked as deleted (∞ for active versions) |
Visibility Rule
A row version is visible to a transaction with snapshot S if both conditions are met:
created_commit <= S— the version was created before or at the time of the snapshot- The version is not deleted, or
deleted_commit > S— the deletion occurred after the snapshot
Write Operations
- INSERT — creates a new row version with
created_commit= current epoch - UPDATE — does not modify the row in-place. Instead, it marks the current version as deleted (
deleted_commit= current epoch) and creates a new version with the updated data - DELETE — marks the version as deleted (
deleted_commit= current epoch)
Isolation Levels
Each transaction receives a snapshot — a fixed view of the data at a specific point in time.
READ COMMITTED (Default)
The snapshot is updated before each statement. The transaction sees all data committed before the start of the current statement. This is the recommended isolation level for most workloads.
SET TRANSACTION ISOLATION LEVEL READ COMMITTED;
REPEATABLE READ
The snapshot is fixed at the time of BEGIN and remains unchanged until the end of the transaction. All statements within the transaction see the same state of the data.
SET TRANSACTION ISOLATION LEVEL REPEATABLE READ;
SERIALIZABLE
As of version 0.6.4.4, AngaraBase implements full SERIALIZABLE (SSI) isolation level.
In SERIALIZABLE mode, write skew and phantoms anomalies are prevented through SIREAD locks and tracking read-write anti-dependencies.
Transactions violating serializability are aborted with code 40001.
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;
Locks
Read Operations
Reading uses the MVCC snapshot and does not require locks. A reader never waits for a writer, and vice versa.
Write Operations (Lock-Free DML)
Writers use atomic operations (Compare-And-Swap) for row-level version installation, ensuring lock-free and conflict-free modifications. Traditional write locks are no longer held.
DDL Operations
Schema change operations (CREATE TABLE, ALTER TABLE, DROP TABLE) acquire table-level locks for the duration of the execution.
Deadlock Detection
AngaraBase detects deadlocks using:
- Timeout — if a transaction waits for a lock longer than a configured threshold, it is aborted
- Victim selection — upon detecting a cycle, the system chooses a victim transaction to rollback
- Deterministic lock ordering — an internal strategy for ordering locks to reduce the likelihood of deadlocks
Garbage Collection (AngaraGC)
Over time, the storage accumulates old row versions that are no longer visible to any active transaction. The AngaraGC subsystem is responsible for cleaning them up.
How It Works
- GC watermark — calculated as the minimum snapshot among all active transactions:
min(active_snapshots) - Row versions with
deleted_commit < watermarkare safe to delete — no active transaction can see them - Cleanup is performed by a background process without pausing query execution
- Bounded slices — GC processes data in fixed-size chunks to avoid latency spikes
- Epoch Reaper — a background worker (since version 0.6.5.24) that prevents the GC watermark from stalling due to abruptly disconnected or hung sessions.
Difference from PostgreSQL
AngaraBase does not have autovacuum in the traditional sense. It uses a hybrid design with an epoch-based watermark (similar to Oracle/InnoDB), allowing for more precise control over the cleanup timing.
Recommendations
- Use READ COMMITTED (the default level) for most workloads
- Avoid long-running transactions — they hold back the GC watermark and prevent the cleanup of old row versions, which increases disk space consumption
- When using REPEATABLE READ, be aware of potential write skew. If strict serializability is needed, use explicit locks (
SELECT ... FOR UPDATE)
MVCC State Upon Crash Recovery
Upon restarting after a crash, AngaraBase restores the MVCC state from the transaction log (WAL).
What Is Restored
- Committed transactions — transactions that managed to write a COMMIT to the WAL
- Aborted transactions — incomplete transactions are marked as aborted
- MVCC visibility — information about which row versions are visible for each commit epoch
- Transaction counters — current commit epoch and other counters
Recovery Process
- WAL scan — scanning the transaction log files in chronological order
- MVCC replay — restoring in-memory MVCC structures from WAL records
- Cleanup — marking uncompleted transactions as aborted
Limitations
- Backend requirement: MVCC recovery only works with
transaction_log.backend = "file_bin" - Memory rebuild: The MVCC state is rebuilt in memory, which can take time for large WAL volumes
- Read-your-writes: Immediately after restart, uncompleted transactions are invisible (marked as aborted)
Monitoring Recovery
-- Check the recovery mode
SELECT recovery_mode FROM sys.identity;
-- Check system health after recovery
SELECT txn_commit_epoch_current FROM sys.health;
Possible recovery_mode values:
"normal"— normal start without recovery"crash_recovery"— recovery after a crash"forced_takeover"— forced instance lease takeover
Related Sections
- Storage Engine — storage architecture and page format
- Instance Lifecycle — instance lifecycle and crash recovery
- Crash Recovery — operational procedures for recovery
- SQL Reference — SQL statement syntax