Undo-based rewind
OrioleDB provides the ability to rewind database state to a some point in the past. This is achieved by using the undo log that records modifications for all orioledb
tables. For heap tables this is used by delaying vaccuum, so that while the transaction is not past the rewind boundary, the past versions of tuples are retained in the heap table, being invisible to the user.
Undo-based rewind is faster than PITR based on applying WAL or logical backup. But it doesn't prohibit the latter.
This feature is currently experimental and should be used with caution.
Undo-based rewind is an optional feature that can be enabled by setting the appropriate configuration parameters.
Usage
To use rewind functionality, the following parameters should be set before starting a cluster:
orioledb.enable_rewind = true
Set parameters to limit retaining rewind data:
orioledb.rewind_max_time
-- time in seconds to hold rewind informationorioledb.rewind_max_transactions
-- overall number of OrioleDB and Heap transactions (including subtransactions) to hold for rewindorioledb.rewind_buffers
-- number of in-memory buffers to store rewind data
Full description of configuration parameters
After setting the GUC parameters above start the cluster. From this time rewind data will be stored into a temporary rewind buffer and could be used to rewind.
To start rewind run one of the queries:
select orioledb_rewind_by_time(1000);
-- rewind all transactions committed after 1000 seconds before present
select orioledb_rewind_to_timestamp('2024-12-18 10:59:54 GMT+4');
-- rewind all transactions committed after specified timestamp
select pg_current_xact_id();
pg_current_xact_id
--------------------
1750
select orioledb_get_current_oxid();
orioledb_get_current_oxid
---------------------------
555
// Work with database normally add/modify/delete rows, tables etc.
orioledb_rewind_to_transaction(1750, 555);
This will rewind a cluster to a previously acquired transaction state specified by a pair of (xid, oxid)
After rewind database shuts down. Start it again and you'll get it in a requested working state in the past.
Trying to rewind past time or xid theshold will output an error. If xid or oxid is in the future, rewind will be based on the other one.
To check if you can rewind to a transaction:
select orioledb_get_complete_xid();
orioledb_get_complete_xid
---------------------------
738
select orioledb_get_complete_oxid();
orioledb_get_complete_oxid
----------------------------
3
You can rewind to a precise transaction oxid 555 xid 1750 as if both numbers are bigger than the oldest stored in the rewind buffer (oxid 3 xid 738).
After rewind transactions that are past rewind point completely disappear. There is no way to undo rewind, other than by restoring from a backup, so it's wise to have a backup at the latest point of time before starting rewind. Particularly, if you use backups and rewind you need to be extra careful about what points of time you get backups at and not occasionally overwrite older backups (before doing rewind) with newer ones (after doing rewind).
Limitations
As mentioned above, rewind mode is currently experimental. The major limitations of this mode are the following.
- Undo-based rewind is limited to a cluster uptime. Rewind buffer is not persistent and you can not rewind past cluster start time. At cluster start time all transactions that were reserved to rewind are permanently fixed (completed).
- Starting database cluster after rewind could not be done by the backend that invokes rewind. So database will be shut down after rewind and then the user should start it.
- Undo-based rewind automatically supports
heap
tables in the cluster. Significant writes toheap
tables will retain old versions of tuples reserved for rewind and cause bloat to heap tables. This bloat can not be removed by vacuum until the transaction is past the rewind retain boundary. It's recommended to use rewind for clusters with most of the writes in theorioledb
tables, and only occasional writes toheap
. - There are yet no safeguards against interruption of rewind when it's already started. This interruption can leave database in an inconsistent partially rewound state.
- Rewind is not yet compatible with replication yet as rewind buffer state is not fixed into WAL, so any transactions that could be rewound on source, will still look as committed on replica.