Streaming replication in Postgres allows for continuous data replication from a primary server to standby servers, to ensure high availability and balance read-only workloads. However, replication lag can occur, leading to delays in data synchronization. This guide provides steps to troubleshoot and mitigate replication lag.
To start investigation we need to understand where we actually have lag, on which stage of replication:
- sending WAL stream to replica via network by
walsender
- receiving WAL stream on replica from network by
walreciever
- writing WAL on disk on replica by
walreciever
- applying (replaying) WAL as a recovery process
Thus, streaming replication lag can be categorized into three types:
- Write Lag: The delay between when a transaction is committed on the primary and when it is written to the WAL on the standby.
- Flush Lag: The delay between when a transaction is written to the WAL on the standby and when it is flushed to the disk.
- Apply (Replay) Lag: The delay between when a transaction is flushed to the disk and when it is applied to the database on the standby.
To identify the lag, use the following SQL query:
select
pid,
client_addr,
application_name,
state,
coalesce(pg_current_wal_lsn() - sent_lsn, 0) AS sent_lag_bytes,
coalesce(sent_lsn - write_lsn, 0) AS write_lag_bytes,
coalesce(write_lsn - flush_lsn, 0) AS flush_lag_bytes,
coalesce(flush_lsn - replay_lsn, 0) AS replay_lag_bytes,
coalesce(pg_current_wal_lsn() - replay_lsn, 0) AS total_lag_bytes
from pg_stat_replication;
We will get something like this:
postgres=# select
pid,
client_addr,
application_name,
state,
coalesce(pg_current_wal_lsn() - sent_lsn, 0) AS sent_lag_bytes,
coalesce(sent_lsn - write_lsn, 0) AS write_lag_bytes,
coalesce(write_lsn - flush_lsn, 0) AS flush_lag_bytes,
coalesce(flush_lsn - replay_lsn, 0) AS replay_lag_bytes,
coalesce(pg_current_wal_lsn() - replay_lsn, 0) AS total_lag_bytes
from pg_stat_replication;
pid | client_addr | application_name | state | sent_lag_bytes | write_lag_bytes | flush_lag_bytes | replay_lag_bytes | total_lag_bytes
---------+----------------+------------------+-----------+----------------+-----------------+-----------------+------------------+-----------------
3602908 | 10.122.224.101 | backupmachine1 | streaming | 0 | 728949184 | 0 | 0 | 0
2490863 | 10.122.224.102 | backupmachine1 | streaming | 0 | 519580176 | 0 | 0 | 0
2814582 | 10.122.224.103 | replica1 | streaming | 0 | 743384 | 0 | 1087208 | 1830592
3596177 | 10.122.224.104 | replica2 | streaming | 0 | 2426856 | 0 | 4271952 | 6698808
319473 | 10.122.224.105 | replica3 | streaming | 0 | 125080 | 162040 | 4186920 | 4474040
Meaning of those _lsn
sent_lsn
: How much WAL (lsn position) has already been sent over the networkwrite_lsn
: How much WAL (lsn position) has been sent to the operating system (before flushing)flush_lsn
: How much WAL (lsn position) has been flushed to the disk (written on the disk)replay_lsn
: How much WAL (lsn position) has been applied (visible for queries)
So lag is a gap between pg_current_wal_lsn
and replay_lsn
(total_lag_bytes
, and it's a good idea to add it to monitoring, but for troubleshooting purposes we will need all 4
- Lag on
sent_lag_bytes
means we have issues with sending the data, i.e. CPU saturatedWALsender
or overloaded network socket on the primary side - Lag on
write_lag_bytes
means we have issues with receiving the data, i.e. CPU saturatedWALreceiver
or overloaded network socket on the replica side - Lag on
flush_lag_bytes
means we have issues with writing the data on the disk on replica side, i.e. CPU saturated or IO contention ofWALreceiver
- Lag
replay_lag_bytes
means we have issues with applying WAL on replica, usually CPU saturated or IO contention of postgres process
Once we pinpointed the problem, we need to troubleshoot the process(es) on the OS level to find the bottleneck.
TBD
- Streaming replication (official Postgres docs)
- pg_stat_replication view (official Postgres docs)
- Replication configuration parameters (official Postgres docs)
- Dmitry Fomin
- Sadeq Dousti
- Nikolay Samokhvalov