/ Server / Postgresql Stream Replication Setup


Published:2014-06-12T13:23:24Z Edited:2016-11-06T13:23:24Z

Recently Postgresql Stream replication become even easier, In this article you''ll be shown how to setup stream replication for postgresql 9.3+.

 

Postgresql Stream Replication

stream replication is a feature that let a mirror/replica/standby database to get a stream of changes that have been done on another primary database, to be replicated on as many standby databases as needed.

 

Use Cases

stream replication can be used as follows:

  • As Part of Data protection plan, Whenever you critical data and you cannot afford loosing any piece of it, then stream replication will be just part of your strategy along with other replication methods, such as syncronous replication, hot-standby replication and frequent offline backups ..etc
  • To scale horizontally to accomdate Read-Only high web traffic load, by adding more standby databases you can scale read-only capacity almost linearly, this is very important if using caching/reverse proxy is not an option.
  • Migration your database to another webhost, when your database is mostly read-only, migration to another host with zero downtime is a breeze.

 

 

Stream replication steps

To perform stream replication, between one primary database on 192.168.1.5 and nother standby database on 192.168.1.7, both servers must have postgresql database installed, then follow the steps

On The Primary postgresql do:

  1. Create a user with replication previliages, login to psql as database superuser.
    On Debian be root user then
    su postgres
    psql
    
    On OpenBSD
     psql -U postgres
    
    And once you're in psql create the replication user
    CREATE USER "replication" WITH REPLICATION;
    
    Set the user password
    \password replication
    
    You'll be prompted to enter the password
    then Exit psql console
  2. Now append to pg_hba.conf replace 192.168.1.7 with standby server ip address
     host replication replication 192.168.1.7/32 md5
    
  3. Append to/Edit postgresql.conf so it contains the following values:
    wal_level=replica
    
    max_wal_senders control concurrency, minimum value for replication is 2, the more standby servers this value need to increased
    max_wal_senders = 2
    
    wal segments to be retain, for heavy-write server consider increase it
    wal_keep_segments = 32
    
    So all settings that need to be present in postgresql.conf are the following:
    wal_level=replica
    max_wal_senders = 2
    wal_keep_segments = 32
    
    thats it, now restart postgresql: On Debian
    service postgresql restart
    
    On OpenBSD :
    rcctl restart postgresql
    
    This restart could be avoided if you plan ahead, having these settings from the begining :)
  4. Install postgresql on the new standby:
    • On Debian run:
      apt-get install postgresql
      
    • On Openbsd run:
      pkg_add postgresql-server
    • Make sure to stop postgres daemon if its already running.
      • On Debian
         service postgresql stop 
      • On openBSD
        rcctl stop postgresql
  5. Go to postgresql home dirctory, ensure that the data directory is indeed empty so that streaming operation can fill it from the primary server:
    • Postgresql Home On Debian its /var/lib/postgresql
    • Postgresql Home On OpenBSD its /var/postgresql
  • Switch to Database System user:
    • On Debian its postgres
    • On OpenBSD its _postgresql
  • Run pg_basebackup replace 192.168.1.5, 5432 with ipaddress, port respectively where primary database daemon is listening
    pg_basebackup -D data -R -P -U replication --xlog-method=stream -h 192.168.1.5 -p 5432
    
  • Now start postgresql daemon on the secondary.
    • On Debian
      service postgresql start
    • On openBSD
      rcctl start postgresql
      

 

Standby database will catch up unless ...

if there wasn't any downtime in the connection between primary and standby server then standby database will keep syncronized to primary server.
 
But what if connection went down for some time will it catch up with changes done on primary?
 
Maybe, but likely it won't, especially if you've heavy-write primary, wal segments may've been gone already before theyve been sent to the standby database and noway to retrieve them unless you startover with a fresh base data using pg_basebackup like the above.
 

 

Make stream replication survive longer lags

If we activiate archiving then standby database can follow up by using shipped archived wal segments to be replayed, this will let it catchup more likely ( depend on the wal_keep_segments )
Assuming you want to store database archive in /tmp/db-archive/ on standby server, then to enable archive follow the following steps:
 

On the primary database

append the following to postgresq.conf
archive_mode = on
archive_command = 'rsync -vka %p remote@192.168.1.7:/tmp/db-archive/%f'
Or if you mounted the standby database host on /mnt/standby7
archive_command = 'test ! -f /mnt/standby7/tmp/db-archive/%f && cp %p /mnt/standby7/tmp/%f
Save file then Restart postgresql daemon
 

On Standby server

Edit recovery.conf that had been writen automaticlly by pg_basebackup to setup restore_command as follow
restore_command = 'cp /tmp/db-archive/%f "%p"'
Then restart postgresql daemon
 
 

Promoting Standby database to primary

In case you've used stream replication to migrate your database server to another webhost, then you need to promote your standby database to become the new primary database, to do that run the following command as postgresql user:
  • On Debian
    pg_ctl promote -D /var/lib/postgresql/9.6/main
    
  • On openBSD
    pg_ctl promote -D /var/postgresql/9.6/main
    
Where 9.6/main is database data directory.
 
 

References

Next
iptables rules for securing webservers
Previous
Tags