Skip to content

Keepalived

The PostgreSQL component can optionally be deployed with a PostgreSQL router.

The router consists of a high-availability (HA) setup of two servers with a virtual IP address (VIP).

This VIP is managed using Keepalived.

Requirements

On the PostgreSQL router setup, the following components are required for Keepalived to function properly:

  • The binary is rolled out using the (standard) Keepalived RPM (from Satellite).
  • The Keepalived configuration can be found in /etc/keepalived/keepalived.conf and is deployed and managed via Ansible.
  • The value for virtual_router_id should be unique per Keepalived cluster within a segment. This value is pseudo-randomly generated by Ansible.

Important:
Ensure that the router nodes are patched one at a time. After patching the first node, verify that it has fully recovered and the VIP is active before proceeding with the second node.

Use

Keepalived is configured in balanced mode in this setup.

This means both instances have the same priority, and no fallback is triggered when both nodes are available again.

Under normal circumstances, both nodes will be available and one of the nodes will have been connected to the VIP (it is undetermined which one).

Which VIP has been connected can be checked using the following command:


Node 1

me@gurus-ansible-server1 ~/g/ansible-postgres (tmp)> ssh gurus-pgsdb-server1.acme.corp.com
[me@gurus-pgsdb-server1 ~]$ ip a

1: lo: <LOOPBACK, UP, LOWER_UP> MTU 65536 qdisc noqueue state UNKNOWN group default qlen 1000
  link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
  inet 127.0.0.1/8 scope host lo
    valid_lft forever preferred_lft forever

2: ens192: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
  Link/Ethernet: 00:50:56:9d:54:47
  Broadcast (brd): ff:ff:ff:ff:ff:ff
  inet 10.0.4.*26/23 brd 10.0.4.455 scope global noprefixroute ens192
    valid_lft forever
    preferred_lft forever

[me@gurus-pgspr-server1 ~]$ logout
Connection to gurus-pgspr-server1.acme.corp.com closed.

Node 2

me@gurus-ansible-server1 ~/g/ansible-postgres (tmp) > ssh gurus-pgsdb-server2.acme.corp.com
[me@gurus-pgsdb-server2 ~\]$ ip a

1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
      valid_lft forever preferred_lft forever

2: ens192: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
    link/ether 00:50:56:9d:79:5e brd ff:ff:ff:ff:ff:ff
    inet 10.0.4.*27/23 brd 10.0.4.455 scope global noprefixroute ens192
      valid_lft forever preferred_lft forever
    inet 10.0.4.*28/23 scope global secondary ens192
      valid_lft forever preferred_lft forever

In this example, node 2 has the extra VIP connected (2 IP addresses: 10.0.4.27 and 10.0.4.28) while node 1 does not (1 IP address: 10.0.4.*26).

ToDo

Keepalived could be provided with a script that checks if HAProxy, PgRoute66, and potentially even PostgreSQL behind HAProxy are accessible.

This could perhaps further increase availability.

In practice, however, an issue has never occurred that could have been prevented by this.