VRRP with Keepalived

VRRP (Virtual Router Redundancy Protocol) is commonly used for providing first-hop IPv4 or IPv6 router (“default gateway”) redundancy for network-attached devices.

Some network appliances like wireless LAN controllers use it to provide a virtual IP that can always be used for reaching the active member in the device cluster. VRRP can also be used on application server clusters.

Keepalived is a routing software that provides VRRP function among other features.

This post is about demonstrating a Keepalived IPv4 configuration with two servers. It will also provide some additional information about the VRRP networking details. This is not about all VRRP or Keepalived features or about implementing high availability on top of random business-critical applications.

Let’s go.

Basic Keepalived VRRP setup with two servers

In this post I’m using Keepalived 2.2.7, as packaged in Debian 12.

For the demonstration of an application that runs on the servers and for which high availability is needed, I’ll use Python’s http.server module to serve a file:

# On the first server (192.168.7.131):
markku@server1:~$ mkdir /tmp/app
markku@server1:~$ cd /tmp/app
markku@server1:/tmp/app$ echo This is server 1 > server.txt
markku@server1:/tmp/app$ python3 -m http.server
Serving HTTP on 0.0.0.0 port 8000 (http://0.0.0.0:8000/) ...

# On the second server (192.168.7.132):
markku@server2:~$ mkdir /tmp/app
markku@server2:~$ cd /tmp/app
markku@server2:/tmp/app$ echo This is server 2 > server.txt
markku@server2:/tmp/app$ python3 -m http.server
Serving HTTP on 0.0.0.0 port 8000 (http://0.0.0.0:8000/) ...

Note: This is not a secure or recommended way to run a web server on a production environment by any means. This is just a quick way for demonstrating TCP/IP connectivity from one host to another.

Testing connection from test client to both servers:

markku@testclient:~$ curl http://192.168.7.131:8000/server.txt
This is server 1
markku@testclient:~$ curl http://192.168.7.132:8000/server.txt
This is server 2
markku@testclient:~$

Installing Keepalived on the servers is simple: sudo apt install keepalived

There is no default configuration applied, so we have to create some. This is a simple configuration for a start, in /etc/keepalived/keepalived.conf (on both servers):

vrrp_instance WEB_VIP {
        interface ens18
        virtual_router_id 1
        virtual_ipaddress {
                192.168.7.130/24
        }
}

“WEB_VIP” is a string we selected for the VRRP instance name. It will be shown in the logs.

ens18 is the network interface of this server that we run Keepalived on.

Virtual router ID is a number from 1 to 255. If there is more than one VRRP group running on the same network (broadly said), each VRRP group must have unique ID to prevent the VRRP announcements from messing up other VRRP groups.

Virtual IP address is the address that will be used by the clients to connect to the active VRRP member.

I’ll first start keepalived on server 1 only: sudo systemctl start keepalived

Then let’s test the virtual IP address (192.168.7.130) from the test client:

markku@testclient:~$ curl http://192.168.7.130:8000/server.txt
This is server 1
markku@testclient:~$

It’s working: server 1 responds to connections for the virtual IP address.

Starting the Keepalived service on server 2 as well (“sudo systemctl start keepalived“) does not change anything yet: server 1 is still responding for the virtual IP address. This is because both servers now have the default VRRP priority (100) configured, so whichever server (the Keepalived process) is up first will get and keep the virtual IP address.

Keepalived works by sending periodic VRRP advertisements every 1 second (by default) that are heard by other VRRP hosts on the same network. That’s how they figure out who should be the master and the owner of the virtual IP address. Only the host in the master state will send the advertisements, the other members in the VRRP group will start sending the advertisements if they don’t hear anything from the master anymore.

Let’s test by stopping Keepalived on server 1: sudo systemctl stop keepalived

And then test from the test client again:

markku@testclient:~$ curl http://192.168.7.130:8000/server.txt
This is server 2
markku@testclient:~$

Now it’s server 2 that is responding for the virtual IP address.

Starting Keepalived again on server 1 does not change the ownership of the virtual IP address right away since server 2 is still running.

To make server 1 always have the virtual IP address whenever running, we can configure higher priority for it:

vrrp_instance WEB_VIP {
        interface ens18
        virtual_router_id 1
        priority 101
        virtual_ipaddress {
                192.168.7.130/24
        }
}

Restarting (or reloading with “sudo systemctl reload keepalived“) Keepalived on server 1 will now get it to master state in the VRRP group again (because server 2 has the default priority of 100), and the test client connection will go to server 1:

markku@testclient:~$ curl http://192.168.7.130:8000/server.txt
This is server 1
markku@testclient:~$

This was a very basic configuration for Keepalived VRRP.

It is important to understand that Keepalived is just a routing software, affecting the routing of the IPv4 or IPv6 packets to a specific server. The VRRP function does not have any application session awareness. VRRP works best in use cases where the application mostly deals with small and short-lived connections because a VRRP switchover causes existing TCP sessions to fail as the new VRRP master does not know about the sessions on the other server.

Actually, by default, Keepalived doesn’t even know if there is an application running on the server or not.

Let’s stop the Python web server on server 1 with Ctrl-C, and test again from the test client:

markku@testclient:~$ curl http://192.168.7.130:8000/server.txt
curl: (7) Failed to connect to 192.168.7.130 port 8000 after 0 ms: Couldn't connect to server
markku@testclient:~$

VRRP is still master on server 1 but there is no application running on port 8000, hence the error message on the client.

Starting the web server again on server 1 (python3 -m http.server) lets connections succeed again:

markku@testclient:~$ curl http://192.168.7.130:8000/server.txt
This is server 1
markku@testclient:~$

VRRP tracking can be used for providing application state information to Keepalived.

VRRP tracking

Tracking in VRRP means checking some state on the local server, and based on that state Keepalived can increase or decrease the VRRP priority of the instance.

In our simple example, we only want the server to be master in the VRRP group if the web server process is running. Here is the new full configuration for server 1:

vrrp_track_process HTTP_SERVER {
        process python3 -m http.server
        weight -10
}

vrrp_instance WEB_VIP {
        interface ens18
        virtual_router_id 1
        priority 101
        virtual_ipaddress {
                192.168.7.130/24
        }
        track_process {
                HTTP_SERVER
        }
}

This configures Keepalived to track the process “python3” with parameters “-m” and “http.server”. If the process is not found, it will decrease the local VRRP priority by 10.

With this new configuration reloaded on server 1, let’s see what happens in the Keepalived log (with “sudo journalctl -fu keepalived“) when I now stop the Python web server process with Ctrl-C:

Apr 18 16:39:55 server1 Keepalived_vrrp[1722]: Quorum lost for tracked process HTTP_SERVER
Apr 18 16:39:55 server1 Keepalived_vrrp[1722]: (WEB_VIP) Changing effective priority from 101 to 91
Apr 18 16:39:58 server1 Keepalived_vrrp[1722]: (WEB_VIP) Master received advert from 192.168.7.132 with higher priority 100, ours 91
Apr 18 16:39:58 server1 Keepalived_vrrp[1722]: (WEB_VIP) Entering BACKUP STATE

Within one second Keepalived noticed that the tracked process disappeared, and it lowered the total announced priority from 101 to 91, which enabled server 2 to become master in the VRRP group and start the VRRP announcements.

markku@testclient:~$ curl http://192.168.7.130:8000/server.txt
This is server 2
markku@testclient:~$

Starting the web server process again brought VRRP master back to server 1:

Apr 18 16:41:52 server1 Keepalived_vrrp[1722]: Quorum gained for tracked process HTTP_SERVER
Apr 18 16:41:52 server1 Keepalived_vrrp[1722]: (WEB_VIP) Changing effective priority from 91 to 101
Apr 18 16:41:53 server1 Keepalived_vrrp[1722]: (WEB_VIP) received lower priority (100) advert from 192.168.7.132 - discarding
Apr 18 16:41:54 server1 Keepalived_vrrp[1722]: (WEB_VIP) received lower priority (100) advert from 192.168.7.132 - discarding
Apr 18 16:41:55 server1 Keepalived_vrrp[1722]: (WEB_VIP) received lower priority (100) advert from 192.168.7.132 - discarding
Apr 18 16:41:55 server1 Keepalived_vrrp[1722]: (WEB_VIP) Entering MASTER STATE

The same track configuration is usually applied to all servers in the VRRP group.

If none of the servers in the VRRP group have the tracked application process running, all the VRRP instances will use the decreased priority, but one of the servers will become VRRP master in any case.

Another common way to use tracking in Keepalived is to use vrrp_script and track_script configuration to have Keepalived run a custom script periodically, for example to check the application status. A systemd-based example would be:

vrrp_script MY_SERVICE {
        script "systemctl -q is-active myapp.service"
        weight -10
}

vrrp_instance WEB_VIP {
        ...
        track_script {
                MY_SERVICE
        }
}

The script command can also be a shell script that contains more operations and returns 0 when the checks succeed and non-zero when there is a problem, guiding Keepalived to lower the priority.

Network observations about Keepalived VRRP

Now let’s proceed to the “let’s see what actually happens” part of the post.

Server 1 is still master in the VRRP group. This is what the ARP table shows in the test client after testing all the web server connections:

markku@testclient:~$ ip neighbor
192.168.7.130 dev ens18 lladdr bc:24:11:7b:02:af REACHABLE
192.168.7.131 dev ens18 lladdr bc:24:11:7b:02:af REACHABLE
192.168.7.132 dev ens18 lladdr bc:24:11:45:af:db REACHABLE
markku@testclient:~$

Note how the VIP address 192.168.7.130 is shown with the same MAC address as server 1 has: bc:24:11:7b:02:af.

That’s not how I’ve seen it previously with other VRRP implementations. RFC 3768 (VRRP version 2 that is the default with Keepalived) even specifically says:

The virtual router MAC address associated with a virtual router is an IEEE 802 MAC Address in the following format: 00-00-5E-00-01-{VRID}

Clearly Keepalived is doing it differently by default.

Let’s see how the ARP table changes when I stop Keepalived on server 1 and then see the ARP table again:

markku@testclient:~$ ip neighbor
192.168.7.130 dev ens18 lladdr bc:24:11:45:af:db REACHABLE
192.168.7.131 dev ens18 lladdr bc:24:11:7b:02:af REACHABLE
192.168.7.132 dev ens18 lladdr bc:24:11:45:af:db REACHABLE
markku@testclient:~$

VIP address changes to using server 2 MAC address right away: bc:24:11:45:af:db

I’ll fire up my Wireshark with sshdump to capture the traffic on the test client to see what happens. I’ll use “proto 112 or arp” as the capture filter (protocol number 112 is VRRP). This is what happened when Keepalived was started again on server 1 (with the higher priority still configured):

If you want to see it in Wireshark, here is the capture file:

VRRP-with-physical-MAC.pcapng (github.com)

First there were the VRRP multicast announcements from server 2 (destination MAC address is 01:00:5e:00:00:12, destination IP address is 224.0.0.18), then Keepalived was started on server 1, resulting in VRRP announcements starting from server 1, as well as two batches of gratuitous ARPs from server 1, announcing the new MAC address of the VIP address 192.168.7.130.

Because of the server-specific MAC address of the VRRP VIP it was necessary to update the ARP tables on adjacent devices with the gratuitous ARPs when the VRRP master role was switched from server 2 to server 1.

As mentioned, this is a bit atypical in my mind as I expected to see 00:00:5e:00:01:01 as the MAC address of the VIP, not any of the server MAC addresses.

In Keepalived VRRP configuration manual there is a “use_vmac” option. Let’s add it inside the vrrp_instance configuration block on both servers, restart them both, start the capture and after a few seconds stop Keepalived on server 1:

VRRP-with-virtual-MAC.pcapng (github.com)

Now the source MAC address of the packets is “IETF-VRRP-VRID_01” (00:00:5e:00:01:01) that is the MAC address mentioned in the RFC document as well.

This is now the ARP table on the test client:

markku@testclient:~$ ip nei
192.168.7.130 dev ens18 lladdr 00:00:5e:00:01:01 REACHABLE
192.168.7.131 dev ens18 lladdr bc:24:11:7b:02:af REACHABLE
192.168.7.132 dev ens18 lladdr bc:24:11:45:af:db REACHABLE
markku@testclient:~$

VIP 192.168.7.130 is now 00:00:5e:00:01:01 as expected.

Even if the MAC address of the VIP didn’t change in the VRRP switchover, gratuitous ARPs were sent anyway. In this case the end host ARP tables didn’t need changes, but the underlying switched LAN must update the MAC tables to correctly forward the packets destined to the VRRP MAC address to the new VRRP master device.

In the end, I think both ways (device MAC and virtual MAC, for the VIP) will work just fine in normal circumstances. But if your specific setup doesn’t work as expected, be sure to try it the other way around.

See “man keepalived.conf” (or the documentation page) for more details on configuring VRRP. Some other aspects that may be of interest in your specific needs:

Adjusting advertisement and tracking intervals
Setting a notify script (to assist in monitoring)

Majornetwork

VRRP with Keepalived

Basic Keepalived VRRP setup with two servers

VRRP tracking

Network observations about Keepalived VRRP

Leave a ReplyCancel reply