Socket High Availability Failover Fails Due To Meraki Switch GARP Limitation

Issue

In instances where a Socket HA pair performs an HA failover, the newly designated master Socket sends a gratuitous ARP broadcast and then starts to respond to ARP broadcast requests for the Site's LAN IP address. For more information see Understanding Socket High Availability and Failover.

However, in scenarios where a Meraki switch interconnects the two Sockets, the HA failover process may fail and the switch will wrongly forward all frames to the slave Socket, resulting in an outage.

Environment

This issue pertains specifically to Socket HA pairs interconnected through a Meraki switch. The proposed solution is applicable to Socket version 13 and above.

Troubleshooting

  • In adherence to RFC2338, the socket will send out a Gratuitous ARP REQUEST packet during HA failover. This behavior can be validated in a packet capture which will show the following flag: opcode = 1
  • Meraki support has confirmed their system's behavior of disregarding GARP requests with opcode = 1, consequently failing to update the switch's CAM table. Detailed information regarding this issue is documented in Meraki MS switching and Gratuitous ARP

Solution

Starting with Socket version 13, a backend configuration can be requested to Support to change the opcode flag in the gratuitous ARP packet, effectively mitigating the issue related to Meraki switches.

Note: This configuration should only be applied to sites with Meraki switches and not at the account level.

The recommended backend configuration changes the opcode in the GARP packet to 2, denoting a REPLY, thereby ensuring successful updating of the switch's CAM table.

Was this article helpful?

3 out of 3 found this helpful

0 comments

Add your comment