Troubleshooting Azure HA Deployment

Overview

This article has suggestions for issues that you may experience deploying vSockets for Azure high availability (HA).

For the Cato HA script, you might experience the following two behaviors:

  • The create_ha_settings script completed successfully but the HA API test from the Socket UI fails
  • Running the create_ha_settings script fails

For more information about Azure vSocket HA, see Configuring High Availability (HA) for Azure vSockets.

Checking the Activity log

Azure using an "activity log" to store all the events that happened inside Azure, if the deployment successful, but one of the roles is not assigned or the floating IP is missing from the LAN NIC.

Browse to the VM or the NIC and select the activity log Screen_Shot_2021-05-09_at_14.53.30.png

Pinging the Floating IP

Pinging the Floating IP is supported from Socket version 11.0 and higher. If the Azure vSockets are lower than version 11.0, then you can’t ping the Floating IP to test the HA status for the vSockets.

Troubleshooting HA API Test Failures 

The troubleshooting steps in this section explain how to:

  • Verify that the floating IP is assigned to the main NIC
  • Verify that the VM identity is "UserAssigned"
  • Verify that all the necessary roles are assigned properly

Verifying the Floating IP assignment in NIC

In order to call the HA API, Azure assigns the Floating IP to the NIC of the main vSocket.

Go to the primary vSocket VM LAN NIC > IP Configuration and verify that the Floating IP exists as "Secondary". Screen_Shot_2021-03-05_at_20.52.34.png

Verifying the VM identity

In order to check the VM identity assignment, go to Azure CLI inside the Azure Portal. Screen_Shot_2021-03-05_at_21.59.42.pngAnd type the following command:

az vm identity show --resource-group [RG name] --name [vSocket name]
If the identity assigned properly, we expect to see the following output :Screen_Shot_2021-03-03_at_21.52.36.png

Verifying the Azure Role Assignment 

The Azure HA script (create_ha_settings) validates that the Azure subscription has two valid vSockets and then assigns the Roles that create the HA and failover mechanism.

The script creates the Role that is stored in Azure Managed Identities

Screen_Shot_2021-03-01_at_18.29.48.png

The script assigns this Role to the different virtual resources (NICs, subnets, and VMs) that are attached to the vSocket. The components in the Azure infrastructure that use the Role are:

  • LAN Network Interface (NIC) for each vSocket
  • A virtual network that uses the LAN subnet
  • Both vSocket VMs

Access Control (IAM) 

The role assignment for the NIC and the subnet are done using the Access control option in Azure and should be assigned for both LAN NICs and their subnets.

The following screenshot shows the Role assigned to the LAN NIC: Screen_Shot_2021-03-01_at_18.35.16.png

vSocket VM (Identity)

For each vSocket VM, the role is assigned in Identity > User assigned as you can see in the screenshot below:

Screen_Shot_2021-03-02_at_9.00.31.png

Troubleshooting HA Script Failures

The most common failure reasons for the HA deployment are:

  • DNS issue in Azure 
  • Permissions related to the Azure account

DNS Issue

Make sure that the default DNS in Azure is configured for both the LAN subnet and the associated NICs, otherwise, the HA script fails to assign the Role correctly and the HA deployment fails.

If the default DNS isn't configured in Azure in both the subnet and NIC, Role creation will fail. To check that the DNS configuration is set to the default values, go to your Virtual network > DNS servers, and make sure that you're using the Default option.

Screen_Shot_2021-03-02_at_9.56.19.png

Permissions in Azure 

In order to successfully run the HA script, make sure that the Azure user has owner permissions. Go to Resource Group > Access Control IAM, and verify that the user account is set to Owner.

Was this article helpful?

3 comments

Add your comment