When you are deploying vSocket high availability (HA) in Azure, you might experience the following two behaviors:
- The create_ha_settings script completed successfully but the HA API test from the Socket UI fails
- Running the create_ha_settings script fails
For more information about Azure vSocket HA, see Configuring High Availability (HA) for Azure vSockets.
Checking the Activity log
Azure using an "activity log" to store all the events that happened inside Azure, if the deployment successful, but one of the roles is not assigned or the floating IP is missing from the LAN NIC.
Browse to the VM or the NIC and select the activity log
Pinging the Floating IP
Pinging the Floating IP is supported from Socket version 11.0 and higher. If the Azure vSockets are lower than version 11.0, then you can’t ping the Floating IP to test the HA status for the vSockets.
Troubleshooting HA API Test Failures
The troubleshooting steps in this section explain how to:
- Verify that the floating IP is assigned to the main NIC
- Verify that the VM identity is "UserAssigned"
- Verify that all the necessary roles are assigned properly
Verifying the Floating IP assignment in NIC
In order to call the HA API, Azure assigns the Floating IP to the NIC of the main vSocket.
Go to the primary vSocket VM LAN NIC > IP Configuration and verify that the Floating IP exists as "Secondary".
Verifying the VM identity
In order to check the VM identity assignment, go to Azure CLI inside the Azure Portal. And type the following command:
az vm identity show --resource-group [RG name] --name [vSocket name]
Verifying the Azure Role Assignment
The Azure HA script (create_ha_settings) validates that the Azure subscription has two valid vSockets and then assigns the Roles that create the HA and failover mechanism.
The script creates the Role that is stored in Azure Managed Identities:
The script assigns this Role to the different virtual resources (NICs, subnets, and VMs) that are attached to the vSocket. The components in the Azure infrastructure that use the Role are:
- LAN Network Interface (NIC) for each vSocket
- A virtual network that uses the LAN subnet
- Both vSocket VMs
Access Control (IAM)
The role assignment for the NIC and the subnet are done using the Access control option in Azure and should be assigned for both LAN NICs and their subnets.
The following screenshot shows the Role assigned to the LAN NIC:
vSocket VM (Identity)
For each vSocket VM, the role is assigned in Identity > User assigned as you can see in the screenshot below:
Troubleshooting HA Script Failures
The most common failure reasons for the HA deployment are:
- DNS issue in Azure
- Permissions related to the Azure account
Make sure that the default DNS in Azure is configured for both the LAN subnet and the associated NICs, otherwise, the HA script fails to assign the Role correctly and the HA deployment fails.
If the default DNS isn't configured in Azure in both the subnet and NIC, Role creation will fail. To check that the DNS configuration is set to the default values, go to your Virtual network > DNS servers, and make sure that you're using the Default option.
Permissions in Azure
In order to successfully run the HA script, make sure that the Azure user has owner permissions. Go to Resource Group > Access Control IAM, and verify that the user account is set to Owner.
I can never get HA to work properly even after reading through this guide and your other azure guides. you guys need to improve this feature or documentation for your Azure deployment. It is really hard to deploy this on an enterprise level if HA doesn't work. I've followed through the steps and still can't get the secondary virtual nic of the floating IP autocreated on the master socket.
There is a broken link in the "Overview" section. Clicking on the link "Configuring High Availability (HA) for Azure vSockets", attempts to open the non-existing URL, "https://support.catonetworks.com/hc/en-us/articles/360016013938".
The correct URL is: "https://support.catonetworks.com/hc/en-us/articles/4413273480977-Configuring-High-Availability-HA-for-Azure-vSockets"
This link would work if were set to "https://support.catonetworks.com/hc/en-us/articles/4413273480977"
First of all, my apologies that your comment didn't get responded to before now! Secondly, I think that the problem that you have encountered here is best handled via a support ticket because the issue you are facing could be environment related.
Please let me know if you need any further assistance with getting this raised as a support ticket.
Please sign in to leave a comment.