Overview
Azure has a DDOS protection mechanism that limits traffic to a specific public IP (for example, a Cato PoP). This might impact performance of a vSocket or an IPSec connection installed in the Azure cloud, and also cause severe packet loss.
Recently, Cato noticed a few customers have experienced significant packet loss due to Azure’s default infrastructure DDoS Protection. The problem arises when the DTLS (UDP/443) tunnel traffic exceeds the threshold of 200k packets-per-second (PPS) per destination IP, triggering Azure’s DDoS protection mechanisms. This triggers Azure to throttle the traffic to a limit of 1k packets-per-second. This limit is applied globally, meaning it aggregates traffic from all Azure sources to a single destination IP.
Frequently Asked Questions (FAQs)
What caused the packet loss issue?
The packet loss was caused by Azure’s default DDoS protection mechanisms, which drop packets when traffic exceeds 200,000 PPS to a single destination IP. This is to prevent potential outbound attacks.
How would a customer detect if there’s a problem with Azure?
For Azure vSocket sites, if there is an extremely high packet loss, it might indicate that Azure activated their DDOS protection. To see the high packet loss, please check Network > Site Monitoring > Network Analytics, and look for packet loss as presented below:
If you see increased packet loss on the last-mile between the Azure site and the Cato Cloud, especially upstream, this might indicate that Azure DDoS protection was triggered. Open a support ticket with Azure to further investigate the issue.
An incident of high last-mile packet loss between the Azure site and the Cato Cloud, especially in the upstream direction, should be considered as a possible result of Azure DDoS mitigation and trigger opening a support ticket with Azure for further investigation.
What temporary solutions have been implemented?
Azure has temporarily increased the PPS threshold for the affected IPs to 2 million PPS until April 2025.
Is there a permanent solution to this issue?
Currently, there is no permanent solution. However, Cato is working closely with Azure to provide such a solution. Customers are encouraged to monitor their traffic and work with Azure support to find long-term strategies to mitigate the impact.
What should customers do if they experience similar issues?
Customers should immediately report the issue to Azure support and provide detailed information about their traffic patterns and also share it with Cato. In addition, please open a Support ticket with Cato as well. Cato will work together with Azure to prevent future incidents.
In addition, we recommend that customers should consider implementing traffic distribution strategies to avoid exceeding the Azure PPS threshold.
0 comments
Please sign in to leave a comment.