Configuring the Connection SLA Settings

This article discusses the SLA for Socket link connectivity, and how to customize the settings for the account or specific sites.

Overview of the Connection SLA Settings

The Cato connectivity SLA for the last mile assures optimal performance and resiliency for the site application flows. The Socket and the connected PoP use real-time SLA based path selection algorithms to select the optimal link for each flow in the upstream and downstream directions. The algorithm constantly monitors SLA KPIs such as packet loss, latency, congestion, port status, Internet connectivity status and the Socket seamlessly moves flows between links if SLA degradation is detected. The Socket optimally distributes traffic between all active links, including links with different bandwidth capacities and asymmetric upstream/downstream bandwidth. The Socket's connectivity SLA mechanism is programmed to react to any connectivity problem and take actions to automatically overcome the issue.

We recommend using the active/active configuration for Socket sites for the best resiliency and performance.

Acceptable and Unacceptable SLA

The Connection SLA screen in the Cato Management Application lets you define acceptable and unacceptable SLA thresholds. When the connectivity SLA is within the acceptable SLA thresholds, the Socket remains connected to the same PoP and uses the real-time path-selection algorithms to select the best link for each flow. In addition, it moves flows between the active links as needed to provide the best user experience. The unacceptable SLA thresholds are defined for the account level and can also be customized for specific sites, see below Customizing the SLA Thresholds Settings.

In situations where the connectivity SLA becomes unacceptable and it can't meet the thresholds, the Socket and the PoP take actions to repair the connectivity. For example, the Socket activates the passive links. If these actions don't solve the connectivity problem, the Socket will connect to a different PoP.

Operating within Acceptable SLA

Within the acceptable SLA, the Socket uses all the active links and selects the best link for each flow based on a health score that is calculated in real time. If the Socket detects an SLA issue or health score degradation for a link, it seamlessly moves flows between the links. These SLA KPI metrics include: packet loss, latency, jitter, congestion, and more. For more information, see Part 1: The Socket Interfaces and Precedence.

For active/passive configurations, the passive links remain inactive as long as there is at least one active link with acceptable SLA.

Example of Packet Loss within Acceptable SLA

The following examples show Socket site configurations where the unacceptable SLA threshold is set to 10% packet loss. Link 1 is experiencing 3% packet loss and link 2 has 0% packet loss.

AA_Good_SLA.png
  • For new flows, the Socket or PoP chooses the link with the best quality

  • For existing flows, the Socket gradually move flows to the best quality link

AP_Good_SLA.png
  • Link 2 (the passive link) is not activated because link 1 meets the acceptable SLA threshold. All flows continue to use the active link.

Operating with Unacceptable SLA

When the Socket determines that all active links don't meet the SLA over the time range, then this is considered unacceptable SLA and the Socket automatically takes actions to remediate the connectivity issues. Depending on the link configuration and Connection SLA settings, the Socket will activate a lower-precedence passive link, or if none of the links meet the acceptable SLA thresholds, it connects all links to a different PoP

Example of Remedy Actions for Unacceptable SLA

The following examples show Socket site configurations where the unacceptable SLA threshold is set to 10% packet loss. Link 1 is experiencing 15% packet loss and link 2 has 0% packet loss. These examples are during the evaluation period where the PoP is using self-healing mechanisms.

AA_Bad_Link.png
  • Same behavior as above with active/active acceptable SLA

AP_Bad_Link.png
  • The passive link (link 2) is activated

  • Socket now works in active/active configuration

  • New flows use link 2

  • Existing flows gradually move from link 1 to link 2

  • For configurations where link 2 is a Last-Resort link, the Grace-timer starts counting

    The Grace-time gives extra time to resolve connectivity issues before activating the cellular link

    • If acceptable SLA isn't restored on link 1 during the Grace-time, then link 2 (the Last-Resort link) is activated

Example of Connecting to a Different PoP for Unacceptable Connectivity SLA

If the remedy actions during the evaluation period don't resolve the connectivity issues, then the Socket connects to a different PoP. When a Socket connects to a new PoP, this is the behavior:

  1. The Socket starts the initial connectivity SLA evaluation period of up to 30 seconds.

    1. If the links to the PoP have acceptable SLA, the Socket remains connected to the PoP.

    2. If the links to the PoP have unacceptable SLA, the Socket connects to a different PoP and repeats the initial connectivity SLA evaluation period of up to 30 seconds.

  2. If the Socket can't locate a PoP with acceptable SLA, it returns and connects to the original PoP.

The following examples show Socket site configurations where the unacceptable SLA threshold is set to 10% packet loss. Link 1 is experiencing 20% packet loss and link 2 has 15% packet loss as a result of tier-1 provider connectivity issues. The second diagram shows how connecting to a different PoP resolves the issue.

T1_Bad_SLA.png
  • After the evaluation period, there is unacceptable SLA (more than 10% packet loss) on all active links

    For example, packet loss related to the tier-1 service provider

T1_Good_SLA.png
  • Socket connects to the next best PoP

  • After 30 seconds, the Socket confirms that the links meet the acceptable SLA

  • A reconnect event is generated

Reconnecting to the Original PoP

For optimal performance and lowest latency, it is always recommended that the Socket connects to the nearest physical PoP location. If the Socket moves to a different PoP location, due to SLA issues with the primary PoP, it will automatically attempt to reconnect to the preferred PoP location (the nearest PoP to the site), in 60 minutes. The Socket will verify that the preferred PoP is available and provides good service before reconnecting to it. You can also choose to manually reconnect the Socket to the preferred PoP, see Defining a Preferred PoP for a Site.

Defining the Connectivity SLA Thresholds

There are two options to define the Connectivity SLA thresholds:

  1. Cato Smart SLA - automatically set by Cato (this is the default option)

  2. Custom SLA settings - customize the SLA thresholds for the entire account or specific sites

Using Cato Smart SLA

The Cato Smart SLA option automatically sets the recommended SLA settings for the last mile connectivity between the Socket and the PoP. This setting includes a 10 minute SLA evaluation period to decide if the existing connectivity to the PoP meets the default SLA requirements or not. If the SLA requirements are not met for 50% of the 10-minute period, the Socket automatically moves the tunnels to a different PoP to restore the connectivity SLA.

The goal of this 10 minute period, is to allow the internal mechanisms in the PoP to identify and resolve the connectivity problem and to avoid moving the site to a different PoP. For example, the PoP automatically identifies a poor quality Tier-1 provider peer and temporarily removes it from the service. Then all traffic from the connected sites uses the remaining Tier-1 provider peers.

These are the default connectivity SLA thresholds for the Cato Smart SLA option:

  • Evaluation period - default value 10 min

  • Packet loss - default value 10%

  • Latency - default value 300 ms

  • % of the time window - default value 50%

We recommend that you use the Cato Smart SLA option to define the Connection SLA thresholds for your account.

Customizing the SLA Thresholds Settings

You can customize the SLA settings to change the default evaluation period, packet loss and latency SLA thresholds, as well as the percentage of the time window that the Packet Loss and Latency are over the thresholds. For example, the connection SLA with an Evaluation period of 600 seconds. In that time, you want to make sure that Packet Loss is no greater than 10% for more than 30% of that evaluation period, so you set the % of the time window to 30.

You should be aware that if you configure the SLA settings to be too sensitive, for example reducing the packet loss to 1% and setting the evaluation period to 20 sec, you can cause the site to frequently move to different PoPs. This can then cause application flows to reset and temporarily impact the user experience until the flows are re-established.

For example,

These are the default values for the custom SLA thresholds:

  • Evaluation period - 130 sec

  • Packet loss - 10%

  • Latency - 300 ms

  • % of the time window - 100%

You can define the Connection SLA setting as a global setting for the entire account, and different Link SLA settings for specific sites. The Link SLA for a specific site overrides the account settings. 

You can check the current Connection SLA Latency between the Socket and the PoP from the Socket UI. In the Tunnels > SLA Parameters section the Latency is displayed in near real-time. The latency used in the Connection SLA calculation is the one-way latency and not the Round-Trip Time (RTT). The Distance graph in the Network Analytics page displays the Round-Trip Time. For an approximate analysis of historical latency, from the Distance graph, half the Distance (RTT) in milliseconds.

Customizing the SLA Thresholds for the Account

Customize the SLA Thresholds settings for all Socket sites in the account.

connectionsla.png

To customize the SLA Thresholds settings for the account:

  1. From the navigation menu, click Network > Connection SLA. The Connection SLA screen opens.

  2. Expand the SLA Thresholds section.

  3. Click Use custom SLA thresholds for Packet Loss and Latency.

  4. Customize the evaluation period for the links, and enter the number of seconds that A link is considered as an unacceptable SLA if any of the following thresholds is exceeded for.

  5. Customize the SLA threshold settings for the Packet Loss and Average Latency.

  6. Determine what the % of the time window the Packet Loss and Average Latency should not exceed the SLA thresholds.

  7. Click Save.

Customizing the SLA Thresholds for a Site

You can customize different SLA Thresholds for the active links for specific Socket sites. The setting for a specific site overrides the account setting.

To customize the SLA Thresholds for a specific site:

  1. From the navigation menu, click Network > Sites and select the site.

  2. From the navigation menu, click Advanced Settings > Connection SLA.

  3. Expand the SLA Thresholds section.

  4. Select Override Account Settings.

  5. Customize the evaluation period for the links, and enter the number of seconds that A link is considered as an unacceptable SLA if any of the following thresholds is exceeded for.

  6. Customize the SLA threshold settings for the Packet Loss and Average Latency.

  7. Determine what the % of the time window the Packet Loss and Average Latency should not exceed the SLA thresholds.

  8. Click Save.

Was this article helpful?

1 out of 2 found this helpful

2 comments

  • Comment author
    Yaakov Simon

    Updated article to explain the SLA connectivity settings for the Cato Cloud. Included the SLA threshold values for the Cato Smart SLA option.

  • Comment author
    TH

    One question.
    SmartSLA has a default evaluation period of 10 minutes, what is this period based on?

    Regards,

    TH

Add your comment