BGP Session is Disconnected - Network Playbook

This playbook describes steps to resolve issues when a BGP session disconnects for a site.

Overview

When a BGP session is disconnected, the connection between two BGP routers is terminated and can disrupt the exchange of routing information. The impact of the disconnected session can vary depending on the network's redundancy and failover mechanisms. In scenarios where alternate paths exist, the impact may be minimal. However, in less resilient setups, disconnections can lead to temporary routing issues and service disruptions.

There are different ways to discover that a BGP session has disconnected for a site:

  • Go to the Stories Workbench page and use the Network XDR preset to find the BGP session disconnected stories.

    bgp-disconnected-story.png

    The story provides information about the incident timeline, current Socket status, and more.

  • A Routing event, with the BGP Session sub-Type with the action Disconnected

    • Use the BGP peers disconnected preset filter and adjust the time frame if necessary

  • BGP email notification

    • When email notifications are enabled for a BGP peer, emails are sent to the mailing list (can include non-admins)

Step 1 - Verifying that the BGP Session is Disconnected

This section discusses different Cato tools that you can use to verify that the BGP session for a site is disconnected, and what might be the root cause.

Taking a PCAP on the LAN Interface

Take a packet capture (PCAP) on the Socket LAN interface (the port used for BGP traffic). For more information, see How to Take a Packet Capture on a Socket.

  • Check if there are ARP replies to see who has the IP address for the BGP router. If there are no responses, this indicates that there might be a problem between the BGP router and the switch.

  • If there are ARP replies, filter the PCAP for port 179. If the router responded with a SYN-ACK. this indicates there is a session. At this point, check which side is dropping the connection, and with which error.

Reviewing Last Activity for a Known Host

Use the Known Hosts page for the site to review the most recent time there was activity for a host. This provides more information about the timing of connectivity issues and the BGP session.

Showing the BGP Status

Use the Cato Management Application to show the real-time status of the BGP session. In the BGP page for the site (Network > Sites > {site name} > Site Configuration > BGP), click Show BGP Status.

This is an example of the status for a disconnected BGP session:

BGP_Disconnect_Status.png

Under the Raw Status tab, review the BGP configuration. See if there are any configuration issues that stand out, such as an incorrect VLAN tag.

Pinging the Host from the LAN

You can use the Socket WebUI to ping the BGP peer from the LAN interface, make sure that the BGP peer allows ICMP traffic. For more information, see Using the Socket WebUI Tools.

  • From the Socket WebUI, ping the host with these settings:

    • Route via - LAN

    • Hostname/IP - IP address of the BGP peer

  • These are example conclusions based on the ping results:

    • Fails - The BGP router is not reachable, the issue isn't related to the Cato Cloud

    • Succeeds - There is an issue between the PoP and the BGP router

Verifying the BGP Disconnected Status for Cross Connect Sites

For Cross Connect sites, BGP is used for connectivity between the cloud environment underlay and the PoPs.

  • In the Cross Connect page for the site (Network > Sites > {site name} > Site Configuration > Cross Connect), click Test Connectivity to show the BGP status of the underlay

  • In the Sites page, review the status of the site

Step 3 - Remediating the BGP Disconnected Status

Once you verify the standby BGP neighbor is disconnected, you can change one of the BGP neighbors and click Save. This pushes a new configuration which can resolve the issue. Then restore the original settings and save the original configuration.

Reviewing the Audit Trail

Review changes in the Audit Trail page for the Cato Management Application, and see if there is a configuration that is related to this issue.

Step 4 - Verifying that the BGP Status is Connected

Showing the BGP Session Established Event

After the BGP neighbor is connected to the site, a BGP Session event is generated with the Action Established. In the Events page, you can manually configure the event filter for Action IS Established to show the event.

Testing the BGP Status

The real-time status of the BGP session shows the routing status and information. In the BGP page for the site (Network > Sites > {site name} > Site Configuration > BGP), click Show BGP Status.

Was this article helpful?

0 comments

Add your comment