LAN Monitoring Host Unreachable - Network Playbook

This playbook describes steps to resolve issues when LAN Monitoring is configured and the Cato Cloud can't reach a host behind a site.

Overview

When you define hosts in LAN Monitoring, a PoP in the Cato Cloud sends ICMP packets to verify that the host is up and running. When the host doesn't respond to the defined threshold of messages, the host is considered down and an event is generated that the host is unreachable.

This playbook contains steps you can take to:

  1. Verify that the host is down.

  2. Remediate the issue.

  3. Verify that the host is restored and the Cato Cloud has resumed monitoring it.

The following are the different ways that a Cato Management Application admin can verify that a host can't connect to the Cato Cloud or to the Internet:

  • Go to the Stories Workbench page and use the Network XDR preset to find the LAN monitoring host unreachable stories.

    lan-monitoring-story.png

    The story provides information on the current status of the site, an incident timeline, and more.

  • LAN Monitoring event with the action Host Unreachable

    • Use the LAN hosts unreachable preset filter and adjust the time frame if necessary

  • LAN Monitoring email notification

    • When email notifications are enabled for a LAN Monitoring rule, emails are sent to the mailing list (can include non-admins)

Understanding LAN Monitoring

The LAN Monitoring feature lets you define hosts behind a site by their IP address, and the Fault Threshold for the host (the maximum number of consecutive failed ICMP tests). A PoP in the Cato Cloud sends ICMP tests to the host, if the host fails to respond to the specified number of ICMP tests, it is considered down, and an event is automatically generated. You can also choose to send an email notification when a host is unreachable.

When the connectivity between the host and the PoP is restored, a new event is generated that the host is reachable.

For more information, see Working with LAN Monitoring for a Site.

Step 1 - Verifying the Root Cause

This section discusses different Cato tools that you can use to verify the reason that the host is unreachable.

Using Socket WebUI Tools

You can use the Socket WebUI to ping the host from the LAN interface. For more information, see Using the Socket WebUI Tools.

  • From the Socket WebUI, ping the host with these settings:

    • Route via - LAN

    • Hostname/IP - IP address of the unreachable host

    If there is no response to the ping, ping the host from the Socket. If there is a response to the ping from the Socket, the issue might be related to routing.

    • Using the Socket WebUI tools, take a PCAP and see if there is a TCP handshake between the host and the PoP.

    • If there is no handshake, filter the PCAP for ARP messages.

    If the host is responding to ARP messages, this would enforce the assumption that there is a routing issue.

    If the host is not responding to neither ping nor ARP, this would indicate that the host is probably down.

Reviewing Changes in the Audit Trail

Review changes in the Audit Trail page for the Cato Management Application, and see if there is a configuration that is related to this issue.

Step 2 - Remediating the Host Unreachable Issue

Once you identify the reason why the Cato Cloud can't reach the host, resolve the issue and restore connectivity. We recommend checking this potential internal causes:

  • Verify the host status and connectivity

  • Verify that there is no planned activity or maintenance that impacts the host

  • Check local connectivity, routing, configurations that could impact the host

Step 3 - Verifying that the Host is Reachable

After you remediate the issue with the host, verify that it is reachable and has connectivity to the Cato Cloud.

Viewing the Host in the Known Hosts Page

From the Known Hosts page, show the host and verify that the Last Host Activity is showing data for the current time.

Pinging the Host from the Socket WebUI

Use the Socket WebUI to ping the host, first from the LAN interface to verify that the host has connectivity to the site. Then ping the host again from the PoP using a WAN interface to verify that it has connectivity to the Cato Cloud.

Reviewing the Host Reachable Event

After the connectivity between the host and the Cato Cloud is restored, a Host Reachable event is generated. You can manually configure the event filter for Action IS Host Reachable to show the event.

Was this article helpful?

0 comments

Add your comment