Reviewing XDR Network Stories

This article discusses how you can use the Stories Workbench to review Network XDR stories for connectivity and performance issues on your network.

Overview

Cato Detection & Response (XDR) identifies network issues such as degradation, in addition to potential security threats. The advanced Network XDR engine detects different indications and metrics related to connectivity and performance, and generates stories that correlate data for issues concerning the network. For example, if a WAN link is intermittently experiencing high packet loss, the engine will create a single story with all the relevant data for the link.

The Stories Workbench page shows the details of each story to help you understand and analyze the issues. You can sort and filter the stories to find the most important incidents, and then drill-down on a story to further investigate the details to resolve the issue.

Network Story Indications

These are the indications of network connectivity and performance issues that are detected by the Network XDR engine to generate stories:

Indication

Description

Threshold for Generating a Story

Site down

The site disconnected from the Cato Cloud.

All links down for 2.5 minutes

Link is down

One of the WAN links for a site disconnected from the Cato Cloud, the site is still connected.

A link is down for 2.5 minutes or a link had 5 shorter disconnections in a 10 minute period

BGP session disconnected

A BGP session unexpectedly disconnected, and can impact app connectivity and the user experience.

One BGP disconnection event

LAN monitoring - host unreachable

A monitored host behind a site isn’t responding to keep-alive packets from the PoP, and is considered unreachable.

Requires a LAN Monitoring rule configured for the host.

One LAN Monitoring Unreachable event

Link quality SLA

The link SLA quality threshold for a site is exceeded. This can impact user experience.

The SLA thresholds are configured for Quality Health Rules.

One Quality Health Rule event

Socket HA Not Ready status

There is an issue with the Socket High Availability (HA) configuration, and the status is Not Ready.

If one of the following Socket HA Not Ready conditions occurs:

  • Connected is not ready for 2.5 minutes

  • Keepalive is not ready for 2.5 minutes

For more about these conditions, see What is Socket High Availability (HA)

PoP reconnect to improve connectivity

The site was forced to reconnect to the PoP to optimize performance. Reconnecting to the PoP can impact user experience.

One reconnect event with this message:

Performance issue detected, reconnected to a different service node in the Cato Cloud

For more about event message fields, see Understanding Socket Connectivity Event Message Fields

LAN port down

One of the LAN ports disconnected

The port is down for 2.5 minutes

Example Use Case

This is an example use case for an admin identifying and resolving a network story with the Stories Workbench:

  1. Filtered the Stories Workbench to show open network stories grouped by site

  2. Identified a high criticality story for the New York site, with the indication Link is down

  3. Opened the drill-down page for the story, reviewed the story data and discovered the site's WAN 01 link was disconnected from the Cato Cloud

  4. Reviewed the relevant playbook to investigate and troubleshoot the issue

  5. After checking the physical Socket at the New York site, discovered the WAN 01 link cable was faulty

  6. Replaced the cable, confirmed the link was up and connected, and continued to monitor the story for possible recurrence of the issue

  7. Story automatically closed after two hours with no recurrence

Showing the Stories Workbench Page

The Stories Workbench page shows a summary of the XDR Network and Security stories for your account.

To show the Stories Workbench page:

  • From the navigation menu, click Monitoring> Stories Workbench.

Understanding the Stories Columns

Detection___Response_Workbench_w_Network.png

Column

Description

ID

Unique Cato ID for this story

Status

The statuses for a Network XDR story represent different stages throughout the story lifecycle, from the initial issue that triggered the story, through the final resolution. The Network XDR engine automatically updates the status when it detects the relevant changes in the network incident. These are the status types:

  • Open - The Network XDR engine detected a network issue that triggered the generation of a story

  • Monitoring - The Network XDR engine detected that the initial issue is resolved, and continues monitoring for a recurrence for two hours. If a recurrence is detected, the status changes back to Open

  • Closed - A story with a status of Monitoring changes to Closed when there is no detected recurrence for two hours.

Created

Date of the first traffic flow for the story

Updated

Date of the most recent traffic flow for the story

Criticality

  • For Network stories - The potential impact of the issue on your network. Values are from 1 (low impact) to 10 (high impact)

  • For Security stories - Cato's risk analysis of the story. Values are from 1 (low risk) to 10 (high risk)

Indication

  • For Network stories - Indication of the network issue for the story

  • For Security stories - Indicator of attack for the story. For more about indications, see Using the Indications Catalog

Source

  • For Network stories - The site where the network issue is occurring

  • For Security stories - IP address, name of device, or SDP user on your network involved in the story

Occurrences

The number of times the issue occurred, including recurrences after a temporary resolution. For example, if a link repeatedly disconnects and reconnects, each disconnection counts as an occurrence

Engine Type

The XDR engine that created the story. For Network stories, the engine is Network XDR

Grouping the Stories

To provide context when reviewing the stories, you can show the stories in groups defined by details including Sources, Indication, Status, and Type. For example, you can show together all of the stories related to a specific source site, or all of the Link quality SLA stories. This gives you a broader perspective when analyzing the stories, and can help you more quickly understand and resolve issues.

For Network XDR stories, Sources are sites in your network.

We recommend as a best practice to begin your analysis of Network stories by grouping by Sources.

Each group highlights the criticality levels for the stories in that group, including the number of high, medium, and low criticality stories.

Stories_Workbench_Grouping_Network.png

To group the stories in the Stories Workbench:

  1. From the navigation menu, click Monitoring> Stories Workbench.

  2. From the Group By drop-down menu, select the required criterion.

    The stories are shown in expandable groups.

Filtering the Stories

There are three ways to filter the data in the Stories Workbench:

  • Select a preset filter

  • Automatically update the filter with a selected item

  • Manually configure the filter

Preset Filters

You can select a preset filter to focus on either Network Operations or Security Operations stories. When you select a preset filter, the story columns most relevant for that type of story are shown by default.

To select a preset filter:

  1. In the filter bar, click the Select Presets dropdown menu.

  2. Select the preset. The Stories Workbench is updated to show the stories that match the preset.

Automatically Filtering for an Item

As you hover over an item or field where a filter option is available, the TD_Filter.png button appears. Click the icon to show the filter options:

  • Add to Filter - Adds the item to the filter, and the Stories Workbench now only shows stories that includes this item. For example, if you filter for a specific Criticality score, the page only shows stories with that Criticality.

  • Exclude from Filter - Updates the filter to exclude this item, and the Stories Workbench now only shows stories that do NOT include this item.

You can continue to add items to the filter, click TD_Filter.png again to update the filter and drill-down further.

Selecting the Time Range

The default time range for the Stories Workbench is the previous two days. You can select a different time range to show a longer or shorter time period. For more information, see Setting the Time Range Filter.

The maximum date range for the Stories Workbench is 90 days.

Manually Configuring the Filter

You can manually configure the story filter for greater granularity to analyze the stories. After you configure the filter, it is added to the stories filter bar and the page is automatically updated to show the stories that match the new filter.

To create a filter:

  1. In the filter bar, click Add2.png.

  2. Start typing or select the Field.

  3. Select the Operator, which determines the relationship between the Field and the Value you are searching for.

  4. Select the Value.

  5. Click Add Filter. The filter is added to the filter bar and the Stories Workbench is updated to show stories based on the filters.

Clearing the Filter

You can remove each item in the filter separately, or clear the entire filter.

To clear the filters for the Stories Workbench page:

  1. To clear a single filter, click remove.png next to the filter.

  2. To clear all the filters, click X at the right end of the filter bar.

Drilling-Down and Analyzing Stories

You can click on a story in the Stories Workbench to drill-down and investigate the details in a different page. This page contains a number of widgets that help you evaluate the potential issue identified by the Network XDR engine.

Investigating Stories with Playbooks

The Stories Workbench drill-down includes a link to a playbook that provides steps to investigate, troubleshoot, and resolve the issue. Each Network XDR story links to a playbook for the story's specific indication. For example, a playbook for stories with the indication Socket HA Not Ready status.

Generating AI Story Summaries

The Stories Workbench drill-down includes a tool that lets you create a natural language story description generated by AI, which provides rich context and helps you quickly assess the story. The story summary is generated dynamically to reflect the current state of the story. If the story updates with new information, you can regenerate the summary to reflect the changes.

For more about generating AI story summaries, see below.

  • The AI story summary is generated only on-demand by the admin

Protecting Sensitive Data with Tokenization

For robust data security during the transmission of story data to third-party AI services, Cato uses tokenization to ensure all sensitive data remains in the Cato XDR platform. This involves replacing sensitive information with unique identifiers, or "tokens," rendering the data meaningless to unauthorized entities. Sensitive data is never exposed to third-party services. This approach ensures the confidentiality of the story's details, aligning with our commitment to robust data privacy and security standards.

Note

Note: Due to the limitations of generative AI, the information provided in story summaries may occasionally contain inaccuracies.

Understanding the Story Drill-Down Widgets

Detection___Response_Network_callouts.png

These are the story drill-down widgets:

Item

Name

Description

1

Story summary

A summary of basic information about the story, including:

  • The story type

  • The name of the site associated with the story

  • The story's criticality

  • The number of times the issue occurred

  • The number of days since the story was generated

  • The story's current status

2

Story timeline

Shows a timeline of changes in the story status

3

Story Details

Basic information for analyzing the story, including a story description, when the story was created and updated with new related network incidents, and information about the site.

  • Click Generate AI Summary for a natural language story description that provides rich context and helps you quickly assess the story

  • Click the Playbook KB article link to open the playbook explaining how to troubleshoot and resolve this type of story

4

Current Site Overview

Information about the site in your network impacted by the story. The widget includes a link to view recent connection logs for the site, and drop-down menus with shortcuts to Site Configuration and Site Monitoring pages. This widget is the same as the Site Information Panel on the Topology page.

5

Incident Timeline

A list of the detected incidents for issues and resolutions in the story. For example, the Incident Timeline for a Link is down story includes these incidents:

  • WAN1 Active link of Primary socket - Disconnected from the Cato Cloud

  • WAN1 Active link of Primary socket - Successfully re-established connectivity to the Cato Cloud

  • No more occurrences of the issue after 120 minutes, story status changed from Monitoring to Closed

These are the columns for the Incident Timeline:

  • Created - When the incident was first detected

  • Validated - When the created incident was confirmed

  • A Description of the incident

  • Event - A link to show the Events page pre-filtered for the incident

Using the Response Policy for Network Stories

Detection___Response_Network_Response_Policy.png

The XDR Response Policy helps you monitor XDR stories by defining when email notifications for stories are sent to admins. You can create rules that define the story criteria for when notifications are sent, and can use mailing lists to configure which admins receive the notifications. For example, you can create a rule to send notifications for a Network XDR story with high Criticality, and define the mailing list to include a helpdesk email address to automatically open a support ticket.

For more about creating Response Policy rules, see Creating the Response Policy for XDR Stories

Was this article helpful?

0 out of 0 found this helpful

0 comments

Add your comment