This article discusses the details of event and Data Lake storage for your Cato account.
Cato maintains a Data Lake that contains the data recorded by Cato networking and security functions. Data such as Event information is added to the data lake in real time and stored for a period, as defined by the customer’s contract, before being discarded.
Cato stores events and data for up to three months, free of charge, as part of the service. Customers may choose to increase the storage and extend the retention period beyond three months. This requires the purchase of Data Lake storage. Customers may also forward their data to a SIEM, using Cato’s Event Integration APIs, or to AWS or Azure datastores using cloud storage integration.
This article applies to all Cato accounts as of January 1st, 2024(*).
Events are stored in real-time and can be tracked in the Cato Management Application in the Events page (Monitoring > Events).
-
Cato stores a core set of key security and connectivity events for each customer
-
Customers can select, within policies, additional events to be recorded
-
Customer licenses define the maximum number of events that can be stored per hour
-
Events in excess of this number are discarded for the remainder of the hour
The primary unit of measurement for data lake storage is the number of events stored per hour.
For each customer, the number of events that were stored in the last hour is tracked by a counter.
-
At the start of each hour, the counter is reset
-
When the number of events reaches a threshold set for the customer, further events are discarded for the remainder of that hour
However, Cato continues to store system events that are related to Cato's processes
-
Cato generally allows headroom above the threshold, to reduce the likelihood of discard
The details for the default Cato rate limiting for events are as follows:
-
Cato allows up to 2.5 million events per hour, free of charge
-
If more than 2.5 million are generated in an hour, the events in excess of 2.5 million are discarded
-
Customers have the option to purchase rate limiting for more than 2.5 million events per hour
Customers will generally find that the default event rate limiting is sufficient for their needs unless they choose a best-practice logging of all events.
For contracts and renewals starting from January 1st, 2024, the default retention period for events is 3 months.
-
After the retention period (ie. after 3 months), event data is discarded
-
Customers may purchase additional data storage if they wish to store event data for more than three months
If a customer chooses to pay for storage, no allowance is made for the free storage that is provided by default: all event storage is chargeable.
-
For more about purchasing additional data storage, please contact your Cato representative.
Cato supports the following event storage options:
-
Directly in the Cato Management Application (see Analyzing Events in Your Network)
-
A high-scale feed to Cloud Storage such as AWS S3 and Azure Blob Storage
-
Using the Cato API
Data Lake storage is purchased in units of 2.5 million events per hour. So, for example:
-
One unit of Data Lake Storage will allow up to 2.5 million events per hour
-
Two units will allow up to 5 million events per hour
Data Lake storage units define the peak number of events that can be stored per hour. A period when fewer events are stored per hour will have no bearing on the number that can be stored in future hours.
Data Lake storage units are available in three variants, according to the storage duration required:
-
A three-month unit
-
A six-month unit
-
A twelve-month unit
The chosen variant applies to all data units, it is not possible to mix units.
The table below illustrates the use of Data Storage Units to cover customer event storage requirements.
Peak number of events per hour that the customer wishes to be able to store |
Retention period required |
Number of Data Storage Units required |
Type of Data Storage Unit required |
---|---|---|---|
Up to 2.5 million |
3 months |
None |
None |
Up to 2.5 million |
6 months |
1 |
6-month unit |
Up to 5 million |
3 months |
2 |
3-month unit |
Up to 7.5 million |
12 months |
3 |
12-month unit |
Customers with a stable history of event storage can inspect the event chart in the Cato Management Application to see how many events are being generated. They can use the peaks in this chart to consider their requirements for storage.
In the example chart below, the peaks reach a maximum of just over 400,000 events per hour. This would be covered by the free storage, if three months’ retention is sufficient.
In the example chart below, the number of events per hour exceeds 2 million in every hour, and the highest peak approaches 3 million. This is more than can be covered by bundled storage. A paid storage of 2 units would cover these storage requirements, allowing up to 5 million events per hour to be stored.
Note that the exact height of each bar can be inspected by hovering the cursor over the bar, as illustrated in the chart below.
Further points to note:
-
These examples cover a small period, for convenience. A longer analysis period would be prudent.
-
The time period represented by each bar will change according to the time period covered by the chart. Pay attention to the Time Series Granularity as you change the time period covered.
Event generation is correlated to both the total bandwidth in use across the network and the number of SDP users supported.
Therefore, customers without a history of event generation can estimate their likely storage requirements by considering first, the sum of the bandwidths in use at each site, and second, the number of their SDP users.
Tables are provided below to assist with estimating the peak events generated per hour. Follow this procedure to calculate requirements from the tables:
-
Find the row in the Total Bandwidth table that corresponds to peak bandwidth bought for the network. Read off the estimated peak events per hour that will be generated
-
Find the row in the SDP Clients table that corresponds to the number of SDP Clients in use. Read off the estimated peak events per hour that will be generated
-
Add the two figures
-
Divide the total events per hour by 2.5 million, and round up, to estimate the number of Data Lake Storage Units required.
Use these tables to estimate the peak number of events per hour generated for a customer. They assume that the customer is logging all events.
Total Bandwidth |
Estimated peak events per hour |
SDP Clients |
Estimated peak events per hour |
---|---|---|---|
Up to 2.5Gbps |
1,000,000 |
Up to 3K |
1,000,000 |
2.5-6Gbps |
5,000,000 |
3K-7K |
5,000,000 |
6-9Gbps |
7,500,000 |
7K-11K |
7,500,000 |
9-12Gbps |
10,000,000 |
11K-15K |
10,000,000 |
12-15Gbps |
12,500,000 |
15K-19K |
12,500,000 |
15-18Gbps |
15,000,000 |
19K-23K |
15,000,000 |
18-21Gbps |
17,500,000 |
23K-27K |
17,500,000 |
21-24Gbps |
20,000,000 |
27K-31K |
20,000,000 |
24-27Gbps |
22,500,000 |
31K-35K |
22,500,000 |
27-30Gbps |
25,000,000 |
35K-39K |
25,000,000 |
30-33Gbps |
27,500,000 |
39K-43K |
27,500,000 |
In the table above:
-
A total of 3 Gbps bandwidth across all sites would generate an estimated peak of five million events per hour
-
A total of 5,000 SDP clients would generate an additional estimated peak of two and a half million events per hour
-
Therefore, the customer could expect a peak of 5+2.5= 7.5 million events per hour
-
This could be covered by buying three Data Lake Storage units of the appropriate duration.
The unit of measure for Data Lake storage is the number of events stored per hour. The volume of data involved is not used in the calculation or purchase of storage units and it is not reported by the Cato Management Application.
However, customers may wish to estimate the storage implications if they plan to export data to external storage or an SIEM. Customers can make a rough estimate for the volume of data involved, by assuming that one unit of Data Lake storage (2.5 million events per hour) is very roughly equivalent to 180 GB per month, as illustrated in the table below.
Note that this is a very rough estimate. Data Lake Storage Units define the maximum number of events that can be stored in an hour. It is self-evident that a customer who buys storage units to cope with occasional large peaks in event storage will have a very different external storage requirement than a customer who buys the same number of Units to cope with a consistently high number of events stored.
The following table shows a very rough estimate of the total GB according to the retention period:
Events per hour |
Storage Units |
GB per month (estimated) |
3 months |
6 months |
12 months |
---|---|---|---|---|---|
2.5 million |
1 |
180 |
540 |
1080 |
2160 |
5 million |
2 |
360 |
1080 |
2160 |
4320 |
7.5 million |
3 |
540 |
2160 |
4320 |
8640 |
(*) Some contracts with Cato may include terms that differ from the information in this article
0 comments
Please sign in to leave a comment.