This article explains how to create a Content Profile for Cato's DLP service. This profile includes one or more of the DLP Data Types which you can use in an Application Control policy or SaaS Security API Data Protection policy.
Note
Note: This is an Early Availability (EA) feature that is only available for limited release. For more information, contact your Cato Networks representative or send an email to ea@catonetworks.com.
Cato's DLP service uses hundreds of different pre-defined out-of-the-box Data Types to identify sensitive data and content within a traffic flow. There are different categories of pre-defined Data Types and the majority of the categories are for specific countries. This lets you create a granular policy that only applies to the relevant sensitive data.
The DLP service also supports custom data types including User Defined Data Types and Sensitivity Labels. For more about custom data types, see the following articles:
The DLP Content Profile is a global object for the Cato Management Application which includes one or more Data Types.
This section summarizes the different categories of pre-defined Data Types that you can add to a profile in the Cato Management Application.
-
Document classification
-
Financial data
-
HIPAA - only relevant to the USA
-
Health care
-
Item identifiers - such as postal codes and license keys -
-
Payment Card Industry Data Security Standard (PCI DSS) - credit card data
-
Personally Identifiable Information - PII
-
UK National Health Service
The pre-defined data types in the DLP service include machine learning (ML) based data classifiers trained to identify different types of sensitive documents. Using an advanced data science similarity model, the ML Classifiers offer better adaptability and accuracy in detecting sensitive data, as they can dynamically learn and evolve with changing data patterns. For example, instead of needing to update a custom data type whenever a medical form is updated, you can use the Records ML Classifier to detect all medical records. The ML Classifiers provide comprehensive detection for categories such as medical records, tax forms, patent documents, resumes, immigration forms, and more. For more about ML Classifiers, see below.
-
ML Classifier data types support English language documents
-
OCR image scanning is not supported for ML Classifier data types
Note
Note: Please contact SaaSecAPI@catonetworks.com or your official Cato reseller for more information about using ML Classifiers for DLP.
You can configure a Content Profile so the DLP engine includes image files in content matching for the profile. The engine uses OCR to extract text that appears in image files and sends the extracted text for content matching. The OCR scanning option appears when configuring a Content Profile.
The DLP service supports OCR scanning for up to 5 languages for your account, by default, only English is configured. When you configure the languages you want to scan, the DLP engine will scan image files for content in all of the configured languages. The order in which the engine scans for the languages follows the priority you set when you configure the languages. Once the DLP engine detects a match for sensitive data in one language, the scan ends, and the image isn't scanned for the other languages.
Setting a language with a high priority means that the engine will scan for that language before lower priority languages, and there is a higher probability that content in that language will be accurately detected. For example, if Japanese is set as the second language and Korean as the third language, the OCR scan will first try to detect Japanese text and it is more likely that Japanese will be accurately detected.
For more information about defining languages for OCR scans, see below Configuring Languages for OCR Scanning.
Use the DLP Configuration page to create and edit Content Profiles. When you are adding Data Types to a profile, you can filter the types according to a specific country or Universal (for all countries). In addition, you can sort the Data Types in ascending or descending alphabetical order according the the category or name, or according to the country.
When you add multiple Data Types to a profile, select the relationship between them:
-
Any (OR) - Match only one of the Data Types in the profile
-
All (AND) - Match all the Data Types in the profile (otherwise, the rule with this profile is ignored)
A Data Control rule can contain up to 20 Data Types across all Content Profiles.
When you configure a Content Profile, optionally enable OCR scanning for the profile.
To create a DLP Content Profile:
-
From the navigation menu, select Security > DLP Configuration, and select or expand Content Profiles.
-
Click New.
The Add Content Profile panel opens.
-
Create the profile and add the Data Types.
-
Optionally, select OCR Scan Enabled for the profile.
-
Click Apply and then click Save.
The Data Types Catalog page shows all the Data Types that you can add to a profile, and lets you sort the types according to the columns on the page. This lets you research and understand more about specific Data Types that you are using in your organization. The catalog also shows the Threshold for each data type, indicating the minimum number of occurrences to activate the data type. For more about data type thresholds, see Working with Custom Data Types for DLP.
The ML Classifiers page shows all the ML Classifiers to you can add to a profile. The page shows the classifiers according to categories and provides a description for each classifier.
Use the Settings tab in the DLP Configuration page to define the languages scanned for in image files. Select up to 5 languages and set the order of priority. By default, only English is configured.
To configure languages for OCR scanning:
-
From the navigation menu, select Security > DLP Configuration, and select or expand Settings.
-
In the OCR Languages section, select up to 5 languages.
-
Drag and drop the languages in the list to define the scanning priority.
-
Click Save. The OCR language settings are configured for the account.
0 comments
Article is closed for comments.