Understanding AI-Driven Automated Domain Categorization Correction

Overview

Cato’s domain categorization service assigns internet domains to predefined categories. These categories are used to enforce security policies and control user access.

In some cases, legitimate domains may be categorized in a way that does not accurately reflect their purpose. For example, a valid business service might be placed in a category that is commonly blocked by default security policies. This can unintentionally prevent users from accessing required services.

To address this at scale, Cato developed an AI-driven automated domain categorization correction system. This system continuously identifies and corrects inaccurate domain categorizations to improve security accuracy and user experience.

Categorization corrections follow a controlled and predictable deployment process that executes every 24 hours and is deployed across PoPs every 24 hours. Therefore, once a misclassification is detected or reported, it is typically corrected within 24–48 hours. All corrections are applied globally and consistently across the Cato network, ensuring uniform policy enforcement.

Scope of Automated Recategorization

The automated process focuses on the most common and impactful miscategorization patterns observed across the Cato network.

To ensure safety and consistency:

Automated reevaluations are applied only to domains with demonstrated usage across multiple customers and users
Customer-specific or edge-case domains are not globally overridden
Every domain undergoes a structured reevaluation process before any category change is made

If an inaccuracy is confirmed, a new category is assigned. If the existing category is validated, it remains unchanged.

Supported Recategorizations

Automated Domain Categorization Correction supports these scenarios:

Legitimate and Common Domains Incorrectly Labeled as Risky

Risky categories are those commonly blocked by default security policies. Examples include:

Uncategorized
Parked Domains
Phishing
Malware
Adult Content

If a legitimate domain is incorrectly placed in one of these categories, users may experience unintended access blocks. The automated system identifies such cases that are commonly accessed by users and reevaluates them.

Domains Reported by End Users as Incorrectly Categorized

Users can report a suspected misclassification directly from the block page using the Report Wrong Category link.

Reported domains are collected and automatically reevaluated by the system. If a misclassification is confirmed, the category is corrected.

How Automated Recategorization Works

When a domain is selected for reevaluation, it is processed through a this automated workflow:

Step 1: Context Collection

Multiple signals are gathered to understand the domain’s purpose. Decisions are never based solely on the domain name.

Signals include:

Home page title and content
Usage metrics across Cato customers
Client characteristics (for example, browser, command line, or application access)
HTTP methods

Step 2: AI-Assisted Analysis

An AI model evaluates the collected signals and suggests the most appropriate category. Each suggestion includes a :

Confidence score
Supporting rationale

Step 3: Safety-First Validation

Before applying any change, additional safeguards are enforced:

Enhanced validation for high-risk categories – Suspected domains, for example Phishing or Malware, are reevaluated by an additional machine learning model trained to detect malicious domains using multiple third-party security augmentations
Conservative confidence thresholds – Only category changes that meet strict confidence requirements are applied.

This layered approach ensures that corrections improve accuracy without introducing risk.