How AI spam classification works

In addition to Rspamd's traditional spam detection (Bayesian classification, authentication checks, URL reputation, and crowd-sourced sender reputation), Cleanbox includes an AI content classifier that reads and understands every incoming email.

What it does

The AI classifier analyzes the sender address, subject line, and body of every email and produces three outputs:

Verdict: spam or ham (legitimate)
Confidence: A score from 0.0 to 1.0 indicating how certain the AI is
Reason: A plain-English explanation of why it classified the email that way

This verdict is integrated into Rspamd's scoring system as a CLEANBOX_AI_SPAM or CLEANBOX_AI_HAM symbol. It works alongside all existing spam checks — not replacing them, but adding an additional layer of detection.

What it catches that traditional filters miss

Traditional spam filters are statistical — they recognize patterns from emails they have seen before. The AI understands context and meaning:

Threat type	Why traditional filters miss it	Why AI catches it
Brand impersonation	Content resembles real brand emails (same words, same HTML structure)	AI sees that `paypal-notifications-center.com` is not `paypal.com`
Sextortion	Unique language patterns not in Bayesian training data	AI recognizes the scam structure: threat → Bitcoin demand → deadline
Fake debt collection	Bayes has no data on this domain-brand combination	AI knows `bintopia.com` is not the a real debt collection agency
Cold outreach	Proper SPF/DKIM, clean HTML, legitimate-looking infrastructure	AI recognizes unsolicited sales patterns regardless of technical legitimacy
Fake voicemails	Short, clean content with low spam score	AI identifies the phishing pattern: fake notification from impersonated provider

The X-Cleanbox-Explanation header

Every email classified by the AI gets a human-readable explanation injected as a header into the delivered message:

X-Cleanbox-Explanation: legitimate newsletter from official company domain careers.microsoft.com with job vacancy listings and subscription management links

X-Cleanbox-Explanation: classic sextortion scam with threats of webcam recordings, demands for Bitcoin payment, fake malware claims, and password extortion tactics

X-Cleanbox-Explanation: sender domain paypal-notifications-center.com is not paypal.com, urgency + fake dispute link

This header is visible in:

The Headers tab on the message detail page in Cleanbox
Your email client if you view full/raw headers (Gmail: "Show original", Outlook: "View source")

How the score is calculated

The AI symbol score is not fixed — it is variable, based on two factors:

AI confidence (0.0 to 1.0) — How certain the AI is
Bayes agreement — Whether Rspamd's Bayesian classifier agrees or disagrees

When both the AI and Bayes agree (both say spam, or both say ham), the score is high. When they disagree, the score is lower. This prevents the AI from overriding Bayes when one of them might be wrong.

AI says	Bayes says	Symbol	Typical score
spam	spam	`CLEANBOX_AI_SPAM`	+3.0 to +4.5
spam	ham or neutral	`CLEANBOX_AI_SPAM`	+2.0 to +3.0
ham	ham	`CLEANBOX_AI_HAM`	-3.5 to -4.5
ham	spam or neutral	`CLEANBOX_AI_HAM`	-2.0 to -3.0

In the spam report

The AI symbol appears in your spam report alongside all other Rspamd symbols:

CLEANBOX_AI_SPAM    +3.29    AI content classifier detected spam/phishing patterns
BAYES_SPAM          +0.25    Message probably spam
RDNS_NONE           +0.50    No reverse DNS
---
Total:              +4.04

Without the AI, this email would have scored 0.75 — delivered without issue. With the AI, it scores 4.04 — caught by the quarantine or spam threshold.

Using AI symbols in filters

You can create filter rules based on the AI symbols:

Zero-tolerance AI spam blocking: Spam symbol equals CLEANBOX_AI_SPAM → Deny
Move AI-flagged email to a folder: Spam symbol equals CLEANBOX_AI_SPAM → Allow, deliver to "AI Flagged" folder

Caching

Bulk spam campaigns send identical content from different sender addresses. Cleanbox caches AI classifications by content hash (subject + body). The first email in a campaign triggers the analysis. Every subsequent copy with identical content gets an instant cache hit — no duplicate processing, no additional cost.

Privacy

The AI classifier processes the sender address, subject, and body text of each email. This processing happens in real-time during the SMTP transaction and the result (verdict, confidence, reason) is cached for future identical content. The full email body is not stored by the classification system — only the hash (for cache lookup) and the short reason text (max 200 characters) are persisted.

Availability

AI spam classification is active on all accounts. It runs automatically alongside traditional spam detection. No configuration is needed.

For a deeper technical explanation, see How We Built AI-Powered Spam Detection on our blog.