How AI spam classification works
In addition to Rspamd's traditional spam detection (Bayesian classification, authentication checks, URL reputation, and crowd-sourced sender reputation), Cleanbox includes an AI content classifier that reads and understands every incoming email.
What it does
The AI classifier analyzes the sender address, subject line, and body of every email and produces three outputs:
- Verdict:
spamorham(legitimate) - Confidence: A score from 0.0 to 1.0 indicating how certain the AI is
- Reason: A plain-English explanation of why it classified the email that way
This verdict is integrated into Rspamd's scoring system as a CLEANBOX_AI_SPAM or CLEANBOX_AI_HAM symbol. It works alongside all existing spam checks — not replacing them, but adding an additional layer of detection.
What it catches that traditional filters miss
Traditional spam filters are statistical — they recognize patterns from emails they have seen before. The AI understands context and meaning:
| Threat type | Why traditional filters miss it | Why AI catches it |
|---|---|---|
| Brand impersonation | Content resembles real brand emails (same words, same HTML structure) | AI sees that paypal-notifications-center.com is not paypal.com |
| Sextortion | Unique language patterns not in Bayesian training data | AI recognizes the scam structure: threat → Bitcoin demand → deadline |
| Fake debt collection | Bayes has no data on this domain-brand combination | AI knows bintopia.com is not the a real debt collection agency |
| Cold outreach | Proper SPF/DKIM, clean HTML, legitimate-looking infrastructure | AI recognizes unsolicited sales patterns regardless of technical legitimacy |
| Fake voicemails | Short, clean content with low spam score | AI identifies the phishing pattern: fake notification from impersonated provider |
The X-Cleanbox-Explanation header
Every email classified by the AI gets a human-readable explanation injected as a header into the delivered message:
X-Cleanbox-Explanation: legitimate newsletter from official company domain careers.microsoft.com with job vacancy listings and subscription management links
X-Cleanbox-Explanation: classic sextortion scam with threats of webcam recordings, demands for Bitcoin payment, fake malware claims, and password extortion tactics
X-Cleanbox-Explanation: sender domain paypal-notifications-center.com is not paypal.com, urgency + fake dispute link
This header is visible in:
- The Headers tab on the message detail page in Cleanbox
- Your email client if you view full/raw headers (Gmail: "Show original", Outlook: "View source")
How the score is calculated
The AI symbol score is not fixed — it is variable, based on two factors:
- AI confidence (0.0 to 1.0) — How certain the AI is
- Bayes agreement — Whether Rspamd's Bayesian classifier agrees or disagrees
When both the AI and Bayes agree (both say spam, or both say ham), the score is high. When they disagree, the score is lower. This prevents the AI from overriding Bayes when one of them might be wrong.
| AI says | Bayes says | Symbol | Typical score |
|---|---|---|---|
| spam | spam | CLEANBOX_AI_SPAM | +3.0 to +4.5 |
| spam | ham or neutral | CLEANBOX_AI_SPAM | +2.0 to +3.0 |
| ham | ham | CLEANBOX_AI_HAM | -3.5 to -4.5 |
| ham | spam or neutral | CLEANBOX_AI_HAM | -2.0 to -3.0 |
In the spam report
The AI symbol appears in your spam report alongside all other Rspamd symbols:
CLEANBOX_AI_SPAM +3.29 AI content classifier detected spam/phishing patterns
BAYES_SPAM +0.25 Message probably spam
RDNS_NONE +0.50 No reverse DNS
---
Total: +4.04
Without the AI, this email would have scored 0.75 — delivered without issue. With the AI, it scores 4.04 — caught by the quarantine or spam threshold.
Using AI symbols in filters
You can create filter rules based on the AI symbols:
- Zero-tolerance AI spam blocking: Spam symbol equals
CLEANBOX_AI_SPAM→ Deny - Move AI-flagged email to a folder: Spam symbol equals
CLEANBOX_AI_SPAM→ Allow, deliver to "AI Flagged" folder
Caching
Bulk spam campaigns send identical content from different sender addresses. Cleanbox caches AI classifications by content hash (subject + body). The first email in a campaign triggers the analysis. Every subsequent copy with identical content gets an instant cache hit — no duplicate processing, no additional cost.
Privacy
The AI classifier processes the sender address, subject, and body text of each email. This processing happens in real-time during the SMTP transaction and the result (verdict, confidence, reason) is cached for future identical content. The full email body is not stored by the classification system — only the hash (for cache lookup) and the short reason text (max 200 characters) are persisted.
Availability
AI spam classification is active on all accounts. It runs automatically alongside traditional spam detection. No configuration is needed.
For a deeper technical explanation, see How We Built AI-Powered Spam Detection on our blog.