Cleanbox
Features Blog Pricing Developers
Sign in Start free trial

How AI spam classification works

In addition to Rspamd's traditional spam detection (Bayesian classification, authentication checks, URL reputation, and crowd-sourced sender reputation), Cleanbox includes an AI content classifier that reads and understands every incoming email.

What it does

The AI classifier analyzes the sender address, subject line, and body of every email and produces three outputs:

  • Verdict: spam or ham (legitimate)
  • Confidence: A score from 0.0 to 1.0 indicating how certain the AI is
  • Reason: A plain-English explanation of why it classified the email that way

This verdict is integrated into Rspamd's scoring system as a CLEANBOX_AI_SPAM or CLEANBOX_AI_HAM symbol. It works alongside all existing spam checks — not replacing them, but adding an additional layer of detection.

What it catches that traditional filters miss

Traditional spam filters are statistical — they recognize patterns from emails they have seen before. The AI understands context and meaning:

Threat typeWhy traditional filters miss itWhy AI catches it
Brand impersonationContent resembles real brand emails (same words, same HTML structure)AI sees that paypal-notifications-center.com is not paypal.com
SextortionUnique language patterns not in Bayesian training dataAI recognizes the scam structure: threat → Bitcoin demand → deadline
Fake debt collectionBayes has no data on this domain-brand combinationAI knows bintopia.com is not the a real debt collection agency
Cold outreachProper SPF/DKIM, clean HTML, legitimate-looking infrastructureAI recognizes unsolicited sales patterns regardless of technical legitimacy
Fake voicemailsShort, clean content with low spam scoreAI identifies the phishing pattern: fake notification from impersonated provider

The X-Cleanbox-Explanation header

Every email classified by the AI gets a human-readable explanation injected as a header into the delivered message:

X-Cleanbox-Explanation: legitimate newsletter from official company domain careers.microsoft.com with job vacancy listings and subscription management links
X-Cleanbox-Explanation: classic sextortion scam with threats of webcam recordings, demands for Bitcoin payment, fake malware claims, and password extortion tactics
X-Cleanbox-Explanation: sender domain paypal-notifications-center.com is not paypal.com, urgency + fake dispute link

This header is visible in:

  • The Headers tab on the message detail page in Cleanbox
  • Your email client if you view full/raw headers (Gmail: "Show original", Outlook: "View source")

How the score is calculated

The AI symbol score is not fixed — it is variable, based on two factors:

  1. AI confidence (0.0 to 1.0) — How certain the AI is
  2. Bayes agreement — Whether Rspamd's Bayesian classifier agrees or disagrees

When both the AI and Bayes agree (both say spam, or both say ham), the score is high. When they disagree, the score is lower. This prevents the AI from overriding Bayes when one of them might be wrong.

AI saysBayes saysSymbolTypical score
spamspamCLEANBOX_AI_SPAM+3.0 to +4.5
spamham or neutralCLEANBOX_AI_SPAM+2.0 to +3.0
hamhamCLEANBOX_AI_HAM-3.5 to -4.5
hamspam or neutralCLEANBOX_AI_HAM-2.0 to -3.0

In the spam report

The AI symbol appears in your spam report alongside all other Rspamd symbols:

CLEANBOX_AI_SPAM    +3.29    AI content classifier detected spam/phishing patterns
BAYES_SPAM          +0.25    Message probably spam
RDNS_NONE           +0.50    No reverse DNS
---
Total:              +4.04

Without the AI, this email would have scored 0.75 — delivered without issue. With the AI, it scores 4.04 — caught by the quarantine or spam threshold.

Using AI symbols in filters

You can create filter rules based on the AI symbols:

  • Zero-tolerance AI spam blocking: Spam symbol equals CLEANBOX_AI_SPAM → Deny
  • Move AI-flagged email to a folder: Spam symbol equals CLEANBOX_AI_SPAM → Allow, deliver to "AI Flagged" folder

Caching

Bulk spam campaigns send identical content from different sender addresses. Cleanbox caches AI classifications by content hash (subject + body). The first email in a campaign triggers the analysis. Every subsequent copy with identical content gets an instant cache hit — no duplicate processing, no additional cost.

Privacy

The AI classifier processes the sender address, subject, and body text of each email. This processing happens in real-time during the SMTP transaction and the result (verdict, confidence, reason) is cached for future identical content. The full email body is not stored by the classification system — only the hash (for cache lookup) and the short reason text (max 200 characters) are persisted.

Availability

AI spam classification is active on all accounts. It runs automatically alongside traditional spam detection. No configuration is needed.

For a deeper technical explanation, see How We Built AI-Powered Spam Detection on our blog.