How does Cleanbox detect spam?

Cleanbox uses multiple layers of spam detection to evaluate every incoming email. The result is a numerical spam score — the higher the score, the more likely the message is spam. This article explains each layer and how they work together.

Layer 1: Sender reputation

Before the email content is even scanned, Cleanbox checks the sender reputation. This is a crowd-sourced system built from feedback across all Cleanbox users:

Feedback aggregation — When users mark messages as spam or not-spam (thumbs down/up), this data is aggregated per sender address and per sender domain
Trust scoring — If 5 or more teams have a sender whitelisted or prioritized, that sender is considered trusted and gets a score reduction
New sender detection — If a sender has never emailed any Cleanbox user before, they are flagged as an uncommon sender with a small score increase

Based on the aggregated feedback, the sender receives a recommendation:

Recommendation	Trigger	Score impact
Accept	Default — no significant spam reports	None
Greylist	3+ users reported, more spam than ham	+2.0
Quarantine	5+ users reported, 70%+ spam ratio	+5.0
Block	10+ users reported, 90%+ spam ratio	+8.0
Trusted	5+ teams whitelisted/prioritized	-2.0
Uncommon	First contact across all Cleanbox users	+1.0

Layer 2: Rspamd content analysis

The full email (headers + body) is sent to Rspamd, an advanced spam scanning engine. Rspamd performs dozens of checks simultaneously:

Bayes classifier — Machine learning model trained on spam and legitimate emails. Continuously improved by user feedback (thumbs up/down).
Authentication checks — Verifies SPF, DKIM, and DMARC. Failed authentication adds to the spam score.
URL analysis — Checks links against known phishing, malware, and spam URL databases.
Header analysis — Checks for forged or suspicious email headers, missing required fields, and signs of mass mailing software.
Content patterns — Detects common spam phrases, suspicious formatting, and known spam signatures.
Cleanbox custom symbols — The sender reputation data from Layer 1 is injected as custom scoring symbols (CLEANBOX_BLOCK, CLEANBOX_TRUSTED, etc.)

All individual checks produce a symbol with a score. These are summed into the total spam score. A typical legitimate email scores 0–2. Obvious spam often scores 10+.

Layer 3: Virus scanning (Relay only)

For relay-protected addresses, Cleanbox also runs ClamAV antivirus scanning. If a virus is detected, the message is immediately rejected — regardless of spam score or any other rules. This check runs before the spam threshold evaluation.

Layer 4: IP blacklist checks (Relay only)

For relay addresses, the sending server IP is checked against DNS-based blackhole lists (DNSBL):

Spamhaus — The most comprehensive spam IP database
Barracuda — Enterprise-grade reputation data
SpamCop — Community-reported spam sources

If the IP is blacklisted, the message is rejected at the SMTP level before content is even processed.

How the spam score is used

Each alias and relay address has two configurable thresholds:

Threshold	What happens
Quarantine threshold	Score meets or exceeds this value → message is held in quarantine for review
Spam threshold	Score meets or exceeds this value → message is rejected outright

The quarantine threshold is always lower than the spam threshold, creating three zones:

Score 0 to quarantine threshold    → Deliver normally
Score quarantine to spam threshold → Quarantine (hold for review)
Score above spam threshold         → Reject (definite spam)

The feedback loop

Spam detection improves over time through a feedback loop:

User receives a message and marks it as spam (thumbs down) or not-spam (thumbs up)
The raw email is sent to Rspamd for Bayes learning — training the classifier on real examples
The feedback count for that sender is incremented in the reputation database
Future emails from that sender receive adjusted Cleanbox symbols based on the aggregated feedback
The combined effect of Bayes learning + sender reputation makes detection more accurate over time

Viewing spam details

Every processed message includes a detailed spam report accessible from the message detail page. The report shows:

The total spam score
Every individual rule that triggered, with its name and score contribution
Authentication results (SPF pass/fail, DKIM pass/fail, DMARC pass/fail)
Which Cleanbox reputation symbols were applied
Whether a virus was detected (relay addresses)

This transparency lets you understand exactly why a message was delivered, quarantined, or rejected — and adjust your thresholds accordingly.