A spam cocktail puts each e-mail message through a series of tests that provides a numeric score showing how likely the message is to be spam. Scores are computed and the message is assigned a probability rating. For example, it may be determined that a message has 85% probability that it is spam. E-mail administrators can create rules that govern how the messages are
handled based on their scores; the highest scores may be deleted, medium scores may quarantined, and lower scores may be delivered but marked with a spam warning.
A spam cocktail commonly includes several of the following identification methods, which may be weighted differently for message scoring:
- Machine learning: Implementing sophisticated computer algorithms that improve over time to analyze the subject line and contents of a message and predict the probability that it is spam based on past results. The Bayesian filter is a type of machine learning.
- Blacklisting: Subscribing to a blacklist or blackhole list of known spammers and blocking messages from those sources
- Content filtering: Using programs that look for specific words or criteria in the subject line of body of a message
- Spam signatures: Using programs that compare the patterns in new messages to patterns of known spam
- Heuristics: Using heuristic programs that look for known sources, words or phrases, and transmission or content patterns
- Reverse DNS lookup: Checking whether the IP address matches the domain name from which a message is coming.