Understanding the network-level behavior of spammers
read more
Citations
BotMiner: clustering analysis of network traffic for protocol- and structure-independent botnet detection
SybilGuard: defending against sybil attacks via social networks
BotHunter: detecting malware infection through IDS-driven dialog correlation
Your botnet is my botnet: analysis of a botnet takeover
SybilLimit: A Near-Optimal Social Network Defense against Sybil Attacks
References
How to Own the Internet in Your Spare Time
Code-Red: a case study on the spread and victims of an internet worm
Understanding BGP misconfiguration
An empirical study of spam traffic and the use of DNS black lists
Measuring the effects of internet path faults on reactive routing
Related Papers (5)
Frequently Asked Questions (15)
Q2. Why are open relays used by spammers?
Originally intended for user convenience (e.g., to let users send mail from a particular relay while they are traveling or otherwise in a different network), open relays have been exploited by spammers due to the anonymity and amplification offered by the extra level of indirection.
Q3. How many blacklists were used to test this hypothesis?
To test this hypothesis, the authors used theresults from real-time DNSBL lookups performed by Mail Avenger to 8 different blacklists at the time the mail was received .
Q4. What is the common way of sending spam?
A small portion of spam is sent by sophisticated spammers, who briefly advertise IP prefixes, establish a connection to the victim’s mail relay, and withdraw the route to that IP address space after spam is sent.
Q5. What is the common reason for the large fraction of spam coming from Windows hosts?
Because a very large fraction of spam comes from Windows hosts, their hypothesis is that many of these machines are infected hosts that are bots.
Q6. What is the main reason for the skewed distribution of spam?
This heavily skewed distribution suggests that spam filtering efforts might better focus on identifying high-volume, persistent groups of spammers (e.g., by AS number), rather than on blacklisting individual IP addresses, many of which are transient.
Q7. How many spamming bots persist in the trace?
The persistence of Bobax-infected hosts appears to be mildly bimodal: although roughly 75% of Bobax drones persist for less than two minutes, the remainder persist for a day or longer, about 50 persist for about six months, and 10 persist for entire length of the trace.
Q8. Why are the authors interested in measuring the persistence of IP addresses?
Since one of their objectives is to study the effectiveness of IP-based filtering (rather than, say, count the total number of hosts), the authors are interested more in measuring the persistence of IP addresses, not hosts.
Q9. What are the properties of network-level spam?
2. Network-level properties may be observable in the middle of the network, or closer to the source of the spam, which may allow spam to be quarantined or disposed of before it ever reaches a destination mail server.
Q10. What is the effect of the'short-lived' routing announcements?
As an added benefit, route announcements for shorter IP prefixes (i.e., larger blocks of IP addresses) are less likely to be blocked by ISPs’ route filters than route announcements or hijacks for longer prefixes.
Q11. How many ASes appear among the top 10 persistent and voluminous spammers?
only two ASes—AS 4788 (Telekom Malaysia) and AS 4678 (Canon Network Communications, in Japan)—appear among both the top-10 most persistent and most voluminous spammers using short-lived BGP routing announcements.
Q12. How many hosts are responsible for the amount of spam the authors receive?
More striking is that, while only about 4% of the hosts from which the authors receive spam are from hosts are running operating systems other than Windows, this small set of hosts appears to be responsible for at least 8% of the spam the authors receive.
Q13. What are the main characteristics of mail headers?
Although many aspects of mail headers can be forged, the authors base their analysis strictly on properties of the sender that are difficult to forge (e.g., IP addresses that made connections to their mail servers, passive TCP fingerprints, corresponding route announcements, etc.).
Q14. How did the authors determine that the spam was particularly prevalent?
Given the sophistication required to send spam under the protection of short-lived routing announcements (especially compared with the relative simplicity of purchasing access to a botnet), the authors doubted that it was particularly prevalent.
Q15. How do you explain the behavior of the spammers using this technique?
The authors are at a loss to explain certain aspects of this behavior, such as why some of the machines appear to have IP addresses from allocated space, when it would be simpler to “step around” the allocated prefix blocks, but, needless to say, the spammers using this technique appear to be very sophisticated.