INTERNET-DRAFT A. Church Expires: February 21, 2006 August 20, 2005 DNS Blacklists Considered Harmful draft-church-dnsbl-harmful-00.txt Status of this Memo Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/1id-abstracts.html The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html By submitting this Internet-Draft, each author represents that any applicable patent or other IPR claims of which he or she is aware have been or will be disclosed, and any of which he or she becomes aware will be disclosed, in accordance with Section 6 of BCP 79. Copyright Notice Copyright (C) The Internet Society (2005). All rights reserved. Abstract As spam continues to grow throughout the Internet, various countermeasures have been developed. Among these is the "DNS blacklist", a DNS server configured to return a "good" or "bad" response to a query on a given IP address; mail servers can be configured to automatically query such a server and reject messages which are flagged "bad". If the blacklist is accurate, this allows mail servers to reject spam without wasting the time of the human recipient or the resources of the server. However, between delays in responding to environmental changes and arbitrary operational decisions by blacklist operators, such blocking of mail in fact causes significant harm to innocent third parties. This note describes the issues concerning these blacklists and suggests ways to resolve the attendant problems. Church Expires February 21, 2006 [Page 1] Internet-Draft DNS Blacklists Considered Harmful August 2005 The Problem The past several years have seen an explosion in the increase of "spam", undesired E-mail such as unsolicited commercial E-mail (UCE), viruses, and fraud schemes. It is not clear exactly how much spam is being sent throughout the Internet, but some estimates [1] suggest that spam accounts for over 70% of all E-mail traffic, and the author has had personal experience with E-mail addresses having a spam-to-nonspam ratio of greater than 20 to 1. By its nature, spam is sent in bulk. As the number of recipients of any particular message often number in the millions, the simple act of sending can place significant burdens on the server used to send the messages. If the spammer is using his own server, this is not such a great issue, but many spammers take advantage of others' servers to pass on their messages, either by using open relay servers (mail servers which allow any Internet user to relay mail to any other Internet user) or through the use of "botnets", networks of remotely-controlled client machines, to send spam. The receiving domains must also process the messages. This becomes a particular problem when a spammer generates a large number of random addresses in a single domain and attempts to send to all of them; the flood of E-mail may cause a denial of service if the receiving mail server is not prepared to process it. For spam that reaches a human user, the user must then take time to first confirm that the message is spam and then delete it. As the ratio of spam to non-spam increases, not only does it take longer to process the spam, the chance of accidentally deleting desired messages increases. Additionally, some Internet users are charged per data unit for connectivity. For such users, not only does receiving and processing spam take time, but it also brings a financial burden. For the above reasons, mail server operators have continually worked on methods to block spam at the mail server level. As documented in RFC 2505 [2], earlier methods included disabling the ability to relay from non-local domains to non-local domains and checking the existence of the sender's purported domain. However, spammers soon circumvented these methods, and since then, server operators and spammers have engaged in a something of a technological war, as server operators and spammers attempt to outthink each other. The Rhetoric One concept that emerged from this "war" is that of the DNS blacklist, or DNSBL. This is a DNS [3] server which, rather than map "host names" to data such as IP addresses, returns a "good" or "bad" response to a query on a certain piece of data (typically an IP address). Church Expires February 21, 2006 [Page 2] Internet-Draft DNS Blacklists Considered Harmful August 2005 The thinking behind these blacklists was that, by collecting data on IP addresses used by spammers and distributing that data throughout the Internet via DNS, server operators could automatically refuse mail from known and newly discovered spammers without the trouble of manually reconfiguring their servers. Previously, each server operator would have to maintain an individual filter configuration, or at the least download a list of "bad" IP addresses from a central site periodically, both requiring effort on the part of the operator and causing a delay between the appearance of a new spammer and the filter update. In order to use these DNS blacklists, a server operator would configure the server to perform a DNS query to the blacklist when receiving a message; for example, if a DNSBL was hosted at dnsbl.example.org, an SMTP server receiving a connection from the IP address 10.1.2.3 might perform a query for an A record on 3.2.1.10.dnsbl.example.org; an NXDOMAIN response would indicate that the IP address was "good", while an A record (perhaps 127.0.0.2) would indicate that the IP address was "bad". If the query returned a "bad" response, the SMTP server might return a 550 error code to any RCPT command from the client, thus preventing the spam from reaching any recipients on that server. The blacklists were typically run by individuals or small groups, which accepted reports of spam, or in some cases of open relay servers (which had the potential to be used for spamming), and added the offending IP addresses to the blacklist. However, some blacklist operators argued that simply blocking the offending IP address itself was not enough. They argued that ISPs should take a more proactive role in stopping spammers, and that in order to pressure ISPs into taking action, DNSBLs should list entire subnets, not single IP addresses; certainly, they said, when ISPs started getting complaints from users about not being able to send mail, they would realize the seriousness of the situation. As spammers realized they could no longer simply rent servers to send spam from, they started using "botnets"--networks of client machines, typically Windows PCs, infected with a virus which allowed remote control of the machine. Spammers would send a command to the botnet to send out a given message to a given list of addresses, and each machine would start sending spam from its own IP address, just as if the proper user of the machine was sending ordinary mail. By spreading out the spam source among thousands of IP addresses, DNSBLs would be unable to stay up to date, much like an IP-based DDoS (distributed denial of service) attack cannot be blocked by simply filtering on IP addresses. In response to this, blacklist operators began creating lists of IP addresses that appeared to be used by ordinary home users, the users most often targeted by the botnet viruses. Such "dialup" or dynamic Church Expires February 21, 2006 [Page 3] Internet-Draft DNS Blacklists Considered Harmful August 2005 IP addresses, they argued, had no business sending mail to any SMTP server other than the one operated by their ISP; blocking such IP addresses would make botnets effectively useless for sending spam. The Reality While it cannot be denied that DNS blacklists are effective to an extent in blocking spam, the reality is that not only are they no panacea, but they cause significant collateral damage in the form of false positives: not infrequently, users find themselves unable to send mail because their mail server has been listed on one of these blacklists. One reason for these false positives is simply the rate of change of the Internet environment. An IP address used for spamming and blacklisted yesterday might today be used by a completely innocuous web server, but it may take a significant amount of time for the server operator's request for removal from the blacklist to be answered. Even if removal requests were handled automatically, there would necessarily be a delay in removing the IP address, to prevent spammers from immediately removing themselves and then continuing to spam. The second, and far more serious, problem is that of arbitrary policies instituted and decisions made by blacklist operators, such as the policy mentioned above of listing an entire netblock when a single spam source is found, or the addition of IP addresses to a dialup blacklist because they subjectively "look like" dynamic addresses. It is unlikely that any blacklist operators are intentionally trying to harm innocent users, but whether due to ignorance or to ego, the fact remains that many blacklists' policies result in completely unrelated third parties being unable to send mail to many domains. The author himself has had significant difficulties in getting his personal server (with a static IP address, using a business-class ISP account) removed from a certain dynamic address blacklist [4]. Blacklist operators hold that they are not blocking mail, simply providing lists for mail server operators to use as they wish (see e.g. [5]), but this is no comfort to the user who finds himself unable to send mail. An analysis of mail received by the author's mail server demonstrates the limited effectiveness of, and collateral damage caused by, these blacklists. The analysis was performed on 1,374 messages (including message send attempts, defined as SMTP connections to the server in which the SMTP client began a transaction with a MAIL command but did not complete the transaction before disconnecting) received over a period of approximately two weeks. The chart below shows the classification of these messages by the mail filter actually used on the server, as well as five of the most commonly used DNS blacklists [6] individually and together. The classifications are: Church Expires February 21, 2006 [Page 4] Internet-Draft DNS Blacklists Considered Harmful August 2005 True Pos (true positive): Correctly identified as spam. False Pos (false positive): Nonspam misidentified as spam. False Neg (false negative): Spam misidentified as nonspam. True Neg (true negative): Correctly identified as nonspam. The filters are: Actual: The filter actually used on the mail server: a content-based, manually-updated filter, based on spam which the author has previously seen. BL1: sbl-xbl.spamhaus.org [7]. BL2: dnsbl.sorbs.net [8]. BL3: cbl.abuseat.org [9]. BL4: combined.njabl.org [10]. BL5: list.dsbl.org [11]. BL-All: The union of BL1, BL2, BL3, BL4, and BL5, treating a positive result from any blacklist as a positive result for the union. | Actual | BL1 | BL2 | BL3 | BL4 | BL5 | BL-All ----------+--------+------+------+------+------+------+-------- True Pos | 1133 | 647 | 550 | 517 | 437 | 257 | 927 False Pos | 0 | 3 | 5 | 3 | 3 | 1 | 6 False Neg | 30 | 516 | 613 | 646 | 726 | 906 | 236 True Neg | 211 | 208 | 206 | 208 | 208 | 210 | 205 As can be seen from the chart, only one of the blacklists (BL1) managed to correctly flag more than 50% (57%) of incoming spam, and all blacklists rejected some of the incoming nonspam messages. Using all five blacklists in combination yields a higher true positive rate, catching about 82% of spam, but also flags nearly 3% of nonspam messages as spam. While some people (perhaps including the blacklist operators) might consider 3% an acceptable rate of loss, many others, including the author, do not. The custom filter actually used on the server, on the other hand, correctly filtered 97% of the spam, without any false positives. (The filter has generated false positives on rare occasion, but none were seen during the test period; the long-term average false positive rate is estimated to be less than 0.1%.) The filter consists of regular expressions derived from spam seen in the past, and the author spends an average of about 40 seconds a day classifying unfiltered spam and updating the filter. What Can Be Done? While the inherently trusting nature of SMTP makes it difficult to eliminate spam within the SMTP framework, there are some actions that can be taken to reduce the side effects of spam filtering. The "dialup" or dynamic IP address blacklists mentioned earlier are Church Expires February 21, 2006 [Page 5] Internet-Draft DNS Blacklists Considered Harmful August 2005 often created by manually analyzing DNS PTR records for IP address blocks and including those that subjectively "look like" dynamic addresses. However, an incorrect decision can result in non-dynamic addresses getting listed (as in the author's case), potentially leaving the users affected with no recourse. One solution to this issue would be to include a record in the DNS, perhaps a TXT record, indicating whether an address is dynamically assigned or not; alternatively, a standard format for PTR records of dynamic and static addresses, such as "aaa-bbb-ccc-ddd.dynamic.example.com" or "aaa-bbb-ccc-ddd.static.example.com", could be chosen. Either of these would allow blacklist operators to generate more accurate lists and minimize improper blocking. However, any such agreement will have little effect unless blacklist operators assume more responsibility for the role they have undertaken in spam control. While legal issues are outside the scope of this document, anyone who publishes a blacklist for the purpose of reducing spam has a moral responsibility to ensure that the blacklist does not affect third parties. In the past, it may have been acceptable to turn a blind eye to inaccuracies and collateral damage, or to disclaim responsibility because the actual blocking was happening elsewhere; today, with the ubiquity and importance of E-mail, it certainly is not. Better behavior on the part of blacklist operators would have a significant positive effect on improper blocking. As technology advances, it is also necessary to revisit the original assumptions that led to the implementation of "hard" mail filters-- that is, mail filters that block mail from reaching downstream recipients, as opposed to e.g. adding or modifying a header--in the first place. In the mid-to-late 1990s, when spam first became a significant problem, mail client software had only simple filtering abilities; most modern software includes sophisticated spam filters which learn from previous spam and are far more effective than simple server-based blacklists, particularly since they can adjust for the types of spam received by individual users. Metered data connections are also becoming less common as high-bandwidth, always-on links become available to more users, and the increased bandwidth of such connections makes the time spent downloading spam negligible in most cases. Furthermore, web-based mail services such as Hotmail and Gmail, as well as similar systems used by individual service providers, have reduced the necessity for users to download mail in the traditional sense. Given these advances, the necessity of blocking mail at the server has waned, and providers of E-mail services should refrain from implementing such "hard" blocks except in extreme circumstances. Security Considerations Security considerations are not directly discussed in this memo. However, (lack of) security on Internet-connected machines, Church Expires February 21, 2006 [Page 6] Internet-Draft DNS Blacklists Considered Harmful August 2005 particularly personal computers, is a significant contributing factor in the prevalence of spam, and hence the use of DNS blacklists; improvements in security of such machines would help reduce the impetus for mail server administrators to use such drastic measures against spam. Author's Address Andrew Church achurch@achurch.org (Please see [4] before sending mail.) References [1] Press release from the Office of Fair Trading, United Kingdom, February 22, 2005 (quoting statistics provided by Brightmail, Inc.). http://www.oft.gov.uk/News/Press+releases/2005/34-05.htm [2] Anti-Spam Recommendations for SMTP MTAs, G. Lindberg, February 1999. RFC 2505. [3] Domain Name System, P. Mockapetris, November 1987. STD0013. [4] Personal communication; see http://achurch.org/sorbs.html [5] njabl.org Information for End Users, http://www.njabl.org/enduser.html [6] "Blacklists Compared", J. Makey, June 11, 2005. http://www.sdsc.edu/~jeff/spam/cbc.html . The five entities with the most hits (excluding t1.dnsbl.net.au, which is an aggregator of blacklists provided by several other entities) were chosen. [7] Spamhaus Block List and Exploits Block List. http://www.spamhaus.org/ [8] Spam and Open-Relay Blocking System. http://www.sorbs.net/ [9] Composite Blocking List. http://cbl.abuseat.org/ [10] Not Just Another Bogus List. http://www.njabl.org/ [11] Distributed Sender Blackhole List. http://dsbl.org/ Church Expires February 21, 2006 [Page 1] Internet-Draft DNS Blacklists Considered Harmful August 2005 Full Copyright Statement Copyright (C) The Internet Society (2005). This document is subject to the rights, licenses and restrictions contained in BCP 78, and except as set forth therein, the authors retain all their rights. This document and the information contained herein are provided on an "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.