HTTP/1.1 200 OK Date: Tue, 09 Apr 2002 10:36:35 GMT Server: Apache/1.3.20 (Unix) Last-Modified: Tue, 21 Feb 1995 23:00:00 GMT ETag: "361d34-4e50-2f4a7070" Accept-Ranges: bytes Content-Length: 20048 Connection: close Content-Type: text/plain Network Working Group Julian Onions INTERNET DRAFT Nexor Ltd February 17, 1995 How to be a Bad EMail Citizen 1. Status of this Memo This document is an Internet Draft. Internet Drafts are working documents of the Internet Engineering Task Force (IETF), its Areas, and its Working Groups. Note that other groups may also distribute working documents as Internet Drafts. Internet Drafts are valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. (The file 1id-abstracts.txt on nic.ddn.mil describes the current status of each Internet Draft.) It is not appropriate to use Internet Drafts as reference material or to cite them other than as a "work in progress". This draft is known as draft-onions-822-mailproblems-00. 2. Abstract The internet consists of many hosts and many implementations of each protocol suite. There are no formal tests or approval mechanisms associated with membership of the internet, and therefore there are very varied levels of conformance to the various standards. This document intends to describe some of the common problems, mistakes and errors that are made in electronic mail. Most of them are easily avoidable, and some guidance on what to do in each case is given here. Some of these guidelines are pragmatic, some are mandated by other standards, and others are religious. 3. Introduction There are various documents around the internet that define the way mail should behave, what is mandatory, what is optional and what is forbidden. Adherence to these standards across implementations is at best patchy, and with no overseeing body the only enforcement to the standards are peer pressure and possible lack of service. Onions Expires Aug 30, 1995 [Page 1] INTERNET DRAFT How to be a Bad EMail Citizen February 17, 1995 4. Scope This document restricts itself to the standards defined in RFC-821 (SMTP), RFC-822, RFC-1123 (Host Requirements), RFC-1521 (MIME) and RFC-1651 (SMTP Extensions). Currently other documents are not considered. 5. Issues concerning SMTP 5.1. The RSET Command RFC-821 is not specific about exactly what the RSET verb resets. This has apparently not been a problem in the past because of the simplicity of the protocol. With the publication of extensions to the SMTP protocol with additional commands and state information, making a more precise definition desirable. The definition provided should not constrain any existing RFC-821 implementation since it is consistent with both the current practice and the only two plausible interpretations. RSET is to be interpreted by SMTP servers as clearing state information present in a session. In particular, it eliminates the effect of any prior FROM commands, any DATA, and any delivery addresses. It resets the server's state to "not a mail transaction". This implies it is in the state after the HELO and before the MAIL verb. RSET has been interpreted by some SMTP servers as requiring that a new HELO command be sent after RSET is acknowledged. Other servers assume that the previous HELO is not reset. Servers SHOULD accept a HELO command subsequent to RSET without special comment, overriding a previous one if necessary. Servers MUST NOT require a HELO command after a RSET. The description above summarizes the current situation with SMTP implementations based on a series of experiments. No implementations have been identified that rejects a second HELO, but it would not be surprising to find one. 5.2. Duplication of single state verbs. Whilst some of the SMTP state-inducing verbs may be repeated and arbitrary number of times (such as RCPT for multi-destination) other verbs (such as MAIL) may only be issued once per transaction. If a second occurrence of state-inducing verb is detected, a server MAY either accept it, overriding earlier information, or may reject it as an out-of-sequence command with a "503 bad sequence of commands" code. A client sending multiple of these commands within a mail Onions Expires Aug 30, 1995 [Page 2] INTERNET DRAFT How to be a Bad EMail Citizen February 17, 1995 transaction MUST be prepared to send a RSET and start over, or to send QUIT and abandon the session, if 503 is received in this case. Clients SHOULD, if possible, behave in a way that avoids this situation. The issues above do not arise in the normal case of multiple successful message transmissions in the same session, since each successful message completion (i.e., server receipt of DATA, the message, CR LF . CR LF, and then sending a positive completion reply) results in terminating a mail transaction. Clients SHOULD NOT send RSET after receipt of a 250 response after DATA and the message; servers MUST reset their states after sending that 250 response and MUST NOT require clients to send RSET before the next MAIL FROM command 5.3. Behavior with unrecognized verbs. While it is not quite explicit, RFC-821 appears to expect that, if a verb is not recognized by the receiver, it will reject the command with a "permanent error", 5yz, code, presumably 500 (Syntax error). Similarly, it appears to specify that, if the sender receives such a code, it must either abandon the mail message (sending QUIT or RSET, presumably) or do something else involving the same or a different verb; it may not simply ignore the 5yz error code and pretend it was a 2yz (or 354) code. This specification depends on that behavioral model. Consistent with RFC-821, we expect that existing SMTP servers will reply with a return code of 500 (Syntax error) when any unfamiliar verb is received. The material above should probably have made it into RFC-1123, but some of the issues -- particularly the fact that anyone could ever have believed that anything else (such as simply ignoring 5xx codes) was permitted--have emerged only in the process of this investigation. Nonetheless, this clarification is believed to be consistent with existing usage and implementations of SMTP. 5.4. Behaviour with eight-bit data RFC-821 together with RFC-822 is unambiguous in this respect. Unless an extension to RFC-821 is in force for the mail transaction, eight- bit data may not be sent. Period. This point just needs emphasising. It is present in the original documents, but not spelled out. Onions Expires Aug 30, 1995 [Page 3] INTERNET DRAFT How to be a Bad EMail Citizen February 17, 1995 5.5. Error reports with eight-bit data Some implementations will return the original message as part of a delivery report. Care needs to be taken in this case that the reason for failure was that eight-bit data was present. Otherwise it is possible to construct an illegal eight-bit message as an error report to an eight-bit message. As error reports and messages cannot be easily distinguished in RFC821, all messages (including error messages) appear as standard messages, and therefore need to be correct RFC822 messages. 5.6. Rejection of SMTP connections due to DNS failure. There are a number of SMTP implementations that either do, or can be configured, to reject SMTP connections if the calling host is not registered in the DNS. This is seen by some as a breaking of the spirit of RFC-1123, and by others as a useful get-out-of-jail card. Regardless of whether this is a good idea or a bad one, the fact remains this is practiced by some sites. Implementors are therefore encouraged to use back up MX routing in the case of a connection that succeeds but no data is received before the connection is dropped. This topic has been debated a number of times on the Internet with both sides sticking to their views. There is no sense in continuing to try and standardise this point. What a site will do with any internet connection from any host eventually comes down to what the administrator at that site decides. If they don't want to talk to a given set of hosts, that may be their loss. With the increasing emphasis on security though, the fact that a site advertises an MX or A record in the DNS does not imply it will talk to all callers. 5.7. EHLO commands There are one or two servers that respond badly to EHLO commands. That is they either set themselves into inconsistent states, or else drop the connection at once. The RFC is fairly clear that unknown commands should be rejected but otherwise ignored. A resilient server MAY detect that the EHLO caused the connection to drop and immediately retry the connection with a HELO verb in place. Alternatively it can be treated as a bad connection and subsequent MX records tried if available. However servers SHOULD NOT drop the connection in response to an unknown verb. Onions Expires Aug 30, 1995 [Page 4] INTERNET DRAFT How to be a Bad EMail Citizen February 17, 1995 6. RFC-822 Issues 6.1. Illegal format RFC-822 messages Some implementations of RFC-821 check the message for adherence to RFC-822 minimum requirements as the message is received. These are that the message contains in the header a From field, a Date field and a recipient field of some type. However, some user agents use RFC-821 as a submission protocol and assume that messages will be made legal RFC-822 as part of the submission process (as some MTA's already do this). Implementations MAY therefore allow strictly illegal RFC-822 messages as data and make them legal by addition of new headers, or MAY reject the message as illegal data. Some User Agents, particularly those on PC's find it difficult to determine an accurate time to provide a Date field, and therefore leave it out. It is harmless enough to insert such a field when acting as a submission channel, but inserting a Date mid way through a multi-hop delivery path is mis-leading and should be discouraged. However, in practice it is difficult to determine the two modes RFC- 821 is used in, so usually a blanket decision concerning all transfers has to be made. What is really required is a submission protocol tailored for this sort of behaviour that can take a partial RFC-822 message and add the appropriate envelope bits. 6.2. Received Lines The syntax of the Received: lines in RFC-822 messages is reasonably straight forward. It requires as a minimum a date stamp following a semi-colon. Unfortunately some implementations cannot seem to generate this. This can cause problems when gatewaying to other systems that also have trace fields. This is seen as a good way to cause general confusion when tracking messages. When gatewaying or examining these elements, the invalid elements should either be discarded or else the current time inserted to make them legal. The illegal Received: lines can be changed to be Orig- Received: to ensure the relayed message is now legal. 6.3. Date fields. Date fields are usually fairly standard, although there are implementations that strike out with new an novel formats. However, when it comes to the area of time zones there is little limitation in the imagination of implementors. Normally time zones should be numeric as these are unambiguous. It should be down to the user agent to display the Date in a ``pretty'' format. Onions Expires Aug 30, 1995 [Page 5] INTERNET DRAFT How to be a Bad EMail Citizen February 17, 1995 Just say NO to pretty, arbitrary timezones! All UAs should generate numeric offsets for timezones. 6.4. Resent- fields RFC-822 allows the pseudo-forwarding of messages by amending the header of a message to contain new recipients. This is done by adding headers such as Resent-To: abc@domain.name Resent-Date: Sun, 1 Jan 1995 02:24 +0000 Resent-From: xyz@foo.bar It is not clear in RFC-822 if when resending a message a complete set of headers is required. The standard would seem to imply that they are but no grammer is present which mandates it. Therefore implementations vary on how to treat this type of message. Strict implementations will on detection of a Resent- field, conclude that this is a resent message, and therefore should be using the Resent- versions of the fields as opposed to the standard forms. In this case a message without a Resent-From, a Resent-Date and a Resent- recipient field is illegal. It is assumed that the message has been resent but with only a partially correct header. Other implementations take the view that a Resent- field is a higher weighted form of the original field. That is, a Resent-Date should be used in preference to a Date field, but as long as a Date, From and Recipient field is present with or without the resent- prefix the message is legal. The first view treats the resent- as a new overriding SET of headers, the second as individual replacements for fields. Either case could be argued, as the original text is unclear. For pragmatic reasons, and because it seems closer to the intent of RFC-822 in this case, the Resent- fields should be taken as a set. However implementations SHOULD allow the individual fields. In practice this sort of forwarding is not very common, but does arise from time to time. 7. MIME issues. MIME since its inception has allowed implementations of MTAs and UAs to further the cause of havoc and generally increase entropy. The number of ways that it is possible to get this specification wrong is truely astounding! In general an MTA can treat badly formatted MIME as a text/plain format and punt the whole problem to the UA. The UA Onions Expires Aug 30, 1995 [Page 6] INTERNET DRAFT How to be a Bad EMail Citizen February 17, 1995 will take a number of views: a) It will crash and burn. b) It will complain the message is illegal and refuse to show it. c) It won't care and show you the message, warts and all. d) It will ignore the message, and you will never even know you have received the message. The best approach is to be able to flag an error and then revert to action c) above. This may upset some naive mail users (who seem to be predominantly upper management and therefore dangerous to upset!). 7.1. Badly formatted Content-Type: fields Implementations have been known to produce lines of the form MIME-version: 1.0 Content-Type: text That is, a MIME type, without the mandatory subtype. This is illegal as a MIME header and means the content may be subject to misinterpretation. In these cases the most pragmatic case is to treat the message as text/plain, regardless of what the Content-type might indicate. However, outright rejection of the message is also an option. (The author feels a system that rejects every other such message may have merits in forcing systems to be upgraded.) 7.2. Multiple Content-Type fields Messages may contain multiple Content-Type fields, sometimes containing contradictory information. Where this happens this may again cause contents to be misrepresented, or misprocessed. For instance: MIME-version: 1.0 Content-Type: multipart/mixed; boundary="---" MIME-version: 1.0 Content-Type: text/plain As for the badly formatted contents type. If two Content-Type fields are present, and contain the same information, that case MAY be treated as just one Content-Type field. Onions Expires Aug 30, 1995 [Page 7] INTERNET DRAFT How to be a Bad EMail Citizen February 17, 1995 7.3. Badly structured multipart messages Message that contain fields such as Content-Type: multipart/mixed have some great potential for causing indigestion in mail systems. The missing boundary string means that although the message is split into multiple parts, there is no way a process can reconstruct the message in general. It is charitable to believed that these type of messages start out with good intentions, but loose their boundary markers somewhere in flight. Whilst an intelligent human can scan the body part and make an educated guess at what the separator is, this is not generally possible for a program. 7.4. Wrapped lines Another interesting little problem is where a UA, or MTA has helpfully wrapped the text of the field to improve readability. Some interesting examples are presented here. Content-Type: multipart/mixed; boundary="message -separator" Content-Type: multipart/mixed; boundary="abcdefghijklmno: boundary:fixed01" The first case is debateably correct input, although few MTA/UAs will be able to reconstruct the correct separator. The second case is illegal, ambiguous and awkward to treat well. Why do people do this! The road to hell is paved with good intentions. In both cases little should be done to try and reconstruct the message without human help. 7.5. MIME prologue and Epilogue text A number of systems and hand constructed messages put text into the prologue and epilogue of MIME multipart messages. Whilst this is a neat trick for allowing non-mime UAs to inform the user why the message appears as garbage, the prologue/epilogue does not really exist as part of a message. Therefore when gatewaying or simply processing such messages, these components may disappear. Alternatively they may appear as new body parts after transformation. Therefore whilst you can do it, don't be suprised if it fails to appear at the other end. Onions Expires Aug 30, 1995 [Page 8] INTERNET DRAFT How to be a Bad EMail Citizen February 17, 1995 8. Acknowledgements This document represents a collection of the experiences and hard-won battle scars from a large community of people. All implementors of SMTP mail systems will have had some influence on this document. In particular there are a number of points taken from the work done in the smtp extensions working group. This document is a summary of some of the discussions, and other experiences. Some of this text is taken from an earlier draft of the SMTP working group document. 9. Security Considerations Security considerations are not discussed in this memo. 10. Editor's Address Julian Onions Nexor Ltd. PO Box 132, Nottingham, England. Onions Expires Aug 30, 1995 [Page 9]