syslog Working Group R. Gerhards Internet-Draft December 2, 2003 Expires: June 1, 2004 The syslog Protocol draft-ietf-syslog-protocol-00.txt Status of this Memo This document is an Internet-Draft and is in full conformance with all provisions of Section 10 of RFC2026. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet-Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http:// www.ietf.org/ietf/1id-abstracts.txt. The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. This Internet-Draft will expire on June 1, 2004. Copyright Notice Copyright (C) The Internet Society (2003). All Rights Reserved. Abstract This document describes the syslog protocol. The syslog protocol has been used throughout the years to convey event notifications. This documents describes a layered architecture for a backwards-compatible and easily extensible syslog protocol. Gerhards Expires June 1, 2004 [Page 1] Internet-Draft The syslog Protocol December 2003 Table of Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 2. Definitions and Architecture . . . . . . . . . . . . . . . . 4 3. Transport Layer Protocol . . . . . . . . . . . . . . . . . . 7 4. Required syslog Format . . . . . . . . . . . . . . . . . . . 8 4.1 PRI Part . . . . . . . . . . . . . . . . . . . . . . . . . . 10 4.2 HEADER Part . . . . . . . . . . . . . . . . . . . . . . . . 12 4.2.1 TIMESTAMP . . . . . . . . . . . . . . . . . . . . . . . . . 12 4.2.2 HOSTNAME . . . . . . . . . . . . . . . . . . . . . . . . . . 14 4.2.3 TAG . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 4.3 MSG . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 4.3.1 COOKIE . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 4.3.2 PAYLOAD . . . . . . . . . . . . . . . . . . . . . . . . . . 17 4.4 TRAILER . . . . . . . . . . . . . . . . . . . . . . . . . . 18 4.5 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . 18 5. Security Considerations . . . . . . . . . . . . . . . . . . 19 5.1 Packet Parameters . . . . . . . . . . . . . . . . . . . . . 19 5.2 Message Authenticity . . . . . . . . . . . . . . . . . . . . 19 5.3 Authentication Problems . . . . . . . . . . . . . . . . . . 19 5.4 Message Forgery . . . . . . . . . . . . . . . . . . . . . . 20 5.5 Sequenced Delivery . . . . . . . . . . . . . . . . . . . . . 20 5.5.1 Single Source to a Destination . . . . . . . . . . . . . . . 21 5.5.2 Multiple Sources to a Destination . . . . . . . . . . . . . 21 5.5.3 Multiple Sources to Multiple Destinations . . . . . . . . . 21 5.6 Replaying . . . . . . . . . . . . . . . . . . . . . . . . . 22 5.7 Reliable Delivery . . . . . . . . . . . . . . . . . . . . . 22 5.8 Message Integrity . . . . . . . . . . . . . . . . . . . . . 22 5.9 Message Observation . . . . . . . . . . . . . . . . . . . . 23 5.10 Message Prioritization and Differentiation . . . . . . . . . 23 5.11 Misconfiguration . . . . . . . . . . . . . . . . . . . . . . 24 5.12 Forwarding Loop . . . . . . . . . . . . . . . . . . . . . . 24 5.13 Load Considerations . . . . . . . . . . . . . . . . . . . . 25 5.14 Denial of Service . . . . . . . . . . . . . . . . . . . . . 25 5.15 Covert Channels . . . . . . . . . . . . . . . . . . . . . . 25 6. IANA Considerations . . . . . . . . . . . . . . . . . . . . 26 7. Authors and Working Group Chair . . . . . . . . . . . . . . 27 8. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 28 References . . . . . . . . . . . . . . . . . . . . . . . . . 29 Author's Address . . . . . . . . . . . . . . . . . . . . . . 30 Intellectual Property and Copyright Statements . . . . . . . 31 Gerhards Expires June 1, 2004 [Page 2] Internet-Draft The syslog Protocol December 2003 1. Introduction The informational document RFC 3164 [19] describes a general format of syslog messages as they have been seen on the wire, and as the original author intended. Over time that format has been modified and extended in several ways, usually to meet new requirements. This document describes the semantics of the syslog protocol and provides a standard format for all syslog messages, that adheres to the original intent of the message format but also contains enhancements that are consistent with many of the innovations put forth through the years. Some components have been adjusted in this document to allow for backwards compatibility. However, the greatest benefit to automated log message parsers and people reading the log messages will come from adherence to the newly defined fields laid out in this document. The adherence of syslog messages to the format defined in this document may present problems to older syslog message receivers even though efforts were made to keep the message format similar to the format described in RFC 3164 [19]. People deploying devices that generate messages following the protocol described here should verify that they don't present problems to their existing syslog receivers. Gerhards Expires June 1, 2004 [Page 3] Internet-Draft The syslog Protocol December 2003 2. Definitions and Architecture The following definitions will be used in this document. A machine that can generate a message will be called a "device". A machine that can receive the message and forward it to another machine will be called a "relay". A machine that receives the message and does not relay it to any other machines will be called a "collector". This has been commonly known as a "syslog server". Any device or relay will be known as the "sender" when it sends a message. Any relay or collector will be known as the "receiver" when it receives the message. There are machines that both receive messages and forward them to another machine AND generate syslog messages themselfs. An example for this may be an application that operates as a syslog relay as one service while at the same time running other services. These services may be monitored by the same application, generating new syslog messages. Such a machine acts both as a relay AND a device. This case is specifically mentioned as the role a machine plays has special significance, for example on formatting. A machine as described here may thus have two separate configurations for each of the machine's operations modes. The architecture of the devices may be summarized as follows: Senders send messages to relays or collectors with no knowledge of whether it is a collector or relay. Senders may be configured to send the same message to multiple receivers. Relays may send all or some of the messages that they receive to a subsequent relay or collector. In the case where they do not forward all of their messages, they are acting as both a collector and a relay. In the following diagram, these devices will be designated as relays. Relays may also generate their own messages and send them on to subsequent relays or collectors. In that case it is acting as a device. These devices will also be designated as a relay in the following diagram. Gerhards Expires June 1, 2004 [Page 4] Internet-Draft The syslog Protocol December 2003 The following architectures shown in Diagram 1 are valid while the first one has been known to be the most prevalent. Other arrangements of these examples are also acceptable. As noted above, in the following diagram relays may pass along all or some of the messages that they receive along with passing along messages that they internally generate. +------+ +---------+ |Device|---->----|Collector| +------+ +---------+ +------+ +-----+ +---------+ |Device|---->----|Relay|---->----|Collector| +------+ +-----+ +---------+ +------+ +-----+ +-----+ +---------+ |Device|-->--|Relay|-->--..-->--|Relay|-->--|Collector| +------+ +-----+ +-----+ +---------+ +------+ +-----+ +---------+ |Device|---->----|Relay|---->----|Collector| | |-\ +-----+ +---------+ +------+ \ \ +-----+ +---------+ \-->--|Relay|---->----|Collector| +-----+ +---------+ +------+ +---------+ |Device|---->----|Collector| | |-\ +---------+ +------+ \ \ +-----+ +---------+ \-->--|Relay|---->----|Collector| +-----+ +---------+ +------+ +-----+ +---------+ |Device|---->----|Relay|---->-------|Collector| | |-\ +-----+ /--| | +------+ \ / +---------+ \ +-----+ / \-->--|Relay|-->--/ +-----+ Gerhards Expires June 1, 2004 [Page 5] Internet-Draft The syslog Protocol December 2003 +------+ +-----+ +---------+ |Device|---->-----|Relay|---->----------|Collector| | |-\ +-----+ /--| | +------+ \ / +---------+ \ +--------+ / \ |+------+| / \-->-||Relay ||->---/ |+------|| / ||Device||->-/ |+------+| +--------+ Diagram 1. Some Possible syslog Architectures Gerhards Expires June 1, 2004 [Page 6] Internet-Draft The syslog Protocol December 2003 3. Transport Layer Protocol This document DOES NOT specify or enforce a specific transport layer protocol. Instead, it describes the format of a syslog message in a transport layer independent way. As long as there are no transport mappings defined, the relevant parts of RFC 3164 should be used for UDP-based transport and the relevant parts of RFC 3195 should be used for TCP-based transport. Transport mappings being defined MUST ensure that a message formatted according to this document can be transported unaltered over the mapping. If the mapping needs to perform temporary transformations, it must be guaranteed that the message received at the final destination is an exact copy of the message sent from the initial originator. This is vital because otherwise cryptographic verifiers (like signatures) would be broken. Gerhards Expires June 1, 2004 [Page 7] Internet-Draft The syslog Protocol December 2003 4. Required syslog Format The traditional format of a syslog message is defined in RFC 3164. There is a concept in that document that anything delivered to UDP port 514 will be accepted as a valid syslog message. However, this document REQUIRES a defined format for syslog messages. The full format of a syslog message seen on the wire has three discernable parts. The first part is called the PRI, the second part is the HEADER, and the third part is the MSG. The total length of the packet MUST be 1024 bytes or less. There is no minimum length of the syslog message although sending a syslog packet with no contents is worthless and SHOULD NOT be transmitted. The definitions of the fields are slightly changed in this document from RFC 3164. While the format described in RFC 3164 is correct for packet formation, the Working Group evaluating this work determined that it would be better if the TAG field were to become a part of the HEADER part rather than the CONTENT part. While IETF documentation does not allow the specification of an API, people developing code to adhere to this specification have found it helpful to think about the parts in this format. The syslog message has the following ABNF [14] definition: ; The general syslog message format SYSLOG-MSG = PRI HEADER MSG [TRAILER] HEADER = TIMESTAMP SP HOSTNAME SP TAG [SP] TRAILER = [CR] LF PRI = "<" PRIVALUE ">" PRIVALUE = (0..191) / (1*3DIGIT "," 0..7) ; the alternate form is based on Albert Mietus comments... HOSTNAME = 1*64PRINTUSASCII ; a FQDN, ;adopt international domain names later (too political issue, ; takes too long)? TAG = static-id [full-dyn-id] [":"] ; 64 chars max static-id = 1*VISUAL full-dyn-id = "[" proc-id [thread-sep thread-id] "]" proc-id = 1*ALFANUM ; recommended: number thread-sep = VISUAL / %d58 ; recommended: ",", or ':', or '.' thread-id = 1*ALFANUM ; recommended: number VISUAL = (%d33-57/%d59-126) ; all but SP and ":" TIMESTAMP = TIMESTAMP-3164 / TIMESTAMP-3339 TIMESTAMP-3164 = MON-3164 SP DAY-3164 SP TIME-3164 MON-3164 = %d74.97.110 / ; "Jan" Gerhards Expires June 1, 2004 [Page 8] Internet-Draft The syslog Protocol December 2003 %d70.101.97 / ; "Feb" %d77.97.114 / ; "Mar" %d65.112.114 / ; "Apr" %d77.97.121 / ; "May" %d74.117.110 / ; "Jun" %d74.117.108 / ; "Jul" %d65.117.103 / ; "Aug" %d83.101.112 / ; "Sep" %d79.99.116 / ; "Oct" %d78.111.118 / ; "Nov" %d68.101.99 ; "Dec" DAY-3164 = (SP 1..9) / (10..31) TIME-3164 = time-hour ":" time-minute ":" time-second-nl TIMESTAMP-3339 = full-date "T" full-time date-fullyear = 4DIGIT date-month = 2DIGIT ; 01-12 date-mday = 2DIGIT ; 01-28, 01-29, 01-30, 01-31 based on ; month/year time-hour = 2DIGIT ; 00-23 time-minute = 2DIGIT ; 00-59 time-second = 2DIGIT ; 00-58, 00-59, 00-60 based on leap ; second rules time-second-nl = 2DIGIT ; 00-59 no leap seconds! time-secfrac = "." 1*DIGIT time-numoffset = ("+" / "-") time-hour ":" time-minute time-offset = "Z" / time-numoffset partial-time = time-hour ":" time-minute ":" time-second [time-secfrac] full-date = date-fullyear "-" date-month "-" date-mday full-time = partial-time time-offset COOKIE = "@#" (COOKIE-IANA / COOKIE-VENDOR / COOKIE-EXPER) COOKIE-IANA = COOKIE-ID ; IANA-Assigned COOKIE-EXPER = "X-" COOKIE-ID ; experimental COOKIE-VENDOR = "V-" VENDORURI "-" COOKIE-ID VENDORURI = 1*64PRINTUSASCII ; a valid domain name owned by the vendor COOKIE-ID = 4*6PRINTUSASCII ; MUST NOT begin with "V-" or "X-" MSG = (COOKIE SP [COOKIE-PARAMS SP] MSG) / PAYLOAD PAYLOAD = *((%d32-126) / (%d128-254)) ; VALID UTF-8 String of PRINTABLE characters COOKIE-PARAMS = *(PRINTUSASCII / %d32) ; parameters defined by the extension using the cookie LF = %d10 Gerhards Expires June 1, 2004 [Page 9] Internet-Draft The syslog Protocol December 2003 CR = %d13 SP = %d32 PRINTUSASCII = %d33-126 ALFANUM = %d48..57 / %d65..90 / %d97..122 4.1 PRI Part The PRI part MUST have three, four, or five characters and will be bound with angle brackets as the first and last characters. The PRI part starts with a leading "<" ('less-than' character), followed by a number, which is followed by a ">" ('greater-than' character). The code set used in this part MUST be seven-bit ASCII in an eight- bit field as described in RFC 2234 [14]. These are the ASCII codes as defined in "USA Standard Code for Information Interchange" ANSI.X3-4.1968 [3]. In this, the "<" character is defined as the Augmented Backus-Naur Form (ABNF) %d60, and the ">" character has ABNF value %d62. The number contained within these angle brackets is known as the Priority value and represents both the Facility and Severity as described below. The Priority value consists of one, two, or three decimal integers (ABNF DIGITS) using values of %d48 (for "0") through %d57 (for "9"). The Facilities and Severities of the messages are defined in RFC 3164. and are repeated here. Numerical Facility Code 0 kernel messages 1 user-level messages 2 mail system 3 system daemons 4 security/authorization messages (note 1) 5 messages generated internally by syslogd 6 line printer subsystem 7 network news subsystem 8 UUCP subsystem 9 clock daemon (note 2) 10 security/authorization messages (note 1) 11 FTP daemon 12 NTP subsystem 13 log audit (note 1) 14 log alert (note 1) 15 clock daemon (note 2) 16 local use 0 (local0) 17 local use 1 (local1) Gerhards Expires June 1, 2004 [Page 10] Internet-Draft The syslog Protocol December 2003 18 local use 2 (local2) 19 local use 3 (local3) 20 local use 4 (local4) 21 local use 5 (local5) 22 local use 6 (local6) 23 local use 7 (local7) Table 1. syslog Message Facilities Note 1 - Various operating systems have been found to utilize Facilities 4, 10, 13 and 14 for security/authorization, audit, and alert messages which seem to be similar. Note 2 - Various operating systems have been found to utilize both Facilities 9 and 15 for clock (cron/at) messages. Each message Priority also has a decimal Severity level indicator. These are described in the following table along with their numerical values. Numerical Severity Code 0 Emergency: system is unusable 1 Alert: action must be taken immediately 2 Critical: critical conditions 3 Error: error conditions 4 Warning: warning conditions 5 Notice: normal but significant condition 6 Informational: informational messages 7 Debug: debug-level messages Table 2. syslog Message Severities The Priority value is calculated by first multiplying the Facility number by 8 and then adding the numerical value of the Severity. For example, a kernel message (Facility=0) with a Severity of Emergency (Severity=0) would have a Priority value of 0. Also, a "local use 4" message (Facility=20) with a Severity of Notice (Severity=5) would have a Priority value of 165. In the PRI part of a syslog message, these values would be placed between the angle brackets as <0> and <165> respectively. The only time a value of "0" follows the "<" is for the Priority value of "0". Otherwise, leading "0"s MUST NOT be used. An alternate form for the PRI part has been recommended by Albert Mietus. It is described in the ABNF above as a basis for discussion. In this form, the facility and severity is split across two fields, with the facility defined to have up to a thousand values (0..999). Gerhards Expires June 1, 2004 [Page 11] Internet-Draft The syslog Protocol December 2003 This would enable newer emitors to provide more detailed information on which subsystem caused the syslog message. In this form, the facility is followed by a comma and then the severity as a single digit. An example of a message from the (traditional) mail subsystem with error severity would be "<2,3>". 4.2 HEADER Part The HEADER part contains a time stamp, an indication of the hostname or IP address of the device, and a string indicating the source of the message. The HEADER part of the syslog packet MUST contain visible (printing) characters. The code set used MUST also be seven-bit ASCII in an eight-bit field like that used in the PRI part. In this code set, the only allowable characters are the ABNF VCHAR values (%d33-126) and spaces (SP value %d32). The HEADER contains three fields called the TIMESTAMP, the HOSTNAME, and the TAG fields. The TIMESTAMP immediately follows the trailing ">" from the PRI part and single space characters MUST follow each of the TIMESTAMP and HOSTNAME fields. HOSTNAME contains the hostname, as it knows itself. If it does not have a hostname, then it contains its own IP address. If a device has multiple IP addresses, it has usually been seen to use the IP address from which the message is transmitted. An alternative to this behavior has also been seen. In that case, a device may be configured to send all messages using a single source IP address regardless of the interface from which the message is sent. This provides a single consistent HOSTNAME for all messages sent from a device. 4.2.1 TIMESTAMP The TIMESTAMP field is either a timestamp as defined in RFC 3164 denoted as TIMESTAMP-3164, or as a formalized timestamp as taken from RFC 3339 [21]. A sender SHOULD format the timestamp as a TIMESTAMP-3339. A receiver MUST accept both formats. The formal definition for both timestamp formats can be found in the ABNF above. Note well: RFC 3339 makes allowances for multiple syntaxes for a timestamp to be used in various cases. This document mandates a single syntax. The primary characteristics of TIMESTAMP-3339 used in this document are as follows. o the "T" and "Z" characters in this syntax MUST be upper case. o usage of the "T" character is mandatory. It MUST NOT be replaced by any other character (like a SP character). o the sender SHOULD include time-secfrac (fractional seconds) if its Gerhards Expires June 1, 2004 [Page 12] Internet-Draft The syslog Protocol December 2003 clock accuracy permits. o the entire length of the TIMESTAMP-3339 field MUST NOT exceed 32 characters. Two samples of this format are: 1985-04-12T23:20:50.52Z 1985-04-12T18:20:50.52-06:00 The first represents 20 minutes and 50.52 seconds after the 23rd hour of April 12th, 1985 in UTC. The second represents the same time but expressed in the Eastern US timezone (daylight savings time being observed). A single space character MUST follow the TIMESTAMP field. Receivers parsing the date format SHOULD check if the TIMESTAMP is a TIMESTAMP-3339. The "T" character at position 11 of the string can be used as a rough indication for this. However, the receiver MUST NOT rely solely on the "T" character but also parse the other data for validity. A receiver SHOULD check for TIMESTAMP-3339 format first and, if unsuccessful, assume a TIMESTAMP-3164. If it is also not a TIMESTAMP-3164 format, the receiver MUST NOT try any other timestamp format but consider the TIMESTAMP to be invalid or missing from the received syslog message. If a relay receives a TIMESTAMP-3164, it SHOULD forward the message with a TIMESTAMP-3164 but MAY reformat it to a TIMESTAMP-3339 if configured to do so. Relays should be aware that the TIMESTAMP-3339 may be longer than the TIMESTAMP-3164 and a replacement of the TIMESTAMP-3164 with a TIMESTAMP-3339 may increase the length of the entire packet beyond 1024 bytes. If a relay receives a TIMESTAMP-3339 it MUST forward the message with a TIMESTAMP-3339. It MUST NOT reformat it to a TIMESTAMP-3164. There is one minor - but eventually important - difference in regard to the second representation between a TIMESTAMP-3164 and a TIMESTAMP-3339. In a TIMESTAMP-3339, the second part may have the value "60" to indicate a leap second. No such value is permitted in a TIMESTAMP-3164 second part. If a relay receives a value of "60" in a TIMESTAMP-3339 AND is configured to rewrite this to a TIMESTAMP-3164 (for whatever reasons), it MUST represent the second part with the value "59" and otherwise leave the TIMESTAMP time unmodified. The author beliefs this handling causes the least confusion and potential code errors. It should occur seldom enough to not cause any issue at all. Gerhards Expires June 1, 2004 [Page 13] Internet-Draft The syslog Protocol December 2003 4.2.2 HOSTNAME The HOSTNAME field contains an indication of the originator of the message in one of four formats: only the hostname, the hostname and domainname, the IPv4 address, or the IPv6 address. The preferred value is the hostname and domainname in the format specified in STD 13 [5]. This format will be referred to in this document as HOSTNAME-STD13. If only the hostname is used, the HOSTNAME field MUST contain the hostname only of the device as specified in STD 13. This format is discouraged but provides for legacy compatibility with the format described in RFC 3164. This format will be referred to in this document as HOSTNAME-3164. In this format, the Domain Name MUST NOT be included in the HOSTNAME field. If the IPv4 address is used, it MUST be shown as the dotted decimal notation as used in STD 13 [6], and will be referred to as HOSTNAME-IPV4. If an IPv6 address is used, any valid representation used in RFC 2373 [15] MAY be used and will be referred to as HOSTNAME-IPV6. A single space character MUST also follow the HOSTNAME field. 4.2.3 TAG The TAG is a string of visible (printing) characters excluding SP, that MUST NOT exceed 64 characters in length. The first occurrence of a SP (space) will terminate the TAG field, but is not part of it. It is RECOMMENDED to terminate the TAG with a colon (':'), which if used, is part of the TAG. The TAG is used to denote the sender of the message. It MUST be in the syntax shown in the ABNF above. A typical example of a TAG is: (without the quotes) "/path/to/PROGNAME[123,456]:" Another example (from VMS) is: (without the quotes) "DKA0:[MYDIR.SUBDIR1.SUBDIR2]MYFILE.TXT;1[123,456]". Please note that in this example, "DKA0:[MYDIR.SUBDIR1.SUBDIR2]MYFILE.TXT;1" is the static-id while "[123,456]" is still the full-dyn-id. This shows that a receiver must be prepared for special characters like '[' to be present inside the static part. As a note to implementors: the beginning of the full-dyn-id is not the first but the LAST occurrence of '[' inside the tag and this ONLY if the tag ends in either "]" or "]:". If these conditions are not met, the '[' is part of the static-id. Gerhards Expires June 1, 2004 [Page 14] Internet-Draft The syslog Protocol December 2003 Systems that use both process-ID's and thead-IDs, SHOULD fill both the proc-id and the thread-part. For other systems it is RECOMMENDED to use the proc-id only. Receivers SHOULD, to be consistent with the format described in RFC3164, accept TAGs that terminate with a single colon, without a space following it. Then the colon is both the last character of that TAG, and the field separator with the next field (MSG). No specific format inside the tag is required. However, an emitor SHOULD use a consistent tag value. 4.3 MSG The MSG part contains an optional COOKIE and the actual PAYLOAD. If the MSG part contains a COOKIE, optional cookie parameters follow after the cookie and after that the original message. NOTE WELL: MSG is a recursive structure. As such, a MSG may contain a COOKIE and another MSG which in turn also contains a COOKIE and yet another MSG. To clarify things, we call a MSG that does not contain any COOKIE the actual PAYLOAD (see below). There is no hard limit of how many levels of COOKIE/MSG constructs are used inside a single message. The only limit is that the whole construct must fit within the syslog size limitation. Practically, however, it is recommended to limit nesting to those cases where it is absolutely necessary and there is good reasoning for it. NOTE WELL: there is an inherent risk with the nesting of COOKIES: As specified, a receiver must assume a valid cookie only if he knows the full COOKIE, including COOKIE-ID. If he does not know that specific cookie, it MUST be treated as ordinary data, thus turning the message from MSG to PAYLOAD. As such, no parsing for further COOKIES in PAYLOAD is allowed nor desired. In consequence, COOKIES nested in deeper layers will not be seen and processed. The author beliefs this potential shortcoming is acceptable. If inner-layer cookies would be tried to parse, this would potentially conflict with existing syslog data as well as introduce a number of potential bugs, as the format and thus validity of the outer level cookie is not know. It is assumed that if the outer layer cookie is not know, the receiver will most probably not understand the inner-layer cookie. To minimize this risk, more generic cookies should be at the outer layers and less specific cookies on the inner layers. Gerhards Expires June 1, 2004 [Page 15] Internet-Draft The syslog Protocol December 2003 4.3.1 COOKIE The COOKIE is an optional part of the message. It is used to identify optional features inside a syslog message. A cookie can either be assigned via IANA (COOKIE-IANA), be experimental but intended to be vendor-neutral (COOKIE-EXPER) or be vendor-specific (COOKIE-VENDOR). If there is a cookie present, it MUST start with the sequence "@#" at the first character of the payload block. The COOKIE-ID must be at least 4 characters, so that the overall minimum COOKIE size is 6 characters. These requirements makes it highly unlikely that a string sequence in an "old-style" syslog message will be misinterpreted as a cookie. However, there is a slight chance that this may happen. It may also be deliberately done as part of a malicious message. As such, an implementation MUST NOT rely solely on the "@#" sequence to judge whether it is a valid cookie or not. It MUST parse the whole cookie to see if it is known or not and then act accordingly. Unknown cookies should be treated as ordinary data and not be acted upon. This implies that an implementation MUST not attempt to find further cookies inside the MSG. 4.3.1.1 COOKIE-ID The COOKIE-ID uniquely identifies the cookie. It is a 4 to 6 character wide string of printable characters. It is case-sensitive. The 4 character minimum size requirement is introduced to reduce the likelihood that a cookie is mistakenly being recognized. The COOKIE-ID alone MUST NOT be used to detect a cookie. It can, however, be handy for human discussion. The COOKIE-ID MUST NOT begin with "V-" or "X-". 4.3.1.2 IANA and Experimental Cookies These are vendor-neutral cookies. IANA-assigned cookie values have undergone the consensus process and are well-defined. Experimental cookies are for vendor-neutral functionality that is currently in development. A syslog extension that is expected to be vendor-specific SHOULD NOT use experimental cookies, it SHOULD use vendor-specific cookies instead. As a rule of thumb, only cookies used for functionalities discussed on IETF mailing lists should be treated as vendor-neutral. When new experimental cookies are designed, they SHOULD use a COOKIE-ID not yet assigned by IANA. This will facilitate the later transition as the experimental COOKIED-ID could eventually be used as an IANA COOKIE-ID once consensus has been reached and the discussed functionality is mature enough. Gerhards Expires June 1, 2004 [Page 16] Internet-Draft The syslog Protocol December 2003 4.3.1.3 Vendor specific Cookies These cookie values are reserved for vendor extensions. A general issue with namespaces and vendor extensions is that multiple vendors may (accidentally) decide to use the same value as their extension ID. To avoid this, we prefix each vendor-specific COOKIE-ID with a VENDORURI. This should be a long-lasting Internet domain name that the vendor owns. An example: "Example Inc" has two software products called "GreatestSyslog" and "EvenGreaterSyslog". It owns the domains "example.com", "GreatestSyslog.example" and "EvenGreaterSyslog.example". Now, "Example Inc" decides to introduce a new cookie for exclusive use by "EvenGreaterSyslog". It is recommended that the company's main domain is used for building the vendor cookie. If they used the COOKIE-ID "MyTag", the complete vendor cookie would look like this: "@#V-example.com-MyTag". The VENDOR-URI is case-insensitive. However, it is good practice to send it consistently in the same case. It SHOULD be sent in lower case. If cookies are nested, vendor cookies MUST be used on the innermost layer, only. 4.3.2 PAYLOAD The PAYLOAD part contains the details of the message. This has traditionally been a freeform message that gives some detailed information of the event. The PAYLOAD part of the syslog packet MUST contain visible (printing) characters. The code set traditionally and most often used has been seven-bit ASCII in an eight-bit field. In this code set, the only allowable characters are the ABNF VCHAR values (%d33-126) and spaces (SP value %d32). However, no indication of the code set used within the PAYLOAD is required, nor is it expected. Other code sets MAY be used as long as the characters used in the MSG part are exclusively visible characters and spaces similar to those described above. For example, the UTF-8 RFC 2279 [13] character set may be used. The selection of a code set used in the PAYLOAD part SHOULD be made with thoughts of the intended receiver. A message containing characters in a code set that cannot be viewed or understood by a recipient will yield no information of value to an operator or administrator looking at it. As such, it is strongly RECOMMENDED to use a standard mechanism to indicate the code set used to the recipient. Gerhards Expires June 1, 2004 [Page 17] Internet-Draft The syslog Protocol December 2003 4.4 TRAILER The trailer is an optional part that is being introduced to preserve compatibility to legacy syslog implementations. It is observed behavior that some emitors send a trailer after the MSG part. Their syslog-message is otherwise well-formed. In order to provide backwards compatibility, receivers MUST accept messages with trailers as valid syslog messages. A relay receiving a trailer MUST NOT reformat the message to remove the trailer. An emitor SHOULD NOT include the trailer inside the syslog message. It MAY be configured to include it, if the receiver it is sending to requires a trailer (which is unlikely). 4.5 Examples The following examples are given. Example 1 <34>Oct 11 22:14:15 mymachine su: 'su root' failed for lonvick on /dev/pts/8 In this example, as it was originally described in RFC 3164, the PRI part is "<34>". In this work, however, the HEADER part consists of the TIMESTAMP, the HOSTNAME, and the TAG fields. The TIMESTAMP is "Oct 11 22:14:15 ", the HOSTNAME is "mymachine ", and the TAG value is "su:". The CONTENT field is " 'su root' failed for lonvick...". The CONTENT field starts with a leading space character in this case. Example 2 <165>Aug 24 05:34:00 10.1.1.1 myproc[10]:%% It's time to make the do-nuts. %% Ingredients: Mix=OK, Jelly=OK # Devices: Mixer=OK, Jelly_Injector=OK, Frier=OK # Transport: Conveyer1=OK, Conveyer2=OK # %% In this example, the PRI part is <165> denoting that it came from a locally defined facility (local4) with a severity of Notice. The HEADER part has a proper TIMESTAMP field in the message. A relay will not modify this message before sending it. The HOSTNAME is an IPv4 address and the TAG field is "myproc[10]:". The MSG part starts with "%% It's time to make the do-nuts. %% Ingredients: Mix=OK, ..." this time without a leading space character. Gerhards Expires June 1, 2004 [Page 18] Internet-Draft The syslog Protocol December 2003 5. Security Considerations Many security considerations were described in the informational RFC 3164 [19] and are repeated here for completeness. Additional considerations are also included in this section. 5.1 Packet Parameters The message length must not exceed 1024 bytes. Various problems may result if a device sends out messages with a length greater than 1024 bytes. In this case, as with all others, it is best to be conservative with what you send but liberal in what you receive, and accept more than 1024 bytes. Similarly, the fragmentation features introduced in this document may be misused to overrun a receiver or a log analyzer with a gigantic message. Any process reassembling fragmented messages MUST properly check the maximum re-assembled message size it supports. Oversize data SHOULD be dropped. Similarly, senders must rigidly enforce the correctness of the message body. It is hoped that all devices adopt the newly defined HOSTNAME-STD13 and TIMESTAMP-3339 formats. However, until that happens, receivers may become upset at the receipt of messages with these fields. Knowledgeable humans should review the senders and receivers to ensure that no problems arise from this. Finally, receivers must not malfunction if they receive syslog messages containing characters other than those specified in this document. 5.2 Message Authenticity The syslog delivery mechanism does not strongly associate the message with the message sender. The receiver of that packet will not be able to ascertain that the message was indeed sent from the reported sender, or if the packet was sent from another device. It should be noted here that the message receiver does not need to verify that the HOSTNAME in the HEADER part match the name of the IP address contained in the Source Address field of the IP packet. 5.3 Authentication Problems One possible consequence of this behavior is that a misconfigured machine may send syslog messages to a collector representing itself as another machine. The administrative staff may become confused that the status of the supposed sender of the messages may not be accurately reflected in the received messages. The administrators Gerhards Expires June 1, 2004 [Page 19] Internet-Draft The syslog Protocol December 2003 may not be able to readily discern that there are two or more machines representing themselves as the same machine. It should also be noted that some cases of filling the HOSTNAME field in the HEADER part might only have local significance and that may only be ephemeral. If the device had obtained an IP address from a DHCP pool, then any association between an identifier and an actual source would not always hold true. The inclusion of a fully qualified domain name in the CONTENT may give the administrators the best chance of identifying the source of each message if it can always be associated with an IP address or if it can always be associated with a unique machine. 5.4 Message Forgery Malicious exploits of this behavior have also been noted. An attacker may transmit syslog messages (either from the machine from which the messages are purportedly sent or from any other machine) to a collector. In one case, an attacker may hide the true nature of an attack amidst many other messages. As an example, an attacker may start generating forged messages indicating a problem on some machine. This may get the attention of the system administrators who will spend their time investigating the alleged problem. During this time, the attacker may be able to compromise a different machine, or a different process on the same machine. Additionally, an attacker may generate false syslog messages to give untrue indications of status or of events. As an example, an attacker may stop a critical process on a machine, which may generate a notification of exit. The attacker may subsequently generate a forged notification that the process had been restarted. The system administrators may accept that misinformation and not verify that the process had indeed been restarted. 5.5 Sequenced Delivery As a general rule, the forensics of a network anomaly rely upon reconstructing the sequence of events. In a perfect world, the messages would be received on the syslog collector in the order of their generation from the other devices and anyone looking at these records would have an accurate picture of the sequence of events. Unfortunately, the syslog process and protocol do not ensure ordered delivery. This section details some of the problems that may be encountered from this. Strict adherence to the use of TIMESTAMP-3339 will help administrators to place received messages in their proper order. Gerhards Expires June 1, 2004 [Page 20] Internet-Draft The syslog Protocol December 2003 5.5.1 Single Source to a Destination The syslog records are usually presented (placed in a file, displayed on the console, etc.) in the order in which they are received. This is not always in accordance with the sequence in which they were generated. As they are transported across an IP network, some out of order receipt should be expected. This may lead to some confusion a messages may be received that would indicate that a process has stopped before it was started. This may be somewhat rectified if the originating process had timestamped or numbered each of the messages before transmission. In this, the sending device should utilize an authoritative time source. It should be remembered, however, that not all devices are capable of receiving time updates, and not all devices can timestamp their messages. 5.5.2 Multiple Sources to a Destination In syslog, there is no concept of unified event numbering. Single devices are free to include a sequence number within the CONTENT but that can hardly be coordinated between multiple devices. In such cases, multiple devices may report that each one is sending message number one. Again, this may be rectified somewhat if the sending devices utilize a timestamp from an authoritative source in their messages. As has been noted, however, even messages from a single device to a single collector may be received out of order. This situation is compounded when there are several devices configured to send their syslog messages to a single collector. Messages from one device may be delayed so the collector receives messages from another device first even though the messages from the first device were generated before the messages from the second. If there is no timestamp or coordinated sequence number, then the messages may be presented in the order in which they were received which may give an inaccurate view of the sequence of actual events. 5.5.3 Multiple Sources to Multiple Destinations The plethora of configuration options available to the network administrators may further skew the perception of the order of events. It is possible to configure a group of devices to send the status messages -or other informative messages- to one collector, while sending messages of relatively higher importance to another collector. Additionally, the messages may be sent to different files on the same collector. If the messages do not contain timestamps from the source, it may be difficult to order the messages if they are kept in different places. An administrator may not be able to determine if a record in one file occurred before or after a record in a different file. This may be somewhat alleviated by placing marking messages with a timestamp into all destination files. If Gerhards Expires June 1, 2004 [Page 21] Internet-Draft The syslog Protocol December 2003 these have coordinated timestamps, then there will be some indication of the time of receipt of the individual messages. 5.6 Replaying Without any sequence indication or timestamp, messages may be recorded and replayed at a later time. An attacker may record a set of messages that indicate normal activity of a machine. At a later time, that attacker may remove that machine from the network and replay the syslog messages to the collector. Even with a TIMESTAMP field in the HEADER part, an attacker may record the packets and could simply modify them to reflect the current time before retransmitting them. The administrators may find nothing unusual in the received messages and their receipt would falsely indicate normal activity of the machine. 5.7 Reliable Delivery As there is no mechanism within either the syslog process or the protocol to ensure delivery, and since the underlying transport is UDP, some messages may be lost. They may either be dropped through network congestion, or they may be maliciously intercepted and discarded. The consequences of the drop of one or more syslog messages cannot be determined. If the messages are simple status updates, then their non-receipt may either not be noticed, or it may cause an annoyance for the system operators. On the other hand, if the messages are more critical, then the administrators may not become aware of a developing and potentially serious problem. Messages may also be intercepted and discarded by an attacker as a way to hide unauthorized activities. RFC 3195 may be used for the reliable delivery of all syslog messages. 5.8 Message Integrity Besides being discarded, syslog messages may be damaged in transit, or an attacker may maliciously modify them. In the case of a packet containing a syslog message being damaged, there are various mechanisms built into the link layer as well as into the IP [9] and UDP protocols which may detect the damage. An intermediary router may discard a damaged IP packet [10]. Damage to a UDP packet may be detected by the receiving UDP module, which may silently discard it. In any case, the original contents of the message will not be delivered to the collector. Additionally, if an attacker is positioned between the sender and collector of syslog messages, they may be able to intercept and modify those messages while in-transit to hide unauthorized activities. Gerhards Expires June 1, 2004 [Page 22] Internet-Draft The syslog Protocol December 2003 5.9 Message Observation While there are no strict guidelines pertaining to the event message format, most syslog messages are generated in human readable form with the assumption that capable administrators should be able to read them and understand their meaning. Neither the syslog protocol nor the syslog application have mechanisms to provide confidentiality of the messages in transit. In most cases passing clear-text messages is a benefit to the operations staff if they are sniffing the packets off of the wire. The operations staff may be able to read the messages and associate them with other events seen from other packets crossing the wire to track down and correct problems. Unfortunately, an attacker may also be able to observe the human- readable contents of syslog messages. The attacker may then use the knowledge gained from those messages to compromise a machine or do other damage. 5.10 Message Prioritization and Differentiation While the processes that create the messages may signify the importance of the events through the use of the message Priority value, there is no distinct association between this value and the importance of delivery of the packet. As an example of this, consider an application that generates two event messages. The first is a normal status message but the second could be an important message denoting a problem with the process. This second message would have an appropriately higher Severity value associated with the importance of that event. If the operators had configured that both of these messages be transported to a syslog collector then they would, in turn, be given to UDP for transmission. Under normal conditions, no distinction would be made between them and they would be transmitted in their order. Again, under normal circumstances, the receiver would accept syslog messages as they are received. If many devices are transmitting normal status messages, but one is transmitting an important event message, there is no inherent mechanism within the syslog protocol to prioritize the important message over the other messages. On a case-by-case basis, device operators may find some way to associate the different levels with the quality of service identifiers. As an example, the operators may elect to define some linkage between syslog messages that have a specific Priority value with a specific value to be used in the IPv4 Precedence field [9], the IPv6 Traffic Class octet [11], or the Differentiated Services field [12]. In the above example, the operators may have the ability to associate the status message with normal delivery while associating the message indicating a problem with a high reliability, Gerhards Expires June 1, 2004 [Page 23] Internet-Draft The syslog Protocol December 2003 low latency queue as it goes through the network. This would have the affect of prioritizing the essential messages before the normal status messages. Even with this hop-by-hop prioritization, this queuing mechanism could still lead to head of line blocking on the transmitting device as well as buffer starvation on the receiving device if there are many near-simultaneous messages being sent or received. This behavior is not unique to syslog but is endemic to all operations that transmit messages serially. There are security concerns for this behavior. Head of line blocking of the transmission of important event messages may relegate the conveyance of important messages behind less important messages. If the queue is cleared appropriately, this may only add seconds to the transmission of the important message. On the other hand, if the queue is not cleared, then important messages may not be transmitted. Also at the receiving side, if the syslog receiver is suffering from buffer starvation due to large numbers of messages being received near-simultaneously, important messages may be dropped indiscriminately along with other messages. While these are problems with the devices and their capacities, the protocol security concern is that there is no prioritization of the relatively more important messages over the less important messages. 5.11 Misconfiguration Since there is no control information distributed about any messages or configurations, it is wholly the responsibility of the network administrator to ensure that the messages are actually going to the intended recipient. Cases have been noted where devices were inadvertently configured to send syslog messages to the wrong receiver. In many cases, the inadvertent receiver may not be configured to receive syslog messages and it will probably discard them. In certain other cases, the receipt of syslog messages has been known to cause problems for the unintended recipient [13]. If messages are not going to the intended recipient, then they cannot be reviewed or processed. 5.12 Forwarding Loop As it is shown in Figure 1, machines may be configured to relay syslog messages to subsequent relays before reaching a collector. In one particular case, an administrator found that he had mistakenly configured two relays to forward messages with certain Priority values to each other. When either of these machines either received or generated that type of message, it would forward it to the other relay. That relay would, in turn, forward it back. This cycle did cause degradation to the intervening network as well as to the processing availability on the two devices. Network administrators Gerhards Expires June 1, 2004 [Page 24] Internet-Draft The syslog Protocol December 2003 must take care to not cause such a death spiral. 5.13 Load Considerations Network administrators must take the time to estimate the appropriate size of the syslog receivers. An attacker may perform a Denial of Service attack by filling the disk of the collector with false messages. Placing the records in a circular file may alleviate this but that has the consequence of not ensuring that an administrator will be able to review the records in the future. Along this line, a receiver or collector must have a network interface capable of receiving all messages sent to it. Administrators and network planners must also critically review the network paths between the devices, the relays, and the collectors. Generated syslog messages should not overwhelm any of the network links. 5.14 Denial of Service As with any system, an attacker may just overwhelm a receiver by sending more messages to it than can be handled by the infrastructure or the device itself. Implementors should attempt to provide features that minimize this threat. Such as only receiving syslog messages from known IP addresses. 5.15 Covert Channels Nothing in this protocol attempts to eliminate covert channels. Indeed, the unformatted message syntax in the packets could be very amenable to sending embedded secret messages. In fact, just about every aspect of syslog messages lends itself to the conveyance of covert signals. For example, a collusionist could send odd and even PRI values to indicate Morse Code dashes and dots. Gerhards Expires June 1, 2004 [Page 25] Internet-Draft The syslog Protocol December 2003 6. IANA Considerations This document also upholds the Facilities and Severities listed in RFC 3164 [19]. Those values range from 0 to 191. This document also instructs the IANA to reserve all other possible values of the Severities and Facilities above the value of 191 and to distribute them via the consensus process as defined in RFC 2434 [16]. IANA must also maintain a registry of cookie values. Gerhards Expires June 1, 2004 [Page 26] Internet-Draft The syslog Protocol December 2003 7. Authors and Working Group Chair The working group can be contacted via the mailing list: syslog-sec@employees.org The current Chair of the Working Group may be contacted at: Chris Lonvick Cisco Systems Email: clonvick@cisco.com The author of this draft is: Rainer Gerhards Email: rgerhards@hq.adiscon.com Phone: +49-9349-92880 Fax: +49-9349-928820 Adiscon GmbH Mozartstrasse 21 97950 Grossrinderfeld Germany Gerhards Expires June 1, 2004 [Page 27] Internet-Draft The syslog Protocol December 2003 8. Acknowledgements The authors wish to thank Chris Lonvick, Jon Callas, Andrew Ross, Albert Mietus, Anton Okmianski and all other people who commented on various versions of this proposal. Gerhards Expires June 1, 2004 [Page 28] Internet-Draft The syslog Protocol December 2003 References [1] National Institute of Standards and Technology, "Digital Signature Standard", FIPS PUB 186-1, December 1998, . [2] National Institute of Standards and Technology, "Secure Hash Standard", FIPS PUB 180-1, April 1995, . [3] American National Standards Institute, "USA Code for Information Interchange", ANSI X3.4, 1968. [4] Menezes, A., van Oorschot, P. and S. Vanstone, ""Handbook of Applied Cryptography", CRC Press", 1996. [5] Mockapetris, P., "Domain names - concepts and facilities", STD 13, RFC 1034, November 1987. [6] Mockapetris, P., "Domain names - implementation and specification", STD 13, RFC 1035, November 1987. [7] Eastlake, D., Crocker, S. and J. Schiller, "Randomness Recommendations for Security", RFC 1750, December 1994. [8] Malkin, G., "Internet Users' Glossary", RFC 1983, August 1996. [9] Freed, N. and N. Borenstein, "Multipurpose Internet Mail Extensions (MIME) Part One: Format of Internet Message Bodies", RFC 2045, November 1996. [10] Oehler, M. and R. Glenn, "HMAC-MD5 IP Authentication with Replay Prevention", RFC 2085, February 1997. [11] Krawczyk, H., Bellare, M. and R. Canetti, "HMAC: Keyed-Hashing for Message Authentication", RFC 2104, February 1997. [12] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997. [13] Yergeau, F., "UTF-8, a transformation format of ISO 10646", RFC 2279, January 1998. [14] Crocker, D. and P. Overell, "Augmented BNF for Syntax Specifications: ABNF", RFC 2234, November 1997. [15] Hinden, R. and S. Deering, "IP Version 6 Addressing Architecture", RFC 2373, July 1998. Gerhards Expires June 1, 2004 [Page 29] Internet-Draft The syslog Protocol December 2003 [16] Narten, T. and H. Alvestrand, "Guidelines for Writing an IANA Considerations Section in RFCs", BCP 26, RFC 2434, October 1998. [17] Callas, J., Donnerhacke, L., Finney, H. and R. Thayer, "OpenPGP Message Format", RFC 2440, November 1998. [18] Blumenthal, U. and B. Wijnen, "User-based Security Model (USM) for version 3 of the Simple Network Management Protocol (SNMPv3)", RFC 2574, April 1999. [19] Lonvick, C., "The BSD Syslog Protocol", RFC 3164, August 2001. [20] New, D. and M. Rose, "Reliable Delivery for syslog", RFC 3195, November 2001. [21] Klyne, G. and C. Newman, "Date and Time on the Internet: Timestamps", RFC 3339, July 2002. [22] Schneier, B., "Applied Cryptography Second Edition: protocols, algorithms, and source code in C", 1996. Author's Address Rainer Gerhards EMail: rgerhards@hq.adiscon.com Gerhards Expires June 1, 2004 [Page 30] Internet-Draft The syslog Protocol December 2003 Intellectual Property Statement The IETF takes no position regarding the validity or scope of any intellectual property or other rights that might be claimed to pertain to the implementation or use of the technology described in this document or the extent to which any license under such rights might or might not be available; neither does it represent that it has made any effort to identify any such rights. Information on the IETF's procedures with respect to rights in standards-track and standards-related documentation can be found in BCP-11. Copies of claims of rights made available for publication and any assurances of licenses to be made available, or the result of an attempt made to obtain a general license or permission for the use of such proprietary rights by implementors or users of this specification can be obtained from the IETF Secretariat. The IETF invites any interested party to bring to its attention any copyrights, patents or patent applications, or other proprietary rights which may cover technology that may be required to practice this standard. Please address the information to the IETF Executive Director. Full Copyright Statement Copyright (C) The Internet Society (2003). All Rights Reserved. This document and translations of it may be copied and furnished to others, and derivative works that comment on or otherwise explain it or assist in its implementation may be prepared, copied, published and distributed, in whole or in part, without restriction of any kind, provided that the above copyright notice and this paragraph are included on all such copies and derivative works. However, this document itself may not be modified in any way, such as by removing the copyright notice or references to the Internet Society or other Internet organizations, except as needed for the purpose of developing Internet standards in which case the procedures for copyrights defined in the Internet Standards process must be followed, or as required to translate it into languages other than English. The limited permissions granted above are perpetual and will not be revoked by the Internet Society or its successors or assignees. This document and the information contained herein is provided on an "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION Gerhards Expires June 1, 2004 [Page 31] Internet-Draft The syslog Protocol December 2003 HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Acknowledgment Funding for the RFC Editor function is currently provided by the Internet Society. Gerhards Expires June 1, 2004 [Page 32]