MILE T. Cejka Internet-Draft P. Kacha Intended status: Informational CESNET Expires: May 4, 2016 November 1, 2015 MILE Differences between IODEF and IDEA draft-cejkat-mile-iodef-and-idea-00 Abstract This document summarizes differences between IODEF and IDEA data formats for description of computer security incidents. Status of This Memo This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet- Drafts is at http://datatracker.ietf.org/drafts/current/. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." This Internet-Draft will expire on May 4, 2016. Copyright Notice Copyright (c) 2015 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. Cejka & Kacha Expires May 4, 2016 [Page 1] Internet-Draft IODEF and IDEA November 2015 Table of Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 1.1. IODEF . . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.2. IDEA . . . . . . . . . . . . . . . . . . . . . . . . . . 3 2. Example Reports Represented in IODEF and IDEA . . . . . . . . 3 2.1. Minimal Example in IODEF . . . . . . . . . . . . . . . . 4 2.2. Minimal Example in IDEA . . . . . . . . . . . . . . . . . 4 2.3. Worm Example in IODEF . . . . . . . . . . . . . . . . . . 4 2.4. Worm Example in IDEA . . . . . . . . . . . . . . . . . . 6 2.5. Reconnaissance Example in IODEF . . . . . . . . . . . . . 7 2.6. Reconnaissance Example in IDEA . . . . . . . . . . . . . 9 2.7. Bot-Net Reporting Example in IODEF . . . . . . . . . . . 12 2.8. Bot-Net Reporting Example in IDEA . . . . . . . . . . . . 13 2.9. Watch List Reporting Example in IODEF . . . . . . . . . . 15 2.10. Watch List Reporting Example in IDEA . . . . . . . . . . 16 3. What is missing in IDMEF and IODEF . . . . . . . . . . . . . 18 4. Real World . . . . . . . . . . . . . . . . . . . . . . . . . 19 5. From XML to JSON . . . . . . . . . . . . . . . . . . . . . . 20 6. General observations . . . . . . . . . . . . . . . . . . . . 20 7. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . 21 8. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 21 9. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 21 10. Security Considerations . . . . . . . . . . . . . . . . . . . 21 11. Informative References . . . . . . . . . . . . . . . . . . . 21 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 21 1. Introduction This document is a collection of examples of security incidents represented in Incident Object Description Exchange Format (IODEF) v2 [RFC5070-bis] and Intrusion Detection Extensible Alert (IDEA) [IDEA] data formats. IODEF uses XML technology meanwhile IDEA is listed as a possible data representation that uses JSON. This activity comes from an idea of JSON usage for representation of IODEF documents. IDEA is a data format for representation of network security incidents detected by Intrusion Detection Systems (IDS), honeypots etc. Since IDEA uses JSON, it can be a possible way for JSON-based IODEF. However, the motivation of this document is to show different approaches of representation of similar information. Examples that are listed in this document were taken from RFC5070-bis and coverted into IDEA. The analysis of IODEF and IDEA discovered that the data format is not the only difference between IODEF and IDEA. Cejka & Kacha Expires May 4, 2016 [Page 2] Internet-Draft IODEF and IDEA November 2015 The following sections briefly introduces IODEF and IDEA. The rest of the document contains examples in IODEF and IDEA with comments. 1.1. IODEF IODEF is a human readable and processable data representation for sharing information cmmonly exchanged by Computer Security Incident Response Teams (CSIRTs). IODEF is defined in RFC5070 and by the time of writing this document, new version IODEF v2 is developed by draft- ietf-mile-rfc5070-bis. Existing implementations of IODEF are listed in draft-moriarty-mile- implementreport. IODEF v2 updates information model of IODEF, however, XML remains as the only one supported data format. 1.2. IDEA IDEA stands for Intrusion Detection Extensible Alert. Even though there exists a variety of models for communication between honeypots, agents, detection probes, none of them is really used because of various limitations for general usage. It is an attempt to define nowadays requirements and propose foundations for viable solution for security event model, taking into consideration existing formats, their benefits and drawbacks. IDEA is designed for storage and automatic processing of incident reports/alerts. Since IDEA is based on JSON, it can be used and processed by several available tools. IDEA is developed and used for sharing information about technical aspects of incidents between CSIRTs. It is currently used in Czech National Research and Educational Network (NREN) http://www.ces.net. IDEA messages can be created by detection systems, stored in NoSQL databases and automaticaly processed / data mined for reputation modelling and blacklisting. In comparison to IODEF, IDEA is more strict and tries not to allow multiple ways of incident representation. This kind of explicitness allows machine processing. On the other hand, IDEA describes only a subset of IODEF because it does not need to contain various information that are only informational for human reading. 2. Example Reports Represented in IODEF and IDEA Cejka & Kacha Expires May 4, 2016 [Page 3] Internet-Draft IODEF and IDEA November 2015 2.1. Minimal Example in IODEF 492382 2015-07-18T09:00:00-05:00 contact@csirt.example.com 2.2. Minimal Example in IDEA This example shows minimal set of keys that must contain an IDEA message. { "Format": "IDEA0", "ID": "bc2a3696-a1d2-49cc-9644-34da932085a8", "Category": [ "Test" ], "DetectTime": "2001-09-13T18:11:21+02:00" } 2.3. Worm Example in IODEF 908711 Cejka & Kacha Expires May 4, 2016 [Page 4] Internet-Draft IODEF and IDEA November 2015 2006-08-01T00:00:00-05:00 Watch-list of known bad IPs or networks CSIRT for example.com contact@csirt.example.com
192.0.2.53
Source of numerous attacks
192.0.2.16/28
Source of heavy scanning over past 1-month
192.0.2.241
C2 IRC server
Cejka & Kacha Expires May 4, 2016 [Page 5] Internet-Draft IODEF and IDEA November 2015
2.4. Worm Example in IDEA IDEA can add "Imprecise" into Target to express that not every IP address from the given range was influenced. { "Format": "IDEA0", "ID": "bc2a3696-a1d2-49cc-9644-34da932085a8", "Category": [ "Attempt.Login" ], "AltNames": [ "csirt.example.com-189493" ], "DetectTime": "2001-09-13T18:11:21+02:00", "CreateTime": "2001-09-13T23:19:24+00:00", "Description": "Host sending out Code Red probes", "ConnectionCount": 57, "Source": [ { "IP4": [ "192.0.2.200" ], "Proto": [ "tcp" ], "Note": "We recommend to block the host" } ], "Target": [ { "IP4": [ "192.0.2.16/28" ], "Proto": [ "tcp" ], "Port": [ 80 ], "Imprecise": true } ], Cejka & Kacha Expires May 4, 2016 [Page 6] Internet-Draft IODEF and IDEA November 2015 "Attach": [ { "Note": "Web-server logs", "Content": "192.0.2.1 - - [13/Sep/2001:18:11:21 +0200] \"GET/default.ida?XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX\"", "Ref": [ "http://mylogs.example.com/logs/httpd_access" ] } ], "Node": [ { "Name": "com.example.csirt.logdetector", "Ref": [ "urn:arin:example-com", "urn:mailto:contact@csirt.example.com" ], "Note": "Example.com CSIRT log detector" } ] } 2.5. Reconnaissance Example in IODEF 59334 2006-08-02T05:54:02-05:00 nmap http://nmap.toolsite.example.com Cejka & Kacha Expires May 4, 2016 [Page 7] Internet-Draft IODEF and IDEA November 2015 CSIRT for example.com contact@csirt.example.com +1 412 555 12345 Joe Smith smith@csirt.example.com
192.0.2.200
60524,60526,60527,60531
192.0.2.201
137-139,445
192.0.2.240
192.0.2.64/28
445 Cejka & Kacha Expires May 4, 2016 [Page 8] Internet-Draft IODEF and IDEA November 2015
2.6. Reconnaissance Example in IDEA IDEA message has "Category" that defines a type of incident that is described. In comparison to IODEF, success of event/incident is implicitly known according to Category, whereas IODEF needs a "completion" attribute. Note that IDEA event can represent just ONE event (corresponding to Flow concept in IODEF). Therefore, this example can be represented by two events. Related events can reference each other in "RelID". { "Format": "IDEA0", "ID": "5E1BE2FE-7322-11E5-9498-975EA0D99A02", "RelID": "0833E4BE-7326-11E5-AE5A-975EA0D99A02", "AltNames": [ "csirt.example.com-59344" ], "CreateTime": "2006-08-02T05:54:02-05:00", "DetectTime": "2006-08-02T05:54:02-05:00", "Category": [ "Recon.Scanning" ], "Source": [ { "IP4": [ "192.0.2.200" ], "Proto": [ "tcp" ], "Port": [ 60524, 60526, 60527, 60531 ] } ], "Target": [ { Cejka & Kacha Expires May 4, 2016 [Page 9] Internet-Draft IODEF and IDEA November 2015 "IP4": [ "192.0.2.201" ], "Proto": [ "tcp" ], "Port": [ 137, 138, 139, 445 ] } ], "Node": [ { "Name": "com.example.csirt.scandetector", "Ref": [ "urn:mailto:contact@csirt.example.com", "urn:tel:+1 412 555 12345" ], "Note": "Example.com CSIRT scan detector" } ] } Cejka & Kacha Expires May 4, 2016 [Page 10] Internet-Draft IODEF and IDEA November 2015 { "Format": "IDEA0", "ID": "0833E4BE-7326-11E5-AE5A-975EA0D99A02", "RelID": "5E1BE2FE-7322-11E5-9498-975EA0D99A02", "AltNames": [ "csirt.example.com-59344" ], "EventTime": "2006-08-02T05:54:02-05:00", "DetectTime": "2006-08-02T05:54:02-05:00", "Category": [ "Recon.Scanning" ], "Source": [ { "IP4": [ "192.0.2.240" ], "Proto": [ "tcp" ] } ], "Target": [ { "IP4": [ "192.0.2.64/28" ], "Proto": [ "tcp" ], "Port": [ 445 ] } ], "Node": [ { "Name": "com.example.csirt.scandetector", "Ref": [ "urn:mailto:contact@csirt.example.com", "urn:tel:+1 412 555 12345" ], "Note": "Example.com CSIRT scan detector" } ] } Cejka & Kacha Expires May 4, 2016 [Page 11] Internet-Draft IODEF and IDEA November 2015 2.7. Bot-Net Reporting Example in IODEF 908711 2006-06-08T05:44:53-05:00 Large bot-net GT Bot CA-2003-22 http://www.cert.org/advisories/CA-2003-22.html Root compromise via this IE vulnerability to install the GT Bot Joe Smith jsmith@csirt.example.com These hosts are compromised and acting as bots communicating with irc.example.com.
192.0.2.1
10000 Cejka & Kacha Expires May 4, 2016 [Page 12] Internet-Draft IODEF and IDEA November 2015 bot
192.0.2.3
250000 bot
irc.example.com
192.0.2.20
2006-06-08T01:01:03-05:00
IRC server on #give-me-cmd channel
Confirm the source and take machines off-line and remediate
2.8. Bot-Net Reporting Example in IDEA IDEA has the notion of toplevel ByteCount, FlowCount, PacketCount, ConnCount, however some nodes have already added Source/Targed specific counters, and we will incorporate them. Incident in IDEA must be represented in exact time frames (WinStartTime, WinEndTime). Counters are related to the time frame. IDEA keeps notion of the "Source" and the "Target" of the problem. Source and Target do not try to describe semantics of network flows. This approach marks every Source as an evil participant of an incident and it allows to easily recognize attackers and victims. { Cejka & Kacha Expires May 4, 2016 [Page 13] Internet-Draft IODEF and IDEA November 2015 "Format": "IDEA0", "ID": "78335B50-7344-11E5-BD6F-5B49358CC448", "Category": [ "Availability.DDOS", "Intrusion.Botnet" ], "AltNames": [ "csirt.example.com-908711" ], "EventTime": "2006-06-08T01:01:03-05:00", "WinStartTime": "2006-06-08T01:01:02-05:00", "WinEndTime": "2006-06-08T01:01:03-05:00", "CreateTime": "2006-06-08T05:44:53-05:00", "DetectTime": "2006-06-08T05:44:53-05:00", "Description": "Large bot-net", "ByteCount": 260000, "Ref": [ "urn:clamav:GT%20Bot", "urn:certcc:CA-2003-22", "http://www.cert.org/advisories/CA-2003-22.html" ], "Note": "Root compromise via IE vulnerability to install the GT Bot", "Source": [ { "Type": [ "Botnet" ], "IP4": [ "192.0.2.1" ], "ByteCount": 10000, "Note": "Host is compromised and acting as bot communicating with irc.example.com." }, { "Type": [ "Botnet" ], "IP4": [ "192.0.2.3" ], "ByteCount": 250000, "Note": "Host is compromised and acting as bot communicating with irc.example.com." }, { "Type": [ Cejka & Kacha Expires May 4, 2016 [Page 14] Internet-Draft IODEF and IDEA November 2015 "CC" ], "IP4": [ "192.0.2.20" ], "Proto": [ "tcp", "irc" ], "Note": "IRC server on #give-me-cmd channel" } ], "Node": [ { "Name": "com.example.csirt.logdetector", "Ref": [ "urn:arin:example-com", "urn:mailto:contact@csirt.example.com" ], "Note": "Example.com CSIRT log detector" } ] } 2.9. Watch List Reporting Example in IODEF 908711 2006-08-01T00:00:00-05:00 Watch-list of known bad IPs or networks CSIRT for example.com Cejka & Kacha Expires May 4, 2016 [Page 15] Internet-Draft IODEF and IDEA November 2015 contact@csirt.example.com
192.0.2.53
Source of numerous attacks
192.0.2.16/28
Source of heavy scanning over past 1-month
192.0.2.241
C2 IRC server
2.10. Watch List Reporting Example in IDEA Idea has no notion of expected action or reply. However, this example shows specific two-parties agreed on format, as indicated by Cejka & Kacha Expires May 4, 2016 [Page 16] Internet-Draft IODEF and IDEA November 2015 "watch-list-043", which can be represented in freely added and negotiated key. "FormatID" is used as an example here. If both parties agree to act on specific IDEA messages, they can also agree on new keys to indicate/require this, here "Expected" key on the event level and on the Source level is used as an example. { "Format": "IDEA0", "ID": "3411C0F2-734D-11E5-A71C-5B49358CC448", "FormatID": "watch-list-043", "AltNames": [ "csirt.example.com-908711" ], "Category": [ "Recon", "Intrusion.AdminCompromise" ], "CreateTime": "2006-08-01T00:00:00-05:00", "DetectTime": "2006-08-01T00:00:00-05:00", "Note": "Watch-list of known bad IPs or networks", "Expected": "Block", "Source": [ { "IP4": [ "192.0.2.53" ], "Note": "Source of numerous attacks", "Expected": "Notify" }, { "IP4": [ "192.0.2.16/28" ], "Note": "Source of heavy scanning over past 1-month" }, { "Type": [ "CC" ], "IP4": [ "192.0.2.241" ], "Note": "C2 IRC server" } ], "Node": [ { Cejka & Kacha Expires May 4, 2016 [Page 17] Internet-Draft IODEF and IDEA November 2015 "Name": "com.example.csirt.logdetector", "Ref": [ "urn:arin:example-com", "urn:mailto:contact@csirt.example.com" ], "Note": "Example.com CSIRT log detector" } ] } 3. What is missing in IDMEF and IODEF Creation of messages in almost any format is easy. However, most formats are hard to process, query and analyze. As RFC5070 says: "IODEF is a format for representing computer security information commonly exchanged between Computer Security Incident Response Teams (CSIRTs)." That means it does too much of what we need for automated incident exchange (detectors/SIEMs have no use for the whole incident solving lifecycle and organisations involved). However, the subset of information was recognized as usable for data representation. It is very similar in concepts to IDMEF, so mostly the same applies. IODEF has only limited means for extension (ext-type, ext-value, AdditionalData), and extension is only on strictly specified places, or by RFC update, or own nonstandard schema. However attacks evolve and there can be the need to incorporate the new (yet unknown) information in the future. For information, which cannot be represented in IODEF, there is a possibility to incorporate another format (IDMEF, IDEA, Taxii/Styx, OpenIOC). That however subverts use of the one standard format (recipients might not know how to deal with embedded message, or would have to implement all of the existing standards). IODEF is very free in various fields representation (e.g. Assessment). Both types and semantics are dependent on attributes: category of Address, type of Assessment or Impact. That is very unfriendly for internal representation and processing. The more conversions needed (into internal format, db storage, etc.), the less efficient is the data representation for usage. Various pieces of the same information on different places (and depths of the structure) make it hard to index/search. According to RFC5070: "These examples do not necessarily represent the only way to encode a particular incident." Various ways to encode one incident complicate its further processing and comparison, correllation or merging with other similar incidents. Machine would have to know all possible variations/representations. Cejka & Kacha Expires May 4, 2016 [Page 18] Internet-Draft IODEF and IDEA November 2015 Recursive representation of information is allowed for Contact or EvenData (Contact/Contact/Contact..., EventData/EventData/Event/Data. Although it is a better representation of real world relations, it complicates processing and internal representation. In addition, IODEF structure is a tree, however, a graph would be more suitable in real world (example in the following section). 4. Real World If the aim is to describe the real world relations, tree is not enough (possibly digraph may do). The following example shows information duplication/incoherence, where one technical contact is responsible for more company subsidiaries: Example Mother Company **John Doe** Example Subsidiary Company **John Doe** Not every event can be described by means of Flow/System hierarchy. For example a phishing URL and related information (source MTA), extracted from spam message. Does the Flow mean the transfer of the message from the source MTA to the destination? Should be the extracted URL represented within the Flow or within one of two System instances? Or in Observable? How to refer it from Flow? (ObservableReference is usable only within Indicator scope.) Another real-world example is related to a trojan on the web page. User clicks, downloads the page, JavaScript exploits user's webbrowser, user machine gets compromised. Does it mean that Flow goes from the user to the webpage (it has been definitely initiated by user), or the other way around? Is it direction of the attack, or description of technical means? Where does the URL and trojan info belong? A data representation should describe an incident clearly in one way. Cejka & Kacha Expires May 4, 2016 [Page 19] Internet-Draft IODEF and IDEA November 2015 5. From XML to JSON Each JSON key may have one and only one value. General methods of mapping of XML attributes or nested tags to JSON are possible, however, they are usually hairy and nonelegant. Consider the following XML sample: 123 JSON1: Goes against JSON paradigms: orthogonal hierarchy, info encoded in keys. "Counter": 123 "Counter.type": "average" "Counter.unit": "byte" JSON2: Each and every attribute has to be nested object because of extensibility. "Counter": { "value": 123 "type": "average" "unit": "byte" } JSON3: Simple, jsonic, however every combination neads unique name. "AvgBytesCounter": 123 If a designed JSON has deterministic, typed and expectable structure, it can be readily imported into existing tools (document stores such as MongoDB, PostgreSQL), indexed, queried, analyzed. It would be very usage friendly to leverage this possibility. 6. General observations No known format allows for timed/marked modifications (e.g. one party generates the event, the second party does the enrichment with PassiveDNS, the third party with Whois information and finally, receiver has no opportunity to find out who and when modified the event report. That is important, because of issue of trust, and because of time differences (some information like DNS change in time, so information added later, may be out of date). Another issue is related to weak notion of trust (authentication, signing). It can be done by means of signed differences (such as JSON Patch). Cejka & Kacha Expires May 4, 2016 [Page 20] Internet-Draft IODEF and IDEA November 2015 7. Conclusion JSON version of the format should be built from the grounds up and take into consideration JSON specifics. Straightforward XML to JSON translation would lead to cumbersome result. One possible approach might be a representation of all simple objects in a reasonably flat structure ("information for machine"), and to map the real world relations by references ("information for humans"). In our oppinion, IDEA is close to the EventData class of a IODEF schema. It might be a good idea to start with event/attack description format (IDEA or IDMEF based) at first and to add security team aimed information (IODEF based) later. 8. Acknowledgements For review and advices, the authors thank to: Takeshi Takahashi, National Institute of Information and Communications Technology Network Security Research Institute Vaclav Bartos, CESNET 9. IANA Considerations This memo includes no request to IANA. 10. Security Considerations There are no security considerations added in this draft because of the nature of the document. 11. Informative References [IDEA] CESNET, a.l.e., "Intrusion Detection Extensible Alert", . [RFC5070-bis] Danyliw, R. and P. Stoecker, "The Incident Object Description Exchange Format v2", 2015, . Authors' Addresses Cejka & Kacha Expires May 4, 2016 [Page 21] Internet-Draft IODEF and IDEA November 2015 Tomas Cejka CESNET, a.l.e. Zikova 4 Prague, CZ 16000 CZ Email: cejkat@cesnet.cz Pavel Kacha CESNET, a.l.e. Zikova 4 Prague, CZ 16000 CZ Email: ph@cesnet.cz Cejka & Kacha Expires May 4, 2016 [Page 22]