PPSP K. Wu Internet Draft Z. Lei Intended status: BCP D. Chiu Expires: April 2011 ASTRI October 26, 2010 Survey of P2P File Downloading and Streaming Protocol draft-wu-ppsp-survey-of-p2p-protocol-01.txt Status of this Memo This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. This document may not be modified, and derivative works of it may not be created, and it may not be published except as an Internet-Draft. This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. This document may not be modified, and derivative works of it may not be created, except to publish it as an RFC and to translate it into languages other than English. This document may contain material from IETF Documents or IETF Contributions published or made publicly available before November 10, 2008. The person(s) controlling the copyright in some of this material may not have granted the IETF Trust the right to allow modifications of such material outside the IETF Standards Process. Without obtaining an adequate license from the person(s) controlling the copyright in such materials, this document may not be modified outside the IETF Standards Process, and derivative works of it may not be created outside the IETF Standards Process, except to format it for publication as an RFC or to translate it into languages other than English. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any Wu, et al. Expires April 26, 2011 [Page 1] Internet-Draft P2P Layered Streaming October 2010 time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html This Internet-Draft will expire on April 26, 2011. Copyright Notice Copyright (c) 2010 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Abstract Most of Peer-to-Peer survey papers are described about the P2P applications architecture. But it is hard to compare PPSP with them with the high-level description. In this survey, we focus on the message design for the well-known P2P file downloading and streaming applications (Bittorrent and eMule) and study their base and enhancement approach in the peer and tracker protocol specifications. Through the survey, we summarize a common message design for the P2P streaming protocol standardization. Wu, et al. Expires April 26, 2011 [Page 2] Internet-Draft P2P Layered Streaming October 2010 Table of Contents 1. Introduction................................................4 2. Terminology.................................................5 3. Bittorrent Protocol.........................................5 3.1. Mainline Protocol Specification.........................6 3.1.1. BitTorrent File Distribution Process...............6 3.1.2. Encoded Type.......................................7 3.1.3. Metainfo File......................................8 3.1.4. Peer-Tracker Message...............................9 3.1.5. Peer-Peer Message.................................10 3.2. Enhancement Proposal...................................11 3.2.1. DHT Protocol......................................11 3.2.2. Fast Extension....................................13 3.2.3. Multitracker Metadata Extension...................15 3.2.4. UDP Tracker Protocol..............................15 3.2.4.1. UDP Connections / Spoofing...................15 3.2.4.2. Time Outs....................................16 3.2.4.3. UDP Peer-Tracker Message.....................16 3.2.5. Superseeding......................................18 3.2.6. HTTP Seeding......................................19 3.2.6.1. Metadata Extension...........................19 3.2.7. Extension for partial seeds.......................20 3.2.7.1. Extension Header.............................21 3.2.7.2. Tracker Scrapes..............................21 3.2.7.3. Tracker Announce.............................21 3.2.7.4. Rationale....................................21 3.2.8. BitTorrent Local Tracker Discovery Protocol........22 3.2.9. Tracker Returns External IP.......................22 3.2.10. Private Torrents.................................22 3.2.11. Tracker exchange.................................23 3.2.12. Merkle tree torrent extension....................23 3.2.12.1. Simple Merkle Hashes........................24 3.2.13. Tracker Failure Retry Extension..................25 3.2.13.1. "retry in" extension to "failure reason".....26 3.2.14. DHT scrape.......................................26 4. eMule Protocol.............................................26 4.1. Mainline Protocol Specification........................26 4.2. Enhancement Proposal...................................26 5. Messages for PPSP..........................................26 5.1. Tracker-Peer Messages..................................27 5.1.1. Baseline Tracker-Peer Messages....................27 5.1.1.1. Connect Request..............................27 5.1.1.2. Connect Response.............................27 5.1.1.3. Announce Request.............................27 Wu, et al. Expires April 26, 2011 [Page 3] Internet-Draft P2P Layered Streaming October 2010 5.1.1.4. Announce Response............................28 5.1.1.5. Get-Peer Request.............................28 5.1.1.6. Get-Peer Response............................28 5.1.1.7. Retry Response...............................28 5.1.1.8. Error Response...............................28 5.1.2. Enhancement Tracker-Peer Messages.................29 5.1.2.1. P2P Layered Streaming Message................29 5.2. Peer-Peer Messages.....................................29 5.2.1. Baseline Peer-Peer Messages.......................29 5.2.1.1. Interested...................................29 5.2.1.2. Not Interested...............................30 5.2.1.3. Choke........................................30 5.2.1.4. Unchoke......................................30 5.2.1.5. Have Piece...................................30 5.2.1.6. Bitfield Request.............................30 5.2.1.7. Bitfield Response............................30 5.2.1.8. Piece Request................................31 5.2.1.9. Piece Response...............................31 5.2.1.10. Piece Cancel................................31 5.2.2. Enhancement Peer-Peer Messages....................31 5.2.2.1. Have All/None Bitfield.......................31 5.2.2.2. Suggest Piece................................31 5.2.2.3. Piece Reject.................................31 5.2.3. DHT Messages......................................32 5.2.3.1. Ping Request.................................32 5.2.3.2. Ping Response................................32 5.2.3.3. Find-Node Request............................32 5.2.3.4. Find-Node Response...........................32 5.2.3.5. Get-Peer-List Request........................32 5.2.3.6. Get-Peer-List Response.......................32 5.2.3.7. Announce Peer Request........................33 5.2.3.8. Announce Peer Response.......................33 5.3. Peer-CDN(HTTP) Messages................................33 5.4. Tracker-Tracker Messages...............................33 6. Security Considerations.....................................33 7. Conclusions................................................33 8. References.................................................34 8.1. Normative References...................................34 8.2. Informative References.................................35 9. Acknowledgments............................................35 1. Introduction For designing the standard the Peer-To-Peer protocols, we surveyed several popular P2P protocols in today's P2P file downloading Wu, et al. Expires April 26, 2011 [Page 4] Internet-Draft P2P Layered Streaming October 2010 applications, including Bittorrent, eMule, and more. They are used domestically in the world. Different with other high-level architecture survey [PPSP Survey], we focus on the message design for the well-known P2P file downloading and streaming applications (Bittorrent and eMule) and study their base and enhancement approach in the peer and tracker protocol specifications. Through the survey, we summarize a common message design for the P2P streaming protocol standardization. Section 2 lists the terminology used. Section 3 describes the Bittorrent baseline and enhancement protocols. Section 4 describes the eMule baseline and enhancement protocols. Section 5 describes a common message design for the P2P streaming protocol standardization. Section 6 describes the security issues. Section 7 describes the message design conclusion. 2. Terminology The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC-2119 0. Piece: A piece is a basic unit of partitioned data, which is used by a peer for the purpose of storage, advertisement and exchange among peers. 3. Bittorrent Protocol BitTorrent is a peer-to-peer file sharing protocol used for distributing large amounts of data. BitTorrent is one of the most common protocols for transferring large files, and it has been estimated that it accounted for roughly 27-55% of all Internet traffic (depending on geographical location) as of February 2009. [BitTorrent in wiki] Wu, et al. Expires April 26, 2011 [Page 5] Internet-Draft P2P Layered Streaming October 2010 In this section, most of content is from Bittorrent.org, which lists the Bittorrent mainline and enhancement protocols. [Index of BitTorrent Enhancement Proposals] 3.1. Mainline Protocol Specification BitTorrent is a protocol for distributing files. It identifies content by URL and is designed to integrate seamlessly with the web. Its advantage over plain HTTP is that when multiple downloads of the same file happen concurrently, the downloaders upload to each other, making it possible for the file source to support very large numbers of downloaders with only a modest increase in its load. [Bittorrent_Protocol_Specification] 3.1.1. BitTorrent File Distribution Process A BitTorrent file distribution consists of these entities: An ordinary web server A static 'metainfo' file A BitTorrent tracker An 'original' downloader The end user web browsers The end user downloaders There are ideally many end users for a single file. To start serving, a host goes through the following steps: Start running a tracker (or, more likely, have one running already). Start running an ordinary web server, such as apache, or have one already. Associate the extension .torrent with mimetype application/x- bittorrent on their web server (or have done so already). Wu, et al. Expires April 26, 2011 [Page 6] Internet-Draft P2P Layered Streaming October 2010 Generate a metainfo (.torrent) file using the complete file to be served and the URL of the tracker. Put the metainfo file on the web server. Link to the metainfo (.torrent) file from some other web page. Start a downloader which already has the complete file (the 'origin'). To start downloading, a user does the following: Install BitTorrent (or have done so already). Surf the web. Click on a link to a .torrent file. Select where to save the file locally, or select a partial download to resume. Wait for download to complete. Tell downloader to exit (it keeps uploading until this happens). 3.1.2. Encoded Type Strings are length-prefixed base ten followed by a colon and the string. For example 4:spam corresponds to 'spam'. Integers are represented by an 'i' followed by the number in base 10 followed by an 'e'. For example i3e corresponds to 3 and i-3e corresponds to -3. Integers have no size limitation. i-0e is invalid. All encodings with a leading zero, such as i03e, are invalid, other than i0e, which of course corresponds to 0. Lists are encoded as an 'l' followed by their elements (also bencoded) followed by an 'e'. For example l4:spam4:eggse corresponds to ['spam', 'eggs']. Dictionaries are encoded as a 'd' followed by a list of alternating keys and their corresponding values followed by an 'e'. For example, d3:cow3:moo4:spam4:eggse corresponds to {'cow': 'moo', 'spam': 'eggs'} and d4:spaml1:a1:bee corresponds to Wu, et al. Expires April 26, 2011 [Page 7] Internet-Draft P2P Layered Streaming October 2010 {'spam': ['a', 'b']}. Keys must be strings and appear in sorted order (sorted as raw strings, not alphanumerics). 3.1.3. Metainfo File Metainfo files are encoded dictionaries with the following keys: announce The URL of the tracker. info This maps to a dictionary, with keys described below. The name key maps to a UTF-8 encoded string which is the suggested name to save the file (or directory) as. It is purely advisory. piece length maps to the number of bytes in each piece the file is split into. For the purposes of transfer, files are split into fixed- size pieces which are all the same length except for possibly the last one which may be truncated. piece length is almost always a power of two, most commonly 2 18 = 256 K (BitTorrent prior to version 3.2 uses 2 20 = 1 M as default). pieces maps to a string whose length is a multiple of 20. It is to be subdivided into strings of length 20, each of which is the SHA1 hash of the piece at the corresponding index. There is also a key length or a key files, but not both or neither. If length is present then the download represents a single file, otherwise it represents a set of files which go in a directory structure. In the single file case, length maps to the length of the file in bytes. For the purposes of the other keys, the multi-file case is treated as only having a single file by concatenating the files in the order they appear in the files list. The files list is the value files maps to, and is a list of dictionaries containing the following keys: length - The length of the file, in bytes. Wu, et al. Expires April 26, 2011 [Page 8] Internet-Draft P2P Layered Streaming October 2010 path - A list of UTF-8 encoded strings corresponding to subdirectory names, the last of which is the actual file name (a zero length list is an error case). In the single file case, the name key is the name of a file, in the muliple file case, it's the name of a directory. All strings in a .torrent file that contains text must be UTF-8 encoded. 3.1.4. Peer-Tracker Message Tracker GET request have the following keys: info_hash The 20 byte sha1 hash of the bencoded form of the info value from the metainfo file. Note that this is a substring of the metainfo file. This value will almost certainly have to be escaped. peer_id A string of length 20 which this downloader uses as its id. Each downloader generates its own id at random at the start of a new download. This value will also almost certainly have to be escaped. ip An optional parameter giving the IP (or dns name) which this peer is at. Generally used for the origin if it's on the same machine as the tracker. port The port number this peer is listening on. Common behavior is for a downloader to try to listen on port 6881 and if that port is taken try 6882, then 6883, etc. and give up after 6889. uploaded The total amount uploaded so far, encoded in base ten ascii. downloaded The total amount downloaded so far, encoded in base ten ascii. Wu, et al. Expires April 26, 2011 [Page 9] Internet-Draft P2P Layered Streaming October 2010 left The number of bytes this peer still has to download, encoded in base ten ascii. Note that this can't be computed from downloaded and the file length since it might be a resume, and there's a chance that some of the downloaded data failed an integrity check and had to be re-downloaded. event This is an optional key which maps to started, completed, or stopped (or empty, which is the same as not being present). If not present, this is one of the announcements done at regular intervals. An announcement using started is sent when a download first begins, and one using completed is sent when the download is complete. No completed is sent if the file was complete when started. Downloaders send an announcement using stopped when they cease downloading. 3.1.5. Peer-Peer Message choke 'choke' has no payload. unchoke 'unchoke' has no payload. interested 'interested' has no payload. not interested 'not interested' has no payload. have The 'have' message's payload is a single number, the index which that downloader just completed and checked the hash of. bitfield 'bitfield' is only ever sent as the first message. Its payload is a bitfield with each index that downloader has sent set to one and the Wu, et al. Expires April 26, 2011 [Page 10] Internet-Draft P2P Layered Streaming October 2010 rest set to zero. Downloaders which don't have anything yet may skip the 'bitfield' message. The first byte of the bitfield corresponds to indices 0 - 7 from high bit to low bit, respectively. The next one 8- 15, etc. Spare bits at the end are set to zero. request 'request' messages contain an index, begin, and length. The last two are byte offsets. Length is generally a power of two unless it gets truncated by the end of the file. All current implementations use 2 15 , and close connections which request an amount greater than 2 17. piece 'piece' messages contain an index, begin, and piece. Note that they are correlated with request messages implicitly. It's possible for an unexpected piece to arrive if choke and unchoke messages are sent in quick succession and/or transfer is going very slowly. cancel 'cancel' messages have the same payload as request messages. They are generally only sent towards the end of a download, during what's called 'endgame mode'. When a download is almost complete, there's a tendency for the last few pieces to all be downloaded off a single hosed modem line, taking a very long time. To make sure the last few pieces come in quickly, once requests for all pieces a given downloader doesn't have yet are currently pending, it sends requests for everything to everyone it's downloading from. To keep this from becoming horribly inefficient, it sends cancels to everyone else every time a piece arrives. 3.2. Enhancement Proposal The Bittorrent technology is improved a lot in last years. The new features are documented as BitTorrent Enhancement Proposals (BEPs). 3.2.1. DHT Protocol BitTorrent uses a "distributed sloppy hash table" (DHT) for storing peer contact information for "trackerless" torrents. In effect, each peer becomes a tracker. The protocol is based on Kademila and is implemented over UDP. The following items are the message types. [DHT Protocol] Wu, et al. Expires April 26, 2011 [Page 11] Internet-Draft P2P Layered Streaming October 2010 ping The most basic query is a ping. "q" = "ping" A ping query has a single argument, "id" the value is a 20-byte string containing the senders node ID in network byte order. The appropriate response to a ping has a single key "id" containing the node ID of the responding node. find_node Find node is used to find the contact information for a node given its ID. "q" == "find_node" A find_node query has two arguments, "id" containing the node ID of the querying node, and "target" containing the ID of the node sought by the queryer. When a node receives a find_node query, it should respond with a key "nodes" and value of a string containing the compact node info for the target node or the K (8) closest good nodes in its own routing table. get_peers Get peers associated with a torrent infohash. "q" = "get_peers" A get_peers query has two arguments, "id" containing the node ID of the querying node, and "info_hash" containing the infohash of the torrent. If the queried node has peers for the infohash, they are returned in a key "values" as a list of strings. Each string containing "compact" format peer information for a single peer. If the queried node has no peers for the infohash, a key "nodes" is returned containing the K nodes in the queried nodes routing table closest to the infohash supplied in the query. In either case a "token" key is also included in the return value. The token value is a required argument for a future announce_peer query. The token value should be a short binary string. announce_peer Announce that the peer, controlling the querying node, is downloading a torrent on a port. announce_peer has four arguments: "id" containing the node ID of the querying node, "info_hash" containing the infohash of the torrent, "port" containing the port as an integer, and the "token" received in response to a previous get_peers query. The queried node must verify that the token was previously sent to the same IP address as the querying node. Then the queried node should store the IP address of the querying node and the supplied port number under the infohash in its store of peer contact information. Wu, et al. Expires April 26, 2011 [Page 12] Internet-Draft P2P Layered Streaming October 2010 3.2.2. Fast Extension The Fast Extension packages several extensions: Have None/Have All, Reject Requests, Suggestions and Allowed Fast. These are enabled by setting the third least significant bit of the last reserved byte in the BitTorrent handshake. The following items are the message types. [Fast Extension] Have All/Have None Have All and Have None specify that the message sender has all or none of the pieces respectively. When present, Have All or Have None replace the Have Bitfield. Exactly one of Have All, Have None, or Have Bitfield MUST appear and only immediately after the handshake. The reason for these messages is to save bandwidth. Also slightly to remove the idiosyncrasy of sending no message when a peer has no pieces. When the fast extension is disabled, if a peer receives Have All or Have None then the peer MUST close the connection. Suggest Piece Suggest Piece is an advisory message meaning "you might like to download this piece." The intended usage is for 'super-seeding' without throughput reduction, to avoid redundant downloads, and so that a seed which is disk I/O bound can upload continguous or identical pieces to avoid excessive disk seeks. In all cases, the seed SHOULD operate to maintain a roughly equal number of copies of each piece in the network. A peer MAY send more than one suggest piece message at any given time. A peer receiving multiple suggest piece messages MAY interpret this as meaning that all of the suggested pieces are equally appropriate. When the fast extension is disabled, if a peer receives a Suggest Piece message, the peer MUST close the connection. Reject Request Reject Request notifies a requesting peer that its request will not be satisfied. If the fast extension is disabled and a peer receives a reject request then the peer MUST close the connection. When the fast extension is enabled: Wu, et al. Expires April 26, 2011 [Page 13] Internet-Draft P2P Layered Streaming October 2010 If a peer receives a reject for a request that was never sent then the peer SHOULD close the connection. If a peer sends a choke, it MUST reject all requests from the peer to whom the choke was sent except it SHOULD NOT reject requests for pieces that are in the allowed fast set. A peer SHOULD choke first and then reject requests so that the peer receiving the choke does not re-request the pieces. If a peer receives a request from a peer its choking, the peer receiving the request SHOULD send a reject unless the piece is in the allowed fast set. If a peer receives an excessive number of requests from a peer it is choking, the peer receiving the requests MAY close the connection rather than reject the request. However, consider that it can take several seconds for buffers to drain and messages to propagate once a peer is choked. Allowed Fast With the BitTorrent protocol specified, new peers take several minutes to ramp up before they can effectively engage in BitTorrent's tit-for-tat. The reason is simple: starting peers have few pieces to trade. Allowed Fast is an advisory message which means "if you ask for this piece, I'll give it to you even if you're choked." Allowed Fast thus shortens the awkward stage during which the peer obtains occasional optimistic unchokes but cannot sufficiently reciprocate to remain unchoked. The pieces that can be downloaded when choked constitute a peer's allowed fast set. The set is generated using a canonical algorithm that produces piece indices unique to the message receiver so that if two peers offer k pieces fast it will be the same k, and if one offers k+1 it will be the same k plus one more. k should be small enough to avoid abuse, but large enough to ramp up tit-for-tat. We currently set k to 10, but peers are free to change this number, e.g., to suit load. The message sender MAY list pieces that the message sender does not have. The receiver MUST NOT interpret an Allowed Fast message as meaning that the message sender has the piece. This allows peers to generate and communicate allowed fast sets at the beginning of a connection. However, a peer MAY send Allowed Fast messages at any time. Wu, et al. Expires April 26, 2011 [Page 14] Internet-Draft P2P Layered Streaming October 2010 A peer SHOULD send Allowed Fast messages to any starting peer unless the local peer lacks sufficient resources. A peer MAY reject requests for already Allowed Fast pieces if the local peer lacks sufficient resources, if the requested piece has already been sent to the requesting peer, or if the requesting peer is not a starting peer. Our current implementation rejects requests for Allowed Fast messages whenever the requesting peer has more than * k * pieces. 3.2.3. Multitracker Metadata Extension In addition to the standard "announce" key, in the main area of the metadata file and not part of the "info" section, will be a new key, "announce-list". This key will refer to a list of lists of URLs, and will contain a list of tiers of announces. If the client is compatible with the multitracker specification, and if the "announce- list" key is present, the client will ignore the "announce" key and only use the URLs in "announce-list". [Multitracker Metadata Extension] 3.2.4. UDP Tracker Protocol 3.2.4.1. UDP Connections / Spoofing In the ideal case, only 2 packets would be necessary. However, it is possible to spoof the source address of a UDP packet. The tracker has to ensure this doesn't occur, so it calculates a value (connection_id) and sends it to the client. If the client spoofed it's source address, it won't receive this value (unless it's sniffing the network). The connection_id will then be send to the tracker again in packet 3. The tracker verifies the connection_id and ignores the request if it doesn't match. Connection IDs should not be guessable by the client. This is comparable to a TCP handshake and a syn cookie like approach can be used to storing the connection IDs on the tracker side. A connection ID can be used for multiple requests. A client can use a connection ID until one minute after it has received it. Trackers should accept the connection ID until two minutes after it has been send. [UDP Tracker Protocol] Wu, et al. Expires April 26, 2011 [Page 15] Internet-Draft P2P Layered Streaming October 2010 3.2.4.2. Time Outs UDP is an 'unreliable' protocol. This means it doesn't retransmit lost packets itself. The application is responsible for this. If a response is not received after 15 * 2 ^ n seconds, the client should retransmit the request, where n starts at 0 and is increased up to 8 (3840 seconds) after every retransmission. Note that it is necessary to rerequest a connection ID when it has expired. 3.2.4.3. UDP Peer-Tracker Message All values are send in network byte order (big endian). Do not expect packets to be exactly of a certain size. Future extensions could increase the size of packets. Before announcing or scraping, you have to obtain a connection ID. 1.) Choose a random transaction ID. 2.) Fill the connect request structure. 3.) Send the packet. connect request 1.Receive the packet. 2.Check whether the packet is at least 16 bytes. 3.Check whether the transaction ID is equal to the one you chose. 4.Check whether the action is connect. 5.Store the connection ID for future use. connect response 1.Choose a random transaction ID. 2.Fill the announce request structure. 3.Send the packet. Wu, et al. Expires April 26, 2011 [Page 16] Internet-Draft P2P Layered Streaming October 2010 announce request 1.Receive the packet. 2.Check whether the packet is at least 20 bytes. 3.Check whether the transaction ID is equal to the one you chose. 4.Check whether the action is announce. 5.Do not announce again until interval seconds have passed or an event has occurred. announce response Up to about 74 torrents can be scraped at once. A full scrape can't be done with this protocol. 1.Choose a random transaction ID. 2.Fill the scrape request structure. 3.Send the packet. scrape request 1.Receive the packet. 2.Check whether the packet is at least 8 bytes. 3.Check whether the transaction ID is equal to the one you chose. 4.Check whether the action is scrape. scrape response If the tracker encounters an error, it might send an error packet. 1.Receive the packet. 2.Check whether the packet is at least 8 bytes. Wu, et al. Expires April 26, 2011 [Page 17] Internet-Draft P2P Layered Streaming October 2010 3.Check whether the transaction ID is equal to the one you chose. error response Offset Size Name Value 0 32-bit integer action 3 // error 4 32-bit integer transaction_id 8 string message 3.2.5. Superseeding The super-seed feature is a new seeding algorithm designed to help a torrent initiator with limited bandwidth "pump up" a large torrent, reducing the amount of data it needs to upload in order to spawn new seeds in the torrent. When a seeding client enters "super-seed mode", it will not act as a standard seed, but masquerades as a normal client with no data. As clients connect, it will then inform them that it received a piece -- a piece that was never sent, or if all pieces were already sent, is very rare. This will induce the client to attempt to download only that piece. When the client has finished downloading the piece, the seed will not inform it of any other pieces until it has seen the piece it had sent previously present on at least one other client. Until then, the client will not have access to any of the other pieces of the seed, and therefore will not waste the seed's bandwidth. This method has resulted in much higher seeding efficiencies, by both inducing peers into taking only the rarest data, reducing the amount of redundant data sent, and limiting the amount of data sent to peers which do not contribute to the swarm. Prior to this, a seed might have to upload 150% to 200% of the total size of a torrent before other clients became seeds. However, a large torrent seeded with a single client running in super-seed mode was able to do so after only uploading 105% of the data. This is 150-200% more efficient than when using a standard seed. Wu, et al. Expires April 26, 2011 [Page 18] Internet-Draft P2P Layered Streaming October 2010 Super-seed mode is NOT recommended for general use. While it does assist in the wider distribution of rare data, because it limits the selection of pieces a client can downlad, it also limits the ability of those clients to download data for pieces they have already partially retrieved. Therefore, super-seed mode is only recommended for initial seeding servers. [Superseeding] 3.2.6. HTTP Seeding The HTTP server is defined as the following. [HTTP Seeding] 3.2.6.1. Metadata Extension "httpseeds" In the main area of the metadata file and not part of the "info" section, will be a new key, "httpseeds". This key will refer to a list of URLs, and will contain a list of web addresses where torrent data can be retrieved. This key may be safely ignored if the client is not capable of using it. Protocol The client calls the URL given, in the following format: ?info_hash=[hash]&piece=[piece]{&ranges=[start]-[end]{,[start]- [end]}...} Server-side Implementation Notes The purpose of the http seed script is to limit access to the data being downloaded so that the web server isn't overwhelmed by clients asking for the data. If it weren't for this limiting, there would be no way to prevent someone from coding a client to try to download continuously or multiply, resulting in a heavy load on the server. Limiting the download rate also allows an http seed script to be run on a web account where the total amount of data downloaded is restricted or may result in extra service charges. Wu, et al. Expires April 26, 2011 [Page 19] Internet-Draft P2P Layered Streaming October 2010 The script must provide three major functions: 1).Limit its average upload to a reasonable level. 2).Intelligently tell peers how long they should wait before retrying. 3).translate from an info-hash and piece number to a byte range within a file or set of files, and return those bytes. Another highly desirable function is to check whether peers are retrying too often, and to automatically ban those peers. Other desirable features include a way of monitoring the tracker the torrent is using and to stop uploading data if sufficient P2P seeds exist, and a way to feed back to the tracker to show a seed is present. Client-side Implementation Notes The prototype code base has a default retry time of 30 seconds; after 3 retries with errors, the time is lengthened with each cycle. The prototype code will not display any errors with contacting http seeds (unless the URL given in the .torrent is incorrect) until it has received data from that seed. (The prototype code also won't display any errors for any http reply that was actually received.) Current behavior is: Request the rarest piece you're missing in entirety that you can locate. If you have no pieces that aren't partially downloaded, skip one retry cycle, then start requesting partials. If you receive a 503 response, set the retry time equal to the integer value received in the response. 3.2.7. Extension for partial seeds The purpose of this extension is to allow further optimizations of bittorrent swarms when peers are partial seeds. A partial seed is a peer that is incomplete without downloading anything more. This happens for multi file torrents where users only download some of the files. [Extension for Partial Seeds] Wu, et al. Expires April 26, 2011 [Page 20] Internet-Draft P2P Layered Streaming October 2010 3.2.7.1. Extension Header A peer that is a partial seed SHOULD include an extra header in the extension handshake, 'upload_only'. Setting the value of this key to 1 indicates that this peer is not interested in downloading anything. Example extension handshake message: {'m': {'ut_metadata', 3}, 'upload_only': 1} 3.2.7.2. Tracker Scrapes The tracker scrape conventions defines three values per torrent, 'complete', 'incomplete' and 'downloaded'. The purpose of this extensions is to let clients distinguish between partial seeds and downloaders, both of which currently would be classified as incomplete. If the tracker supports this extension, it MUST add a fourth field, 'downloaders'. This field is the number of active downloaders in the swarm, it does not include partial seeds. The number of partial seeds can be calculated by: incomplete - downloaders. 3.2.7.3. Tracker Announce In order to tell the tracker that a peer is a partial seed, it MUST send an event=paused parameter in every announce while it is a partial seed. 3.2.7.4. Rationale Allowing peers to scrape a tracker and distinguish between active downloaders and partial seeds makes it more efficient to determine what to seed based on the downloader/seed ratio. The reason why every announce should contain event=paused is to avoid relying on the state being stored in the tracker. In case there's a failure and a backup tracker is used, it can recover all of the swarm state because the clients are announcing that they are partial seeds. Wu, et al. Expires April 26, 2011 [Page 21] Internet-Draft P2P Layered Streaming October 2010 3.2.8. BitTorrent Local Tracker Discovery Protocol Some Internet Service Providers (ISPs) may wish to localize traffic to reduce transit costs, reduce internal traffic, and improve user experience by speeding up downloads. [BitTorrent Local Tracker Discovery Protocol] With this extension, BitTorrent clients are able to discover a tracker nearby on the network, and via this tracker discover nearby caches or peers. A cache may simply be a fast peer in the middle of the network. It might also have substantial disk space. The client communicates with a cache using the normal BitTorrent protocol. When a cache is present, the user benefits from having a high capacity peer from which the user's client downloads and to which it can delegate seeding. When a cache inside the user's ISP network seeds on behalf of the client, it frees upstream capacity in the user's access network benefiting the user and those that share the access network. When subsequent peers transfer from their ISP's cache, the ISP experiences less transit traffic. 3.2.9. Tracker Returns External IP A BitTorrent client can easily learn the IP address used when sending, but because of intervnening Network Address Translators (NATs) the IP address of the client's host seen inside the client's private network may differ from the IP address used to route the client's packets through the public Internet. [Tracker Returns External IP] 3.2.10. Private Torrents A private tracker restricts access to the torrents it tracks. A torrent with restricted access is called a private torrent. All other torrents are public torrents. To promote sharing, private trackers often maintain statistics about registered users and restrict access to certain or all torrents for users that do not adequately upload. [Private Torrents] When generating a metainfo file, users denote a torrent as private by including the key-value pair "private=1" in the "info" dict of the torrent's metainfo file. Wu, et al. Expires April 26, 2011 [Page 22] Internet-Draft P2P Layered Streaming October 2010 When a BitTorrent client obtains a metainfo file containing the "private=1" key-value pair, it MUST ONLY announce itself to the private tracker, and MUST ONLY initiate connections to peers returned from the private tracker. When multiple trackers appear in the announce-list in the metainfo file of a private torrent (see multitracker extension in [4]), each peer MUST use only one tracker at a time and only switch between trackers when the current tracker fails. When switching between trackers, the peer MUST disconnect from all current peers and connect only to those provided from the new tracker. 3.2.11. Tracker exchange This extension makes it possible for BitTorrent peers to learn about new trackers for a swarm they have joined. Ideally ending up with every peer knowing about every tracker used for the torrent. In this extension, every peer has a list of trackers. In this list are only verified trackers. A verified tracker is a tracker that either was in the .torrent file that was loaded (just like without this extension, they are assumed to be good) or a tracker that we have received over the TEX protocol and received a successful response from. The tracker list used by this extension is hence different from the tracker list used by the client itself, since it does not include some trackers that we have never successfully announced with. This list of trackers is the only list of verified trackers referred to in this extension, unless explicitly stated otherwise. The extension message is used to send changes to the tracker list to other peers. If the peers have different tracker lists on handshake, the first message MUST contain the full list of trackers. Any subsequent message SHOULD only contain added trackers. If the peers have the same tracker list when connecting, the first extension message SHOULD only contain added trackers. 3.2.12. Merkle tree torrent extension BitTorrent requires a torrent file containing a cryptographic digest of every piece of the content to allow the verification of pieces during the download. Large torrent files put a strain on the Web Wu, et al. Expires April 26, 2011 [Page 23] Internet-Draft P2P Layered Streaming October 2010 servers distributing them, and cannot be directly included in RSS feeds or gossiped around. [Merkle tree torrent extension] A related problem is the use of large piece sizes. To keep the size of a torrent file small (as to not overload the Web servers) the number of hashes for a content file is being kept small. For large files this implies that the piece size over which digests are calculated must go up (up to 2MB pieces are used). The large piece sizes affect the ability of peers to barter pieces. Only when a piece has been completely received and verified using the digest may it be traded with other peers. This means that it may be some time before a node starts bartering with others. Our solution to these two problems is to replace the list of digests with a single Merkle hash [1]. A Merkle hash can be used to verify the integrity of the total content file as well as the individual blocks via a hierarchical scheme. It works by constructing a hash tree of the content and using just the root hash as data integrity protection. The simple root hash value also allows for smaller piece sizes to be used. A common form of hash trees is the Merkle hash tree, hence the name. 3.2.12.1. Simple Merkle Hashes We propose a minimalistic design that does not affect the existing BitTorrent protocol and clients very much. The design is backwards compatible in the sense that clients supporting the Simple Merkle Hash extension can still be made to process regular torrent files easily. From the content we construct a hash tree as follows. Given a piece size, we calculate the hashes of all the pieces in the set of content files. Next, we create a binary tree of sufficient height. Sufficient height means that the lowest level in the tree has enough nodes to hold all piece hashes in the set. We place all piece hashes in the tree, starting at the left-most leaf, see figure. The remaining leaves in the tree are assigned a filler hash value of 0 (see Discussion). Finally, we calculate the hash values of the higher levels in the tree, by concatenating the hash values of the two children (again left to right) and computing the hash of that aggregate. This process ends in a hash value for the root node, which we call the root hash. The hashing algorithm used is SHA1, as in normal torrents. Wu, et al. Expires April 26, 2011 [Page 24] Internet-Draft P2P Layered Streaming October 2010 The root hash along with the total size of the content-file set and the piece size are now the only information in the system that needs to come from a trusted source. A client that has only the root hash of a file set can check any piece as follows (see figure). It first calculates the hash of the piece it received. Along with this piece it should have received the hashes of the piece's sibling and of its uncles, that is the sibling Y of its parent X, and the uncle of that Y until the root is reached (uncles are marked with * in the figure). Using this information the client recalculates the root hash of the tree, and compares it to the root hash it received from the trusted source. 3.2.13. Tracker Failure Retry Extension This BEP provides a simple backward compatible extension for the BitTorrent Tracker Protocol to provide a client with more details on a failure, specifying if a failure is permanent or temporary and when the request can be repeated. [Tracker Failure Retry Extension] There have been resported cases where (un)intentionally the tracker locations (hostnames and their resolved IP addresses, or IP addresses) where actually not BitTorrent Trackers. When this happens the HTTP server that is living at those addresses will receive a large amount of /announce and /scrape requests which it cannot fulfull as there is no BitTorrent Tracker present at that webserver. Most BitTorrent clients do not check the HTTP errorcodes provided by the server and this thus makes them ignore 404 (File Not Found), which is the general case when the server is not supposed to be used as a tracker. Clients then keep on retrying forever till the user finally gives up. With a large enough number of clients this might overwhelm the webserver from serving the content that it is really supposed to perform. Clients though might not want to parse the 404 or any other error code as the developers of which claim that the 404 might be temporary and that keeping retrying is more important. This proposal addresses this problem by defining a BitTorrent Tracker response "retry in" which allows site owners to return a static response for /announce and /scrape telling the client that this server is permanently not acting as a tracker, thus making the client never return. The user/client can then also inform the source of the .torrent specifying the faulty tracker that the tracker is not a tracker. Wu, et al. Expires April 26, 2011 [Page 25] Internet-Draft P2P Layered Streaming October 2010 This error message can also be used to specify to a client that it should retry its request after a certain amount of time. This allows a overwhelmed tracker to distribute load a little bit. 3.2.13.1. "retry in" extension to "failure reason" The permanent failure error consists of a bencoded "failure reason", which contains the reason and is backward compatible to clients which don't support this extension. The new field "retry in" specifies the number of minutes in which a retry can be done for this tracker. This field is either a positive integer or the value "never". The latter specifies that the client should never send this query again. 3.2.14. DHT scrape 4. eMule Protocol 4.1. Mainline Protocol Specification 4.2. Enhancement Proposal 5. Messages for PPSP We analyze the message design of the above P2P file downloading and streaming applications (Bittorrent and eMule). We summarize a common message design for the P2P streaming protocol standardization from their base and enhancement approach in the peer and tracker protocol specifications. Wu, et al. Expires April 26, 2011 [Page 26] Internet-Draft P2P Layered Streaming October 2010 5.1. Tracker-Peer Messages 5.1.1. Baseline Tracker-Peer Messages The tracker and peer protocol should consist of the following messages. For the performance issues, most of the trackers are implemented in UDP. The UDP spoofing should be considered. It is possible to spoof the source address of a UDP packet which is sent by different peers. For avoiding that this case occurs, the tracker calculates a connection ID and send it to the peer when the peer send "connect request". For example, if peer spoofs peer's address, but it does i j not know the connection id except that peer sniffs the network. In i the following connections, the tracker verifies the connection ID and ignores the request if the connection ID is not match. The connection ID can be generated in random. 5.1.1.1. Connect Request In the first step, a peer sends a "Connect Request" message to tracker. 5.1.1.2. Connect Response When the tracker receives a peer's "Connect Request" message, the tracker returns a "Connect Response" message. It consists of a connection ID. 5.1.1.3. Announce Request Peer reports its information to tracker. The following items can be considered: Channel ID Download size Uploaded size Wu, et al. Expires April 26, 2011 [Page 27] Internet-Draft P2P Layered Streaming October 2010 Event: start, stop, downloading, complete 5.1.1.4. Announce Response When tracker receives a peer's "Announce Request" message, the tracker returns a "Announce Response" message. It consists of the number of the announce interval seconds. Peer does not announce again until the interval seconds have passed or a new event has occurred. 5.1.1.5. Get-Peer Request When the peer wants to get the peer list, it sends the "Get-peer Request" message with the channel ID. 5.1.1.6. Get-Peer Response When the tracker receives the "Get-peer Request" message, the tracker replies the "Get-Peer Response" message with the peer list information. 5.1.1.7. Retry Response The tracker receives a peer's message. But the tracker is busy. The tracker sends a "Retry Response" message to the peer. The message consists of the number of the retry interval seconds. 5.1.1.8. Error Response The tracker receives a peer's message. But the tracker encounters errors. The tracker sends an "Error Response" message to the peer. Wu, et al. Expires April 26, 2011 [Page 28] Internet-Draft P2P Layered Streaming October 2010 5.1.2. Enhancement Tracker-Peer Messages 5.1.2.1. P2P Layered Streaming Message With supporting Layered P2P Streaming, the following messages are considered. [P2P Layered Streaming] PUT-LAYER (Put Layer Information) into Tracker When the source put the layer information, tracker replies an ACK. The layer information keeps the layer number and the information for each layer. GET-LAYER (Get Layer Information) from Tracker When a peer gets the layer information, the tracker replies the layer information. It keeps the layer number and the information for each layer. LAYER-CHANGE (Layer Change) Peer can select its suitable active layer according to it current network bandwidth. For example, when a peer's bandwidth is high, the peer can request all layer chunks. But when a peer's bandwidth is slow, the peer can request lower layers, or just base layer chunks. When the peer changes its layer state, it sends message to notify its peers and tracker for updating its information. 5.2. Peer-Peer Messages 5.2.1. Baseline Peer-Peer Messages 5.2.1.1. Interested When PeerA checks PeerB's bitfield, PeerA wants to request data to PeerB. PeerA sends an "Interested" message to peerB. Wu, et al. Expires April 26, 2011 [Page 29] Internet-Draft P2P Layered Streaming October 2010 5.2.1.2. Not Interested When PeerA checks PeerB's bitfield, PeerA does not want to request data to PeerB. PeerA sends a "Not Interested" message to PeerB. 5.2.1.3. Choke PeerB receives PeerA's "Interested" message or is sending data to peerA. If PeerB wants to stop sending data to PeerA, PeerB sends the "Choke" message to PeerA. 5.2.1.4. Unchoke PeerB receives PeerA's "Interested" message or is not sending data to PeerA. If PeerB wants to start sending data to PeerA, PeerB sends "Unchoke" message to PeerA. 5.2.1.5. Have Piece When a peer finishes receiving a new piece, it sends a "Have Piece" message about this piece to the peers in its peer list. 5.2.1.6. Bitfield Request PeerA wants to update PeerB's bitfield. PeerA sends a "Bitfield Request" message to PeerB. 5.2.1.7. Bitfield Response When PeerB receives PeerA's "Bitfield Request" message, PeerB sends a "Bitfield Response" message to PeerA with PeerB's bitfield information. Wu, et al. Expires April 26, 2011 [Page 30] Internet-Draft P2P Layered Streaming October 2010 5.2.1.8. Piece Request When PeerA wants to request data from PeerB, PeerA sends a "Piece Request" message to PeerB. 5.2.1.9. Piece Response PeerB receives PeerA's "Piece Request" message. If PeerB has the related data, PeerB replies a "Piece Response" message with the data. 5.2.1.10. Piece Cancel PeerA has sent a "Piece Request" message to PeerB and not receivied the related data. But PeerA does not want to receive the related data from PeerB. PeerA sends a "Piece Cancel" message to peerB. 5.2.2. Enhancement Peer-Peer Messages 5.2.2.1. Have All/None Bitfield PeerA receives PeerB's "Bitfied Request" message. When PeerA's bitfield is full, peerA sends a "Have All Bitfield" message to PeerB. When PeerA's bitfield is empty, peerA sends "Have None Bitfield" message to PeerB. 5.2.2.2. Suggest Piece PeerA receives PeerB's "Piece Request" message. If PeerA finds a piece is rare in its peer list, PeerA suggestes PeerB to download this piece with a "Suggest Piece" message. 5.2.2.3. Piece Reject PeerA receives PeerB's "Piece Request" message. If PeerA does not want to send this piece to PeerB, PeerA reject PeerB's request with a "Piece Reject" message. Wu, et al. Expires April 26, 2011 [Page 31] Internet-Draft P2P Layered Streaming October 2010 5.2.3. DHT Messages 5.2.3.1. Ping Request A peer sends its activity status to its peer list with a "Ping Request" message. 5.2.3.2. Ping Response When a peer receives a "Ping Request" message, it replies a "Ping Response" message. 5.2.3.3. Find-Node Request A peer wants to find a node id. It sends a "Find-Node Request" message to the K closest nodes in its peer list. 5.2.3.4. Find-Node Response When a peer receives a "Find-Node Request" message, it replies a "Find-Node Response" message with the K closest nodes in its peer list. 5.2.3.5. Get-Peer-List Request A peer wants to find the peer list for a node id. It sends a "Get- Peer-List Request" message to K closest peers in the find-node process. 5.2.3.6. Get-Peer-List Response When a peer receives a "Get-Peer-List Request" message, it replies "Get-Peer-List Response" message with the related peer list information. Wu, et al. Expires April 26, 2011 [Page 32] Internet-Draft P2P Layered Streaming October 2010 5.2.3.7. Announce Peer Request A peer wants to announce its information. It sends an "Announce Peer Request" message to K closest peers in the find-node process. 5.2.3.8. Announce Peer Response When a peer receives an "Announce Peer Response" message, it replies a successful message. 5.3. Peer-CDN(HTTP) Messages The CDN(HTTP) server can support the "Range" HTTP command. The HTTP header that the client sends is as the following format: /GET url RANGE=[start]-[end] 5.4. Tracker-Tracker Messages Todo: The content of this section need further input. 6. Security Considerations Todo: The content of this section need further input. 7. Conclusions Todo: The content of this section need further input. Wu, et al. Expires April 26, 2011 [Page 33] Internet-Draft P2P Layered Streaming October 2010 8. References [BitTorrent in wiki] http://en.wikipedia.org/wiki/BitTorrent_%28protocol%29 [Index of BitTorrent Enhancement Proposals] http://www.bittorrent.org/beps/bep_0000.html [Bittorrent_Protocol_Specification] The BitTorrent Protocol Specification http://www.bittorrent.org/beps/bep_0003.html [DHT Protocol] http://www.bittorrent.org/beps/bep_0005.html [Fast Extension] http://www.bittorrent.org/beps/bep_0006.html [Multitracker Metadata Extension] http://www.bittorrent.org/beps/bep_0012.html [UDP Tracker Protocol] http://www.bittorrent.org/beps/bep_0015.html [Superseeding] http://www.bittorrent.org/beps/bep_0016.html [HTTP Seeding] http://www.bittorrent.org/beps/bep_0017.html [Extension for Partial Seeds] http://www.bittorrent.org/beps/bep_0021.html [BitTorrent Local Tracker Discovery Protocol] http://www.bittorrent.org/beps/bep_0022.html [Tracker Returns External IP] http://www.bittorrent.org/beps/bep_0024.html [Private Torrents] http://www.bittorrent.org/beps/bep_0027.html [Merkle tree torrent extension] http://www.bittorrent.org/beps/bep_0030.html [Tracker Failure Retry Extension] http://www.bittorrent.org/beps/bep_0031.html 8.1. Normative References [P2P Layered Streaming] P2P Layered Streaming for Heterogeneous Networks in PPSP (drafting). Wu, et al. Expires April 26, 2011 [Page 34] Internet-Draft P2P Layered Streaming October 2010 [Tracker Protocol] Yingjie Gu, et. al. Tracker Protocol, PPSP (drafting) [PPSP Survey] Yingjie Gu, et. al. Survey of P2P Streaming Applications (drafting) 8.2. Informative References [1] 9. Acknowledgments Wu, et al. Expires April 26, 2011 [Page 35] Internet-Draft P2P Layered Streaming October 2010 Authors' Addresses Kent Kangheng Wu Hong Kong Applied Science and Technology Research Institute Company Limited (ASTRI) 3/F, Building 6, 2 Science Park West Avenue, Hong Kong Science park, Shatin, New Territories, Hong Kong Phone: 852-34062908 Email: khwu@astri.org James Zhibin Lei Hong Kong Applied Science and Technology Research Institute Company Limited (ASTRI) 3/F, Building 6, 2 Science Park West Avenue, Hong Kong Science Park, Shatin, New Territories, Hong Kong Phone: 00852-34062748 Email: lei@astri.org Dah Ming Chiu Hong Kong Applied Science and Technology Research Institute Company Limited (ASTRI) 3/F, Building 6, 2 Science Park West Avenue, Hong Kong Science Park, Shatin, New Territories, Hong Kong Phone: 00852-34062979 Email: dmchiu@ie.cuhk.edu.hk Wu, et al. Expires April 26, 2011 [Page 36]