PPSP Y.Chen Internet Draft Beijing Jiaotong University Intended status: Informational Y. Zhang China Mobile Expires: April 2010 October 18, 2009 Evaluation of DHT-based Chunk Discovery for P2P Streaming draft-chen-ppsp-dht-chunk-discovery-evaluation-00.txt Status of this Memo This Internet-Draft is submitted to IETF in full conformance with the provisions of BCP 78 and BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet-Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html This Internet-Draft will expire on April 18, 2010. Copyright Notice Copyright (c) 2009 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the BSD License. Chen Expires April 18, 2010 [Page 1] Internet-Draft Evaluation of DHT-based Chunk Discovery for P2P Streaming October 2009 Abstract Recently, some researchers have suggested that DHT should be applied to the P2P streaming system to do the media block search. This draft evaluates this proposal by simulation and modeling analysis, and discloses that it is inappropriate for P2P live streaming system. Chen Expires April 18, 2010 [Page 2] Internet-Draft Evaluation of DHT-based Chunk Discovery for P2P Streaming October 2009 Table of Contents 1. Introduction.................................................4 1.1. The basic idea..........................................4 1.2. The extension...........................................4 2. Basic assessment.............................................5 2.1. Characteristics of P2P Live Streaming System............5 2.2. Basic evaluation of the proposal........................5 2.3. Performance Evaluation of the basic program.............6 3. The extended method.........................................10 4. References..................................................11 4.1. Normative References...................................11 4.2. Informative References.................................11 Chen Expires April 18, 2010 [Page 3] Internet-Draft Evaluation of DHT-based Chunk Discovery for P2P Streaming October 2009 1. Introduction DHT is a distributed resource discovery algorithm which is widely used in miscellaneous P2P-based application systems. Recently, some researchers have suggested that DHT should be applied to the P2P streaming system to do the media block search[Ning]. We evaluate this proposal by simulation and modeling analysis, and disclose that it is inappropriate for P2P live streaming system. Our results show that: 1) the delay for a peer to publish the chunk availability information to a DHT node hurdles the quick propagation of chunk in the network, and therefore has a negative impact on the real-time performance of system. 2) The registration of the chunk availability information brings huge load on the DHT node and network, which makes the method not scalable. We final conclude the proposed method will face serious challenges in the actual deployment. 1.1. The basic idea The proposed method is based on another IETF Draft: REsource LOcation And Discovery (RELOAD) [ID.ietf-p2psip-base]. It uses RELOAD where the resources is stored and retrieved by DHT to register chunk availability information and search it. Specifically, the method is: the DHT network stores the key-value pair as below: the key is the chunk id which is identified by (Video ID, Chunk-ID), and the value is the list of peer who have already owned the chunk, denoted by "Peer List". When a peer gets a chunk, it sends registration message to the corresponding DHT node to add itself into the peer list; while others want to get the chunk, they will query the DHT node in order to obtain the Peer List, and then request chunks from the peers in the Peer List. We call the above method as basic method. 1.2. The extension To optimize performance, the proposal[Ning] extends the above-mentioned "Peer List" to include the Chunk Lists of peers who are included in the Peer List. As a result, the contents stored in the key-value pair on the DHT node become the chunk lists of peers who have already obtained the chunk, which is denoted by "Peer List with Chunk List". We call this method as extended method. Chen Expires April 18, 2010 [Page 4] Internet-Draft Evaluation of DHT-based Chunk Discovery for P2P Streaming October 2009 2. Basic assessment In this section, we give a simple overall assessment on the proposed method mentioned above. We then evaluate its performance based on simulation and modeling in the following section. Before we start our discussion of the design, let us first look at the characteristics of P2P Live Streaming System. 2.1. Characteristics of P2P Live Streaming System An important feature of the P2P live streaming system is that its users watch the program nearly real-time, which is what we call the real-time requirements. In other words, after the media server broadcasts a chunk, all users should be able to obtain it within the shortest possible time. In the current commercial P2P live streaming system, the delay is kept at about 20s in a network with size > 10k. Because the chunk propagation is in such a short time constraints, when the media server upload a new chunk to the network, peer should detect the emergence of this new chunk quickly, find the peers who have already get the chunk and therefore are able to act as the source of this chunk, and then request to peer for the chunk. Moreover, because it is a distributed system, the request may fail because the source node leaves, or the ability of the source node can not meet the needs of all incoming chunk fetching requests and therefore have to ignore partial incoming requests. In this case, the peer need retry the above procedure to get the chunk. Therefore, after a peer obtains a new Chunk, it must quickly register that it has got this chunk, and then other peers can find this information, and then request to download from it. 2.2. Basic evaluation of the proposal We above present the piece propagation in P2P live streaming system as a snowball process. We now turn to evaluate the proposed DHT- based approach in general. First, denote the number of valid pieces in the P2P network by X. With the proposed method, there are at most X chord nodes storing the peer list information. For a large-scale network, it is not reasonable for only X nodes to take this responsibility. In practice, the X is generally 1k-2k. The number of peers in the network, however, can be very large, e.g. it can be 1M at the Olympic Game Opening Play. Chen Expires April 18, 2010 [Page 5] Internet-Draft Evaluation of DHT-based Chunk Discovery for P2P Streaming October 2009 We second look at the basic method. The essence of this approach is using the DHT node to act as a centralized tracker of the chunk, because all the peer information of a chunk is stored in a single DHT node. Therefore, the approach faces the same scalability problem of traditional C/S structure when the chunk requests surge. The proposal is aware of this scalability problem, and therefore, goes forward to propose its extended method. The essence of the extended method is that the DHT node for a chunk stores more content about peers' piece availability information, with the expectations that it can reduce the number of requests to the DHT node, and alleviates the scalability issues. The intent of this extension is right. The contents it proposes to store, however, are too excessive. Let us clarify this problem by using the traditional Data-Driven method as example. In this method, a peer only gets the chunk list of their neighbors. For example, in a 10k network, a peer only gets about 30 peers' chunk list information. In the proposed extended method, the DHT node will return the querying peer the chunk list of all 10k peers, which does not make sense. Even if the proposed extension is revised to let the DHT node only return chunk lists of limited number of peers, it is still to be difficult to design an efficient and scalable peer selection algorithm. The final implementation of the problem is it did not give a suitable solution to achieve this functionality, especially the method to update these chunk lists. The implementation effort is undoubtedly considerable. 2.3. Performance Evaluation of the basic program To evaluate the performance of system, we first introduce the exponential backoff algorithm for the piece scheduling of P2P live streaming system [Yishuai]. The proposed method does not take into account an important question, i.e., how quickly a peer will retry to look up the peer list of a chunk when it fails to get a chunk. [Yishuai] proposes applying the well-known exponential backoff algorithm. This method means that, when the request is rejected, a peer retreats an exponentially distributed time delay, and then restart the look up. Chen Expires April 18, 2010 [Page 6] Internet-Draft Evaluation of DHT-based Chunk Discovery for P2P Streaming October 2009 We now model and simulate to evaluate the system performance with the proposed method. We first discuss the basic method. In this method, the peer list of a certain chunk is stored in a single DHT node. We simulate the following procedure: after the media server uploads a new chunk, all peer exponential back off and then look up the peer list of the chunk. After a peer gets the peer list, it requests the chunk from these peers. If the source peer is spare, it will get the piece. The peer who has just obtained a new chunk will immediately send its registration request to the corresponding DHT Node to register itself. If the source peer is not spare, the peer will not get the piece and will retry after an exponential back off. The procedure can be modeled as below. Assume it takes a peer b seconds to upload a chunk to another. Assume a source peer uploads to one other peer at a time. Denote the mean number of hops of the registration message by H, and denote the mean network transmission time for it to reach the corresponding DHT node by c. Therefore, the chunk propagation process is as follows: o t = 0s, Peer 1 starts the download from the Media Server for the first chunk. o t = b seconds, Peer 1 gets the chunk, and send the registration message to the DHT node. At the same time, Peer 2 begins to start the download from the Media Server for the 2nd chunk. o t = b + c seconds, Peer 1 completes its registration, and is immediately found by Peer 3. Peer 3 begins downloading from Peer 1 at that time. o t = 2b seconds, Peer 2 gets the chunk, and sends the registration message to the DHT node. At the same time, Peer 4 begins to start the download from the Media Server for the fourth chunk. o . . . Chen Expires April 18, 2010 [Page 7] Internet-Draft Evaluation of DHT-based Chunk Discovery for P2P Streaming October 2009 We use discrete event simulation to simulate the effect of registration delay c on the piece propagation procedure. We first estimate the c as below. Consider an N-node Chord network. Assuming the N nodes distributed on the Chord rings uniformly randomly. [Stoica] proposed a model to estimate the average path length of a query in the Chord network, i.e., L = (1/2)log2(N). Denoting the network transmission delay of one hop is T, we have c = (1/2)log2(N)T For each simulation configuration, we have carried out 20 times discrete-time simulations. Each simulation has different random number seed. 1. Piece Propagation Delay We use 99% Piece Propagation Delay (99% PDD) as the main metric to evaluate the piece propagation delay, which means the time required for 99% of peers to obtain the chunk after it is uploaded by the media server. We simulate two T: one is 200ms and the other is 400ms. Our simulation network consists of N=10k nodes. Therefore, the corresponding register and lookup latencies for them are (1/2)log2(N)T = 1.3288s and 2.6575s. The exponential backoff rate lamda is 1. We compare the simulation results with those of the traditional data- driven method [Coolstreaming]. [Yong] and [GridMedia] have proven that the traditional data-driven approach can obtain nearly-optimal piece propagation delay which is close to the result we obtain by setting c=0 in simulation. Our results obtained as follows: Data-Driven Methods: 99% PDD mean: 26.2524; standard deviation of 6.8494 DHT-based Method - T = 200ms: 99% PDD mean: 34.1400; standard deviation of 6.8627 DHT-based Method - T = 400ms: 99% PDD mean: 41.1365; standard deviation of 5.9998 Chen Expires April 18, 2010 [Page 8] Internet-Draft Evaluation of DHT-based Chunk Discovery for P2P Streaming October 2009 By comparison, when the single-hop delay is 200ms, the 99% PDD increases by (34.1400-26.2524) / 26.2524 = 30%; When the single-hop delay is 400ms, the 99% PDD increases by (41.1365-26.2524) / 26.2524 = 56.7%, which is a consideration deterioration. 2. Nodes and network load With the basic method, a DHT Node will receive two kinds of signaling messages: a) Register; b) Look up. For the Register message, the evaluation is relatively simple, that is, for each chunk, each peer will register once. As a result, the DHT node will get N messages. Because each message will traverse mean (1/2)log2(N) hops, the total number of registration message in the network is (N/2)log2(N). For the Look up message, the situation is more complex because the failed peers will look up again. We recorded the number of lookup the DHT node received during the simulation. The results are: with T=200ms, the mean number of look up messages is 210,048 with standard deviation 67,544; With T=400ms, the mean number of look up messages is 263,320 with standard deviation 59,755. The load on DHT node and network is quite large. By comparison, in the Data-Driven approach, the chunk list is exchanged between neighbors, and therefore, the above centralized messages handling is avoided. Chen Expires April 18, 2010 [Page 9] Internet-Draft Evaluation of DHT-based Chunk Discovery for P2P Streaming October 2009 3. The extended method We now assess the extended method. We focus on the evaluation of the node and network load. First, as we explained before, there are at most X chord nodes storing the peer list information. Assuming a peer owns X/2 chunks on average, it appears in the peer list of X/2 chunks. Therefore, when it gets a new chunk, it should update its chunk lists stored in the X/2 chunks, which requires X/2 registration messages. As a result, for a chunk, N peers generate NX/2 registration messages. Because each registration message traverse (1/2)log2(N) hops, the total number of registration messages in the network is (NX/4)log2(N). To demonstrate the problem, considering the following example: when X=1k, N=1M, each node need process or routing (X/4)log2(N)= 4.9829e+003 message for a chunk, which is considerable. Assuming we duplicate the peer list of a Chord Node to other nodes for load balance in practice. The duplication, however, increases the number of registration messages in the network. For instance, if there are Y nodes to duplicate the peer list of a Chord node, then the total number of registration messages in the network becomes (NXY/4)log2(N). To demonstrate the problem, considering the following example: when X=1k, Y=100, N=1M, each node need process or routing (XY/4)log2(N)= 4.9829e+005 message for a chunk, which is unacceptable. Chen Expires April 18, 2010 [Page 10] Internet-Draft Evaluation of DHT-based Chunk Discovery for P2P Streaming October 2009 4. References 4.1. Normative References [1] [ID.ietf-p2psip-base] Jennings, C., Lowekamp, B., Rescorla, E., Baset, S., and H. Schulzrinne, "REsource LOcation And Discovery (RELOAD)Base Protocol", draft-ietf-p2psip-base-02. 4.2. Informative References [2] [Ning] N. Zong, Chunk Discovery for P2P Streaming, draft-zong-ppsp -chunk-discovery-00. [3] [Coolstreaming] X.Zhang, J.Liu, B. Li, and T.P. Yum. "Coolstreaming / donet: a data-driven overlay network for peer- to-peer live media streaming". In Proceedings of IEEE / INFOCOM, Miami, March 2005. [4] [GridMedia] M. Zhang, Q. Zhang, L. Sun, and S. Yang, "Understanding the power of pull-based streaming protocol: can we do better?" IEEE Journal on Selected Areas in Communications, 2007. [5] [Yong] C. Liang, Y. Guo, and Y. Liu. Is Random Scheduling Sufficient in P2P Video Streaming? In Proc. Of ICDCS, 2008 [6] [Yishuai] Y. Chen, C. Chen, Y. Zhao, C. Li, Blind Random Scheduling for P2P Live Streaming, Submitted, 2008 Author's Addresses Yichuai Chen Beijing Jiaotong University Phone: 86 10 51684757 Email: chenyishuai@gmail.com Yunfei Zhang China Mobile Phone: 86 13601032119 Email: zhangyunfei@chinamobile.com Chen Expires April 18, 2010 [Page 11]