Network Working Group L. Bertz Internet-Draft Sprint Intended status: Informational October 19, 2015 Expires: April 16, 2016 Extensions for ALTO Aggregation draft-bertz-alto-aggrimpl-00 Abstract This document defines possible ALTO extensions for Aggregation based upon implementations in the e_alto open source project. Status of This Memo This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet- Drafts is at http://datatracker.ietf.org/drafts/current/. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." This Internet-Draft will expire on April 16, 2016. Copyright Notice Copyright (c) 2015 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. Bertz Expires April 16, 2016 [Page 1] Internet-Draft ALTO-Aggregation October 2015 Table of Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 2. Requirements . . . . . . . . . . . . . . . . . . . . . . . . 2 3. Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . 3 3.1. Requirement 1 - Origin property . . . . . . . . . . . . . 3 3.2. Requirement 2 - EPCS finegrain constraint . . . . . . . . 4 3.3. Requirement 3 - Demanded Query Recognition . . . . . . . 4 4. Future Work / Areas of Investigation . . . . . . . . . . . . 5 4.1. Cycle Avoidance . . . . . . . . . . . . . . . . . . . . . 5 4.2. Measurement Initiation . . . . . . . . . . . . . . . . . 5 4.3. Writing Data to Aggregators . . . . . . . . . . . . . . . 6 5. Informative References . . . . . . . . . . . . . . . . . . . 6 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . 7 1. Introduction Several ALTO documents address the reality that multiple sources exist for ALTO [MULTISOURCE] and ALTO Clients may be required to deal with ALTO servers in multiple domains [CROSSDOMAIN]. It is possible if not probable that multiple ALTO servers exist within a single domain for many reasons including: o query optimization by storing specific sets of indices for a subset of ALTO Clients, e.g. P2P, DMM, CDN Request Routers o scaling just to meet Client demand, i.e. ALTO Server Load o sub-domains within a domain e_alto (http://www.github.com/lylebe/e__alto) is an open source project designed based upon the assumption that it would provide ALTO compliant services and act as a server of information origin and an aggregator. During its implementation several use cases were discovered and some basic ALTO extension were made in order to prepare the software to act as origin and aggregator of ALTO information. 2. Requirements The following was discovered during the construction of the Aggregation features in the ALTO Server: It is entirely possible that multiple aggregation tiers will exist between an ALTO Client and the ALTO Origin Server. How can a Client traverse the sources to find the Server? How can a minimal amount of Bertz Expires April 16, 2016 [Page 2] Internet-Draft ALTO-Aggregation October 2015 data be stored on the Aggregators to support this, especially if horizontal scaling was part of the motivation for aggregation? Requirement 1 - Provide the ability to determine the server of origin for an ALTO data. The Aggregator should be to provide better information as opposed to summarized (coarse grain) information when it exists. This requires a mechanism by which fine grained and coarse grained data can be distinguished. Requirement 2 - Distinguish between first hand knowledge and aggregated information. Aggregators will need to adjust over time to the queries of the Clients it serves. It is entirely possible that Clients will ask about information the Aggregator and its Origin Servers do not have but could acquire if they knew the information was desired. Requirement 3 - Provide the ability for ALTO Origin Servers to know of queries that are missed, i.e. no such data exists in the ALTO Aggregator. 3. Solutions 3.1. Requirement 1 - Origin property An origin property can be added to the various property service functions. This property notes the URI of the ALTO IRD which the Server acquired the data from or the value 'self' indicating the ALTO Server being queried is the server of origin. Aggregated map responses can add the "origins" property to the "meta" property in order to note information. When a Costmap is filtered or, for example, a single pair of endpoints is used in a EPCS query a single value for the origins field can be returned to provide the equivalent of an origin query for individual entries in Costmaps or Endpoint Costmaps. Name collisions are problematic during aggregation and are only currently resolved through configuration. Recursion can be used by the ALTO Client or Aggregator to find the origin of a specific piece of information. For example, using the origin property returned by an Aggregator the Client could then query the ALTO Server referenced in the response to acquire the property service. If the Server is not the source, i.e. it responds with yet Bertz Expires April 16, 2016 [Page 3] Internet-Draft ALTO-Aggregation October 2015 another URI instead of the value 'self', the Client may follow the URI recursively until the originating ALTO Sever is located. The next hop solution seems tedious but it was chosen in order to permit the discovery of ALTO exchanged information. 3.2. Requirement 2 - EPCS finegrain constraint Distinguishing coarse and fine grain information affects only EPCS. A new constraint 'finegrain' was added to EPCS in order to note that if no known finegrain data is available 'coarse' information from EPCS aggregated data or Costmaps should not be provided. This permits an Aggregator to query for only finegrain data and use it in place of coarse grain responses which would be provided by Costmaps. The Aggregator then minimized the amount of stored EPCS data and relied on Costmaps for coarse grain. When an EPCS update is provided using specifications such as [SSE] the ALTO Server provides a "granularity" property in the "meta" field with 3 possible values o finegrain - this response only contains finegrain measurements o coarsegrain - this response is known to contain coarse grain measurements o unknown - granularity cannot be determined and the response should be treated as coarsegrain 3.3. Requirement 3 - Demanded Query Recognition In order to adjust to Client queries the ALTO Aggregator and Servers can implement at Demanded Query recognition by tracking the number of times a specific query could not be satisfied. Once an administratively defined threshold is reached it could the forward the request to ALTO Servers that would be likely sources of the information. Likely sources can be identified by the Aggregator using the Network Map to determine which ALTO Server could provide information on the Endpoints or PIDs. The server would then repeat the queries to the Servers. If those servers also support Demanded Query recognition they can repeat the process described here if they are not the Origin Server. Bertz Expires April 16, 2016 [Page 4] Internet-Draft ALTO-Aggregation October 2015 For missed queries that involve Costmaps, PIDS, Endpoints or Endpoint Costmaps from multiple Origin Servers the Aggregator would send the queries to all ALTO Servers. An ALTO Server would only forward on data it acts as an Aggregator or Origin for if it has the information or forward the request on to the Origin Servers it would use for the queried information. NOTE: This implies that cycles in data exchange should be avoided or a mechanism should be put in place to recognize query cycles. At the Origin Server the constant query miss could be used to adjust what is measured or filtered. Such a mechanism would also, for example, permit two ALTO Servers to realize that initiating measurements between a few endpoints would provide data to common ALTO queries. 4. Future Work / Areas of Investigation Based upon the current work underway several other areas of investigation have been recognized. 4.1. Cycle Avoidance The current architecture assumes no cycles between the ALTO Client and Origin Servers. This should be formalized into a easily detectable mechanism and a part of both the update and query functions in ALTO. 4.2. Measurement Initiation Demanded Query (cache miss) Recognition can be used to recognize when data is not being provided but solutions to initiate the measurement of the data has not been defined. For missed EPCS information the endpoints to be measured can be provided. Properities associated with the endpoint could note a URI where a new type of measurement initiation protocol could be implemented. Such a protocol can be used to provide the needed information. The Aggregator could act as an intermediary and supply the measurement initiation URIs for specified endpoints, PIDs or Network Maps in the "meta" field. In such a case the ALTO Origin Server can pass along this information to underlying Measurement initiation functions without the need for it to act as an ALTO Client to track down other Measurement Initiation Service Points (URIs). Bertz Expires April 16, 2016 [Page 5] Internet-Draft ALTO-Aggregation October 2015 For Costmaps the Aggregator could query for PID properties using extensions [PIDPROPS] or query for Endpoints with Measurement Initiation URIs and then use the Demanded Query Recognition process with those endpoints to acquire finegrain measurements for use in the Costmap. However, this bypasses some of the process by which an Origin Server would compute the specified Costmap metric value and is not recommended. Rather, implementation of [PIDPROPS] and exposing a Measurement Initiation URI property would preferred. 4.3. Writing Data to Aggregators The following could be incorporated into the ALTO update process o Cache techniques common to HTTP o Vector-Clocks and Interval Tree Clocks to support Eventual Consistency o Detecting when sending update history (change log) is more efficient than downloading the entire JSON structure when disconnects occur (resync) o Acquiring update history for a given time period o JSON Updates for Strong Eventual Consistency [SSE] provides a basis for updates. Aggregating the data and providing consistency of replicated JSON data could be explored to provide loser coupling constraints between various ALTO entities. However, such development would focus on JSON and HTTP and is more general than ALTO. 5. Informative References [CROSSDOMAIN] Kiesel, S. and Stiemerling, M., "Application Layer Traffic Optimization (ALTO)Cross-Domain Server Discovery", 2015. [MULTISOURCE] Gao, K. and Yang, Y., "ALTO Extensions for Multi-Source Information Collection", 2015. [PIDPROPS] Roome, W., "Extensible Property Maps for the ALTO Protocol", 2015. [SSE] Roome, W. and Yang, Y., "ALTO Incremental Updates Using Server-Sent Events (SSE)", 2015. Bertz Expires April 16, 2016 [Page 6] Internet-Draft ALTO-Aggregation October 2015 Author's Address Lyle Bertz Sprint 6220 Sprint Parkway Overland Park KS, 66251 USA Email: lyleb551144@gmail.com Bertz Expires April 16, 2016 [Page 7]