ionta-plic-01.txt

Internet DRAFT - draft-ionta-plic
draft-ionta-plic

Last Version:	draft-ionta-plic-01.txt	Tracker Entry
Date:	`17-May-2016`
Disposition:	expired
Previous Versions:	draft-ionta-plic-00.txt (diff) - 17-Nov-2015

Network Working Group                                         T. Ionta
Internet-Draft                                            A. Cappadona
Intended status: Standard                               Telecom Italia
Expires: Nov 15, 2016                                     May 16, 2016


A Performance Layer Inboard Computing method to support active, passive
                    and hybrid measurements
                    draft-ionta-plic-01.txt

Abstract

This document describes an innovative frame, called PLIC,("Performance 
Layer Inboard Computing") able to support active, passive or hybrid 
performance measurements on any kind of unicast and multicast service 
passing through a network. It is based on an algorithm which, through 
a distributed computation, floods the performance measurements of each
node to the rest of the network, without necessity of any external 
topology database or operation support system. It has been invented 
and engineered in the Telecom Italia Core Network.

Status of This Memo

This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79.

Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet-
Drafts is at http://datatracker.ietf.org/drafts/current/.

Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."

This Internet-Draft will expire on May 1, 2016.

Copyright Notice

Copyright (c) 2015 IETF Trust and the persons identified as the
document authors. All rights reserved.

This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with respect
to this document.







Ionta, et al.           Expires May 16, 2016                   [Page 1]

Internet-Draft Performance Layer Inboard Computing method October 2015

Table of Contents

1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . 2
2. General Overview of the method . . . . . . . . . . . . . . . . . 3
3. Detailed description of the method . . . . . . . . . . . . . . . 3
3.1 Discovery process (Process 1) . . . . . . . . . . . . . . . . . 4
3.2 PLIC-Neighborship building (Process 2). . . . . . . . . . . . . 4
3.3 PLIC Token getting (Process 3). . . . . . . . . . . . . . . . . 6
3.4 Local Performance parameters calculation (Process 4). . . . . . 6
3.5 PLIC Token building (Process 5) . . . . . . . . . . . . . . . . 7
3.5.1 General info put in the PLIC Token. . . . . . . . . . . . . . 7
3.5.2 PLIC Token detailed structure . . . . . . . . . . . . . . . . 8
3.6 Processes sequencing during each single timeslot. . . . . . . . 8
4. Robustness of the method . . . . . . . . . . . . . . . . . . . . 9
5. Implementation and deployment (use case) . . . . . . . . . . . .10
6. Security Considerations. . . . . . . . . . . . . . . . . . . . .10
7. IANA Considerations. . . . . . . . . . . . . . . . . . . . . . .10
8. Acknowledgements.  . . . . . . . . . . . . . . . . . . . . . . .10
9. References . . . . . . . . . . . . . . . . . . . . . . . . . . .11
9.1. Normative References . . . . . . . . . . . . . . . . . . . . .11
9.2. Informative References . . . . . . . . . . . . . . . . . . . .11
Authors' Addresses  . . . . . . . . . . . . . . . . . . . . . . . .11

1. Introduction

This document provides a framework for performing measurements on data
flows transmitted in a communication network, which may be 
implemented in a distributed way by the nodes themselves, without 
requiring the intervention of any external management server 
collecting measurements from the nodes. It is also capable of 
automatically adapting, without necessity of getting topology from 
any external topology database, to any possible changes of the 
network topology and of the paths followed by the data flow across 
the communication network, thus easing the operations of performance
monitoring and failure detection and management.
This framework is applicable to active, passive or hybrid performance
measurements, since it can support both active and passive probing, or
a mix of them. 
It is also capable to support performance measurements on both 
multicast and unicast services.
It is applicable everywhere across the network (Core, Edge, Access), 
giving the possibility to have the result of the measurements either 
from an "end-to-end" point of view or a "node by node" too.









Ionta, et al.           Expires May 16, 2016                   [Page 2]

Internet-Draft Performance Layer Inboard Computing method October 2015



2. General Overview of the method

The main idea behind this method is to find a way to pass node-by-node
the performance measurements (Packet Loss, Delay, CRC, Bandwith, 
passive/active measurement parameters or whatever else) taken on each 
node in a way that let any other node of the network to:
- identify, automatically and dynamically, the list of all the upstream
  nodes along the reverse path up to the source node of the service, 
  thus avoiding the necessity of an external topology database. The 
  topology discovery is performed and updated frequently (with a period
  equal or less than a single timeslot) using the method described in 
  the following paragraph and based on a discovery method where each 
  node, discovering just its neighbors and passing these discovery info
  to the following node, allows each node to build, in a distributed 
  way, the topology structure of the reverse path up to the source node
  of the service.
- know what happens on the upstream nodes in terms of performance 
  measurements.
The data structure devoted to the information passing is defined as 
PLIC Token.
It is generated by each node during a specific timeslot and passed (in
a push or pull way, indifferently) to the following node during the 
same timeslot, thus allowing all the nodes to update themselves during
the same timeslot.
PLIC Token flows through the network using a multicast flow (belonging
to a probe or to real traffic, indifferently). The method is 
applicable, at the same time, to one or multiple multicast trees, 
regardless to the location of the sources and the receivers. 
Note that the multicast transmission scenario is merely exemplary, 
because the method is also applicable to unicast services. For instance
point-to-point transmission (e.g. MPLSE-TE tunnels, ATM/FR tunnels, 
etc.)).
It is also applicable both to a flow-based traffic (i.e. active 
probing) and to a link-based traffic (i.e. passive probing)
A notable consequence of the method is that no external Operation 
Support System is needed to manage performance and topology data, 
because, according to the method described in this document, 
management is distributed on-board. 
The communication protocol among nodes can be a new ad-hoc one (as 
the one proposed in this document) or an extension of an existing one 
(i.e. PIM).


3. Detailed description of the method

To achieve the above goals, and given a generic multicast tree, each 
node N performs the same set of five processes, detailed below, during 
each single timeslot (whose duration is freely settable by the 
operator depending on the architecture of the network under 
measurement) and then repeated during each following timeslot.

Ionta, et al.           Expires May 16, 2016                   [Page 3]

Internet-Draft Performance Layer Inboard Computing method October 2015


3.1 Discovery process (Process 1)
This discovery process is a local process (no external off-line 
topology database is needed) and it is aimed to identify, for a 
specific multicast flow entering a specific interface, the chain of the
upstream nodes along the reverse path from the current node up to the 
source node.
Thus the current node, during the current timeslot "n" and before 
performing the remaining processes, performs two actions (platform 
dependent):
- Identify the specific interface where the multicast flow is entering 
the node;
- Identify, based on the information inside its IGP database, the set 
of all the upstream nodes along the reverse path from the current node 
up to the source node. This set of nodes is the input for process n. 2.

Note 1: since the topology refreshing frequency rate must be at least 
equal to the duration of a single time-slot, this way of performing 
discovery is much lighter than using an off-line topology database, 
where the impact of nodes polling would be too heavy.
Note 2: with the proposed algorithm, similarly to OSPF approach, the 
whole network topology is distributed (on any single node) instead of 
centralized (on an off-line topology database)


3.2 PLIC-Neighborship building (Process 2) 

Given the set (defined by Process 1) of the upstream nodes along the 
reverse path from the current node up to the source node, the current 
process, starting from the closest reachable upstream node, tries to 
build a "PLIC neighborship" with it, based on the steps defined in 
the current process. If the PLIC-Neighborship building will result 
unsuccessful with the closest upstream node, the current process will 
try to build the PLIC-Neighborship with the following upstream node 
and so on, up along the path toward the source.

As a first step the current node (N(m)) sends the following 
"PLIC_Token_Status_Request message" to its closest upstream node along 
the reverse path from the current node up to the source node:

"PLIC_Token_Status_Request message" definition:
- Node ID (of N(m))
- Request flag 


Ionta, et al.           Expires May 16, 2016                   [Page 4]

Internet-Draft Performance Layer Inboard Computing method October 2015

The upstream node, regarding its own PLIC Token, can be in one of the 
following situations:
1) No PLIC Token Tk(m-1): if the upstream node has never generated the 
PLIC Token since up and running. 
2) PLIC Token Tk(m-1) exists but it is "obsolete", that is generated 
before the previous timeslot (n-2 or older). Thus it is no more useful 
for node N(m).
3) PLIC Token Tk(m-1) has been generated during the previous timeslot 
(n-1), thus the desired PLIC Token will be likely generated during the 
current timeslot (n).
4) PLIC Token Tk(m-1) has been generated during the current timeslot 
(n), thus it is exactly the PLIC Token node N(m) is looking for.

Thus, to let node N(m) distinguish among these situations, it is 
enough, for node N(m-1) while answering to the 
"PLIC_Token_Status_Request message" with the "PLIC_Token_Status_Request
message", to put the Time-Stamp referring to when the PLIC Token has 
been successfully generated by node N(m-1). 

"PLIC_Token_Status message" definition:
- Node ID (di N(m-1))
- Ts(m-1)

If no PLIC_Token_Status message is received by N(m) after timeout, 
PLIC-Neighborship between N(m) and N(m-1) will end. 

Node N(m), depending on the timestamp contained in the 
"PLIC_Token_Status message", will be able to distinguish among the 
above possibilities, thus putting the upstream node under examination 
in one of the following two possible status:
- "PLIC-AWARE": if Tk(m-1) is in the state belonging to instance 1 
or 2 (as stated above)
- "PLIC-UNAWARE": if Tk(m-1) is in the state belonging to instance 3 
or 4 (as stated above).
In case the upstream node under examination is put in "PLIC-AWARE" 
status, it is also defined as "PLIC-AWARE PREDECESSOR node" (N(m-1)) 
of N(m) and a PLIC-Neighborship is built between N(m) and N(m-1).
Instead if the upstream node under examination is put in 
"PLIC-UNAWARE" status, the next upstream node will go under 
examination trying to build a PLIC-neighborship with it by sending a 
new "PLIC Token_Status_Request message" and analyzing the 
corresponding received "PLIC Token_Status message", as stated above.
This process will continue iteratively until one of the following 
conditions occurs: 
- One PLIC-AWARE node is found along the reverse path from the current 
node up to the source node. In this case Process 2 has not succeeded.
- No one PLIC-AWARE node is found along the reverse path from the 
current node up to the source node. In this case N(m) is named "Source 
Node" of the PLIC tree. Note that, following the above algorithm, the 
"Source Node" of the PLIC tree could differ from the multicast source 
node. In other words the "Source Node" of the PLIC tree is the PLIC 
AWARE node more close to the multicast source node. In this case 
Process 2 succeeded.

Ionta, et al.           Expires May 16, 2016                   [Page 5]

Internet-Draft Performance Layer Inboard Computing method October 2015


3.3 PLIC Token getting (Process 3)

As soon as process 2 succeeds, thus designating the 
PLIC-AWARE PREDECESSOR node (called N(m-1)) and building of a 
PLIC-Neighborship with it, the current node N(m) asks to it, for the 
first time, the PLIC Token Tk(m-1) by sending the 
"Get_PLIC_Token message" with the Request Flag set.

"Get_PLIC_Token message" 
Node ID (referred to N(m))
Request Flag

Instead, if no one PLIC-AWARE node will be found along the reverse 
path from the current node up to the source node, the current process 
is skipped by node N(m).

After sending this first request, a "request time counter" is 
switched on by N(m).

Now three possible occurrences occur:
1) Node N(m-1) doesn't answer immediately.
2) Node N(m-1) answers immediately sending its own PLIC Token Tk(m-1) 
(by using the "Send_PLIC Token message", composed by the fields 
detailed in Process 5). In this situation two possible occurrences 
(node N(m) distinguishes between them based on Ts(m-1) contained in 
the "Send_PLIC Token message" from N(m-1):
2.a) PLIC Token Tk(m-1) has been generated during the previous 
timeslot (n-1), thus the desired (cfr occurrence 3 of process 2).
2.b) PLIC Token Tk(m-1) has been generated during the current 
timeslot (n), thus it is exactly the PLIC Token node N(m) is looking 
for (cfr occurrence 4 of process 2).

In occurrence 2.b the current process succeeds and thus process 5 can 
immediately start.
Instead in occurrence 1 or 2.a node N(m), after an predefined 
incremental quantum time, will reiterate again and again Process 4 
until one out the following two occurrences occurs:
- occurrence 2.b occurs, thus process 5 can start, using PLIC Token 
Tk(m-1) to build Tk(m). 
- the "request time counter" overcomes a predefined timeout, thus 
process 5 can start, but without using PLIC Token Tk(m-1) to build 
Tk(m).


3.4 Local Performance parameters calculation (Process 4)

This process is a local process and it is devoted to calculate, on 
node N(m), the data of interest (i.e. PLR, CRS, Delay, etc.), based 
on measurements performed during the current or previous timeslot 
(depending on the kind of parameter measured ). These data will be 
put in PLIC Token Tk(m) during Process 5. Each performance parameter 
is calculated depending on the chosen method.

Ionta, et al.           Expires May 16, 2016                   [Page 6]

Internet-Draft Performance Layer Inboard Computing method October 2015


3.5 PLIC Token building (Process 5) 

This process is a local process and it is devoted to generate a new 
PLIC Token Tk(m), available for node N(m+1).
This process puts in the PLIC Token Tk(m) some kind of information. 
General information and the detailed corresponding fields are 
described in the following paragraphs.


3.5.1 General info put in the PLIC Token

The following are the general info to put in the PLIC Token:
- Info about local node N(m) (i.e. hostname of N(m))
- Locally performed measurements (i.e. number of packets, CRC, etc.) 
calculated by process 4
- If, and only if, process 3 succeeded (PLIC Token Tk(m-1) has been 
got): 
- Data contained in Tk(m-1) (i.e. list of hostnames of the PLIC-AWARE 
nodes along the upstream reverse path from the local node N(m) up to 
the source)
- Results of calculation performed merging the data coming from 
Tk(m-1) and the locally performed measurements calculated by process 4 
(i.e. Packet Loss calculated as difference between number of packets 
measured in N(m-1), and thus contained in Tk(m-1), and number of 
packets measured by the locally performed measurements during 
process 4. 

As detailed in the above PLIC-Neighborship building process, please 
remember that PLIC Token Tk(m-1) has not been got by N(m) if one of 
the following cases occurs:
- case 1: no PLIC-Aware node has been found during the 
PLIC-Neighborship building process 
- case 2: the PLIC-AWARE PREDECESSOR node (called N(m-1)) has not been 
able to generate PLIC Token Tk(m-1), or N(m) has not been able to get 
it, within the timeout.
In this situation no info about the upstream path can be put in PLIC 
Token Tk(m).


Ionta, et al.           Expires May 16, 2016                   [Page 7]

Internet-Draft Performance Layer Inboard Computing method October 2015


3.5.2 PLIC Token detailed structure

The fields of the PLIC Token are:
1.Hostname ID of N(m)
2.Interface ID where the PLIC multicast flow comes in
3.Time-Stamp referred to the successfully creation of the PLIC Token 
by N(m) during the current timeslot.
4.Flow ID 1: Main PLIC multicast flow identifier. i.e. Multicast Group 
under monitoring. It can belong to a customer or to a monitoring 
probe.  There is a different PLIC multicast flow for each Multicast 
group under monitoring by PLIC.
5.(optional) Flow ID 2: Detailed multicast flow identifier. In case 
the Flow ID 1 is not enough to identify a specific multicast flow 
(i.e. different PLIC flows using the same Multicast group need 
different DSCP to be differentiated).
6.Number of Performance parameters, measured by Node N(m), contained 
in the following part of the current PLIC Token Tk(m) (TLV approach)
7.Performance parameter 1 (i.e. PLR): node-level performance measure 
(i.e. between the current node and its predecessor)
8.Performance parameter 1 (i.e. PLR): cumulative performance measure 
(i.e. between the current node and the source)
9.Performance parameter 1 (i.e. PLR): additive info containing the 
node-level measurements of the basic elements composing the 
Performance parameter measurements (i.e. number of incoming packets 
measured on the current Interface). These values will be useful for 
the successor node to calculate its own node-level performance 
measure.
10.(optional) analogue of 7 for Performance parameter 2 (i.e. Delay)
11.(optional) analogue of 8 for Performance parameter 2
12.(optional) analogue of 9 for Performance parameter 2
13.(optional) analogue of 7 for Performance parameter "n" (i.e. CRC)
14.(optional) analogue of 8 for Performance parameter "n" 
15.(optional) analogue of 9 for Performance parameter "n"
16.(if and only if process 3 succeeded) Numbers of records, each 
containing the sets of Performance parameters (analogue of the set 
created by N(m) and put in the above fields 7-15) measured by the 
upstream PLIC-aware nodes and all contained inside the PLIC Token 
Tk(m-1) got by N(m) during process 3
17.- n.: analogue of fields 6-16 for each record referred to each 
of the upstream PLIC-aware node.


Ionta, et al.           Expires May 16, 2016                   [Page 8]

Internet-Draft Performance Layer Inboard Computing method October 2015


3.6 Processes sequencing during each single timeslot

Following there is a time diagram detailing one single timeslot

Process | Process description |            Timeslot "N"               |
  ID    |                     |                      |  PLIC Token    |
                                                     |propagation time|
1       |   Discovery         | XXX | 
2       |PLIC-Neighborsh build|     | XXX |  
3       |   Get_PLIC_Token    |           | XXX | 
4       | Local Performance   | XXX | XXX | XXX |
        | parameters calcul.  |  
5       |PLIC Token building  |                 | XXX|

Process 2 can start only after the ending of process 1 because it needs
to know, as input, the set of all the upstream nodes along the reverse
path from the actual node up to the source node.
Process 3 can start only after the ending of process 2 because the PLIC
Token must be taken only from the upstream PLIC-AWARE node.
Process 4 has not dependencies from process 1,2 and 3 (thus it can 
start asynchronously respect to them) because the local performance 
parameters calculation, depending only on performance parameters 
sampled on the local node, can be performed without necessity of the 
PLIC Token from the upstream PLIC-AWARE node.
Process 5 can start only after the ending of both processes 3 and 4 
because depends on the output of both of them. 
It must finish before the PLIC Token propagation time interval so to 
give enough time for the PLIC Token propagation downstream, that is 
during the PLIC Token propagation time interval in Timeslot N its 
successor node (N(m+1)) must get Tk(m,n) and create its own Tk(m+1,n)
(following the same processes sequence as for N(m)) to be taken, once 
again during the same PLIC Token propagation time interval in 
Timeslot N, by its successor node (N(m+2))  and creates its own 
Tk(m+2,n),and so on.
Following is shown a zoom on the PLIC Token propagation time interval 
during generic Timeslot "N":

Node m+1 |Get Tk(m,n)| 
         |           |Create Tk(m+1,n)|
Node m+2 |           |                |Get Tk(m+1,n)|
         |           |                |             |Create Tk(m+2,n)|


4. Robustness of the method

Based on the above statements, the proposed method is robust 
regarding:
- any loss of measurement samples: even if the PLIC Token is not 
generated by the upstream node N(m-1), node N(m) generates anyway 
its own PLIC Token available for downstream nodes.
- propagation drift of the PLIC Token during a single timeslot.
- network synchronization problems. 

Ionta, et al.           Expires May 16, 2016                   [Page 9]

Internet-Draft Performance Layer Inboard Computing method October 2015



5. Implementation and deployment (use case)

The methodology has been implemented and delivered in Telecom Italia 
by leveraging functions and tools available on IP routers and it's 
currently being used to support monitoring of packet loss, CRC and 
Bandwith in some portions of Telecom Italia's network. The timeslot 
length is 5 minutes.
In particular the packet loss measurement is performed based on the 
method described in [I-D.draft-tempia-ippm-p3m-01.txt] and the 
results are put inside the PLIC Token structure and delivered to the 
following nodes.


6. Security Considerations
Implementation of this method must be mindful of security
and privacy concerns.
There are two types of security concerns: potential harm caused by
the measurements and potential harm to the measurements. For what
concerns the first point, the measurements described in this document
are passive, so there are no packets injected into the network
causing potential harm to the network itself and to data traffic.
Nevertheless, the method implies modifications on the fly to the IP
header of data packets: this must be performed in a way that doesn't
alter the quality of service experienced by packets subject to
measurements and that preserve stability and performance of routers
doing the measurements. The measurements themselves could be harmed
by routers altering the coloring of the packets, or by an attacker
injecting artificial traffic. Authentication techniques, such as
digital signatures, may be used where appropriate to guard against
injected traffic attacks.
The privacy concerns of network measurement are limited because the
method only relies on information contained in the IP header without
any release of user data.


7. IANA Considerations

There are no IANA actions required.


8. Acknowledgments

The authors would like to thank their workmates F. Moretti, A. Soldati 
and A. Barbetti for their helpful collaboration.
Thanks also to F. Benedetti, G. Giammona, G. Mazzola, S. Salamone 
and L. Tomasino for their support while implementing and deploying this 
method, according to the rules stated in the job agreement with 
Telecom Italia.   
A special thank to Daniela for her help while inspiring and 
translating this memo.

Ionta, et al.           Expires May 16, 2016                  [Page 10]

Internet-Draft Performance Layer Inboard Computing method October 2015



9. References

9.1 Normative References

[RFC5357] Hedayat, K., Krzanowski, R., Morton, A., Yum, K., and J.
Babiarz, "A Two-Way Active Measurement Protocol (TWAMP)",
RFC 5357, October 2008.

[RFC5938] Morton, A. and M. Chiba, "Individual Session Control
Feature for the Two-Way Active Measurement Protocol
(TWAMP)", RFC 5938, August 2010.

9.2 Informative References

[I-D.tempia-opsawg-p3m] Capello, A., Cociglio, M., Castaldelli, L., 
and A. Bonda,"A packet based method for passive performance
monitoring", draft-tempia-opsawg-p3m-04 (work in
progress), February 2014.

[draft-tempia-ippm-p3m-01.txt] Capello, A., Cociglio, M., Castaldelli,
L., Fioccola G., and A. Bonda, "A packet based method for passive 
performance monitoring", September 2015



Authors' Address

Tiziano Ionta (editor)
Telecom Italia Labs
Via Valcannuta 250
00167 Rome
Italy
Phone: +39 06 3688 5600
Email: tiziano.ionta@telecomitalia.it

Antonio Cappadona (editor)
Telecom Italia Labs
Via Valcannuta 250
00167 Rome
Italy
Phone: +39 06 3688 7181
Email: antonio.cappadona@telecomitalia.it

Ionta, et al.           Expires May 16, 2016                  [Page 11]

Internet-Draft Performance Layer Inboard Computing method October 2015