Internet DRAFT - draft-jiang-nmlrg-network-machine-learning

draft-jiang-nmlrg-network-machine-learning







Network Machine Learning Research Group                         S. Jiang
Internet-Draft                              Huawei Technologies Co., Ltd
Intended status: Informational                          October 28, 2016
Expires: May 1, 2017


                        Network Machine Learning
             draft-jiang-nmlrg-network-machine-learning-02

Abstract

   This document introduces background information of machine learning
   briefly, then explores the potential of machine learning techniques
   for networks.  This document is serving as a white paper of the
   (proposed) IRTF Network Machine Learning Research Group.

Status of This Memo

   This Internet-Draft is submitted in full conformance with the
   provisions of BCP 78 and BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF).  Note that other groups may also distribute
   working documents as Internet-Drafts.  The list of current Internet-
   Drafts is at http://datatracker.ietf.org/drafts/current/.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   This Internet-Draft will expire on May 1, 2017.

Copyright Notice

   Copyright (c) 2016 IETF Trust and the persons identified as the
   document authors.  All rights reserved.

   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents
   (http://trustee.ietf.org/license-info) in effect on the date of
   publication of this document.  Please review these documents
   carefully, as they describe your rights and restrictions with respect
   to this document.  Code Components extracted from this document must
   include Simplified BSD License text as described in Section 4.e of
   the Trust Legal Provisions and are provided without warranty as
   described in the Simplified BSD License.




Jiang                      Expires May 1, 2017                  [Page 1]

Internet-Draft          Network Machine Learning            October 2016


Table of Contents

   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   2
   2.  Terminology . . . . . . . . . . . . . . . . . . . . . . . . .   3
   3.  Brief Background of Machine Learning  . . . . . . . . . . . .   3
     3.1.  Machine Learning Categories . . . . . . . . . . . . . . .   3
     3.2.  Machine Learning Approaches . . . . . . . . . . . . . . .   3
     3.3.  Successful Applications . . . . . . . . . . . . . . . . .   5
     3.4.  Precondition of Applying Machine Learning Approach  . . .   5
     3.5.  Limitation of Machine Learning Mechanism  . . . . . . . .   5
   4.  Network Machine Learning Research Group in IRTF . . . . . . .   6
   5.  Use Cases Study of Applying Machine Learning in Network . . .   7
     5.1.  Network Traffic . . . . . . . . . . . . . . . . . . . . .   7
   6.  Security Considerations . . . . . . . . . . . . . . . . . . .   7
   7.  IANA Considerations . . . . . . . . . . . . . . . . . . . . .   8
   8.  Acknowledgements  . . . . . . . . . . . . . . . . . . . . . .   8
   9.  Change log [RFC Editor: Please remove]  . . . . . . . . . . .   8
   10. Informative References  . . . . . . . . . . . . . . . . . . .   8
   Author's Address  . . . . . . . . . . . . . . . . . . . . . . . .   8

1.  Introduction

   Machine learning techniques help to make predictions or decisions by
   learning from historical data.  As machine learning mechanism could
   dynamically adapt to varying situations and enhance their own
   intelligence by learning from new data, they are more flexible in
   handling complicated tasks than strictly static program instructions.
   Therefore, machine learning techniques have been widely applied in
   image analysis, pattern recognition, language recognition,
   conversation simulation, and etc.

   With deep exploration, machine learning techniques would cast light
   on studies of autonomic networking, in that they could be well
   adapted to learn the various environments of networks and react to
   dynamic situations.

   The proposed Network Machine Learning Research Group (NMLRG) was
   formed within IRTF (Internet Research Task Force), October, 2015.  As
   a procedure, currently, IRTF requests an one-year provisional period.
   After this period, the proposed research group may become a formal
   research group if there is a steady research community.  The NMLRG
   provides a forum for researchers to explore the potential of machine
   learning techniques for networks.

   This document firstly provides background information of machine
   learning briefly, then explores the potential of machine learning
   techniques for networks functions, such as network control, network
   management, and supplying network data for upper-layer applications.



Jiang                      Expires May 1, 2017                  [Page 2]

Internet-Draft          Network Machine Learning            October 2016


   Author notice: this document is in the primary stage.  It is an
   ongoing document for the proposed Network Machine Learning Research
   Group.  For now, it is not clear whether it would be published or
   not.

2.  Terminology

   The terminology defined in this document.

   Machine Learning  A computational mechanism that analyzes and learns
      from data input, either historic data or real-time feedback data,
      following designed model/pattern.  It can be used to make
      predictions or decision, rather than following strictly static
      program instructions.

3.  Brief Background of Machine Learning

3.1.  Machine Learning Categories

   Machine learning mechanisms are typically classified into three broad
   categories, depending on the nature of the learning "signal" or
   "feedback" available:

   Supervised learning  The machine learning mechanism is given labeled
      inputs and the correspondent desired outputs.  The mechanism could
      learn a general rule that maps inputs to outputs by itself.

   Unsupervised learning  The given input are not labeled.  It leaves
      the machine learning mechanism itself to find structure in its
      input and output.

   Reinforcement learning  The machine learning mechanism interacts with
      dynamic environments in which it performs a certain task and
      receives feedback from its action.

   Between supervised and unsupervised learning, there is semi-
   supervised learning, in which input data are partially labeled.

3.2.  Machine Learning Approaches

   There are a few basic machine learning approaches.  They can be mixed
   together to complete complicated tasks.

   Classification  With the training data that has been labeled into a
      number of classes, the machine learning mechanism could assign new
      unlabeled data into one or more these classes.  An example is SPAM
      filtering, in which emails are classified into "spam" or "not
      spam" classes.



Jiang                      Expires May 1, 2017                  [Page 3]

Internet-Draft          Network Machine Learning            October 2016


   Clustering  Without labeled training data, the machine learning
      mechanism divides data into groups.  It is the learning mechanism
      itself to decide the number or structure of output classes.

   Regression  It estimates the relationships among variables.  The
      outputs are continuous.

   Anomaly detection  It detects specific data which do not conform to
      an expected pattern or other data in a data set.

   Density estimation  The machine learning mechanism needs to identify
      the distribution of input data.

   Dimensionality reduction  The machine learning mechanism could
      simplify inputs by mapping them into a lower-dimensional space.

   Decision tree learning  The learning output is structured into a
      decision tree as a predictive model.

   Association rule learning  The learning delivers potential relations
      between variables.

   Artificial neural networks  also called "neural network".  It is
      inspired by the structure and functions of biological neural
      networks.  It is structured by a number of interconnected
      computational "neurons", each of which has independent deciding
      ability.  The connections have numeric weights that can be tuned
      according to feedback and trends, making neural nets adaptive to
      inputs and capable of learning.

   Reinforcement learning  It is inspired by behaviorist psychology.
      The mechanism take actions in an environment so as to maximize
      cumulative reward.

   Similarity and metric learning  It learns from training data a
      similarity function that measures how similar or related two
      objects are.

   Representation learning  Also called feature learning.  It learns a
      feature - a transformation of raw data input to a representation
      that can be effectively exploited in machine learning tasks.

   This is not a full enumerated list of machine learning approaches.
   Other approaches may include support vector machines, bayesian
   networks, inductive logic programming, sparse dictionary learning,
   genetic algorithms, and etc.





Jiang                      Expires May 1, 2017                  [Page 4]

Internet-Draft          Network Machine Learning            October 2016


   Editor notes: the basic algorithms that machine learning approaches
   use may be listed as a future work.  It may be too detailed and too
   many to be included.

3.3.  Successful Applications

   Machine learning approaches have been successfully applied in many
   areas, such as human behavior analysis, image analysis, nature
   language recognition (including speech and handwriting processing),
   conversation simulation, medical diagnosis, structural health
   monitoring, stock market analysis, biological analysis and
   classifying, loan and insurance evaluation, game playing, and many
   other applications.

   As for network applications, such as search engines, SPAM filtering,
   adaptive website, Internet fraud detection, online advertising, etc.,
   have all been greatly benefited from the machine learning mechanism.
   However, most of those successful stories are in the application
   layer of network perspective.

3.4.  Precondition of Applying Machine Learning Approach

   Although it is different from big data or data mining, machine
   learning does also need data.  However, machine learning can be
   applied with small set of data or dynamic feedback from environment.
   The quality of data decides the efficient and accuracy of machine
   learning.

   There is no generic machine learning mechanism that could suitable
   for all or most of use cases.  For each use case, the developers need
   to design a specific analysis path, which may combine multiple
   approaches or algorithms together.  The feature design and analysis
   path design are the key factor in the machine learning applications.

   To achieve autonomic decision or minimize the human intervention,
   there should be evaluation system for the results of machine learning
   mechanism.  The evaluation system could be the measurement that the
   results of machine learning mechanism are executed.  The evaluation
   system and machine learning mechanism could compose a close decision
   loop for autonomic decision.

3.5.  Limitation of Machine Learning Mechanism

   So far, the machine learning mechanism does not perform very well for
   accurate result.  In most successful cases, it is used as an
   assistant analysis tool.  Its results are usually accepted in fault-
   tolerant environment or with further human confirmation.




Jiang                      Expires May 1, 2017                  [Page 5]

Internet-Draft          Network Machine Learning            October 2016


4.  Network Machine Learning Research Group in IRTF

   The Network Machine Learning Research Group (NMLRG), which was formed
   as a proposed research group of IRTF, October, 2015 (as a procedure,
   a proposed research group may become a formal research group after
   one year provisional period), provides a forum for researchers to
   explore the potential of machine learning techniques for networks.
   In particular, the NMLRG will work on potential approaches that apply
   machine learning techniques in network control, network management,
   and supplying network data for upper-layer applications.

   The initial focus of the NMLRG will be on higher-layer concepts where
   the machine learning mechanism could be applied in order to enhance
   the network establishing, controlling, managing, network applications
   and customer services.  This includes mechanisms to acquire knowledge
   from the existing networks so that new networks can be established
   with minimum efforts; the potential to use machine learning
   mechanisms for routing control and optimization; using machine
   learning mechanisms in network management to predict future network
   status; using machine learning mechanisms to autonomic and dynamical
   network management; using machine learning mechanisms to analyze
   network faults and support recovery; learning network attacks and
   their behaviors, so that protection mechanisms could be self-
   adapted; unifying the data structure and the communication interface
   between network/network devices and customers, so that the upper-
   layer applications could easily obtain relevant network information,
   etc.  The NMLRG is expected to identify and document requirements, to
   survey possible approaches, to provide specifications for proposed
   solutions, and to prove concepts with prototype implementations that
   can be tested in real-world environments.

   The more knowledge we have, the more intelligent we are.  It is the
   same for networks and network management.  Up to now, the only
   available network knowledge is usually the current network status
   inside a given device or relevant current status from other devices.
   However, historic knowledge is very helpful to make correct
   decisions, in particular to reduce network oscillation or to manage
   network resources over time.  Transplantable knowledge from other
   networks can be helpful to initially set up a new network or new
   network devices.  Knowledge of relationships between network events
   and network configuration may help a network to decide the best
   parameters according to real performance feedback.  In addition to
   such historic knowledge, powerful data analytics of current network
   conditions may also be a valuable source of knowledge that can be
   exploited directly.  The machine learning mechanism is the
   correspondent mechanism to learn and apply knowledge intelligently.





Jiang                      Expires May 1, 2017                  [Page 6]

Internet-Draft          Network Machine Learning            October 2016


5.  Use Cases Study of Applying Machine Learning in Network

   In 2016, the NMLRG is focusing on collecting and studying of use
   cases that applies machine learning mechanisms into network area.
   More use cases are still in the collecting process.

5.1.  Network Traffic

   Network traffic is one of the most important objectives that needs to
   be managed in network/Internet area.

   Network traffic meets preconditions of applying Machine Learning
   mechanisms.  It is full of data: the network traffic itself is data
   source, also there are many properties of network traffic are
   measurable, such as latency, number of packets, last period, etc.
   The network traffics are complicated.  Its characteristics are often
   beyond the awareness of human operators.  Machine Learning would
   greatly help to discover knowledge regarding to network traffics.
   The network traffics are always dynamic changing.  There is both
   regularity and irregularity.  Quick response to real-time network
   traffic is a big challenge to network management.  It is beyond the
   ability of human operator.  The rigid management has already become a
   bottleneck of current networks.  Machine Learning could form a quick
   and adaptive auto response managing system.

   There are many different types of network traffic.  In April 2016,
   NMLRG #2 IETF 95 meeting was organized with the theme of network
   traffic.  There are multiple use cases presented: HTTPS traffic
   classification, machine learning in the router - learn from and act
   on network traffics, applications of machine learning to flow-based
   monitoring, malicious domains: automatic detection with DNS traffic
   analysis, machine-learning based policy derivation and evaluation in
   broadband networks, predicting interface failures for better traffic
   management

   NMLRG is currently working on a dedicated document for this theme.
   It is potential this document becomes RG document and is published as
   a RFC in the future.

6.  Security Considerations

   This document is focused on applying machine learning in network,
   including of course applying machine learning in network security, on
   higher-layer concepts.  Therefore, it does not itself create any new
   security issues.






Jiang                      Expires May 1, 2017                  [Page 7]

Internet-Draft          Network Machine Learning            October 2016


7.  IANA Considerations

   This memo includes no request to IANA.

8.  Acknowledgements

   The author would like to acknowledge the valuable comments made by
   participants in the IRTF Network Machine Learning Research Group,
   particular thanks to Lars Eggert, Brian Carpenter, Albert Cabellos,
   Shufan Ji, Panagiotis Demestichas, Jerome Francois, Susan Hares,
   Rudra Saha, Dacheng Zhang and Bing Liu.

   This document was produced using the xml2rfc tool [RFC2629].

9.  Change log [RFC Editor: Please remove]

   draft-jiang-nmlrg-network-machine-learning-01: adding brief
   description of network traffic and ML into use case study, 2016-4-23.

   draft-jiang-nmlrg-network-machine-learning-00: original version,
   2015-10-19.

10.  Informative References

   [RFC2629]  Rose, M., "Writing I-Ds and RFCs using XML", RFC 2629,
              DOI 10.17487/RFC2629, June 1999,
              <http://www.rfc-editor.org/info/rfc2629>.

Author's Address

   Sheng Jiang
   Huawei Technologies Co., Ltd
   Q14, Huawei Campus, No.156 Beiqing Road
   Hai-Dian District, Beijing, 100095
   P.R. China

   Email: jiangsheng@huawei.com














Jiang                      Expires May 1, 2017                  [Page 8]