ForCES Working Group                                       J. Hadi Salim
Internet-Draft                                             Znyx Networks
Expires: April 25, 2004                                          R. Haas
                                                            IBM Research
                                                                S. Blake
                                                                Ericsson
                                                        October 26, 2003


                      Netlink2 as ForCES Protocol
                  draft-jhsrha-forces-netlink2-02.txt

Status of this Memo

   This document is an Internet-Draft and is in full conformance with
   all provisions of Section 10 of RFC2026.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF), its areas, and its working groups. Note that other
   groups may also distribute working documents as Internet-Drafts.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time. It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   The list of current Internet-Drafts can be accessed at http://
   www.ietf.org/ietf/1id-abstracts.txt.

   The list of Internet-Draft Shadow Directories can be accessed at
   http://www.ietf.org/shadow.html.

   This Internet-Draft will expire on April 25, 2004.

Copyright Notice

   Copyright (C) The Internet Society (2003). All Rights Reserved.

Abstract

   This document describes Netlink2, which is an extension of Linux
   Netlink [RFC3549].  This document is intended as a proposal for the
   ForCES IETF working group protocol.

   ForCES attempts to define a clear separation between the two entities
   of the NE in order to have them evolve separately as opposed to the
   current monolithic evolution.

Conventions used in this document


Hadi Salim, et al.       Expires April 25, 2004                 [Page 1]

Internet-Draft        Netlink2 as ForCES Protocol           October 2003


   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
   document are to be interpreted as described in [RFC2119].

Table of Contents

   1.    Introduction . . . . . . . . . . . . . . . . . . . . . . . .  4
   2.    Definitions  . . . . . . . . . . . . . . . . . . . . . . . .  5
   3.    Netlink2 Overview  . . . . . . . . . . . . . . . . . . . . .  6
   4.    Summary of Netlink2 Modifications to Netlink . . . . . . . .  7
   4.1   Header Modifications . . . . . . . . . . . . . . . . . . . .  7
   4.2   Addressing and Transport Extensions  . . . . . . . . . . . .  8
   5.    Netlink2 Message Format  . . . . . . . . . . . . . . . . . .  9
   5.1   Netlink2 Message Header  . . . . . . . . . . . . . . . . . .  9
   5.2   Type Length Value  . . . . . . . . . . . . . . . . . . . . . 13
   5.3   Encapsulated TLVs  . . . . . . . . . . . . . . . . . . . . . 14
   5.4   Netlink2-extension TLVs  . . . . . . . . . . . . . . . . . . 14
   6.    Addressing and Transport Extensions  . . . . . . . . . . . . 16
   6.1   Transport Methods  . . . . . . . . . . . . . . . . . . . . . 16
   6.1.1 Why Multicast? . . . . . . . . . . . . . . . . . . . . . . . 16
   6.1.2 Why IP?  . . . . . . . . . . . . . . . . . . . . . . . . . . 16
   6.1.3 Why UDP/TCP/SCTP/DCCP? . . . . . . . . . . . . . . . . . . . 17
   6.2   The Netlink2 wire and bundle . . . . . . . . . . . . . . . . 17
   6.2.1 What wires go in a bundle? . . . . . . . . . . . . . . . . . 18
   6.3   Redefining the Netlink PID Semantics . . . . . . . . . . . . 20
   6.4   Local Scope Addressing and Encapsulation . . . . . . . . . . 21
   6.5   Global Scope Addressing and Encapsulation  . . . . . . . . . 21
   7.    Protocol Architecture  . . . . . . . . . . . . . . . . . . . 23
   7.1   Protocol Phases  . . . . . . . . . . . . . . . . . . . . . . 23
   7.1.1 The Pre-Association Phase  . . . . . . . . . . . . . . . . . 23
   7.1.2 The Association Phase  . . . . . . . . . . . . . . . . . . . 23
   7.1.3 Service Termination  . . . . . . . . . . . . . . . . . . . . 24
   7.2   Protocol Logical Model . . . . . . . . . . . . . . . . . . . 24
   7.3   Service Addressing . . . . . . . . . . . . . . . . . . . . . 25
   7.4   Service Templates  . . . . . . . . . . . . . . . . . . . . . 26
   7.5   Mechanisms for Creating Protocols  . . . . . . . . . . . . . 26
   7.5.1 Building Reliable Protocols  . . . . . . . . . . . . . . . . 26
   7.5.2 Building Availability  . . . . . . . . . . . . . . . . . . . 27
   7.5.3 The ACK Netlink2 Message . . . . . . . . . . . . . . . . . . 27
   7.5.4 Batching . . . . . . . . . . . . . . . . . . . . . . . . . . 28
   7.5.5 Atomicity and Ordering of Transactions . . . . . . . . . . . 29
   8.    Putting together the base protocol for WG charter  . . . . . 30
   8.1   Netlink2-Extension TLVs  . . . . . . . . . . . . . . . . . . 30
   8.1.1 Authentication . . . . . . . . . . . . . . . . . . . . . . . 30
   8.1.2 Checksum . . . . . . . . . . . . . . . . . . . . . . . . . . 30
   8.1.3 Message Priority . . . . . . . . . . . . . . . . . . . . . . 30
   8.1.4 SYN COOKIE . . . . . . . . . . . . . . . . . . . . . . . . . 31
   8.1.5 Name ID  . . . . . . . . . . . . . . . . . . . . . . . . . . 31


Hadi Salim, et al.       Expires April 25, 2004                 [Page 2]

Internet-Draft        Netlink2 as ForCES Protocol           October 2003


   8.2   LFB and FE Attributes and discovery  . . . . . . . . . . . . 31
   8.3   NE creation  . . . . . . . . . . . . . . . . . . . . . . . . 31
   8.3.1 FE State transitions . . . . . . . . . . . . . . . . . . . . 32
   8.3.2 CE view of FE State transitions  . . . . . . . . . . . . . . 34
   8.3.3 SYN Message Format . . . . . . . . . . . . . . . . . . . . . 37
   8.3.4 FIN Message Format . . . . . . . . . . . . . . . . . . . . . 37
   8.3.5 NOOP Message Format  . . . . . . . . . . . . . . . . . . . . 37
   8.4   LFB and FE Service Templates . . . . . . . . . . . . . . . . 37
   8.4.1 Physical Port and Address Functions  . . . . . . . . . . . . 38
   8.4.2 IPv4 and IPv6 L3 Forwarding Functions  . . . . . . . . . . . 41
   8.4.3 Filtering Functions  . . . . . . . . . . . . . . . . . . . . 45
   8.4.4 QoS Functions  . . . . . . . . . . . . . . . . . . . . . . . 45
   8.4.5 IPSEC Functions  . . . . . . . . . . . . . . . . . . . . . . 45
   8.4.6 Packet redirection Functions . . . . . . . . . . . . . . . . 45
   8.4.7 Packet Mirroring Functions . . . . . . . . . . . . . . . . . 45
   8.4.8 Packet Sampling Functions  . . . . . . . . . . . . . . . . . 45
   8.5   Security Considerations  . . . . . . . . . . . . . . . . . . 45
   8.5.1 Denial of Service (DoS) attacks  . . . . . . . . . . . . . . 46
   8.5.2 Authentication and Encryption  . . . . . . . . . . . . . . . 46
         References . . . . . . . . . . . . . . . . . . . . . . . . . 47
         Authors' Addresses . . . . . . . . . . . . . . . . . . . . . 49
   A.    Sample Service Hierarchy . . . . . . . . . . . . . . . . . . 50
   B.    Sample Protocol for the foo IP Service . . . . . . . . . . . 52
   B.1   Interacting with Other IP Services . . . . . . . . . . . . . 52
   C.    Examples . . . . . . . . . . . . . . . . . . . . . . . . . . 53
         Intellectual Property and Copyright Statements . . . . . . . 54


Hadi Salim, et al.       Expires April 25, 2004                 [Page 3]

Internet-Draft        Netlink2 as ForCES Protocol           October 2003


1. Introduction

   The concept of IP control and forwarding separation was first
   introduced in the early 1980s by the BSD 4.4 routing sockets
   [Stevens]. The focus at that time was to provide a simple IP(v4)
   forwarding service and allow the control plane, either via a command
   line configuration tool or a dynamic route daemon, to control
   forwarding tables for that IPv4 forwarding service.

   The IP world has evolved considerably since then.  Linux Netlink
   [RFC3549], when observed from a service provisioning and management
   point of view, takes routing sockets one step further by breaking the
   narrow focus on IPv4 forwarding.  Since the Linux 2.1 kernel, Netlink
   has been providing the IP service abstraction for a few additional
   services other than classical RFC 1812 IPv4 forwarding.

   Netlink was designed with a goal of solving the forwarding and
   control separation.  This means that many of the main issues have
   been thought through and resolved over the years.  In other words
   Netlink is proven as a protocol addressing separation of forwarding
   and control. Netlink is also network-ready because it uses packet
   formating techniques and concepts (e.g., multicast addressing). This,
   and the availability of publicly running and tested code which is
   widely deployed, form a major motivator to base Netlink2 on Netlink.

   Netlink2 extends Linux Netlink to meet the requirements of the ForCES
   working group charter for a protocol.  Netlink is extended to have a
   distributed addressing and transport scheme, and missing mechanisms
   are added to make Netlink2 meet the ForCES protocol requirements
   [ForCES_REQ].

   Netlink2 operates in a mode where knowledge of the NE, its topology,
   and LFB modeling MAY have already been discovered, or is discovered
   within the Netlink2 protocol.  Netlink2 can operate over a variety of
   link, network, and transport media. The transport and media includes
   but is not limited by:

   o  L2 such as Ethernet, ATM, FR, etc,

   o  over bus and I/O interfaces such as PCI, HT, PCI-express, etc

   o  L3 IPV4, IPv6, IPX etc.

   o  L4 and above such as TCP, UDP, SCTP, DCCP

   In the cases where required mechanisms are missing from the
   underlying media, they are compensanted for by Netlink2 extensions
   (refer to Section 8.1)


Hadi Salim, et al.       Expires April 25, 2004                 [Page 4]

Internet-Draft        Netlink2 as ForCES Protocol           October 2003


2. Definitions

   We use the definitions provided in [ForCES_REQ], as well as the
   following:

   Logical Functional Block (LFB): same as Forwarding Engine Components
   as defined in [RFC3549].  This is a forwarding datapath component in
   the FE driven by the ForCES protocol in order to achieve a certain
   service.

   Control Element Component (CPC): same as defined in Control Plane
   Component in [RFC3549].  This is a component in the CE that drives
   LFB(s) in order to achieve a certain service.


Hadi Salim, et al.       Expires April 25, 2004                 [Page 5]

Internet-Draft        Netlink2 as ForCES Protocol           October 2003


3. Netlink2 Overview

   A datapath packet processing service accomplished by an FE is
   represented as a logical functional block (LFB) in the FE. CE
   components (CPC) in the CE interact with LFBs over Netlink2 wires and
   bundles (described in Section 6.2) to configure and manage a certain
   service. The interactions between LFBs and CPCs are specific to each
   service and are defined using templates as presented in [RFC3549].

   The Netlink2 message is used to communicate between the FE and CPC
   for configuration of LFBs, LFB events to the CPCs, and statistics or
   config querying/gathering (typically by a CPC).  Other activities
   include transfer of control packets between FE and CPC.

   Netlink2 messages travel between the CPC and LFB over Netlink2 wires
   which are part of Netlink2 bundles. Netlink2 wires are abstractions
   similar to GSMP links [RFC3292], albeit without the limitation to ATM
   VP:VC, Ethernet link, or TCP connection only.

   For instance, the IPv4 Forwarding service (called NETLINK_ROUTE)
   defines a message template for handling IP routes and the message
   types to insert, remove, or query a route.  The routing CPC(s) and
   the IPv4 Forwarding LFB(s) interact using these message templates and
   message types over the Netlink2 bundle to execute the IPv4 Forwarding
   service.

   The message types in Netlink2 messages allow the FE to demultiplex
   messages to the appropriate LFB.

   Messages of a certain service destined to a LFB can travel on
   different Netlink2 wires within the same bundle

   Netlink2 by itself constitutes a base ForCES protocol with a set of
   mechanisms that can be utilized depending on service requirements.
   For example, for certain messages between the FE and CE, reliability
   can be enforced at the transaction level by setting the appropriate
   flags in the Netlink2 message. However, by default, Netlink2
   transactions are not acknowledged.


Hadi Salim, et al.       Expires April 25, 2004                 [Page 6]

Internet-Draft        Netlink2 as ForCES Protocol           October 2003


4. Summary of Netlink2 Modifications to Netlink

   To conform to the ForCES requirements [ForCES_REQ], the Netlink
   protocol [RFC3549] is extended in the following respects:

   1.  Base header modifications, and feature expandability extensions
       by means of optional header TLVs to accommodate current generic
       ForCES requirements and to make it possible to add more in the
       future.  This facilitates adding such features as authentication,
       checksumming, etc., when required.

   2.  IP and Transport encapsulations to carry Netlink messages.

   With these complementary changes to the existing Netlink
   functionality, Netlink2 fulfills the requirements to become the
   ForCES protocol.

4.1 Header Modifications

   1.  PID field redefinition and addition.

       In Netlink, PID 0 referred to the equivalent of the FE (kernel).
       The equivalent of the CE (user process) was referred by its OS
       process id.

       In Netlink2, the PID has additional semantics which give it group
       identity, unicast capability, etc (discussed later in Section
       6.3).

       A PID of the unicastPID type is assigned to each FE and CE in the
       pre-association phase.  In this way the CE uniquely identifies
       the FE and avoids any collision.  We maintain the name PID for
       historical purposes.

       *  Destination PID: the PID field is redefined as the Destination
          PID field.  This field identifies the parties on the wire that
          must process the message.

       *  Source PID: this field is introduced in the header to identify
          the source of the message.

       Different types of PIDs are discussed in Section 6.3.

   2.  The Length field has been reduced to 16 bits, with length 0 being
       reserved.  The rest of the old 32-bit Length field is now split
       between a new version field and a new extended flags field.

   3.  A Version field is introduced in the Netlink2 header. This 8-bit


Hadi Salim, et al.       Expires April 25, 2004                 [Page 7]

Internet-Draft        Netlink2 as ForCES Protocol           October 2003


       field is 4 bits major number and 4 bits minor number in the form
       of major:minor.  For Netlink2, this becomes: 0x20.

   4.  A new Extended Flags field is introduced to take over the
       remaining 8 bits from the 16-bits taken from the original 32-bit
       Length field in Netlink. Turning different bits on enables
       additional new features such as proclaiming the presence of
       extended TLVs, etc.

   5.  Netlink2-extension TLVs follow directly after the Netlink2 base
       header. They are optional and their purpose is to extend the
       Netlink2 header. Typical use of Netlink2-specific TLVs is to
       compensate for capabilities lacking in a underlying transport.
       For example, in an IP network not deployed with IPSEC, the
       Netlink2-specific authentication TLV could be used to emulate the
       features provided by IPSEC-AH.

   6.  There could be more than one IP service configuration template
       within a Netlink2 message (as opposed to a single service
       template per netlink message). Implementation experience Section
       6.3 has shown embedding multiple service templates improves
       performance of FE configuration.

   Other than these changes, all mechanisms provided by Netlink are
   sufficient to meet the requirements for ForCES.  The reader is
   encouraged to refer to [RFC3549] as a companion to this one.

4.2 Addressing and Transport Extensions

   1.  Support for UDP/TCP/SCTP/DCCP transport over unicast/multicast IP
       (Section 6.1).

   2.  Support for bundles (Section 6.2).

   3.  Message recipient scoping using the Destination PID (Section
       6.3).

   4.  Support for both local scope and global scope addressing (Section
       6.4 and Section 6.5).


Hadi Salim, et al.       Expires April 25, 2004                 [Page 8]

Internet-Draft        Netlink2 as ForCES Protocol           October 2003


5. Netlink2 Message Format

   There are three levels to a Netlink2 message: The general Netlink2
   message header which is mandatory, the Netlink2-extension TLV and
   service Template(s) which are optional.


     0                   1                   2                   3
     0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |                                                               |
    |                   Netlink2 message header                     |
    |                                                               |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |                                                               |
    |                   Netlink2-extension TLV (optional)           |
    |                                                               |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |                                                               |
    |                   Service Template(s)  (optional)             |
    |                                                               |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+


   Implementation studies [Goutaudier] have shown the above data layout
   to provide easier parsing while allowing for extensibility (via the
   optional Netlink2-extension TLV) and scalability (allowing for
   multiple Service templates).

   The Netlink2 message header is generic for all services and contains
   the command that describes the rest of the message.

   The optional Netlink2-extension TLV acts to extend any general
   missing functionality from the Netlink2 message header. Typically,
   this would be to allow for compensating for missing underlying
   transport functionality.

   The Service template is specific to a service. As mentioned earlier
   there could be more than one template per Netlink2 message. Each
   Service template carries configuration parameters or query requests
   (CPC->LFB direction) or query responses (LFB->CPC direction). In the
   case of multiple Service templates, then all the templates MUST be
   used to execute the same command as defined in the Netlink2 message
   header. In some special cases the Service template is not used. For
   example in the case of a Netlink2 SYN, FIN or NOOP command.

5.1 Netlink2 Message Header


Hadi Salim, et al.       Expires April 25, 2004                 [Page 9]

Internet-Draft        Netlink2 as ForCES Protocol           October 2003


   Each Netlink2 message contains a byte stream with a Netlink2 header
   followed by its associated payload.

   A single PDU may contain more than one Netlink2 message.  This is
   referred to as batching.  Netlink batching is reused in Netlink2 and
   allows for messages with different commands (such as adding routes
   and deleting a QoS policy) to be carried in the same batch PDU.

   A Netlink2 message may be split across multiple PDUs if it does not
   fit into the PDU.  This is referred to as a multipart Netlink2
   message and is also inherited from Netlink.

   For multipart messages, the first and all following headers have the
   NLM_F_MULTI Netlink header flag set, except for the last header,
   which has the Netlink header type NLMSG_DONE.

   The Netlink2 message header is shown below.


     0                   1                   2                   3
     0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
                     0               1               2             3
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |    Version    |   Flags_E     |             Length            |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |             Type              |             Flags             |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |                        Sequence Number                        |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |                          Source PID                           |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |                        Destination PID                        |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+


   The fields in the header are:

      Version: 8 bits

      The version field is split into major:minor (4:4 bits) sub-
      fields.  The value for Netlink2 is 0x20.

      Flags_E: 8 bits

      These are extended flags:

         NLM_F_PRIO: Message priority: 1 for high and 0 for low.
         Additional QoS level set in QoS TLV.


Hadi Salim, et al.       Expires April 25, 2004                [Page 10]

Internet-Draft        Netlink2 as ForCES Protocol           October 2003


         NLM_F_ASTR: Set the ACK strategy: 1 for partial ACKs and 0 for
         full ACKs

         NLM_F_MS: Multiple Service templates are present when this flag
         is set to 1

         NLM_F_EXT: If this flag is set, it implies presence of the
         extended optional TLVs

      Length: 16 bits

      The length of the Netlink2 message in bytes including the header.

      Type: 16 bits

      This field describes the message content.  It can be one of the
      standard message types:

         NLMSG_NOOP: message is not executed on LFnot executed on LFB

         NLMSG_ERROR the message signals an error and the payload
         contains a nlmsgerr structure.  This can be looked at as a NACK
         and typically it is from LFB to CPC.

         NLMSG_DONE: message terminates a multipart message

         NLMSG_SYN: Sent on the first message.  Interpreted as a boot
         message of the sender.

         NLMSG_FIN: Sent on the last message.  Interpreted as a shutdown
         message of the sender.

      Typically, services specify more message types centered around
      transactional operations of adding, deleting or querying a
      command. For example, the NETLINK_ROUTE Service specifies several
      types for manipulating IPv4 or IPv6 routes such as RTM_NEWROUTE,
      RTM_DELROUTE, etc.

      Flags: 16 bits

      The standard flag bits used in Netlink are:

         NLM_F_REQUEST: Must be set on all request messages (typically
         from CE to FE)

         NLM_F_MULTI: Indicates the message is part of a multipart
         message terminated by NLMSG_DONE


Hadi Salim, et al.       Expires April 25, 2004                [Page 11]

Internet-Draft        Netlink2 as ForCES Protocol           October 2003


         NLM_F_ACK: Request for an acknowledgment on success.  Typical
         direction of request is from  CPC to LFB.

         NLM_F_ECHO: Echo this request.  Typical direction of request is
         from CPC to LFB.

      Additional flag bits for GET requests on config information in the
      LFB:

         NLM_F_ROOT: Return the complete table instead of a single
         entry.

         NLM_F_MATCH: Return all matching criteria passed in message
         content

         NLM_F_ATOMIC: This is an atomic or part of an atomic operation
         (such as two-phase commit).

      Convenience macros for flag bits:

         NLM_F_DUMP: This is NLM_F_ROOT or'ed with NLM_F_MATCH

      Additional flag bits for NEW requests:

         NLM_F_REPLACE: Replace existing matching config object with
         this request.

         NLM_F_EXCL: Do not replace the config object if it already
         exists.

         NLM_F_CREATE: Create config object if it does not already
         exist.

         NLM_F_APPEND: Add to the end of the object list.

      For readers familiar with BSDish use of such operations in route
      sockets, the equivalent translations are:

      *  BSD ADD operation equates to NLM_F_CREATE or-ed with NLM_F_EXCL

      *  BSD CHANGE operation equates NLM_F_REPLACE

      *  BSD Check operation equates NLM_F_EXCL

      *  BSD APPEND equivalent is actually mapped to NLM_F_CREATE

      Sequence Number: 32 bits


Hadi Salim, et al.       Expires April 25, 2004                [Page 12]

Internet-Draft        Netlink2 as ForCES Protocol           October 2003


      The sequence number of the message.

      Source PID: 32 bits

      The PID of the sender of the message (unicast or logical PID).

      Destination PID: 32 bits

      The PID of the destination of the message (unicast, logical, or
      broadcast PID).


5.2 Type Length Value


    0                   1                   2                   3
    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   | TLV Type                    | variable TLV Length             |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |            Value (Data of size TLV length)                    |
   ~                                                               ~
   ~                                                               ~
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+


      TLV Type:

      The TLV type field is two octets, and indicates the type of data
      encapsulated within the TLV.

      TLV Length:

      The TLV Length field is two octets, and indicates the length of
      this TLV including the TLV Type, TLV Length, and the TLV data.

      TLV Value:

      The TLV Value field carries the data. For extensibility, the TLV
      Value may be a TLV. In fact, this is the case with the
      Netlink2-extension TLV. The Value encapsulated within a TLV is
      dependent of the attribute being configured and is opaque to
      Netlink2 and therefore is not restricted to any particular type
      (example could be ascii strings such as XML, or OIDs etc).

   TLVs must be 32 bit aligned.


Hadi Salim, et al.       Expires April 25, 2004                [Page 13]

Internet-Draft        Netlink2 as ForCES Protocol           October 2003


5.3 Encapsulated TLVs

   TLV values can be other TLVs. This gives the flexibility of being
   able to add new attributes when needed. This is important for a
   protocol such as ForCES for which attributes are expected to vary
   over a wide range of configurable blocks (CEs, FES, LFBs, etc).

   Note that Encapsulated TLVs could be viewed as abstractions that
   represent dynamic lists of attributes

5.4 Netlink2-extension TLVs

   The Netlink2-Extension and Service TLVs are Encapsulated TLVs. They
   contain their respective TLVs as appropriate in the message being
   sent.


    0                   1                   2                   3
    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   | Outer TLV Type              |    Outer TLV Length             |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   | Inner TLV1 Type             |   Inner TLV1 Length             |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   ~                                                               ~
   ~ ~~~~~~~~~~~~~~           VALUE1    ~~~~~~~~~~~~~~~~~~~~~~     ~
   ~                                                               ~
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   ~                                                               ~
   ~                                                               ~
   ~                                                               ~
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   | Inner TLVn Type             |   Inner TLVn Length             |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   ~                                                               ~
   ~ ~~~~~~~~~~~~~~           VALUEn    ~~~~~~~~~~~~~~~~~~~~~~     ~
   ~                                                               ~
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+


      Outer TLV Type:

      This is set to NL2_OPTIONS(0) to indicate the TLV is the
      Netlink2-Extension TLV. The rest of the possible value types are
      reserved for future use.

      Outer TLV Length:


Hadi Salim, et al.       Expires April 25, 2004                [Page 14]

Internet-Draft        Netlink2 as ForCES Protocol           October 2003


      The Outer TLV Length is the length of everything within the TLV
      including the Outer TLV Type field , Outer TLV Length, and all the
      encapsulated TLVs which are treated as the the Outer TLV Value.

      Outer TLV Value:

      The Outer TLV Value is all the inner TLVs. The figure above shows
      an outer TLV with n inner TLVs.

      Inner TLV type, Length, Value:

      These are all just normal TLVs. No assumption is made about their
      data contents.


Hadi Salim, et al.       Expires April 25, 2004                [Page 15]

Internet-Draft        Netlink2 as ForCES Protocol           October 2003


6. Addressing and Transport Extensions

   We extend Netlink to make it distributed. The focus is on making
   Netlink2 have a strong local scope view of the world while fitting
   well into a global scope when the hop distance between the FE and CE
   increases.

   If the network interconnecting the FE(s) and CE(s) is completely
   hidden from the outside (black-box view), for instance an internal
   Ethernet segment or a switching fabric in which CE(s) and FE(s) are
   connected within physical proximity, then communications between FE
   and CE are assumed to be of a local scope.  On the other hand, if
   communications between FE and CE cross several hops of the network
   then the scope is considered global

6.1 Transport Methods

   The ideal environment for Netlink2 is considered to be a
   multicast-capable medium with IP above it and with UDP/TCP/SCTP/DCCP
   running over IP.

   On the other hand, Netlink2 is also capable of running directly over
   L2 (Ethernet for example).

   In the case of non-IP, non-multicast-capable environment, extra
   processing and messaging by the ForCES layer to compensate for
   services that IP already offers would be needed (eg security, quality
   of service, fragmentation, etc if underlying transport does not have
   it).

6.1.1 Why Multicast?

   Multicast is considered important to facilitate one-to-many/some
   communication.  For example, a single command from a CE can be
   multicast to multiple FEs, which eases the scalability requirements
   mentioned in [ForCES_REQ]. This is discussed in later sections.

   When running Netlink2 over non-multicast-capable media, it is
   expected that mechanisms similar to those used in OSPF NBMA [RFC2328]
   networks will be put in place.

6.1.2 Why IP?

   IP runs on virtually every link layer. Leveraging this fact alone
   helps deploying the protocol wider and faster.

   IP also provides numerous services such as fragmentation and
   reassembly, prioritization, and security, which are inherent


Hadi Salim, et al.       Expires April 25, 2004                [Page 16]

Internet-Draft        Netlink2 as ForCES Protocol           October 2003


   requirements for the ForCES protocol.  This means that to
   successfully run an alternative to IP requires that similar services
   be provided by whatever is underneath in order to meet the
   requirements.

   Netlink2-specific optional TLVs can be used to compensate for lacking
   functionality if running on a network transport other than IP or
   directly on the link layer.

   Netlink already allows the definition of multipart messages with IP
   segmenting/reassembling when the path MTU is exceeded.  When running
   on top of non-IP media, the Netlink2 message can be limited to not
   exceed the MTU; the multipart messages facility can be then be used
   to provide framing for segmenting/reassembling.

   The Netlink2-specific Authentication TLV can be used to carry
   authentication signatures over a transport that does not have this
   capability.

   The Netlink2-specific Checksum TLV can be used to carry checksums
   over a medium that does not have this capability.

   The Netlink2-specific Message Priority TLV can be used to carry
   prioritization if transports are not capable of making priorities in
   their headers.

6.1.3 Why UDP/TCP/SCTP/DCCP?

   On a local scope, it is assumed that multicast UDP over IP is the
   preferred mode of operation.

   On a global scope it is expected that TCP or SCTP would be used for
   enhanced reliability and Internet congestion friendliness.

   All mentioned protocols provide 16-bit ports, which are further
   address-demultiplexing points.  Also, all three protocols provide
   checksum capability to enhance integrity of the Netlink2 message.  In
   the case of UDP, the checksum is optional (which fits the model that
   the local scope is less error-prone than global scope and hence the
   integrity check could be turned on only when needed).

6.2 The Netlink2 wire and bundle

   A Netlink2 wire displays the same behavior as a Netlink wire.  It
   interconnects FEs and CEs in order to support services they jointly
   offer.

   The only conceptual difference between a Netlink2 wire and a Netlink


Hadi Salim, et al.       Expires April 25, 2004                [Page 17]

Internet-Draft        Netlink2 as ForCES Protocol           October 2003


   wire is that whereas the Netlink wire is localized, the Netlink2 wire
   is distributed.

   We also introduce the concept of a Netlink2 bundle.  A Netlink2
   bundle interconnects a set of FE(s) and/or CE(s) by means of one or
   more Netlink2 wires.  Note that a Netlink2 bundle does not
   necessarily mean a full-mesh interconnection (see examples later on).

   Parties (FEs and CEs) on a Netlink2 bundle share a common
   configuration, provisioning and event-notification end goals.

   A Netlink2 wire MAY be constructed using a multicast connection or a
   unicast connection or a multiple number of multicast and unicast
   connections. A wire MUST belong to only one bundle.  A bundle may
   have only a single wire (unicast or multicast).  In most cases we
   believe there will only be one multicast address for a bundle,
   although scalability issues could require the use of unicast
   connections in addition.

   When a multicast IP address is used, a Netlink2 wire MUST run over
   UDP - a UDP port is used to uniquely identify the wire.  There MAY be
   multiple wires using the same multicast address as long as they run
   over different UDP ports.

   When a unicast IP address is used, the description of how to connect
   to an endpoint (CE/FE) is subject to the agreement between the CE and
   FE. The connection could be directly over IP (Note: need an IP
   protocol number) or via transport-layer ports (TCP/UDP/SCTP/DCCP).

   In both unicast and multicast wires, the necessary parameters (such
   as IP address and port numbers) can be discovered by the involvement
   of the FE and CE Managers.

6.2.1 What wires go in a bundle?

   Netlink2 provides flexibility to have a bundle of purely unicast
   wires or multicast wires or a hybrid of both.  The decision of what
   goes into a bundle can be made in the pre-association phase.

   A good analogy is to think of a multicast wire as a broadcast link
   (as is done in Netlink) in which CE(s) and FE(s) are parties attached
   to that broadcast link.

   Depending on the number of FEs and CEs on an NE, a choice of a single
   multicast wire in the bundle may be sufficient.  Multicast allows
   one-to-some messagging.  A single message sent by an originator is
   seen by all parties on the wire.  This simplifies synchronization in
   an HA environment as well as implementation of the protocol.


Hadi Salim, et al.       Expires April 25, 2004                [Page 18]

Internet-Draft        Netlink2 as ForCES Protocol           October 2003


   The fact that multicast messages are seen by all parties could cause
   scalability issues as the number of nodes grows.  Parties need to
   filter out messages not destined to them.  This can take compute or
   table resources if filtering is done in hardware.  The extra messages
   also consume unnecessary bandwidth for FE(s) and CE(s) not interested
   in seeing these messages.

   Unicast wires could be used to create point-to-point connections
   between the parties; when every party is connected to every other
   party, then this becomes a full mesh.

   A full unicast mesh topology removes the need to filter the
   unnecessary messages but introduces scalability concerns as the
   number of connections required grows quadratically with the number of
   parties (FEs and CEs) present.  This requires a lot more compute and
   state information to be maintained at each party.  A pure mesh
   topology also complicates HA because more state must be maintained
   (for instance, the IP addresses of the CEs and FEs that are active
   and what their backups are) and therefore needs to perform extra
   processing to achieve failover.  This becomes transparent if
   multicast is used among all parties.

   Netlink2 allows a bundle to have a hybrid of unicast and multicast
   connections.  Note this is a model used by other protocols such as
   OSPF over broadcast links where the Hello protocol is multicast but
   responses to LSA updates are unicasted.

   We present some examples of Netlink2 bundles:

   1.  A trivial case is a Netlink2 bundle consisting of a single
       unicast wire between the CE and FE it interconnects.

   2.  Multiple FEs and a CE could be interconnected with a Netlink2
       bundle using a single multicast connection.

   3.  In the same example as 2) above, the unicast address of the CE
       could in addition also be used, for instance, to deliver
       acknowledgments or notifications from the FEs to the CE, and not
       be seen by all other FEs.  The unicast addresses of the FEs could
       also be used, for instance, to deliver certain messages only to a
       specific FE, such as a retransmission of a message in a two-phase
       commit only to an FE that did not respond.

   4.  Multiple FEs and CEs could use a wire with two multicast
       connections: one for all FEs, the other for all CEs, so that
       messages only relevant to FEs are not seen by CEs and vice-versa.


Hadi Salim, et al.       Expires April 25, 2004                [Page 19]

Internet-Draft        Netlink2 as ForCES Protocol           October 2003


6.3 Redefining the Netlink PID Semantics

   We maintain the name PID for historical purposes and introduce a
   Destination PID and a Source PID as mentioned earlier.

   For every message received by each party on the wire, the destination
   PID field indicates the recipient of the message.  The addressed
   party could be either a FE or a CE, respectively a LFB or a CPC.

   In addition to Netlink2 wires (unicast or multicast) defining the
   destination of a particular message delivered, the PID types provide
   further control, namely to define which entity actually has to
   process the message.  So if the bundle uses only a single multicast
   wire, messages will be heard by all parties on the wire, but only
   those with a matching PID will actually process these messages.  We
   introduce special- purpose PIDs addressed to specific listeners on
   the wire.

   The following types of PIDs are defined and can be used in the
   Netlink2 messages.  The actual values for the PID of a FE or CE must
   be the same across all wires of the same bundle and must be
   established during the pre-association phase.

   Default values are given.  PIDs must be unique within a Netlink2
   wire. They may also be unique within the NE. PIDs are subdivided into
   two 16-bit subfields named wire and party in the form wire:party.

   1.  unicastPID: allows one to uniquely address a FE or CE.  Each FE/
       CE must have such a unicast PID.  Only the FE or CE assigned to
       this PID must process an incoming message with such a Destination
       PID.  Other parties MAY silently discard the message. The wire
       subfield is a unique identifier of the FE or CE. The party
       subfield acts as a port number: it can for instance be used to
       further demultiplex a message to the appropriate process in a CE
       (CPC) or the appropriate LFB in an FE.

       Default value: none.

   2.  logicalPID: in addition to unicastPID, a FE/CE MAY have zero or
       more logical PIDs assigned to it.  A logicalPID can be used for
       active-backup pairs of FEs: for instance, the active and the
       backup FE have the same logical PID or at least the same wire
       subfield. The wire subfield is an identifier of the group of FEs
       and/or CEs participating in the group. Pre-association
       configuration ensures that the same party identifier is not
       assigned twice to different CPCs or LFBs on the same wire.

       Default value: none.


Hadi Salim, et al.       Expires April 25, 2004                [Page 20]

Internet-Draft        Netlink2 as ForCES Protocol           October 2003


   3.  broadcastPID: all parties on all wires must process an incoming
       message with such a Destination PID.  An example of a message
       that might be broadcast is when a CE is brought down for
       maintenance.

       Default value: 0xffffffff

   4.  FEbroadcastPID: all FEs on all wires must process an incoming
       message with such a Destination PID.  Typically a route update
       from the CE to all FEs.  Other parties (CEs) can silently discard
       the message.

       Default value: 0xffffefff

   5.  CEbroadcastPID: all CEs on all wires must process an incoming
       message with such a Destination PID.  Other parties (FEs) can
       silently discard the message.

       Default value: 0xffffdfff

   A Netlink2 message must have as Destination PID one of the PIDs types
   defined above.  The Source PID of a Netlink message must be of the
   unicastPID or logicalPID type.  In addition, if the NLM_F_ACK flag is
   set, then every party processing the message MUST reply with an
   acknowledgment after processing the message, unless the NLM_F_ASTR
   flag is used to prevent ACK implosion.

   Pre-configured translation tables can be used to map a given PID into
   the underlying wire in a bundle, i.e., an IP unicast or multicast
   address.

6.4 Local Scope Addressing and Encapsulation

   At a local scope, the preferred addressing used for a wire is a UDP
   port on top of a multicast IP address.

   Multiple wires can run on one multicast address with further
   demultiplex level based on the UDP port.

   The wire addressing parameters MAY be discovered during the
   pre-association phase.

6.5 Global Scope Addressing and Encapsulation

   When addressing a non-local scope the Netlink2 message is
   encapsulated over a transport header and shuttled to the remote end
   where it is decapsulated and run as if originating from the local
   scope of that remote end. The global scope addressing could use any


Hadi Salim, et al.       Expires April 25, 2004                [Page 21]

Internet-Draft        Netlink2 as ForCES Protocol           October 2003


   transport protocol configured (SCTP, UDP, TCP or DCCP) as agreed upon
   in the pre-association phase.

   This can be viewed as extensions of the local scope wires.


Hadi Salim, et al.       Expires April 25, 2004                [Page 22]

Internet-Draft        Netlink2 as ForCES Protocol           October 2003


7. Protocol Architecture

7.1 Protocol Phases

   ForCES in relation to NEs involves three phases: the Pre-Association
   phase, the association phase where the ForCES protocol operates, and
   a termination phase where a party in the relationship leaves a
   bundle.

7.1.1 The Pre-Association Phase

   In a simple setup, this phase is static.  All the parameters for the
   association phase are well known (example multicast groups for each
   Netlink2 wire in a bundle, etc.).

   Vendors may use their own proprietary service discovery protocol.  As
   minimum, we assume a static configuration. In fact, although ForCES
   mandates a minimal set of capability discovery, Netlink2 will also
   operate in a mode where such capability discovery is done in
   pre-association phase. In that case, the FE Manager and the CE
   Manager agree on all the parameters and clearly articulate topology
   and other information to each other in the pre-association phase.

   On completion of the Service Discovery phase, the FEM will have
   established contact with the appropriate CEM component.
   Initialization and Authentication will be complete at this point.
   Both the FE and CE know how to connect to each other for
   configuration, accounting, identification and authentication
   purposes.  Both sides are also knowledgeable of all necessary
   protocol parameters such as timers, etc. All capabilities may also
   have been discovered at this point.

7.1.2 The Association Phase

   In this phase, the FE and CE components cooperate to deliver the IP
   service. The CE component might be registered (in the pre-association
   phase) to receive FE-specific services (such as link events).
   Essentially, in this phase, the service is provisioned and executing.
   The FE component might continuously get updates from the control
   plane component on how to operate the service (for example, the IPv4
   forwarding route additions or deletions).

   The association phase is where Netlink2 operates as the ForCES
   protocol.

   On startup, the FE connects to the bundle(s) to which the CE is
   connected, using procedure defined in Section 8.3.1. The controlling
   CE will either admit the FE into the NE or reject it.


Hadi Salim, et al.       Expires April 25, 2004                [Page 23]

Internet-Draft        Netlink2 as ForCES Protocol           October 2003


   Once granted access into the NE, the FE is continously updated or
   queried. The FE may also send async event notifications to the CE.
   This continues until a termination is initiated by either the CE or
   FE.

7.1.3 Service Termination

   Service termination could be issued by either component of the
   service abstraction.

   FE or the CE initiating the termination will issue a FIN command

7.2 Protocol Logical Model

   In the diagram below we show a simple LFB-CPC logical relationship.
   We use the IPv4 Forwarding LFB as an example.


                            CE-----------------------------------
                            |    /^^^^^\       /^^^^^\           |
                            |   |       |     / CPC-2 \          |
                            |   | CPC-1 |     | COPS  |          |
                            |   | ospfd |     |  PEP  |          |
                            |   \       /      \_____/           |
                            |    \_____/           |             |
                            |        |             |             |
                         ****************************************|
                         ************* NETLINK2 BUNDLE ***********
            FE---------- *****************************************.
            |       IPv4 Forwarding|    |           |             |
            |       LFBs           |    |           |             |
            |       --------------/ ----|-----------|--------     |
            |       |            /      |           |       |     |
            |       |     .-------.  .-------.   .------.   |     |
            |       |     |ingress|  | IPv4  |   |Egress|   |     |
            |       |     |police |  |Forward|   | QoS  |   |     |
            |       |     |_______|  |_______|   |Sched |   |     |
            |       |                             ------    |     |
            |        ---------------------------------------      |
            |                                                     |
             -----------------------------------------------------


   Netlink2 logically models LFBs and CPCs in the form of service blocks
   interconnected to each other via a Netlink2 bundle.

   Acknowledgements and responses to messages do not have to be sent


Hadi Salim, et al.       Expires April 25, 2004                [Page 24]

Internet-Draft        Netlink2 as ForCES Protocol           October 2003


   onto the same wire from which the triggering messages came from but
   MUST be sent on the same bundle to the same originating PID. For
   instance, a wire interconnecting a CE with multiple FEs using a
   multicast address could be used to send route updates from the CE.
   On the other hand, independent unicast wires from each FE to the CE
   could be used to send back route events or acknowledgments.  Note
   that sequencing is done per wire and Source PID, and ACKs can travel
   back on any wire of a bundle.

   The Netlink2 wire can be shared or be specific to a service.  There
   can be multiple Netlink2 wires bundled in a bundle carrying messages
   of the same service.  In order to reduce (for example to avoid extra
   processing) or restrict the messaging accessible for partitioning or
   security reasons, additional Netlink2 wires can be used.  A possible
   partitioning is a Netlink2 bundle per service.  In the example above
   the IPv4 Forwarding LFB would be considered a service.

   Assuming capabilities have been discovered during the pre-association
   phase (between the FEM and CEM), blocks (CPCs or LFBs as illustrated
   above) connect to the agreed wires on the Netlink2 bundle, and listen
   to receive specific messages.  CPCs may connect to multiple Netlink2
   wires if it helps them to control the service better.  All blocks
   (CPCs and LFBs) dump packets on the Netlink2 wires.

   LFBs or CPCs join Netlink2 wires and listen to messages of interest
   for processing or monitoring purposes.

   All messages addressed to the LFB (for example the IPv4 forwarding
   LFB illustrated above) will have the FE PID agreed upon by both the
   CE and the FE at the pre-association phase.

   LFBs (as well as CPCs) also process messages with the broadcast PIDs.
   They may also process messages destined to other LFBs (as well as
   CPCs) for availability synchronization purposes.

   A further demultiplexing point is the command type in the Netlink2
   message. Each of the LFBs (e.g., the ingress police LFB above) knows
   how to respond to a specific command-set as defined by the Netlink2
   message type.

7.3 Service Addressing

   Connecting to a service is achieved by connecting to a defined
   Netlink2 bundle by both the CPC and LFB.  This Netlink2 bundle is
   derived in the pre-association phase.

   A service would typically be constrained to a specific Netlink2
   bundle.


Hadi Salim, et al.       Expires April 25, 2004                [Page 25]

Internet-Draft        Netlink2 as ForCES Protocol           October 2003


   Connecting to a service is followed (at any point during the lifetime
   of the connection) by either issuing a service-specific command
   mostly for configuration purposes (from the CPC to the LFB) or for
   statistics collection. The LFB could also send event announcements to
   the CPC or respond or queries issued by the CPC.

7.4 Service Templates

   LFBs throw events and are configured and queried by using service
   templates.

   Refer to the Netlink document [RFC3549] as well as Section 8.4 for
   the different templates used for different LFBs that fit within the
   current scope of the ForCES charter.

7.5 Mechanisms for Creating Protocols

   Mechanisms for reliable or non-reliable protocols creation are
   provided.  In addition, mechanisms for facilitating availability are
   embedded in Netlink2.

7.5.1 Building Reliable Protocols

   By default the Netlink2 header flags NLM_F_PRIO and NLM_F_ACK are not
   set so that Netlink2 messages are sent with a lower priority and do
   not require acknowledgements.

   One could create a reliable protocol between an LFB and a CPC by
   using the combination of sequence numbers, ACKs and retransmit
   timers.  Both sequence numbers and ACKs are provided by Netlink2.
   Timers are provided by the operating system or hardware.

   Prioritization is an orthogonal mechanism to reliability.  When a
   node runs out of resources, a message sent with a higher priority
   will get preferential treatment.  For instance, if a FE has only
   enough memory to allocate one message in response to a message from
   the CE and it has to choose between one of two messages to respond
   to, then it will use that memory for the request which was sent with
   the higher priority.  This also applies to other resources such as
   computing cycles and bandwidth.  In other words, the NLM_F_PRIO is
   more than only the classical bandwidth prioritization of packets on a
   link.

   Another orthogonal mechanism provided by Netlink2 is the ACK strategy
   which is selected by the NLM_F_ASTR flag.

   We define two types of acknowledgement strategies:


Hadi Salim, et al.       Expires April 25, 2004                [Page 26]

Internet-Draft        Netlink2 as ForCES Protocol           October 2003


   1.  partial ACKs (using multicast ACK slotting and damping techniques
       [XTP]):  receivers multicast an ACK after a random time if they
       have not yet seen an ACK sent by another receiver.  This limits
       the number of ACKs returned to the source of the message and
       improves performance. For messages which a CE sends to a group of
       FEs partial ACKs imply that anyone of the FEs generating an ACK
       back is sufficient to deem the message was delivered.

   2.  full ACKs: each receiver sends an ACK back to the source.  This
       allows the source to immediately detect problems with receivers.
       In two-phase commits it is important that all FEs respond so that
       the full ACKs strategy should be used.


7.5.2 Building Availability

   A protocol component or an application could passively listen to
   Netlink2 commands and events within one or several Netlink2 wires.
   Doing so allows a very simple way of building complex applications
   which are aware of all service components that affect them for HA
   reasons.

   To ensure transparent CE or FE redundancy for certain services, it is
   sufficient to ensure that the backup CPC/LFB is always attached to
   the same wires to which the active CPC/LFB is attached, so that the
   backup CPC/LFB receives all messages destined to the active CPC/LFB
   (whatever PID they are sent to) as well as all messages originating
   from the active CPC/LFB.

   One could create a heartbeat protocol between the LFB and CPC by
   using the ECHO flags and the NLMSG_NOOP message(Section 8.3.5). The
   heartbeat, in addition to listening to FE or CE events, could be used
   to facilitate takeover.

   This topic is beyond the scope of ForCES and will not be discussed
   further here.  Note, however, that Netlink2 has the mechanisms
   required to enable this when required.

7.5.3 The ACK Netlink2 Message

   This message is actually used to denote both an ACK and a NACK.
   Typically the direction is from LFB to CPC (in response to an ACK
   request message).  However, CPC should be able to send ACKs back to
   LFB when requested.  The semantics for this are IP service specific.


     0                   1                   2                   3
     0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1


Hadi Salim, et al.       Expires April 25, 2004                [Page 27]

Internet-Draft        Netlink2 as ForCES Protocol           October 2003


     0               1               2               3
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |                       Netlink2 message header                 |
    |                       type = NLMSG_ERROR                      |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |                          error code                           |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |                       OLD Netlink2 message header             |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+


   Error code: integer (typically 32 bits)

   An error code of zero indicates that the message is an ACK response.
   An ACK response message contains the original Netlink2 message header
   that can be used to compare against (sent sequence numbers, etc).

   A non-zero error code message is equivalent to a Negative ACK (NACK).
   In such a situation, the Netlink2 data that was sent down to the
   kernel is returned appended to the original Netlink2 message header.

7.5.4 Batching

   As mentioned earlier (repeated here for clarity) Standard Netlink
   multi-message batching looks as follows:

   NLMSG:NLMSG:NLMSG....

   where NLMSG is a Netlink2 header and its associated payload.

   This has the advantage of allowing inter-mixing of multiple commands
   (example adds/deletes) generally in a request from CE->FE. It is also
   useful for batching multiple events from the FE->CE.

   Additionally, studies from [Goutaudier] have motivated batching of
   Service Templates within a single Netlink2 messages. Recall, a
   Netlink2 message looks like:

   NLMSGHDR:OET:ST

   where NLMSGHDR is a Netlink2 header, OET is the optional extension
   TLVs and ST is the service template.

   The template extension now looks like:

   NLMSGHDR:OET:ST:ST:ST.....

   In other words there are multiple service templates that can fit


Hadi Salim, et al.       Expires April 25, 2004                [Page 28]

Internet-Draft        Netlink2 as ForCES Protocol           October 2003


   within the same message. There are caveats with such a batching
   scheme since only one ACK may be sent for a whole batch, it implies
   that it is difficult to know which service configuration failed. In a
   close proximity, low error rate link batching in this mode should
   allow for high throughputs for configurations while reducing the
   number of ACKs back.

7.5.5 Atomicity and Ordering of Transactions

   In a two-phase commit messages are bound into a relationship. The
   first and all following headers have the NLM_F_MULTI Netlink2 header
   flag set, except for the last header, which has the Netlink2 header
   type NLMSG_DONE.  Typically, in netlink, the NLMSG_DONE shows up in
   separate PDUs to define a commit.

   Atomicity of a transaction including that of a batch is achieved by
   using the NLM_F_ATOMIC flag.  Use of the NLM_F_ATOMIC is expensive
   because it may necessitate the locking of access to tables (depending
   on the implementation.


Hadi Salim, et al.       Expires April 25, 2004                [Page 29]

Internet-Draft        Netlink2 as ForCES Protocol           October 2003


8. Putting together the base protocol for WG charter

   The design approach taken for Netlink2 protocol is to avoid over
   featuring the protocol and focus on the requirements under the
   current WG charter. Although Netlink2 could be used for CE-CE or
   FE-FE communication this is not discussed in this document to avoid
   complexity. Additionaly although Netlink2 provides the minimal
   required attribute discovery, it will work with existing proprietary
   or open protocols which exist to discover such attributes.

8.1 Netlink2-Extension TLVs

   Netlink2-Extension TLVs are mostly used to compensate for the
   underlying transport not having mechanisms needed by Netlink2.

8.1.1 Authentication

   [TBD]

8.1.2 Checksum


    0                   1                   2                   3
    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |    TLV Type = NL2_CSUM      |    TLV Length = 2               |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |    Checksum (16 bits)       |    Alignment Padding (16 bits)  |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+


   This TLV is optional.  To compute the correct checksum, an
   implementation MUST add the optional checksum TLV to the Netlink2
   message with the initial checksum value of 0 and compute the checksum
   over such a Netlink2 message.  Refer to [RFC3358] for details on the
   Checksum TLV.

8.1.3 Message Priority


     0                   1                   2                   3
     0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |    TLV Type = NL2_MPRIO     |   TLV Length = 2                |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |    Checksum (16 bits)       |    Alignment Padding (16 bits)  |


Hadi Salim, et al.       Expires April 25, 2004                [Page 30]

Internet-Draft        Netlink2 as ForCES Protocol           October 2003


    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+


   This TLV is optional.  It is used if the network does not support
   prioritization.  This field is used to indicate priorities to the
   remote end.

8.1.4 SYN COOKIE

   TBF

   TLV_TYPE = NL2_COOKIE.


8.1.5 Name ID


    0                   1                   2                   3
    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   | Type = NL2_NAMEID           | TLV Length = variable           |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                         size of name                          |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+


   This TLV is optional.  It is used to identify a name that a CE or FE
   wishes to be known as. Typically exchanged with SYN messages.

8.2 LFB and FE Attributes and discovery

   In the association phase the CE queries the FE to determine its
   capabilities.  These may include the FE-FE topology, the initial LFB
   topology for the FE, constraints on how the LFB topology can be
   modified (if possible), etc.  A schema for representing FE and LFB
   attributes and capabilities is being defined in [ForCES_Model].
   Appropriate Netlink2 TLVs will be defined to convey the identified
   parameters as the model work progresses.

8.3 NE creation

   The FE and CE Managers communicate to decide communication parameters
   and rules that are to be used in the transaction between the CE and
   FE.

   Using the agreed on parameters, the FE attempts to join the NE. The
   CE may reject the FE or allow it to join. The FE then communicates to


Hadi Salim, et al.       Expires April 25, 2004                [Page 31]

Internet-Draft        Netlink2 as ForCES Protocol           October 2003


   the FEM to inform it of the decision. Note that we do not discuss the
   FE-FEM or CE-CEM interfaces in this document as it is beyond the
   scope of ForCES.


8.3.1 FE State transitions


                                        SYN retran.
                                           .-->-.
                                           ^    Y
                                           |    |
                                           ^    Y
                                            \   Y
                          send SYN        +---------+  recvd SYN|ACK
              +--->----->----->---------->|SYN_SENT |---->>>----+
              |     +------<---------<----|         |           Y
       +------+--+  |  recvd NACK or      |  state  |     +--------+
       | INIT    |<-+  max retransmit     +---------+     | EST    |
       |         |                                        |  State |
       | State   |<-+                     +---------+     |        |
       +---------+  |   recvd FIN|ACK     |FIN_SENT |     +--------+
          ^         +----<---<----------<-|         |       Y     Y
          |                               | State   |--<-<--+     |
          ^                               +---------+   Send FIN  Y
          |                                ^    Y                 |
          |                                |    |                 |
          |                                +-<--+                 |
          |                                 FIN                   |
          |                                retrans                |
          |                                                       |
          |        recvd FIN|ACK  or recvd SYN broadcast         Y
          +-<---<---------<-------<---------<-------<-------------+


   INIT state:

   When the FE is started (by FE manager or otherwise) it goes into the
   INIT state.  At this point the FE has been informed by the FE Manager
   of the following (based on current implementation):

   o  the bundle to join,

   o  its PID,

   o  the PID of the CE,


Hadi Salim, et al.       Expires April 25, 2004                [Page 32]

Internet-Draft        Netlink2 as ForCES Protocol           October 2003


   o  the number of retries for the SYN transmission and the SYN timer,

   o  and the number of retries for the FIN transmission and the FIN
      timer value.

   The FE Manager would also instruct the FE to be either active or
   passive. Although this is beyond Forces charter, the active/passive
   setup description is introduced here to describe one way to achieve
   redundancy. Netlink2 does not mandate how redundancy is achieved.
   Netlink2 imposes that FE redundancy is the role of the FE plane as
   such netlink2 is designed so that the CE has no knowledge of FE
   redundancy. This greatly simplifies the protocol.

   After internal initialization, the FE sends a SYN message with the
   ACK flag on.  The message will contain Netlink2-extension TLV of type
   NL2_NAMEID. The NL2_NAMEID TLV will contain the name the FE wishes to
   be known as. The FE then enters the SYN_SENT state.

   A FE could passively monitor the state of one or more FEs and
   synchronizes their state and communication data with the CE. The end
   goal of a passive FE is to act as a backup for the FE whose
   activities it is monitoring. The monitoring is trivial to achieve if
   multicast is used.  The synchronization may also happen via a FE-FE
   protocol or via the FE Manager. A passive FE may be called on by the
   FE manager to take over the functionality of the FE it is monitoring.

   SYN_SENT state:

   The FE fires the SYN timer and waits for a response from the CE. Two
   events could happen:

   1.  The timer expires. If the number of retries has not reached the
       maximum allowed value, then the SYN is retransmitted and timer
       restarted. If the maximum number of retries has been reached with
       the last SYN transmission then the FE notifies the FE manager and
       goes into INIT state.

   2.  a packet is received from the CE:

       *  A NACK packet to the sent SYN packet. Action: cancel the
          timer, inform the FE manager on the rejection reasons and go
          into INIT state.

       *  an ACK packet to the sent SYN packet. Action: update the FE
          manager and go into EST state.

   EST state:


Hadi Salim, et al.       Expires April 25, 2004                [Page 33]

Internet-Draft        Netlink2 as ForCES Protocol           October 2003


   This is the established state where normal Forces communication
   starts.

   Several events may force the FE to transition out of the EST state:

   1.  the FE manager requests it to. In this case the FE will issue a
       FIN with an ACK request to the CE and transition to the FIN_SENT
       state.

   2.  The CE asks it to leave. This is considered a reset of the FE.
       The FE receives a FIN from the CE to inform it to leave. The FE
       immediately informs the FE manager, sends a FIN and goes into
       INIT state.

   3.  The CE restarts and sends a broadcast SYN. This may be caused by
       either the CE manager restarting the CE to clear its state or a
       result of the CE dying and being restarted. Control of restarting
       of the CE and association to the CE manager is out of scope for
       ForCES. Upon receiving the broadcast SYN, the FE assumes the CE
       has no knowledge of any state the FE is in and transits into the
       INIT state after informing the FE manager.

   Additionaly not discussed here are optional heartbeats from the CE to
   FE. If the CE doesnt see heartbeats after a timeout period then the
   transition to the INIT state will be made.

   FIN_SENT state:

   Two events could happen:

   1.  The timer expires. If the number of retries has not reached the
       maximum allowed value then the FIN is retransmitted and timer
       restarted. If the maximum number of retries has been reached with
       the last FIN transmission then the FE notifies the FE manager and
       goes into INIT state.

   2.  a valid FIN|ACK packet is received from the CE. Action: cancel
       the timer, inform the FE manager and go into INIT state.


8.3.2 CE view of FE State transitions

   This is per FE information on the CE side.


        wait


Hadi Salim, et al.       Expires April 25, 2004                [Page 34]

Internet-Draft        Netlink2 as ForCES Protocol           October 2003


       for FE
        .->-.
        ^    Y
        |    |           recvd SYN       +---------+  setup complete
        ^   Y  +->----->----->---------->|SYN_RCVD |---->>>----+
         \ Y   |                         |         |           Y
       +---------+                       |  state  |      +--------+
       | INIT    |                       +---------+      | EST    |
       |         |                                        |  State |
       | State   |<-+                     +---------+     |        |
       +---------+  |   recvd FIN|ACK     | FIN_SENT|     +--------+
          ^         +----<---<----------<-|         |       Y     Y
          |                               | State   |--<-<--+     |
          ^                               +---------+   Send FIN  Y
          |                                ^    Y                 |
          |                                |    |                 |
          |                                +-<--+                 |
          |                                 FIN                   |
          |                                retrans                |
          |                                                       |
          |        recvd FIN|ACK  or recvd SYN                    Y
          +-<---<---------<-------<---------<-------<-------------+


   INIT state:

   When the CE Manager informs the CE of a FE, basic state information
   is created for the FE and it is placed into the INIT state. At this
   point the CE has been informed by the CE Manager of the following:

   o  the bundle the FE will join,

   o  its PID that the FE is going to use to refer to tthe CE,

   o  the unicast PID of the FE.

   o  the number of retries for the SYN transmission and the SYN timer

   o  the number of retries for the FIN transmission and the FIN timer
      value.

   o  the expected timeouts before the FE joins and number of such
      timeout to wait for the FE.

   o  whether the FE is interested in restart information if available
      (refer to the FIN_SENT state)

   The CE fires a timer waiting for the FE to join. Two things could


Hadi Salim, et al.       Expires April 25, 2004                [Page 35]

Internet-Draft        Netlink2 as ForCES Protocol           October 2003


   happen:

   1.  The timer expires. If the number of retries for waiting for the
       FE to join has not reached the maximum allowed value then the
       timer is restarted. If the maximum number of retries is reached
       then the CE deletes the FEs state info and informs the manager.

   2.  A valid SYN packet is received from the FE. The CE transitions
       into the SYN_RCVD state.

   SYN_RCVD state:

   In this state the CE will do any necessary processing to prepare for
   the FE to be admitted into the NE. The CE issues a SYN|ACK and moves
   into the EST state.

   EST state:

   This is the established state where normal Forces communication
   starts. Several events may force the CE to transition out of the EST
   state:

   1.  the CE manager requests it to. In this case the CE will issue a
       FIN with an ACK request to the FE and transition to the FIN_SENT
       state.

   2.  The FE leaves. This is considered a reset of the FE. The FE sends
       a FIN to the CE to inform it it is leaving. The CE immediately
       sends a FIN ACK and notifies the CE manager. Transition is made
       to the INIT state.

   Not discussed here is use of hearbeats or other events (eg link down
   ) to transition to the INIT state on discovery that the FE is dead.

   FIN_SENT state:

   The CE fires the FIN timer and waits for a response from the FE.

   Two events could happen:

   1.  The timer expires. If the number of retries has not reached the
       maximum allowed value then the FIN is retransmitted and timer
       restarted. If the maximum number of retries has been reached with
       the last FIN transmission then the CE notifies the CE manager and
       goes into INIT state.

   2.  a valid FIN|ACK packet is received from the FE:


Hadi Salim, et al.       Expires April 25, 2004                [Page 36]

Internet-Draft        Netlink2 as ForCES Protocol           October 2003


       *  cancel the timer, inform the CE manager

       *  transition to the INIT state.

   For states that transition to the init state observe that if the FE
   comes back and joins before the FE expiry time, its LFB state(s)
   would still be intact and maybe resent to it (The restart policy is
   agreed on at pre-association time). OTOH, the state will be garbage
   collected if no SYNs from the FE are seen within the period (or if
   they are new ones seen but FEM-CEM interface indicates no interest in
   the restart data).

8.3.3 SYN Message Format

   A SYN message contains a base Netlink2 header (refer to Section 5.1)
   with the appropriate flags followed by the Extension TLV Name ID
   (refer to Section 8.1.5). The Name ID will have the name the FE
   wishes to be refered to.


8.3.4 FIN Message Format

   A FIN message contains a base Netlink2 header  with the appropriate
   flags (refer to Section 5.1).


8.3.5 NOOP Message Format

   A NOOP message contains a base Netlink2 header with the appropriate
   flags (refer to Section 5.1) set. The NOOP carries no execution
   message and therefore no operations on LFBs are carried out as a
   result of receiving it. The flags of the message are still relevant.

   A standard use of NOOP message is for heartbeats. A CE may send LFBs
   keepalive messages using NLMSG_NOOP command. When requesting for
   replies, the CE sets the NLM_F_ECHO flag on to get the message sent
   back to it as is (essentially loopback of exact same message sans the
   ECHO flag).

8.4 LFB and FE Service Templates

   In this section we describe Service Templates used to configure FEs
   and LFBs as well as for async event notification as required by the
   ForCES WG charter.

   Some of these message templates are already described in the Netlink
   document ([RFC3549]) but are repeated here for clarity.


Hadi Salim, et al.       Expires April 25, 2004                [Page 37]

Internet-Draft        Netlink2 as ForCES Protocol           October 2003


   A feature of Netlink2 is that the same message template is used in
   configuration, querying or events. In the CE->FE direction
   configuration commands embedding Service Templates described in this
   section are used to configure (Add or delete a policy for example).
   In the FE->CE direction, the templates are used to give back query
   responses or throw events at the CE (on a per-LFB basis).

   As noted earlier, a single Netlink2 message may carry multiple
   service templates if the NLM_F_MS flag is set. This is not restricted
   to the config (CE->FE) only but also extends to responses or events
   (FE->CE).

8.4.1 Physical Port and Address Functions

   [TBF]


8.4.1.1 Interface Service Template

   This is very close to what the Port LFB is defined to be in the Model
   draft. Its expressive semantics are sufficient to define a physical
   port (regardless of the underlying physical links), virtual
   interface, etc.


     0                   1                   2                   3
     0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |   Family    |   Reserved  |          Device Type              |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |                     Interface Index                           |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |                      Device Flags                             |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |                      Change Mask                              |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+


   Family: 8 bits  This is always set to AF_UNSPEC.

   Device Type: 16 bits  This defines the type of the link.  The link
   could be Ethernet, PCI, a tunnel, etc.

   Interface Index: 32 bits  Uniquely identifies interface.

   Device Flags: 32 bits

   IFF_UP            Interface is administratively up.


Hadi Salim, et al.       Expires April 25, 2004                [Page 38]

Internet-Draft        Netlink2 as ForCES Protocol           October 2003


   IFF_BROADCAST     Valid broadcast address set.

   IFF_DEBUG         Internal debugging flag.

   IFF_LOOPBACK      Interface is a loopback interface.

   IFF_POINTOPOINT   Interface is a point-to-point link.

   IFF_RUNNING       Interface is operationally up.

   IFF_NOARP         No ARP protocol needed for this interface.

   IFF_PROMISC       Interface is in promiscuous mode.

   IFF_NOTRAILERS    Avoid use of trailers.

   IFF_ALLMULTI      Receive all multicast packets.

   IFF_MASTER        Master of a load balancing bundle.

   IFF_SLAVE         Slave of a load balancing bundle.

   IFF_MULTICAST     Supports multicast.

   IFF_PORTSEL       Is able to select media type via ifmap.

   IFF_AUTOMEDIA     Auto media selection active.

   IFF_DYNAMIC       Interface was dynamically created.

   Change Mask: 32 bits  Reserved for future use.  Must be set to
   0xFFFFFFFF.


      Applicable attributes:
             IFLA_UNSPEC          Unspecified.
             IFLA_ADDRESS         Hardware address interface L2 address.
             IFLA_BROADCAST       Hardware address L2 broadcast
                                  address.
             IFLA_IFNAME          ASCII string device name.
             IFLA_MTU             MTU of the device.
             IFLA_LINK            ifindex of link to which this device
                                  is bound.
             IFLA_QDISC           ASCII string defining egress root
                                  queuing discipline.
             IFLA_STATS           Interface statistics.


Hadi Salim, et al.       Expires April 25, 2004                [Page 39]

Internet-Draft        Netlink2 as ForCES Protocol           October 2003


      Netlink message types specific to this service:
      RTM_NEWLINK, RTM_DELLINK, and RTM_GETLINK


8.4.1.2 Address Service Template

   The expressive semantics of this template are sufficient to define
   addressing for a port LFB (physical or virtual interfaces) including
   secondary addresses. Although the focus is on IPv4 and IPv6, the
   template could be used to configure IPX etc. We only focus on IP.


      0                   1                   2                   3
      0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |   Family    |     Length    |     Flags     |    Scope      |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                     Interface Index                         |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

      Family: 8 bits
      Address Family: AF_INET for IPv4; and AF_INET6 for IPv6.

      Length: 8 bits
      The length of the address mask.

      Flags: 8 bits
      IFA_F_SECONDARY  For secondary address (alias interface).

      IFA_F_PERMANENT  For a permanent address set by the user.
                       When this is not set, it means the address
                       was dynamically created (e.g., by stateless
                       autoconfiguration).
      IFA_F_DEPRECATED Defines deprecated (IPv4) address.
      IFA_F_TENTATIVE  Defines tentative (IPv4) address (duplicate
                       address detection is still in progress).
      Scope: 8 bits
      The address scope in which the address stays valid.
             SCOPE_UNIVERSE: Global scope.
             SCOPE_SITE (IPv6 only): Only valid within this site.
             SCOPE_LINK: Valid only on this device.
             SCOPE_HOST: Valid only on this host.

   Applicable attributes:

            IFA_UNSPEC      Unspecified.
            IFA_ADDRESS     Raw protocol address of interface.


Hadi Salim, et al.       Expires April 25, 2004                [Page 40]

Internet-Draft        Netlink2 as ForCES Protocol           October 2003


            IFA_LOCAL       Raw protocol local address.
            IFA_LABEL       ASCII string name of the interface.
            IFA_BROADCAST   Raw protocol broadcast address.
            IFA_ANYCAST     Raw protocol anycast address.
            IFA_CACHEINFO   Cache address information.

      Netlink messages specific to this service: RTM_NEWADDR,
      RTM_DELADDR, and RTM_GETADDR.


8.4.2 IPv4 and IPv6 L3 Forwarding Functions

   In this section we describe two LFB templates necessary for IPv4 and
   V6 L3 forwarding control.

8.4.2.1 IPv4 and IPv6 Forwarding LFB Template

   The expressive semantics of this template are sufficient to describe
   any IPv4 or IPv6 route configuration including ability to express
   route entries for virtual routers within a physical router.

      0                   1                   2                   3
      0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |   Family    |  Src length   |  Dest length  |     TOS       |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |  Table ID   |   Protocol    |     Scope     |     Type      |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                          Flags                              |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

      Family: 8 bits
      Address Family: AF_INET for IPv4; and AF_INET6 for IPv6.

      Src length: 8 bits
      Prefix length of source IP address.

      Dest length: 8 bits
      Prefix length of destination IP address.

      TOS: 8 bits
      The 8-bit TOS (should be deprecated to make room for DSCP).
      Table ID: 8 bits
      Table identifier.  Up to 255 route tables are supported.
                    RT_TABLE_UNSPEC    An unspecified routing table.
                    RT_TABLE_DEFAULT   The default table.
                    RT_TABLE_MAIN      The main table.


Hadi Salim, et al.       Expires April 25, 2004                [Page 41]

Internet-Draft        Netlink2 as ForCES Protocol           October 2003


                    RT_TABLE_LOCAL     The local table.

                    The user may assign arbitrary values between
                    RT_TABLE_UNSPEC(0) and RT_TABLE_DEFAULT(253).

      Protocol: 8 bits
      Identifies what/who added the route.
                    Protocol          Route origin.
                    ..............................................
                    RTPROT_UNSPEC     Unknown.
                    RTPROT_REDIRECT   By an ICMP redirect.
                    RTPROT_KERNEL     By the kernel.
                    RTPROT_BOOT       During bootup.
                    RTPROT_STATIC     By the administrator.

      Values larger than RTPROT_STATIC(4) are not interpreted by the
      kernel, they are just for user information.  They may be used to
      tag the source of a routing information or to distinguish between
      multiple routing daemons.

      Scope: 8 bits
      Route scope (valid distance to destination).
                    RT_SCOPE_UNIVERSE   Global route.
                    RT_SCOPE_SITE       Interior route in the
                                        local autonomous system.
                    RT_SCOPE_LINK       Route on this link.
                    RT_SCOPE_HOST       Route on the local host.
                    RT_SCOPE_NOWHERE    Destination does not exist.

      The values between RT_SCOPE_UNIVERSE(0) and RT_SCOPE_SITE(200)
      are available to the user.

      Type: 8 bits
      The type of route.

                    Route type        Description
                    ----------------------------------------------------
                    RTN_UNSPEC        Unknown route.
                    RTN_UNICAST       A gateway or direct route.
                    RTN_LOCAL         A local interface route.
                    RTN_BROADCAST     A local broadcast route
                                      (sent as a broadcast).
                    RTN_ANYCAST       An anycast route.
                    RTN_MULTICAST     A multicast route.
                    RTN_BLACKHOLE     A silent packet dropping route.
                    RTN_UNREACHABLE   An unreachable destination.
                                      Packets dropped and host
                                      unreachable ICMPs are sent to the


Hadi Salim, et al.       Expires April 25, 2004                [Page 42]

Internet-Draft        Netlink2 as ForCES Protocol           October 2003


                                      originator.
                    RTN_PROHIBIT      A packet rejection route.  Packets
                                      are dropped and communication
                                      prohibited ICMPs are sent to the
                                      originator.
                    RTN_THROW         When used with policy routing,
                                      continue routing lookup in another
                                      table.  Under normal routing,
                                      packets are dropped and net
                                      unreachable ICMPs are sent to the
                                      originator.
                    RTN_NAT           A network address translation
                                      rule.
                    RTN_XRESOLVE      Refer to an external resolver (not
                                      implemented).

      Flags: 32 bits
      Further qualify the route.
                    RTM_F_NOTIFY     If the route changes, notify the
                                     user.
                    RTM_F_CLONED     Route is cloned from another route.
                    RTM_F_EQUALIZE   Allow randomization of next hop
                                     path in multi-path routing
                                     (currently not implemented).

      Attributes applicable to this service:
                    Attribute       Description
                    ---------------------------------------------------
                    RTA_UNSPEC      Ignored.
                    RTA_DST         Protocol address for route
                                    destination address.
                    RTA_SRC         Protocol address for route source
                                    address.
                    RTA_IIF         Input interface index.
                    RTA_OIF         Output interface index.
                    RTA_GATEWAY     Protocol address for the gateway of
                                    the route
                    RTA_PRIORITY    Priority of route.
                    RTA_PREFSRC     Preferred source address in cases
                                    where more than one source address
                                    could be used.
                    RTA_METRICS     Route metrics attributed to route
                                    and associated protocols (e.g.,
                                    RTT, initial TCP window, etc.).
                    RTA_MULTIPATH   Multipath route next hop's
                                    attributes.
                    RTA_PROTOINFO   Firewall based policy routing
                                    attribute.


Hadi Salim, et al.       Expires April 25, 2004                [Page 43]

Internet-Draft        Netlink2 as ForCES Protocol           October 2003


                    RTA_FLOW        Route realm.
                    RTA_CACHEINFO   Cached route information.

      Additional Netlink message types applicable to this service:
      RTM_NEWROUTE, RTM_DELROUTE, and RTM_GETROUTE


8.4.2.2 Neighbor Discovery LFB Template

   The expressive semantics for this config are sufficient to describe
   both IPv4 neighbor resolution via ARP or IPv6 neighbor discovery
   (RFC2461).


      0                   1                   2                   3
      0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |   Family    |    Reserved1  |           Reserved2           |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                     Interface Index                         |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |           State             |     Flags     |     Type      |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

      Family: 8 bits
      Address Family: AF_INET for IPv4; and AF_INET6 for IPv6.

      Interface Index: 32 bits
      The unique interface index.

      State: 16 bits
      A bitmask of the following states:
                    NUD_INCOMPLETE   Still attempting to resolve.
                    NUD_REACHABLE    A confirmed working cache entry
                    NUD_STALE        an expired cache entry.
                    NUD_DELAY        Neighbor no longer reachable.
                                     Traffic sent, waiting for
                                     confirmation.
                    NUD_PROBE        A cache entry that is currently
                                     being re-solicited.
                    NUD_FAILED       An invalid cache entry.
                    NUD_NOARP        A device which does not do neighbor
                                     discovery (ARP).
                    NUD_PERMANENT    A static entry.
      Flags: 8 bits
                    NTF_PROXY        A proxy ARP entry.
                    NTF_ROUTER       An IPv6 router.


Hadi Salim, et al.       Expires April 25, 2004                [Page 44]

Internet-Draft        Netlink2 as ForCES Protocol           October 2003


      Attributes applicable to this service:

                    NDA_UNSPEC      Unknown type.
                    NDA_DST         A neighbour cache network
                                    layer destination address
                    NDA_LLADDR      A neighbor cache link layer
                                    address.
                    NDA_CACHEINFO   Cache statistics.

      Additional Netlink message types applicable to this service:
      RTM_NEWNEIGH, RTM_DELNEIGH, and RTM_GETNEIGH.


8.4.3 Filtering Functions

   TBF

8.4.4 QoS Functions

   TBF

8.4.5 IPSEC Functions

   TBF

8.4.6 Packet redirection Functions

   TBF

8.4.7 Packet Mirroring Functions

   TBF

8.4.8 Packet Sampling Functions

   TBF

8.5 Security Considerations

   CEs may communicate vital and possibly confidential information to
   FEs via the ForCES protocol. For example, such information can be
   filtering rules or secret encryption keys. In addition, the ForCES
   protocol should not open new possibilities for Denial of Service
   attacks. A single box environment is an interconnect between CEs and
   FEs that can be physically secured. ForCES messages coming on
   physical ports not part of the interconnect are dropped. In such an
   environment, protection is required only against data-packet-based


Hadi Salim, et al.       Expires April 25, 2004                [Page 45]

Internet-Draft        Netlink2 as ForCES Protocol           October 2003


   DoS attacks. A multi-hop environment places more requirements in
   terms of security. Protection against Netlink2-SYN-flood attack
   becomes necessary. In addition, some or all of the ForCES messages
   may have to be authenticated or encrypted.

8.5.1 Denial of Service (DoS) attacks

   Preventing DoS attacks resulting from data packets redirected by the
   FE to the CE can be achieved by shaping according to configurable
   parameters such as a maximum rate.

   A data-packets DoS-resistant FE MUST therefore support the necessary
   LFBs that permit to place policers that shape traffic redirected to
   the CE by an FE.

   Preventing DoS attacks at the ForCES protocol level (such as Netlink2
   SYN flood) may be necessary if the underlying transport protocol is
   not resistant to such attacks. This can be the case if UDP is used,
   for instance. In the case of TCP and SCTP, cookie-based mechanisms
   already exist to prevent SYN flood DoS attacks (refer to the
   respective RFCs and [TCP-SYN-COOKIES]).

   A SYN-flood DoS-resistant FE or CE MUST therefore support a
   Netlink2-Extension Cookie TLV (TLV_TYPE = NL2_COOKIE). This Cookie
   TLV is placed in the ACK message that acknowledges a SYN message.
   This Cookie TLV MUST be returned as is in the SYNACK message. (Note:
   content and length of the Cookie TLV remain to be standardized, if
   necessary).

8.5.2 Authentication and Encryption

   To perform authentication, the necessary information may be
   configured statically, such as shared secrets or public and private
   keys. On the other hand, in a dynamic environment, public keys may
   have to be distributed using certificates. Such certificates must
   contain names that are uniquely and permanently assigned to CEs and
   FEs. Addresses used for routing ForCES messages may change and are
   not suitable for that purpose. ForCES qualified names (Note: this
   needs to be defined in a draft of its own) MUST be used similarly to
   iSCSI qualified names [iSCSI-NAMING].


Hadi Salim, et al.       Expires April 25, 2004                [Page 46]

Internet-Draft        Netlink2 as ForCES Protocol           October 2003


References

   [Diffserv]
              "Linux Diffserv", <http://diffserv.sourceforge.net>.

   [ForCES_Model]
              Yang, L., Halpern, J., Gopal, R., DeKok, A., Haraszti, Z.
              and S. Blake, "ForCES Forwarding Element Model", October
              2003, < <http://www.ietf.org/internet-drafts/
              draft-ietf-forces-model-01.txt>.  >.

   [ForCES_REQ]
              Khosravi, H. and T. Anderson, "Requirements for Separation
              of IP Control and Forwarding", October 2003, <http://
              www.ietf.org/internet-draft/
              draft-ietf-forces-requirements-07.txt>.

   [Goutaudier]
              Goutaudier, G., "Enhancements and Prototype Implementation
              of the ForCES Netlink2 Protocol, IBM Research Report
              RZ3482", September 2003, <http://domino.watson.ibm.com/
              library/cyberdig.nsf/
              papers?SearchView&Query=(goutaudier)>.

   [Netfilter]
              "Linux Netfilter", <http://www.netfilter.org>.

   [RFC1157]  Case, J., Fedor, M., Schoffstall, M. and C. Davin, "Simple
              Network Management Protocol (SNMP)", May 1990, <http://
              www.rfc-editor.org/rfc/rfc1157.txt>.

   [RFC1633]  Braden, R., Clark, D. and S. Shenker, "Integrated Services
              in the Internet Architecture: an Overview", June 1994,
              <http://www.rfc-editor.org/rfc/rfc1633.txt>.

   [RFC1812]  Baker, F., "Requirements for IP Version 4 Routers", June
              1995, <http://www.rfc-editor.org/rfc/rfc1812.txt>.

   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
              Requirement Levels", BCP 14, RFC 2119, March 1997.

   [RFC2328]  Moy, J., "OSPF Version 2", April 1998, <http://
              www.rfc-editor.org/rfc/rfc2328.txt>.

   [RFC2475]  Blake, S., Black, D., Carlson, M., Davies, E., Weiss, W.
              and Z. Wang, "An Architecture for Differentiated
              Services", December 1998, <http://www.rfc-editor.org/rfc/
              rfc2475.txt>.


Hadi Salim, et al.       Expires April 25, 2004                [Page 47]

Internet-Draft        Netlink2 as ForCES Protocol           October 2003


   [RFC2748]  Boyle, J., Cohen, R., Durham, D., Herzog, S., Rajan, R.
              and A. Sastry, "The COPS (Common Open Policy Service)
              Protocol", January 2000, <http://www.rfc-editor.org/rfc/
              rfc2748.txt>.

   [RFC2844]  Przygienda, T., Droz, P. and R. Haas, "OSPF over ATM and
              Proxy-PAR", May 2000, <http://www.rfc-editor.org/rfc/
              rfc2844.txt>.

   [RFC3036]  Andersson, L., Doolan, P., Feldman, N., Fredette, A. and
              B. Thomas, "LDP Specification", January 2001, <http://
              www.rfc-editor.org/rfc/rfc3036.txt>.

   [RFC3292]  Doria, A., "General Switch Management Protocol (GSMP) V3",
              June 2002, <http://www.rfc-editor.org/rfc/rfc3292.txt>.

   [RFC3358]  Przygienda, T., "Optional Checksums in Intermediate System
              to Intermediate System (ISIS)", August 2002, <http://
              www.rfc-editor.org/rfc/rfc3358.txt>.

   [RFC3549]  Hadi Salim, J., Khosravi, H., Kleen, A. and A. Kuznetsov,
              "Linux Netlink as an IP Services Protocol", July 2003,
              <http://www.rfc-editor.org/rfc/rfc3549.txt>.

   [Stevens]  Wright, G. and W. Stevens, "TCP/IP Illustrated Volume 2,
              Chapter 20", June 1995.

   [TCP-SYN-COOKIES]
              Dan, D., "SYN cookies", 1997, <http://cr.yp.to/
              syncookies.html>.

   [XTP]      "Xpress Transport Protocol Specification, XTP Revision
              4.0", March 1995.

   [iSCSI-NAMING]
              "iSCSI Naming and Discovery,
              draft-ietf-ips-iscsi-name-disc-10.txt", June 2003, <http:/
              /www.ietf.org/internet-drafts/
              draft-ietf-ips-iscsi-name-disc-10.txt>.


Hadi Salim, et al.       Expires April 25, 2004                [Page 48]

Internet-Draft        Netlink2 as ForCES Protocol           October 2003


Authors' Addresses

   Jamal Hadi Salim
   Znyx Networks
   195 Stafford Rd. West
   Ottawa, Ontario
   Canada

   EMail: hadi@znyx.com


   Robert Haas
   IBM Research
   Zurich Research Laboratory
   Saeumerstrasse 4
   CH-8803 Rueschlikon,
   Switzerland

   EMail: rha@zurich.ibm.com


   Steven Blake
   Ericsson
   920 Main Campus Drive, Suite 500
   Raleigh, NC  27606
   USA

   EMail: steven.blake@ericsson.com


Hadi Salim, et al.       Expires April 25, 2004                [Page 49]

Internet-Draft        Netlink2 as ForCES Protocol           October 2003


Appendix A. Sample Service Hierarchy

   In the diagram below we show a simple IP service, foo, and the
   interaction it has between CP and FE components for the
   service(labels 1-3).

   The diagram is also used to demonstrate CP< - >FE addressing.  In
   this section we illustrate only the addressing semantics.  In
   Appendix 2 , the diagram is referenced again to define the protocol
   interaction between service foo's CPC and LFB (labels 4-10).


     CP
    [--------------------------------------------------------.
    |   .-----.                                              |
    |  |       \                 . --------.                 |
    |  |  CLI   |               /           \                |
    |  |        |              | CP protocol |               |
    |   \      /->> -.         |  component  | <-.           |
    |    \__ _/      |         |   For       |   |           |
    |                |         | IP service  |   ^           |
    |                Y         |    foo      |   |           |
    |                |          \___________/    ^           |
    |                Y   1,4,6,8,9 /  ^ 2,5,10   | 3,7       |
     --------------- Y------------/---|----------|-----------
                     |           ^    |          ^
                   **|***********|****|**********|**********
                   ************* Netlink2 layer ************
                   **|***********|****|**********|**********
           FE        |           |    ^          ^
           .-------- Y-----------Y----|--------- |----.
           |         \           |              /     |
           |          \          Y            /       |
           |          .\ --------^-------.  /         |
           |          |FE component/module|/          |
           |          |  for IP Service   |           |
    --->---|------>---|     foo           |----->-----|------>--
           |           -------------------            |
           |                                          |
           |                                          |
            ------------------------------------------


   The control plane protocol for IP service foo does the following to
   connect to its FE counterpart. The steps below are also numbered in
   the diagram above.

   1.  Connect to IP service foo through a socket connect.  A typical


Hadi Salim, et al.       Expires April 25, 2004                [Page 50]

Internet-Draft        Netlink2 as ForCES Protocol           October 2003


       connection would be via a call to: socket(AF_NETLINK, SOCK_RAW,
       NETLINK_FOO)

   2.  Bind to listen to specific async events for service foo

   3.  Bind to listen to specific async FE events

   Note that a wrapper socket can be created on top of the real sockets:
   depending on the dest PID given, it chooses the most appropriate
   socket to send the packet onto (if here are two multicast groups, one
   for all FEs, and one for all FEs and CEs, a packet from the CE to the
   FEs will use the first multicast group).  The wrapper socket
   basically maps a message to the most appropriate wire in the bundle.


Hadi Salim, et al.       Expires April 25, 2004                [Page 51]

Internet-Draft        Netlink2 as ForCES Protocol           October 2003


Appendix B. Sample Protocol for the foo IP Service

   Our proverbial IP service "foo" is used again to demonstrate how one
   can deploy a simple IP service control using Netlink2.

   These steps are continued from Appendix 1 (hence the numbering).

   1.  query for current config of FE component

   2.  receive response to 4) via channel on 3)

   3.  query for current state of IP service foo

   4.  receive response to 6) via channel on 2)

   5.  register the protocol specific packets you would like the FE to
       forward to you

   6.  send specific service foo commands and receive responses for them
       if needed


B.1 Interacting with Other IP Services

   The diagram in Appendix 1 shows another control component configuring
   the same service.  In this case, it is a proprietary Command Line
   Interface. The CLI may or may not be using the Netlink protocol to
   communicate with the foo component. If the CLI should issue commands
   that will affect the policy of the LFB for service "foo", then the
   "foo" CPC is notified.  It could then make algorithmic decisions
   based on this input.  For example if a FE allowed another service to
   delete policies installed by a different service and a policy that
   foo installed was deleted by service bar, there might be a need to
   propagate this to all the peers of service "foo").


Hadi Salim, et al.       Expires April 25, 2004                [Page 52]

Internet-Draft        Netlink2 as ForCES Protocol           October 2003


Appendix C. Examples

   In this example we show a simple configuration Netlink2 message sent
   from a TC CPC to an egress TC FIFO queue.  This queue algorithm is
   based on packet counting and drops packets when the limit exceeds the
   configured limit (100 packets in the example policy below). We assume
   the queue is in hierarchical setup with a parent 100:0 and a classid
   of 100:1 and that it is to be installed on device with ifindex of 4.


     0                   1                   2                   3
     0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
                     0               1               2             3
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |    Version    |    Flags_E    |             Length            |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    | Type (RTM_NEWQDISC)           | Flags (NLM_F_EXCL |           |
    |                               |NLM_F_CREATE | NLM_F_REQUEST)  |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |                      Sequence Number (arbitrary number)       |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |                           Source PID                          |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |                         Destination PID                       |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    | Type == NL2_SERVICE           |   Outer Length                |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    | Type == NL2_QDISC             |   Inner Length                |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |Family(AF_INET)|  Reserved1    |         Reserved1             |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |                     Interface Index  (4)                      |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |                      Qdisc handle  (0x1000001)                |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |                     Parent Qdisc   (0x1000000)                |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |                        TCM Info  (0)                          |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |            Type (TCA_KIND)    |          Length(4)            |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |                        Value ("pfifo")                        |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |            Type (TCA_OPTIONS) |          Length(4)            |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |                        Value (limit=100)                      |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+


Hadi Salim, et al.       Expires April 25, 2004                [Page 53]

Internet-Draft        Netlink2 as ForCES Protocol           October 2003


Intellectual Property Statement

   The IETF takes no position regarding the validity or scope of any
   intellectual property or other rights that might be claimed to
   pertain to the implementation or use of the technology described in
   this document or the extent to which any license under such rights
   might or might not be available; neither does it represent that it
   has made any effort to identify any such rights. Information on the
   IETF's procedures with respect to rights in standards-track and
   standards-related documentation can be found in BCP-11. Copies of
   claims of rights made available for publication and any assurances of
   licenses to be made available, or the result of an attempt made to
   obtain a general license or permission for the use of such
   proprietary rights by implementors or users of this specification can
   be obtained from the IETF Secretariat.

   The IETF invites any interested party to bring to its attention any
   copyrights, patents or patent applications, or other proprietary
   rights which may cover technology that may be required to practice
   this standard. Please address the information to the IETF Executive
   Director.


Full Copyright Statement

   Copyright (C) The Internet Society (2003). All Rights Reserved.

   This document and translations of it may be copied and furnished to
   others, and derivative works that comment on or otherwise explain it
   or assist in its implementation may be prepared, copied, published
   and distributed, in whole or in part, without restriction of any
   kind, provided that the above copyright notice and this paragraph are
   included on all such copies and derivative works. However, this
   document itself may not be modified in any way, such as by removing
   the copyright notice or references to the Internet Society or other
   Internet organizations, except as needed for the purpose of
   developing Internet standards in which case the procedures for
   copyrights defined in the Internet Standards process must be
   followed, or as required to translate it into languages other than
   English.

   The limited permissions granted above are perpetual and will not be
   revoked by the Internet Society or its successors or assignees.

   This document and the information contained herein is provided on an
   "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING
   TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING
   BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION


Hadi Salim, et al.       Expires April 25, 2004                [Page 54]

Internet-Draft        Netlink2 as ForCES Protocol           October 2003


   HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF
   MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.


Acknowledgment

   Funding for the RFC Editor function is currently provided by the
   Internet Society.


Hadi Salim, et al.       Expires April 25, 2004                [Page 55]