Network Working Group                                R. R. Stewart
    INTERNET-DRAFT                                       Cisco Systems
                                                                L. Ong
                                                       Nortel Networks
                                                      January 31, 2001

                      SCTP Bakeoff Results and Issues
                <draft-stewart-ong-sctpbakeoff-sigtran-01.txt>

    Status of This Memo

    This document is an Internet-Draft and is in full conformance with
    all provisions of Section 10 of RFC 2026 [RFC2026]. Internet-Drafts
    are working documents of the Internet Engineering Task Force (IETF),
    its areas, and its working groups. Note that other groups may also
    distribute working documents as Internet-Drafts.

    The list of current Internet-Drafts can be accessed at
    http://www.ietf.org/ietf/1id-abstracts.txt

    The list of Internet-Draft Shadow Directories can be accessed at
    http://www.ietf.org/shadow.html.

    Abstract

    This document captures problems and issues discovered at SCTP
    bakeoff's and on the sigtran mailing list. This document will be
    updated after each bakeoff augmenting the existing draft to include
    any new issues discovered during inter-operability testing. Two
    basic sets of problems are identified by this draft: first, issues
    that need to be addressed when the next revision of SCTP is created,
    i.e.  issues that should be remembered in a BIS document; second,
    issues that were found that are strictly implementation problems.

    Table of Contents

    1.0 Introduction................................................ 2
    2.0 Issues found with the specification......................... 2
    2.1 Stream negotiation.......................................... 2
    2.2 Chunk issues................................................ 3
    2.3 Heart beat issues........................................... 4
    2.4 Initialization Issues....................................... 6
    2.5 Data Transfer Issues........................................ 7
    2.5 Issues with fast retransmit................................. 7
    3.0 Implementation issues found................................. 7
    3.1 General Philosophy.......................................... 7
    3.2 ICMP processing............................................. 7
    3.3 Unrecognized parameters in the INIT-ACK..................... 8
    3.4 Concerns when you explicitly control the source address..... 9
    3.5 Issues with Shutdown........................................10 
    3.6 What happens when the primary path breaks and is restored...10
    3.7 Issues with heartbeats......................................11
    4.0 Acknowledgements............................................12
    5.0 Authors Addresses...........................................13
    6.0 References..................................................13

    1.0 Introduction

    This document captures problems and issues discovered at SCTP
    bakeoff's. This document will be updated after each bakeoff
    augmenting the existing draft to include any new issues discovered
    during inter-operability testing. Two basic sets of problems will be
    identified in this draft: first, issues that need to be addressed
    when the next revision of SCTP is defined, i.e. issues that should
    be documented in a BIS document; and second, issues that were found
    that are strictly implementation problems and would not be
    documented in the protocol specification.

    It is hoped that by capturing these issues various implementations
    have found, that developers wishing to implement SCTP will be able
    to not repeat the mistakes of others. It is also hoped that this
    document can be an input into the applicability document for SCTP
    being worked upon within the Sigtran working group.

    This document is divided into two parts. Section 2 details issues
    found at the bakeoff('s) that are clearly specification issues that
    need to be addressed. Section 3 details problems that various
    implementators have encountered in their implementations.  Both
    sections will use the following format:

    Problem/Issue: A summary description of the problem/issue.

    Description: A detailed description of the problem.

    Advice/Solution: A synopsis of the solution that needs to be applied
    to the specification or implementation.

    Found at: The bakeoff that this issue arose at or when on the
    mailing list the issue was raised.

    2.0 Issues found with the specification

    This section captures issues that need to be addressed when the next
    revision of SCTP is defined.  It is thought that this section will
    capture the problem and possibly suggest a basis for the beginning
    of the specification changes. All changes here are suggestions that
    will be subject to full working group review at the time a BIS work
    is begun.

    2.1 Stream negotiation

    Problem/Issue: Rules for filling in MIS and OS values in the INIT
    ACK are not sufficiently complete.

    Description: An implementation during the bakeoff responded to an
    INIT that contained an Outbound Streams value of 5 and an Maximum
    Inbound Streams value of 5 with a request for 65000 streams. The
    following drawing depicts this:

    Endpoint A --------INIT(OS=5,MIS=5)----------> Endpoint Z
            <------INIT-ACK(OS=65000,MIS=65000)---

    Endpoint A upon receiving this request refused to setup the
    association interpreting Endpoint Z's response as a confusion on its
    part on how many streams it was allowed to have.
                
    The specification is silent on this and should be clarified to
    provide guidance on if this is acceptable or not.

    Advice/Solution: The specification needs to be clarified to prohibit
    this behavior since it is really not a legal response and can
    indicate that the sender misunderstood the limit set by the INIT. A
    possible clarification in section 3.3.3 of RFC2960 [RFC2960] under
    the description of the OS value would be to add a line such as:

    The OS value sent in the INIT-ACK MUST NOT be greater than the value
    found in the MIS of the received INIT to which the INIT-ACK is a
    response.

    Found at: Bakeoff number 2 in Research Triangle Park 10/23/2000 -
    10/27/2000

    2.2 Chunk issues

    Problem/Issue: When responding to an unknown parameter type it may
    not always be possible to determine the corresponding chunk type in
    which this was received.

    Description: Currently a responder to an unrecognized parameter is
    required to send an Operational Error when the upper bits of the
    parameter type indicate to do so. The operational error includes the
    entire TLV of the unrecognized parameter. But since parameter types
    are specific to chunk types (even though so far no duplicates have
    been assigned across chunk types), how does one tell which chunk the
    operational error is associated with if more than one chunk was
    bundled and sent together.

    Advice/Solution: Currently TLV's are only prevalent in the INIT and
    INIT-ACK. However as the spec is extended this will become a
    problem. The unrecognized parameter error should be extended to
    include the chunk type in the next revision of the document, or the
    other alternative for the working group to consider is to require
    that all parameter types be unique across the entire SCTP
    specification and all of its extensions.

    Found at: Bakeoff number 2 in Research Triangle Park 10/23/2000 -
    10/27/2000

    Problem/Issue: When a Chunk, such as an INIT, is composed of many
    TLV's and terminates in a TLV, what is the chunk length supposed to
    be.

    Description: An implementation sent an INIT with some number of
    TLV's terminating in a non-word-aligned TLV. The padding in-between
    all of the TLV's was included in the Chunk length, but the final 2
    bytes of padding (it was a 6 octet TLV) was not included in either
    the parameter length or the chunk length. When the target of this
    INIT received it and processed it, it determined that the INIT was
    invalid since it thought the chunk length should reflect the
    value of the padding of the last TLV.

    Advice/Solution: The specification is vague in its description of
    chunk lengths when terminating in a TLV that needs padding. It is
    clear that the padding is NOT included in the chunk length when
    placed on the end of a chunk. It is also clear that the parameter
    length does not include any padding but when combining multiple
    TLV's in a chunk the padding in-between TLV's must be included in
    the chunk length. The specification needs to be clarified for this
    one case. It is recommended that RFC2960 be clarified to state
    clearly that in this case, the padding size of the final parameter
    should be left off of both the chunk length and the parameter
    length. It should also be noted that a robust implementation will
    accept the chunk no matter if the size of this final padding is in
    the chunk or not in the chunk. This follows the general philosophy
    of being liberal on what you accept and conservative on what you
    receive.

    Found at: Bakeoff number 2 in Research Triangle Park 10/23/2000 -
    10/27/2000

    Problem/Issue: Typo in RFC2960 on unrecognized chunk.

    Description: When describing how the upper two chunk type bits are
    processed in section 3.2 of RFC2960 [RFC2960] it states:

    01 - Stop processing this SCTP packet and discard it, do not process
    any further chunks within it, and report the unrecognized parameter
    in an 'Unrecognized Parameter Type' (in either an ERROR or in the
    INIT ACK).
  
    Advice/Solution: It should state (note the change of the word
    Parameter to Chunk):

    01 - Stop processing this SCTP packet and discard it, do not process
    any further chunks within it, and report the unrecognized parameter
    in an 'Unrecognized Chunk Type' (in either an ERROR or in the INIT
    ACK).

    Found at: On the mailing list reported by Fukumoto Atsushi on Nov 1,
    2000, at 7:10 am.

    2.3 Heart beat issues

    Problem/Issue: The text is unclear on if the overall association
    error count is stroked when a Heartbeat is not acknowledged.
    Description: An implementation was not stroking the overall
    association error counts when Heartbeats were missed. The individual
    destination error counts were being stroked. When the peer "core
    dumped" the association stayed up but both of the destination
    addresses were marked unreachable.

    Advice/Solution: The specification clearly says (in Section 8.1) to
    clear the association error counter when a heartbeat acknowledgement
    arrives. However the specification fails to specify to stroke the
    association error counter when an endpoint misses a heartbeat
    ack. In section 8.2 it does detail to stroke and clear the various
    destination error counters. The specification should be enhanced at
    section 8.1 to specifically state to stroke the error counter for
    the entire association if a heartbeat is missed.

    Found at: Bakeoff number 2 in Research Triangle Park 10/23/2000 -
    10/27/2000

    Problem/Issue: When a Heartbeat arrives and contains a format error,
    should the receiver respond.

    Description: If a heartbeat contains unrecognized TLV parameters or
    a heartbeat contains no TLV parameters, what should a responder do?
    Many implementations just take and overwrite the Chunk type with the
    Heartbeat-Ack type and echo back whatever was sent without looking
    inside the heartbeat. This was the actual intended design of the
    heartbeat and the description of the heartbeat response in RFC2960
    indicates this and leaves one confused by referring to "heartbeat
    information". How should this be handled?

    Advice/Solution: The heartbeat was always intended to be echoed back
    by the receiver without the receiver looking at it by just changing
    the Chunk type to Heartbeat Ack.  Somewhere during the working group
    debate the heartbeat information was changed to a TLV.  One possible
    solution is that the Heartbeat TLV be removed from the specification
    in the next iteration and that they be treated has they were
    initially intended, i.e. like an opaque piece of data to the
    receiver. Another alternative would be to clearify the text in the
    document on use of the TLV and provide clearer guidance on TLV use,
    i.e. only this one TLV should be included inside the Heartbeat.
    Implementations may wish to be prepared for any of these possible
    changes by NOT looking inside the heartbeat and instead just change
    the chunk type and echo it back the message to the sender. This will
    assure compatibility no matter which course is taken during the next
    iteration of the specification.

    Found at: Bakeoff number 2 in Research Triangle Park 10/23/2000 -
    10/27/2000

    Problem/Issue: It is unclear in the heartbeat description on exactly
    when to start, and stop heartbeating.

    Description: Some implementations where confused by receiving a
    Heartbeat right away after start up. Other implementations where
    confused by still receiving heartbeats after entering some of the
    shutdown states.

    Advice/Solution: The specification is currently unclear as to when
    heartbeating should begin and end. The specification should be
    clarified to dictate that you should not send heartbeat after the
    point of sending either a SHUTDOWN chunk or SHUTDOWN-ACK chunk and
    you must not start heartbeating until you have reached the
    established state. The specification should also be clarified to
    state that you MUST respond to a Heartbeat after entering
    COOKIE-SENT state until you reach the closed state.

    Found at: Bakeoff number 2 in Research Triangle Park 10/23/2000 -
    10/27/2000

    2.4 Initialization Issues

    Problem/Issue: During discussion a potential Denial Of Service
    attack was uncovered.

    Description: During debate and discussion of the restart case
    described in section 5.2.4 of RFC2960 in particular the case (A)
    restart, a denial of service attack was uncovered, consider the
    following:

    Endpoint A(IPA)                        Endpoint B(IPZ)
        <-------Association Established------->

         Evil Endpoint C (IPQ)
         -----------SRC:IPQ-INIT(IPQ,IPA)----->
         <----------DST:IPQ-INIT-ACK(IPZ)-----
         -----------SRC:IPQ-CookieEcho------->
         <----------Cookie-Ack---------------

    At this point Evil Endpoint C has bridged on to an existing
    association. RFC2960 does not advise an implementation during the
    restart case to be aware of different IP address configurations.

    Advice/Solution: Clause (A) on page 63 of RFC2960 needs to be
    clarified. It should state that if the cookie indicates that a new
    address(es) have been added to a restarting association then the
    entire cookie MUST be discarded. And a new operational error should
    be sent that states:

    'Restarting association adds an address, restart refused.'

    This error should also list the TLV(s) with the new address(es) 
    that was/were not present in the old association. This would allow 
    a true restarting association to go through a recovery procedure 
    (if it desired) to bring back the association.

    NOTE: it should be possible to restart with a strict 
    subset of the original address list.

    Found at: Bakeoff number 2 in Research Triangle Park 10/23/2000 -
    10/27/2000

    2.5 Data Transfer Issues

    Problem/Issue: Some implementations would never grow their cwnd.

    Description: Some implementations where having trouble growing their
    cwnd since they were very strict about not allowing the number of
    outstanding bytes to exceed the current cwnd. This resulted in them
    almost never having cwnd bytes outstanding which is one of the
    qualifications that must be met before you can increase your cwnd
    (I.e you must be using your full cwnd before you can grow your
    cwnd).

    Advice/Solution: The specification has always allowed an implicit
    "slop over" of cwnd by up to 1 MTU minus 1 byte. So in other words
    if your cwnd is 1 (and P-MTU is 1500) you can still send 1 more P-MTU
    sized piece. This would result in your cwnd being exceeded by 1499
    bytes. The specification should be clarified to capture this
    explicitly.

    Found at: Bakeoff number 2 in Research Triangle Park 10/23/2000 -
    10/27/2000

    2.6 Issues with Fast Retransmit

    Problem/Issue: A sender of a SACK is NOT required to include
    GAP acks. RFC2960 makes this a SHOULD and NOT a MUST. More so
    when SCTP does a fast retransmit, the wording of the current
    specification does not prevent multiple fast retransmits of
    the same packet. Simulation shows that this leads to a undesireable
    reduction in the window on Long Fat Pipes. Also that fact that
    SCTP must wait for 1/2 the window to drain before sending the
    fast-retransmit impacts the recovery and performance.

    Advice/Solution: In the BIS document wording needs to be added
    to A) require GAP ack blocks, i.e. change the SHOULD to a MUST and
    B) prevent multiple retransmissions of a packet being Fast 
    Retransmitted. Along with this, the Fast Retransmit should not
    be restricted by the cwnd but be allowed to be retransmitted
    without delay. In so adding these change  text will be
    needed to limit the maximum burst after a fast retransmit for
    the cases where during a FR the sender becomes gated by the
    rwnd until the SACK acknowledging the FR arrives. Recommendations
    should be taken from [KF96] to limit this burst, where a
    "maxburst" is used to limit retransmission when exiting
    Fast Recovery.

    Found at: During ns simulation and discussed on the mailing
    list January 2001.

    3.0 Implementation issues found

    This section presents various implementation issues discovered at
    various bakeoffs. These issues do NOT require or indicate changes
    needed to RFC2960. Instead these issues provide guidance to future
    implementors and provide input to the SCTP applicability document
    where appropriate.

    3.1 General Philosophy

    Problem/Issue: Implementations that refuse to bring up an
    association due to detection of minor inconsistencies.

    Description: Many implementations seem to be overly concerned with
    not proceeding with an association if any slight inconsistency is
    detected (note the MIS/OS problem in section 2.1 or the padding
    issue in section 2.2 as an example).

    Advice/Solution: The general philosophy in robust network design
    with SCTP/IP should use the guiding principle of being conservative
    in what you send and liberal in what you are willing to receive.

    Found at: Bakeoff number 2 in Research Triangle Park 10/23/2000 -
    10/27/2000

    3.2 ICMP processing

    Problem/Issue: Reaction to an ICMP message by an SCTP stack

    Description: One of the early kernel implementations at the bakeoff
    would issue an ICMP, protocol not registered message to the sender
    of an INIT. A debate arose as to if an SCTP stack should process
    this ICMP message or continue to retransmit INIT's until it hit its
    maximum retry count.

    Advice/Solution: An implementation should be able to process MOST of
    the ICMP messages that would be sent safely as long as it checks the
    ICMP header for a valid Verification Tag. In the case of an INIT,
    the Verification TAG is set to zero. Since ICMP only guarantees that
    the first 64 bits, i.e., 8 bytes, of the senders message is
    returned, an ICMP message indicating protocol not registered may NOT
    necessarily be used. If the ICMP implementation returns more than 8
    bytes of information and in this information the Initiation Tag can
    be found, an implementation MAY use the ICMP message after checking
    the Initiation Tag against what the implementation sent in its
    INIT. If only 8 bytes of the INIT are received an implementation 
    SHOULD NOT use the ICMP message, otherwise it will be open to 
    a denial of service attack.

    Found at: Bakeoff number 2 in Research Triangle Park 10/23/2000 -
    10/27/2000

    3.3 Unrecognized parameters in the INIT-ACK

    Problem/Issue: Receiver discarded an INIT-ACK with an unrecognized
    parameter and the association did not come up.

    Description: An INIT-ACK included a parameter with type set as a
    Supported Address Types parameter in an INIT chunk. This parameter
    type is not defined for the INIT-ACK chunk type.  The receiver of
    this INIT ACK did the correct thing (according to the upper bits)
    and silently discarded the entire packet. The SCTP association thus
    never came up.

    Advice/Solution: An implementation should obey the upper bit
    positions when processing unknown TLV's. The sender of the INIT-ACK
    must only use parameters that are defined for the INIT-ACK. In this
    particular case, where the sender was trying to restrict address
    types, the INIT had already arrived with a particular set of
    addresses. If the address set was unacceptable it should have
    returned the appropriate operational errors i.e. unresolvable
    address. Other than that, it had no need to inform the sender of the
    INIT what address types it could support. This is why the INIT-ACK
    does not have defined for it a 'Supported Address Types'
    parameter. Senders of an INIT-ACK should NOT include a 'Supported
    Address Types' parameter and when including any undefined parameter
    should expect to be treated has defined in the upper two bits of the
    parameter type.

    Found at: Bakeoff number 2 in Research Triangle Park 10/23/2000 -
    10/27/2000

    3.4 Concerns when you explicitly control the source address

    Problem/Issue: An implementation was explicitly controlling its
    source addresses and during the loss of one network would lose the
    whole association.

    Description: When controlling the source address an implementation
    would always send with one set source address for both heartbeats
    and data. When that address broke, the HeartBeat Ack's no longer
    found their way back to the sender. This included both destinations
    since the receiver of the Heartbeat always replied to the source
    address. This led the sender to detect that its whole association
    had broken when it actually still had a good path available.

    Advice/Solution:

    When choosing the source address, it is generally a good idea to
    choose the source address that corresponds to the interface the data
    packet is being sent upon. Many routing networks do ingress route
    filtering so submitting a packet with a foreign source address on a
    network may mean that the SCTP packet will NOT be forwarded.

    When allowing sub-set binding of all of the IP addresses in a
    machine, an SCTP stack should vary the source address if the NIC
    (Network Interface Card) that the packet is being emitted on is not
    one of the allowable source addresses. For example, say a machine
    has 3 IP address associated with 3 NIC cards, IP1, IP2 and IP3
    respectively with corresponding NIC cards NIC1, NIC2 and NIC3. If
    IP1 and IP2 are bound to an association but the destination address
    implies a route out NIC3 a.k.a IP3, the implementation should vary
    the IP source address on each outbound SCTP packet with IP1 and IP2
    that is sent out of NIC3. This will allow a greater chance that even
    if a backward path is broken, transmission will still succeed.

    For some user space implementations it may not be possible to
    determine which NIC card a packet will be sent from. In this case,
    the implementation should vary the source address of each send.

    On the receiving side, when duplicate DATA chunks begin to arrive,
    if the sender is multi-homed the receiver should NOT use the source
    address to send back the SACK. Instead it is beneficial to use the
    arrival of duplicate DATA chunks as a clue that the backward path
    may be broken and use an alternate address for the SACK to be sent
    to.

    Choosing the source address may become a kernel space implementation
    problem as well, since existing IP stacks provide one of two choices
    today. Either they set a specific IP address in place (i.e. where
    the user has bound only one address), or they set in the IP address
    for the interface that the packet goes out (i.e where the user bound
    INADDR_ANY). If a kernel implementation binds a set of addresses but
    not all of the addresses the issue of picking the "best" source
    address will also need to be examined closely. The same problems
    discussed above apply and should be considered by the kernel
    implementation.

    Found at: Bakeoff number 2 in Research Triangle Park 10/23/2000 -
    10/27/2000

    3.5 Issues with Shutdown

    Problem/Issue: Setting of Cumulative TSN when no DATA has been
    transmitted in the association.

    Description: An implementation became confused when it received a
    Shutdown with a Cumulative TSN of 0. This was caused when the peer
    sent Shutdown before any DATA had been sent.  It sent 0 as the
    cumulative TSN not having properly setup its cumulative TSN value
    (the initial cumulative TSN was set to some large positive integer).

    The key issue is what does an SCTP stack set in its cumulative TSN
    value and thus use in its subsequent Shutdown message if it receives
    no DATA chunk from its peer.

    Advice/Solution: The specification is quite clear on this point. It
    requires implementations to set the Cumulative TSN of a new
    association to 1 minus the initial TSN. So in our problem case, if
    the Initial TSN was 5294, the Cumulative TSN should have been set to
    5293. Thus the Cumulative TSN that was subsequently sent in the
    SHUTDOWN chunk should also have been 5293.

    Found at: Bakeoff number 2 in Research Triangle Park 10/23/2000 -
    10/27/2000

    Problem/Issue: Some implementations were having trouble processing a
    verification tag in a Shutdown Complete.

    Description: An implementation received a Shutdown Complete with the
    'T bit set' and the verification tag set to the value that was sent
    in the Shutdown-Ack.  It rejected the Shutdown-Complete and thus
    after multiple retransmissions of the Shutdown-Ack and subsequent
    discarding of the Shutdown-Complete's, it closed the association.

    Advice/Solution: Since the 'T bit' was set in the Shutdown Complete,
    and the verification tag was set to the proper value, i.e. the value
    that was sent in the Shutdown-Ack, the receiver should have accepted
    the Shutdown Complete. This same issue applies to Aborts when the 'T
    bit' is set. Implementations need to pay attention to the 'T bit' on
    these two particular types of chunks when checking the verification
    tag.

    Found at: Bakeoff number 2 in Research Triangle Park 10/23/2000 -
    10/27/2000

    3.6 What happens when the primary path breaks and is restored

    Problem/Issue: What should happen when the primary path goes down
    (becomes inactive) and subsequently comes back up (becomes active
    again)?

    Description: An implementation had been designed so that after the
    primary destination became inactive and data was sent to an
    alternate address, when the primary destination once again became
    active, data continued to be sent to the alternate rather than
    returning to the primary.

    Advice/Solution: RFC2960 specifically calls for the SCTP stack NOT
    to change the primary address. The reason behind this is that the
    application may have a preference for the primary path (e.g., cost,
    quality) and SCTP should follow application preference. Another
    positive outcome of this behavior is automatic fail-over and
    fail-back. And since the SCTP stack should notify the user
    application, if the application does not want to fail back it is
    capable of taking action and changing the primary upon the failure
    notification.

    Found at: Bakeoff number 2 in Research Triangle Park 10/23/2000 -
    10/27/2000



    3.7 Issues with heartbeats

    Problem/Issue: Which idle destination should a heartbeat be sent to?

    Description: When an implementation has more than 2 idle
    destinations that have not measured a RTT in a RTO time, only one is
    to be sent a heart beat to per heart beat timer expiration (note:
    there is only one hearbeat timer running at any time). The
    specifics of how to choose the destination to heartbeat is not
    defined.

    Advice/Solution: It is recommended but not mandated to use a
    round-robin approach for heartbeating multiple idle
    destinations. This assures over time all destinations are probed for
    reachability.

    Found at: Bakeoff number 2 in Research Triangle Park 10/23/2000 -
    10/27/2000

    Problem/Issue: What happens when a malicious implementation corrupts
    the heartbeat.

    Description: One of the implementations was playing with changing
    the timestamps in a heartbeat-ack to see if strange behavior would
    ensue in its peer.

    Advice/Solution: There are two possible solutions to this
    problem. An implementation can just handle the possibility that the
    time a heartbeat indicates is negative, in which case it should
    discard the RTT data and NOT calculate a new RTO. In effect the only
    harm this will do is destroy the correct RTO calculation of the
    malicious host. Even if the peer makes the RTT greater, it just
    makes the RTO time itself calculate to a larger value harming only
    the malicious stacks association data transfer. Another approach to
    this is for an implementation to sign the heartbeat much like it
    does the cookie. This way it could verify that the heartbeat
    information has not been changed. This idea must be weighed against
    the possibility of a peer then using the knowledge of a signature to
    generate a denial of service attack by sending large numbers of
    heartbeat-ack's thus causing cpu to be spent verifying the false
    signatures.

    Found at: Bakeoff number 2 in Research Triangle Park 10/23/2000 -
    10/27/2000

    3.8 Advice on talking with a broken implementation

    Problem/Issue: What do you do when an implementation refuses to
    process data and sacks correctly but does process the heartbeat
    correctly.

    Description: One of the implementations could not properly SACK
    inbound datagrams correctly. However heartbeats were handled
    properly. This caused the association to stay up since the
    heartbeats continually cleared all the error counters.

    Advice/Solution: This is an issue of a broken implementation. No
    action was thought needed except to fix the implementation. An
    alternative that some implementations put in place, is to limit the
    number of retransmissions a given DATA chunk could have. If this
    limit is ever exceeded, the association is torn down.

    Found at: Bakeoff number 2 in Research Triangle Park 10/23/2000 -
    10/27/2000


    4.0 Acknowledgements

    The authors would like to thank the following people that have
    provided comments and input for this document:

    For their comments on the list, Atsushi Fukumoto.

    For their participation in the RTP Bakeoff number 2 and all of their
    input, Heinz Prantner, Jan Rovins, Renee Revis, Steven Furniss,
    Manoj Solanki, Mike Turner, Jonathan Lee, Peter Butler, Laurent
    Glaude, Jon Berger, Dan Harrison, Sabina Torrente, Tomas OrtŤ
    MartŤn, Jeff Waskow, Robby Benedyk, Steve Dimig, Joe Keller, Ben
    Robinson, David Lehmann, John Hebert, Sanjay Rao, Kausar Hassan,
    Melissa Campbell, Sujith Radhakrishnan, Michael Tuexen, Andreas
    Jungmaier, Mitch Miers, Fred Hasle, Oliver Mayor, Cliff Thomas,
    Jonathan Wood, Kacheong Poon, Sverre Slotte, Wang Xiaopeng, Ivan
    Rodriguez, John Townsend, Harsh Bhondwe, Sandeep Mahajan, RCMonee,
    Ken FUJITA, Yuji SUZUKI, Mutsuya IRIE, Sandeep Balani, Biren Patel,
    Qiaobing Xie, Karl Knutson, La Monte Yarroll, Gareth Keily, Ian
    Periam, Nathalie Mouellic, and Stan McClellan.

    For their comments on the list and his detailed analysis and
    simulations of SCTP.

    Rob Brennan and Thomas Curran.

    5.0 Authors Addresses

    Randall R. Stewart
    24 Burning Bush Trail
    Crystal Lake, IL 60012
    USA

    EMail: rrs@cisco.com

    Lyndon Ong
    Point Reyes Networks
    Santa Clara, CA 95054, USA

    EMail: long@pointreyesnet.com

6.0 References

    [RFC2960] - Stewart, R.,Xie Q.,Morneault K., Sharp C., Schwarzbauer
    H., Taylor T., Rytina I., Kalla M., Zhang L., Paxson V. - "Stream
    Control Transmission Protocol", RFC 2960, October 2000.

    [KF96] - K. Fall and S. Floyd  - "Simulation-baseed Comparisons
    of Tahoe, Reno and SACK TCP", Computer Communications 
    Review, 26(3), 1996.






























Stewart & Ong                                                  [Page 13]