IP and ARP Over FC Working Group Murali Rajagopal INTERNET-DRAFT Raj Bhagwat Wayne Rickard (Expires September 1, 1998) (Gadzoox Networks) IP and ARP over Fibre Channel Status of this Memo This document is an Internet-Draft. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet-Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet- Drafts as Reference material or to cite them other than as ``work in progress.'' To learn the current status of any Internet-Draft, please check the``1id-abstracts.txt'' listing contained in the Internet- Drafts ShadowDirectories on ftp.is.co.za (Africa), nic.nordu.net (Europe), munnari.oz.au (Pacific Rim), ds.internic.net (US East Coast), or ftp.isi.edu (US West Coast). Abstract Fibre Channel is a high speed serial interface technology that supports several higher layer protocols such as SCSI and IP. Until now, SCSI has been the only widely used protocol over Fibre Channel. Although Fibre Channel standards support IP, they do not specify how IP packets may be transported over Fibre Channel, nor do they specify how IP addresses are resolved to FC addresses. The purpose of this document is to specify a way of encapsulating IP and ARP over Fibre Channel and also to describe a mechanism for IP address resolution. 1. Introduction Fibre Channel(FC) is a gigabit networking technology designed to primarily support the Storage Industry. FC is standardized under ANSI and has specified a number of documents describing its protocols, operations, and services. Need: Currently, Fibre Channel is predominantly used for communication between storage devices and servers using the SCSI protocol, with most of the servers still communicating with each other over LANs. Fibre Channel has architecturally defined support for IP, although currently, their exists no standard way of using IP over FC. Once, such a standard method is specified servers can directly communicate with each other INTERNET-DRAFT IPFC February 24, 1998 Rajagopal, Bhagwat, Rickard Page 1 02/24/98 using IP possibly boosting performance in Server host-to-host communications. This technique will be especially useful in a Clustering Application. Therefore, there is a real need to have a standard way for implementations using IP to inter-operate. Objective: The major objective of this specification is to promote inter-operable implementations of IP over Fibre Channel. This specification describes a method for encapsulating IPv4 and ARP packets over Fibre Channel. This specification accommodates any FC topology (loop, fabric, or point-to-point) and a desired class of service (1, 2 or 3). Use of IEEE 802.2 LLC/SNAP encapsulation for IP and ARP as specified in this document shall not preclude the use of same encapsulation technique for other protocol stacks (e.g. IPX, AppleTalk). Organization: Section 2 states the problem that is solved in this specification. Section 3 describes the techniques used for encapsulating IP and ARP packets inside a FC frame. Section 4 discusses the Address Resolution Protocol and the required mappings and operation. Sections 5,6 and 7 describe FC specific Sequence and Exchange management. Section 8, 9, 10 and 11 contain miscellaneous information. Appendix A provides a brief overview of the FC Protocols and Networks along with a list of Acronyms and Glossary of FC Terms used in this specification. Appendix B contains suggested methods for MAC address to FC Port ID mapping. 2. Problem Statement This draft addresses two problems: - A frame format definition and encapsulation mechanism for IP and ARP packets over FC - An Address Resolution mechanism. The existing FC Standard [3] touches upon the first problem but is incomplete. However, a solution to both problems has been proposed by the Fibre Channel Association (FCA)[1] - a consortium of Fibre Channel vendor companies. FCA is not a standards body. This draft specification is largely based on the proposed solution in [1] and is an attempt to provide a standardized solution addressing both the above stated problems. Note: Please see Appendix A for Acronyms and Glossary of Terms. 3. IP/ARP Encapsulation 3.1 FC Frame Format All FC frames have a standard format much like LAN 802.x protocols. However, the exact size of each frame varies depending on the sizes of the variable fields. The FC frame structure is shown in Fig. 1. +-------+--------+-----------+----//-------+-----+-----+ | SOF |Frame |Optional | Payload |CRC | EOF | | (4B) |Header |Header | (0-2112B) |(4B) |(4B) | | |(24B) |(0-112B) | | | | +-------+--------+-----------+----//-------+-----+-----+ Rajagopal, Bhagwat, Rickard Page 2 02/24/98 INTERNET-DRAFT IPFC February 24, 1998 Fig. 1 FC Frame Format The Start of Frame (SOF) and End of Frame (EOF) are both 4 bytes long and act as frame delimiters. The CRC is 4 bytes long and uses the same 32-byte polynomial used in FDDI and is specified in ANSI X3.139 Fiber Distributed Data Interface. The Frame Header is 24 bytes long and has several fields associated with identification and control of the payload. The syntax of code points for these fields determine the semantics of the FC frame. Code points relevant to the IP and ARP payloads will be discussed later. A FC Optional Header allows up to 4 optional header fields. The IP and ARP FC frames are allowed to carry only the Network_header optional header field. The Network_header field is 16 bytes long. Its use for the IP and ARP payload encapsulation is described below. In FC an application level payload is called a Sequence. Typically, a Sequence consists of more than one frame. When IP and ARP form the payload then only the first frame of the Sequence shall include the FC Network_Header. This rule is shown in Fig. 2. (Note: the SOF, CRC, EOF control fields and other optional headers have been omitted for clarity) First Frame of the Sequence --+------------+---------------------------+----------//--------+-- | FC Header | FC Network Header | IP or ARP Payload | --+------------+---------------------------+---------//---------+-- Subsequent Frames --+-----------+--------------//----+-- | FC Header | IP or ARP Payload | --+-----------+--------------//----+-- Fig. 2 Network_Header in a Frame Sequence carrying IP or ARP Payload 3.2 FC Network Header Format The format of the Network Header is shown in Fig. 3. The FC standards defined this field to enable FC networks to be bridged with other FC networks or LANs. Source and Destinations are each identified by a 60- bit address field along with a 4-bit Network Address Authority field. The NAA code point with binary code equal to 0001 indicates the IEEE-48 bit MAC address and shall be used for both the source NAA (S_NAA)and destination NAA (D_NAA). +--------+---------------------------------------+ | D_NAA |Network_Dest_Address (High-order bits) | |(4 bits)| (28 bits) | +--------+---------------------------------------+ | Network_Dest_Address (Low-order bits) | |( (32 bits) | +--------+---------------------------------------+ | S_NAA |Network_Source_Address(High-order bits | Rajagopal, Bhagwat, Rickard Page 3 02/24/98 INTERNET-DRAFT IPFC February 24, 1998 |(4 bits)| (28 bits) | +--------+---------------------------------------+ | Network_Source_Address (Low-order bit | |( (32 bits) | +--------+---------------------------------------+ Fig. 3 Format of the Network Header Field 3.3 Payload Header Format for IP and ARP packets The payload portion of the FC frame carrying an IP packet shall use the format shown in Fig. 4. Fig. 5 shows the format when the payload is an ARP packet. However, both formats use of the 8-byte LLC/SNAP header. +-----------------+-----//------+-----//----+ | LLC/SNAP Header | IP Header | IP Data | +-----------------+-----//------+-----//----+ Fig. 4 Format of the FC Payload carrying IP +-----------------+-------------------+ | LLC/SNAP Header | ARP Packet | +-----------------+-------------------+ Fig. 5 Format of the FC Payload carrying ARP A Logical Link Control (LLC) field along with SubNetwork Attachment Point (SNAP) field is a method used to identify routed and bridged non- OSI protocol PDUs and is defined in IEEE 802.2 and applied to IP in [8]. In LLC Type 1 operation (unacknowledged connectionless mode), the LLC header is 3-bytes long and consists of a 1-byte Destination Service Access Point (DSAP)field, a 1-byte Source Service Access Point (SSAP)field, and a 1-byte Control field and is shown in Fig. 6 Bytes: 1 1 1 +------+------+------+ | DSAP | SSAP | CTRL | +------+------+------+ Fig. 6 LLC Format The LLC DSAP and SSAP values of 0xAA indicate that a SNAP header follows. The LLC CTRL value equal to 0x03 specifies Unnumbered Information Command PDU. The LLC header value that shall be used is 0xAA-AA-03. The SNAP header is 5 bytes long and consists of a 3-byte Organizationally Unique Identifier (OUI) field and a 2-byte Protocol Identifier and is shown in Fig. 7 Bytes: 3 2 +------+------+-------+------+------+ | OUI | PID | +------+------+-------+------+------+ Fig. 7 SNAP Format The SNAP OUI value 0x00-00-00 specifies that the PID is an EtherType Rajagopal, Bhagwat, Rickard Page 4 02/24/98 INTERNET-DRAFT IPFC February 24, 1998 (routed non-OSI protocol). The SNAP Type field specifies the EtherType value. In particular, the value of 0x08-00 indicates IP and value of 0x08-06 indicates ARP. The complete LLC/SNAP header is shown in Fig. 8. Bytes: 1 1 1 3 2 +------+------+------+------+------+------+------+------+ | DSAP | SSAP | CTRL | OUI | PID | +------+------+------+------+------+------+------+------+ Fig. 8 LLC/SNAP Header 3.4 ARP Packet Format The format of the encapsulated ARP packet is based on [9] and is shown in Fig. 9. +-------------------------+ | HW Type | 2 bytes +-------------------------+ | Protocol | 2 bytes +-------------------------+ | HW Addr Length | 1 byte +-------------------------+ | Protocol Addr Length | 1 byte +-------------------------+ | Op Code | 2 bytes +-------------------------+ | HW Addr of Sender | 6 bytes +-------------------------+ | Protocol Addr of Sender | 4 bytes +-------------------------+ | HW Addr of Target | 6 bytes +-------------------------+ | Protocol Addr of Target | 4 bytes +-------------------------+ Fig. 9 ARP Packet Format The 'HW Type' field shall be set to 0x00-06 indicating IEEE 802 networks. The 'Protocol' field shall be set to 0x08-00 indicating IP protocol. The 'HW Addr Length' field shall be set to 0x06 indicating 6 bytes of HW address. The 'Protocol Addr' Length field shall be set to 0x04 indicating 4 bytes of IP address. The 'Operation' Code field shall be either 0x00-01 for Request or 0x00- 02 for Reply. The 'HW Addr of Sender' field shall be the 6 byte IEEE MAC address of the sender. The 'Protocol Addr of Sender' field shall be the 4 byte IP address of the sender. The 'HW Addr of Target' field shall be set to zero if the 'Operation Code' field is set to 1. Otherwise, it shall be set to the 6 byte IEEE MAC address of the original sender of the ARP request. Rajagopal, Bhagwat, Rickard Page 5 02/24/98 INTERNET-DRAFT IPFC February 24, 1998 The 'Protocol Addr of Target' field shall be set to the 4 byte IP address of the target. The ARP packet is 28 bytes long in this particular application. The difference between an ARP Request Packet and an ARP Reply Packet is given below: 1. ARP Request packet: 'Operation' Code field = 0x00-01 and the 'HW Addr of Traget' is set to 0x00-00-00-00-00-00. 2. ARP Reply packet: 'Operation' Code field = 0x00-02 and the 'HW Addr of Target' is set to 6 bytes of the 'HW Addr of Sender' extracted from the ARP Request packet. An ARP Request message is defined as a FC broadcast frame carrying the ARP Request packet. The exact mechanism used to broadcast a FC frame depends on the topology and will be discussed in the next section. Compliant ARP broadcast messages shall include Network Headers. An ARP Reply message is defined as an ARP Reply packet encapsulated in a FC frame. 4. Address Resolution 4.1 Problem Description Address Resolution is concerned with associating IP addresses with FC Port addresses. FC device ports have two addresses: - a non-volatile unique 64-bit address called World Wide Port_Name (WWP_N) (essentially formed from a unique IEEE 48-bit MAC address and other bit-fields) - a volatile 24-bit address called a Port_ID The Address Resolution mechanism therefore will need two levels of mapping: 1. A mapping from IP address to the WWP_N address(i.e., IEEE 48-bit MAC address) 2. A mapping from WWP_N to the Port_ID The address resolution problem is compounded by the fact that the Port_ID is volatile and the second mapping has to be validated before use. Moreover, this validation process can be different depending on the FC network topology used. The first level of mapping and control operation is handled by the ARP layer. The second level mapping and control is handled by the FC layer. 4.2 ARP Layer Mapping and Operation Whenever a source FC port with a designated IP address wishes to send IP data to a destination FC port also with a designated IP address then, the following steps are taken: 1. The source port shall consult its local mapping tables to determine the . (Note, WWP_N address and 48-bit MAC address will conceptually Rajagopal, Bhagwat, Rickard Page 6 02/24/98 INTERNET-DRAFT IPFC February 24, 1998 mean the same thing in this discussion.) 2. If such a mapping is found then source shall send the IP data to the port whose WWP_N address was found in the table. 3. If such a mapping is not found, then the source shall send an ARP broadcast message to its connected FC network with the hope of getting a reply from the correct destination with its WWP_N address. 4. When an ARP broadcast message is received by the destination it shall generate an ARP response. Since the ARP response must be addressed to a specific destination Port_ID, the FC layer mapping between the MAC address and Port_ID must be valid before the reply is sent. 4.2.1 ARP Broadcast in a Point-to-Point Topology There is no requirement for ARP since the WWP_N is known after the two N_Ports carry out a N_Port Login, that is a PLOGI (See Annex A). 4.2.2 ARP Broadcast in a Private Loop Topology In a private loop, the ARP broadcast message is sent using the broadcast method specified in the FC-AL [7]standard. 1. The source port shall first send an Open Broadcast Replicate primitive (OPN(fr))Signal forcing all the ports in the loop (except itself), to replicate the frames that they receive while examining the frame header's Destination_ID field. 2. The source port shall remove this OPN(fr) signal when it returns to it. 3. The source shall now send a FC frame containing the ARP Request (ARP broadcast message), as a sequence in a Class 3 frame with D_ID = 0xFFFFFF, SI=0, LS=1, ES=1.(Note: these FCTL settings apply to single-frame broadcasts, as used in ARP sequences. This information is provided to clarify ARP Broadcast usage only, and should not be interpreted as prohibiting the use of multiframe broadcasts by this specification.) 4. The destination port recognizing its IP address in the ARP packet shall respond with an ARP Reply message. 4.2.3 ARP Broadcast in a Public Loop Topology The following steps will be followed when a port is configured in a public loop: 1. A public loop device attached to a fabric through an FL_Port shall not use the OPN(fr) signal primitive. Rather, it shall send the broadcast sequence to the FL_Port at AL_PA = 0x00. 2. A fabric shall propagate the broadcast to all other ports including the FL_Port which the broadcast arrived on. 3. On each FL_Port, the fabric shall propagate the broadcast by first using the primitive signal OPNfr and then a sequence sent in Class 3 frame with D_ID 0xFFFFFF, SI=0, LS=1, ES=1. 4.2.4 ARP Operation in a Fabric Topology Rajagopal, Bhagwat, Rickard Page 7 02/24/98 INTERNET-DRAFT IPFC February 24, 1998 1. Nodes directly attached to fabric do not require the OPN(fr)primitive signal. 2. The node shall send the broadcast as a sequence in Class 3, to D_ID = 0xFFFFFF, SI=0, LS=1, ES=1. 4.3 FC Layer Mapping and Operation FC layer mapping between the MAC address and the Port_ID is independent of the ARP mechanism and is more closely associated with the details of the FC protocols. This mapping is therefore outside the scope of this document. However, several strategies for handling MAC address (WW Port Name) to Port_ID mapping are presented in Appendix B. The selection of the most appropriate strategy for a particular implementation is outside the scope of this document. 4.4 FC layer Address Validation At all time, the mapping has to be valid before it can be used. There are many events that can invalidate this mapping. For example, when a link interruption occurs, the Port_ID of a port may change. After the interruption, the Port_IDs of all other ports that have previously performed PLOGI with this port may have changed, and its own Port_ID may have changed. Because of this, address validation is required after a LIP in a loop topology or after NOS/OLS in a point-to- point topology. Port_IDs will not change as a result of Link Reset(LR), thus address validation is not required. In addition to actively validating devices after a link interruption, a port shall send an explicit logout (LOGO) to the sending port upon receipt of any FC-4 data frames from a port not currently logged in (excluding broadcast frames). The level of initialization and subsequent validation and recovery reported to the upper (FC-4) layers is implementation-specific. In general, an explicit Logout (LOGO) shall be sent whenever the FC- Layer mapping between the Port_ID and WWP_N of a remote port is removed. The effect of power-up or re-boot on the mapping tables is outside the scope of this specification. FC Layer Address Validation in a Point-to-Point Topology: No validation is required after LR. In a point-to-point topology, NOS/OLS causes implicit logout of each port and after a NOS/OLS, each port must perform a PLOGI. (Ref. FC-PH [2], section 23.5.3.) FC Layer Address Validation in a Private Loop Topology: After LIP, a port shall not transmit any link data to another port until the address of the other port has been validated. The validation consists of completing either ADISC or PDISC. As a requester, this specification prohibits PDISC and requires ADISC. As a responder, an implementation may need to respond to both ADISC and PDISC for compatibility with other Rajagopal, Bhagwat, Rickard Page 8 02/24/98 INTERNET-DRAFT IPFC February 24, 1998 specifications. If the three addresses, Port_ID, WWP_N, WWN_N, exactly match the values prior to the LIP, then any active exchanges may continue. If any of the three addresses have changed, then the node must be either implicitly or explicitly logged out. (Ref. FC-PLDA [5] and FLA [4] Section 5.7) FC Layer Address Validation in a Public Loop Topology: After a LIP, each public loop port shall not transmit any frames until it receives the FAN ELS from the fabric (Ref. FLA [4]). The WWP_N and WWN_N of the fabric FL_Port contained in the FAN ELS must exactly match the values before the LIP. In addition, the AL_PA obtained by the port must be the same as the one before the LIP. If the above conditions are met, the port may resume all exchanges. If not, then FLOGI must be performed with the fabric and all nodes must be either implicitly or explicitly logged out. A public loop device will have to perform the private loop authentication to any nodes on the local loop which have an Area + Domain Address == 0x00-00-xx. FC Layer Address Validation in a Fabric Topology: No authentication is required after LR (link reset). After NOS/OLS, a port must perform FLOGI. If, after FLOGI, the S_ID of the port, the WW Port Name of the fabric, and the WWN_N of the fabric are the same as before the NOS/OLS, then the port may resume all exchanges. If not, all nodes must be either implicitly or explicitly logged out. (Ref. FC-PH [2], section 23.5.3) 5. Sequence And Exchange Management 5.1 Sequence 5.2 Exchange 5.2.1 Exchange Origination FC Exchanges shall be established to transfer data between ports. Frames on IP exchanges shall not transfer Sequence Initiative. 5.2.2 Exchange Termination With the exception of the recommendations in Appendix C, "Reliability in Class 3", the mechanism for aging or expiring exchanges based on activity, timeout, or other method is outside the scope of this document. Exchanges may be terminated by either port. The Exchange Originator shall normally terminate Exchanges by setting the LS bit, following normal FC-PH rules. This specification prohibits the use of the NOP ELS with LS set for Exchange termination. Rajagopal, Bhagwat, Rickard Page 9 02/24/98 INTERNET-DRAFT IPFC February 24, 1998 Exchanges may be torn down by the Exchange Responder by using the ABTS_LS protocol. The use of ABTS_LS for terminating aged exchanges or error recovery is outside the scope of this document. The termination of IP exchanges by Logout is discouraged, since this may terminate active exchanges on other FC-4s. 6. Summary of Supported Features Note: 'Required' means the feature support is mandatory, 'Prohibited' means the feature support is not allowed, 'Allowed' means the feature support is optional, and 'Settable' means support is as specified in the relevant standard. 6.1 FC-4 Header (Note 1) +----------------------------------------------------------------------+ | Feature | Support | Notes | +----------------------------------------------------------------------+ | Type Code ISO8802-2 LLC/SNAP | Required | 2 | | Network Headers | Required | 3 | | Other Optional Headers | Prohibited | | +----------------------------------------------------------------------+ Notes: 1. This table applies only to FC-4 related data, such as IP and ARP packets. This table does not apply to link services and other non-FC-4 sequences (PLOGI, for example) that must occur for normal operation. 2. TYPE field must indicate ISO 8802-2 LLC/SNAP Encapsulation (Type 5). This revision of the document focuses solely on the issues related to running IP and ARP over FC. All other issues are outside the scope of this document, including full support for IEEE 802.2 LLC. 3. DF_CTL field must indicate the presence of a Network Header (0010 0000) on the first frame of FC-4 sequences. 6.2 RCTL Routing +----------------------------------------------------------------------+ | Feature | Support | Notes | +----------------------------------------------------------------------+ | FC-4 Device Data | Required | 1 | | Extended Link Data | Required | 2 | | FC-4 Link Data | Prohibited | | | Video Data | Prohibited | | | Basic Link Data | Required | 3 | | Link Control | Required | 4 | | RCTL information | | | | Uncategorized | Prohibited | | | Solicited Data | Prohibited | | | Unsolicited Control | Required | 2 | | Solicited Control | Required | 2 | | Unsolicited Data | Required | 1 | Rajagopal, Bhagwat, Rickard Page 10 02/24/98 INTERNET-DRAFT IPFC February 24, 1998 | Data Descriptor | Prohibited | | | Unsolicited Command | Prohibited | | | Command Status | Prohibited | | +----------------------------------------------------------------------+ Notes: 1. This is required for FC-4 (IP and ARP) packets - Routing bits of R_CTL field must indicate Device Data frames (0000). - Information Category of R_CTL field must indicate Unsolicited Data (0100). 2. This is required for Extended Link Services. 3. This is required for Basic Link Services. 4. This is required for Link Control frames. 6.3 FCTL +----------------------------------------------------------------------+ | Feature | Support | Notes | +----------------------------------------------------------------------+ | Exchange Context | Settable | | | Sequence Context | Settable | | | First / Last / End Sequence (FS/LS/ES) | Settable | | | Chained Sequence | Prohibited | | | Sequence Initiative (SI) | Settable | 1 | | X_ID Reassigned / Invalidate | Prohibited | | | Unidirectional Transmit | Settable | | | Continue Sequence Condition | Required | 2 | | Abort Seq. Condition - continue and single seq. | Required | 3 | | Relative Offset - Unsolicited Data | Settable | 4 | | Fill Bytes | Settable | | +----------------------------------------------------------------------+ Notes: 1. For FC-4 frames, each N_Port shall have a dedicated X_ID for sending data to each N_Port in the network and a dedicated X_ID for receiving data from each N_Port as well. Exchanges are used in a unidirectional mode, thus setting sequence initiative is not valid for FC-4 frames. Sequence initiative is valid when using Extended Link Services. 2. This field is required to be 00, no information. 3. Sequence error policy is requested by an exchange originator in the F_CTL Abort Sequence Condition bits in the first data frame of the exchange. For classes 1 and 2, ACK frame is required to be "continuous sequence". 4. Relative offset prohibited on all other types (Information Category) of frames. 6.4 Sequences +----------------------------------------------------------------------+ | Feature | Support | Notes | Rajagopal, Bhagwat, Rickard Page 11 02/24/98 INTERNET-DRAFT IPFC February 24, 1998 +----------------------------------------------------------------------+ | Class 2 open sequences / exchange | 1 | 1 | | Length of seq. not limited by end-to-end credit | Required | 2 | | Maximum sequence size - IP sequences | 65536 | 3 | | Maximum sequence size - ARP sequences | 532 | 4 | | Capability to receive sequence of maximum size | Allowed | 5 | | Sequence Streaming | Prohibited | 6 | | Stop Sequence Protocol | Prohibited | | | ACK_0 support | Allowed | 7 | | ACK_1 support | Required | 7 | | ACK_N support | Prohibited | | | Class of Service for transmitted sequences | 1, 2 or 3 | 8 | | Continuously Increasing Sequence Count | Allowed | 9 | +----------------------------------------------------------------------+ Notes: 1. Only one active sequence per exchange is allowed. 2. A sequence initiator shall be capable of transmitting sequences containing more frames than the available credit indicated by a sequence recipient at login. FC-PH end-to end flow control rules will be followed when transmitting such sequences. 3. Maximum sequence size is 65536 bytes. Thus the maximum IP packet size (MTU) is 65280 bytes (65536 - 256 bytes for header overhead). 4. Maximum size ARP packet is 532 bytes (including LLC/SNAP headers). 5. Some OS environments may not handle the max MTU of 65536. It is up to the administrator to configure the Max MTU for all systems. 6. All class 3 sequences are assumed to be non-streamed. 7. Only applies for Class 1 and 2. Use of ACK_1 is default, ACK_0 used if indicated by sequence recipient at login. 8. The administrator configured class of service is used, except where otherwise specified (e.g. Broadcasts are always sent in class 3). 9. Review Appendix C, "Reliability in Class 3". 6.5 Exchanges +----------------------------------------------------------------------+ | Feature | Support | Notes | +----------------------------------------------------------------------+ | X_ID interlock support | Allowed | 1 | | OX_ID=FFFF | Prohibited | | | RX_ID=FFFF | Allowed | 2 | | Action if no exchange resources available | P_RJT | 3 | | Long Lived Exchanges | Allowed | 4 | | Reallocation of Idle Exchanges | Allowed | | +----------------------------------------------------------------------+ Notes: 1. Only applies to Classes 1 and 2, supported by the exchange originator. A Port shall be capable of interoperating with another Port that requires X_ID interlock. The exchange originator facility within the Port shall use the X_ID Interlock protocol in such cases. 2. An exchange responder is not required to assign RX_IDs. If a RX_ID of Rajagopal, Bhagwat, Rickard Page 12 02/24/98 INTERNET-DRAFT IPFC February 24, 1998 FFFF is assigned, it is identifying exchanges based on S_ID / D_ID / OX_ID only. 3. In Classes 1 and 2, a Port shall reject a frame that would create a new exchange with a P_RJT containing reason code "Unable to establish exchange". In Class 3, the frame would be dropped. 4. When an exchange is created between 2 Ports for IP/ARP data, it remains active while the ports are logged in with each other. An exchange shall not transfer Sequence Initiative (SI). Broadcasts and ELS commands may use short lived exchanges. 6.6 ARP +----------------------------------------------------------------------+ | Feature | Support | Notes | +----------------------------------------------------------------------+ | ARP Server Support | Prohibited | 1 | | Response to ARP requests | Required | 2 | | ARP requests transmitted as broadcast message | Required | | | Class of Service for ARP requests | 3 | 3 | | Class of Service for ARP replies | 1, 2 or 3 | 4 | +----------------------------------------------------------------------+ Notes: 1. Well-known Address FFFFFC is not used for ARP requests. frames from Well-known Address FFFFFC are not considered to be ARP frames. Broadcast support is required for ARP. 2. The IP Address is mapped to a specific MAC address with ARP. 3. An ARP request is a broadcast message, thus Class 3 is always used. 4. An ARP reply is a normal sequence, thus the administrator configured class of service is used. 6.7 Extended Link Services +----------------------------------------------------------------------+ | Feature | Support | Notes | +----------------------------------------------------------------------+ | Class of service for ELS commands / responses | 1,2 or 3 | 1 | | Explicit N-Port Login | Required | | | Explicit F-Port Login | Required | | | FLOGI ELS command | Required | | | PLOGI ELS command | Required | | | ADISC ELS command | Required | | | PDISC ELS command | Allowed | 2 | | FAN ELS command | Required | 3 | | LOGO ELS command | Required | | | Other ELS command support | Allowed | 4 | +----------------------------------------------------------------------+ Notes: 1. The administrator configured class of service is used. 2. PDISC is prohibited as requester. ADISC should be used instead. As a responder, an implementation may need to respond to both ADISC and PDISC Rajagopal, Bhagwat, Rickard Page 13 02/24/98 INTERNET-DRAFT IPFC February 24, 1998 for compatibility with other specifications. 3. FAN is required in a public loop environment. 4. If other ELS commands are received an LS_RJT may be sent. NOP is not required by this specification, and should not be used as a mechanism to terminate exchanges. 7. LOGIN PARAMETERS Unless explicitly noted here, a compliant implementation shall use the login parameters as described in FLA [4], section 5. 7.1 Common Service Parameters - FLOGI - FC-PH Version, lowest version may be 0x09 to indicate 'minimum 4.3'. - Can't use BB_Credit=0 for N_Port on a switched Fabric (F_Port). 7.2 Common Service Parameters - PLOGI - FC-PH Version, lowest version may be 0x09 to indicate 'minimum 4.3'. - Can't use BB_Credit=0 for N_Port in a Point-to-Point configuration - Random Relative Offset is allowed. 7.3 Class 3 Service Parameters - PLOGI - Discard error policy only. 8. MISCELLANEOUS Note that the 'Receive Data Field Size' fields specified in the PLOGI represent both optional headers and payload. The MAC Address can therefore be extracted from the 6 lower bytes of the WWP_N field (when the IEEE 48-bit Identifier format is chosen as the NAA) during PLOGI or ACC payload exchanged during Fibre Channel Login. (Ref. FC-PH [2], Section 23.) The MAC Address can also be extracted from the WWP_N field in the Network Header during ADISC (and ADISC ACC), or PDISC (and PDISC ACC). 9. CONCLUSIONS 10. ACKNOWLEDGEMENT This specification is based on FCA IP Profile, Version 2.3. The FCA IP Profile was a joint work of the Fibre Channel Association (FCA) vendor community. The following companies and organizations have contributed to the creation of the FCA IP Profile: Adaptec, Ancor, Brocade, Clarion, Crossroads, emf Associates, Emulex, Finisar, Gadzoox, Hewlett Packard, Interphase, Jaycor, LLNL, McData, Migration Associates, Prisa, Q-Logic, Symbios, Systran, Tektronix, Univ. of Minnesota, Univ. of New Hamshire. Rajagopal, Bhagwat, Rickard Page 14 02/24/98 INTERNET-DRAFT IPFC February 24, 1998 11. REFERENCES [1] FCA IP Profile, Revision 2.3, May 15, 1997 [2] Fibre Channel Physical and Signaling Interface (FC-PH) , ANSI X3.230-1994 [3] Fibre Channel Link Encapsulation (FC-LE), Revision 1.1, June 26, 1996 [4] Fibre Channel Fabric Loop Attachment (FC-FLA), Rev. 2.4, October 21, 1996 [5] Fibre Channel Private Loop SCSI Direct Attach (FC-PLDA), Rev. 1.7, October 7, 1996 [6] Fibre Channel Physical and Signaling Interface-2 (FC-PH-2), Rev. 7.4, ANSI X3.297-1996 [7] Fibre Channel Arbitrated Loop (FC-AL), ANSI X3.272-1996 [8] Postel, J. and Reynolds, J., "A standard for the Transmission of IP Datagrams over IEEE 802 Networks". RFC 1042, ISI, Feb, 1988 [9] Plummer, D. "An Ethernet Address Resolution Protocol -or- Converting Network Addresses to 48-bit Ethernet Address for Transmission on Ethernet Hardware", STD 37, RFC 826, MIT, Nov 1982. 12. AUTHORS' ADDRESSES Murali Rajagopal Gadzoox Networks, Inc. 711 Kimberly Avenue, Suite 100 Placentia, CA 92870 Phone: +1 714 577 6805 Fax: +1 714 524 8508 Email: murali@gadzoox.com Raj Bhagwat Gadzoox Networks, Inc. 711 Kimberly Avenue, Suite 100 Placentia, CA 92870 Phone: +1 714 577 6806 Fax: +1 714 524 8508 Email: raj@gadzoox.com Wayne Rickard Gadzoox Networks, Inc. 711 Kimberly Avenue, Suite 100 Placentia, CA 92870 Phone: +1 714 577 6803 Fax: +1 714 524 8508 Email: wayne@gadzoox.com Rajagopal, Bhagwat, Rickard Page 15 02/24/98 INTERNET-DRAFT IPFC February 24, 1998 APPENDIX - A FIBRE CHANNEL OVERVIEW A.1 Brief Tutorial FC standard [2] defines 4 "levels" for its protocol description: FC-0, FC-1, FC-2, FC-3, and FC-4. The first three levels (FC-0, FC-1, FC-2) are largely concerned with the physical formatting and control aspects of the protocol. FC-3 is architecturally defined but not unspecified and FC-4 is meant for support profiles of higher protocols such as IP and Small Computer Serial Interface (SCSI) and supports a relatively small set of higher level protocols compared to LAN protocols such as IEEE 802.3. FC Nodes communicate using these higher layer protocols such as SCSI over FC and are configured to operate using different networking topologies. Currently, the FC standards support 4 networking topologies: - Point-to-Point - Private Loop - Public Loop (attachment to a Fabric) - Fabric The point-to-point is the simplest of the four topologies, where only two nodes communicate with each other. The private loop may connect a number of devices (max 126) in a logical ring much like Token Ring and is distinguished from a public loop by the absence of a Fabric Node participating in the loop. The Fabric topology is a switched network where any attached node can communicate with any other. A.2 Acronyms and Glossary of FC Terms It is assumed that the reader is familiar with the terms and acronyms used in the FC protocol specification [2]. The following is provided for easy reference. A.2.1 Acronyms ABTS_LS: Abort Sequence Protocol - Last Sequence. A protocol for aborting an exchange based on the ABTS recipient setting the Last_Sequence bit in the BA_ACC ELS to the ABTS. [Ref. FC-PH [2], 21.2.2.2 and PLDA [5] 9.3] ADISC: Discover Address. An ELS for discovering the Hard Addresses (the 24 bit NL_Port Identifier) of N_Ports. [Ref. PH2 [6] 21.19.2 and PLDA [5] 10.3] D_ID: Destination ID [Ref. FC-PH [2], 18.3.2] ES: End sequence. This FCTL bit in the FC header indicates this frame is the last frame of the sequence. [Ref. FC-PH [2], 18.5 and table 37] Rajagopal, Bhagwat, Rickard Page 16 02/24/98 INTERNET-DRAFT IPFC February 24, 1998 FAN: Fabric Address Notification. An ELS sent by the fabric to all known previously logged in ports following an initialization event. [Ref. FLA [4] A.1.1] LIP: Loop Initialization. A primitive sequence used by a port to detect if it is part of a loop or to recover from certain loop errors. [Ref. FC-AL [7], 7.7] LR: Link reset. A primitive sequence transmitted by a port to initiate the link reset protocol or to recover from a link timeout. [Ref. FC-PH [2], 16.4.4] LS: Last sequence of Exchange. This FCTL bit in the FC header indicates the sequence is the last sequence of the exchange. [Ref. FC-PH [2], 18.5 and table 37] NOS: Not Operational. A primitive sequence transmitted to indicate that the port transmitting this sequence has detected a link failure or is offline, waiting for OLS to be received. [Ref. FC-PH [2], 16.4.2] OLS: Off line. A primitive sequence transmitted to indicate that the port transmitting this sequence is either initiating the link initialization protocol, receiving and recognizing NOS, or entering the offline state. [Ref. FC-PH [2], 16.4.3] PDISC: Discover Port. An ELS for exchanging Service Parameters without affecting login state.[Ref. PH2 [6], 21.19.1 and PLDA [5] 10.3] SI: Sequence Initiative [ref. FC-PH [2], 18.5 and table 37] FLOGI: TBD Primitive Sequence: A primitive sequence is an Ordered Set that is transmitted repeatedly and continuously. [Ref. FC-PH [2], 16.4] Private Loop Device: A device that does not attempt fabric login (FLOGI) and usually adheres to PLDA. The Area and Domain components of the NL_Port ID must be 0x0000. These devices cannot communicate with any port not in the local loop. [Ref. PLDA [5] Section 4). Public Loop Device: A device whose Area and Domain components of the NL_Port ID cannot be 0x0000. Additionally, to be FLA compliant, the device must attempt to open AL_PA 0x00 and attempt FLOGI. These devices communicate with devices on the local loop as well as devices on the other side of a Fabric. [Ref. PLDA [5] Section 4 and FLA [4] Section 4] Link: Two unidirectional paths flowing in opposite directions and connecting two Ports within adjacent Nodes. LOGO: TBD Node: A collection of one or more Ports identified by a unique World Wide Node Name (WW Node Name). Rajagopal, Bhagwat, Rickard Page 17 02/24/98 INTERNET-DRAFT IPFC February 24, 1998 Port: The transmitter, receiver and associated logic at either end of a link within a Node. There may be multiple Ports per Node. Each Port is identified by a unique Port_ID, which is volatile, and a unique World Wide Port Name (WW Port Name), which is unchangeable. In this document, the term "port" may be used interchangeably with NL_Port or N_Port. Port_ID: Fibre Channel ports are addressed by unique 24-bit Port_IDs. In a Fibre Channel frame header, the Port_ID is referred to as S_ID (Source ID) to identify the port originating a frame, and D_ID to identify the destination port. The Port_ID of a given port is volatile (changeable). The mechanisms through which a Port_ID may change in a Fibre Channel topology are outside the scope of this document. PLOGI: TBD World Wide Port_Name (WWP_N): Fibre Channel requires each Port to have an unchangeable WWP_N. Fibre Channel specifies a Network Address Authority (NAA) to distinguish between the various name registration authorities that may be used to identify the WWP_N. A 4-bit NAA identifier, 12-bit field set to 0x0 and an IEEE 48-bit MAC address together make this a 64-bit field. (Ref. FC-PH [2], Section 19.3.) World Wide Node_Name (WWN_N): Fibre Channel identifies each Node with a unchangable WWN_N. In a single port Node, the WWN_N and the WWP_N may be identical. APPENDIX - B MECHANISMS FOR MAINTAINING FC-LAYER MAPPINGS (MAC Address to Port Address Tables) This appendix presents several possible approaches that might be used to create and maintain MAC Address to Port Address tables. The preferred method is a configuration/administration issue, and may be implementation-dependent. Each method should have some mechanism to ensure PLOGI has completed successfully before data is sent. A related concern in large networks is limiting concurrent logins to only those ports with active IP traffic. B.1 Method 1 - Login on Cached Mapping Information This method insulates the level performing LOGIN from the level interpreting ARP. It is more accommodating of non-ARP mechanisms for building the FC-layer mapping table. - When any Broadcast Message is received which contains a Network Header, cache the S_ID from the FC header and the corresponding Network_Source_Address from the Network Header. The cache represents a correlation of Port_IDs to WW Port Names. If the received Broadcast message is compliant with this specification, the WW Port Name will be a MAC Address. This method may also accommodate other NAA types. Rajagopal, Bhagwat, Rickard Page 18 02/24/98 INTERNET-DRAFT IPFC February 24, 1998 - The WW Port Name is "available" if Login has been performed to the Port_ID and flagged. If login has not been performed, the WW Port Name is "unavailable". - If an outbound packet is destined for a port that is "unavailable", the cached info is used to look up the Port_ID. - Send an ELS PLOGI command (Port Login) to the Port. By waiting for an outbound packet before initiating login, login resources are reserved only for those ports which wish to establish communication. - After Port Login completes (ACC received), the outbound packet can be forwarded. - At this point in time, both ends have the necessary information to complete their IP address / MAC Address / Port_ID association. B.2 Method 2 - Login on ARP Parsing This method performs LOGIN sooner by parsing ARP before passing it up to higher levels for IP/MAC Address correlation. It requires a low-level awareness of your IP address, and is therefore protocol-specific. - When an ARP Broadcast Message is received, extract the S_ID from the FC header and the corresponding Network_Source_Address from the Network Header. - Parse the ARP payload to determine if (a) you are the target of the ARP request (Target IP Address match), and (b) you are currently logged in with the port (Port_ID = S_ID) originating the ARP broadcast. - Pass the ARP to higher level for ARP Response generation. - If a Port Login is required, an ELS PLOGI command (Port Login) is sent immediately to the Port originating the ARP Broadcast. - After Port Login completes, an ARP response can be forwarded. Note that there are two possible scenarios: (1) the ACC to PLOGI returns before the ARP reply is processed, and the ARP Reply is immediately forwarded. (2) the ARP reply is delayed, waiting for ACC (successful Login). - At this point in time, both ends have the necessary information to complete their IP address / MAC Address / Port_ID association. B.3 Method 3 - Use of Name Server This method is preferred in environments where a Name Server is required. FC-FLA [4] compliant topologies require a Name Server, while FC-PLDA [5] devices may not be able to access the well-known Name Server address, even if one exists. Rajagopal, Bhagwat, Rickard Page 19 02/24/98 INTERNET-DRAFT IPFC February 24, 1998 - A Name Server may be Referenced to resolve unmapped MAC addresses. - Any upper layer send request for which there is not a Port_ID to MAC address mapping can trigger a query to a name server. - The format of the Name Server query and response is outside the scope of this document. See FC-FLA [4] for a typical example. - A preferred Name Server implementation is described in [ns008.pdf on ftp.network.com]. The MAC address must be re-formatted in the 64-bit WW Port Name format before the query is issued. - The query response from the Name Server must contain the Port_ID associated with the MAC Address specified in the query. - Send an ELS PLOGI command (Port Login) to the Port. - After Port Login completes, the outbound packet can be forwarded. - At this point in time, both ends have the necessary information to complete their IP address / MAC Address / Port_ID association. B.4 Method 4 - Login to Everyone - In Fibre Channel topologies with a limited number of ports, it may be efficient to unconditionally login to each port. This method is discouraged in fabric and public loop environments. - After Port Login completes, the MAC Address to Port_ID Address tables can be constructed. B.5 Method 5 - Static Table -In some loop environments with a limited number of ports, a static mapping from a MAC Address to Port_ID (D_ID or AL_PA) may be maintained. The FC layer will always know the destination Port_ID based on the table. The table is typically downloaded into the driver at configuration time. This method scales poorly, and is therefore not recommended. APPENDIX - C C.1 RELIABILITY IN CLASS 3 Problem: Sequence ID reuse in Class 3 can conceivably result in missing frame aliasing with no corresponding detection at the FC2 level. Prevention: This specification requires one of the following methods if Class 3 is used. - Continuously increasing Sequence Count (new Login Bit) - both sides must set When an N_Port sets the PLOGI login bit for continuously increasing SEQ_CNT, it is guaranteeing that it will transmit all frames within an exchange using a continuously increasing SEQ_CNT (see description below). - After using all SEQ_IDs (0-255) once, must start a new Exchange. It is Rajagopal, Bhagwat, Rickard Page 20 02/24/98 INTERNET-DRAFT IPFC February 24, 1998 recommended that a minimum of 4 Exchanges be used before an OX_ID can be reused. Note: If an implementation is not checking the OX_ID when reassembling sequences, the problem can still occur. Cycling through some number of SEQ_IDs, then jumping to a new exchange does not solve the problem. SEQ_IDs must still be unique between two N_Ports, even across exchanges. - Use only single-frame Sequences. C.2 CONTINUOUSLY INCREASING SEQ_CNT This method allows the recipient to check incoming frames, knowing exactly what SEQ_CNT value to expect next. Since the SEQ_CNT will not repeat for 65,536 frames, the aliasing problem is significantly reduced. A login bit (PLOGI) is used to indicate that a device always uses a continuously increasing SEQ_CNT, even across transfers of sequence initiative. This bit is necessary for interoperability with some devices, and it provides other benefits as well. In the FC-PH-3 Rev. 9.1 specification, the following paragraph would go in section 23.6.3.3: Word 1, bit 17 - SEQ_CNT (S) 0 = Normal FC-PH rules apply 1 = Continuously Increasing SEQ_CNT Any N_Port that sets Word 1, Bit 17 = 1, is guaranteeing that it will transmit all frames within an exchange using a continuously increasing SEQ_CNT. Each exchange shall start with SEQ_CNT = 0 in the first frame, and every frame transmitted after that shall increment the previous SEQ_CNT by one, even across transfers of sequence initiative. Any frames received from the other N_Port in the exchange shall have no effect on the transmitted SEQ_CNT. [This INTERNET DRAFT expires on September 1, 1998]