Network	Working	Group					     Ralph Droms
INTERNET DRAFT					     Bucknell University

							      Greg Rabil
							     Mike Dooley
							      Arun Kapur
						       Quadritek Systems

							     Kim Kinnear
						       American	Internet

							    Steve Gonczi
							     Bernie Volz
							Process	Software

							     August 1998
						      Expires March 1999


			 DHCP Failover Protocol
		    <draft-ietf-dhc-failover-02.txt>


Status of this Memo

   This	document is an Internet-Draft. Internet-Drafts are working
   documents of	the Internet Engineering Task Force (IETF), its	areas,
   and its working groups. Note	that other groups may also distribute
   working documents as	Internet-Drafts.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time. It is inappropriate to	use Internet-Drafts as reference
   material or to cite them other than as ``work in progress.''

   To learn the	current	status of any Internet-Draft, please check the
   ``1id-abstracts.txt'' listing contained in the Internet-Drafts Shadow
   Directories on ftp.is.co.za (Africa), nic.nordu.net (Europe),
   munnari.oz.au (Pacific Rim),	ftp.ietf.org (US East Coast), or
   ftp.isi.edu (US West	Coast).

Abstract

   DHCP	[RFC 2131] allows for multiple servers to be operating on a
   single network. Some	sites are interested in	running	multiple servers
   in such a way so as to provide redundancy in	case of	server failure.
   In order for	this to	work reliably, the cooperating Primary and
   Secondary servers must maintain a consistent	database of the	lease


Droms, et. al.						        [Page 1]

DRAFT							    January 1998


   information.	 This implies that servers will	need to	coordinate any
   and all lease activity so that this information is synchronized in
   case	of failover.

   This	document defines a protocol to provide this synchronization
   between two servers.	One server is designated the "Primary" server,
   the other is	the "Secondary"	server.	Additionally, this document
   describes a protocol	for the	automatic transfer of control from the
   Primary to the Secondary in the case	of failure (failover), as well
   as a	network	partition.

   This	document is a merge of draft-ietf-dhc-failover-01.txt and
   draft-ietf-dhc-safe-failover-proto-00.txt, along with substantial
   changes to each.  Unfortunately, this merge was not completed with
   sufficient time to allow review by any of the authors of draft-ietf-
   dhc-failover-01.txt,	and so it may well not reflect their views even
   though their	names appear as	authors.  See Section 11, issue	#1 and
   Section 12 for more details.


1.  Introduction

   As the use of DHCP servers in networked environments	grows, the
   dependency of those networks	on the DHCP server increases.  This is
   particularly	true of	the hosts that receive their configuration
   information from the	DHCP server.  Therefore, it is very important to
   be able to provide reliable,	continuous availability	of DHCP	ser-
   vices.

   This	specification describes	a protocol to support automatic	failover
   from	a primary to its secondary server.  The	failover mechanism
   allows the secondary	server to perform DHCP actions while the primary
   is down, or when a network failure prevents the primary and secondary
   from	communicating.	The protocol also specifies how	reintegration is
   achieved when the primary again becomes operational or when the pri-
   mary	and secondary can again	communicate.

   In providing	the specification for the failover, the	protocol speci-
   fies	how to guarantee reliable delivery of changes to the secondary.
   This	is required to synchronize the secondary's lease data with that
   of the primary.  The	protocol further specifies a mechanism to allow
   the secondary to determine if it can	communicate with the primary
   server.  The	secondary will automatically begin to service DHCP
   requests whenever it	cannot communicate with	the primary.  When the
   primary server becomes available again, the secondary will convey any
   changes that	occurred since the time	of failover back to the	primary.

   Through careful control of the difference between the lease times


Droms, et. al.						        [Page 2]

DRAFT							    January 1998


   offered to DHCP clients and the lease time known by the secondary
   server, the protocol	allows the primary to communicate with the
   secondary after the primary has completed communication with	the DHCP
   client (a technique known as	"lazy" update) and still guarantee that
   duplicate IP	address	allocations do not occur.  Thus, the protocol
   does	not directly impact the	ability	of a DHCP server to respond to
   DHCP	client requests.

1.1.  Requirements Terminology

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL"	in this
   document are	to be interpreted as described in RFC 2119 [RFC	2119].


1.2.  DHCP Terminology

   This	document uses the following terms:

	o "DHCP	client"	or "client"

	  A DHCP client	is an Internet host using DHCP to obtain confi-
	  guration parameters such as a	network	address.

	o "DHCP	server"	or "server"

	  A DHCP server	is an Internet host that returns configuration
	  parameters to	DHCP clients.

	o "binding"

	  A binding is a collection of configuration parameters, includ-
	  ing at least an IP address, associated with or "bound	to" a
	  DHCP client.	Bindings are managed by	DHCP servers.

	o "binding database"

	  The collection of bindings managed by	a primary and secondary.

	o "subnet address pool"

	  A subnet address pool	is the set of IP address which is asso-
	  ciated with a	particular network number and subnet mask.  In
	  the simple case, there is a single network number and	subnet
	  mask and a set of IP addresses.  In the more complex case
	  (sometimes called "secondary subnets", sometimes "super-
	  scopes"), several (apparently	unrelated) network number and
	  subnet mask combinations with	their associated IP addresses


Droms, et. al.						        [Page 3]

DRAFT							    January 1998


	  may all be configured	together into one subnet address pool.

	o "primary server" or "primary"

	  A DHCP server	configured to provide primary service to a set
	  of DHCP clients for a	particular set of subnet address pools.

	o "secondary server" or	"secondary"

	  A DHCP server	configured to act as backup to a primary server
	  for a	particular set of subnet address pools.

	o "stable storage"

	  Every	DHCP server is assumed to have some form of what is
	  called "stable storage".  Stable storage is used to hold
	  information concerning IP address bindings (among other
	  things) so that this information is not lost in the event of a
	  server failure which requires	restart	of the server.


1.3.  Requirements for this protocol

   The following list of goals must be (and are) achieved by this proto-
   col.

	1. Implementations of this protocol must work with existing DHCP
	   client implementations based	on the DHCP protocol [1].

	2. Implementations of the protocol must	work with existing BOOTP
	   relay implementations.

	3. The protocol	must provide failover redundancy between servers
	   that	are not	located	on the same subnet.


1.4.  Goals for	this protocol


	1. Provide for continued service to DHCP clients through an
	   automated mechanism in the event of failure of the Primary
	   Server.

	2. Avoid binding an IP address to a client while that binding is
	   currently valid for another client.	In other words,	don't
	   allocate the	same IP	address	to two clients.

	3. Minimize any	need for manual	administrative intervention.


Droms, et. al.						        [Page 4]

DRAFT							    January 1998


	4. Introduce no	additional delays in server response time as a
	   result of inter-server communication.

	5. Share IP address ranges between primary and secondary
	   servers; i.e., impose no requirement	that the pool of avail-
	   able	addresses be divided between servers.

	6. Continue to meet the	goals and objectives of	this protocol in
	   the event of	server failure or network partition.

	7. Provide graceful reintegration of full protocol service after
	   server failure or network partition.

	8. Allow for one computer to act as a Secondary	Server for mul-
	   tiple Primary Servers. Other	topologies (e.g.: mesh)	are also
	   possible.  Primary and Secondary Servers SHOULD be viewed as
	   "logical" servers and not necessarily physical computers.

	9. Ensure that an existing client can keep its existing	IP
	   address binding if it can communicate with either the Primary
	   or Secondary	DHCP server implementing this protocol - not
	   just	whichever server that originally offered it the	binding.

	10.Ensure that a new client can	get an IP address from some
	   server. Ensure that in the face of partition, where servers
	   continue to run but cannot communicate with each other, the
	   above goals and requirements	may be met. In addition, when
	   the partition condition is removed, allow graceful automatic
	   re-integration without requiring human intervention.

	11.If either Primary or	Secondary Server loses all of the infor-
	   mation that is has stored in	stable storage,	it should be
	   able	to refresh its stable storage from the other server.


1.5.  Limitations of this Protocol

   The following are explicit limitations of this protocol.

	1. Under normal	operation, only	one server at a	time will ser-
	   vice	DHCP client requests; this protocol provides reliability
	   through  redundancy but not load balancing.

	2. This	protocol provides only one level of redundancy through a
	   single Secondary Server for each Primary Server.

	3. The protocol	provides a way to detect when the primary and
	   secondary server cannot communicate,	but once this condition


Droms, et. al.						        [Page 5]

DRAFT							    January 1998


	   has been detected, does not (indeed,	cannot)	provide	any way
	   to further distinguish between network failure and failure of
	   one of the servers.

	4. A small number of IP	addresses are reserved for Secondary
	   Server use.	In order to handle the failure case where both
	   servers are able to communicate with	DHCP clients, but unable
	   to communicate with each other, a small number of IP
	   addresses must be set aside as a private address pool for the
	   Secondary Server. The Secondary can use these to service
	   newly arrived DHCP clients during such a period.  The size of
	   this	private	pool SHOULD be based only on the arrival rate of
	   new DHCP clients and	the length of expected downtime, and is
	   not influenced in any way by	the total number of DHCP clients
	   supported by	the server pair.

	5. The Primary and Secondary Servers SHOULD pause normal DHCP
	   transaction processing while	resynchronizing, after a system
	   failure.


2.  Protocol Operations

   The protocol	necessary in providing redundant/failover servers can be
   grouped in three areas:

	o Messages to keep the Secondary Server's lease	data synchron-
	  ized with that of the	Primary	so that	when failover occurs,
	  there	is no degradation of service.

	o Messages that	allow the Secondary to determine the operational
	  state	of the Primary,	so as to know when to start servicing
	  DHCP traffic.

	o Messages that	are used to coordinate the Primary regaining
	  control when it has become available again.


2.1.  Time synchronization between communicating servers

   Each	Binding	update message carries a "sent time stamp" (the	time
   when	the message was	sent in	GMT). This provides a simple mechanism
   to determine	any "time drift" between communicating servers.

   DISCUSSION:

      If an UDP	packet is successfully transmitted (i.e.: it does not
      get lost), the packet travel time	is negligible in the framework


Droms, et. al.						        [Page 6]

DRAFT							    January 1998


      of  DHCP leases.	By providing a GMT "sent time" stamp, the reci-
      pient can	compare	this with its notion of	the current GMT	time at
      the time it receives the packet.	The difference (plus the packet
      travel time, which we ignore) is the time	drift.	The recipient
      can use this time	drift value to bias all	"absolute time"	values
      it receives from the sender.

2.2.  Failover Protocol	Messages

   The Failover	Protocol messages are encoded using a packet format
   specific to the Failover Protocol. To allow easy  recognition of
   Failover Protocol messages, BOOTP packet "op" field values  3..14 are
   proposed to mark various Failover Protocol messages.	A Failover Pro-
   tocol message is always unicast from	the source to the destination.
   The sender, and never the recipient is responsible for reliable re-
   transmission.

2.3.  Failover Protocol	packet header format


   0		       1		   2		       3
   0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |	 op (1)	   |	 rev (1)   |	    payload offset (2)	   |
   +---------------+---------------+---------------+---------------+
   |				xid (4)				   |
   +---------------------------------------------------------------+
   |	     0 or more additional header bytes	(variable)	   |
   +---------------------------------------------------------------+
   |	       Payload data, formatted as DHCP-style options	   |
   |	       (although using a unique	option number space)	   |
   |			       (variable)			   |
   +---------------------------------------------------------------+


   op -	1 byte

   These values	extend the number space	of the existing	BOOTP message
   type	"Op" field.  The following types are defined:


Droms, et. al.						        [Page 7]

DRAFT							    January 1998


   3		   DHCPPOOLREQ
   4		   DHCPPOOLRESP
   5		   DHCPBNDUPD
   6		   DHCPBNDACK
   7		   DHCPPOLL
   8		   DHCPPRPL
   9		   DHCPCTLREQ
   10		   DHCPCTLRET
   11		   DHCPCTLACK
   12		   DHCPCTLACKACK
   13		   DHCPREQUEREQ
   14		   DHCPREQUERESP

   rev - 1 byte

   Failover protocol version supported.	Set to 1 for the Failover Proto-
   col described in this draft.

   payload offset - 2 bytes, network byte order

   The byte offset of the Payload area,	from the beginning of the Fail-
   over	packet header. The value for the current protocol version is 8.

   xid - 4 bytes, network byte order

   The sender of a failover protocol packet is responsible for setting
   this	number,	and the	receiver of the	packet copies the number over
   into	any response packet.  To the receiver it is opaque.  The sender
   SHOULD ensure that every packet sent	to a particular	IP address and
   port	combination has	a unique transaction id	unless that packet is a
   re-transmission.

2.4.  DHCPPOOLREQ and DHCPPOOLRESP:

   Whenever the	Secondary server transitions into NORMAL mode, it first
   sends a DHCPPOOLREQ message	to initiate a transfer of a small range
   of IP addresses that	will serve as its private address pool.

   This	is necessary, because initially	the Secondary server has no such
   address pool, and its pool gets depleted when it hands out addresses
   in COMMUNICATION-INTERRUPTED	mode. This is why the request is sent
   every time the Secondary server transitions into NORMAL mode.  The
   DHCPPOOLREQ message does not	carry any payload data.	When the Primary
   Server gets a DHCPPOOLREQ message, it computes which	addresses should
   be transferred to the Secondary, and	queues up  DHCPBNDUPD transac-
   tions, setting the Status of	these bindings to "BACKUP".  Having done
   this, it sends a  DHCPPOOLRESP message. The DHCPPOOLRESP message


Droms, et. al.						        [Page 8]

DRAFT							    January 1998


   carries the "Number of addresses transferred" as its	payload.

   The Secondary server	keeps sending DHCPPOOLREQ messages until it
   receives a  DHCPPOOLRESP with "Number of addresses transferred" = 0,
   or it decides that the partner is not responding.  Each one of these
   message MUST	have the same transaction ID.  If a new	transaction ID
   is used in one of these messages, the receiving server will begin the
   transmission	of the DHCPBNDUPD messages all over again.  To be clear,
   if the Secondary Server receives a  DHCPPOOLRESP message with "Number
   of addresses	transferred" > 0, it MUST send another DHCPPOOLREQ mes-
   sage. This mechanism	makes it possible for the Primary Server to pace
   the transfer	(e.g., it could	generate all addresses all at once, or
   one-by-one).

   The Primary Server must respond to each DHCPPOOLREQ message it
   receives. If	it has already generated all private addresses,	or it
   has no available addresses, it MUST send  DHCPPOOLRESP with "Number
   of addresses	transferred" = 0.


2.5.  DHCPREQUEREQ and DHCPREQUERESP:

   Whenever either server wishes to be updated with the	information that
   the other server knows and has not yet transmitted to it, will send a
   DHCPREQUEREQ.

   The DHCPREQUEREQ message does not carry any payload data. When the
   either server gets a	DHCPREQUEREQ message, it computes which	updates
   should be transferred to the	Secondary, and queues up DHCPBNDUPD
   transactions	   as appropriate.  Having done	this, it sends a DHCPRE-
   QUERESP message. The	DHCPREQUESP message carries the	"Number	of
   addresses queued up"	as its payload.	The set	of binding updates
   queued up will depend on the	requesting server's state. (The	state
   has already been communicated via prior DHCPPOLL/DHCPPRPL messages)

   The Secondary server	keeps sending DHCPPREQUEREQ messages until it
   receives a  DHCPREQUERESP with "Number of addresses queued up" = 0,
   or it decides that the partner is not responding.  This is the same
   approach  as	in the DHCPPOOLREQ/DHCPPOOLRESP	messages is used.  Each
   one of  these DHCPREQUEREQ message MUST have	the same transaction ID.
   Use of a new	transaction ID will cause re-building of the outgoing
   binding update queue.

   The Primary Server must respond to each DHCPREQUEREQ	message	it
   receives. If	it has already queued up all of	the previously unsent
   bindings update, then it MUST send  DHCPREQUERESP with "Number of
   addresses queued up"	= 0.


Droms, et. al.						        [Page 9]

DRAFT							    January 1998


2.6.  DHCPBNDUPD

   The Primary notifies	Secondary (or the other	way around) of a binding
   state and data change.

   In response to a binding update, the	recipient server MUST respond
   with	a  DHCPBNDACK message.	Multiple binding updates can be	batched
   up, and sent	in one Failover	Protocol message.


2.7.  DHCPBNDACK

   This	message	implements a positive, or negative acknowledgement of
   one or more binding updates.

   A binding update, (or a batch of binding updates sent as one	message)
   are matched up with their associated	acknowledgment by having the
   same	Xid field value	in the message header.

   The server sending a	DHCPBNDACK message MAY include any of the
   options that	are acceptable in a DHCPBNDUPD message when the
   DHCPBNDACK message returned to the sender.  If any of this informa-
   tion	differs	from the information in	the DHCPBNDUPD message,	the
   receiver SHOULD update its bindings database	with that information
   upon	receipt	of the DHCPBNDACK message.

   The DHCPBNDACK MAY selectively reject one or	more updates by	includ-
   ing one or more IP address -	Reject Reason option pairs in the mes-
   sage	body.

   The DHCPBNDACK implicitly acknowledges any binding updates it replies
   to, except those it enumerates using	Reject Reason Codes.


2.8.  DHCPPOLL

   In order to determine the state of a	given server, or to communicate
   a critical change in	its own	status,	a participant can use the above
   message.

   This	message	inquires about the current state of the	recipient, and
   tells the recipient what state the sender is.

   In response to the DHCPPOLL message,	the participant	will listen for
   a DHCPPRPL message.


Droms, et. al.					               [Page 10]

DRAFT							    January 1998


2.9.  DHCPPRPL

   This	message	replies	to the DHCPPOLL	message	(PRPL=Poll reply). The
   DHCPPRPL also carries server	status information (see	message	payload
   details below).

   After a failover, when the Primary Server is	restarted, the following
   messages are	used to	coordinate the Primary taking control back from
   the Secondary:

   DHCPCTLREQ	  - Request for	control
   DHCPCTLRET	  - Return of control initiated
   DHCPCTLACK	  - Return of control completed
   DHCPCTLACKACK  - Return of control completed	message	acknowledged.

   The Primary Server sends a DHCPCTLREQ message, indicating that it
   would like to take control of the bindings database.	 The Secondary
   Server replies with a DHCPCTLRET message, which serves as a signal to
   the Primary "Stand by to receive binding updates".  This message then
   is followed by a set	of binding updates from	the secondary to the
   primary.  When all updates have been	transmitted (and acknowledged)
   from	Secondary to Primary,  a DHCPCTLACK message is sent from the
   Secondary to	the Primary, to	signal that "all updates from the Secon-
   dary	are now	completed".

   DISCUSSION:

      Note, that the DHCPCTLACK	message	type must be transmitted reli-
      ably, as the Primary Server will not start servicing clients,
      until it has received the	DHCPCTLACK message.  To	provide	this
      reliability, the DCHPCTLACKACK message is	provided. This provides
      an acknowledgment	of the DHCPCTLACK message, and the DHCPCTLACK
      message will be periodically re-sent until it is acknowledged.  We
      could  just periodically re- send	the DHCPCTLACK message until we
      start receiving binding updates from the Primary,	but the	Primary
      may not have any updates to send at all, hence the need for an
      explicit DCHPCTLACKACK   message.

   The Primary Server transitions into NORMAL state upon receiving a
   DHCPCTLACK from the secondary, when the secondary has completed send-
   ing all of its updates during synchronization. The  DHCPCTLACKACK
   message is needed to	prevent	the primary from waiting and not servic-
   ing clients if the DHCPCTLACK message got lost.  The	Secondary server
   will	keep re-sending	the DHCPCTLACK message,	until:

	1. It Decides that the primary is not responding, so the Secon-
	   dary	server goes into COMMUNICATION-	INTERRUPTED mode.


Droms, et. al.					               [Page 11]

DRAFT							    January 1998


	2. It receives a DHCPCTLACKACK or a DHCPBNDUPD message from the
	   primary.  The Primary's DHCPBNDUPD messages would start
	   arriving at the Secondary server, if	the Primary did	get the
	   DHCPCTLACK, but the DHCPCTLACKACK message got lost.

3.  Protocol Payload Data Format

   Payload data	is encoded as a	set of flexible	DHCP/BOOTP style
   options. (The usual 1 byte option code, 1 byte length, and "length"
   bytes of data).  The	options	are placed after the header, after skip-
   ping	PayloadOffset bytes. The payload data options are not preceded
   "cookie" value.

   Since the packet is NOT a DHCP/BOOTP	protocol packet,  the options
   used	here do	not conflict with any existing "proper"	DHCP/BOOTP
   options.  In	fact, these options are	allocated in relationship to the
   DHCP	option space in	the following way.  In cases where the syntax
   and semantics of a Failover Payload Option is identical to that of a
   DHCP/BOOTP option, the same number option number is used.  For
   options unique to the Failover protocol, options numbers starting at
   230 are used.

   Thus, all new Failover Protocol option numbers are assigned from a
   continuous range beginning with 230.	 This number is	shown as an X in
   the tables below.

   The protocol	is permissive in allowing various other	DHCP options in
   binding updates.  As	long as	the sender wishes to use an option, it
   MAY include it. On the other	hand, the recipient MUST ignore	any
   option it is	not expecting.

   Multiple DHCPBNDUPD transactions can	be batched together in one UDP
   packet. Option sets	for individual transaction MUST	always begin
   with	the IP address (Option	50) . This is the only restriction on
   payload item	ordering. In any other case, payload data items	can be
   included in any desired order.

   In case an implementation chooses to	use the	DHCPBNDNAK mechanism,
   the DHCPBNDNAK message SHOULD contain one or	more Option 50s	from the
   NAK-ed message, to indicate which specific update items are being
   NAK-ed.

   While the synchronization is	in progress, the secondary MUST	NOT
   accept client requests, and the primary MUST	NOT send any updates to
   the secondary. This is necessary to allow the Primary to be the sole
   arbitrator of any conflicting updates.


Droms, et. al.					               [Page 12]

DRAFT							    January 1998


3.1.  DHCP Server Status

   This	option is used to convey the current state of a	server.


    Code  Len  Type
   +--+---+------+
   | X|	1 | 1-15 |
   +--+---+------+

   Allowed values for this option:

   Value Message Type
   ----- ------------
   1	 UNKNOWN-STATE
   2	 PRIMARY-NORMAL		   Normal state
   3	 BACKUP-NORMAL
   4	 PRIMARY-COMINT		   Communication interrupted (safe)
   5	 BACKUP-COMINT
   6	 PRIMARY-PARTNERDOWN	   Partner down	(unsafe
				   mode)
   7	 BACKUP-PARTNERDOWN
   8	 PRIMARY-CONFLICT	   Synchronizing, after	a
				   "Partner-Down"
				   divergence
   9	 PRIMARY-SYNC		   Synchronizing, after	a
				   "communications-
				   interrupted"
				   divergence.
   10	 BACKUP-SYNC
   11	 PRIMARY-RECOVER	   Recovering ALL
				   bindings from partner
   12	 BACKUP-RECOVER
   13	 FAILOVER-DISABLED	   The server is running
				   with	the failover
				   protocol disabled.
				   (standalone)

   14	 SERVER-PAUSED		    The	server is inactive,
				   shutting down for a sort period.
   15	 SERVER-SHUTDOWN	    The	server is inactive,
				   shutting down for an	extended period.


   When	a server is being re-started, it should	send a DHCPPOLL	message
   to its partner, reporting its status	(SERVER-PAUSED).  In response,
   the recipient SHOULD	go into	COMMUNICATION-INTERRUPTED mode.


Droms, et. al.					               [Page 13]

DRAFT							    January 1998


   When	a server is being shut down,  it should	send a DHCPPOLL	message
   to its partner, reporting its status	(SERVER-SHUTDOWN).

   In response,	the recipient SHOULD go	into PARTNER-DOWN mode.


3.2.  DHCP Binding Status

   This	option is used to convey the current state of a	binding. This
   option is mandatory for DHCPBNDUPD messages.


   Code	  Len  Type
   +-----+-----+-----+
   | X+1 |  1  | 1-7 |
   +-----+-----+-----+

   Legal values	for this option	are:

   Value   Message Type
   -----   ------------
   1	   FREE		  The lease has	never been used
   2	   ACTIVE	  assigned to a	client *
   3	   EXPIRED
   4	   RELEASED	  A client released the	lease
   5	   ABANDONED	  A server or client flagged address
			  as not usable.
   6	   RESET	  Lease	was freed by some
			  external agent.
   7	   BACKUP	  Lease	is set aside for Secondary
			  server's private address pool.


3.3.  Assigned IP address

   Uses	identical code and format to DHCP Option 50 (requested IP
   address).


   Code	  Len	       Address
   +-----+-----+-----+-----+-----+-----+
   |  50 |  4  |  a1 |	a2 |  a3 |  a4 |
   +-----+-----+-----+-----+-----+-----+


Droms, et. al.					               [Page 14]

DRAFT							    January 1998


3.4.  Lease grant time

   An absolute,	GMT time value for this	option,	as time	synchronization
   has already been achieved between the source	and the	target server
   using the Sent Time Stamp option.  Represented as seconds since Jan
   1, 1970  (i.e. ANSI C time_t	time value representation).


   Code	  Len		Time
   +------+-----+-----+-----+-----+-----+
   | X+2  |  4	|  t1 |	 t2 |  t3 |  t4	|
   +------+-----+-----+-----+-----+-----+


3.5.  Sent Time	Stamp

   A time stamp	using GMT, when	the packet was sent. It	is used	to
   determine the time drift between the	sender and the recipient. The
   time	drift is defined as the	difference between "Arrive Time	(GMT)"
   and (Send Time (GMT)" .  The	actual packet travel time is assumed to
   be negligible in this context. All Date-Time	values contained  in
   Failover messages will be corrected by the time drift before	being
   stored by the recipient.


   Code	  Len		Time
   +-----+-----+-----+-----+-----+-----+
   | X+3 |  4  |  t1 |	t2 |  t3 |  t4 |
   +-----+-----+-----+-----+-----+-----+


   The time is a 32 bit	unsigned long in network byte order, in	units of
   seconds (GMT	since EPOCH).


3.6.  Number of	addresses transferred to Secondary Server

   A 32	bit unsigned long in network byte order. Reports the number of
   addresses transferred by the	Primary	to the Secondary Server
   (addresses to be used for the Secondary Server's private address
   pool)


Droms, et. al.					               [Page 15]

DRAFT							    January 1998


   Code	  Len		Time
   +-----+-----+-----+-----+-----+-----+
   | X+4 |  4  |  t1 |	t2 |  t3 |  t4 |
   +-----+-----+-----+-----+-----+-----+


3.7.  Lease Duration

   Uses	the format and code of the standard DHCP IP Address Lease Time
   option. It is used by the DHCP protocol in the exact	same way by the
   DHCPOFFER message. The time is in units of seconds, and is specified
   as a	32-bit	unsigned integer. A Lease Duration of 0xFFFFFFFF indi-
   cates an infinite lease.


   Code	  Len	      Lease Time
   +-----+-----+-----+-----+-----+-----+
   |  51 |  4  |  t1 |	t2 |  t3 |  t4 |
   +-----+-----+-----+-----+-----+-----+


3.8.  Client Identifier

   The format, code and	conventions used are identical to DHCP option
   61.


   Code	  Len	Type  Client-Identifier
   +-----+-----+-----+-----+-----+---
   |  61 |  n  |  t1 |	i1 |  i2 | ...
   +-----+-----+-----+-----+-----+---


3.9.  Client Hardware Address

   The format is similar to DHCP option	61. T1 (type) MUST be set to the
   proper ARP hardware address code ( it MUST NOT be zero!)  TBD: Refer-
   ence	the ARP	document here.


Droms, et. al.					               [Page 16]

DRAFT							    January 1998


   Code	  Len	Type  Client-Identifier
   +-----+-----+-----+-----+-----+---
   | X+5 |  n  |  t1 |	i1 |  i2 | ...
   +-----+-----+-----+-----+-----+---


   Either Client Id, Client Hardware Address or	BOTH MAY be present in
   binding update transactions.	At least one of	them MUST be present.
   If both are present,	the Client Id MUST be used to uniquely identify
   the owner of	the binding (exactly as	in RFC 2131).


3.10.  Host Name

   Uses	the format and code of DHCP option 12.


   Code	  Len		      Host Name
   +-----+-----+-----+-----+-----+-----+-----+-----+--
   |  12 |  n  |  h1 |	h2 |  h3 |  h4 |  h5 |	h6 |  ...
   +-----+-----+-----+-----+-----+-----+-----+-----+--


3.11.  Domain Name

   Uses	the format and code of DHCP option 15.


   Code	  Len	Domain Name
   +-----+-----+-----+-----+-----+-----+--
   |  15 |  n  |  d1 |	d2 |  d3 |  d4 |  ...
   +-----+-----+-----+-----+-----+-----+--


3.12.  Reject Reason Code

   This	option is used to selectively reject binding updates. It MAY be
   used	in DHCPBNDACK message, always following	an option 50.(The option
   50 contains the IP address of the specific update being rejected).


Droms, et. al.					               [Page 17]

DRAFT							    January 1998


   Code	  Len	Reason code
   +-----+-----+-----+
   | X+6 |  1  |  R1 |
   +-----+-----+-----+-

   Reason codes	:

   1 Illegal IP	address	(not part of any address pool)
   2 Fatal conflict exists: address in use by other client.


3.13.  MDLI

   Maximum Delta Lease Interval, in seconds.  A	32  bit	integer	value,
   in netwotk byte order.


   Code	  Len		Time
   +------+-----+-----+-----+-----+-----+
   | X+7  |  4	|  t1 |	 t2 |  t3 |  t4	|
   +------+-----+-----+-----+-----+-----+


4.  Exchange of	control	between	Primary	and Secondary

   The Primary and Secondary Servers coordinate	the exchange control
   over	the bindings database through the use of DHCPPOLL and DHCPCTLREQ
   messages.  In normal	operation:

   The Primary sends notification of each change to its	bindings data-
   base	to the Secondary, and the Secondary keeps its bindings database
   synchronized	with the Primary's database.

   The Secondary periodically sends DHCPPOLL messages to the Primary,
   and the Primary responds to each DHCPPOLL message with a DHCPPRPL
   message. If the Secondary does not receive a	DHCPPRPL response mes-
   sage, the Secondary takes control of	the bindings database and begins
   answering requests from DHCP	clients.  Note that the	Secondary should
   be able to be configured to not perform the automatic switch-over.

   The conditions under	which a	Secondary takes	control	of the bindings
   database, e.g., the number of consecutive missing acknowledgments,
   should be configurable in the Secondary by the DHCP administrator.


Droms, et. al.					               [Page 18]

DRAFT							    January 1998


   The Secondary records any changes it	makes to the bindings database
   while it has	control. The Secondary continues to send DHCPPOLL mes-
   sages to the	Primary.  The DHCPPOLL messages	also carry information
   on the state	of the Secondary Server.

   To regain control of	the bindings database, e.g., after the Primary
   Server has recovered	from a failure,	or a partitioned network condi-
   tion, the Primary sends a DHCPCTLREQ	message	to the Secondary.  The
   Secondary stops answering DHCP client requests, and responds	to its
   Primary with	a DHCPCTLRET message.  After sending the DHCPCTLRET mes-
   sage, the Secondary sends DHCPBNDUPD	messages for each of the changes
   it has made to the bindings database.

   The Primary sends a DHCPBNDACK for each DHCPBNDUPD message it
   receives.  The Secondary completes the transfer of control by sending
   a DHCPCTLACK	message	to the Primary as soon as all of its updates
   were	acknowledged.

   Note, that the Primary SHOULD NOT send any DHCPBNDUPD messages while
   synchronization is in progress with the Secondary.

   Once	the synchronization is completed, and the Primary transitions
   into	NORMAL state, and starts sending DHCPBNDUPD transactions on any
   accumulated binding changes it may have.

5.  Duplicate address assignment scenarios

   In the following two	scenarios, the protocol	could end up allocating
   duplicate IP	addresses, unless the measures recommended in Section 6.
   are taken:

   Primary Server crash	before "lazy" update: In the case where	the Pri-
   mary	Server sends an	ACK to a client	for a newly allocated IP address
   and then crashes prior to sending the corresponding update to the
   Secondary Server, the Secondary Server will have no record of the IP
   address allocation.	When the Secondary Server takes	over, it may
   well	try to allocate	that IP	address	to a different client.	In the
   case	where the first	client to receive the IP address is not	on the
   net at the time (yet	while there was	still time to run on its lease),
   an ICMP echo	(i.e., ping) will not prevent the Secondary Server from
   allocating that IP address to different client.

   A more likely and subtle version of this problem is where the Primary
   Server crashes after	extending a client's lease time, and before
   updating the	Secondary with a new time using	a lazy update. After the
   Secondary takes over, if the	client is not connected	to the network
   the Secondary will believe the client's lease has expired when, in
   fact, it has	not.  In this case as well, the	IP address might be


Droms, et. al.					               [Page 19]

DRAFT							    January 1998


   reallocated to a different client while the first client is still
   using it.

   Network partition where servers can't communicate but each can talk
   to clients: Several conditions are required for this	situation to
   occur. First, due to	a network failure, the Primary and Secondary
   Servers cannot communicate.	As well, some of the DHCP clients must
   be able to communicate with the Primary Server, and some of the
   clients must	now only be able to communicate	with the Secondary
   Server.  When this condition	occurs,	both Primary and Secondary
   Servers could attempt to allocate IP	addresses for new clients from
   the same pool of available addresses. At some point,	then, two
   clients will	end up being allocated the same	IP address. This will
   cause potentially serious problems when the network failure that
   created this	situation is corrected.

   The next section details how	the Failover Protocol prevents either of
   the above scenarios (and other related scenarios) from causing dupli-
   cate	IP address allocation.

6.  Duplicate Address Assignment Control

   There are several ways that the Failover protocol avoids the	possi-
   bility of duplicate address assignment.

6.1.  Control of lease time

   The key problem with	lazy update is that when the primary server
   fails after updating	a client with a	particular lease time and before
   updating the	secondary server, the secondary	server will believe that
   a lease has expired even though the client still retains a valid
   lease on that IP address.

   In order to handle this problem, a period of	time known as the "max-
   imum	delta lease interval" (MDLI) is	defined	and must be known to
   both	the primary and	secondary servers.  Proper use of this time
   interval places an upper bound on the difference allowed between the
   lease time provided to a DHCP client	and the	lease time known by the
   secondary server.  In order that this is not	the maximum lease time
   that	the primary can	ever provide to	a client, during a lazy	update
   the primary typically updates the secondary with lease time informa-
   tion	which is longer	than the lease time previously given to	the
   client.

   In the case where the secondary needs to take over from the primary,
   the secondary will not reallocate any IP addresses from one client to
   a different clients.	 When transitioning to the PARTNER-DOWN	state
   (where the secondary	is allowed to reallocate IP addresses),	the


Droms, et. al.					               [Page 20]

DRAFT							    January 1998


   secondary will wait the maximum-delta-lease-interval	before complet-
   ing the state transition.  Thus, any	clients	which have a lease on an
   IP address with a lease time	greater	that than known	by the secondary
   will	either have contacted the secondary during that	time or	the
   their lease will have expired.

   This	protocol requires a DHCP server	to deal	with several different
   lease intervals and places specific restrictions on their relation-
   ships. The purpose of these restrictions is to allow	the other server
   in the pair to be able to make certain assumptions in the absence of
   an ability to communicate between servers.

   The different lease times are:

	o desired client lease interval

	  The desired client lease interval is the lease interval that
	  the DHCP server would	like to	give to	the DHCP client	in the
	  absence of any restrictions imposed by the Failover Protocol.
	  Its determination is outside of the scope of this protocol.
	  Typically this is the	result of external configuration of a
	  DHCP server.

	o actual client	lease interval

	  The actual client lease internal is the lease	interval that
	  that DHCP server gives out to	the DHCP client.  It may be
	  shorter than the desired client lease	interval (as explained
	  below).

	o Primary Server lease interval

	  The Primary Server lease interval is the interval after which
	  the Primary Server believes that DHCP	client's lease will
	  expire.

	o desired Secondary Server lease interval

	  The desired Secondary	Server lease interval is the interval
	  the Primary Server tells to the Secondary Server after which
	  the lease will expire.

	o acknowledged Secondary Server	lease interval

	  The acknowledged Secondary Server lease interval is the inter-
	  val the Secondary Server has most recently acknowledged. The
	  key restriction (and guarantee) that the Primary Server makes
	  with respect to lease	intervals is that the actual client


Droms, et. al.					               [Page 21]

DRAFT							    January 1998


	  lease	interval never exceeds the acknowledged	Secondary Server
	  lease	interval (if any) by more than a fixed amount.	This
	  fixed	amount is called the "maximum delta lease interval"
	  (MDLI).

   The MDLI MAY	be configurable, but for correct server	operation it
   MUST	be known to both the Primary and Secondary Servers.

   The Primary Server MUST record in its state both the	Primary	Server
   lease interval and the most recently	acknowledged Secondary Server
   lease interval. It is assumed that the desired client lease interval
   can be determined through techniques	outside	of the scope of	this
   protocol.

   The above lease time	descriptions are written for the case where the
   where the Primary server is operating and in	communication with the
   Secondary server.  In the case where	the Secondary server is	operat-
   ing out of communications with the Primary server, then the relation-
   ships must hold in the other	direction.

   The fundamental relationship	among these times which	MUST be	main-
   tained is:


       actual client lease interval <
       ( acknowledged other server lease interval + MDLI )


   The "acknowledged other server lease	interval" is the acknowledged
   secondary server lease interval for the Primary server, and it would
   be the acknowledged primary server lease interval for the Secondary
   server when it is operating out of contact with the Primary server.

   DISCUSSION:

      This protocol mandates no	particular detailed algorithms concern-
      ing these	lease intervals, as long as above fundamental relation-
      ship is preserved.

      In the interests of clarity, however, let's examine a specific
      example. The MDLI	in this	case is	1 hour.	 The desired client
      lease interval is	3 days.	 In operation this might work as fol-
      lows:

      When a Primary Server makes an offer for a new lease on an IP
      address to a DHCP	client,	it determines the desired client lease
      interval (in this	case, 3	days).	It then	examines the ack-
      nowledged	Secondary lease	interval (which	in this	case is	 zero).


Droms, et. al.					               [Page 22]

DRAFT							    January 1998


      Since the	actual client lease interval can not be	allowed	to
      exceed the current Secondary lease interval by more than the MDLI,
      the offer	made to	the DHCP client	(the actual client lease inter-
      val) is for (essentially)	the MDLI, 1 hour.

      Once the Primary Server has performed the	ACK to the DHCP	client,
      it will update the Secondary Server with the lease information.
      However, the Secondary Server lease interval will	be composed of
      the current actual client	lease interval + ( 1.5 * desired client
      lease interval). Thus, the Secondary Server is updated with a
      lease interval of	4.5 days + 1 hour.

      When the Primary Server receives an ACK to its update of the
      Secondary	Server's lease interval, it records that as the	ack-
      nowledged	Secondary Server lease interval.  The Primary Server
      MUST ensure that the Secondary Server has	received and recorded in
      its stable storage the Secondary Server lease interval.

      When the DHCP client attempts to renew at	T2 (approximately one
      half an hour from	the start of the lease), the Primary Server
      again determines the desired client lease	time, which is still 3
      days.  It	then compares this with	the remaining acknowledged
      Secondary	Server lease interval (adjusting for the time passed
      since the	Secondary Server was last updated), which is 4.5 days +
      to the desired client lease interval as it is less than the ack-
      nowledged	Secondary lease	interval.

      When the Primary DHCP server updates the Secondary DHCP server
      after the	DHCP client's renewal ACK is complete, it will calculate
      the Secondary Server lease interval as the actual	client lease
      interval (3 days this time) + .5 the desired client lease	interval
      (1.5 days).  In this way,	the Primary attempts to	have the Secon-
      dary always "lead" the client in its understanding of the	client's
      lease interval.

      Once the initial actual client lease interval of the MDLI	is past,
      the protocol operates effectively	like the DHCP protocol does
      today in its behavior concerning lease intervals.	However, the
      guarantee	that the actual	client lease interval will never exceed
      the acknowledged Secondary Server	lease interval by more than the
      MDLI allows full recovery	from failures in lazy update.


6.2.  Controlled re-allocation of IP addresses

   When	the servers cannot communicate neither server will allow an IP
   address previously used by one client to be offered to a different
   client.  As a corollary, during normal operations the primary server


Droms, et. al.					               [Page 23]

DRAFT							    January 1998


   must	update the secondary server whenever a lease expires or	an IP
   address is released,	and must receive acknowledgement of that update
   before offering the IP address of the expired or released IP	address
   to a	different client.


7.  Server States

   The following server	states are defined:

   NORMAL State:

   NORMAL state	is the state used by a server when it can communicate
   with	the other server in the	Primary-Secondary Server pair. When in
   this	state, the Primary responds to DHCP clients requests, while the
   Secondary does not.

   COMMUNICATION-INTERRUPTED state:

   A server goes into this state whenever it is	unable to communicate
   with	the other server. Both the Primary and Secondary Servers can go
   into	this state, although the behavior changes that result are dif-
   ferent. Primary and Secondary Servers cycle automatically (without
   administrative intervention)	between	NORMAL and COMMUNICATION-
   INTERRUPTED state as	the network connection between them fails and
   recovers, or	as the partner server cycles between operational and
   non-operational. No duplicate IP address allocation can occur while
   the servers cycle between these states.  In this state both servers
   may respond to DHCP client requests.	 When allocating new IP
   addresses, each server allocates from a different pool. When	respond-
   ing to renewal requests, each server	will allow continued renewal of
   a DHCP client's current lease on an IP address.

   PARTNER-DOWN	state:

   PARTNER-DOWN	state is a state either	server can enter. Once a server
   has entered NORMAL state, the PARTNER-DOWN state is entered only on
   command of an external agency (typically an administrator of	some
   sort) or after the expiration of an externally configured minimum
   safe-time after the beginning of COMMUNICATION-INTERRUPTED state.
   When	in this	state, the server no longer assumes that the other
   server could	still be operational and servicing a a different set of
   clients, but	instead	assumes	that it	is the only server operating.
   Only	one server should be operating in this state at	a time.	The
   server in this state	will respond to	DHCP client requests. It will
   allow renewal of all	outstanding leases on IP addresses, and	will
   allocate IP addresses from its own pool, and	after a	fixed period of
   time, it will allocate IP addresses from the	set of all available IP


Droms, et. al.					               [Page 24]

DRAFT							    January 1998


   addresses. The server will transition out of	PARTNER-DOWN state after
   automatic re-integration the	companion server is complete.  This
   automatic re- integration will typically be initiated by the	restart
   of the server which was down.

   POTENTIAL-CONFLICT state:

   This	state indicates	that the two servers are attempting to rein-
   tegrate with	each other, but	at least one of	them was running in a
   state that did not guarantee	automatic reintegration	would be possi-
   ble.	 In POTENTIAL-CONFLICT state the servers may determine that the
   same	IP address has been offered and	accepted by two	different DHCP
   clients.

   RECOVER state:

   This	state indicates	that the server	has no information in its stable
   storage. A server in	this state will	attempt	to refresh its stable
   storage from	the other server.

   SYNC	state:

   In this state, the Secondary	Server attempts	to synchronize its
   stable storage with the Primary Server.  Both the Primary and Secon-
   dary	may have information that the other lacks.


8.  Primary Server Operation

   This	section	discusses the operation	of the primary server using the
   state transition diagram in Figure 8.2-1.

8.1.  Primary Server Initialization

   When	the Primary Server starts, there are three possibilities:  it
   has never started before and	therefore has no record	of any previous
   state nor of	any client binding information;	it has started before
   and has a record of a previous state	and possibly of	some client
   binding information;	it has started before, but failed catastrophi-
   cally, and now has no record	of any previous	state (nor of any client
   binding information).

   When	the Primary Server starts, if it has any record	of a previous
   state, then if that state was NORMAL	or COMMUNICATION-INTERRUPTED it
   moves to COMMUNICATION- INTERRUPTED state.  If that state was
   PARTNER-DOWN	or POTENTIAL-CONFLICT, then it moves to	PARTNER-DOWN
   state.  If that state was RECOVER, then the Primary Server moves into
   the RECOVER state.


Droms, et. al.					               [Page 25]

DRAFT							    January 1998


   If it has no	record of any previous state, then either this is an
   initial startup, or a recovery from a catastrophic failure where
   stable storage and all client binding information was lost. These are
   distinguished by recovery from a catastrophic failure being indicated
   by some external configuration indication to	the Primary Server.

8.2.  Primary Server State Transitions

   Figure 8.2-1	is the diagram of the Primary Server's state transi-
   tions. The remainder	of this	section	contains information important
   to the understanding	of that	diagram.

   The server stays in the current state until all of the actions speci-
   fied	on the state transition	are complete.  If communications fails
   during one of the actions, the server simply	stays in the current
   state and attempts a	transition whenever the	conditions for a transi-
   tion	are later fulfilled.

   In the state	transition diagram below, the "+" or "-" in the	upper
   right corner	of each	state is a notation about whether communication
   is ongoing with the Secondary Server.  The legend "responsive" and
   "unresponsive" in each state	indicates whether the Primary Server is
   responsive to DHCP client requests in the respective	state.

   In the diagram state	transition diagram below, when communication is
   reestablished between the Primary and Secondary Server, the Primary
   server must record the state	of the Secondary Server	when the commun-
   ication was reestablished.

   If the state	of the Secondary Server	changes	 while communicating,
   then	the Primary Server moves through the communications-failed tran-
   sition, and into whatever state results.  It	then immediately moves
   through whatever state transition is	appropriate given the current
   state of the	Secondary Server.

   DISCUSSION:

      The point	of this	technique is simplicity, both in explanation of
      the protocol and in its implementation.  The alternative to this
      technique	of memory of partner state and automatic state transi-
      tion on change of	partner	state is to have every state in	the fol-
      lowing diagram have a state transition for every possible	state of
      the partner.  With the approach adopted, only the	states in which
      communications are reestablished require a state transition for
      each possible partner state.

   All state transitions of the	Primary	Server must be recorded	in its
   stable storage, and thus be available to the	server after a server


Droms, et. al.					               [Page 26]

DRAFT							    January 1998


   restart.


	       Previous	Primary	State:

	 NORMAL	or     RECOVER	       PARTNER DOWN
       COMMUNICATION  <ext. cmd>    POTENTIAL CONFLICT
	INTERRUPTED	  |		<none>
       +---+		  V		   |
       |     +----------------+	+-----------------+
       |     |		    - |	|		- |
       |     |	  RECOVER     |	|  PARTNER DOWN	  |<-----+
       |     | (unresponsive) |	|  (responsive)	  |	 |
       |     +----------------+	+-----------------+	 |
       |       |		 |	 |	 ^	 |
       |   Comm. OK		 |    Comm. OK	 |	 |
       |   Sec.	State:		 |  Sec. State:	Comm.	 |
       |    |	   |		 V  All	Others	Failed	 |
       |    |	RECOVER	    +<---+	 V	 |	 |
       |   All	   |	    |	    +-------------+	 |
       |  Others   |	 Comm. OK   |  POTENTIAL +|	 |
       |    |	  Note	Sec. State: |  CONFLICT	  |	 |
       |    |	  Poss.	 RECOVER    |(responsive) |<---- | --+
       |    V	  Error	  NORMAL    +-------------+	 |   |
       | Sec->Pri   |	 Pri->Sec	    |		 |   |
       |   Sync	    |	  Sync.	      Resolve Conflict	 |   |
       |    |	    |	    V		    V		 |   |
       | Wait MDLI  |	   +-----------------+		 |   |
       | from Fail. |	   |		   + | External	 |   |
       |    V	    V	   |	 NORMAL	     |-Command-->+   |
       |    +-----++------>|  (responsive)   |		 |   |
       |	  ^	   +-----------------+		 |   |
       |	  |		    |			 |   |
       |      Pri<->Sec		  Comm.		    External |
       |	Sync		 Failed		     Command |
       |	  |		    |			or   |
       |      Comm. OK		    |	       "Safe Period" |
       |     Sec. State:	    V		 expiration  |
       |       NORMAL	   +-----------------+		 |   |
       |     COMM. INT.	   |		   - |---------->+   |
       |      RECOVER------| COMMUNICATIONS  |		     |
       |		   |   INTERRUPTED   |	 Comm. OK    |
       +------------------>|  (responsive)   |--Sec. State:--+
			   +-----------------+	All Others

	   Figure 8.2-1:  Primary Server state diagram.


Droms, et. al.					               [Page 27]

DRAFT							    January 1998


8.3.  Primary Server in	PARTNER-DOWN state

   When	it is in PARTNER-DOWN state, the Primary Server	operates largely
   as does a normal DHCP server, with none of the special algorithms
   described below.  In	PARTNER-DOWN state the Primary Server MUST
   respond to DHCP client requests.

   Any available IP address tagged as belonging	to the Secondary Server
   (at entry to	PARTNER-DOWN state) MUST NOT be	used until the MDLI
   beyond the entry into PARTNER-DOWN state has	elapsed.

   The Primary Server MUST NOT allocate	an IP address to a DHCP	client
   different from that to which	it was allocated at the	entrance to
   PARTNER-DOWN	state until the	MDLI beyond the	its expiration time has
   elapsed.  If	this time would	be earlier than	the current time plus
   the MDLI, then the current time plus	the MDLI is used.

   Two options exist for lease times, with different ramifications flow-
   ing from each.

   If the Primary Server wishes	the Failover Protocol to protect it from
   loss	of stable storage in any state,	then it	should ensure that the
   MDLI	based lease time restrictions in Section 6.1 are maintained,
   even	in PARTNER-DOWN	state.

   If the Primary Server wishes	to forego the protection of the	Failover
   Protocol in the event of loss of stable storage, then it need recog-
   nize	no restrictions	on actual client lease times while in PARTNER-
   DOWN	state.

   The Primary Server MUST poll	the Secondary Server and attempt to
   establish communications and	synchronization	with it.

   Once	the Primary succeeds in	contacting the Secondary Server, the
   Primary examines the	state of the Secondary Server. If the state of
   the Secondary Server	is RECOVER or NORMAL, then both	servers	have
   been	running	in such	a way that duplicate IP	address	allocations were
   inhibited.  In this case, the Primary Server	updates	the Secondary
   Server with its client binding information, and moves into the NORMAL
   state.

   Once	contact	has been established, if the state of the Secondary
   Server is anything other than RECOVER or NORMAL then	the Primary
   Server moves	into the POTENTIAL-CONFLICT state.

8.4.  Primary Server in	RECOVER	state

   When	Primary	Server is initialized in the RECOVER state it expects to


Droms, et. al.					               [Page 28]

DRAFT							    January 1998


   refresh its stable storage from an existing Secondary Server.  In
   this	state the Primary Server MUST NOT respond to DHCP client
   requests.

   When	the Primary Server succeeds in contacting the Secondary	Server,
   if it determines that the Secondary Server is itself	in the RECOVER
   state (which	indicates that the Secondary Server has	no existing
   client binding information),	the Primary Server will	move directly
   into	NORMAL state after signaling some kind of an error (since some
   person had to explicitly start the Primary Server in	RECOVER	state to
   refresh its lost client binding information from the	Secondary, and
   the Secondary had no	state).

   If the Primary Server determines that the Secondary Server is in any
   state other than RECOVER, then the Secondary	Server has some	client
   binding information that the	Primary	Server needs before it moves
   into	the NORMAL state.  The Primary Server will attempt to refresh
   its state from the Secondary	Server,	and it will remain in the
   RECOVER state until it is successful	in doing so.

   The Primary Server MUST remain in RECOVER state until a period of at
   least the MDLI has passed since the Primary Server was known	to have
   failed.  This is to allow any IP addresses that were	allocated by the
   Primary Server prior	to loss	of Primary Server client binding infor-
   mation in stable storage to contact the Secondary Server or to time
   out.

   DISCUSSION:

      The actual requirement on	this wait period in RECOVER is that it
      start when the Primary Server went down, not necessarily when it
      came back	up.  If	the time when the Primary Server failed	is
      known, then it could be communicated to the recovering server, and
      the wait period could be reduced to the MDLI less	the difference
      between the current time and the time the	server failed. In this
      way, the waiting period could be minimized.


8.5.  Primary Server in	NORMAL state

   When	in NORMAL state, the Primary Server takes the following	actions
   to implement	the Safe Failover Protocol:

	o Lease	Time Calculations

	  As discussed in Section 6.1, "Control	of lease time",	the
	  lease	interval given to a DHCP client	can never be more than
	  the maximum delta lease interval greater than	the acknowledged


Droms, et. al.					               [Page 29]

DRAFT							    January 1998


	  Secondary Server lease interval.

	  As long as the Primary Server	adheres	to this	constraint, the
	  specifics of the lease intervals that	it gives to either the
	  DHCP client or the Secondary DHCP server are implementation
	  dependent. One possible approach is shown in Section 6.1, but
	  that particular approach is in no way	required by this proto-
	  col.

	o Lazy Update of Secondary Server

	  After	an ACK of a IP address binding,	the Primary Server
	  attempts to update the Secondary with	the binding information.
	  The lease time used in the update of the Secondary MUST be at
	  least	that given to the DHCP client in the DHCPACK.  It MAY,
	  however, be longer.

	o Reallocation of IP Addresses Between Clients

	  Whenever a client binding is released, a DHCPBNDUPD message
	  must be sent to the Secondary	Server,	setting	the binding
	  state	to RELEASED. However, until a DHCPBNDACK is received for
	  this message,	the IP address cannot be allocated to another
	  client.

8.6.  Primary Server in	COMMUNICATION-INTERRUPTED Mode

   When	in COMMUNICATION-INTERRUPTED state the Primary Server operates
   in such a way that correct operation	is ensured even	if the Secondary
   Server is still up and operational, but unable to communicate to the
   Secondary Server. When communications are reestablished between the
   Primary and Secondary Servers, if both are still in COMMUNICATION-
   INTERRUPTED state, then the re-integration of their operation will
   proceed automatically and without human intervention.  The protocol
   is designed to ensure that reintegration will proceed in an error
   free	manner and that	no actions taken by either server while	in
   COMMUNICATION-INTERRUPTED state will	cause problems during reintegra-
   tion.

   The Primary Server operates in COMMUNICATION-INTERRUPTED state as it
   does	in NORMAL state.

   However, since it cannot communicate	with the Secondary in this
   state, the acknowledged-Secondary-lease-time	will not be updated in
   any new bindings. This is likely to eventually cause	the actual-
   client-lease-times to be the	current-time plus the MDLI (unless this
   is greater than the desired-client-lease-time).


Droms, et. al.					               [Page 30]

DRAFT							    January 1998


   The Primary Server can simply queue updates to the Secondary	on com-
   munication interruption and stay in the NORMAL state. If, at	the time
   communication with the Secondary is reestablished, the Secondary
   remains in the NORMAL state as well,	then the queued	updates	for the
   Secondary will simply be processed.

   COMMUNICATION-INTERRUPTED state for the Primary Server is a signal
   that	it has stopped queuing updates to the Secondary, and is	able to
   respond to a	variety	of possible Secondary states.

   It is anticipated that some alarm condition would be	raised upon the
   transition from NORMAL state	to COMMUNICATION-INTERRUPTED state. Once
   the Primary Server has been in COMMUNICATION-INTERRUPTED state for a
   period equal	to the safe-period, then it can	(if configured to do so)
   transition into the PARTNER-DOWN state.  An external	command	may also
   force a transition to PARTNER-DOWN state.

9.  Secondary Server Operation

   The Secondary Server	responds to DHCP client	requests only in the
   PARTNER-DOWN	and COMMUNICATION-INTERRUPTED states.


9.1.  Secondary	Server Initialization

   When	the Secondary Server starts, there are three possibilities: it
   has never started before and	therefore has no record	of any previous
   state nor of	any client binding information;	it has started before
   and has a record of a previous state	and possibly of	some client
   binding information;	it has started before, but failed catastrophi-
   cally, and now has no record	of any previous	state (nor of any client
   binding information).

   When	the Secondary Server starts, if	it has any record of a previous
   state, then if that state was NORMAL, COMMUNICATION-INTERRUPTED, or
   SYNC, it moves to COMMUNICATION-INTERRUPTED state. If that state was
   PARTNER-DOWN	or POTENTIAL-CONFLICT, then it moves to	PARTNER-DOWN
   state. In all other cases (both other previous states and the cases
   where there is no record of a previous state), the Secondary	Server
   moves into the RECOVER state.


9.2.  Secondary	Server State Transitions

   The server stays in the current state until all of the actions speci-
   fied	on the state transition	are complete.  If communications fails
   during one of the actions, the server simply	stays in the current
   state and attempts a	transition whenever the	conditions for a


Droms, et. al.					               [Page 31]

DRAFT							    January 1998


   transition are later	fulfilled.

   In the state	transition diagram below, the "+" or "-" in the	upper
   right corner	of each	state is a notation about whether communication
   is ongoing with the Primary Server. The legend responsive" and
   "unresponsive" in each state	indicates whether the Secondary	Server
   is responsive to DHCP client	requests in the	respective state.

   In the state	transition diagram below, when communication is	reesta-
   blished between the Secondary and Primary Server, the Secondary
   Server must record the state	of the Primary Server when the communi-
   cations was reestablished. If the state of the Primary Server changes
   while communicating,	then the Secondary Server moves	through	the
   communications-interrupted transition, and into whatever state
   results.  At	that time, it then immediately moves through whatever
   state transition is appropriate for the current state of the	Primary
   Server.

   All state transitions of the	Secondary Server must be recorded in its
   stable storage, and thus be available to the	server after a server
   restart.


Droms, et. al.					               [Page 32]

DRAFT							    January 1998


	       Previous	Secondary State:

	 NORMAL	   RECOVER	  PARTNER DOWN
       COMM. INT.   <none>	POTENTIAL CONFLICT
	  SYNC	      |		       |
       +---+	      V		       V
       |     +----------------+	+-----------------+
       |     |	  RECOVER   - |	|  PARTNER DOWN	- |<-----+
       |     | (unresponsive) |	|  (responsive)	  |	 |
       |     +----------------+	+-----------------+	 |
       |       |		 |	|	 ^	 |
       |   Comm. OK		 |   Comm. OK	 |	 |
       |   Pri.	State:		 |  Pri. State:	Comm.	 |
       |    |	   |		 V  All	Others	Failed	 |
       |    |	RECOVER	    +<---+	V	 |	 |
       |    |	   |	    |	    +--------------+	 |
       |    |	   |	 Comm. OK   |  POTENTIAL + |	 |
       |   All	   |	Pri. State: |  CONFLICT	   |	 |
       |  Others   |	 RECOVER    |(unresponsive)|<--- | --+
       |    |	  Note	    |	    +--------------+	 |   |
       |    |	  Poss.	 Sec->Pri	    |		 |   |
       |    V	  Error	  Sync.	      Resolve Conflict	 |   |
       | Pri->Sec  |	    V		    V		 |   |
       |   Sync	   |	   +-----------------+		 |   |
       |    V	   V	   |	 NORMAL	   + |-External->+   |
       |    +-----++------>| (unresponsive)  | Command	 |   |
       |	  ^	   +-----------------+		 |   |
       |      Pri<->Sec	      |	       ^		 |   |
       |	Sync	      |	 Start Alloc Timer	 |   |
       |	  |	      |	    Sec->Pri		 |   |
       |  +--------------+    |	      Sync		 |   |
       |  |	       + |--->+	       |	    External |
       |  |	SYNC	 |  Comm.   Comm. OK	     Command |
       |  | unresponsive | Failed  Pri.	State:		or   |
       |  +--------------+    |	     RECOVER   "Safe Period" |
       |	  ^	      V	       |	 expiration  |
       |	  |	  +------------------+		 |   |
       |      Comm. OK	  | COMMUNICATIONS - |---------->+   |
       |     Pri. State:  |    INTERRUPTED   |	 Comm. OK    |
       |       NORMAL-----|   (responsive)   |--Pri. State:--+
       |     COMM. INT.	  +------------------+	All Others
       |		     ^
       +---------------------+


	  Figure 9.2-1:	 Secondary Server State	Diagram.


Droms, et. al.					               [Page 33]

DRAFT							    January 1998


9.3.  Secondary	Server in RECOVER state

   The Secondary DHCP server comes up in the RECOVER state when	it has
   no record of	any previous state (or that previous state was RECOVER).

   It stays in this state until	it establishes communication with the
   Primary Server, and is unresponsive to DHCP client requests in this
   state. Essentially it is idle until it can contact the Primary
   Server.

   When	it establishes communication with the Primary Server, it
   attempts to load its	client binding database	from that of the Primary
   Server using	the techniques specified in section 6.

   Once	the Secondary Server's client binding database is refreshed from
   that	of the Primary,	the Secondary Server moves into	NORMAL state.


9.4.  Secondary	Server in NORMAL state

   In normal state, the	Secondary Server receives state	updates	from the
   Primary Server in DHCPBNDUPD	messages.  It records these in its
   client binding database in stable storage and then sends the
   corresponding DHCPBNDACK message to the Primary Server.

   While in NORMAL state, the Secondary	Server MUST also acquire a
   series of IP	addresses from the Primary Server to be	used to	satisfy
   DHCPDISCOVER	requests from DHCP clients when	in COMMUNICATION- INTER-
   RUPTED state. See Section 2.2.2 for details of this acquisition pro-
   cess.

   The Secondary Server	periodically polls the Primary Server with the
   DHCPPOLL message. If	it fails to receive a DHCPPRPL message in reply
   after a configured number of	retries	or some	administratively deter-
   mined time, the Secondary Server transitions	into COMMUNICATION-
   INTERRUPTED state. Both the DHCPPOLL	and DHCPPRPL messages carry the
   current status of the sender.

   If an external command is received by the Secondary Server, it can
   move	from NORMAL to PARTNER-	DOWN state directly.  Such a command
   might be sent when the Primary Server was removed from server, and an
   operator wanted the Secondary Server	to take	over immediately and
   completely from the Primary Server.(Note that the Secondary Server
   takes over from the Primary Server when in COMMUNICATION- INTERRUPTED
   state, but less completely than in PARTNER-DOWN state).


Droms, et. al.					               [Page 34]

DRAFT							    January 1998


9.5.  Secondary	Server in COMMUNICATION-INTERRUPTED state

   When	in COMMUNICATION-INTERRUPTED state the Secondary Server	operates
   in such a way that correct operation	is ensured even	if the Primary
   Server is still up and operational, but unable to communicate to the
   Secondary Server. When communications are reestablished between the
   Primary and Secondary Servers, if both are still in COMMUNICATION-
   INTERRUPTED state, then the re-integration of their operation will
   proceed automatically and without human intervention.  The protocol
   is designed to ensure that reintegration will proceed in an error
   free	manner and that	no actions taken by either server while	in
   COMMUNICATION-INTERRUPTED state will	cause any conflicts to occur
   during re-integration.

   In COMMUNICATION-INTERRUPTED	state, the Secondary Server responds to
   DHCP	client requests.

   When	processing a DHCPREQUEST from a	DHCP client, the Secondary
   Server MUST ensure that the client- lease-time is never more	than the
   maximum-delta-lease-	interval from the current-time,	independent of
   the desired-	client-lease-time.

   When	processing a DHCPRELEASE request from a	DHCP client or the
   expiration of a lease, the Secondary	Server must not	reallocate the
   IP address to a different client.  If the same client subsequently
   performs a DHCPDISCOVER request, the	Secondary Server SHOULD	offer it
   the previously used IP address.

   When	processing a DHCPDISCOVER request from a DHCP client, the secon-
   dary	MUST allocate IP addresses from	the list of IP addresses that it
   acquired from the Primary Server in RECOVER state.  When it exhausts
   this	list, it MUST stop responding to DHCPDISCOVER requests (except
   those it can	satisfy	by offering expired or released	IP addresses to
   their previously bound clients).

   The Secondary Server	MUST continue to send DHCPPOLL messages	to the
   Primary Server when in COMMUNICATION-INTERRUPTED state.  If it
   receives a DHCPPRPL message in reply, the Secondary Server determines
   the state of	the Primary Server.  If	the Primary Server is in NORMAL
   or COMMUNICATION-INTERRUPTED	state, then the	Secondary Server moves
   into	the SYNC state.

   If, however,	the Primary Server is in RECOVER state,	then the Secon-
   dary	Server updates the Primary Server with its known client	binding
   information,	and moves into NORMAL state upon completion of that
   update.

   If instructed to by an outside agency (e.g.,	an administrator), the


Droms, et. al.					               [Page 35]

DRAFT							    January 1998


   Secondary Server SHOULD move	into PARTNER-DOWN state.  Once the
   Secondary Server has	been in	COMMUNICATION-INTERRUPTED state	for a
   period equal	to the safe-period, then it may	(if configured to do so)
   transition into the PARTNER-DOWN state in the absence of an external
   command.


9.6.  Secondary	Server in SYNCH	state

   The Secondary Server	does not respond to DHCP client	requests when in
   SYNCH state.

   DISCUSSION:

      This is the entire reason	for this states	existence, otherwise the
      activities specified for this state could	happen as part of a
      state transition from the	COMMUNICATION-INTERRUPTED state	to the
      NORMAL state. However, in	the COMMUNICATION-INTERRUPTED state the
      Secondary	Server responds	to DHCP	client requests. Having	the
      Secondary	Server respond to DHCP client requests during the syn-
      chronization process (and	thus taking actions requiring further
      synchronization) seemed like a bad idea.

   The Secondary Server	synchronizes its information with the Primary
   Server while	in SYNCH state.	 Both Primary and Secondary Servers may
   have	information the	other lacks because of operations performed
   while communications	were interrupted.

   During the synchronization process, the Secondary Server continues to
   poll	the Primary Server with	DHCPPOLL messages.  If it fails	to
   receive a reply, it moves back into COMMUNICATION-INTERRUPTED state.

   When	synchronization	is complete, the Secondary Server moves	into
   NORMAL state.


9.7.  Secondary	Server in PARTNER-DOWN state

   The Secondary Server	responds to DHCP client	requests when in
   PARTNER-DOWN	state.

   Any available IP address which does not belong to the private pool
   established by the Secondary	Server (at entry to PARTNER-DOWN state)
   MUST	NOT be used until the MDLI beyond the entry into PARTNER-DOWN
   state has elapsed.

   The Secondary Server	MUST NOT allocate an IP	address	to a DHCP client
   different from that to which	it was allocated at the	entrance to


Droms, et. al.					               [Page 36]

DRAFT							    January 1998


   PARTNER-DOWN	state until the	MDLI beyond the	its expiration time has
   elapsed. If this time would be earlier than the current time	plus the
   MDLI, then the current time plus the	MDLI is	used.

   Two options exist for lease times, with different ramifications flow-
   ing from each.

   If the Secondary Server wishes the Failover Protocol	to protect it
   from	loss of	stable storage in any state, then it should ensure that
   the MDLI based lease	time restrictions in Section 6.1 are maintained,
   even	in PARTNER-DOWN	state.

   If the Secondary Server wishes to forego the	protection of the safe
   Failover Protocol in	the event of loss of stable storage, then it MAY
   recognize no	restrictions on	actual client lease times while	in
   PARTNER-DOWN	state.

   The Secondary Server	continues to poll the Primary Server with
   DHCPPOLL messages.  If the Secondary	Server receives	a reply, and the
   Primary Server is in	the RECOVER state, the Secondary Server	updates
   the Primary Server with all of the Secondary's client binding infor-
   mation, and then moves into the NORMAL state.

   If communications with the Primary Server are reestablished,	and the
   Primary Server is in	any other state	but RECOVER, the Secondary
   Server moves	into the POTENTIAL-CONFLICT state (as does the Primary
   Server).

9.8.  Secondary	Server in POTENTIAL-CONFLICT state

   The secondary server	enters POTENTIAL-CONFLICT state	when the combi-
   nation of its state and that	of the primary indicate	that a potential
   conflict of IP address allocation has occurred.  There is no	guaran-
   tee that such a conflict has	occurred -- just the possibility.  In
   this	state each server compares its client binding information with
   that	of the other server and	any conflicts are resolved in an imple-
   mentation dependent manner.

   When	(and if) the resolution	process	completes, each	server moves
   into	the NORMAL state.


10.  Safe Period

   Due to the restrictions imposed on each server while	in
   COMMUNICATION-INTERRUPTED state, long-term operation	in this	state is
   not feasible	for either server. One reason that these states	exist at
   all,	is to allow the	servers	to easily survive transient network


Droms, et. al.					               [Page 37]

DRAFT							    January 1998


   communications failures of a	few minutes to a few days (although the
   actual time periods will depend a great deal	on the DHCP activity of
   the network in terms	of arrival and departure of DHCP clients on the
   network).

   Eventually, when the	servers	are unable to communicate, they	will
   have	to move	into a state where they	no longer can re-integrate
   without the some possibility	of a duplicate IP address allocation.
   There are two ways that they	can move into this state (known	as
   PARTNER-DOWN).

   They	can either be informed by external command that, indeed, the
   partner server is down. In this case, there is no difficulty	in mov-
   ing into the	PARTNER-DOWN state since it is an accurate reflection of
   reality and the protocol has	been designed to operate correctly (even
   during reintegration) if, when in PARTNER-DOWN state	the partner is,
   indeed, down.

   The other difficulty	is when	the servers are	running	unattended for
   extended periods, and in this case the option is provided to	config-
   ure something called	a "safe- period" into each server. This	OPTIONAL
   safe-period is the period after which either	the Primary or Secondary
   Server will automatically transition	to PARTNER-DOWN	from
   COMMUNICATION-INTERRUPTED state.  If	this transition	is completed and
   the partner is not down, then the possibility of duplicate IP address
   allocations will exist.

   The goal of the "safe-period" is to allow network operations	staff
   some	time to	react to a server moving into COMMUNICATION-INTERRUPTED
   state.  During the safe-period the only requirement is that the net-
   work	operations staff determine if both servers are still running --
   and if they are, to either fix the network communications failure
   between them, or to take one	of the servers down before the	expira-
   tion	of the safe-period.

   The length of the safe-period is installation dependent, and	depends
   in large part on the	number of unallocated IP addresses within the
   subnet address pool and the expected	frequency of arrival of	previ-
   ously unknown DHCP clients requiring	IP addresses.  Many environments
   should be able to support safe-periods of several days.

   During this safe period, either server will allow renewals from any
   existing client.  The only limitation concerns the need for IP
   addresses for the DHCP server to hand out to	new DHCP clients and the
   need	to re-allocate IP addresses to different DHCP clients.

   The number of "extra" IP addresses required is equal	to the expected
   total number	of new DHCP clients encountered	during the safe	period.


Droms, et. al.					               [Page 38]

DRAFT							    January 1998


   This	is dependent only on the arrival rate of new DHCP clients, not
   the total number of outstanding leases on IP	addresses.

   In the unlikely event that a	relatively short safe period of	an hour
   is all that can be used (given a dearth of IP addresses or a	very
   high	arrival	rate of	new DHCP clients), even	that can provide sub-
   stantial benefits in	allowing the DHCP subsystem to ride through a
   minor problems that could occur and be fixed	within that hour.  In
   these cases,	no possibility of duplicate IP address allocation
   exists, and re-integration after the	failure	is solved will be
   automatic and require no operator intervention.

11.  Open Issues

A number of details remain to be worked	out.  They are as follows:

     1.	Level of Agreement and Completion

	This draft is incomplete in two	senses.	 First,	none of	the
	authors	agree with everything written, and quite a number of
	issues remain to be worked out among the various authors (to say
	nothing	about the rest of the community).  Second, this	draft is
	not yet	complete enough	to support creation of inter-operable
	implementations.

	However, we believe that even though this draft	is very	much a
	work in	progress, there	is value with sharing it with the rest
	of the DHCP community in its current form.

     2.	Failover Port

	We need	to resolve whether the Failover	protocol runs with the
	same or	a different port as the	DHCP protocol.	In the interests
	of allowing implementation of the Failover protocol by a dif-
	ferent process or sub-process, having it use a different port
	seems reasonable.

     3.	High Level Operations

	While the detailed operations are beginning to come together,
	the higher level operations (like reintegration) are, as yet,
	incompletely specifcied.  This will be rectified in a later
	revision.

     4.	Option Spaces

	The draft currently reflects some rather fuzzy goals of	using
	DHCP options where they	apply but also defining	new options.  It


Droms, et. al.					               [Page 39]

DRAFT							    January 1998


	uses the "user defined option space" for this, which is	probably
	not a good idea.  Perhaps the DHCP Panel will produce a	larger
	option space in	which all of these options can be defined, or
	perhaps	(as it written in the draft) this protocol will	just
	have to	define entirely	unique options.

     5.	Subnet Level Granularity

	This protocol talks about a server being in one	state or
	another, however the desire is for this	protocol to operate
	independently in each address pool for which a primary and
	secondary server is defined.  In this way, the "server"	state
	really refers to the "subnet" state.  Once the protocol	is vali-
	dated, the editing work	to make	it operate at subnet granularity
	will be	performed.

     6.	Secondary Server Communications	with DHCP Clients

	There are two situations where we may want to allow the	secon-
	dary server to communicate with	DHCP clients even though the
	secondary can communicate with the primary and would normally be
	unresponsive to	DHCP client requests.

	The first situation which deserves consideration is where the
	secondary has given a DHCP client a lease on an	IP address when
	it was not able	to communicate with the	primary, and then subse-
	quently	the secondary becomes able to communicate with the pri-
	mary.  When the	client unicasts	its DHCPREQUEST	to the secondary
	to renew its lease, the	secondary will not be able to communi-
	cate with the client (as this protocol is defined).  Should we
	allow the Secondary to extend the lease	for the	DHCP client and
	then inform the	primary	of that	extension using	the DHCPBNDUPD
	message	in the same was	as the Primary uses that message?

	The second situation arises where a client can only communicate
	with the secondary due to some network failure,	but the	primary
	and secondary server can communicate.  As written, the protocol
	will not allow the secondary to	offer a	lease to the DHCP
	client,	but it would be	straightforward	to modify the protocol
	to allow the secondary to do so.  The only difficult part of
	this change to the protocol would be to	suggest	how the	secon-
	dary would know	that the DHCP client could talk	only to	the
	secondary.  But, given that if the DHCP	primary	could talk to
	the DHCP client, the secondary would expect to hear about it in
	DHCPBNDUPD messages at some point, the absence of such messages
	could be used as a signal to communicate to the	DHCP client in
	question.


Droms, et. al.					               [Page 40]

DRAFT							    January 1998


     7.	UDP or TCP

	There has been much debate about the utility of	using UDP for
	the failover protocol, since it	doesn't	supply guaranteed
	delivery.  Certainly rebuilding	TCP out	of UDP would be	a mis-
	take.  Some factors to consider	in this	debate are as follows:

	First, it is important to recognize that mere receipt of a
	packet by the other server in the pair (e.g., receipt of a
	DHCPBNDUPD packet by the secondary server) is not sufficient for
	the primary to update its own bindings database	with new infor-
	mation about what the secondary	knows.	In all cases of
	transfers of bindings information, the server of a DHCPBNDUPD
	message	MUST update its	own stable storage prior to replying
	with a DHCPBNDACK message (except in the marginal case where all
	of the updates are rejected).  An action is required by	the
	receiving server and an	explicit ACK is	needed by the sending
	server to ensure the integrity of the protocol.	 So, just know-
	ing that the other server has received a Failover protocol
	packet is not intrinsically interesting.

	Second,	the DHCP protocol, both	the client and server side, is
	being implemented in progressively smaller and smaller machines.
	While this progression is most evident in DHCP clients,	there
	exist implementations today of DHCP servers embedded in	devices
	that are by no stretch of the imagination traditional "servers"
	running	mainstream operating systems.  In many ways, the Fail-
	over protocol is very well suited to such devices.  Adding addi-
	tional protocol	infrastructure requirements to implement the
	Failover protocol could	easily prevent its implementation in
	devices	that in	some ways need it most.

	Third, there are only a	few cases where	the Failover protocol
	requires guaranteed delivery of	packets.  In particular, the
	normal Primary to Secondary DHCPBNDUPD message to not have to be
	delivered reliably.  The consequences of lost DHCPBNDUPD mes-
	sages are handled by the use of	the MDLI, for the simple reason
	that since these messages are "lazy", they may not get delivered
	because	of a server failover prior to their transmission.  Given
	that the protocol is robust in the face	of loss	of either a
	DHCPBNDUPD message or a	DHCPBNDACK message, a technique	known as
	"fire and forget" may be used with this	protocol and two
	cooperating implementations.  If the DHCPBNDACK	message	contains
	all of the information originally in the DHCPBNDUPD message,
	then the DHCPBNDUPD message may	be transmitted and forgotten by
	the sending server (typically the primary).  When and if the
	secondary receives the DHCPBNDUPD and replies with a DHCPBNDACK
	message	and the	primary	receives it, the primary will update its


Droms, et. al.					               [Page 41]

DRAFT							    January 1998


	stable storage with a new picture of what the secondary	knows
	about the lease	time.  If either of these messages is lost, the
	only downside is that the DHCP client associated with the bind-
	ing in question	may receive a shorter lease for	one lease period
	than it	would otherwise.   This	"fire and forget" technique
	could substantially ease both the complexity of	implementation
	and memory requirements	of an implementation of	the Failover
	protocol, especially where two servers were communicating over a
	very slow link.

12.  Acknowledgments

   Ralph Droms started it all, by sketching out	an initial interserver
   draft that embodied ideas from several past IETF meetings.  In that
   draft, he acknowledged contributions	by Jeff	Mogul, Greg Minshall,
   Rob Stevens,	Walt Wimer, Ted	Lemon, and the DHC working group.

   Kim Kinnear and Bob Cole each extended that draft, separately and
   then	together, until	they created an	interserver draft that supported
   any number of servers.  The complexity of that approach was just too
   great, and led to a much simpler approach embodied in the first Fail-
   over	draft by Greg Rabil, Mike Dooley, and Arun Kapur and Ralph
   Droms.  This	draft posited only two servers -- a primary and	a secon-
   dary.  Kim Kinnear then wrote the Safe Failover draft to layer on top
   of the Failover Draft and increase its the robustness in the	face of
   certain rare	network	failures. At the spring	1998 IETF meeting in LA,
   the DHC working group said that they	wanted a merged	Failover and
   Safe	Failover draft.	 Steve Gonczi and Bernie Volz stepped up and
   produced the	raw material for such a	merged draft, along with a new
   message format designed around DHCP options and other extensions and
   clarifications.  Kim	Kinnear	edited their work into draft format and
   made	other changes, and that	is what	you have in your hands.

   Many	people have reviewed the various drafts	that went into this
   result.  At American	Internet, ideas	have been contributed by Mark
   Stapp, Brad Parker, and Ellen Garvey.  Glenn	Waters of Bay Networks
   contributed ideas and enthusiasm to make a Failover protocol	that was
   both	"safe" and "lazy".


13.  References


	[1] Droms, R., "Dynamic	Host Configuration Protocol", RFC 2131,
	    March 1997.

	[2] Alexander, S.,  Droms, R., "DHCP Options and BOOTP Vendor
	    Extensions", Internet RFC 2132, March 1997.


Droms, et. al.					               [Page 42]

DRAFT							    January 1998


	[3] Rabil, G., Dooley, M., Kapur, A., Droms, R., "DHCP Failover
	    Protocol", draft-ietf-dhc-failover-00.txt.

	[4] Gudmundsson, Olafur, "Security Architecture	for DHCP",
	    draft-ietf-dhc-security-arch-00.txt.

14.  Author's information

      Ralph Droms
      323 Dana Engineering
      Bucknell University
      Lewisburg, PA  17837

      Phone: (717) 524-1145
      EMail: droms@bucknell.edu


      Greg Rabil, Mike Dooley, Arun Kapur
      Quadritek	Systems, Inc.
      10 Valley	Stream Parkway,	Suite 240
      Malvern, PA 19355

      Phone: (800) 208-2747

      EMail: grabil@quadritek.com
	     mdooley@quadritek.com
	     akapur@quadritek.com


      Kim Kinnear
      American Internet	Corporation
      4	Preston	Ct.
      Bedford, MA  01730-2334

      Phone: (781) 276-4587
      EMail: kinnear@american.com


      Steve Gonczi, Bernie Volz
      Process Software Corporation
      959 Concord St.
      Framingham, MA  01701

      Phone: (508) 879-6994

      EMail: gonczi@process.com
	     volz@process.com


Droms, et. al.					               [Page 43]