Network Working Group M. Schwartz
Internet-Draft Code On The Road, LLC
Expires: April 7, 2002 October 7, 2001
The ANTACID Replication Service: Protocol and Algorithms
draft-schwartz-antacid-protocol-00
Status of this Memo
This document is an Internet-Draft and is in full conformance with
all provisions of Section 10 of RFC 2026 except that the right to
produce derivative works is not granted. (If this document becomes
part of an IETF working group activity, then it will be brought into
full compliance with Section 10 of RFC 2026.)
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF), its areas, and its working groups. Note that
other groups may also distribute working documents as Internet-
Drafts.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
The list of current Internet-Drafts can be accessed at
http://www.ietf.org/ietf/1id-abstracts.txt.
The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html.
This Internet-Draft will expire on April 7, 2002.
Copyright Notice
Copyright (C) The Internet Society (2001). All Rights Reserved.
Abstract
This memo specifies the protocol and algorithms of the ANTACID
Replication Service, designed to replicate hierarchically named
repositories of XML documents for business-critical, internetworked
applications.
ASCII and HTML versions of this document are available at
http://www.codeontheroad.com/papers/draft-schwartz-antacid-
protocol.txt and http://www.codeontheroad.com/papers/draft-schwartz-
antacid-protocol.html, respectively.
Schwartz Expires April 7, 2002 [Page 1]
Internet-Draft ANTACID Protocol October 2001
Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . 4
2. Walk-Through of Example ARS Interactions . . . . . . . . . 5
2.1 ARS Commit-and-Propagate Protocol (ars-c) . . . . . . . . 7
2.2 ARS Submission-Propagation Protocol (ars-s) . . . . . . . 12
2.3 ARS Encoding Negotiation Protocol (ars-e) . . . . . . . . 18
2.4 ARS Service Implementing All Three Sub-Protocols . . . . . 19
3. ARS Syntax and Semantics . . . . . . . . . . . . . . . . . 24
3.1 Identifiers, Data Representation, and Error Signaling . . 24
3.1.1 ARS Server Identification . . . . . . . . . . . . . . . . 24
3.1.2 Sequence Numbers . . . . . . . . . . . . . . . . . . . . . 26
3.1.3 DataWithOps Encoding . . . . . . . . . . . . . . . . . . . 27
3.1.4 ARSError . . . . . . . . . . . . . . . . . . . . . . . . . 28
3.2 ARS Message Semantics . . . . . . . . . . . . . . . . . . 32
3.2.1 ARS Commit-and-Propagate Protocol (ars-c) . . . . . . . . 33
3.2.1.1 SubmitUpdate . . . . . . . . . . . . . . . . . . . . . . . 33
3.2.1.2 SubmittedUpdateResultNotification . . . . . . . . . . . . 36
3.2.1.3 PushCommittedUpdates . . . . . . . . . . . . . . . . . . . 36
3.2.1.4 PullCommittedUpdates . . . . . . . . . . . . . . . . . . . 37
3.3 ARS Submission-Propagation Protocol (ars-s) . . . . . . . 38
3.3.1 PropagateSubmittedUpdate . . . . . . . . . . . . . . . . . 38
3.3.2 SubmittedUpdateResultNotification Extended Semantics . . . 39
3.4 ARS Encoding Negotiation Protocol (ars-e) . . . . . . . . 39
4. Algorithms and Implementation Details . . . . . . . . . . 41
4.1 ARS Meta-Data Management . . . . . . . . . . . . . . . . . 41
4.1.1 Document State . . . . . . . . . . . . . . . . . . . . . . 41
4.1.2 Committed Update State Management . . . . . . . . . . . . 41
4.1.3 Committed Update Collapsing . . . . . . . . . . . . . . . 42
4.1.4 Per Server Sequence Number State . . . . . . . . . . . . . 45
4.1.5 Locking . . . . . . . . . . . . . . . . . . . . . . . . . 46
4.1.6 Server Configuration Data . . . . . . . . . . . . . . . . 47
4.1.6.1 Replication Topology (Normative) . . . . . . . . . . . . . 47
4.1.6.2 Local Implementation Settings (Non-Normative) . . . . . . 50
4.2 Protocol Processing . . . . . . . . . . . . . . . . . . . 51
4.2.1 ARS Commit-and-Propagate Protocol (ars-c) . . . . . . . . 51
4.2.1.1 SubmitUpdate Processing . . . . . . . . . . . . . . . . . 52
4.2.1.2 SubmittedUpdateResultNotification Processing . . . . . . . 55
4.2.1.3 PushCommittedUpdates Processing . . . . . . . . . . . . . 56
4.2.1.4 PullCommittedUpdates Processing . . . . . . . . . . . . . 56
4.2.1.5 Submitted Update Collapsing for Infrequently Synchronized
Peers . . . . . . . . . . . . . . . . . . . . . . . . . . 57
4.2.2 ARS Submission-Propagation Protocol (ars-s) Processing . . 58
4.2.2.1 Non-Primary SubmitUpdate Processing . . . . . . . . . . . 58
4.2.2.2 Non-Primary PropagateSubmittedUpdate Processing . . . . . 59
4.2.2.3 Primary PropagateSubmittedUpdate Processing . . . . . . . 61
4.2.2.4 PushCommittedUpdates and PullCommittedUpdates Scheduling . 62
4.2.2.5 PullCommittedUpdates Synchronization . . . . . . . . . . . 63
Schwartz Expires April 7, 2002 [Page 2]
Internet-Draft ANTACID Protocol October 2001
4.2.2.6 SubmittedUpdateResultNotification Synchronization . . . . 64
4.2.2.7 Submitted Update Reordering Details . . . . . . . . . . . 66
4.2.3 ARS Encoding Negotiation Protocol (ars-e) Processing . . . 68
4.3 Example State Transition Diagrams . . . . . . . . . . . . 69
5. Security Considerations . . . . . . . . . . . . . . . . . 72
Author's Address . . . . . . . . . . . . . . . . . . . . . 73
A. Acknowledgements . . . . . . . . . . . . . . . . . . . . . 74
B. Future Enhancements and Investigations . . . . . . . . . . 75
C. ANTACID Replication Service Registration . . . . . . . . . 77
D. ARS Top-Level DTD . . . . . . . . . . . . . . . . . . . . 78
E. ars-c DTD . . . . . . . . . . . . . . . . . . . . . . . . 80
F. ars-s DTD . . . . . . . . . . . . . . . . . . . . . . . . 86
G. ars-e DTD . . . . . . . . . . . . . . . . . . . . . . . . 88
H. ARS Topology Configuration DTD . . . . . . . . . . . . . . 90
I. Current Encodings and Registration Procedures . . . . . . 93
I.1 Currently Defined Encodings . . . . . . . . . . . . . . . 93
References . . . . . . . . . . . . . . . . . . . . . . . . 73
I.2 Encoding Registration Procedures . . . . . . . . . . . . . 96
Full Copyright Statement . . . . . . . . . . . . . . . . . 97
Schwartz Expires April 7, 2002 [Page 3]
Internet-Draft ANTACID Protocol October 2001
1. Introduction
This document specifies the protocol and algorithms used to implement
the ANTACID Replication Service (ARS). Readers are referred to [1]
for a motivation of the problem addressed, the replication
architecture, and terminology used in the current document. The
current document assumes the reader has already read that document,
and that the reader is familiar with XML [2]. Moreover, since the
ARS protocol is defined in terms of a BEEP [3] profile, readers are
referred to that document for background.
We begin (Section 2) in by walking through example ARS interactions,
to give the reader a concrete flavor for how the protocol works. We
then (Section 3) present the ARS syntax and semantics, and then
(Section 4) provide algorithms and implementation details.
Schwartz Expires April 7, 2002 [Page 4]
Internet-Draft ANTACID Protocol October 2001
2. Walk-Through of Example ARS Interactions
ARS updates follow a simple pattern, with Submit Sequence Numbers
(SSN's) assigned by each submission server flowing up the DAG and
Commit Sequence Numbers (CSN's) assigned by the primary flowing back
down after a submission has committed at the primary. As an example,
consider the DAG illustrated below:
svr3
| |
\|/ \|/
svr2<-svr4
| |
\|/ \|/
svr1 svr5
In this diagram, arc directions indicate the "is a downstream server
from" relationship. Thus, svr3 is the zone primary, svr1, svr2,
svr4, and svr5 are non-primaries, svr2 and svr4 are downstream from
svr3, svr2 is downstream from svr4, svr1 is downstream from svr2, and
server 5 is downstream from server 4.
Given this DAG, an update submitted at svr1 might be assigned SSN 1
by svr1, and then be propagated by svr1 to svr2, and then from svr2
to svr4, and then from svr4 to svr3. svr3 serializes the update
submission, commits the update, and assigns it a CSN of, say, 2. At
this point the committed update propagates back down the DAG, for
example first to svr4 and svr2 (from svr3), and then in parallel from
svr4 to svr5 and from svr2 to svr1. As this example illustrates, the
path by which committed updates propagate down the DAG may differ
from the path by which submissions are propagated up the DAG.
This DAG represents a set of ARS servers that implement ars-c as well
as ars-s, which supports updates being submitted to non-primary
servers and propagated up to the primary. In an ARS service that
implements only ars-c all updates must be submitted to the primary.
For that case, the only propagation that occurs is when committed
updates propagate from the primary to all downstream servers.
Given this basic understanding of how submitted and committed updates
propagate across the DAG, we now walk through examples of the
protocol content exchanged between a set of ARS peers. We start with
a server that implements the minimal required ARS protocol elements
(ars-c). We then show the additional functionality of ars-s and ars-
e, each in turn.
Schwartz Expires April 7, 2002 [Page 5]
Internet-Draft ANTACID Protocol October 2001
The examples in this section are based on a pair of servers
configured as follows:
/->svr1 (primary)
/ |
client |
\ \|/
\->svr2 (non-primary)
In some of the examples the client makes requests of the primary. In
other examples the client makes requests of the non-primary. Here
the DAG between the servers is just a single edge, but in general
there could be many servers upstream and downstream from each server
(except for the primary, which never has upstream servers unless it
is also a non-primary for other zones).
In the examples we list the communication endpoints flush left on
the page, with the transmitted content indented, like so:
client->svr1:
...
The "client->svr1:" above is only for labeling the flow as going from
the client to svr1, and is not part of the transmitted ARS content.
The indented text is the transmitted ARS content.
The examples here all use the Blocks [4] name space.
Schwartz Expires April 7, 2002 [Page 6]
Internet-Draft ANTACID Protocol October 2001
2.1 ARS Commit-and-Propagate Protocol (ars-c)
For the simplest case, the interaction begins when the client
performs a "SubmitUpdate" request to the zone primary:
client->svr1:
...
...
Here the client generates and passes a request number that can be
used to correlate the response with the request, to support
concurrent requests. The client makes a SubmitUpdate request,
passing the host name and port number to which notification should be
sent when the update completes or fails, as well as a single
UpdateGroup containing all updates to be performed. The request uses
the DataWithOps encoding, since in this basic example the ARS client-
server pair do not support any other encodings. The DataWithOps
encoding contains a set of (in this case 2) documents, each of which
has an associated operation (in this case "create") to be performed
and attributes containing the document's name and CSN. Because the
data in this example come from the Blocks name space, the name and
CSN information are also contained as attributes within the Blocks.
This redundancy happens because Blocks require these attributes to be
present in the root XML element, as additional structure beyond that
imposed by ARS. Since ARS only assumes documents and not the more
constrained structure of Blocks, the name and CSN need to be included
in the ARS encoding. Finally, note that because the documents have
Schwartz Expires April 7, 2002 [Page 7]
Internet-Draft ANTACID Protocol October 2001
not yet been created in the datastore, the CSN is not meaningful.
The CSN value is only meaningful once the content has been created in
the datastore.
The server responds as follows:
svr1->client:
The ARSAnswer element contains the server's host name, port,
incarnation stamp, and a 64 bit Submit Sequence Number assigned by
the submission server. Together, these four pieces of data
constitute an identification of the update submission that is
globally unique for all time, called the GlobalSubmitID.
This ARSAnswer indicates that the submission was successfully
received, and that the server has entered into the client-server
promise described in [1]. If an error had occurred the response
would have contained a ARSError (Section 3.1.4) instead of an
ARSAnswer.
At some later time, the server performs a
SubmittedUpdateResultNotification request to notify the client that
the update has been successfully committed, and the client
acknowledges receipt of this notification:
svr1->client:
client->svr1:
Schwartz Expires April 7, 2002 [Page 8]
Internet-Draft ANTACID Protocol October 2001
The ReqNum here is 5 because the server happens to have performed 4
other requests before this one. The
SubmittedUpdateResultNotification element contains the four
attributes that constitute the GlobalSubmitID, as well as two other
attributes: the 64 bit CSN that was assigned by the primary when it
committed this update, and the URI [5] of the top node in the zone
within which this update occurred. The URI in effect names the zone.
It is needed because servers can handle multiple zones, and CSN's are
allocated per zone. Together, the URI and CSN constitute an
identification of the update commit event that is globally unique for
all time.
This ARSAnswer indicates that the submission was successfully
committed. If it had failed the SubmittedUpdateResultNotification
would have contained an ARSError element describing the error, and
the CSN would have been 0.
At a time determined by the local implementation's configuration
settings, the primary performs a PushCommittedUpdates request to
suggest to the non-primary that new committed updates are available
to be pulled. The non-primary acknowledges this PushCommittedUpdates
with an ARSResponse:
svr1->svr2:
svr2->svr1:
The PushCommittedUpdates request specifies the host and port from
which the request was initiated. This is done rather than relying on
looking up this information from the underlying transport service
(BEEP) because the transmission could arrive on a different port than
the advertised port on which the server accepts requests. In fact, a
local implementation may chose to split receiving and sending onto
separate machines to distribute load and failure modes, similar to
how some commercial email services split processing for POP [6] and
SMTP [7].
Schwartz Expires April 7, 2002 [Page 9]
Internet-Draft ANTACID Protocol October 2001
At this point, the non-primary performs a PullCommittedUpdates
request to request newly available updates:
svr2->svr1:
blocks:test.schwartz
0
Similar to the PushCommittedUpdates request, the PullCommittedUpdates
request specifies the host and port from which the request was
initiated. The PullCommittedUpdates request names the URI of the
zone for which it wants updates, and the last CSN it has seen for
that zone. By specifying a LastSeenCSN of 0, the non-primary is
requesting the entire zone content (the first valid CSN is defined to
be 1).
Schwartz Expires April 7, 2002 [Page 10]
Internet-Draft ANTACID Protocol October 2001
The primary responds with the requested updates:
svr1->svr2:
...
...
Note that the documents have their CSN's set, per the value assigned
by the primary at commit time. Also, the operations sent are "write"
(rather than the "create" specified when the update was submitted) in
order to ensure that the operation succeeds in the case where update
collapsing (see [1]) is performed. Collapsing will be discussed in
more detail later.
Schwartz Expires April 7, 2002 [Page 11]
Internet-Draft ANTACID Protocol October 2001
2.2 ARS Submission-Propagation Protocol (ars-s)
We begin with the client submitting an update request to the non-
primary server:
client->svr2:
...
...
The content of this request is identical to that discussed in the
earlier example (Section 2.1). Only the destination of the request
has changed.
Schwartz Expires April 7, 2002 [Page 12]
Internet-Draft ANTACID Protocol October 2001
The server responds, noting that it has successfully received the
request:
svr2->client:
Again the content is identical to that shown in the earlier example
(Section 2.1), but with a different source (and SubmisSvrHost) for
the response.
Schwartz Expires April 7, 2002 [Page 13]
Internet-Draft ANTACID Protocol October 2001
At this point, the non-primary server relays the request by making a
PropagateSubmittedUpdate request to the primary server:
svr2->svr1:
...
...
The PropagateSubmittedUpdate request contains the GlobalSubmitID of
the request (indicating that the request was submitted at svr2), but
has re-written the NotifyHost and NotifyPort to refer to svr2, so
that it will find out when the request completes or fails.
Otherwise, the content of the request is identical to what svr2
received from the client.
Schwartz Expires April 7, 2002 [Page 14]
Internet-Draft ANTACID Protocol October 2001
The primary then responds, acknowledging that it has successfully
received the PropagateSubmittedUpdate request and has entered into
the client-server promise, providing a chain of responsibility from
client to svr2 to svr1:
svr1->svr2:
At some later time, the primary commits the update and performs a
SubmittedUpdateResultNotification to inform the non-primary that the
request has completed successfully. The non-primary acknowledges
this SubmittedUpdateResultNotification with an ARSResponse:
svr1->svr2:
svr2->svr1:
Note that ars-s is re-using the SubmittedUpdateResultNotification
element defined by ars-c, for informing a downstream server about the
completion status of a pending update.
Schwartz Expires April 7, 2002 [Page 15]
Internet-Draft ANTACID Protocol October 2001
At some later time, the primary performs a PushCommittedUpdates, the
non-primary follows with a PullCommittedUpdates, and the primary
responds with the requested updates:
svr1->svr2:
svr2->svr1:
svr2->svr1:
blocks:test.schwartz
0
Schwartz Expires April 7, 2002 [Page 16]
Internet-Draft ANTACID Protocol October 2001
svr1->svr2:
...
...
At this point, the non-primary performs a
SubmittedUpdateResultNotification, to notify the client that its
update submission has successfully committed, and the client
acknowledges receipt of this notification:
svr2->client:
client->svr2:
Schwartz Expires April 7, 2002 [Page 17]
Internet-Draft ANTACID Protocol October 2001
Note that this SubmittedUpdateResultNotification indicates that the
update has now committed at the non-primary. This is important
because it means the client can now interact with the non-primary
copy and expect to see the committed update. The client can
correlate this response to the submission it had made based on the
GlobalSubmitID information (host, port, incarnation stamp, and SSN)
contained in the SubmittedUpdateResultNotification attributes.
2.3 ARS Encoding Negotiation Protocol (ars-e)
ContentEncodingNegotiation can be performed between any pair of ARS
peers, to determine if an expanded set of encodings is available
beyond the default DataWithOps encoding. As an example, the non-
primary server might perform a ContentEncodingNegotiation with the
primary as follows:
svr2->svr1:
DataWithOps
AllZoneData
EllipsisNotation
svr1->svr2:
DataWithOps
AllZoneData
Schwartz Expires April 7, 2002 [Page 18]
Internet-Draft ANTACID Protocol October 2001
The ContentEncodingNegotiation element contains a ZoneTopNodeName
attribute specifying the URI of the top node in the zone to which
this encoding is to apply, because the set of encodings supported may
vary by zone. The ContentEncodingNegotiation also contains one or
more ContentEncodingName elements corresponding to content encodings
the initiator supports. The responder sends back the subset of the
requested encodings that it supports.
2.4 ARS Service Implementing All Three Sub-Protocols
Below we put together all of the protocol pieces discussed in the
last three sub-sections, showing how a system supporting all three
ARS sub-protocols might function:
client->svr2:
...
...
svr2->client:
Schwartz Expires April 7, 2002 [Page 19]
Internet-Draft ANTACID Protocol October 2001
svr2->svr1:
DataWithOps
AllZoneData
EllipsisNotation
svr1->svr2:
DataWithOps
AllZoneData
Schwartz Expires April 7, 2002 [Page 20]
Internet-Draft ANTACID Protocol October 2001
svr2->svr1:
...
...
svr1->svr2:
svr1->svr2:
svr2->svr1:
svr1->svr2:
Schwartz Expires April 7, 2002 [Page 21]
Internet-Draft ANTACID Protocol October 2001
svr2->svr1:
blocks:test.schwartz
0
svr1->svr2:
...
...
svr2->svr1:
svr2->client:
Schwartz Expires April 7, 2002 [Page 22]
Internet-Draft ANTACID Protocol October 2001
client->svr2:
Several subtleties of the protocol can be observed from this example:
o When it first connects to the primary, the non-primary performs a
ContentEncodingNegotiation, and finds that the primary supports
fewer content encodings than it does.
o The non-primary uses the DataWithOps for propagating the update
submission, but the primary uses a different encoding
(AllZoneData) to propagate the committed updates. The AllZoneData
encoding is used because the non-primary's PullCommittedUpdates
request asked for all updates performed on the zone
(LastSeenCSN=0). (The use of the AllZoneData encoding is
discussed in more detail later (Appendix I.1).)
o The primary happened to perform a PushCommittedUpdates to inform
the non-primary that new committed updates are available BEFORE
the primary performs the SubmittedUpdateResultNotification for
this update. This can happen because these two types of messages
are asynchronous and decoupled from one another, for reasons that
will be discussed later (Section 4). For this reason, the non-
primary cannot perform the SubmittedUpdateResultNotification to
the client until it has received the
SubmittedUpdateResultNotification from the primary AND performed
the PullCommittedUpdates request (so that it knows the CSN
assigned corresponding to the non-primary's assigned SSN).
o The non-primary's PullCommittedUpdates request overlaps with the
non-primary's receipt of the SubmittedUpdateResultNotification
from the primary. This can happen because, again, result
notification and the scheduling of
PushCommittedUpdates/PullCommittedUpdates requests are
asynchronous processes.
o Once both the SubmittedUpdateResultNotification and the
PullCommittedUpdates requests have completed, the non-primary
detects that it is now time to notify the client of the committed
update's success -- which it does, providing both the original
GlobalSubmitID that it had assigned and the CSN that the primary
assigned for this update.
Schwartz Expires April 7, 2002 [Page 23]
Internet-Draft ANTACID Protocol October 2001
3. ARS Syntax and Semantics
In this section we present the ARS syntax and semantics. We begin
with how ARS identifies and encodes information within its messages:
server identification, submitted and committed update sequence
numbers, default data encodings, and error signaling between ARS
peers. We then describe the structure and meaning of messages
exchanged between ARS peers.
3.1 Identifiers, Data Representation, and Error Signaling
3.1.1 ARS Server Identification
Each ARS server has a global server identifier (GlobalServerID),
which consists of a Domain Name System (DNS [8]) name, server
incarnation stamp, and port number. The GlobalServerID must be
unique for all time. If the server moves to a machine with a
different DNS name, its GlobalServerID changes. A level of naming
indirection can be used to minimize operational problems from this
(e.g., a DNS CNAME called ars.example.com that points to
host3.example.com).
A GlobalServerID-identified server must never use the same SSN for
two different update submissions. The incarnation stamp provides a
way for a server that loses track of its last assigned SSN (e.g., due
to a disk crash) to assign a new incarnation stamp and restart its
SSN allocation sequence. If not for the incarnation stamp, a server
losing its SSN state would be forced to move to a different host name
or port number, which would be an ARS peer-visible change. Note that
ARS peers contact each other using only the host and port
information. The incarnation stamp is only used as part of
GlobalServerID's, which in turn provide a key for looking up
replication state (such as the last seen SSN from a particular
server).
Note that if a new server incarnation is established, no ordering
constraints are defined with respect to the previous server
incarnation. For example, an update submitted to the newly
incarnated server might be serialized before an update that had been
submitted chronologically earlier at the previous server incarnation.
Schwartz Expires April 7, 2002 [Page 24]
Internet-Draft ANTACID Protocol October 2001
At present there is no recovery mechanism if a primary server loses
track of its last assigned CSN. Primary servers must therefore be
run with more failure-resilient technology than non-primary servers -
- for example using RAID-5 plus hot backups. Note that an
incarnation stamp approach would be problematic for primary servers
because it would mean that updates committed after server re-
incarnation would have no defined serialization relationship with
those committed before re-incarnation, which in turn violates
convergent consistency requirements.
The incarnation stamp is a 64 bit number generated from the time-of-
day clock on the server for which the incarnation stamp is being
generated There is no clock synchronization requirement, since the
stamp for any particular server is always generated by a single
machine. Nor is there a requirement that the time stamp be formed
according to any particular clock format (e.g., the UNIX seconds-
since-midnight-1970 epoch -- although the examples in this document
use that format). The only requirement is that a newly generated
incarnation stamp must be at least one greater than the previously
assigned incarnation stamp for that server.
The reason for using a timestamp rather than a simple counter is that
using a timestamp reduces the chances for an administrative error
that would assign an incarnation number that had already been
assigned. In particular, the only state needed to generate a new
incarnation stamp is the current time-of-day clock, which is readily
available without access to any previous replication server state
(which may have been completely destroyed by a disk crash).
Incarnation stamp 0 is defined to be invalid, and thus can be used by
the server implementation as a pre-initialized value to ensure a
valid incarnation stamp has been received during later processing.
Schwartz Expires April 7, 2002 [Page 25]
Internet-Draft ANTACID Protocol October 2001
3.1.2 Sequence Numbers
ARS uses 64 bit unsigned integer sequence numbers to provide unique-
for-all-time identification of submitted and committed updates being
processed by individual servers. For example, this counter size
would allow one million updates per second to a particular zone for
585,000 years without wrapping. There are two types of sequence
numbers:
1. SSN: used by ars-s, the Submit Sequence Number (SSN) is allocated
per submission server per zone to serialize all update
submissions to a zone/server pair. The SSN plus GlobalServerID
constitutes a GlobalSubmitID that uniquely identifies a
submission for all time. The SSN imposes a total ordering over
all updates submitted at that server, and a partial ordering over
all updates globally.
2. CSN: used by ars-c, the Commit Sequence Number (CSN) is allocated
per update per zone by the zone primary server after an entire
update submission has been received and checked for various
problems (discussed below). Each successfully committed
UpdateGroup is assigned a CSN (the value for which is
subsequently associated with all documents in that UpdateGroup),
which in turn serializes all update submissions to a zone so that
updates are committed in the same order globally. More formally,
the CSN imposes a global ordering on all updates that respects
the partial orderings imposed by the SSN's from all submission
servers for the zone.
Note that there is no need for logical clocks [9] for sequence
numbers because updates are not applied at database replicas until
they have been serialized at the primary. In fact, logical clocks
must not be used because that would cause gaps in the SSN sequence,
which would appear to the primary as missing update submissions.
In the case of an UpdateGroup a single CSN must be assigned to the
entire update (rather than one CSN per document within the update
submission).
Schwartz Expires April 7, 2002 [Page 26]
Internet-Draft ANTACID Protocol October 2001
Sequence number 0 (for both SSN's and CSN's) is defined to be
invalid. It is used in three cases:
1. as the value of the CSN field in the
SubmittedUpdateResultNotification response for a failed update;
2. for documents that are not replicated (e.g., if local state
information about the replication system is stored in a Blocks
datastore, the CSN values for each of those documents should be
0); and,
3. as the value of LastSeenCSN when requesting an entire zone in the
PullCommittedUpdates request.
CSN 1 is defined to be the first valid Commit Sequence Number, and is
used only for the case of a data item that lacks a current CSN
attribute (i.e., CSN value 1 is the value used as the default for
this IMPLIED attribute). The first CSN assigned by an ARS server in
response to a successfully committed update is 2. This definition is
specifically used to allow a datastore not previously replicated by
ARS to be replicated without requiring a special tool to add CSN's
(see the SubmitUpdate Processing section (Section 4.2.1.1)).
Instead, ARS interprets a missing CSN attribute at '1', in effect
treating all previous updates applied to a non-replicated datastore
as being rolled up into a starting state with CSN=1. From then on,
ARS assigns CSN's for successfully committed updates starting at CSN
value 2.
When a zone is divided/delegated, the newly created zone initializes
its CSN to be the highest CSN value set from the zone from which it
has been delegated. Doing this (rather than restarting the counting
sequence) preserves the monotonicity of CSN's and avoids the need for
renumbering sequence numbers assigned to documents within the new
zone. The original zone also continues allocating CSN's from this
high-water mark CSN. Note that once a zone is delegated, the fact
that the original and new zone have the same CSN implies nothing
about the relative orderings of updates applied in each. ARS defines
no ordering of updates across zones.
3.1.3 DataWithOps Encoding
ARS requires all clients and servers to support the DataWithOps
encoding. DataWithOps is used by ARS servers that do not support the
ars-e sub-protocol. It is also used in cases where ars-e is
supported but has not been performed between a pair of ARS peers.
Schwartz Expires April 7, 2002 [Page 27]
Internet-Draft ANTACID Protocol October 2001
Each DataWithOps element contains zero or more DatumAndOp elements
describing a set of update operations to be performed, such that
either all operations succeed or all operations fail (per the ANTACID
semantics defined in [1]). Each DatumAndOp element contains a set of
attributes concerning the update to be performed and the content of
the document being updated. The attributes are:
Name: the URI of the document to be updated;
CSN: the CSN of the document to be updated;
Action: one of:
create: verifies that the documents do not exist in the datastore
before creating them;
write: creates or overwrites the documents in the datastore (the
default);
update: verifies that the documents exist in the datastore before
overwriting them; or,
delete: removes the documents from the datastore.
3.1.4 ARSError
The ARSError element provides an error-signaling structure for
exchanging ARS profile-specific errors, providing specific detail
beyond BEEP error handling. The ARSError element contains three
attributes:
1. OccurredAtSvrHost specifies the DNS name or IP address of the
server that flagged the error;
2. OccurredAtSvrPort specifies the port number of the server that
flagged the error; and,
3. OccurredAtSvrIncarn specifies the incarnation stamp of the server
that flagged the error.
This can provide useful information when an update propagates up
several hops in a DAG, with multiple choices at each hop.
Schwartz Expires April 7, 2002 [Page 28]
Internet-Draft ANTACID Protocol October 2001
The ARSError element contains three elements:
1. ARSErrorCode, which must be filled in with values as enumerated
below. ARSErrorCode 0 is defined to be invalid. It can be used
by a server implementation as a pre-initialized value to ensure a
valid code was received during later processing.
2. ARSErrorText, which must be filled in.
3. ARSErrorSpecificsText, which may be filled in to provide
additional detail. The error code enumeration below provides
recommendations of what additional information should be filled
in for the ARSErrorSpecificsText in cases where additional detail
is warranted.
Non-zero ARSErrorCode's use positional structure encoded into
unsigned 32 bit numbers, as follows:
First digit:
1: client problem
2: server problem
Second digit:
1: service failure
2: service refusal
Third digit:
1: security
2: timeout
3: mis-configuration
4: too expensive
5: implementation-specific failure
6: data conflict
7: protocol/format error
8: request for unimplemented feature
9: resource overload
0: other
Fourth-Sixth digits: three-digit enumeration of errors. For
example, error code 114001 is a client problem that caused a
service failure because the request was too expensive.
The error codes listed below are referenced throughout this document.
These error codes cover more failure conditions than those
specifically mentioned in the protocol and algorithm discussions in
this document, such as disk space exhaustion. Moreover, a variety of
local implementation failures are possible (such as data validity
assertion failures built into the code), which also are represented
in the ARSError list below.
Schwartz Expires April 7, 2002 [Page 29]
Internet-Draft ANTACID Protocol October 2001
The currently defined ARSErrorCode's are:
116001: Attempt to delete non-existent document. [ErrorSpecificsText
should specify the non-existent document URI.]
116002: Attempt to update non-existent document. [ErrorSpecificsText
should specify the non-existent document URI.]
117001: Missing URI in document store request.
121001: Authentication failure.
121002: Access denied.
123001: Request was made to submission server that does not hold zone
being requested.
123002: Request was made to upstream server that does not hold zone
being requested.
123003: Attempt to update documents spanning zone boundaries within a
single UpdateGroup.
123004: Request to update data in unknown name space.
126001: Write-write conflict detected. [ErrorSpecificsText should
show ID of server that last updated document before this conflict
was detected.]
126002: Request violates datastore operation semantics.
[ErrorSpecificsText should specify more details.]
126003: General datastore error. [ErrorSpecificsText should specify
more details.]
127001: Malformed client-server ARS protocol transmission.
[ErrorSpecificsText should show XML parser error output.]
210001: Unable to propagate update submission to any upstream
servers. [ErrorSpecificsText should provide some details about
how many attempts were made, over how long of a duration.]
212001: Timeout at zone primary waiting for submitted update re-
ordering.
212002: Timeout while waiting for zone lock.
Schwartz Expires April 7, 2002 [Page 30]
Internet-Draft ANTACID Protocol October 2001
212003: Timeout while waiting for upstream server to propagate
update.
212004: Timeout while trying to respond to request.
213001: Content encoding from upstream server not understood.
[ErrorSpecificsText should name the encoding.]
213002: No appropriate content encoding was available for the
requested operation. [ErrorSpecificsText should name the server
where the problem occurred, and the operation for which no content
encoding could be found.]
213003: Malformed ARS protocol transmission (client or server).
[ErrorSpecificsText should describe parse error (note: used for
cases where the underlying service can't tell whether it's a
client-to-server or server-to-server ARS parsing error).]
213004: General parsing error. [ErrorSpecificsText should describe
parse error (note: used for cases where the underlying service
can't determine whether it's server-to-server parsing or config
file parsing).]
213005: BEEP connection attempt to remote ARS end point relayed on
behalf of current request failed. [ErrorSpecificsText should
describe more detail about the nature of the failure.]
219001: Server resource overload. [ErrorSpecificsText should contain
detail about what resource(s) overloaded.]
223001: No content encodings available for full zone transfer.
223002: PropagateSubmittedUpdate request received from server not
configured as a downstream server.
223003: PushCommittedUpdates request received from server not
configured as an upstream server.
223004: PullCommittedUpdates request received from server not
configured as a downstream server.
223005: Requested ARS sub-protocol not supported.
223006: Update submission received at non-primary that does not
support ars-s.
225001: Implementation-specific failure. [ErrorSpecificsText should
provide detail.]
Schwartz Expires April 7, 2002 [Page 31]
Internet-Draft ANTACID Protocol October 2001
226001: Duplicate update submission detected -- could be a server
retransmitting after update submission has already been
successfully received, or a server configuration loop.
226002: Request for CSN before log truncation point. Full zone
transfer should be requested. [ErrorSpecificsText should show CSN
& truncation point.]
227001: Malformed server-server ARS protocol transmission.
[ErrorSpecificsText should show XML parser error output.]
3.2 ARS Message Semantics
ARS consists of three sub-protocols, only the first of which must be
implemented by all ARS servers: the Commit-and-Propagate Protocol
(ars-c), the Submission-Propagation Protocol (ars-s), and the
Encoding Negotiation Protocol (ars-e). The protocol syntax for a
server supporting any subset of these protocols is defined by a DTD
whose contents are constructed based on the top-level definition and
inclusion content (Appendix D). Here the operations to be supported
are defined in the "ARSREQUESTS" ENTITY, the DTD's for the supported
sub-protocol(s) is/are included, and, if ars-e is not supported, the
"UpdateGroup" ELEMENT is set to define the single required default
encoding for all ARS servers (DataWithOps).
The "ARSRequest" element contains a "ReqNum" attribute and one of a
subset of the following elements, the subset being defined by the
"ARSREQUESTS" ENTITY: a "SubmitUpdate" element, a
"SubmittedUpdateResultNotification" element, a "PushCommittedUpdates"
element, a "PullCommittedUpdates" element, a
"ContentEncodingNegotiation" element, and a
"PropagateSubmittedUpdate" element.
The "ReqNum" attribute (an integer in the range 1..4294967295) is
used to correlate "ARSRequest" elements sent by a BEEP peer acting in
the client role with the "ARSResponse" elements sent by a BEEP peer
acting in the server role. Request number 0 is defined to be
invalid, and thus can be used by the server implementation as a pre-
initialized value to ensure a request was received during later
processing.
The semantics of each of the elements within the ARSRequest are
defined in the following subsections.
Schwartz Expires April 7, 2002 [Page 32]
Internet-Draft ANTACID Protocol October 2001
3.2.1 ARS Commit-and-Propagate Protocol (ars-c)
ars-c defines four request elements: SubmitUpdate,
SubmittedUpdateResultNotification, PushCommittedUpdates, and
PullCommittedUpdates. For the time being we assume clients submit to
the primary server; submissions to non-primary servers are discussed
later (Section 3.3). For the time being we also assume that all
submitted and committed updates are transmitted between all ARS peers
using the DataWithOps encoding. The ability to support other
encodings is discussed non-primary servers are discussed later
(Section 3.4).
3.2.1.1 SubmitUpdate
Clients submit groups of documents and their associated operation
names to be performed in an ANTACID (see [1]) fashion using the
SubmitUpdate request.
The SubmitUpdate element contains three optional elements:
NotifyHost specifies the DNS name or IP address to which asynchronous
notification is to be sent after the commit fails or succeeds;
NotifyPort specifies the port number for asynchronous notification;
and,
NotifyOkOnCurrentChannel specifies whether it is acceptable for the
server to send notification on the same channel that was used for
submitting the update, if that channel is still open at the time
the notification is ready to be sent. This flag allows the server
to avoid the overhead of opening a new BEEP channel for updates
that commit relatively quickly. The flag is needed because it is
possible that the submission arrives on a different host and port
than that specified by NotifyHost and NotifyPort, and different
applications may or may not want to allow notification to arrive
on the original submission channel. The default (IMPLIED) value
is "no", meaning that the server must open a new channel for
notification.
If a NotifyHost is specified then a NotifyPort must also be included.
If only one of these attributes is included the update must be
rejected with an ARSError containing ARSErrorCode=127001. The peer
that receives notification may differ from the original submitting
client, for example allowing a mobile client to perform update
submissions and an always-connected server to receive the
SubmittedUpdateResultNotification and convert it to an email message
for the user to pick up later.
Schwartz Expires April 7, 2002 [Page 33]
Internet-Draft ANTACID Protocol October 2001
If NotifyOkOnCurrentChannel='yes', then NotifyHost and NotifyPort
must also be specified. If NotifyOkOnCurrentChannel='yes' and
NotifyHost or NotifyPort is not specified, the update must be
rejected with an ARSError containing ARSErrorCode=127001. The
semantics when NotifyOkOnCurrentChannel='yes' are:
o If the submission channel is still open at the time notification
is ready to be sent, the server may send notification on the
submission channel or it may open a channel to the specified
NotifyHost and NotifyPort and send the notification on that
channel.
o If the submission channel is no longer open at the time
notification is ready to be sent, the server must open a channel
to the specified NotifyHost and NotifyPort and send the
notification on that channel. The server must not attempt to
determine the host and port for the original channel and open a
new connection to that host and port.
The SubmitUpdate element also contains an UpdateGroup element. The
UpdateGroup contains one or more DataWithOps elements, structured as
noted in the DataWithOps Encoding section (Section 3.1.3). Although
the DTD allows for zero or more DataWithOps, if zero elements are
included in a SubmitUpdate request the update must be rejected with
an ARSError containing ARSErrorCode=127001. (The case of zero
elements is used elsewhere in the protocol.)
Schwartz Expires April 7, 2002 [Page 34]
Internet-Draft ANTACID Protocol October 2001
The response to a failed SubmitUpdate request contains an ARSError
describing the failure. For example, an update submission requesting
deletion of a non-existent document might receive the a response as
follows.
126002
Request violates datastore operation semantics
Request #1 [BlockNameAndStoreOp:
name=test.schwartz.blk01, StoreOp=delete] failed
The response to a successful SubmitUpdate request contains an
ARSAnswer element, which in turn contains a GlobalSubmitID element.
The GlobalSubmitID contains four attributes:
SubmisSvrHost specifies the DNS name of the submission server (note
that unlike some other parts of ARS, the GlobalSubmitID allows
only DNS names, not IP addresses, in the host component);
SubmisSvrPort specifies the port number of the submission server;
SubmisSvrIncarn specifies the incarnation stamp of the submission
server; and,
SSN specifies the Submit Sequence Number assigned to this update
submission.
Schwartz Expires April 7, 2002 [Page 35]
Internet-Draft ANTACID Protocol October 2001
A success response to a SubmitUpdate request means that the server
has accepted the update and will begin processing it at some time in
the future. If a client wishes to be informed of success/failure of
the update commit operation it may request asynchronous notification,
as noted earlier.
3.2.1.2 SubmittedUpdateResultNotification
The SubmittedUpdateResultNotification element is used to notify the
client of success/failure of its submitted update. A
SubmittedUpdateResultNotification is sent to the client when this
status becomes known at the submission server (as opposed to when the
update has committed at the primary).
A SubmittedUpdateResultNotification for a successfully committed
update contains six attributes:
SubmisSvrHost specifies the DNS name of the submission server;
SubmisSvrPort specifies the port number of the submission server;
SubmisSvrIncarn specifies the incarnation stamp of the submission
server;
SSN specifies the Submit Sequence Number assigned to this update
submission;
CSN specifies the Commit Sequence Number that was assigned by the
primary for this update; and,
ZoneTopNodeName specifies the URI [5] of the top node in the zone
within which this update occurred.
A SubmittedUpdateResultNotification for an update that failed
contains the same six attributes above, except that the CSN number is
set to 0. In addition, the SubmittedUpdateResultNotification for a
failed update contains a single ARSError element describing the error
that occurred.
3.2.1.3 PushCommittedUpdates
A PushCommittedUpdates request is made from an upstream server to a
downstream server to suggest that the downstream server perform a
PullCommittedUpdates request from the upstream server. It provides a
means of propagating updates quickly without the downstream servers'
needing to poll the upstream server.
Schwartz Expires April 7, 2002 [Page 36]
Internet-Draft ANTACID Protocol October 2001
The PushCommittedUpdates element contains two attributes:
UpstreamHost specifies the DNS name or IP address of the upstream
server making the request; and,
UpstreamPort specifies the port number of the upstream server making
the request.
3.2.1.4 PullCommittedUpdates
The PullCommittedUpdates request specifies one or more ReplState
elements, corresponding to the one or more zone's the downstream
server replicates from the upstream server, for which it is making
the PullCommittedUpdates request. Each ReplState element contains
two elements:
TopNodeOfZoneToReplicate specifies the top node in the name tree for
the current zone being replicated; and,
LastSeenCSN specifies the last CSN the downstream server has seen.
The semantics are that the upstream server is to send committed
update content (discussed shortly) for each operation that has
occurred since that CSN (i.e., not including that CSN), optionally
using the collapsing notion defined in [1]. A request specifying
LastSeenCSN='0' indicates that the entire zone is to be
transferred.
The response to a failed PullCommittedUpdates request contains a
ARSError describing the failure.
The response to a successful PullCommittedUpdates request contains
zero or more UpdateGroup's:
o zero UpdateGroup's are sent in the case of a PullCommittedUpdates
request to an upstream server that has committed no new updates
since the last PullCommittedUpdates request performed by the
downstream server.
o one UpdateGroup is sent a single UpdateGroup has been committed by
the upstream since the specified CSN.
o more than one UpdateGroup's are sent if multiple UpdateGroup's
have been committed by the upstream since the specified CSN.
Each UpdateGroup contains a set of committed updates, encoded in the
default DataWithOps encoding.
Schwartz Expires April 7, 2002 [Page 37]
Internet-Draft ANTACID Protocol October 2001
3.3 ARS Submission-Propagation Protocol (ars-s)
If the primary and a non-primary server both support ars-s, updates
may also be submitted to the non-primary server.
ars-s adds two new protocol requests to those defined by ars-c:
1. PropagateSubmittedUpdate, which is used by a non-primary to
forward an update submission up the replication Directed Acyclic
Graph (DAG) towards the primary; and,
2. SubmittedUpdateResultNotification (which is used by ars-s for
client notification) is used in an additional way, namely, to
provide asynchronous success/failure notification to a downstream
server of a request it had earlier submitted.
3.3.1 PropagateSubmittedUpdate
The PropagateSubmittedUpdate element contains six attributes:
SubmisSvrHost specifies the DNS name of the submission server;
SubmisSvrPort specifies the port number of the submission server;
SubmisSvrIncarn specifies the incarnation stamp of the submission
server;
SSN specifies the Submit Sequence Number assigned by the submission
server for this update submission;
NotifyHost specifies the DNS name or IP address to which asynchronous
notification is to be sent after the commit fails or succeeds;
and,
NotifyPort specifies the port number for asynchronous notification.
Schwartz Expires April 7, 2002 [Page 38]
Internet-Draft ANTACID Protocol October 2001
The PropagateSubmittedUpdate element also contains one of two
possible elements:
1. UpdateGroup containing the submitted update content, which
contains one or more DataWithOps elements.
2. FailedUpdateSubmission, which is used to indicate that all
attempts to perform a PropagateSubmittedUpdate request to
upstream servers fail (after timing out/retrying a configurable
number of times) have failed. A FailedUpdateSubmission can also
be generated by an administrative tool run to fail updates
submitted to a server that is not brought down cleanly, in
violation of the Client-Server Promise (see [1]).
3.3.2 SubmittedUpdateResultNotification Extended Semantics
Unlike its use in ars-c, with ars-s the notification destination
(host/IP and port) is required, so that downstream servers always
receive notification of update results. Note also that the
GlobalSubmitID contained in a SubmittedUpdateResultNotification
always specifies the globally unique identifier for the submission
server (including the unique SSN it generated), which should be used
by each server along the submission path as a key into a local state
table of in-progress update submissions (e.g., to find where to
propagate the response back down to the previous server on the
submission path).
As in the ars-c case, the SubmittedUpdateResultNotification includes
the ARSError if an error occurred, or the CSN that was assigned by
the primary for the given SSN if no error occurred.
3.4 ARS Encoding Negotiation Protocol (ars-e)
The ContentEncodingNegotiation element is optionally initiated by an
ARS peer that wishes to determine if an expanded set of encodings is
available beyond the default DataWithOps encoding.
The currently defined encodings and procedures for registering new
encodings are provided in an appendix (Appendix I.1).
The ContentEncodingNegotiation element contains a ZoneTopNodeName
attribute specifying the URI of the top node in the zone to which
this encoding is to apply, and one or more ContentEncodingName
elements corresponding to content encodings the initiator supports.
Each ContentEncodingName element contains an NMTOKEN specifying the
name of a defined encoding (such as "DataWithOps").
Schwartz Expires April 7, 2002 [Page 39]
Internet-Draft ANTACID Protocol October 2001
The responder sends back the subset of the requested encodings that
it supports.
After the ContentEncodingNegotiation has completed, each ARS peer may
cache the list of ContentEncodingName's supported by the given peer
and for the given zone, for the duration of the ARS channel's
lifetime. Given a list of supported ContentEncodingName's, each ARS
peer may select an appropriate encoding in future message exchanges.
If no ContentEncodingNegotiation has taken place before an operation,
the DataWithOps encoding must be used. See Current Encodings section
(Appendix I.1) about cases where the DataWithOps may fail to meet the
needs of the current transmission.
Schwartz Expires April 7, 2002 [Page 40]
Internet-Draft ANTACID Protocol October 2001
4. Algorithms and Implementation Details
Below we discuss the basic state management needed to implement an
ARS server. We then discuss algorithm and implementation details for
each of the three sub-protocols.
4.1 ARS Meta-Data Management
A variety of meta-data must be managed to implement an ARS service.
This section discusses possible implementation approaches for
managing this meta-data.
4.1.1 Document State
ARS requires two pieces of meta-data to be associated with each
document: the name of the document and its current CSN. It is a
local implementation matter how these meta-data are stored. One
approach would be to store these meta-data in the datastore itself,
as attributes in the root element of each document. Another approach
would be to maintain a separate repository mapping document name to
the pair (physical address for document, CSN), where the physical
address might be a disk block address or a database row ID. This
approach is similar to how a UNIX file system uses a directory file
to map from hierarchical name to flat (inode) name plus protection
attributes.
4.1.2 Committed Update State Management
To be able to respond to PullCommittedUpdates requests, an ARS server
needs to track the set of operations that have committed on each
document, and the corresponding CSN's. Some type of index is needed
to locate all operations and these associated meta-data for which the
CSN is larger than a given CSN. It is a local implementation matter
how these meta-data are to be managed. We discuss two possibilities
here.
Similar to the case noted in the previous section, one approach would
be to store the meta-data as attributes in the datastore itself, in
each document. An additional complication with doing this for
managing committed update state is that there must be a way to track
deleted documents (so that "delete" operations can be returned in
response to a PullCommittedUpdates request after a delete has
committed at the ARS server). To do this, at "delete" time the
datastore could use another root element attribute to mark documents
as deleted, rather than physically removing them from the datastore.
Additionally, the datastore will need to provide a way for each
profile (SEP, ARS, etc.) that uses the datastore to choose whether to
retrieve deleted documents. For example, it must be possible to
Schwartz Expires April 7, 2002 [Page 41]
Internet-Draft ANTACID Protocol October 2001
service SEP queries such that deleted documents are not returned, but
it must be possible for ARS to retrieve the document name and
"delete" operations that have committed since a given CSN.
In addition to the above complication, there is a potential
performance problem with tracking deleted documents in the datastore.
Specifically, the "greater than CSN" lookup needs to be very
efficient, potentially retrieving millions of results. If the
underlying datastore supports only a text-based index (e.g., designed
primarily to support SEP textual queries), "greater than" queries
will probably be slow. In this case, it would be preferable to
implement a more specialized indexing structure to track committed
updates. That leads to the second approach, namely, tracking
committed updates in some type of log. The log could be implemented
as a flat file with a corresponding numeric index, or perhaps in a
relational database table.
If a log implementation is chosen, a local implementation decision
needs to be made about how far back in history to keep update logs.
Generally speaking, the larger the content held by an ARS server and
the more expensive the network links, the longer back in history the
server should retain logs. Note also that systems supporting mobile
clients should provision for more log data to be kept around, more
clients, longer-running transactions, etc.
4.1.3 Committed Update Collapsing
Regardless of whether committed update state is tracked inside the
datastore or in an auxiliary log, ARS servers may choose to implement
"collapsing" updates as defined in [1]. Doing so could yield
significant savings in network transmissions as well as space
required for committed update state. For the sake of simplicity
below we describe only how to implement update collapsing assuming a
log-based implementation of committed update state management.
Schwartz Expires April 7, 2002 [Page 42]
Internet-Draft ANTACID Protocol October 2001
To implement update collapsing, the ARS server does as follows:
o The log stores a single entry in the log per document that has
been updated.
o If the most recent operation on the document was 'create',
'update' or 'write', the operation saved with this entry is saved
as 'write' (which is the datastore operation that overwrites the
document if it exists and creates it otherwise).
o If the most recent operation on the document was 'delete', the
operation saved with this entry is 'delete'.
o When transmitting a collapsed update, the upstream server sends a
null operation for each sequence number that has been elided by
the above create/update/write/delete substitution algorithm, so
that a complete up-counting sequence is transmitted to the
downstream server. detect missing operations from a up-counting
set of sequence numbers.
In this fashion, for example, the following update sequence run at
the primary:
o write blk2 csn=1
o write blk2 csn=2
o delete blk2 csn=3
o create blk2 csn=4
o update blk2 csn=5
will be "played back" for the downstream server that requests all
operations since csn=1 as:
o noop csn=1
o noop csn=2
o noop csn=3
o noop csn=4
o write blk2 csn=5
Schwartz Expires April 7, 2002 [Page 43]
Internet-Draft ANTACID Protocol October 2001
ARS servers are not required to perform update collapsing when
responding to a PullCommittedUpdates request. However, ARS servers
must be prepared to process PullCommittedUpdates responses that have
been collapsed. Specifically:
o The collapsing algorithm requires that downstream servers ignore
datastore "non-existent document deletion" failures, since, for
example, the sequence:
* create blk2 csn=4
* write blk2 csn=5
* update blk2 csn=6
* delete blk2 csn=7
will be replayed as:
* noop csn=4
* noop csn=5
* noop csn=6
* delete blk2 csn=7
but the downstream server does not have a blk2 at the time this
delete is performed (because it was created and deleted after the
last PullCommittedUpdates request the downstream server handled).
Note that datastore invalid document deletions ARE correctly
detected at the primary, and hence ARS does not negatively impact
datastore semantics for replicated datastores. It is only the
downstream servers that ignore document deletion failures.
o Upstream servers must transmit a complete up-counting sequence of
CSN's, starting with the "last seen" CSN. Downstream servers must
check that they always receive a complete up-counting sequence of
CSN's, starting with the "last seen" CSN.
Schwartz Expires April 7, 2002 [Page 44]
Internet-Draft ANTACID Protocol October 2001
4.1.4 Per Server Sequence Number State
Sequence numbers are tracked as follows. For each zone it handles,
an ARS server tracks the last assigned SSN for that server for that
zone. In addition,
o if it is a non-primary for the zone it tracks the last CSN it has
seen for the zone; or,
o if it is a primary for the zone it tracks:
* The last assigned CSN for the zone; and,
* The last seen SSN for each submission server for the zone.
If a site's ARS service is implemented by multiple physical servers
(all identified by a single DNS name at the site), those servers must
coordinate assignment of sequence numbers among each other to meet
the uniqueness requirement, for example by retrieving the SSN from a
shared backend database.
Note that per-server sequence number state need not be saved in the
datastore, and in fact for the sake of efficiency should be saved to
a lighter weight storage system such as flat files. (The datastore
implements ACID semantics, which is overkill for managing individual
data items.)
Schwartz Expires April 7, 2002 [Page 45]
Internet-Draft ANTACID Protocol October 2001
4.1.5 Locking
A zone-wide lock is obtained in the process of committing an update.
This is accomplished using 'lock' and 'release' primitives at the
top-level node for the zone before and after (respectively)
performing the individual document writes, which implement the
following semantics:
lock: specifies the URI of the document defining a subset of a zone
to which the requesting user instance is requesting exclusive
write access. The zone subset to be locked consists of the named
document and all documents beneath that document in the subtree,
down to but not including any zone delegation cut points in the
subtree. (If there are no zone delegation points, the zone subset
consists of the entire subtree under the specified node, down to
and including the leaves.) A lock must be performed successfully
before any document writes may be performed. While a zone subset
is locked, no other user instance may lock or write documents
successfully within the zone subset, and any document write
operations are journaled until a subsequent release operation.
release: specifies whether to commit or rollback any journaled
document write operations. All document write operations
performed while a zone subset is locked have atomic update
semantics -- either they all succeed or they all fail. If they
all succeed, they must all become visible to other clients of the
local datastore atomically.
Note: for performance reasons it may be preferable to implement a
more optimistic concurrency control technique so that write
operations from multiple updates can be overlapped and conflicts
cause rollback/replay. For simplicity we talk about zone-wide
locking in the current document.
If the ARS implementation is threaded additional synchronization is
required, because datastore lock semantics disallow a single process
from locking nesting subtrees (e.g., locking "a.b.c" when "a.b" is
already locked). Threaded implementations therefore need to maintain
a table of threads currently holding or waiting for a lock, listing
the thread identifier and the tree node locked / to be locked. When
a new lock request is to be performed, this table needs to be checked
to see if any other threads currently hold locks on tree nodes above
or below the current request in the tree, and if so to create a queue
of such requests. When a lock is released, this table again needs to
be checked to see if any threads are currently waiting for a lock
that may now be allowed to issue the datastore lock request.
Schwartz Expires April 7, 2002 [Page 46]
Internet-Draft ANTACID Protocol October 2001
Finally, appropriate synchronization is needed around accesses to the
above table.
4.1.6 Server Configuration Data
4.1.6.1 Replication Topology (Normative)
Appendix H provides the DTD for configuring the replication topology
of a ARS server. While the storage management mechanism for this
configuration data (local file, database table, etc.) is a local
implementation matter, the document structure is defined here for two
reasons:
o This syntax provides a standard format for exporting server
information, which may be used to support server location (e.g.,
through export to DNS service location (SRV) records); and
o Having a uniform syntax allows for easier discussion, e.g., in
outside documents or email discussions.
Each server specifies the set of zones it handles, whether it is
primary for each zone, and the immediate upstream and downstream
servers for each zone it serves. The configuration data also
specifies the frequency of PushCommittedUpdates and
PullCommittedUpdates requests, as well as preferences for the order
that servers are to be contacted when propagating submitted updates.
Schwartz Expires April 7, 2002 [Page 47]
Internet-Draft ANTACID Protocol October 2001
As an example, the following is the configuration data for a primary
server running on host s1.example.com and port 5682, which replicates
content in the Blocks name space:
This is the primary server for the global name tree root, delegating
at cut points "doc.rfc" and "doc.edgar". It is replicated by two
downstream servers, running on s2.example.com and s3.example.com. It
pushes updates to those servers every 10 minutes.
Schwartz Expires April 7, 2002 [Page 48]
Internet-Draft ANTACID Protocol October 2001
Here is a configuration file for a non-primary server running on host
s2.example.com and port 5682, which also replicates content in the
Blocks name space:
'ZoneTopNode Name='blocks:.'/>
'TopNodeOfZoneToReplicate Name='blocks:.'/>
This server replicates the "." zone from two upstream servers
(s1.example.com and s6.example.com). It does not schedule any
periodic update pull requests from the upstream servers, because in
this set of servers only pushes are scheduled. The server specifies
preference weights for each upstream server, used to determine the
order that the upstream servers are tried when attempting to
Schwartz Expires April 7, 2002 [Page 49]
Internet-Draft ANTACID Protocol October 2001
propagate update submissions. Finally, this server is replicated by
two downstream servers, running on s4.example.com and s5.example.com,
respectively.
4.1.6.2 Local Implementation Settings (Non-Normative)
In addition to replica topology information, ARS servers will also
need various local configuration data. What follows is not part of
the normative specification for ARS, but rather is included to
provide a concrete example to implementors, based on the author's
server implementation. The author's ARS implementation has the
following local configuration data:
HomeDirectory: Root directory under which data, logs, and
configuration information are stored.
ValidateARSMessages: Whether to validate ARS protocol messages
against DTD. Note that this setting can adversely affect server
performance.
DetectWriteWriteConflicts: Whether to detect write-write conflicts.
Only matters at Zone primary. The spec makes this required to be
on but I included the option to turn it off to allow
experimentation (since it adds overhead) and to allow easier
testing (since otherwise you need to have the right CSN before
sending an update).
OutOfOrderTimeoutInSecs: How long to wait (in seconds) for out-of-
order update submissions while earlier submissions are propagated
before timing out the update for the current attempt period.
LockWaitTimeoutInSecs: Number of seconds to allow update submissions
to wait for for the real subtree lock while trying to apply a
committed update before timing out.
SingleARSRequestAttemptTimeoutInSecs: How long to wait (in seconds)
for PropagateSubmittedUpdate requests to complete before timing
out the update for the current attempt period.
ServiceFailedTransmitRetryPeriodInSecs: How long to wait after a
retryable request fails due to service failure out before
retrying, in seconds.
ServiceFailedTransmitMaxAttempts: Number of times to retry a service
failed request before giving up and reporting it failed to client.
Schwartz Expires April 7, 2002 [Page 50]
Internet-Draft ANTACID Protocol October 2001
LogicallyIndentBlocks: If true, we put logical indentation into XML
document start and end elements (not the character content) as we
write them out. Else will be left margin aligned. Note that this
setting is only meaningful for XML documents that are parsed.
CacheSeqNumBlocks: Enable/disable SeqNumBlock caching.
ARSDTDFileName: Location of ARS DTD.
ARSContentEncodingsFileName: File name where to find ARS content
encodings DTD. This file can be edited locally to add new (non-
standardized) content encodings, and is included here so that the
ARS runtime can validate content encodings if ValidateARSMessages
is enabled.
4.2 Protocol Processing
An ARS server must implement ars-c, and may implement one or both of
ars-s and ars-e. As part of the required protocol handling support,
ARS servers must reject requests for a non-supported sub-protocol
with an ARSError containing ARSErrorCode=223005.
4.2.1 ARS Commit-and-Propagate Protocol (ars-c)
Schwartz Expires April 7, 2002 [Page 51]
Internet-Draft ANTACID Protocol October 2001
4.2.1.1 SubmitUpdate Processing
Upon receipt of a SubmitUpdate request, an ARS server performs the
following steps:
1. If a non-primary ARS server that does not support ars-s receives
an update submission (either via a SubmitUpdate or
PropagateSubmittedUpdate request), it must reject the request by
responding with an ARSError containing ARSErrorCode=223006.
2. If ARS server receives an update submission specifying an
unsupported name space it must reject the request by responding
with an ARSError containing ARSErrorCode=123004.
3. The server parses the DataWithOps encoding, saves the enclosed
documents and their associated CSN's and update operations to
temporary stable storage (using temporary identifiers guaranteed
not to clash with other concurrently arriving updates), and
performs the following checks:
* access control denial;
* update request to a document in a zone not served by the
current ARS server; and,
* update request that spans more than one zone.
Note that the temporary document copies need not be saved in the
datastore, and in fact for the sake of efficiency should be saved
to a lighter weight storage system such as a journaling file
system. (The datastore implements ACID semantics, which is
overkill for managing temporary data.)
4. If no failures occurred during the above checks, the server
allocates a new GlobalSubmitID for the UpdateGroup, for the zone
within which the submission falls.
5. At this point the server responds to the client either with an
ARSError describing the error that occurred or an ARSAnswer
containing the GlobalSubmitID to indicate that the submission has
been successfully received. It also saves the
OptionalNotificationDest information provided (if any), for use
in asynchronous notification once the update has completed.
Schwartz Expires April 7, 2002 [Page 52]
Internet-Draft ANTACID Protocol October 2001
6. Now that the update has been completely received, the server
enqueues it for commit processing. The server processes elements
in this queue one at a time, as follows:
* If the local implementation uses log-based committed update
state management (Section 4.1.2), create a temporary list into
which document names, operation names and CSN's can be stored.
* Acquire a zone-wide lock, setting an implementation-specified
timeout period that will result in an ARSError containing
ARSErrorCode=212002 being sent to the client if the lock is
not acquired before the timeout expires. Note that this
rough-grain locking is required to implement zone-wide
serialization, and can become a source of contention if the
operations performed while locking are not implemented
efficiently.
* Allocate a new CSN for this zone.
Schwartz Expires April 7, 2002 [Page 53]
Internet-Draft ANTACID Protocol October 2001
* Loop on all DatumAndOp elements within the UpdateGroup and
perform the following steps in the order the operations occur
in the UpdateGroup:
+ load data and operation from saved temporary state.
+ If no 'csn' attribute is currently set in the document,
treat that document as having CSN=1. Doing this allows a
datastore not previously replicated by ARS to be replicated
without running a special tool to add CSN's.
+ At this point the local implementation may perform write-
write conflict detection by comparing the value of the
'CSN' attribute contained in the DatumAndOp against the
corresponding value stored in the primary's local
datastore. If any of these values differ, the
implementation may reject the update by responding with a
ARSError containing ARSErrorCode=126001.
+ Update the 'csn' attribute in each document per the
assigned CSN, and then perform the needed datastore
operation, trapping any errors/exceptions that arise.
(Note that datastore lock/release semantics do not make the
operation visible until the corresponding release occurs).
+ If the local implementation uses log-based committed update
state management (Section 4.1.2), save the document name,
datastore operation name, and CSN in the temporary list.
+ Continue to the next operation.
* If an error/exception arises during the above loop:
+ Release the zone-wide lock, requesting that all contained
updates be aborted.
+ If the local implementation uses log-based committed update
state management (Section 4.1.2), discard the temporary
document name/operation list.
+ Reset the CSN counter so that this CSN will be used for the
next commit attempt. (Each CSN must represent a successful
update.)
+ Generate an ARSError to be transmitted to the client (if
notification was requested).
Schwartz Expires April 7, 2002 [Page 54]
Internet-Draft ANTACID Protocol October 2001
* Else, if the local implementation uses log-based committed
update state management (Section 4.1.2), append the temporary
document name/operation list to the log of all operations
performed on the datastore (which is used by the
PullCommittedUpdates request; details about this log are
discussed later (Section 4.1.3)). This log is only written
when the zone lock is held, and therefore the log will be
serialized in the same update order as applied to the local
datastore.
* Finally, release the zone-wide lock, requesting that all
contained updates be committed.
7. Upon completion (either successful or not), if notification was
requested the server performs a SubmittedUpdateResultNotification
operation (discussed below).
Note: as an optimization for step 1 above, incoming documents can be
written directly to the datastore (rather than saving first to
temporary storage), and the update simply aborted if an error is
detected. However, we recommend against this approach because:
o It would require holding the zone-wide lock potentially an
arbitrarily long time, for example while a large update is
submitted across a congested link. By saving to temporary storage
first a bound is placed on how long the zone lock is held.
o This approach will not work if the ars-c implementation is
extended to support ars-s, since ars-s needs to propagate
submission upstream before they are committed.
4.2.1.2 SubmittedUpdateResultNotification Processing
SubmittedUpdateResultNotification must be implemented as a timeout-
and-retry style of operation, so that if the client is temporarily
unreachable the server will retry over a period of time. The number
of retries and timeout period is determined by the local
implementation.
For failed updates, the SubmittedUpdateResultNotification contains
the ARSError that occurred and the GlobalServerID where the error
occurred.
For successful updates, the SubmittedUpdateResultNotification
contains an empty ARSError element, as well as the CSN that was
assigned by the primary for the given SSN.
Schwartz Expires April 7, 2002 [Page 55]
Internet-Draft ANTACID Protocol October 2001
4.2.1.3 PushCommittedUpdates Processing
The processing of PullCommittedUpdates requests is implementation-
dependent. The downstream server may ignore the request, or may use
it to schedule (Section 4.2.2.4) a PullCommittedUpdates request.
4.2.1.4 PullCommittedUpdates Processing
Downstream servers must synchronize PullCommittedUpdates requests so
that at most on request/response is in progress for a given zone at
any time.
If a server maintains committed update state in a log (Section 4.1.2)
and a request is received for updates further back in history than
are stored in that log, the upstream server responds with an ARSError
containing ARSErrorCode=226002. In response the downstream server
may either re-issue the same request at a different ARS server, or
(if both servers support an appropriate ars-e encoding (Section 3.4))
request a full zone transfer. Note in particular that if a server
performs committed update log truncation it will be unable to support
new ARS replicas' requests to join the replication network (since
they will need to perform a request for all updates since CSN 0)
unless both servers also support an appropriate ars-e encoding. As a
consequence, an implementation that does not support ars-e and that
wishes to allow new replicas to join over time must not perform
committed update log truncation.
The upstream server must lock the requested zone while processing a
PullCommittedUpdates request, so that the underlying datastore
contents are not changed while the content is being sent (which could
result in inconsistent content being transmitted to the downstream
server). Since this may cause the zone to be locked for a long time,
an alternative implementation would be to lock the zone, make a copy
of the documents to be sent, and unlock, before transmitting those
documents. Copy-on-write implementations are also possible.
The upstream server must send the UpdateGroup's in increasing order
of CSN for that zone.
Schwartz Expires April 7, 2002 [Page 56]
Internet-Draft ANTACID Protocol October 2001
When a downstream server receives a committed set of UpdateGroup's
from an upstream server (in response to the PullCommittedUpdates
request) the downstream ARS server:
o should check that all CSN's contained within each UpdateGroup are
monotonically increasing.
o must apply a (possibly complete) prefix of the CSN's. Resource
overrun during acceptance of the updates must not leave the
downstream server in a state where a non-prefix subset of the
responses has been committed.
o must apply each UpdateGroup atomically, applying steps 1-3, 6 and
7 listed under SubmitUpdate Processing (Section 4.2.1.1), with one
exception: in step 6 rather than allocating a new CSN the
downstream server should use the CSN contained in each document
(which was allocated and written into the documents by the
primary).
o Updates its local state based on the last CSN received and
committed from the committed UpdateGroup(s).
The downstream server must ignore datastore delete failures to
function correctly in response to upstream servers that implement
collapsing updates.
4.2.1.5 Submitted Update Collapsing for Infrequently Synchronized Peers
If an ARS server performs write-write conflict detection, clients
cannot submit two updates in a row to a document without getting a
commit response after each submission. That can be an annoying
limitation for infrequently synchronized nodes, such as mobile PDAs.
To mitigate this problem ARS peers may collapse updates as follows.
If a pending update submission has not yet been propagated up the
DAG, the ARS server may choose to replace the pending submission with
another update to the same document, reusing the SSN. To maintain
the correct submitted update ordering, the SSN's for all updates
between the previous and recent submission must be reordered whenever
this algorithm is applied, by dropping the original submission,
shifting each of the following SSN's back by one, and decrementing
the current not-yet-assigned SSN at that ARS server. For example,
consider the update submission sequence: a.b.c (SSN 4), a.b.d (SSN
5), a.b.e (SSN 6), a.b.c (SSN 7). If none of these updates has yet
been propagated up the DAG, this update sequence can be replaced with
the sequence a.b.d (SSN 4), a.b.e (SSN 5), a.b.c (SSN 6), and then
reusing SSN 7 for the next update that is submitted.
Schwartz Expires April 7, 2002 [Page 57]
Internet-Draft ANTACID Protocol October 2001
4.2.2 ARS Submission-Propagation Protocol (ars-s) Processing
ars-s requires more complex synchronization for performing the ars-c
SubmittedUpdateResultNotification operation. Each of these
operations is discussed below.
4.2.2.1 Non-Primary SubmitUpdate Processing
If a non-primary ARS server that supports ars-s receives a
SubmitUpdate request, it performs the following steps:
1. Steps 1-5 listed under SubmitUpdate Processing (Section 4.2.1.1).
Note: a SSN should not be used in place of a temporary identifier
in step 3 because if a failure occurs during these steps a
FailedUpdateSubmission request will have to be propagated
upstream (discussed below), adding additional load to all
upstream servers and delaying other update submissions until this
FailedUpdateSubmission has completed at the primary.
2. The server generates a PropagateSubmittedUpdate request,
consisting of the same content as the received submission, but
filling in the GlobalSubmitID attribute with the server's
SubmisSvrHost, SubmisSvrPort, SubmisSvrIncarn, and the SSN.
3. The server attempts to send this PropagateSubmittedUpdate request
to each upstream server in turn, until one successfully receives
it.
4. If the PropagateSubmittedUpdate request cannot be successfully
forwarded to any upstream server, a timer must be set to retry
the sequence of upstream servers again later (because of the
client-server promise discussed in [1]). The timeout duration
and number of attempts is determined by the local implementation.
5. Once the PropagateSubmittedUpdate transmission has completed, the
server saves stable state to indicate that the update has been
propagated, so that it can look up this state when the update
later completes (successfully or not) and notify the client if
notification was requested. The PropagateSubmittedUpdate request
must not be transmitted again once it has been successfully
received by an upstream server.
Schwartz Expires April 7, 2002 [Page 58]
Internet-Draft ANTACID Protocol October 2001
6. If all attempts to send/timeout/re-send the
PropagateSubmittedUpdate request upstream fail and notification
was requested, the server sends an ARSError containing
ARSErrorCode=210001. If all attempts fail the server always
generates a PropagateSubmittedUpdate request containing a
FailedUpdateSubmission element, which it attempts to send
upstream using the same timeout-and-retry logic as noted in step
(4), with the exception that it never stops trying until it
succeeds. The reason is that the primary must learn of the
failed update submission, else all future submissions from the
submission server will fail because of the requirement to
serialize updates by SSN (see below).
4.2.2.2 Non-Primary PropagateSubmittedUpdate Processing
If a non-primary ARS server that supports ars-s receives a
PropagateSubmittedUpdate request (which came either from a non-
primary that received a SubmitUpdate request and generated a
corresponding PropagateSubmittedUpdate request, or from a server
propagating a PropagateSubmittedUpdate request it received), it does
the following:
1. If the request contains a "FailedUpdateSubmission" element, it
responds to the downstream server with an ARSAnswer containing
the GlobalSubmitID to indicate that it has successfully received
the request. It then it attempts to send this
PropagateSubmittedUpdate request to each upstream server in turn,
until one successfully receives it. Note that at this point the
responsibility for completing the FailedUpdateSubmission
transmission has passed from the previous server to the current
server, so the current server must retry transmitting the request
indefinitely until an upstream server has accepted it.
Schwartz Expires April 7, 2002 [Page 59]
Internet-Draft ANTACID Protocol October 2001
2. Otherwise, the server performs steps 1-6 listed under Non-Primary
(Section 4.2.2.1), with four changes:
* In addition to the other checks performed during step 1 (more
specifically, during step 3 of SubmitUpdate Processing
(Section 4.2.1.1)), it checks for a duplicate GlobalSubmitID
to the one already seen. This check is done in two places:
1. During step 3 of SubmitUpdate Processing (Section 4.2.1.1)
a check is made that the SSN contained within the given
GlobalSubmitID is greater than the SSN of the last
successfully committed update for the given zone &
submission GlobalServerID; and,
2. After step 3 of SubmitUpdate Processing (Section 4.2.1.1)
a check is made that the given GlobalSubmitID is not
currently being processed (which could happen if duplicate
submissions arrive so close together that one has started
processing and not yet completed).
This check ensures that:
+ DAG cycles (caused by configuration errors) cannot result
in infinite loops or deadlocks; and,
+ PropagateSubmittedUpdate operations are idempotent, which
provides greater resilience in dealing with partitions.
* It rewrites the NotifyHost with its own GlobalServerID, so
that the SubmittedUpdateResultNotification from the upstream
server to which the submission was propagated will be sent the
current server.
* Upon completion (successful or not) the server sends a
SubmittedUpdateResultNotification to the server from which the
PropagateSubmittedUpdate request was received. See also the
discussion of SubmittedUpdateResultNotification
Synchronization (Section 4.2.2.6).
* If all attempts to send the PropagateSubmittedUpdate request
to upstream servers fail (step 6 of Non-Primary
PropagateSubmittedUpdate Processing (Section 4.2.2.2)) the
server sends the appropriate ARSError to the downstream server
from which the PropagateSubmittedUpdate request was received,
but does not generate the PropagateSubmittedUpdate request
containing a FailedUpdateSubmission element. That request
must be generated by the submission server.
Schwartz Expires April 7, 2002 [Page 60]
Internet-Draft ANTACID Protocol October 2001
Note that each server in the submission path assumes responsibility
for the client-server promise (see [1]) as the update submission is
passed up the tree. This promise allows an ARS never to retransmit a
submission once it has been accepted by an upstream server (step 5
under "Non-Primary SubmitUpdate Processing").
4.2.2.3 Primary PropagateSubmittedUpdate Processing
If a primary ARS server that supports ars-s receives a
PropagateSubmittedUpdate request, it performs the following steps:
o If the request does not contain a "FailedUpdateSubmission"
element, it performs the following stems:
1. Steps 1-3 listed under SubmitUpdate Processing (Section
4.2.1.1).
2. If the GlobalSubmitID for the submission is not one greater
than the last seen SSN from the submission server, enqueue the
submission to await receipt of the missing submission(s),
starting a timer to detect excessive delays. This ensures
that update serialization preserves the correct partial
ordering of updates. Note that this hold-and-re-order
mechanism is required because submissions transmitted up
varying replication DAG paths could arrive out of order. If a
timeout occurs, send an ARSError containing
ARSErrorCode=212001 and abort the update.
3. Steps 6-7 listed under SubmitUpdate Processing (Section
4.2.1.1), but sending the SubmittedUpdateResultNotification to
the downstream server from which the submission was received.
4. If the update completed successfully, it updates its local
state that tracks the last seen SSN from the given submission
server.
o Otherwise (FailedUpdateSubmission):
1. It responds to the downstream server with an ARSAnswer
containing the GlobalSubmitID to indicate that it has
successfully received the request.
2. Step 2 above, but without setting a timeout.
3. Step 4 above.
Schwartz Expires April 7, 2002 [Page 61]
Internet-Draft ANTACID Protocol October 2001
4.2.2.4 PushCommittedUpdates and PullCommittedUpdates Scheduling
ARS does not specify how PushCommittedUpdates and
PullCommittedUpdates operations are to be scheduled. As a local
implementation matter, ARS servers may schedule PushCommittedUpdates
and PullCommittedUpdates operations a variety of different ways,
perhaps offering configuration options that can support any/all of
the following:
1. Periodic PullCommittedUpdates requests.
2. PushCommittedUpdates requests down the submission path
immediately following any update that was propagated up that
path, to minimize committed update propagation latency back down
to the submitting client.
3. PushCommittedUpdates requests down to other servers immediately
following an update, to minimize committed update propagation
latency for servers that need to keep in close synchronization.
4. PullCommittedUpdates requests only upon new replica join, server
re-boot, mobile device reconnection, or partition repair, to
"catch up".
Schwartz Expires April 7, 2002 [Page 62]
Internet-Draft ANTACID Protocol October 2001
If a server implements (2) and/or (3) above, care should be taken to
prevent backlogging the downstream server with many
PushCommittedUpdates requests. For example, if the primary is
experiencing high update rates and performs a PushCommittedUpdates
each time it completes an update, it may not be possible to process
the ensuing PullCommittedUpdates requests that the downstream
server(s) make as fast as new PushCommittedUpdates requests are being
made. This can create excess network traffic and lock contention at
the primary, at precisely the worst time. To avoid this problem, the
following algorithm (reminiscent of delayed acknowledgements and
Nagle's algorithm used by TCP [10]) should be used:
o The upstream server uses a flag that tracks whether the downstream
server has run a PullCommittedUpdates request since the last
PushCommittedUpdates request from the upstream server.
o After sending a PushCommittedUpdates request to the downstream
server, the upstream server sets the flag to true.
o After processing a PullCommittedUpdates request, the upstream
server sets the flag to false.
o Each time the upstream server's PushCommittedUpdates code is
triggered it checks this flag and only performs the
PushCommittedUpdates request if the flag is false.
o The code that performs the PushCommittedUpdates and sets the flag
must be run within a critical section that guarantees that a
PullCommittedUpdates request cannot begin until the
PushCommittedUpdates completes (otherwise a deadlock can ensue
that stops any future PushCommittedUpdates requests from running).
With this approach, updates can be propagated when they complete, but
during times of high update submission load many PushCommittedUpdates
operations will be batched together.
4.2.2.5 PullCommittedUpdates Synchronization
The downstream server must perform synchronization to ensure that at
most one PullCommittedUpdates request can be running at a time for a
given zone. For example, a server configured with two different
upstream servers for a zone must not run concurrent
PullCommittedUpdates requests from the two upstream servers. (This
synchronization requirement is one reason why PushCommittedUpdates is
simply a suggestion for a PullCommittedUpdates request to be
performed. If PushCommittedUpdates actually transmitted data, it
would be difficult to synchronize because the PushCommittedUpdates
and PullCommittedUpdates data transfers would be initiated by
Schwartz Expires April 7, 2002 [Page 63]
Internet-Draft ANTACID Protocol October 2001
different servers. Instead, the downstream server controls the
scheduling of committed update transmissions.) This is important not
only because concurrent data transfers for PullCommittedUpdates' for
the same zone would waste traffic and server load, but also because
this concurrency could result in incorrect committed state. For
example, consider the sequence:
o Start PullCommittedUpdates request from upstream server #1 with
LastSeenCSN=2.
o Start PullCommittedUpdates request from upstream server #2 with
LastSeenCSN=2.
o Upstream server #1 has last seen CSN 4, and begins sending content
for CSN's 2,3,4.
o Upstream server #2 has last seen CSN 5, and begins sending content
for CSN's 2,3,4,5.
o Upstream server #2 happens to finish first, committing updates 2-
5.
o Upstream server #1 then finishes, leaving the last-seen CSN=4 but
CSN 5 already applied. This is an incorrect state.
4.2.2.6 SubmittedUpdateResultNotification Synchronization
The scheduling of SubmittedUpdateResultNotification request to a
downstream server is complicated by two factors:
1. Because the response to a PullCommittedUpdates request can
contain more than one UpdateGroup, receipt of a
PullCommittedUpdates request in a server that supports ars-s may
trigger multiple SubmittedUpdateResultNotification's to be
generated to downstream servers and/or clients.
2. It is possible that the committed update content for an update
reaches the downstream server before the
SubmittedUpdateResultNotification from its upstream server
reaches that downstream server.
Schwartz Expires April 7, 2002 [Page 64]
Internet-Draft ANTACID Protocol October 2001
To illustrate the second case above, consider the following
replication topology:
svr3
(primary)
| |
\|/ \|/
svr2 svr4
| |
\|/ \|/
svr1
Given this topology, consider the following event ordering sequence:
1. client->svr1: SubmitUpdate
2. svr1->svr2: PropagateSubmittedUpdate
3. link between svr1 and svr2 goes down
4. svr2->svr3: PropagateSubmittedUpdate
5. svr3->svr2: SubmittedUpdateResultNotification
6. svr3->svr2: PushCommittedUpdates
7. svr2->svr3: PullCommittedUpdates
8. svr3->svr4: PushCommittedUpdates
9. svr4->svr3: PullCommittedUpdates
10. svr4->svr1: PushCommittedUpdates
11. svr1->svr4: PullCommittedUpdates
12. link between svr1 and svr2 comes back up
13. svr2->svr1: SubmittedUpdateResultNotification
14. svr2->svr1: PushCommittedUpdates
15. svr1->svr2: PullCommittedUpdates
16. svr1->client: SubmittedUpdateResultNotification
Because the link between svr1 and svr2 goes down after the submitted
update has been propagated, the committed update content reaches svr1
via an alternate path through the DAG (svr3->svr4->svr1, completing
in event 11) before the SubmittedUpdateResultNotification reaches it
(event 13).
Schwartz Expires April 7, 2002 [Page 65]
Internet-Draft ANTACID Protocol October 2001
Because of these complications, SubmittedUpdateResultNotification (as
well as scheduling of PushCommittedUpdates operations to propagate
the newly arrived committed content downstream) should be triggered
as follows:
o Each server maintains a submitted update state table keyed by SSN
and sequentially ordered (in a secondary data structure such as a
tree) by CSN. It also maintains a counter of the last seen CSN
for the given zone.
o When a SubmittedUpdateResultNotification arrives from an upstream
server, the downstream server looks up the state table entry for
the given SSN, and sets the CSN for that table entry (adjusting
tree order accordingly).
o A thread runs periodically (say, once per second), scanning the
CSN-ordered tree for submitted updates whose CSN is less than the
last seen CSN for the given zone. For each entry, the downstream
server generates a SubmittedUpdateResultNotification the server
downstream of the current downstream server, along the submission
path.
Implementations should not attempt to simplify the synchronization
requirements here by forcing the SubmittedUpdateResultNotification to
complete before the committed content propagates, because doing so
could mean that a single unavailable downstream server would hold up
transmissions of committed updates to all servers in the network.
4.2.2.7 Submitted Update Reordering Details
Non-primary servers must not hold-and-re-order update submissions.
They simply forward all updates up the DAG, and the primary performs
any needed re-ordering. Non-primary servers need not hold-and-re-
order committed updates coming back down the DAG, because all ARS
servers are required to send committed updates in order and without
gaps in the numbering sequence since the requested CSN.
Schwartz Expires April 7, 2002 [Page 66]
Internet-Draft ANTACID Protocol October 2001
The following figure provides an example of the dynamics that can
result from the update submission re-ordering/time-out mechanism.
primary:E2 (E4, E5 queued)
/ \
/ \
\|/ \|/
repA repB
/ \
~ \
\|/ \|/
repC:E3 repD:E4,E5
~ /
| /
\|/ \|/
repE
In this figure, the last update to be serialized by the primary from
replica server E is E's SSN number 2 (denoted by "primary:E2").
After E2 committed, E propagated three more update submissions: it
propagated SSN number 3 to replica server C, and SSN's 4 and 5 to
replica server D. Replica server C became partitioned from the
network after it accepted submission E3 (indicated by the tilde's in
the figure), but SSN 4 and 5 made it to the primary via repD->repA-
>primary. Because it has not yet seen E3, the primary queues E4 and
E5, waiting for E's SSN 3 to arrive. If replica server C stays
partitioned for a long time, the primary will time out SSN's 4 and 5
(sending an ARSError for each containing ARSErrorCode=127001).
Replica server C might then repair its partition and propagate E3
upstream, at which point the primary will serialize and pass the
corresponding committed update back down. Replica server E could
therefore see E4 and E5 fail and then see E3 succeed. It is up to E
to decide whether and when to resubmit the failed submissions.
Possibilities include:
o Reflecting the state to the user;
o Pausing and resubmitting E4 and E5 some configurable number of
times; and,
o Re-propagating E3 (and E4 and E5), so the primary receives it
without waiting for C to repair its partition.
Submission re-ordering is not performed in the downstream direction.
Instead, updates are only propagated in the downstream direction in
CSN order. Note that this happens naturally because the primary
Schwartz Expires April 7, 2002 [Page 67]
Internet-Draft ANTACID Protocol October 2001
generates updates and sends them in CSN order, and its downstream
servers likewise send updates only up through the last CSN they have
seen, so all ARS servers will always see updates in complete CSN
order.
4.2.3 ARS Encoding Negotiation Protocol (ars-e) Processing
In response to a ContentEncodingNegotiation request, the responder
makes a zone-specific decision (e.g., different zones can have
different underlying databases, supporting correspondingly different
proprietary encoding formats). The local implementation may also
consider other issues (e.g., source IP address to decide if
encryption allowed based on country's export control restrictions).
Encoding negotiation results may be cached as long as a BEEP channel
is open to the remote server. Thus, to change the set of encodings
it supports a server must first close any open channels.
If a server that does not support the ars-e protocol receives a
ContentEncodingNegotiation request, it responds an ARSError
containing ARSErrorCode=223005.
After receiving a response to the ContentEncodingNegotiation request,
the initiator should check that the responded set is indeed a subset
of the original encodings.
A request specifying LastSeenCSN='0' indicates that the entire zone
is to be transferred. This case may be used by the upstream server
to trigger a special full-zone encoding, if ars-e is supported by
both servers.
The basic algorithm used for 'plumbing' into a content encoding is to
define an API which the encoding can upcall to save documents to
their stable (temporary) storage, passing the document name, content,
CSN, and operation to be performed. On the reverse side (sending a
set of documents from stable storage out through a encoding), the
encoding upcalls to get a list of document names needing
transmission, and then upcalls to get the document data content for
each. The encoding can then perform whatever transformations are
needed on the way to/from stable storage. Importantly, the whole
process must be implemented as a pipeline so as not to assume an
entire update will fit in memory -- as they arrive documents should
be saved to stable storage, and they should be read as they are to be
sent.
Schwartz Expires April 7, 2002 [Page 68]
Internet-Draft ANTACID Protocol October 2001
4.3 Example State Transition Diagrams
The state transitions needed to implement ARS will depend on which
subset of ARS sub-protocols is implemented, and what scheduling and
synchronization mechanisms are implemented. The following three
figures provide a set of state transition diagrams that could be used
to implement all three ARS sub-protocols (ars-c, ars-s, and ars-e),
with support for PushCommittedUpdates requests down the submission
path immediately following an update that was propagated up that
path. In these state diagrams the paths running straight down
represent the transitions taken when the current state completes
successfully, while the paths to the left represent the transitions
when a failure occurs.
The first state transition diagram can be used for handling
submitted updates arriving at a non-primary from a client (via
SubmitUpdate) or from a downstream ARS server (via
PropagateSubmittedUpdate):
|
\|/
/<---Incomplete
/ |
/ \|/
/ /<-CompletelyReceived
/ / |
/ / \|/
/ /<-PropagatingUpstream<------\ timeout + retrans
| | | \______/ N times, then infinite
| f | \|/ FailedSubmissionPropagation
| a |<--AwaitingCommitNotif
| i | |
| l | \|/
| u |<--AwaitingLocalCommit
| r | |
| e | \|/
| s |<-KickingDownstreamScheds
| | |
| \ \|/
\ \-->NotifyingSubmitter<------\ timeout + retrans
\ | \______/ N times
\ \|/
\------->CleaningUp
|
\|/
Done
Schwartz Expires April 7, 2002 [Page 69]
Internet-Draft ANTACID Protocol October 2001
The second state transition diagram can be used for handling
submitted updates arriving at the primary from a client or from a
downstream ARS server:
|
\|/
/<---Incomplete
/ |
/ \|/
/ /<-CompletelyReceived
/ / |
/ / \|/
/ /<-QueuedForReordering
| | |
| f | \|/
| a |<---WaitingForLock
| i | |
| l | \|/
| u |<--ApplyingToLocalDatastore
| r | |
| e | \|/
| s |<-KickingDownstreamScheds
| | |
| \ \|/
\ \-->NotifyingSubmitter<------\ timeout + retrans
\ | \______/ N times
\ \|/
\------->CleaningUp
|
\|/
Done
Schwartz Expires April 7, 2002 [Page 70]
Internet-Draft ANTACID Protocol October 2001
The third state transition diagram can be used for handling
committed updates arriving at a non-primary from an upstream ARS
server:
|
\|/
/Incomplete
/ |
/ \|/
/<-CompletelyReceived
f / |
a / \|/
i /<---WaitingForLock
l | |
u | \|/
r |<--ApplyingToLocalDatastore
e | |
s \ \|/
\--->CleaningUp
|
\|/
Done
The AwaitingCommitNotif state is used to represent the case where a
submitted update has been propagated upstream and the local server
has not yet received notification that the update has committed
(along with the CSN). The AwaitingLocalCommit state is used to
represent the case where the commit notification has been received
but the committed update content has not yet been propagated back
down the DAG to the local server. (See the discussion of
SubmittedUpdateResultNotification Processing (Section 4.2.1.2).)
The KickingDownstreamScheds state is used to represent the case where
PushCommittedUpdates operations are scheduled to run periodically at
the upstream server, and when a new update arrives the schedules need
to be changed such that a PushCommittedUpdates runs immediately and
then the normal schedule period is re-started. Again, this is needed
for the case of an implementation that performs PushCommittedUpdates
requests down the submission path immediately following an update
that was propagated up that path.
Schwartz Expires April 7, 2002 [Page 71]
Internet-Draft ANTACID Protocol October 2001
5. Security Considerations
See [1]'s Section 10 for a discussion of ARS security issues.
Schwartz Expires April 7, 2002 [Page 72]
Internet-Draft ANTACID Protocol October 2001
References
[1] Schwartz, M., "The ANTACID Replication Service: Rationale and
Architecture", draft-schwartz-antacid-service-00 (work in
progress), October 2001.
[2] World Wide Web Consortium, "Extensible Markup Language (XML)
1.0", W3C XML, February 1998, .
[3] Rose, M., "The Blocks Extensible Exchange Protocol Framework",
draft-mrose-blocks-protocol-04 (work in progress), May 2000.
[4] Rose, M., Gazzetta, M. and M. Schwartz, "The Blocks Datastore
Model", Draft Technical Memo, January 2001.
[5] Berners-Lee, T., Fielding, R. and L. Masinter, "Uniform
Resource Identifiers (URI): Generic Syntax", RFC 2396, August
1998.
[6] Reynolds, J., "Post Office Protocol", RFC 918, Oct 1984.
[7] Postel, J., "Simple Mail Transfer Protocol", RFC 788, Nov 1981.
[8] Mockapetris, P., "Domain names - concepts and facilities", RFC
1034, STD 13, Nov 1987.
[9] Lamport, L., "Time, Clocks, and the Ordering of Events in a
Distributed System", Communications of the ACM Vol. 21, No. 7,
July 1978.
[10] Stevens, W., "TCP/IP Illustrated, Volume 1 - The Protocols",
Addison-Wesley Professional Computing Series , 1994.
[11] Borenstein, N. and N. Freed, "MIME (Multipurpose Internet Mail
Extensions) Part One: Mechanisms for Specifying and Describing
the Format of Internet Message Bodies", RFC 1521, September
1993.
Author's Address
Michael F. Schwartz
Code On The Road, LLC
EMail: schwartz@CodeOnTheRoad.com
URI: http://www.CodeOnTheRoad.com
Schwartz Expires April 7, 2002 [Page 73]
Internet-Draft ANTACID Protocol October 2001
Appendix A. Acknowledgements
The author would like to thank the following people for their reviews
of this specification: Marco Gazzetta, Carl Malamud, Darren New, and
Marshall Rose.
Schwartz Expires April 7, 2002 [Page 74]
Internet-Draft ANTACID Protocol October 2001
Appendix B. Future Enhancements and Investigations
A possible future enhancement to the protocol and implementation
would be to use an attribute that specifies payload length for update
content. By doing this, an implementation could copy the payload
directly to stable storage instead of first parsing it. This could
provide a significant performance improvement, and would also allow
the update content to be saved in exactly the format it was sent in
(as opposed to the rewriting/reindenting/etc. that happen when XML
content is parsed and then output).
Another possible future enhancement to the protocol and
implementation would be to allow serialization-only primary servers,
whose only job is to serialize update submissions and distribute the
work for applying and propagating serialized updates among the first
tier of zone replica servers. That would offload query and update
processing from the inherently centralized serialization server.
A possible future enhancement to the protocol would be to allow a
replication topology containing cycles, rather than requiring DAGs.
This generalization would allow more resilience to network partitions
with fewer servers than DAGs. For example, consider the following
cyclic replication topology:
s1
| \
| \
s2---s3
In this figure updates can to s2 if s3 is down and vice versa. With
ARS's DAG-based topology an additional server would be required to
achieve the same level of redundancy:
s1------>s4
| \ /
| \ /
| \ /
| / \
| / \
\|/\|/ \|/
s2----->s3
Another possible future enhancement to the protocol would be to allow
batching of submitted updates before propagating up the DAG.
Schwartz Expires April 7, 2002 [Page 75]
Internet-Draft ANTACID Protocol October 2001
An area for further work is defining SNMP-based monitoring/management
interfaces.
An area for further work is automating the approach to laying out
replication topology.
Schwartz Expires April 7, 2002 [Page 76]
Internet-Draft ANTACID Protocol October 2001
Appendix C. ANTACID Replication Service Registration
Profile Identification: http://xml.resource.org/profiles/ARS
Messages exchanged during Channel Creation: none
Messages in "REQ" frames: "ARSRequest"
Messages in positive "RSP" frames: "ARSResponse"
Messages in negative "RSP" frames: "ARSError"
Message Syntax: c.f., Appendix D, Appendix E, Appendix F, and
Appendix G.
Message Semantics: c.f., Section 3.2
Schwartz Expires April 7, 2002 [Page 77]
Internet-Draft ANTACID Protocol October 2001
Appendix D. ARS Top-Level DTD
%ARSC;
Schwartz Expires April 7, 2002 [Page 78]
Internet-Draft ANTACID Protocol October 2001
%ARSC;
%ARS;
%ARSC;
%ARSE;
%ARSC;
%ARS;
%ARSE;
Schwartz Expires April 7, 2002 [Page 79]
Internet-Draft ANTACID Protocol October 2001
Appendix E. ars-c DTD
Schwartz Expires April 7, 2002 [Page 80]
Internet-Draft ANTACID Protocol October 2001
Schwartz Expires April 7, 2002 [Page 81]
Internet-Draft ANTACID Protocol October 2001
Schwartz Expires April 7, 2002 [Page 83]
Internet-Draft ANTACID Protocol October 2001
Schwartz Expires April 7, 2002 [Page 85]
Internet-Draft ANTACID Protocol October 2001
Appendix F. ars-s DTD
Schwartz Expires April 7, 2002 [Page 86]
Internet-Draft ANTACID Protocol October 2001
Schwartz Expires April 7, 2002 [Page 87]
Internet-Draft ANTACID Protocol October 2001
Appendix G. ars-e DTD
Schwartz Expires April 7, 2002 [Page 88]
Internet-Draft ANTACID Protocol October 2001
Schwartz Expires April 7, 2002 [Page 89]
Internet-Draft ANTACID Protocol October 2001
Appendix H. ARS Topology Configuration DTD
Schwartz Expires April 7, 2002 [Page 90]
Internet-Draft ANTACID Protocol October 2001
Schwartz Expires April 7, 2002 [Page 92]
Internet-Draft ANTACID Protocol October 2001
Appendix I. Current Encodings and Registration Procedures
ARS encodings are defined as MIME [11] Content-Type
"application/ars", with the single parameter "encoding_name" naming
which encoding is being used (e.g., DataWithOps). ars-e is NOT a
MIME Content Transfer Encoding, since it is not application-
independent.
As is the case with MIME primary types, encodings being used
privately (that is, between peers that understand the encoding by
mutual prior arrangement) must be given names that begin with "X-" to
indicate the encodings' non-standard status and to avoid a potential
conflict with a future official name. Following the "X-" must be a
URI [5] that identifies the encoding uniquely (for example, X-
http://xml.resource.org/encodings/mysqlRaw.html). This URI should
refer to a document that describes the encoding (whether formally or
informally), but the existence of a document is not required. The
only requirement is that the URI must provide a globally unique
identification of the encoding, to prevent clashes in the name space
of privately defined encodings.
ARS Encodings are afforded official status when they have been
registered with the Internet Assigned Numbers Authority (IANA), using
the template provided below. The currently defined ARS encodings are
also listed below, for convenience.
Note that ARS references the encoding_name within the
ContentEncodingsSupported and UpdateGroup elements, without the MIME
"Content-Type:" syntax.
I.1 Currently Defined Encodings
Schwartz Expires April 7, 2002 [Page 94]
Internet-Draft ANTACID Protocol October 2001
The AllZoneData encoding is used to send (and receive) all documents
within a datastore, used for two cases: (a) starting up a new
replica, (b) updating a downstream replica from an upstream server
that uses log-based committed update state management, when the
downstream server's last seen CSN is earlier than the upstream
replica's log truncation point (Section 4.1.2). The encoding is
similar to that used for DataWithOps, with the following differences:
o with AllZoneData the receiving ARS server must delete the existing
documents for the zone before applying the updates in the case of
the AllZoneData encoding; and,
o with AllZoneData the requestor must specify the zone for which
they want all zone data.
The EllipsisNotation encoding may be used during committed update
propagation when transmitting to a downstream server on the DAG path
along which an update was originally submitted. Instead of sending
the documents to be updated inside the UpdateGroup, the upstream
server sends the GlobalUpdateSubmitID that was assigned when the
update was originally submitted. The downstream server then commits
the content that it had saved in temporary stable storage. This
encoding avoids transmitting the update content down the same link(s)
along which it was originally submitted. When using this encoding it
is the responsibility of the upstream server to track where it
received updates in order to determine when the ellipsis notation may
be applied. It is a local implementation matter whether the state
needed for tracking this information is kept on stable storage vs as
in-memory current-server-incarnation state.
Schwartz Expires April 7, 2002 [Page 95]
Internet-Draft ANTACID Protocol October 2001
I.2 Encoding Registration Procedures
Similar to the MIME IANA Registration Procedures, this appendix
provides an email template for registering new ARS encodings. Note
that this template has not yet been registered with the IANA.
To: IANA@isi.edu
Subject: Registration of new ARS Encoding (MIME Content-Type:
application/ars)
Encoding name:
Dependence on proprietary formats:
Security considerations:
Published specification:
(The published specification must be an Internet RFC or
RFC-to-be if a new top-level type is being defined, and must be
a publicly available specification in any case.)
Person & email address to contact for further information:
Schwartz Expires April 7, 2002 [Page 96]
Internet-Draft ANTACID Protocol October 2001
Full Copyright Statement
Copyright (C) The Internet Society (2001). All Rights Reserved.
This document and translations of it may be copied and furnished to
others, and derivative works that comment on or otherwise explain it
or assist in its implementation may be prepared, copied, published
and distributed, in whole or in part, without restriction of any
kind, provided that the above copyright notice and this paragraph are
included on all such copies and derivative works. However, this
document itself may not be modified in any way, such as by removing
the copyright notice or references to the Internet Society or other
Internet organizations, except as needed for the purpose of
developing Internet standards in which case the procedures for
copyrights defined in the Internet Standards process must be
followed, or as required to translate it into languages other than
English.
The limited permissions granted above are perpetual and will not be
revoked by the Internet Society or its successors or assigns.
This document and the information contained herein is provided on an
"AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING
TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING
BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION
HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF
MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
Code On The Road, LLC expressly disclaims any and all warranties
regarding this contribution including any warranty that (a) this
contribution does not violate the rights of others, (b) the owners,
if any, of other rights in this contribution have been informed of
the rights and permissions granted to IETF herein, and (c) any
required authorizations from such owners have been obtained. This
document and the information contained herein is provided on an "AS
IS" basis and CODE ON THE ROAD, LLC DISCLAIMS ALL WARRANTIES, EXPRESS
OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF
THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED
WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
IN NO EVENT WILL CODE ON THE ROAD, LLC BE LIABLE TO ANY OTHER PARTY
INCLUDING THE IETF AND ITS MEMBERS FOR THE COST OF PROCURING
SUBSTITUTE GOODS OR SERVICES, LOST PROFITS, LOSS OF USE, LOSS OF
DATA, OR ANY INCIDENTAL, CONSEQUENTIAL, INDIRECT, OR SPECIAL DAMAGES
WHETHER UNDER CONTRACT, TORT, WARRANTY, OR OTHERWISE, ARISING IN ANY
WAY OUT OF THIS OR ANY OTHER AGREEMENT RELATING TO THIS DOCUMENT,
WHETHER OR NOT SUCH PARTY HAD ADVANCE NOTICE OF THE POSSIBILITY OF
SUCH DAMAGES.
Schwartz Expires April 7, 2002 [Page 97]
Internet-Draft ANTACID Protocol October 2001
Acknowledgement
Funding for the RFC Editor function is currently provided by the
Internet Society.
Schwartz Expires April 7, 2002 [Page 98]