HTTP/1.1 200 OK Date: Mon, 08 Apr 2002 22:35:11 GMT Server: Apache/1.3.20 (Unix) Last-Modified: Fri, 14 Jun 1996 22:26:00 GMT ETag: "2e7d31-8c4d-31c1e6f8" Accept-Ranges: bytes Content-Length: 35917 Connection: close Content-Type: text/plain Internet-Draft Grenville Armitage Bellcore June 13th, 1996 Redundant MARS architectures and SCSP Status of this Memo This document was submitted to the IETF Internetworking over NBMA (ION) WG. Publication of this document does not imply acceptance by the ION WG of any ideas expressed within. Comments should be submitted to the ion@nexen.com mailing list. Distribution of this memo is unlimited. This memo is an internet draft. Internet Drafts are working documents of the Internet Engineering Task Force (IETF), its Areas, and its Working Groups. Note that other groups may also distribute working documents as Internet Drafts. Internet Drafts are draft documents valid for a maximum of six months. Internet Drafts may be updated, replaced, or obsoleted by other documents at any time. It is not appropriate to use Internet Drafts as reference material or to cite them other than as a "working draft" or "work in progress". Please check the lid-abstracts.txt listing contained in the internet-drafts shadow directories on ds.internic.net (US East Coast), nic.nordu.net (Europe), ftp.isi.edu (US West Coast), or munnari.oz.au (Pacific Rim) to learn the current status of any Internet Draft. Abstract The Server Cache Synchronisation Protocol (SCSP) has been proposed as a general mechanism for synchronising the databases of NHRP Next Hop Servers (NHSs), MARSs, and MARS Multicast Servers (MCSs). All these entities are different parts to the IETF's ION solution. This document is INFORMATIONAL, RAMBLING, and REALLY HACKY. It is intended as a catalyst for discussions aimed at identifying the realistic MARS scenarios to which the SCSP may find itself applied. This document does not deal with NHS and MCS scenarios. Armitage Expires December 13th, 1996 [Page 1] Internet Draft June 13th, 1996 1. Introduction. SCSP [1] was proposed to the ROLC and IP over ATM working groups as a general solution for synchronizing distributed databases such as distributed Next Hop Servers [2] and MARSs [3]. It is now being developed within the newly formed Internetworking over NBMA (ION) working group. This document attempts to describe possible redundant/distributed MARS architectures, and how SCSP would aid their implementation. 1.1 MARS Client support for backup MARSs. The current MARS draft already specifies a set of MARS Client behaviours associated with MARS-failure recovery (Section 5.4 of [3]). MARS Clients expect to regularly receive a MARS_REDIRECT_MAP message on ClusterControlVC, which the current and backup MARSs. When a MARS Client detects a failure of its MARS, it steps to the next member of this list and attempts to re-register. If the re- registration fails, the process repeats until a functional MARS is found. Sections 5.4.1 and 5.4.2 of [3] describe how a MARS Client, after successfully re-registering with a MARS, re-issues all the MARS_JOIN messages that it had sent to its previous MARS. This causes the new MARS to build a group membership database reflecting that of the failed MARS prior to the failure. (This behaviour is required for the case where there is only one MARS available and it suffers a crash/reboot cycle. Cluster members represent a distributed cache 'memory' that imposes itself onto the newly restarted MARS.) 1.2 Structure of this document. This document is currently structured in a semi-rambling fashion. I've put together sequences of ideas to see if I can lead people to certain conclusions, highlighting my reasoning along the way so that the issues (or lack thereof) may be evident to readers. As of the first release there are few conclusions or solutions. 2. Why a distributed database? In the current MARS model [3] a Cluster consists of a number of MARS Clients (IP/ATM interfaces in routers and/or hosts) utilizing the services of a single MARS. This MARS is responsible for tracking the IP group membership information across all Cluster members, and providing on-demand associations between IP multicast group identifiers (addresses) and multipoint ATM forwarding paths. It is Armitage Expires December 13th, 1996 [Page 2] Internet Draft June 13th, 1996 also responsible for allocating Cluster Member IDs (CMIs) to Cluster members (inserted into outgoing data packets, to allow reflected packet detection when Multicast Servers are placed in the data path). Two different, but significant goals motivate the distribution of the MARS functionality across a number of physical entities. These might be summarized as: Fault tolerance If a client discovers the MARS it is using has failed, it can switch to another MARS and continue operation where it left off. Load sharing The component MARSs of a distributed, logically single MARS, handle a subset of the control VCs from the clients in the Cluster. Each goal has some characteristics that it does not share with the other, so it would be wrong to believe that any solution to one is a solution to the other. However, a general solution to the Load sharing model may well provide fault tolerance as a by product. Some additional terminology is introduced to describe the distributed MARS options. These reflect the differing relationships the MARSs have with each other and the Cluster members (clients). Fault tolerant model: Active MARS The single MARS serving the clients, that allocates CMIs and tracks group membership changes by itself. It is the sole entity that constructs replies to MARS_REQUESTs. Backup MARS An additional MARS that tracks the information being generated by the Active MARS. Cluster members may re-register with a Backup MARS if the Active MARS fails, and they'll assume the Backup has sufficient up to date knowledge of the Cluster's state to take the role of Active MARS. Load sharing model: Active Sub-MARS At its most basic, load sharing involves breaking the Active MARS into a number of simultaneously Armitage Expires December 13th, 1996 [Page 3] Internet Draft June 13th, 1996 active MARS entities that each manage a subset of the Cluster. Sub-MARS entities must co-ordinate their activities so that they appear to be interchangeable to cluster members - each one capable of allocating CMIs and tracking group membership information within the Cluster. Together they act as a distributed, logical single Active MARS. MARS_REQUESTs sent to a single Active Sub-MARS return information covering the entire Cluster. Backup Sub-MARS A MARS entity that tracks the activities of an active Sub-MARS, and is able to become a member of the active sub-MARS group when failure occurs. The next 2 sections discuss the Fault tolerance and Load sharing models in further detail. (Editorial note: it is not yet clear how to map these to the Server Group concept in SCSP, which appears to solely consist of what I would term 'active sub-servers'. Terminology will be cleaned up as this becomes clearer.) 3. Architectures for fault tolerance. This is the simpler of the two models. The Active MARS is a single entity, and only requires a one-way flow of information to the one or more Backup MARSs to keep their databases up to date. The relationship between cluster members, an Active MARS and 3 backup MARSs might be represented as: C1 C2 C3 | | | ------------- M1 ------------ | M2--M3--M4 In this case the Cluster members (C1, C2, and C3) use M1, the Active MARS. M2, M3, and M4 are the Backup MARSs. The communications between M1, M2, M3, and M4 is completely independent of the communication between M1 and C1, C2, and C3. The Backup MARSs are essentially slaved off M1. (The lines represent associations, rather than actual VCs. M1 has pt-pt VCs between itself and the cluster members, in addition to ClusterControlVC spanning out to the cluster members.) As noted in section 1.1, M1 would be regularly transmitting a MARS_REDIRECT_MAP on ClusterControlVC specifying {M1, M2, M3, M4} as Armitage Expires December 13th, 1996 [Page 4] Internet Draft June 13th, 1996 the set of MARSs for the cluster. If M1 was to fail, and M2 was fully operational, the cluster would rebuild itself to look like this: C1 C2 C3 | | | ------------- M2 ------------ | M3--M4 As noted in section 1.1, each cluster member re-issues its outstanding MARS_JOINs to M2. In case M2 above had also failed, clients would have then tried to re-register with M3, then M4, then cycled back to M1. This sequence would repeat until one of the MARSs listed in the last heard MARS_REDIRECT_MAP allowed the clients to re-register. A further complication is that transient failures of 2 or more of the backup MARSs may lead, through race conditions in client re-registration, to C1 re-registering with a different MARS to C2 and C3. It is clear that the backup MARSs must elect their own notion of the Active MARS, and redirect clients to this Active MARS if clients attempt to re- register with a MARS that considers itself to not be the Active MARS for the Cluster. (This needs to be clarified further.) 3.1 MARS_REDIRECT_MAPs and post-recovery reconfiguration. Cluster members assume that the members of the MARS_REDIRECT_MAP are capable of taking on the role of Active MARS. Any inter-MARS protocol for dynamically adding and removing Backup MARSs must ensure this is true. For example, in the preceding example, once M2 takes over as the Active MARS it should start sending MARS_REDIRECT_MAPs that carry the reduced list {M2, M3, M4} until such time as M1 as recovered. A couple of options exist once M1 recovers, and these must be addressed by a distributed MARS protocol. A simple approach relegates M1 to be a Backup MARS. Thus M2 might begin issuing MARS_REDIRECT_MAPs with the list {M2, M3, M4, M1} once M1 is known to be available again. The picture might eventually look like: C1 C2 C3 | | | ------------- M2 ------------ | Armitage Expires December 13th, 1996 [Page 5] Internet Draft June 13th, 1996 M3--M4--M1 (If M1 has some characteristics that make it more desirable than M3 or M4, then M2 might instead start sending {M2, M1, M3, M4}.) However, it is possible that M1 has characteristics that make it preferable to any of the Backup MARSs whenever it is available. (This might include throughput, attachment point in the ATM network, fundamental reliability of the underlying hardware, etc.) Ideally, once M1 has recovered from whatever problem caused the move to M2, M2 will force the cluster members to shift back to M1. This functionality is also already included in the cluster member behaviour defined by the MARS draft (Section 5.4.3 of [3]). Once M1 was known to be available and synchronised with M2, M2 would stop sending MARS_REDIRECT_MAPs with {M2, M3, M4}. It would then start sending MARS_REDIRECT_MAPs with {M1, M2, M3, M4}, and bit 7 of the mar$redirf flag reset. Cluster members would compare the identity of their Active MARS (M2) with the first one listed in the MARS_REDIRECT_MAP (M1) and initiate a redirect. Bit 7 of mar$redirf being reset indicates a soft redirect. Cluster members re-register with M1, but do not re-join the multicast groups they are members of - by indicating a soft redirect, M2 is claiming that M1 has a current copy of M2's database. This reduces the amount of MARS signalling traffic associated with redirecting the cluster back to M1. (If synchronization of M1 with M2's database is not available, a hard redirect back to M1 can be performed - with a consequent burst of MARS control traffic as the clients leave M2 and re-join all their groups with M1.) 3.2 Impact of cluster member re-registration. As noted earlier, the MARS draft requires cluster members re- registering after an Active MARS failure MUST re-issue MARS_JOINs for all groups to which they consider themselves members. This has an interesting implication - it may not be necessary for an inter-MARS protocol to ensure that Backup MARSs have up to date group membership maps. Take the preceding example. During the transition from M1 to M2 cluster members C1, C2, and C3 will re-issue to M2 a sequence of MARS_JOINs. This would result in M2 building a group membership database that reflected M1's just before the failure, even if M2's database was initially empty. One piece of information that is not supplied by cluster members during re-registration/re-joining is their CMI - this must be Armitage Expires December 13th, 1996 [Page 6] Internet Draft June 13th, 1996 supplied by the new Active MARS. It is highly desirable that when a cluster member re-registers with M2 it be assigned the same CMI that it obtained from M1. To ensure this, the Active MARS MUST ensure that the Backup MARSs are aware of the ATM addresses and CMIs of every cluster member. (If the CMIs are not re-assigned to the same cluster members, data packets flowing out of a given cluster member will suddenly have a different CMI embedded in them. During the transition from M1 to M2, some cluster members may transition earlier than others. If they are assigned the same CMI as a pre-transition cluster member to whom they are currently sending IP packets, they recipient will discard these packets as though they were reflections from an MCS. Once all cluster members have transitioned to M2 this problem will go away, but it represents a short period where some data packets might fall into a black hole.) 3.3 Multicast Servers. For the purposes of this document we look at Multicast Servers (MCSs) as clients of the Active MARS. They adhere to the same rules as cluster members do - listen to MARS_REDIRECT_MAP, and redirect to a Backup MARS when the Active MARS fails. In the same way that cluster members re-join their groups after re-registration, MCSs also re- register for groups that they are configured to serve. Unlike Cluster members there is no equivalent of the CMI for MCSs. However, it is important for the Active MARS to keep Backup MARSs informed of what groups are MCS supported. The reason for this can be understood by considering what would happen if the Backup MARS had no knowledge of what groups had members, and which of those groups were MCS supported when the Active MARS failed. Consider the follwing sequence: Active MARS fails. Cluster members and MCSs gradually detect the failure, and begin re-registering with their first available Backup MARS. Cluster members re-MARS_JOIN all groups they were members of. As the Backup (now Active) MARS receives these MARS_JOINs it propagates them on its new ClusterControlVC. Simultaneously each MCS re-MARS_MSERVs all groups they were configured to support. If a MARS_MSERV arrives for a group that already has cluster Armitage Expires December 13th, 1996 [Page 7] Internet Draft June 13th, 1996 members, the Backup (now Active) MARS transmits an appropriate MARS_MIGRATE on its new ClusterControlVC. Assume that group X was MCS supported prior to the Active MARS's failure. Each cluster member had a pt-mpt VC out to the MCS (a single leaf node). MARS failure occurs, and each cluster member re-registers with the Backup MARS. The pt-mpt VC for group X is unchanged. Now cluster members begin re-issuing MARS_JOINs to the Backup (now Active) MARS. If the MCS for group X has not yet re-MARS_MSERVed for group X, the Backup MARS thinks the group is VC Mesh based, so it propagates the MARS_JOINs on ClusterControlVC. Other cluster members then update their pt-mpt VC for group X to add the (apparently) new leaf nodes. This results on cluster members forwarding their data packets to the MCS and some subset of the cluster members directly. This is not good. When the MCS finally re-registers, and re- MARS_MSERVs group X, the MARS will issue a MARS_MIGRATE, which will fix every cluster member's pt-mpt VC for group X. But the transient period is potentially dangerous. If the Backup MARSs are aware of what groups are MCS supported, they can appropriately suppress the cluster member's MARS_JOINs for a period of time while waiting for the MCS to explicitly re-register and re-MARS_MSERV. This would avoid the transient period where cluster members are reacting to MARS_JOINs erroneously sent across the new ClusterControlVC. 3.4 Inter-MARS protocol requirements. For the purely fault-tolerant model, the requirements are: For the architecture discussed in this section, the key pieces of information (or sub-caches, described in Appendix B.4 of [1]) that must be propagated by the Active MARS to Backup MARSs is the CMI to Cluster Member mapping table and the list of groups currently MCS supported. It is valuable enable a previous Active MARS to be returned to the group of MARSs listed in a MARS_REDIRECT_MAP after it recovers from its failure. If a failed Active MARS restarts, and is preferable to any of the Backup MARSs for long term cluster operation, then it is desirable that some mechanism exists for synchronising the entire databases of the current Active MARS with the restarted MARS. This allows a transition back to the restarted MARS using a soft redirect. No special additions are required to handle client requests (e.g. Armitage Expires December 13th, 1996 [Page 8] Internet Draft June 13th, 1996 MARS_REQUEST or MARS_GROUPLIST_QUERY), since there is only a single Active MARS. 4. Architectures for load sharing. Creating a physically distributed, but logically single MARS is a non-trivial task. A number of issues arise: ClusterControlVC is partitioned into a number of sub-CCVCs, one hanging off each Active Sub-MARS. Their leaf nodes are those cluster members that make up the cluster partition served by an Active Sub-MARS. MARS_JOIN/LEAVE traffic to one Active Sub-MARS must propagate out on each and every sub-CCVC to ensure Cluster wide distribution. This propagation must occur immediately. Allocation of CMIs across the cluster must be co-ordinated amongst the Active Sub-MARSs to ensure no CMI conflicts within the cluster. Each sub-CCVC must carry MARS_REDIRECT_MAP messages with an appropriate MARS list that perpetuates the illusion to cluster members that there is only a single MARS. Each Active Sub-MARS must be capable of answering a MARS_REQUEST or MARS_GROUPLIST_QUERY with information covering the entire Cluster. Load sharing configurations take on a range of forms. At the simplest end multiple MARS entities are simultaneously operationaly, and subdivide the Cluster. No fault tolerance is provided - if a MARS fails, its clients are 'off air' until the MARS restarts. A more complex model would allow each partition of the cluster to be supported by a MARS with its own dedicated set of Backup MARSs. Finally, the most complex model requires a set of MARS entities from which a subset may at any one time be actively supporting the cluster, while the remaining entities wait as Backups. The partitioning of the cluster is ideally dynamic and variable. The following subsections touch on these different models. Armitage Expires December 13th, 1996 [Page 9] Internet Draft June 13th, 1996 4.1 Simple load sharing. In a simple load sharing model each Active Sub-MARS has no backups, and clients only know of one Sub-MARS. Consider a cluster with 4 MARS Clients, and 2 Active Sub-MARSs. The following picture shows one possible configuration, where the cluster members are split evenly between the sub-MARSs: C1 C2 C3 C4 | | | | ----- M1 ------ ----- M2 ----- | | ----------------------- C1, C2, C3, and C4 all consider themselves to be members of the same Cluster. M1 manages a sub-CCVC with {C1, C2} as leaf nodes, while M2 manages a sub-CCVC with {C3, C4} as leaf nodes. M1 and M2 must have some means to exchange cluster co-ordination information. When C1 issues MARS_JOIN/LEAVE messages they must be sent to {C1, C2} and also {C3, C4} via M2. When C3 issues MARS_JOIN/LEAVE messages they must be sent to {C3, C4} and also {C1, C2} via M1. One side- effect is that M1 and M2 are forced to be aware of group membership changes from all parts of the cluster (through the exchange of MARS messages needing cluster wide propagation). M2 must be able to answer a MARS_REQUEST from C3 or C4 that covers its own database and that of M1. Conversely M1 must be able to draw upon M2's knowledge and its own when answering a MARS_REQUEST from C1 or C2. Two solutions exist - either M1 and M2 attempt to ensure they both share complete knowledge of the cluster's membership lists, or they query each other 'on demand' when building the answers to a clients MARS_REQUEST. Given that each Active Sub-MARS will see the MARS_JOIN/LEAVE messages generated by clients of other Active Sub- MARSs, it seems more effective for each Active Sub-MARS to keeps its own view of the Cluster using this message flow, and so build replies to MARS_REQUESTs from local knowledge. When new cluster members register with either M1 or M2 there must be some mechanism to ensure CMI allocation is unique within the scope of the entire cluster. There must be some element to the inter-MARS protocol that allows them to detect the possible loss of messages from the other MARS(s). If no backups exist, and no mechanism for dynamically re-arranging the partitioning of the cluster, the MARS_REDIRECT_MAP message from M1 lists {M1}, and from M2 lists {M2}. Armitage Expires December 13th, 1996 [Page 10] Internet Draft June 13th, 1996 4.2 Simple load sharing with backups. A slightly more complex model would evolve if each Active Sub-MARS had their own list of one or more Backup Sub-MARSs. The picture might become: C1 C2 C3 C4 | | | | ----- M1 ------ ----- M2 ----- / | | \ | ----------------------- | M3 M4 In this case M3 is a Backup for M1, and M4 is a Backup for M2. Initially we'll assume that there is no requirement for M3 and M4 to be shareable between M1 and M2. The MARS_REDIRECT_MAP from M1 would list only {M1, M3}, and from M2 would list only {M2, M4}. This situation implies the fault-tolerant model (section 3) between each Active Sub-MARS and its local group of Backup Sub-MARSs. However, it also implies that when a Backup Sub-MARS is promoted to Active Sub-MARS it must have some means to know who the other Active Sub-MARSs are. Thus the protocol managing load sharing among the Active Sub-MARSs needs augmentation to support Backup Sub-MARSs. For example, if M1 failed, the picture might become: C1 C2 C3 C4 | | | | ----- M3 ------ ----- M2 ----- | | \ ----------------------- | M4 The MARS_REDIRECT_MAP from M3 would list only {M3}, and from M2 would continue to list {M2, M4}. (Assuming M1 never recovers. If M1 recovers, a number of options exist for M1 and M3 to decide who will continue supporting their part of the cluster.) 4.3 Load sharing with dynamic reconfiguration. The preceding examples are significantly limited. Ideally the set of individual sub-MARSs should be capable of managing a variable sized partition, all the way up to the full cluster. What size each MARSs partition is should be dynamically changeable. If such flexibility exists, each Active Sub-MARS can effectively become each other's Backup Sub-MARS. Shifting clients from a failed Active Sub-MARS to Armitage Expires December 13th, 1996 [Page 11] Internet Draft June 13th, 1996 another Active Sub-MARS is load reconfiguration from the perspective of the Sub-MARSs, but is fault tolerant MARS service from the perspective of the clients. For example, assume this initial configuration: C1 C2 C3 C4 | | | | ----- M1 ------ ----- M2 ----- | | ----------------------- M1 lists {M1, M2} in its MARS_REDIRECT_MAPs, and M2 lists {M2, M1}. The cluster members neither know nor care that the Backup MARS listed by their Active MARS is actually an Active MARS for another subset of the Cluster. If M1 failed its partition of the cluster should collapse. C1 and C2 should re-register with M2, and the picture becomes: C1 C2 C3 C4 | | | | --------------------------- M2 ----- All cluster members start receiving MARS_REDIRECT_MAPs from M2, listing {M2} as the sole MARS. Currently missing from this model is a mechanism for re-partitioning the cluster once M1 has recovered. M2 needs to get C1 and C2 to perform a soft-redirect (or hard, if appropriate) to M1, without losing C3 and C4. One way of avoiding this scenario is to provision enough Active Sub- MARSs for the desired load sharing, and then provide a pool of shared Backup Sub-MARSs such that the number of Active Sub-MARSs never changes and the cluster partitions never alter. The picture from section 4.2 might be redrawn: C1 C2 C3 C4 | | | | ----- M1 ------ ----- M2 ----- | | ----------------------- | | M3 M4 In this case M1 lists {M1, M3, M4} in its MARS_REDIRECT_MAPs, and M2 lists {M2, M3, M4}. If M1 fails, the cluster configures to: Armitage Expires December 13th, 1996 [Page 12] Internet Draft June 13th, 1996 C1 C2 C3 C4 | | | | ----- M3 ------ ----- M2 ----- | | ----------------------- | M4 Now, if M3 stays up while M1 is recovering from its failure, there will be a period within which M3 lists {M3, M4} in its MARS_REDIRECT_MAPs, and M2 lists {M2, M4}. This implies that the failure of M1, and the promotion of M3 into the Active Sub-MARS set, causes M2 to re-evaluate the list of available Backup Sub-MARSs too. Then, when M1 is detected to be available again, M1 might be placed on the list of Backup Sub-MARS. The cluster would be configured as: C1 C2 C3 C4 | | | | ----- M3 ------ ----- M2 ----- | | ----------------------- | | M1 M4 M3 lists {M3, M1, M4} in its MARS_REDIRECT_MAPs, and M2 lists {M2, M4, M1}. Alternatively, as discussed in section 3, the failed MARS M1 may have some characteristics that make it preferred any time it is alive. So, M3 should only manage {C1, C2} until such time as M1 is detected alive again. M3 and M1 should then swap places, and inform the other Active Sub-MARSs. The difference between this scheme, and that described in section 4.2, is that M3 and M4 are actually available to support either M1 or M2's partitions. For example, if M1 and M2 failed simultaneously the cluster should rebuild itself to look like: C1 C2 C3 C4 | | | | ----- M3 ------ ----- M4 ----- | | ----------------------- M1 and M2 must be careful to list a different sequence of Backup Sub-MARSs in their MARS_REDIRECT_MAPs. For example, if M1 listed {M1, M3, M4} and M2 listed {M2, M3, M4} the cluster would look like this Armitage Expires December 13th, 1996 [Page 13] Internet Draft June 13th, 1996 after a simultaneous failure of M1 and M2: C1 C2 C3 C4 | | | | --------------------------- M3 ----- | M4 This is a bad situation, since (as noted earlier) we have no obvious mechanism to re-partition the cluster between the two available Sub- MARSs. (Another solution that is not entirely fool proof would be for the Active MARS to issue specifically targetted MARS_REDIRECT_MAP messages on the pt-pt VCs that each client has open to it. If C1 and C2 still had their pt-pt VCs open, e.g. after re-registration, the M3 could send them private MARS_REDIRECT_MAPs listing {M4, M3} as the list, forcing only C1 and C2 to re-direct. This approach requires further thought.) 4.4 Multicast Server interactions? One of the more complex aspects of a single MARS is its filtering of MARS_JOIN/LEAVE messages on ClusterControlVC in the presence of MCS supported groups (Section 6 of [3]). For an Active Sub-MARS to correctly filter MARS_JOIN/LEAVE messages it may want to transmit on its local Sub-CCVC it MUST know what groups are, cluster wide, being supported by an MCS. Since the MCS in question may have registered with another Active Sub-MARS, this implies that the Active Sub-MARSs must exchange timely information on MCS registrations and supported groups. 4.5 Key issues? Since MARS_JOIN/LEAVE traffic must propagate through every Active Sub-MARS, the 'load' being shared across the set of Active Sub-MARSs is VCC load rather than message processing load. Re-partitioning that involves increasing the number of Active Sub- MARSs has no obvious solution at this point. Since MARS_JOIN/LEAVE traffic must propagate through every Active Sub-MARS, a separate server cache synchronisation protocol covering group membership changes is probably not needed between Active Sub- Armitage Expires December 13th, 1996 [Page 14] Internet Draft June 13th, 1996 MARSs. As for the purely fault tolerant models in section 3, CMI information needs to be propagated amongst Active and Backup Sub-MARSs. To ensure each Active Sub-MARS can filter the JOIN/LEAVE traffic it propagates on its Sub-CCVC, information on what groups are MCS supported MUST be distributed amongst them. Active Sub-MARSs should be aware at all times what the cluster wide group membership is for any given group so they can answer MARS_REQUESTs from locally held information. XX. Tradeoffs and simplifications. TBD. [i.e. why do one or the other. summarize difficulties in doing both. Value in doing only one? Is fault tolerance more important than loadsharing? ] XX. So how does SCSP help? TBD. XX. The relationship between MARS and NHS entities. TBD. [e.g. they're not required to be co-resident, don't restrict your architecture to assume they will be even if NHSs exist in your LIS for unicast. MARS has _no_ IP level visibility (except perhaps for SNMP access - not clear on how this would work). ] XX. Open Issues. Armitage Expires December 13th, 1996 [Page 15] Internet Draft June 13th, 1996 Security Consideration Security consideration are not addressed in this document. Acknowledgments Jim Rubas and Anthony Gallo of IBM have helped clarify some points in this initial release, and will be co-authors on future releases. Author's Address Grenville Armitage Bellcore, 445 South Street Morristown, NJ, 07960 USA Email: gja@thumper.bellcore.com Ph. +1 201 829 2635 References [1] J. Luciani, G. Armitage, J. Jalpern, "Server Cache Synchronization Protocol (SCSP) - NBMA", INTERNET DRAFT, draft- luciani-rolc-scsp-02.txt, April 1996. [2] J. Luciani, et al, "NBMA Next Hop Resolution Protocol (NHRP)", INTERNET DRAFT, draft-ietf-rolc-nhrp-08.txt, June 1996. [3] G. Armitage, "Support for Multicast over UNI 3.0/3.1 based ATM Networks.", Bellcore, INTERNET DRAFT, draft-ietf-ipatm-ipmc-12.txt, February 1996. Armitage Expires December 13th, 1996 [Page 16]