Shivkumar Kannan Internet Draft Novell, Inc. Document: draft-shivkumar-ldapext-dirtxn-00.txt July 5, 2000 Expires: January 5, 2001 An approach to enable directory transactions Status of this Memo This document is an Internet-Draft and is in full conformance with all provisions of Section 10 of RFC2026 [1]. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet- Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. This internet draft will expire on January 5, 2001. 1. Abstract The directory is becoming an important part of the organization. As the data in the directory becomes more critical, it needs to be consistent. With more applications storing information in the directory, there is a need for transaction (atomic commit/rollback) services to be offered by the directory to ensure consistency. This document describes a mechanism to implement transaction service in a distributed replicated directory. It also describes the LDAP APIs that can be used to provide a standard interface for the transaction service. 2. Conventions used in this document Shivkumar Expires January 5, 2001 1 An approach to enable directory transactions July, 2000 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC-2119 [2]. 3. Overview The directory now plays an important role for enterprise applications as well as Internet applications. More and more application developers will use the directory as the repository not only for user and network information but also for objects specific to the application. There is a need for the directory to provide atomic update functionality to a group of objects across multiple update requests. This is required to preserve the logical integrity of the directory and also to free the applications from having to manage this complexity themselves. The developer should be able to specify the beginning of the transaction, the sequence of "modification" operations on the objects, and finally the commit for this transaction. The directory should be able to provide an atomic commit. That is, either the results of all the transactions are updated or none of them gets updated. The following section examines the need for transactions in a directory. 4. Need for transactions in a directory The directory was traditionally used as a mechanism to lookup information. For example, the telephone directory is used to find the address and telephone numbers of people and organizations. In this case, the directory is updated and maintained by the administrator from a central location. In the case of the network directory, the network administrator maintained the information in the directory, which consisted of the users in the network, the servers, and the printers in the organization. Again, this information was controlled centrally. The frequency of updates to this data was also very minimal. However more objects were represented in the directory that were related to other objects. For example, the Group object is closely related to the User object. This increased the need to maintain consistency in the directory. To illustrate this, let us examine the "User Group" relation that states that "a user is a member of a group" and "a group consists of users" and the way that it is represented in the directory. When we say that a user A is a member of group G, there is an implicit relation that is set up between these objects. User A's "Group membership" attribute is set to G and group G's "Group members" attribute is set to A. In addition to this, user A's "Security Equals" attribute is set to G and group G's "Equivalent to Me" Shivkumar Expires January 5,2000 2 An approach to enable directory transactions July, 2000 attribute is set to A. The relation between the User and the Group is said to be complete only when all these attributes are updated. If any of these update operations is not successful, then the data in the directory is inconsistent. 5. SDK for transaction services through LDAP The problem of atomic updates across multiple update requests can be overcome by providing database-like transaction semantics to the directory. The directory should have provision for specifying the beginning of a transaction, to commit or abort it. This will be specific to each session to the directory. Every update need not have a transaction defined for it. A transaction is commenced by the begin transaction operation and this will be the active transaction. The different APIs specified in C needed to accomplish the atomic update across update requests are: i. Int LDAP_Begin_Transaction (LDAP *ld, Enum TransactionType, int MaxWaitTime) Begins a transaction in this session and returns the success or error code. Ld - Connection handle to the LDAP structure which has connection information TransactionType - Transaction is an enumeration of the types of transaction. These are NO_TRANSACTION - 0 (no transaction) READ_TRANSACTION - 1 (read-only operations) WRITE_TRANSACTION - 2 (read-write operations) MaxWaitTime - Maximum time to wait in milliseconds to obtain a lock, 0 could indicate indefinite wait. Alternately, this API could have an additional parameter by which the client can specify the objects involved in the transaction right at the beginning. ii. Int LDAP_Commit (LDAP *ld) This commits the active transaction in this session and returns the success or error code. Shivkumar Expires January 5,2000 3 An approach to enable directory transactions July, 2000 iii. Int LDAP_Abort (LDAP *ld) This aborts the active transaction in this session and returns the success or error code. iv. BOOL LDAP_IsTransactional(LDAP *ld) This tells the user whether an active transaction exists in this session and returns TRUE if it exists or FALSE otherwise. v. ENUM_Transactions LDAP_GetTransactionType(LDAP *ld) If there exists an active transaction in this session, then this returns the type of the transaction in this session. Providing transaction services through the existing standard LDAP client APIs. The above mentioned solution required the addition of a few APIs to the client and this may not be available across all implementations unless this becomes a standard. An alternative way of supporting the transaction services for universal access can be by means of extended operations defined by the LDAPv3 protocol. Here the client checks for the existence of the transaction service with the LDAP server by checking with the rootDSE of the server. The LDAP extended operation int ldap_extended_operation_s( LDAP *ld, const char *requestoid, struct berval *requestdata, LDAPControl **serverctrls[NULL], LDAPControl **clientctrls[NULL], char **retoidp, struct berval **retdatap ); could be used. We do not need any controls for this operation. We just have to define the message formats for primitives such as the ones defined above (begin_transaction, Commit, Abort, IsTransactional and GetTransactionType) for the request and response. This way, we can support transaction services for the directory with the existing client APIs. 6. Challenges in providing a transaction service for the directory i. The directory is usually partitioned and replicated across locations. The speed of the read/query operations should not be diminished with the introduction of transaction services. Shivkumar Expires January 5,2000 4 An approach to enable directory transactions July, 2000 ii. If any updates are made to a set of objects in one replica, then any read operation on these objects should not read old values from any other replica. iii. While a transaction is in progress, other operations should not be able to access these objects - that is, isolation should be provided. iv. The two-phase commit protocol is not an appropriate solution for this problem as it is very message intensive and will slow down the performance. v. The solution should address multi-master replica deployment of the directory (NDS) as well as the master-slave scenario. 7. Assumptions i. In OLTP transactions, there would be a high level of concurrency among the write operations as OLTP transactions are "write" intensive. To achieve high throughput, interleaving the operations is imperative. This leads to complex locking mechanisms that maintain the consistency in this scenario. Less than 10% of all requests to the directory are write requests and less than 3% will be transaction requests (read and write). This will make the implementation of the locking mechanism simpler. ii. The underlying data store of a directory need not be a database. 8. Solution for a scenario where the objects that belong to the transaction exist on a single partition i. How locking would be done To achieve tight consistency, locking is essential. However, a locking mechanism transparent to the user is desirable. A mechanism for achieving locking is given here. The transaction service sets the lock on the objects under a transaction by updating a flag, which is a hidden attribute on the object. This attribute is meant only for the internal use of the transaction service and the directory server. During the course of a normal directory operation that is not under this transaction, if the directory server finds that this attribute has been set on the object, then it throws a "Object locked - transaction in progress" exception message. Shivkumar Expires January 5,2000 5 An approach to enable directory transactions July, 2000 ii. Primary replica There will be a primary replica for any set of objects in a multi- master replica scenario. The primary replica is where the transaction takes place on these objects (within a particular partition). This primary replica need not be the same as the master replica of this partition. A transaction can originate on any write replica. If this is not the primary replica, a redirect flag is set on this set of objects and the transaction is directed to the primary replica transparent to the user. iii. Transaction request on objects present in the same partition When the transaction request arrives from the client - Objects are locked in the primary replica and redirect flag is set on these objects on all the other replicas. This is a synchronous operation. - All operations from here on are performed on the primary replica. The Commit / Rollback and the read/write operations are now handled by the Transaction Service on the primary replica. The transaction service maintains atomicity and isolation, and also includes a log manager and a recovery manager. The log manager logs the operations under a transaction Begin-end pair. This helps the recovery manager to undo operations. The Recovery Manager takes care of a transaction rollback operation, system crash, and shutdown/restart of the directory server. As all the objects are now present in the primary replica, two-phase commit is not required in this scenario. - After the operation is complete, the lock attribute of these objects is reset in the master replica. - The transaction ends here. The redirect attribute on these objects in the other replicas is reset when the replication is complete according to the replication policies of the directory servers. During replication the current values of these objects will get updated on the replicas. The redirect operation could be implemented as a referral. An object, which is locked during the course of a transaction, will not be replicated until the transaction is complete and the lock is released. Also, replication will not take place for objects when the referrals are being set. This solution requires all the replicas to be available to achieve tight consistency. Shivkumar Expires January 5,2000 6 An approach to enable directory transactions July, 2000 Depending on the consistency requirements (if eventual consistency is sufficient across the replicas), the atomicity of the transaction can be achieved in the case of disconnected replicas or a replica being down. 9. Solution for a scenario where the objects that belong to the transaction exist in multiple partitions In this case, the solution will be similar to that of a single partition. The differences are listed here. In each partition, the primary replicas of the objects under this transaction are identified. The locking mechanism remains the same. A two-phase commit between the primary replicas of these partitions is required. This two-phase commit would be coordinated by the transaction service at the primary replica of the partition where this transaction originated. The resetting of the lock attribute and the redirect attribute would follow the same process as defined in the previous section. 9. Security Considerations No special security is required for the transaction service of the directory as it will be a part of the directory server. 10. References 1 Bradner, S., "The Internet Standards Process -- Revision 3", BCP 9, RFC 2026, October 1996. 2 Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997 3 M. Wahl, S. Kille, T. Howes, "Lightweight Directory Access Protocol (v3)", RFC 2251, December 1997. 10. Acknowledgments I thank Dr. Dinkar Sitaram, the members of the LDAP mailing list, the discussions held there and the Internet draft "LDAP Extensions for Simple Transactions" submitted by David Boreham, Satoshi Kikuchi Shivkumar Expires January 5,2000 7 An approach to enable directory transactions July, 2000 and Michiyasu Odaki in April 1998. I also thank Vidula and Haripriya from Novell for their contributions. 11. Author's Addresses Shivkumar Kannan Novell, Inc. Novell product Group, Bangalore 7th Mile, Hosur Road, Garvebhavipalya, Bangalore, India - 560068 Phone: 91-80-5721858 Email: kshivkumar@novell.com, kshivkumar@altavista.net Full Copyright Statement "Copyright (C) The Internet Society (2000). All Rights Reserved. This document and translations of it may be copied and furnished to others, and derivative works that comment on or otherwise explain it or assist in its implmentation may be prepared, copied, published and distributed, in whole or in part, without restriction of any kind, provided that the above copyright notice and this paragraph are included on all such copies and derivative works. However, this document itself may not be modified in any way, such as by removing the copyright notice or references to the Internet Society or other Internet organizations, except as needed for the purpose of developing Internet standards in which case the procedures for copyrights defined in the Internet Standards process must be followed, or as required to translate it into languages other than English. The limited permissions granted above are perpetual and will not be revoked by the Internet Society or its successors or assigns. This document and the information contained herein is provided on an "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Intellectual Property Notice The IETF takes no position regarding the validity or scope of any intellectual property or other rights that might be claimed to pertain to the implementation or use of the technology described in this document or the extent to which any license under such rights might or might not be available; neither does it represent that it has made any effort to identify any such rights. Information on the Shivkumar Expires January 5,2000 8 An approach to enable directory transactions July, 2000 IETF's procedures with respect to rights in standards-track and standards-related documentation can be found in BCP-11. Copies of claims of rights made available for publication and any assurances of licenses to be made available, or the result of an attempt made to obtain a general license or permission for the use of such proprietary rights by implementors or users of this specification can be obtained from the IETF Secretariat. The IETF invites any interested party to bring to its attention any copyrights, patents or patent applications, or other proprietary rights which may cover technology that may be required to practice this standard. Please address the information to the IETF Executive Director. Shivkumar Expires January 5,2000 9