Internet-Draft                                Leslie L. Daigle
Category: Informational                       Thinking Cat Enterprises
Expires: December 25, 1999                    Thommy Eklof
                                              Ericsson
                                              June 25, 1999

        An Architecture for Integrated Directory Services
                draft-daigle-arch-ids-00.txt

Status of this Memo

   This document is an Internet-Draft and is in full conformance with
   all provisions of Section 10 of RFC2026.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF), its areas, and its working groups. Note that other
   groups may also distribute working documents as Internet-Drafts.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time. It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   The list of current Internet-Drafts can be accessed at
   http://www.ietf.org/ietf/1id-abstracts.txt

   The list of Internet-Draft Shadow Directories can be accessed at
   http://www.ietf.org/shadow.html.

   This Internet-Draft will expire on December 25, 1999.


Abstract

A single, unified, global whitepages directory service remains
elusive.  Nonetheless, there is increasing call for participation of
widely-dispersed directory servers (i.e., across multiple 
organizations) in large-scale directory services.  These services
range from national whitepages services, to multi-national
indexes of WWW resources, and beyond.  Drawing from experiences
with the TISDAG ([TISDAG]) project, this document outlines
an approach to providing the necessary infrastructure for 
integrating such widely-scattered servers into a single service,
rather than attempting to mandate a single protocol and schema set
for all participating servers to use. 

The proposed architecture inserts a coordinated set of modules
between the client access software and participating servers.  While
the client software interacts with the service at a single entry
point, the remaining modules are called upon (behind the scenes)
to provide the necessary application support.  This may come in
the form of modules that provide query proxying, schema translation,
lookups, referrals, security infrastructure, etc.

1.0 Introduction

This document does not propose a protocol to end all protocols.
Rather, it outlines an architecture for a directory service
that permits coordinated modules to support existing protocols
while providing service functionality not necessarily natively
supported in those protocols.  In the TISDAG ([TISDAG]) project,
this was called the "Directory Access Gateway" or "DAG".  Here,
we outline the underlying principles of that architecture and
make some proposals for extensions.  

Part of this architecture is an "internal protocol" -- called the
"DAG/IP" in the TISDAG project.  This document also outlines the
perceived requirements for this protocol in the extended DAG.

CAVEAT!:  this document lives up to the description "work in progress"!
It is not the presentation of an answer; its major goal is to get some 
ideas out so that community discussion can help shape things towards
an answer.

We start by having a look at the TISDAG project architecture, and
attempting to extract some architectural principles.

2.0  TISDAG -- a first implementation

The Swedish TISDAG project (described in detail in [TISDAG], with
some experiences reported in [DAG-EXP]) was designed to fulfill the
requirements of a particular national directory service.   The
experience of developing component-based system for providing
a directory service through a uniform interface (client access point)
provided valuable insight into the possibilities of extending the
system architecture so that services with different base requirements
can benefit from many of the same advantages.

2.1 Deconstructing the TISDAG architecture

The proposed service architecture consists of 4 major principles:
	
	1. end-user client software is serviced from client access points
	2. CAPs access (internal or remote) services via 
	   well-defined, advertised service access points (SAPs)
	3. there is an "inside" and an "outside" to the service -- 
	   maintained in terms of what gets advertised to servers
	   at large, how security is maintained, etc
	4. internally, there is a single protocol framework for 
	   all communications -- this facilitates service support
	   functions (e.g., security of transmission), ensures 
	   distributability, and provides the base mechanism
	   for allowing/ascertaining interoperability of components.

Client access points are responsible for providing contact to the
client software attaching to the service.  Any given service may
have several CAPs, to support clients of different protocols or
service level.  

2.2 Perceived benefits 

Among the benefits of this approach are:

	. distribution of development
	. distribution of operation
	. eventual possibilities of hooking together different
	  systems (of different backgrounds)
	. separation of 
		. architectural principles
		. implementation to a specific application
		. configuration for a given service

It is not the goal to say that a standardized system architecture
can be made so that single components can be built for all possible
applications.  However, this approach in general permits the decoupling
of access protocols from specific applications, and facilitates
the integration of necessary infrastructure independently of
access protocol (e.g., referrals, security, lookup services, 
distribution etc).


3.0 Some generalizations

3.1 Definitions

For the purposes of this document, important distinctions
and relationships are defined between services, applications,
systems and servers.  These are defined as follows:

Service:  an operational system providing (controlled) access to
  fulfill a particular application's needs.

  One service may be changed by configuring location, access controls,
  etc.  Changing application means changing the service.

Application:  a solution to a particular (set of) user need(s).

  The definition of an application includes the type(s) of information
  to be exchanged, expected behaviour, etc.  Thus, a whitepages
  (search) application may expect to receive a name as input to 
  a query engine, and will return all information associated with
  the name.  By contrast, a specific security application might
  use the same input name to verify access controls.

System:  a set of components with established interconnections.

Server:  a single component offering access through a dedicated 
  protocol, without regard to a specific service (or services) it
  may be supporting in a given configuration.


3.2 Proposed architecture

Pictorially, the DAG architecture is as follows:

      +-------------------------------------------+
  "a" |         |                +--------+       |
<----->  CAP a  |                | SAP A  |       |
      |         |                |        |       |
      |---------+                +-+------+---+   |
      |                            |(Internal)|   |
      |           "DAG/IP"         | Server i |   |
      |                            +----------+   |
      |                                           |
      |                                           |       
      |                          +--------+       | "B"   
      |                          | SAP B  <-------------->           
      |                          |        |       |      
      |                          +--------+       |       
      |                                           |
      +-------------------------------------------+

Note that the bounding box is conceptual -- all components may or may 
not reside on one server, or a set of servers governed by the provider
of the service.

As we saw in the TISDAG project, the provider of this service may
be only loosely affiliated with the services that are drawn on
(WDSPs in this case).


In the TISDAG project, the above could be mapped as follows:

	CAP a 	LDAPv2 CAP
	SAP A		the Referral Index (RI) interface
	Server i	the Referral Index (RI)
	SAP B		LDAPv3 SAP

Note that, in the TISDAG project specification, the designation SAP 
referred exclusively to proxy components designed to deal with external
servers.  The Referral Index was considered an entity in its own
right.  However, generalizing the concepts of the TISDAG experience
lead to the proposal of regarding all DAG/IP-supporting service 
components as SAPs, each designed to carry out a particular type of 
service functionality, and whether the server is managed internally to 
the DAG system or not is immaterial.

Building a service on this architecture requires:

System architecture:
	1. definition of the overall application to be supported by
	   the system -- whitepages, web resource indexing, medical
	   information
	2. identification of necessary CAPs -- in terms of access
	   protocols to be supported, different service levels to
	   be provided (e.g., secure and unsecure connections)
	3. identification of necessary services -- e.g., proxying to 
	   remote information search services, lookup services, AAA
	   servers, etc
	4. definition of the transaction process for the service: 
	   insofar as the CAPs represent the service to client software,
	   CAP modules manage the necessary transactions with other
	   service modules

Data architecture:
	1. selection of schemas to be used (in each protocol)
	2. definition of schema and protocol mappings -- into and
	   out of some DAG/IP representation


In the case of the TISDAG project, for example:

System architecture:
	1. whitepages lookups, with specific query types supported, on
	   a national scale
	2. publicly accessible CAPs in HTTP, SMTP, Whois++, LDAPv2,
	   and LDAPv3
	3. referral proxies to Whois++, LDAPv2 and LDAPv3 WDSPs, as
	   well as a referral query service
	4. the basic transaction process, uniform across all CAPs, is:
		. query the RI for relevant referrals
		. where necessary, chain referrals through SAPs of 
		  appropriate protocol
		. return, in the native protocol, all remaining referrals
		  and data

Data architecture:  see the spec.

Should the CAPs be able to access external servers in their native
protocol?  Conceptually, no.  In practice -- who cares if the
SAP is part of the code, or accessed remotely?

So, why use a protocol for DAG/IP, and isn't it all obvious anyway?
Yes, no, maybe.  The advantage of the protocol is that you can
have components distributed far & wide (for free).  By providing
a standard, you can build at least semi-reusable CAPs.  At the
very least, you don't reinvent the wheel every time you want to
build a new service.   We define some requirements of the protocol
below.


3.3 Requirements for the future DAG/IP

The role of the DAG/IP is less as a query protocol, and more as
a framework or structure for carrying basic query-response transactions
of different (configurable) types.

Whatever the syntax or grammar, the basic requirements for the
DAG/IP include that it be:

	. lightweight; CAPs, SAPs should be able to be quite small
	. flexible enough to carry queries of different paradigms,
	  results of different types
	. able to support authentication, authorization, accounting and
	  audit mechanisms -- not necessarily native to the protocol
	. able to support encryption and end-to-end security within the
	  DAG system
	. sophisticated enough to allow negotiation of	capabilities --
	  querying & identifying application type supported (e.g.,
	  whitepages vs. service location vs. URN resolution), query
	  types supported, results types supported
	
This also means:

Better support for query-passing/other query semantics (need to
balance that against the fact that you don't want DAG-CAPs/SAPs to have
to know a multiplicity of semantic possibilities.

Security infrastructure -- ability to establish security credentials,
maintain a secure transaction, and propagate the security information
forward in the transaction (don't want to reinvent the wheel, just
want to be able to use it!).

Ability to do lookups, instead of searches -- might mean connecting
to different services than the RI and/or presenting things in a 
slightly different light -- e.g., lookup <blat> in the <foo> space,
as opposed to search for all things concerning <blat>.

Ability to access other services -- e.g., NDD -- beyond just for
specific characteristics of the service (e.g., security).

In short, the model that seems to stand out from these requirements
one of a protocol framework that looks after establishing secure
and authenicated (authorized, accountable, auditable...) connections,
with transaction negotiation facilities.  Within that framework,
it must be possible to identify transaction types, provide suitable
input information (negotiation?) for those transactions, and accept
transaction result objects back.
	

4.0 Revisiting TISDAG

In the light of the above proposals, we can revisit the way
the TISDAG CAPs would be defined.

The whitepages-application service known as TISDAG would have
CAPs and SAPs that supported 2 types of query, and 2 types
of result sets:

	query types:
		. token-based
		. phrase-based

	result types:
		. result data
		. referrals

The Whois++ CAP would be configured to contact LDAPv2 and LDAPv3
CAPs because they are identified as providing that kind of service
(i.e., if referral protocol == LDAPv2 connect to a particular
service).  The query paradigm will be phrase-oriented -- NOT
because the Whois++ CAP understands LDAP, but because that is
one of the defined query types.  


5.0 Other Applications

For the medical application discussed in [DAGEXP], a security SAP
could be included that fit within the context of the existing
medical application security model.  Then, CAPs built for this
medical application would have a transaction model that included
obtaining the required AAA from that SAP, and query transactions
would include the propagation of all necessary security credentials.


7.0 Applicability Limitations

Although very general in nature, this architecture is still
fundamentally focused on providing a "middleware" service structure
for "read-only" systems.  That could change, as the security
model improves and is properly implemented.

This model is also pretty tied to single query-response transaction
types; it would be interesting to consider extending it to more
complex interactions.


8.0 Acknowledgements

In discussing this perspective on the evolution of DAG/IP, it seemed
to us that the requirements for DAG/IP are falling into line with
the proposed text-based directory access protocol that has variously
been discussed.  Whether it survives in a recognizable form or not 
:-) some of the above has been drawn from discussions of that
protocol with Michael Mealling and Patrik Faltstrom.

9.0 Authors' Addresses

Leslie L. Daigle
Thinking Cat Enterprises
Email:  leslie@thinkingcat.com

Thommy Eklof
Ericsson
S-126 25 STOCKHOLM
Sweden
Email: thommy.eklof@ericsson.com


10.0 References

Request For Comments (RFC) and Internet Draft documents are available
from numerous mirror sites.

	[TISDAG]  	L. Daigle, R. Hedberg "Technical Infrastructure for
  			Swedish Directory Access Gateways (TISDAG)", Inter-
  			net Draft (work in progress), June 1999

	[DAGEXP]  	T.Eklof, L.Daigle, "Wide Area Directory Deployment
			Experiences", Internet Draft (work in progress),
			June 1999
 	
 	[NDD]		R. Hedberg, H. Alvestrand, "Technical Specifica-
 			tion, The Norwegian Directory of Directories
 			(NDD)", Internet Draft (work in progress), May 1999