S. Reddy INTERNET-DRAFT Microsoft Corporation draft-reddy-dasl-requirements-01.txt January 5, 1998 Expires July , 1998 Requirements for DAV Searching and Locating Status of this Memo This document is an Internet draft. Internet drafts are working documents of the Internet Engineering Task Force (IETF), its areas and its working groups. Note that other groups may also distribute working information as Internet drafts. Internet Drafts are draft documents valid for a maximum of six months and can be updated, replaced or obsoleted by other documents at any time. It is inappropriate to use Internet drafts as reference material or to cite them as other than as "work in progress". To learn the current status of any Internet draft please check the "lid-abstracts.txt" listing contained in the Internet drafts shadow directories on ftp.is.co.za (Africa), nic.nordu.net (Europe), munnari.oz.au (Pacific Rim), ds.internic.net (US East coast) or ftp.isi.edu (US West coast). Further information about the IETF can be found at URL: http://www.ietf.org/ Distribution of this document is unlimited. Editorial comments should be sent to the author (saveenr@microsoft.com). Abstract The Distributed Authoring and Versioning protocol [WEBDAV] defines simple mechanisms to assign and retrieve values for properties. This document presents a list of features in the form of requirements for a DAV Searching and Locating (DASL) protocol, an extension that improves the efficiency and utility of searching operations for resources whose properties or content meet client- defined criteria. INTERNET DRAFT Requirements for DAV Searching and Locating 1 Requirements for DAV Searching and Locating November 1997 1 Introduction The DAV methods INDEX, PROPFIND, and the HTTP 1.1 method GET are sufficient to allow a client to locate those resources that meet a set of conditions on their properties or content. However, these methods are inefficient for some simple, common search scenarios. For example, in a typical publishing environment a client may wish to find "all the text documents modified within the last week." DAV clients must repeatedly invoke the INDEX and PROPFIND methods to traverse the server namespace, retrieve property values, and then determine which resources meet the criteria. This procedure is a functional solution. However, it has several limitations. First, this procedure makes inefficient use of network resources. The client must repeatedly invoke the INDEX method to recurse the server namespace. Likewise, repeated calls to PROPFIND are required for all of the resources that are being examined, resulting in the transmission of data even for resources that will fail to meet the criteria. Second, it makes inefficient use of server intelligence. Servers capable of supporting a criteria-based search for resources can use well-defined mechanisms to expedite the generation of the results. These techniques include caching of intermediate search results and the use of indices. If the logic is left solely to the client, neither client nor server can take advantage of these features. Third, this simple DAV search procedure cannot efficiently search the content of resources. To search content would require a DAV client to retrieve the entire content of each resource that is to be examined. These limitations are severe enough for even simple search scenarios that DAV needs extensions to specifically address them. INTERNET DRAFT Requirements for DAV Searching and Locating 2 Requirements for DAV Searching and Locating November 1997 2 Terminology Search Criteria - a set of conditions that must be true for a resource to be included in the search result. Result Set - the set of result records transmitted to the client as the response for a search request. Result Record - a unit of information appearing in the result set. Each Result Record corresponds to a specific resource that meets the search criteria. Search Scope - the set of resources to be searched. In addition to the terms defined above, this document uses the terminology consistent with the HTTP 1.1 specification [HTTP] and the WEBDAV specification [WEBDAV] INTERNET DRAFT Requirements for DAV Searching and Locating 3 Requirements for DAV Searching and Locating November 1997 3 Requirements 3.1 Search Criteria 3.1.1 Boolean Expressions It must be possible to use Boolean operators (AND, OR, NOT) in the search criteria. Often criteria involve the evaluation of several conditions simultaneously. For example, a stereotypical query might ask for "those documents modified by user X within some period of time Y." Boolean operations are necessary to provide support for these common queries. 3.1.2 Relative Comparisons It must be possible to specify criteria on ordered relations such as "less than" or "greater than" for property values. Many common searches involve relative comparisons. For example, a stereotypical query might ask for "those documents under 10K in size". Relational operations are necessary to provide support for these common queries. 3.1.3 Simple Searches on Content It must be possible to perform simple searches on content of any media type. Searching for specific content inside a resource is a common operation. Examining resource content is generally less efficient than examining only the resource's properties because the size of the content is generally much larger than that of the properties. DASL must provide a mechanism to provide searching on content of a resource to provide for this scenario. 3.1.4 Variants It must be possible for searches to occur across multiple variants of resource and to target specific variants. The WEBDAV working group is addressing the standardization of mechanisms for authors to use when submitting variants to the server. DASL must provide mechanisms that can intelligently query on those variants. 3.1.5 Exact Matching It must be possible to specify exact content matches, and the absence of an exact content match. INTERNET DRAFT Requirements for DAV Searching and Locating 4 Requirements for DAV Searching and Locating November 1997 3.1.6 Regular Expression Matching It must be possible to specify a search with matching operators with the expressive power of regular expressions. The power and frequent use of the Unix utility "grep" highlights the value of regular expressions for searching large bodies of content. 3.1.7 NEAR operator It must be possible to specify searches for content matches of terms that are near each other within a document. 3.2 Results 3.2.1 Result Record Definition The client must be able to identify the properties or content of interest for the result records. Search criteria and search result records may not overlap. For example, a query might ask for "the authors of those documents under 10K in size". In this case, the criterion relates to the size, but the desired result record relates only to the author. 3.2.2 Standardized Results Format DASL must define a standard format for search results. For the sake of interoperability, it is desirable that server result formats be standardized so that regardless of the type of query syntax used, clients are guaranteed to successfully understand the results of a query. 3.2.3 Paged Search Results DASL search results must be conducive to paged retrieval. Paged retrieval is necessary if result sets are very large and if clients must also present a responsive interface to a user. In this scenario clients need to access portions of the search result at specific times. DASL search results must be defined so that paged search results are possible. 3.3 Search Qualifiers 3.3.1 Search Scope It must be possible for the client to specify a number of different, unrelated URIs over which the search is to range. INTERNET DRAFT Requirements for DAV Searching and Locating 5 Requirements for DAV Searching and Locating November 1997 3.3.2 Search Depth It must be possible for the client to specify the "depth" of a search for a search scope URI. Users often intend to scope their searches either to the immediate children of a container or to extend the search recursively to the container's children. Furthermore, depth control is needed to prevent servers from performing unnecessary work. 3.3.3 Search References It must be possible for the server to refer the client to other resources in order to continue a search. For example, a client may ask the resource http://ren/stimpy to perform a search over http://foo/bar and http://blah/mumble. However http://ren/stimpy may not be able to perform the search itself and so will need to be able to inform the client that it should submit its search request directly to http://foo/bar and http://blah/mumble. 3.4 Search Query Syntax 3.4.1 Simple Query Syntax The DASL extensions must define a query syntax that provides simple searching functionality. For the sake of interoperability, DASL servers must be expected to offer a basic set of searching capabilities. Likewise, clients need a standard, simple syntax by which to access those capabilities. 3.4.2 Extensible Query Syntax DASL extensions must support the extensible use of alternate query syntax. Servers that support searching capabilities may wish to expose those capabilities through DASL. This may be the case if the simple query syntax is not robust enough to support the server's capabilities. 3.4.3 Query Syntax Discovery It must be possible for clients to discover which syntaxes a server supports. If a server is capable of supporting several search syntaxes, the client needs to determine which syntaxes are supported. INTERNET DRAFT Requirements for DAV Searching and Locating 6 Requirements for DAV Searching and Locating November 1997 3.5 Authentication The DASL specification should state how the DASL extensions to WEBDAV interoperate with existing authentication schemes, and should make recommendations for using those schemes. 3.6 Access Control The DASL specification should state how the DASL extensions to WEBDAV interoperate with the ACL mechanisms supported by WEBDAV, and should make recommendations for using those schemes. 3.7 Internationalization DASL extensions must describe how to perform searches on internationalized content and properties. Information intended for user comprehension must conform to the IETF Character Set Policy [CHAR]. INTERNET DRAFT Requirements for DAV Searching and Locating 7 Requirements for DAV Searching and Locating November 1997 4 References [CHAR] H.T. Alvestrand, "IETF Policy on Character Sets and Languages", June 1997, internet-draft, work-in-progress, draft- alvestrand-charset-policy-02.txt. [HTTP] R. Fielding, J. Gettys, J. C. Mogul, H. Frystyk, and T. Berners-Lee, "Hypertext Transfer Protocol -- HTTP/1.1", RFC 2068, U.C. Irvine, DEC, MIT/LCS, January 1997. [WEBDAV] Y. Y. Goland, E. J. Whitehead, Jr., A. Faizi, S. R. Carter, D. Jensen, "Extensions for Distributed Authoring and Versioning on the World Wide Web", October, 1997, internet-draft, work-in-progress, draft-ietf-webdav-protocol-04.txt.Authors' Addresses INTERNET DRAFT Requirements for DAV Searching and Locating 8 Requirements for DAV Searching and Locating November 1997 5 Author's Addresses Saveen Reddy Microsoft Corporation One Microsoft Way Redmond WA, 9085-6933 EMail: saveenr@microsoft.com Expires May 24, 1998 INTERNET DRAFT Requirements for DAV Searching and Locating 9