Internet DRAFT - draft-wilper-semantic-content-pkgs

draft-wilper-semantic-content-pkgs



Network Working Group
Internet-Draft                                                 C. Wilper
Intended status: Experimental                                  DuraSpace
Expires: October 15, 2012                                 April 13, 2012

                    Semantic Content Packages (SCP)
                 draft-wilper-semantic-content-pkgs-00

Abstract

   This document specifies Semantic Content Packages, an experimental
   data structure and associated format for storing and transmitting a
   named set of RDF statements with a set of content streams.  Packages
   can be arranged as a set of files in a directory hierarchy or
   serialized into a single stream for transmission and archival
   storage.  For the latter, a new ZIP-based media type,
   application/scp, is defined.


Status of this Memo

   This Internet-Draft is submitted in full conformance with the
   provisions of BCP 78 and BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF).  Note that other groups may also distribute
   working documents as Internet-Drafts.  The list of current Internet-
   Drafts is at http://datatracker.ietf.org/drafts/current/.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   This Internet-Draft will expire on October 15, 2012.


Copyright Notice

   Copyright (c) 2012 IETF Trust and the persons identified as the
   document authors.  All rights reserved.

   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents
   (http://trustee.ietf.org/license-info) in effect on the date of
   publication of this document.  Please review these documents
   carefully, as they describe your rights and restrictions with respect
   to this document.
 


C. Wilper               Expires October 15, 2012                [Page 1]

Internet Draft           semantic-content-pkgs            April 13, 2012


Table of Contents

   1. Introduction  . . . . . . . . . . . . . . . . . . . . . . . . .  3
     1.1.  Motivation . . . . . . . . . . . . . . . . . . . . . . . .  3
     1.2.  Notational Conventions . . . . . . . . . . . . . . . . . .  3
   2. The SCP Data Structure  . . . . . . . . . . . . . . . . . . . .  4
   3. The SCP Vocabulary  . . . . . . . . . . . . . . . . . . . . . .  4
     3.1. Bytestream  . . . . . . . . . . . . . . . . . . . . . . . .  4
     3.2. ContentLocation . . . . . . . . . . . . . . . . . . . . . .  4
       3.2.1. ResolvableURI . . . . . . . . . . . . . . . . . . . . .  5
       3.2.2. PackagePath . . . . . . . . . . . . . . . . . . . . . .  5
     3.3. Package . . . . . . . . . . . . . . . . . . . . . . . . . .  5
     3.4. PackageType . . . . . . . . . . . . . . . . . . . . . . . .  5
   4. The SCP Format  . . . . . . . . . . . . . . . . . . . . . . . .  5
     4.1.  Directory Structure  . . . . . . . . . . . . . . . . . . .  5
     4.2.  ZIP-based Serialization  . . . . . . . . . . . . . . . . .  6
     4.3.  Examples . . . . . . . . . . . . . . . . . . . . . . . . .  6
       4.3.1.  A Minimal Package  . . . . . . . . . . . . . . . . . .  6
       4.3.2.  A Typed Package with an Image and Statements . . . . .  7
   5. IANA Considerations . . . . . . . . . . . . . . . . . . . . . .  8
   6. Security Considerations . . . . . . . . . . . . . . . . . . . .  8
   7.  References . . . . . . . . . . . . . . . . . . . . . . . . . .  8
     7.1.  Normative References . . . . . . . . . . . . . . . . . . .  8
     7.1.  Informative References . . . . . . . . . . . . . . . . . .  8
   Appendix A.  SCP-OWL-Ontology.ttl  . . . . . . . . . . . . . . . .  9
   Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . .  9






















 


C. Wilper               Expires October 15, 2012                [Page 2]

Internet Draft           semantic-content-pkgs            April 13, 2012


1. Introduction

1.1.  Motivation

   In data management, the primitives used to describe structured and
   semi-structured data are usually determined by the kind of storage
   technology at hand.  For example, data stored in a relational
   database is modeled using the well-known entity-relationship
   paradigm, but data stored in a key-value store is modeled using a
   different set of primitives.

   Over time, high-value data will necessarily be migrated from one
   storage technology to another.  This is often complicated by the need
   to translate the data model to the target storage technology while
   retaining the original semantics.

   The Resource Description Framework (RDF) provides a core data model
   capable of expressing detailed descriptions about any kind of
   resource in a way that is independent of any storage technology or
   protocol.  With this simple, but powerful capability, RDF can form
   the basis of a more durable kind of content modeling paradigm.

   Semantic Content Packages build upon the core RDF model by providing
   a way to logically and physically bundle a set of statements with a
   set of content streams, and to include a unique identifier within
   each bundle.

1.2.  Notational Conventions

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
   document are to be interpreted as described in RFC 2119 [RFC2119].
















 


C. Wilper               Expires October 15, 2012                [Page 3]

Internet Draft           semantic-content-pkgs            April 13, 2012


2. The SCP Data Structure

   A Semantic Content Package is a logical container consisting of:

      o  A set of finite-length octet stream values, also known as
         content, each with a distinct location within the package, a
         path.

      o  An id which is a Uniform Resource Identifier (URI).  Every
         package MUST contain a content stream at path '.scpi/id' whose
         value is an ASCII-encoded URI identifying the package.

      o  A set of Resource Description Framework (RDF) triples, also
         known as statements.  The path '.scpi/graph.ttl' is reserved
         for an RDF Turtle [W3C.SUBM-turtle-20110328] serialization of
         the statements, if any.


3. The SCP Vocabulary

   SCP defines a minimal RDF vocabulary in order to:

      1.  Provide a uniform, but OPTIONAL way to refer to content within
         a package.  This allows assertions to be made about the content
         using other, more appropriate vocabularies such as Dublin Core,
         SKOS, PREMIS-OWL, and OAI-ORE [TBD: spec refs under Informative
         References]

      2.  Allow packages to declare the compound data models, if any, to
         which they conform.

   All terms in this vocabulary are defined as Web Ontology Language
   (OWL) classes or properties, and are described below.  See Appendix A
   for the official, machine-readable ontology.

3.1. Bytestream

   A Bytestream is defined as a specific sequence of octets that exists
   conceptually regardless of where the content might be found.  The
   "location" property refers to a ContentLocation.  A Bytestream may
   refer to any number of ContentLocations.

3.2. ContentLocation

   A ContentLocation indicates where the content of Bytestream is
   expected to be found.  Two subclasses are defined by this
   specification:

 


C. Wilper               Expires October 15, 2012                [Page 4]

Internet Draft           semantic-content-pkgs            April 13, 2012


3.2.1. ResolvableURI

   A ResolvableURI is a kind of ContentLocation with a URI that can be
   deferenced.

3.2.2. PackagePath

   A PackagePath is a kind of ContentLocation that points to a path
   within a Package.  The "path" property refers to a Plain Literal
   value, such as "path/to/file.txt", and the "inPackage" property
   refers to the Package.

3.3. Package

   A Package is a container for content and statements.  The "type"
   property refers to a PackageType.

3.4. PackageType

   A PackageType is a particular kind of package.  Some sort of data
   model is usually associated with a PackageType, but the means by
   which package types are described or represented is not expected to
   be uniform, nor is a method suggested by this specification.


4. The SCP Format

   Packages that adhere to the SCP data model MAY be stored in a variety
   of ways.  This specification defines two.

4.1.  Directory Structure

   The SCP model is designed to fit naturally into a filesystem
   directory structure.  Each content stream is available from some root
   directory path, and the 'path' of that content stream in the package
   is the relative path from the root directory of the package.

   For example, if the root directory of a package is /tmp/mypackage,
   there MUST be a directory whose full path is '/tmp/mypackage/.scpi'
   with at least one file in it named 'id', whose content is the id of
   the package.







 


C. Wilper               Expires October 15, 2012                [Page 5]

Internet Draft           semantic-content-pkgs            April 13, 2012


4.2.  ZIP-based Serialization

   A new ZIP-based Internet Media Type, application/scp (extension .scp)
   is defined by this specification as a way to serialize packages.

   The ZIP file is structured such that unzipping it will instantiate
   the package in a directory structure as described in the section
   above.  It follows that it MUST contain an item named '.scpi/id'
   whose content is the id of the package.

4.3.  Examples

   The following examples illustrate packages in a directory structure.

4.3.1.  A Minimal Package

   In order to represent a package, a directory is only required to have
   one file in a particular place to declare id, as shown below:

   example-package-1/
   |
   -- .scpi/
       |
       | id
       |  (http://example.org/pkg1)
       -






















 


C. Wilper               Expires October 15, 2012                [Page 6]

Internet Draft           semantic-content-pkgs            April 13, 2012


4.3.2.  A Typed Package with an Image and Statements

   This example shows a package containing a single image bundled with
   several statements, including:

      o  An assertion of the "type" of package.

      o  An assertion from the Dublin Core vocabulary that the format of
         the image is "image/jpeg".

      o  An assertion that the image Bytestream is expected to be
         available in three known locations, one being a path within the
         package itself.


   example-package-2/
   |
   | image.jpg
   |  (binary data)
   |
   -- .scpi/
       |
       | id
       |  (http://example.org/pkg2)
       |
       | graph.ttl (
       |  @prefix dc:  <http://purl.org/dc/elements/1.1/> .
       |  @prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
       |  @prefix scp: <http://scpproject.org/ontology#> .
       |
       |  <http://example.org/pkg2> rdf:type scp:Package ;
       |        scp:type     <http://example.org/pkgtype-1> .
       |  _img  rdf:type     scp:Bytestream ;
       |        dc:format    "image/jpeg" ;
       |        scp:location _path ,
       |                     <http://example.org/pkg2/img1.jpg> ,
       |                     <http://example.com/images/foo.jpg> .
       |  _path rdf:type     scp:PackagePath ;
       |        scp:path     "image1.jpg" ;
       |        scp:inPackage <http://example.org/pkg2> .
       | )
       -






 


C. Wilper               Expires October 15, 2012                [Page 7]

Internet Draft           semantic-content-pkgs            April 13, 2012


5. IANA Considerations

   A new media type shall be registered for this format.

   Type name:
       application
   Subtype name:
       scp
   Required parameters:
       none
   Optional parameters:
       none


6. Security Considerations

   The security considerations for this specification are the union of
   those that apply to processing Turtle RDF and ZIP files.


7.  References

7.1.  Normative References

   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
              Requirement Levels", BCP 14, RFC 2119, March 1997.

   [RFC3986]  Berners-Lee, T., Fielding, R., and L. Masinter, "Uniform
              Resource Identifier (URI): Generic Syntax", STD 66, RFC
              3986, January 2005.

   [W3C.SUBM-turtle-20110328]
              Beckett, T. and Berners-Lee, T., "Turtle - Terse RDF
              Triple Language", W3C Team Submission
              SUBM-turtle-20110328, March 2011,
              <http://www.w3.org/TeamSubmission/2011/SUBM-turtle-
              20110328/>

              Latest version available at
              <http://www.w3.org/TeamSubmission/turtle/>.

   [TBD: Refs to RDF, OWL, ZIP, etc.]

7.1.  Informative References

   [TBD: Refs to DC, SKOS, OAI-ORE, PREMIS-OWL, etc.]


 


C. Wilper               Expires October 15, 2012                [Page 8]

Internet Draft           semantic-content-pkgs            April 13, 2012


Appendix A.  SCP-OWL-Ontology.ttl

   # Namespaces
   @prefix owl:  <http://www.w3.org/2002/07/owl#> .
   @prefix rdf:  <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
   @prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
   @prefix scp:  <http://scpproject.org/ontology#> .

   # Classes
   scp:Bytestream      rdf:type        owl:Class .
   scp:ContentLocation rdf:type        owl:Class .
   scp:ResolvableURI   rdfs:subClassOf scp:ContentLocation .
   scp:PackagePath     rdfs:subClassOf scp:ContentLocation .
   scp:Package         rdf:type        owl:Class .
   scp:PackageType     rdf:type        owl:Class .

   # Properties
   scp:location  rdf:type    owl:ObjectProperty ;
                 rdfs:domain scp:Bytestream ;
                 rdfs:range  scp:ContentLocation .
   scp:path      rdf:type    owl:DatatypeProperty ,
                             owl:FunctionalProperty ;
                 rdfs:domain scp:PackagePath ;
                 rdfs:range  rdf:PlainLiteral .
   scp:inPackage rdf:type    owl:ObjectProperty ,
                             owl:FunctionalProperty ;
                 rdfs:domain scp:PackagePath ;
                 rdfs:range  scp:Package .
   scp:type      rdf:type    owl:ObjectProperty ;
                 rdfs:domain scp:Package ;
                 rdfs:range  scp:PackageType .


Authors' Addresses


   Chris Wilper
   West Henrietta, NY
   USA

   Email: cwilper@gmail.com










C. Wilper               Expires October 15, 2012                [Page 9]