Skip to end of metadata
Go to start of metadata

These requirements are derived from KeyEvents.

There is a bunch of terminology in this page that needs to be addressed. We should use terminology that already exists in other standards, where possible.

Earlier work:,,,

Requirements engineering overview:

Object level functionality

Lookup object given id

Create objects

  • Create an empty object
  • Create an object, given some data (may be synthesis of create empty and addDatatoObject)
    (optional, maybe more than minimal requirements)
  • Create an empty object, given an identifier
    • needs to check for exitence of object (see LookupObject)
  • Create an object, given some data and an identifier

Modify objects

  • Modify object, given an identifier, the element to change, and the new contents of the element

Remove object

  • Remove an object, given an identifier
    (to support functionality like OAI and provenance, it is probably best not to remove the entire object. Best to hold on to provenance info and at least the original identifier as a placeholder.)
    (Under what circumstances does it make sense to remove the entire object?)

Object element level functionality

Add element to object

  • Add element to object, given an identifier, the element to add, and the new contents of the element

Remove element from object

  • Remove element from object, given identifier and the element to remove

Get element of an object

  • Extract and return designated element from an object
    (consider case of object already fetched v. fetching element directly)
    (consider need to fetch segment of content/metadata – read slice of data – from object)
    (consider issue of repository system metadata v. user-provided metadata)

Collection/container issues

Find all members of a collection.

(this is an interesting problem, depending on how collections are modeled)
(I'm not sure there is a way to do this generically)

Other functionalities

Bulk ingest

(is there a reasonable way to package a SIP generically?)
(see Transactions, how do we handle these nicely, especially for large transactions)

Backup and restore

  • need to provide consistent serialized expression of repository content to support backup and restore. For example, DSpace has a database and one or more content stores that are not in sync at all times. The only way to quarantee consistency in this environment is to stop the repository (or at least disable write) during backups.

Session management

(authentication: application to repository; user credentials passed through application? Third-party mechanisms like Shib?)

Concurrency and Locking

  • need to support multiple simultaneous use. A couple different modes...
    • process attempts to gather appropriate locks and is notified of failure if locks cannot be obtained
    • process attempts to gather appropriate locks and waits until they are available
    • repository reports failure when process attempts to store object modified by another process since fetch


  • need mechanism to designate some calls as atomic
  • need mechanism to create rollbackable transaction wrapper around a series of calls


(at what level to we apply versioning? object? element of an object?)
(note for Tim: how does versioning tie into digital provenance, in general?)

  • Need ability to freeze an object so it can no longer be modified.


  • Applications may like to be notified on object updates.


  • Scalability
  • Security
    • Ensure data is always in valid state
    • Ensure access is controlled as desired
  • Interoperability
    • Remote administration and access through well defined API
  • No labels