This page explains the different sections of the repository API evaluations.
Each evaluations follows the schema below.
This section gives some background on the repository API being evaluated.
This sections identifies the underlying technology used by the repository API.
This section describes the data model used by the repository API.
A data model describes in an abstract way how data is represented.
For example, in a relational database data is represented by mathemetical
relations. The base data type is a tuple. A tuple is an ordered multi-set of
domain, value pairs. The repository API will have its own model with its
The features described below were derived from scenarios.
A scenario is a brief narrative that describes the hypothetical use
of a system. The system should contain a repository.
The scenarios are used to rough out the boundary of the repository in the
system. These boundaries were analyzed to derive a set of repository features
used by the various systems. Then repository APIs were evaluated the with
respect to the derived features.
The most difficult part of the analysis was deriving features from scenarios.
As an example, consider SharingContentMultipleInstructors. While this scenario
does not explicitly mention versioning, it is reasonable to assert that this
repository feature would be helpful, if not essential, to support the activity.
In the scenario, two users create a new course in Sakai, a collaboration and
learning environment, out of resources from an old course. The resources may need
to be changed for the new course. If both users edit the same resource, one of
the users? changes will be lost. Depending on the underlying repository and
the timing of the edits, a user may not realize data has been lost.
Suppose the repository used by Sakai has versioning support. Each time a resource
is edited, the repository automatically creates a new revision of the resource.
Sakai could then give users the ability to examine all the revisions of a resource
and, if needed, roll back to a previous revision. No data would be lost.
Users mentioned below are users of the repository API.
The purpose of this section is to explain how the API is used to perform basic
storage functions such as add, remove, access, and modify. The API encapsulates
data with container objects from the data model.
Data is added to a container object from the data model. The container object
might have to be created. The container will have a unique identifier to
among the containers. The data will have a unique identifier in the container.
The container holding the data is looked up with an identifier.
Then the desired data is looked up in the container.
Removing data might mean removing data from a container object or removing the
entire container object. The data might be purged or just marked withdrawn.
Purged data is removed from the underlying storage used by the repository.
If the repository knows about object relationships and removal breaks a
relationship, the repository API must provide the user with some mechanism
for dealing with the conflict.
Add, remove, and change metadata associated with objects.
Metadata is user supplied information about objects.
Examples include a name or a relationship with another object.
An aggregation groups related data in the repository.
Aggregations that can contain other aggregations as members allow data to
be organized in a hierarchy. An object might also be a member of more than one
The API might directly support aggregations, possibly as an object type in the
data model, or the burden of managing aggregations might be placed on the user.
If the API has direct support for aggregations, there might be a aggregation
object from the data model to create.
In order to remove an aggregation, the aggregation object, if any, is removed
and the relationship which links members to the aggregation is deleted.
In some cases, removing an aggregation might also cause aggregation members
to be removed. See "Remove data" above.
Change aggregation membership
The API might provide support for changing which aggregations contain which
Find aggregation members
A user should be able to efficiently find the members of an aggregation.
The management features below deal with issues a systems administrator might
face when migrating data from one repository to another or recovering a
repository after a hardware failure. The API may not directly support these
Bulk ingestion is the addition of a large amount of data to the repository.
A bulk ingest might take a long time, which makes error recovery important.
Ideally, after an error occurs, the ingestion can be restarted without
unnecessarily duplicating work.
It is likely that bulk ingestion can be indirectly supported on top of the regular ingestion
Bulk export is the export of a large amount of data from the repository.
The export format should be well documented or left up to the user.
As in bulk ingestion, error recovery is important.
Bulk export can be indirectly supported if a user is be able to iterate over all of the data stored in the repsository.
Authentication verifies the identity of a user. The user presents credentials
to the repository and the repository checks the credentials. The credentials
should be communicated through a secure channel. Authentication support might
be provided by an external system.
Access control is the capacity to permit or deny actions.
An action might be reading data from an object or removing an object.
Access control based on user authentication is called authorization.
http://csrc.nist.gov/rbac is one way
of implementing authorization.
User management is the ability to add, remove, and modify user records
stored in the repository. User management might not be a programmatic feature
of the repository and could be part of an external system.
The API may provide the ability to change the security policy. For example a
user might be able to change the permissions associated with an object or
control what actions a group of users can take.
When multiple users have write access to shared data, conflicts can occur.
If two users write to the same data, typically the last write wins and the
user who lost does not know it. If a repository supports locking, a user
can lock a piece of data and gain exclusive access.
Virtual object representation
The physical representation of an object is the data it contains.
A virtual representation of an object is data generated dynamically
from the object data. A virtual representation of an image might be
the image in another format.
A transaction is a sequence of updates to the repository.
The updates making up the transaction happen if and only if the transaction
is committed. If the transaction is aborted, none of the updates take place.
This is valuable when a user wants to ensure consistent relationships between
a set of objects. A transaction that updates each object as needed ensures
that the relationships between the objects are consistent at all times.
Versioning is ability to manage multiple revisions of objects.
When a user modifies a versionable object, the previous state of the object
is saved as a revision. Revisions of the same object are linked together.
The API defines how revisions are created, accessed, and removed.
Searching is the selection of data matching certain criteria.
The criteria might be supplied through a query language.
The API provides a means to construct queries, evaluate the queries, and
iterate over the results.