Child pages
  • FedoraFeatures
Skip to end of metadata
Go to start of metadata

See Features for background on the evaluation.

Introduction

Fedora is an open source web services based
framework for managing and delivering digital content. Fedora can be
used as a componenet in the creation of applications such as institutional
repositories and course managment systems.

Sites using Fedora include http://www.encyclopedia.chicagohistory.org and
http://www.lib.virginia.edu/digital/collections/image.

This evaluation is based on Fedora 2.0.

Technology

Fedora is written in Java and runs as a web application in Tomcat.
Fedora provides SOAP and REST interfaces for access, management, and searching.
Fedora also provides graphical and command line tools built on top of the web
services. The storage layer uses a RDBMS and file system.

Fedora supports FOXML (XML format specific to fedora) and a METS extension as
object ingest and export formats.

Fedora also comes with optional local web services for XSLT, PDF
transformation, and image manipulation. The REST API provides a subset
of the methods in the SOAP API. The REST API methods can return both XML
and HTML.

See http://www.arxiv.org/abs/cs.DL/0501012 for detailed information on
Fedora's architecture.

Data Model

Fedora has three types of digital objects: data, behavior definition,
and behavior mechanism. See
http://www.fedora.info/download/2.0/userdocs/digitalobjects/objectModel.html.

Data objects contain an id, dublin core metadata, relationship metadata,
an audit trail, datastreams, and disseminators.
The relationship metadata of an object uses RDF/XML to make assertions about
object relationships.
A datastream represents a piece of content like an image. The content
may be stored in the repository or externally.

Disseminators define virtual representations or disseminations of objects.
A dissemination is a view of an object produced by a web service which takes
as input one or more datastreams of the object.
A disseminator is specified by a behavior definition object and a behavior
mechanism object. Behavior definition objects are abstract
descriptions of web services. Each object describes a set of methods each
of which operate on one or more datastreams.
Behavior mechanism objects are concrete implementations of web services.
Behavior mechanism objects define a data contract. The data contract specifies
what kinds of datastreams an object must have in order to be operated on by
the behavior mechanism object.

Features

Storage

Add data

Objects are created by calling the ingest method of the management
API. The method takes an XML document in FOXML or a Fedora METS format
as input.

The ingest method returns an identifier for the object.
The identifier may be set in the XML or generated by the repository.

A Fedora PID is the unique persistent identifier assigned to every object.
A PID consists of a namespace and an identifier string.

Datastreams are added with the addDatastream method.
The datastream contents may be stored in the repository or externally.

Access data

Objects datastreams accessed by calling getDatastreamDissemination with an
object PID and datastream id.

Remove data

Objects are permanently removed by calling purgeObject.

The purgeObject method takes a force argument. The force argument determines
whether or not referential integrity is enforced.

Datastreams are removed with the purgeDatastream method. A force parameter
specifies whether or not to force the change if it would break a data contract.

Manage metadata

Object metadata is modified with modifyObject.

Other object information can be accessed with
getObjectProfile, listMethods, and listDataStreams.

Datastream metadata is modified with modifyDatastreamByReference
and modifyDatastreamByValue. The former can change the location a datastream
points to and the latter can change the actual content of the datastream.
Both can change the metadata associated with a datastream.

Aggregation

Fedora has a flexible data model and can aggregate related content in a
number of ways. Consider using an object to represent the collection and an
object for each child. The collection needs to know what members it contains
and each members needs to know its collection. These "is member of" and
"has member" relationships are supported in the Fedora relationship ontology
and can be stored in the relationships metadata of an object.

Object relationships are indexed and can be searched on.

Create aggregation

An aggregation object is created with the ingest method of the managment API
just like a normal object.

Remove aggregation

Objects representing aggregations and aggregation members can be removed
normally with purgeObject If objects may be members of more than one
collection, then removing a member requires checking to make sure the
object is not also a member of another collection. If collections
may be nested, then removing a collection may require updating the
relationship metadata of a parent collection.

Change aggregation membership

Aggregation members are added and removed by modifying the relationship
metadata of the objects.

Find aggregation members

The resource index search API supports searching on the relationship metadata.
This can be used to find all the members of an aggregation and the aggregations
an object belongs to.

Management

Bulk ingest

There is a command line tool, fedora-batch-ingest, and a gui tool fedora-admin.
Both use the same code which is built on top of the management API.
The bulk ingest tools take a directory of Fedora METS or FOXML files as input.
Fedora has tools for producing the XML input files from templates.

The bulk ingest produces a map file which maps each ingested file to a PID.
The work done is not lost when an error occurs, but there appears to be no
built in way to resume an ingest. Error covery could be easily added to the
tool.

Bulk export

Both the command line tool, fedora-export, and the gui tool, fedora-admin,
can export all of the objects in a repository to the file system.
The objects can be written out in FOXML or Fedora METS.

Security

Release 1.2 includes a simple form of access control to provide access
restrictions based on IP address. IP range restriction is supported in both
the Management and Access APIs. In addition, the Management API is protected
by HTTP Basic Authentication

Note that Fedora 2.1 will have significantly enhanced security features.
See http://www.fedora.info/download/2.1b/userdocs/server/security/securingrepo.html

Authentication

A user is either the administrator or an anonymous user. Anonymous users are
restricted to the access API and are not authenticated. Administrators are
authenticated by Tomcat with HTTP basic authentication and can use the management API.

Access control

The administrator can use the managment API. Anonymous users
are restricted to the access API. In addition, each API can be restricted by
ip address.

User management

A user is either the administrator or an anonymous user.
Users are managed by editing the configuration file fedora.fcfg.

Policy management

Policies are changed by editing the configuration file fedora.fcfg.

Other

Locking

There is no support for locking.

Virtual object representation

Virtual representations are obtained by calling the getDissemination method
of the access API. The method takes a data object pid, a behavior definition
pid, a method name, an array of parameters, and a timestamp as arguments.
The timestamp indicates what revision of the data object should invoke the
dissemination. The view is returned as a mime-typed byte stream.
The data object must have a disseminator with the indicated behavior
definition.

Virtual representations are created by adding new behavior definition and
behavior mechanism objects to the repository.

Behavior mechanism and definition objects can be added, removed, and queried
just like data objects. A disseminator is a behavior description/mechanism pair
which indicates a data object supports a certain set of views.
Disseminators can be added and removed from data objects.

Transactions

There is no support for transactions.

Versioning

Every change made to an object's datastream creates a new revision of the
object. Revisions are timestamped and accessed through that timestamp.

Methods like getDatastreamDissemination and getDissemination have an optional
timestamp argument which can be used to select a certain revision of an object

The purgeObject, purgeDatastream, and purgeDisseminator methods can
remove old revisions.

Searching

Searching happens through two APIs, the access API and the
resource index search API

Both the REST and SOAP access APIs have a findObjects method.
The method returns all objects matching a set of critera.
The criteria consists of dublin core metadata elements and
other required elements such as label, pid, and state.
The SOAP API returns results as a fedora defined search result type.
The REST API returns results as HTML or XML.

The resource index search API supports searching on the relationship metadata
which can be defined for each object.
The following RDF query languages are supported: SPO, iTQL, and RDQL.
Search results can be returned in a few different formats such as XML and CSV.
See
http://www.fedora.info/download/2.0/userdocs/server/webservices/risearch for
detailed information.

  • No labels