Skip to end of metadata
Go to start of metadata

A Technology Analysis of Repositories and Services

With funding from the Mellon Foundation, the Sheridan Libraries at Johns Hopkins University has conducted an analysis of repositories and services based on a methodology for connecting user requirements with repository programmatic features. The Sheridan Libraries considered a diverse range of content types and end user services by developing and gathering numerous scenarios from multiple institutions, and collaborating particularly with MIT, UVA, and ProQuest to evaluate DSpace 1.3.2, Fedora 2.0, and Digital Commons. In all cases, we worked with the ?out of the box? system and documented APIs.

During the Mellon Foundation?s Research and Instructional Technology (RIT) Retreat in 2006, MacKenzie Smith described three aspects of interoperability: semantic, protocol and functional. This analysis examined the protocol aspects by assessing the existing protocols of JSR-170, DR OSID, and ECL, and the functional aspects by testing the documented APIs from the aforementioned systems that can interface readily with applications.

While the specific results from this analysis are noteworthy, it is worthwhile to affirm the importance of the methodology and the recommendations for next steps. Different audiences often refer to different concepts when using the term ?repository.? In order to bridge the different perspectives, we proposed a methodology that included scenarios, use cases and repository features. Our initial idea rested upon the premise that a scenario, an ?individual instance of use cases that traverse a specific path using specific data?, represents the most accessible description of needs from the end user perspective. Faculty, students, collection managers, etc. can most readily describe what they need to do with various content types in a story format, rather than by defining technical requirements (or speaking the language or developers or programmers).

From these scenarios, we attempted to draw an explicit connection between elements defined in the scenario and specific repository features, which would be mapped to documented APIs. This connection would allow different individuals to understand repository needs in different contexts. For example, an end user might focus on scenarios to identify or articulate particular needs whereas a developer or programmer might focus on the repository features that relate to the scenarios. Initially, we felt that moving from scenarios to use cases to repository features would provide an explicit path for mapping between end user needs and technical specifications. However, our experience over the course of the project led us to alter this approach. We ultimately identified a set of repository features that encompasses a broad range of content types and service requirements, though the connection between the scenarios and repository features is implicit, reflecting the tacit knowledge of the project team gained through this analysis and previous repository-based projects such as the Archive Ingest Handling Test.

The set of repository features was used to conduct the analysis of DSpace, Fedora, and Digital Commons, and the repository API specifications JSR-170, DR OSID, and ECL. It is important to note that our analysis focused on the ability of each of these systems to support specific functionality through documented APIs. Future work should include additional analysis of other means for supporting functionality (e.g., user interface or application based import or access), and of additional systems (e.g., ePrints).


Also see our portal site at, which includes our original proposal, interim report, final report, and presentations at CNI and DLF at

Repository interface evaluations

The following is a list of repository interfaces we have evaluated.

See Features for a summary of the repository features.
See ResultsSummary for a summary of the results.
See Related Documents for links to related information.

Our Initial Approach
  • Start with scenarios, or "stories"
  • Record sequence of key events
  • Cluster events across scenarios
  • Develop use cases from event clusters
    • (e.g., fetch a digital object, given an identifier)

A similar approach is being taken by the JISC-funded ASK project, in which we are a partner. See their repository needs analysis wiki for more information on this related activity.

Applying Use Cases
  • Develop functional requirements from use cases
  • Map capabilities of repositories to these functional requirements
  • Map interface specifications to functional requirements


  • No labels


  1. Anonymous

    Do you consider to add CDSware in your repository's application list ?
    Best regards,

    1. Jean-Blaise,

      Thank you for submitting your question about our analysis. We have not included CDSware to our repository list. We don't anticipate adding additional repositories in this phase of our analysis, but we welcome the chance to learn more about other repositories and applications. Could you provide a reference or website that describes more about CDSware?


      1. Anonymous

        CDS ware can be found at the CERN in Switzerland.

        more info on:, or look at the repository the CERN built with it on

        Wichor Bramer

  2. Anonymous

    Like the above post I was curious to see if you included VDC, Virtual Data Centre, which could be seen as a repository application for datasets, at least social science quantitative datasets. I see you are not adding new ones but you might want to consider if it fits within your scope? Developed by Harvard-MIT Datacentre, at .

    Another similar application used by many European Data Archives, but commercial rather than free, is Nesstar,
    a subsidiary of the UK Data Archive and the Norwegian Social Science Data Services.

    I see you include e-learning, are research datasets within your scope? Both seem equally distant from the published paper.

    Good luck with the project,
    Robin Rice, R.Rice @
    Edinburgh University Data Library

    1. Hello Robin,

      Thanks for posting your comment. We have heard of the VDC, but have not included in this current phase of our analysis. I have heard and read good things about VDC so please don't assume we're not interested in it! We simply had to choose a few repositories at this point to manage the scope and tractability of our analysis. We believe that our current work can be extended, including the addition of other repository software.

      I have not heard about Nesstar, so I very much appreciate the reference and pointer. As for datasets, we are considering them primarily through our interactions with the Virtual Observatory Project, We have communicated often with the PI of this project, Alex Szalay at Johns Hopkins, about research datasets and possible connections to electronic publications. We would welcome other examples and cases of dataset repository topics or issues to conider.