OAI-PMH
General
Rosetta supports the Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH) for both harvesting and publishing metadata (or, in OAIS terminology, for creating SIPs and DIPs). If your legacy digital repository publishes data in OAI-PMH format, Rosetta can harvest and preserve your data and files. To harvest your Rosetta collections and expose them to your partons and the world, use Primo or another OAI-PMH compliant discovery system to harvest metadata from Rosetta. For more information on the OAI-PMH protocol, please refer to the OAI-PMH website.
Harvesting
The end-to-end process of harvesting records from an external repository into Rosetta is comprised of three stages:
- Rosetta harvests the records.
- Rosetta attempts to match the records to existing records in Rosetta and transforms them.
- Based on the matching results, Rosetta either generates a Submission Job folder (for new records) or a Metadata Update folder for existing records.
Scheduled Submission and/or Metadata Update Jobs run independently of the Harvesting Job.
A description on how to set up a full ingest workflow based on OAI-PMH harvesting (including examples for several common digital repositories) is described here.
Publishing
The Rosetta OAI-PMH server is fully compliant with OAI-PMH requirements and guidelines. Harvesters can connect to this server using the standard OAI-PMH verbs. The base URL of your server is http://{delivery_load_balancer_host:port}/oaiprovider/request. For example:
https://rosetta.exlibrisgroup.com/oaiprovider/request?verb=Identify
Rosetta OAI-PMH publishing is based on the plug-in infrastructure, and allows staff to leverage Rosetta’s built-in OAI-PMH server or to export OAI-PMH records to a file.
Publishing configuration is institution-based, and contains:
- A set of IEs to be published (incrementally)
- One or more profiles that include processing instructions. The profiles determine
- Whether to transform the metadata before publishing (and, if so, how);
- The publishing target (Rosetta’s built-in OAI-PMH server or a file on the NFS)
Institutions can run any number of publishing configurations. All configurations of all institutions are synchronized by a global system job, but each configuration can also be synchronized by the owning institution manually on demand.