Ready to Learn?Ex Libris products all provide open APIs

Ingesting

Introduction

Alma provides built-in functionality to ingest digital resources in bulk or one-at-at-time. In addition, ingests can be prepared outside of Alma using a third party tool, such as those listed here, and processed by an MD import job when ready.

The image below depicts the flow for ingesting digital materials into Alma.

alma-d storage

When using the tools built into Alma, such as Add Representation and the Digital Uploader, ingests are prepared by Alma. Alternatively, ingests can be prepared outside of Alma and passed off to Alma’s metadata import for processing. Use cases that might benefit from preparing ingests outside of Alma include:

  • Migrating from legacy digital asset management systems
  • Processing the output from digitization projects
  • Providing a custom end-user deposit workflow

This section describes how to prepare an ingest to be processed by Alma.

MD Import Profile

Alma uses metadata import profiles to define how a metadata import job processes files. MD import profiles can be created for different types of imports. For digital materials, a digital metadata import profile is used. For information on how to configure a metadata import profile, see “Resource Management -> Managing Profiles for Record Imports” in the online help.

Several fields from the import profile configuration impact the preparation of ingests outside of Alma.

  • MD import profile ID: used as the name of the upload folder. See information on the ingest folder below.
  • MD file name: the name of the file(s) containing the metadata to be processed
  • Source format type: the metadata format expected.
  • Representations: determines the pattern used to match files to create different representations. For example, if both high resolution and low resolution files are prepared, they can be provided in different directories which can be matched by the representation configuration.

To add digital representations to existing BIB records, make sure your metadata files contain records that will be matched with the existing ones by the MD import process. For information on using matching profiles, see “Resource Management -> Managing Profiles for Record Imports -> Configuring New Import Profiles -> Match Methods- Explanations and Examples” in the online help.

In order to prepare and process ingests outside of Alma, the following APIs may be helpful:

  • MD Import Profile List: Returns a list of import profiles that can be used for digital materials. The list includes the default collection to which BIBs processed by the import job will be added.
  • Run MD Import Job: Runs an import job based on the specified import profile. Useful to kick off an import job on-demand rather than on a scheduled basis.

Metadata files

Each ingest must contain at least one metadata file and the files which are meant to be ingested into Alma. Records in the metadata files are matched or created in Alma in accordance with the configuration of the MD import profile. The expected metadata filename pattern is also set in the MD import profile configuration.

For Dublin Core, the following conventions are used:

  • dc.xml file may contain a single DC record in a <record> tag, or one or more DC records wrapped by a <collection> tag.
  • If a single dc.xml file with a single DC record is provided and the file order is of no importance, the files do not need to be referenced. Any files found in the folder will be added to the matched or created BIB.
  • If a single dc.xml file with a single DC record is provided and the file order is of importance, files should be referenced using the dc:identifier property and a file:// prefix. File order will be preserved.
  • If multiple DC records are provided, the files must be enumerated in the metadata, using the dc:identifier property and a file:// prefix.
<?xml version="1.0" encoding="UTF-8" ?>
<collection xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:dcterms="http://purl.org/dc/terms/1.1/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://purl.org/dc/terms/1.1/ http://dublincore.org/schemas/xmls/qdc/2008/02/11/dcterms.xsd http://purl.org/dc/elements/1.1/ http://dublincore.org/schemas/xmls/qdc/2008/02/11/dc.xsd">
    <record>
        <dc:title>Towards paperless information systems</dc:title>
        <dc:creator>Lancaster, Frederick Wilfrid</dc:creator>
        <dc:subject>Information Transfer and Management</dc:subject>
        <dc:subject>theology</dc:subject>
        <dc:description>179 pp.</dc:description>
        <dc:publisher>New York, NY : Academic Press</dc:publisher>
        <dc:date>1978</dc:date>
        <dc:type>Text</dc:type>
        <dc:identifier>file://paperless.pdf</dc:identifier>
        <dc:language>eng</dc:language>
    </record>
        <record>
        <dc:title>Special Collections at the Cusp of the Digital Age: A Credo</dc:title>
        <dc:creator>Lynch, Clifford A.</dc:creator>
        <dc:description>Research Library Issues, no. 267</dc:description>
        <dc:publisher>Association of Research Libraries</dc:publisher>
        <dc:date>December 2009</dc:date>
        <dc:type>Text</dc:type>
        <dc:identifier>file://lynch.pdf</dc:identifier>
        <dc:language>eng</dc:language>
        <dcterms:bibliographicCitation>Clifford A. Lynch, “Special Collections at the Cusp of the Digital Age: A Credo,” Research Library Issues, no. 267 (December 2009).</dcterms:bibliographicCitation>
    </record>
</collection>

 

For MARCXML, the following conventions are used:

  • A single MARC record in a <record> tag, or one or more MARC records wrapped by a <collection> tag.
  • If a single MARCXML file with a single MARC record is provided, the files do not need to be referenced. Any files found in the folder will be added to the matched or created BIB. The following MARCXML file will create a BIB with all files found in the folder:
<?xml version="1.0" encoding="UTF-8" ?>
<collection>
   <record>
      <leader>     aas          a     </leader>
      <datafield tag="100" ind1="1" ind2=" ">
        <subfield code="a">AUTHOR</subfield>
      </datafield>
      <datafield tag="245" ind1="1" ind2="2">
        <subfield code="a">TITLE</subfield>
      </datafield>
      <datafield tag="260" ind1=" " ind2=" ">
        <subfield code="c">DATE</subfield>
      </datafield>
   </record>
 </collection>
  • If multiple MARC records are provided, the files must be enumerated in the metadata. The field and subfield can be configured in the MD import profile configuration. For example:
<collection>
   <record>
      <leader>     aas          a     </leader>
      <datafield tag="100" ind1="1" ind2=" ">
        <subfield code="a">AUTHOR1</subfield>
      </datafield>
      <datafield tag="245" ind1="1" ind2="2">
        <subfield code="a">TITLE1</subfield>
      </datafield>
      <datafield tag="260" ind1=" " ind2=" ">
        <subfield code="c">DATE1</subfield>
      </datafield>
      <datafield tag="856" ind1=" " ind2=" ">
        <subfield code="u">TITLE1/image.jpg</subfield>
      </datafield>
   </record>
<record>
      <leader>     aas          a     </leader>
      <datafield tag="100" ind1="1" ind2=" ">
        <subfield code="a">AUTHOR2</subfield>
      </datafield>
      <datafield tag="245" ind1="1" ind2="2">
        <subfield code="a">TITLE2</subfield>
      </datafield>
      <datafield tag="260" ind1=" " ind2=" ">
        <subfield code="c">DATE2</subfield>
      </datafield>
      <datafield tag="856" ind1=" " ind2=" ">
        <subfield code="u">TITLE2/image.jpg</subfield>
      </datafield>
   </record>
 </collection>

Note that UTF-8 encoding is expected.

Alternatively, a CSV file can be submitted. Mapping depends on the target format (DC or MARCXML). The following table lists the supported fields and how they are mapped.

Source CSV fieldTarget mappingNotes
group_id No mapping - Functional field for grouping representations together under the same bib. 
 Collection fields 
collection_name (R) Assign to collection by collection Name
collection_id (R) Assign to collection by collection ID
collection_external (R) Assign to collection by collection external system and ID, formatted as (system)ID
   
 BIB fields (MARC21 / DC) 
mms_id (NR) No mapping - for matching purposes only
originating_system_id035 ##$a
dc:identifier
 
contributor700 ##$a
dc:contributor
 
coverage651 #4$a
dc:coverage
 
creator100 1#$a
dc:creator
MARC: NR
 
date008/07-10, 264  #0c
dc:date
If null, current date is used; MARC: NR
description500 ##$a
dc:description
 
format340 ##$a
dc:format
 
identifier024 8#$a
dc:identifier

DC: Match existing bib record using alma:{INST_CODE}/bibs/{MMS_ID} syntax
ISBN020 ##$a
dc:identifer xsi:type="dcterms:URI"
MARC: NR
DC: 'urn:ISBN:' prefix is added
ISSN022 ##$a
dc:identifer xsi:type="dcterms:URI"
MARC: NR
DC: 'urn:ISSN:' prefix is added
language008/35-37, 041 ##$a
dc:language
MARC: Mandatory, NR
DC: Recommended; use ISO-639-2/3 codes
publisher264 ##$b
dc:publisher
 
relation530 ##$a
dc:relation

DC: Assign to collection using alma:{INST_CODE}/bibs/collections/{COLLECTION_ID} syntax
rights506 ##$a
dc:rights
 
source786 0#$a
dc:source
 
subject650 #4$a
dc:subject
 
title245 00$a
dc:title
MARC: if creator exists, mapped to 245 10$a 
 
typeLeader06, Leader07
dc:type
MARC: Mandatory, material type controlled list ('Book', 'Map', etc.). If null, uses “mixed material”
DC: use DCMI types
any other field
(with no reserved prefix)
500 ##$a
no DC support
MARC: mapped as key:value
 
   
 Representation fields 
rep_label (NR)Label 
rep_public_note (NR)Public Note 
rep_access_rights (NR)AR Policy NameDefault can be set in MD import profile
rep_usage_type (NR)Usage TypeMaster or Derivative; default can be set in MD import profile
rep_library (NR)LibraryDefault can be set in MD import profile
rep_note (R)Note 
any other field with rep_ prefixNotemapped as key:value
   
 File fields 
file_name_{1…n} (NR) File name with relative path to ingest folder
file_label_{1…n} (NR)LabelIf not provided, filename without extension is used

All fields are optional, expect where otherwise noted. 

Only one CSV file per ingest should be submitted. A CSV template is available for downloading from here.

Ingest folder

Each ingest is prepared in a separate folder. The directory structure for ingest folders is as follows:

INSTITUTION_CODE/upload/MD_IMPORT_PROFILE_ID/INGEST_ID
  • INSTITUTION_CODE: The code of the institution, for example 01UNI_INST
  • upload: Hardcoded for the upload folder
  • MD_IMPORT_PROFILE_ID: The ID of the relevant MD import profile. Can be retrieved from the Import Profile UI (by clicking the 'i' icon in the upper right corner of the screen) or by using the MD Import Profile List API (see below)
  • INGEST_ID: A random unique identifier for the ingest. This folder name has no significance

Alma will process files in any subfolder within the ingest folder, but the metadata file must be in the root of the ingest folder.

While preparing the ingest, a .lock file should be placed in the root of the ingest folder. This will indicate to Alma that the ingest is not ready to be processed. When ready, the .lock file should be removed. The next time the MD import job is run, the ingest will be processed by Alma.

Thumbnails

When creating digital inventory, Alma will automatically attempt to generate thumbnails for most common image, document, presentation and video file formats. 

Customized thumbnails can be provided for any file in the ingest. Images should be in jpg, png, or gif format and not exceed 100K. The naming convention used is the name of the file with a .thumb extension, for example:

upload/991234567890/abcd-efgh-ijkl-mnop/myfile.doc

upload/991234567890/abcd-efgh-ijkl-mnop/myfile.doc.thumb

upload/991234567890/abcd-efgh-ijkl-mnop/data/myfile.ppt

upload/991234567890/abcd-efgh-ijkl-mnop/data/myfile.ppt.thumb