Metadata Update Job

Updating IE Metadata with the Update Metadata Job

You can streamline IE-level descriptive metadata updates from an external source without developing an external application. Instead of submitting UpdateMD SOAP requests, you can simply place the UpdateMD files on a designated NFS location and schedule a job to process them in bulk. The job is conceptually similar to the Submission Job (which consumes a SIP).

The updateMD xml file must contain:

  • A valid IE PID
  • One or more metadata elements containing:
    • A type element (value must be “descriptive” or “source”)
    • A subtype element (value must be “dc” or a sourceMD subtype)
    • An mid element (for sourceMD update only)
    • A content element containing a metadata record wrapped in CDATA (metadata will not be stored in CDATA)

It is recommended – but not required – to use informative file names (e.g. IE1234-source-mods-1436.xml). Note: Files submitted by the OAI-PMH Harvester Job will contain a prefix comprised of the job name, a timestamp, and the IE PID.

The submitted record will overwrite the existing record and generate an appropriate event.

Configuration

Like the Submission Job, the Update Metadata requires a user name, which will be stored in the IE object Characteristics modifiedBy DNX section.

The NFS location is an absolution NFS path and should be set under the operational_shared area. Multiple jobs should have separate NFS locations. To avoid conflicts with other institutions, it is recommended to create separate folders for each institution, e.g.

/operational_shared/md_update/INS01/updateMdJob1
/operational_shared/md_update/INS01/updateMdJob2
/operational_shared/md_update/INS02/updateMdJob1
/operational_shared/md_update/INS02/updateMdJob2

Folders must be created on the NFS prior to job execution.

If the submitted record is identical to the existing record, no update is performed. The job can be configured to be sensitive to the record field order by clearing the “Ignore differences in field sequence” field.

Files are processed in order of create date: If two files are submitted for the same IE metadata, both files will be processed, but the later file will overwrite the record provided in the earlier file.

Scheduling is similar to the scheduling infrastructure in use throughout Rosetta.

Error Handling

Upon execution, the Metadata Update job generates two sub-folders: done (where files are moved if update is successful) and error (if unsuccessful). The following exceptions are handled:

ReasonResult
User insufficient roleJob aborts
Invalid XMLError
IE does not exist or deletedError
IE is still in SIP processing stageSkip; file remains in folder for processing by a future job execution
Unsupported metadata typeError
IE is lockedError; this is in order to prevent accidental overwriting of changes in progress
Insufficient permission to update IE (IE owned by another institution)Error

Logging

A metadata job can complete with one of three statuses:
1. Success
2. Complete with warnings – one or more IEs was not updated because no changes were found
3. Error – one or more IEs was could not be updated because of an error (see above – Error Handling)
The log will state the failure reason and reference the relevant updateMD file for each IE that was not updated:

Mon Dec 28 05:44:00 IST 2015   INFO   IE IE3129 has been successfully updated
Mon Dec 28 05:44:01 IST 2015   WARN   IE IE3135 was not updated. Reason: No change in metadata. File name is updateMD-IE23423-dc.xml
Mon Dec 28 05:44:01 IST 2015   ERROR   IE could not be updated, XML is not well-formed or invalid. File name is updateMD-IE32485-dc.xml
Mon Dec 28 05:44:02 IST 2015   WARN   IE IE3138 was not updated. Reason: No change in metadata. File name is updateMD-IE3224-source-marc-234532.xml
Mon Dec 28 05:44:02 IST 2015   INFO   Job completed with Errors

Examples

An example of a dc update file:

<?xml version="1.0" encoding="UTF-8"?>
<updateMD xmlns="http://com/exlibris/digitool/repository/api/xmlbeans">
<PID>IE3818</PID>
<metadata>
<type>descriptive</type>
<subType>dc</subType>
<content>
<![CDATA[<dc:record xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:dcterms="http://purl.org/dc/terms/">
<dc:title>A new title</dc:title>
<dc:description>Lorem ipsum dolor sit amet, consectetur adipiscing elit...</dc:description>
</dc:record>]]>
</content>
</metadata>
</updateMD>

An example of a sourceMD update:

<?xml version="1.0" encoding="UTF-8"?>
<updateMD xmlns="http://com/exlibris/digitool/repository/api/xmlbeans">
<PID>IE3138</PID>
<metadata>
<mid>1830</mid>
<subType>mods</subType>
<type>source</type>
<content>
<![CDATA[<mods:mods xmlns:mods="http://www.loc.gov/mods/v3" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<mods:name>
<mods:role>
<mods:roleTerm type="text">author</mods:roleTerm>
</mods:role>
<mods:namePart>HUSSAIN, JAMAN</mods:namePart>
</mods:name>
<mods:extension>
<mods:dateAccessioned encoding="iso8601">2015-02-16T04:38:59Z</mods:dateAccessioned>
</mods:extension>
<mods:extension>
<mods:dateAvailable encoding="iso8601"/>
</mods:extension>
<mods:originInfo>
<mods:dateIssued encoding="iso8601">2003-02</mods:dateIssued>
</mods:originInfo>
<mods:identifier  type="uri">http://hdl.handle.net/10673/27</mods:identifier>
<mods:language>
<mods:languageTerm authority="rfc3066">en_US</mods:languageTerm>
</mods:language>
<mods:accessCondition  type="useAndReproduction"/>
<mods:subject>
<mods:topic>AUTOMATION</mods:topic>
</mods:subject>
<mods:titleInfo>
<mods:title>AUTOMATION</mods:title>
</mods:titleInfo>
<mods:genre />
</mods:mods>]]></content>
</metadata>
</updateMD>