Tech Blog

SharePoint 2010 to Primo

NUI Galway Taught Thesis Submission (SharePoint 2010) and Discovery (Exlibris Primo) – http://library.nuigalway.ie

ATOM2OAI-PMH an OAI-PMH Server that acts as a real-time SharePoint 2010 to Primo data connector

SharePoint 2010 as a document repository with OAI-PMH provider support (using REST, oData, OAI-PMH, PHP, XML, XSLT, cURL).

I recently created a system to manage the submission, storage, approval, and discovery of taught thesis documents. I used SharePoint 2010 as a the document repository and Exlibris Primo as the discovery tool.
Primo is our discovery tool of choice, so it was an easy choice.
The decision to use SharePoint was more challenging, but in the end SharePoint was chosen because it has out-of-the-box (nearly) approval workflows, document libraries, unique identifiers, storage (provided by central IT), backups (provided by central IT), secure authentication (provided by central IT), and remote access (provided by central IT in the form of UAG 2010).
The final solution looked something like the below.
OAI-PMH Server Sharepoint to Primo

Primo needs to be able to talk to SharePoint to keep informed of changes (additions, modification, deletions).
By default, Primo does not speak to SharePoint. Both systems talk in different languages/standards. SharePoint uses ATOM XML and Primo expects the OAI-PMH standard.
A connector was created to facilitate Primo to SharePoint communication. I named it ATOM2OAI-PMH.

ATOM2OAI-PMH

Built using PHP. With technology as follows PHP, XML, XSLT, CURL, and SharePoint REST API using oData.
Uses standards ATOM and OAI-PMH.
Assumes oai:dc
Two OAI-PMH verbs are supported, Identify and ListRecords.
Primo harvests via this OAI-PMH server/service/link every day. To read records from SharePoint ATOM interface in real-time.
This OAI-PMH server code displays the records within the requested date range. New, modified, and deleted records are displayed.

Data provider will present as SERVER/?verb=ListRecords&metadataPrefix=oai:dc&from=DATE&until=DATE

Overview

Read records from SharePoint oData API (and only show approved and deleted records).
Then this date is parsed to cut out stuff that breaks the transform (xmlns, atom, odd characters – all done in PHP).
Then this data is transformed to OAI-PMH XML and displayed to Primo.

All this happens on the fly in real-time.

The transform (XSLT)

<?xml version="1.0" encoding="utf-8"?>

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"

    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ http://www.openarchives.org/OAI/2.0/oai_dc.xsd"
    xmlns:dc="http://purl.org/dc/elements/1.1/"
    xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc.xsd"
    xmlns:d="http://schemas.microsoft.com/ado/2007/08/dataservices"
    xmlns:m="http://schemas.microsoft.com/ado/2007/08/dataservices/metadata"
    xmlns:dcterms="http://purl.org/dc/terms/"
    >

  <xsl:output method="xml" indent="yes" encoding="utf-8"/>
  <xsl:template match="/">
    <OAI-PMH>
      <responseDate><xsl:value-of select="feed/updated"/></responseDate>
      <request verb="ListRecords">[The Server location]</request>
      <ListRecords>
        <xsl:apply-templates select="feed/entry"/>

      </ListRecords>
    </OAI-PMH>
  </xsl:template>
  <xsl:template match="entry">
    <xsl:choose>
      <xsl:when test="m:properties/d:Deleted=1">
        <record>
          <header status="deleted"><identifier><xsl:value-ofselect="m:properties/d:DocumentIDValue"/></identifier>
            <datestamp><xsl:value-of select="m:properties/d:DateModified"/></datestamp>
            <setSpec></setSpec>
          </header>
        </record>
      </xsl:when>
      <xsl:otherwise>
        <record>
          <header><identifier><xsl:value-ofselect="m:properties/d:DocumentIDValue"/></identifier>
            <datestamp><xsl:value-of select="m:properties/d:DateModified"/></datestamp>
            <setSpec></setSpec>
          </header>
          <metadata>
            <oai_dc:dc>
              <dc:title><xsl:value-of select="m:properties/d:Title"/></dc:title>
              <dc:creator><xsl:value-ofselect="m:properties/d:StudentLastName"/>, <xsl:value-ofselect="m:properties/d:StudentFirstName"/></dc:creator>
              <dc:subject><xsl:value-of select="m:properties/d:Subject"/></dc:subject>
              <dc:subject><xsl:value-of select="m:properties/d:Keywords"/></dc:subject>
              <dc:description><xsl:value-of select="m:properties/d:Comments"/></dc:description>
              <dc:contributor><xsl:value-ofselect="m:properties/d:Contributor"/></dc:contributor>
              <dc:contributor><xsl:value-ofselect="link/m:inline/entry/content/m:properties/d:Title"/></dc:contributor>
              <dc:contributor>[University name]</dc:contributor>
              <dc:date><xsl:value-ofselect="m:properties/d:StudentYearOfGraduationValue"/></dc:date>
              <dc:modified><xsl:value-of select="m:properties/d:Modified"/></dc:modified>
              <dc:created><xsl:value-of select="m:properties/d:Created"/></dc:created>
              <dc:type><xsl:value-of select="m:properties/d:ThesisTypeValue"/></dc:type>
              <dc:format><xsl:value-of select="m:properties/d:Format"/></dc:format>
              <dc:identifier><xsl:value-ofselect="m:properties/d:StudentLastName"/>, <xsl:value-ofselect="substring(m:properties/d:StudentFirstName,1,1)"/> (<xsl:value-ofselect="m:properties/d:StudentYearOfGraduationValue"/>). <xsl:value-ofselect="m:properties/d:Title"/>, (<xsl:value-ofselect="m:properties/d:ThesisTypeValue"/>)</dc:identifier>
              <dc:identifier><xsl:value-of select="substring-before(m:properties/d:DocumentID, ',')"/></dc:identifier>
              <dc:source><xsl:value-of select="m:properties/d:Source"/></dc:source>
              <dc:source>[University name and Library collection]</dc:source>
              <dc:language><xsl:value-ofselect="m:properties/d:LanguageValue"/></dc:language>
              <dc:relation><xsl:value-of select="m:properties/d:Relation"/></dc:relation>
              <dc:coverage><xsl:value-of select="m:properties/d:Coverage"/></dc:coverage>
              <dcterms:bibliographicCitation><xsl:value-ofselect="m:properties/d:StudentLastName"/>, <xsl:value-ofselect="substring(m:properties/d:StudentFirstName,1,1)"/> (<xsl:value-ofselect="m:properties/d:StudentYearOfGraduationValue"/>). <xsl:value-ofselect="m:properties/d:Title"/>, (<xsl:value-ofselect="m:properties/d:ThesisTypeValue"/>)</dcterms:bibliographicCitation>
            </oai_dc:dc>
          </metadata>
          <about>
            <oai_dc:dc>
              <dc:publisher><xsl:value-of select="m:properties/d:Publisher"/></dc:publisher>
              <dc:rights><xsl:value-ofselect="m:properties/d:RightsManagement"/></dc:rights>
            </oai_dc:dc>

          </about>

        </record>
      </xsl:otherwise>
    </xsl:choose>
  </xsl:template>
</xsl:stylesheet>

Reference

http://expressionsinweb.com/2014/07/11/atom2oai-pmh-an-oai-pmh-server-that-acts-as-a-sharepoint-2010-to-primo-data-connector/

Leave a Reply