Tech Blog

Transforming simple CSV to Rosetta CSV

The tool (available on github) helps Rosetta customers to create CSV files that have the structure Rosetta requires. The Java tool is specifically useful for digitization projects where there is a unique ID per entity, and this ID is used as a base name (prefix) for all files of this entity.

The expected simple source CSV has just one line per IE, which contains metadata fields on IE level. In addition, the first column must contain the base name (prefix) of related files.

Example:

Simple CSV

Generated Rosetta CSV (differs according to configuration)

The tool includes following functions:
– search NFS folders for related files
– add required lines for representation and file level
– add columns on representation and file level
– extract the file label from the file name via regular expression
– retain the order of IEs from the source CSV and sort the files by their name
– OS specific preparation of the material for deposit:
Windows: create a ZIP file of the stream file folders for upload
UNIX: create SIP directory and add full path of files to CSV, so that the files don’t need to be copied
to SIP directories but can be deposited from their original location
– flexible configuration of source and target locations
– log file (with debug information, when configured)

Leave a Reply