Removing minor changes to headings in Authority Control Task List report

The Authority Control Task List in Alma can be exported to an Excel file for advanced data manipulation.

Carleton and St. Olaf Colleges recently turned on the Preferred Term Correction job, and as part of the cleanup, wanted to remove entries from the report where the “BIB Heading Before” was the same as the “BIB Heading After” except for minor changes to punctuation, spacing, and diacritics. We wanted the final report reviewed by catalogers to be only substantive updates.

We wrote a Python script that takes a CSV version of the report, normalizes the data, and creates a new CSV file with those “minor change” records removed. I’m sure there are other refinements you could make (for example, if you didn’t care about capitalization changes) but we chose to start by removing anything that wasn’t an alphanumeric character (\w in regular expressions).

The script is available in GitHub.

