Skip to main content

Quick Summary

I attended the Association for Library Collections & Technical Services webinar Introduction to Python and PyMARC: Session 1. The presenter, Lauren Magnusson (California State University San Marcos), gave a really nice introduction to the programming language Python and how it could be useful to library staff working with bibliographic data. Python is robust and can be used for writing scripts to clean up, transform, and analyze data. PyMARC is the module used for working with MARC records specifically.

Body

I attended the Association for Library Collections & Technical Services webinar Introduction to Python and PyMARC: Session 1. The presenter, Lauren Magnusson (California State University San Marcos), gave a really nice introduction to the programming language Python and how it could be useful to library staff working with bibliographic data. 

Python is robust and can be used for writing scripts to clean up, transform, and analyze data. PyMARC is the module used for working with MARC records specifically. Why not just use the MarcEdit program to transform MARC records? Python is not a replacement for MarcEdit, but it can supplement MarcEdit workflows. For example, if you find yourself needing to process data often in both MarcEdit and Excel, that process may be a good candidate for creating a Python/PyMARC workflow.

What are people using PyMARC for? Here are a few PyMARC projects the instructor mentioned:

  • Extract and transform OCLC numbers from a set of MARC records. For example, perhaps you need to do a homegrown OCLC reclamation project and update your holdings in WorldCat. You could write a Python script to strip out all the OCLC prefix or suffix text from your MARC records so that you are left with just a list of OCLC numbers to use for updating holdings in WorldCat.
  • Transform a file of MARC records for bulk DSpace metadata ingest. Transform MARC records into Dubin Core records, and prepare a file for ingest into DSpace.
  • Extract and transform ISBNs for holding upload for GOBI. Extract ISBNs from a set of MARC records and put in your GOBI account number to be able to tell GOBI what you hold. 
  • Generate a KBART file from MARC records for custom WorldCat Knowledge Base collections. Take a vendor file of MARC records, run a Python script on it, and generate a custom KBART file (file format needed for ingest into the WorldCat Knowledge Base)

The instructor spent a considerable amount of time explaining the installation process, as this can be cumbersome for a new user. Right now there are two versions of Python, and Magnusson recommended using Python 3 as Python 2 is nearing the end of support (2020). All of her examples used Python 3. In addition, one will need a good text editor for writing scripts. For the Mac/OS envrionment, Magnusson uses BBEdit. For the Windows/PC environment, Notepad++ was recommended. Here are the instructions shared for installing Python and PyMARC.

Here are the resources shared:

Written by

Sara Ring
Continuing Education Librarian
Professional Development logo.

Strengthening the knowledge, skills, and efficiency of staff in libraries throughout the Minitex region