How to Get from Here to BIBFRAME is the 4th webinar in the From MARC to BIBFRAME series, presented by Carl Stahmer from UC-Davis, and offered by the Association for Library Collections & Technical Services (ALCTS). Since each session in the series builds off the one before it, consider reading the other articles we have summarived before reading this article.
- Recording available: Library of Congress BIBFRAME developments
- Highlights from ALCTS webinar: Putting the Link in Linked Data
- Highlights from ALCTS webinar: Embedding URIs in MARC using MarcEdit
This webinar continued discussing how to prepare our MARC data and cataloging departments for transition to BIBFRAME. Stahmer spoke of the transition in two phases: “Have your MARC and eat it too,” in which we work within the MARC format to enhance our data and being to utilize Linked Data; and “Native Linked Data,” in which we operate in a Linked Data environment from start to finish, without conversion of data to and from MARC.
PHASE 1: HAVE YOUR MARC AND EAT IT TOO
The first step of the transition from MARC to BIBFRAME has already begun, with the insertion of URIs into legacy data. There has been much work in this area (see previous webinars in this series), including the formation and development of a Program for Cooperative Cataloging Task Group on URIs in MARC. This group has produced a series of reports on their efforts, the latest of which is freely available online.
The next piece of the puzzle is what Stahmer calls a “Linked Data enabled workbench,” a cataloging interface or suite of tools that makes it straightforward to add Linked Data markup (including URIs) to MARC data at the time of creation or editing. This would include functionality to automatically look up and disambiguate headings using resources like the Library of Congress, OCLC, and Getty databases. This workbench would also need to include processes for ILS synchronization, much of which would be performed by APIs. Though it will take quite a bit of work to develop APIs robust enough for what Stahmer envisions, this work will fall not on the heads of catalogers but on IT departments.
PHASE 2: NATIVE LINKED DATA
This phase has two main stages within it: the conversion of legacy MARC data, and the switch to cataloging entirely in a Linked Data environment. For each of these stages, Stahmer describes several tools available.
The bulk of the MARC-to-BIBFRAME conversion will, like the APIs above, fall mainly on IT departments, which may need to devote full-time personnel to the task. The tools described include:
- Library of Congress transformation tool. Because it is provided by the Library of Congress itself, the quality of BIBFRAME descriptions produced is very high. It is at the moment still using BIFRAME vocabulary version 1.0; though version 2.0 has been published it is not yet available in this tool. A possible drawback here is that there are no other output formats available, such as schema.org.
- MarcEdit MarcNext. This tool (described in more detail in the previous webinar) uses XSLT for scripting. It can handle multiple instruction sets and draws on a community-generated library of transformations. This means that it is a very flexible tool, but it requires XSLT expertise. Because it is not designed to have multiple endpoints, though, MarcEdit is not the most efficient tool for processing large and frequent batches of MARC data that need to be managed by several people.
- Homegrown scripts. UC-Davis’ project BIBFLOW is a good example of this. If there is in-house expertise, a library can create whatever transformation scripts are needed. They can be built in whatever languages are most convenient or most efficient, and thus can be extremely fast and focused. The main drawback of using homegrown scripts is that they require high levels of expertise and can be very expensive to produce and maintain.
Although there are many tools available, the presenter notes that there are not many vendors active in this space. There is some work being done with conversion, especially by Zepheira, but there is not yet a service akin to WorldCat – a database of BIBFRAME graphs that can be searched, copied, imported, and so on.
The final step in the transition away from MARC will be cataloging in completely non-MARC formats, natively part of the Linked Data world. There are a few ideas to consider when preparing for this cloud-based (or at least cloud-friendly) world.
- Model vs. storage. Although BIBFRAME is often discussed and presented in RDF markup, catalogers need not worry about having to be familiar with the framework. This is similar to the current situation: although ILSes display what looks like MARC to the human eye, they store the data in completely separate formats. Catalogers know the relevant model – that is, what information goes in which places – and rely on the underlying systems to prepare the data for storage.
- Cataloging interfaces. Stahmer predicts that unlike our current world (in which the majority of catalogers use a single interface for their work), there may never be just one BIBFRAME interface. This is partly because the BIBFRAME model is more complicated and far more flexible that the MARC format.
- Library of Congress BIBFRAME Editor. This is a good indication of what future cataloging interfaces may look like. The fields are labeled not with technical code but with RDA Elements (which are linked to the RDA Toolkit).
- Zepheira Scribe. This is a similar interface to the LC BIBFRAME Editor with a few key differences. This tool relies more on lookups and selections from dropdown menus, and the labels for each field are not explicitly tied to any content standard.
- Automated copy cataloging. We are already seeing tools that can scan the barcode on the back of a book, search WorldCat Works, and suggest possible matches. These types of workflows will also change and evolve in the coming years.
- Discovery layers. Some discovery layers (like Blacklight) are compatible with MARC but natively Linked Data-ready. There are several options available that are ready to be linked to triple stores rather than MARC databases.
- ILS integration. There are many functions that are supported in the typical ILS (including acquisitions, circulation, etc.) that may not need to be connected to the catalog/discovery layer. Stahmer questions whether we will in fact need integrated library systems – perhaps we will be moving to a world of several different products working together.
- Authorities. The most dramatic changes will occur with the workflow and systems for dealing with authorities. Stahmer stresses that we will need some sort of reconciliation for all the URIs that will be minted as catalogers create new “authorities”. This can be solved with one of two main approaches:
- Peer-to-peer collaboration. This would work well, but would be very resource-heavy in terms of time and network traffic necessary.
- Reconciliation service. This has been explored in a few pilot programs, and OCLC reports that their reconciliation pilot worked quite well.
Watch for the next summary in the ALCTS From MARC to BIBFRAME series, Modeling and Encoding Serials in BIBFRAME (webinar 5 of 6).