Let’s Get Technical — Subject Heading Prediction By Tris Shores (Developer, PredictiveBIB project) <trishores@gmail.com> and Alisha Taylor (Monograph & Media Cataloging Coordinator, University of Illinois at Urbana-Champaign, and former cataloger at Ingram Books) <alisha@illinois.edu> Column Editors: Kyle Banerjee (Sr. Implementation Consultant, FOLIO Services) <kbanerjee@ebsco.com> www.ebsco.com www.folio.org and Susan J. Martin (Chair, Collection Development and Management, Associate Professor, Middle Tennessee State University) <Susan.Martin@mtsu.edu>
Introduction One of the most labor-intensive aspects of original cataloging is Library of Congress subject heading (LCSH) assignment as it requires familiarity with subject area terminology, interpretation of subject authority records, and construction of compound subject headings with subdivisions. Since LCSH vocabulary numbers are in the hundreds of thousands and can be quite esoteric/specific/ dated, catalogers may find it easier to categorize a book’s content using terms (aka descriptors) drawn from a simpler and more contemporary controlled-vocabulary relevant to a collection’s subject areas. Descriptors (from a custom vocabulary) and LCSH for two books are shown below:
The following Map fragment, portrayed as a concept map, shows the descriptors ‘Combat’, ‘Paranormal’, and ‘Outer space’ individually mapped to various LCSH: A Boolean AND search for LCSH associated with the descriptors ‘Comb a t’ a n d ‘ P a r a normal’ returns: ‘Women superheroes’ and ‘Yoda (Fictitious character from Lucas)’. Another Map fragment, portrayed as a Venn diagram, shows the descriptors, ‘People’, ‘Young’, ‘Animals’, ‘Relationship’, ‘Paranormal’, and ‘Religion’ individually mapped to various LCSH:A Boolean AND search for LCSH associated with the descriptors, ‘People’, ‘Animals’, and ‘Relationship’, returns ‘Human-animal relationships’ and ‘Aviculture’.
Assuming it’s faster for catalogers to come up with descriptors than LCSH terms, this article describes a technique for automated prediction of LCSH based on a cataloger’s selection of book descriptors. At first glance, the extra step of assigning descriptors appears to slow down the cataloging process, but in reality the extra step is akin to taking the time to chop up a potato before eating it. This technique is especially relevant to organizations that create original bibliographic records in a production environment where time-savings and reduced complexity are important considerations.
Predicting LCSH using a Descriptor-LCSH Map At the core of this technique is a descriptor-LCSH map (Map), which associates book descriptors with LCSH. Not only are descriptors drawn from a controlled vocabulary, they should individually have a one-to-many relationship with LCSH (in other words, be a more generalized vocabulary).
Against the Grain / December 2021 - January 2022
A mature Map is likely to contain hundreds of descriptors mapped to thousands of LCSH, but can be rapidly queried by a computer to extract the associated LCSH for a given set of descriptors (using a Boolean AND search). One implementation strategy is to incorporate the Map in a cloud API service. Catalogers would utilize the cloud service by making an API request call sending the descriptors for a book and an LCSH type (main topic, subtopic, geographic, or chronological), and in return receive a list of auto-suggested LCSH (of the requested type) ranked by LCSH usage popularity. Optimally, cataloging software will automate the cloud API calls on behalf of catalogers.
<https://www.charleston-hub.com/media/atg/>
43