Reader’s Roundup: Monographic Musings & Reference Reviews Column Editor: Corey Seeman (Director, Kresge Library Services, Stephen M. Ross School of Business, University of Michigan; Cell Phone: 734-717-9734) <cseeman@umich.edu> Twitter @cseeman Column Editor’s Note: Sometimes you are watching a movie or a television show and everything seems settled, but you know it is not done. Maybe you know a spoiler about the show. Or maybe you realize that the show is only half way over, so the resolution is not really the ending. Or if you are reading a print book, and you see a whole bunch of pages left, you likely are not at the ending. That seems to be where we are right now in the world of libraries. We seem to have a way to move forward in the Fall, and plans are being announced and shared. But we have many more pages left in this book. So what is the future going to look like? No one knows. Regardless of where we end up for Fall, we are in all likelihood still in the beginning of this pandemic that may stretch well into 2021. While there are health and economic considerations for sure, for many of us in the library field, this will mean more remote work and more use only of digital collections. Even the Charleston Conference will be celebrating their 40th anniversary via Zoom (or similar software) rather than jammed into the Charleston Gaillard Center. So it is very likely that this fall will be another term where we are spending more time with screens than spines (book spines that is). While getting our hands on print and books might be more challenging during the pandemic, their value to our community does not change. As this pandemic races on, my hope is that publishers open up, in a more permanent fashion, access to ebooks, even if they might be used in a course from time to time. Now more than ever, libraries need to be able to utilize and leverage the books that are in our collections. Now more than ever, libraries need to make good purchasing decisions to ensure that we are good stewards of our community’s dollars. And hopefully, these reviews from our talented team of reviewers will help you make educated purchases and stretch your collection dollars just when you need it the most. After all, there still is more of this story yet to play out. Thanks to my great reviewers for getting items for this column. My new reviewers are Ellie Dworak, Joshua Hutchinson, Mary Catherine Moeller, Rossana Morriello, John Novak, and Benjamin Riesenberg. They are joined by returning reviewers Janet Crum, Jennifer Matthews, and Sally Ziph. Thank you all! If you would like to be a reviewer for Against the Grain (and I can ever get back into my office), please write me at <cseeman@ umich.edu>. If you are a publisher and have a book you would like to see reviewed in a future column, please also write me directly. Happy reading and be nutty! — CS
Banerjee, Kyle. The Data Wrangler’s Handbook: Simple Tools for Powerful Results. Chicago: ALA Neal-Schuman, 2019. 9780838919095, 176 pages. $67.99 Reviewed by Ellie Dworak (Associate Professor & Information Design & Data Visualization Librarian, Boise State University) <elliedworak@boisestate.edu>
Data wrangling encompasses, broadly speaking, the processes involved in data transformation. For example, one such task involves cleaning messy data when downloading chat transcripts from the software that my library uses for reference interactions as the text is full of HTML codes and long links. Another is restructuring data, say eBook usage statistics that are arranged by title, and you want to analyze usage by subject headings. Combining data sets is another wrangling task, one that might first require data cleaning and restructuring (think ejournal statistics across vendors, some of which aren’t COUNTER compliant). Those of us who work in libraries may hardly notice that we’re wrangling data until the tools we’ve come to rely upon fall short. The introduction to this book gives us several examples of times that software with a graphical user interface (GUI) are not the best tool for the job. Excel, for example, automatically converts numeric data in disadvantageous ways and crashes when confronted with large amounts of data. However, the reader is reassured, we can learn to use the Command Line Interface (CLI) capabilities of our computers to solve these problems, and it isn’t even hard. This book was written to be a toolkit of data wrangling fundamentals in the context of librarianship. Author Kyle Banerjee is the Collections and Services Technology librarian at Oregon Health Sciences University (OHSU), the co-author (with Terry Reese) of Building Digital Libraries: A How-To-Do-It Manual, and co-editor (with Bonnie Parks) of Migrating Library Data: A Practical Manual. The chapter on understanding formats is authored by David Forero, is the OHSU Library’s technical director. The book opens with an introduction to the command line, followed with a series of tools used to modify text files. I followed along to create a brief tab-delimited file, to which I successfully appended and modified data. I learned to use grep to search the text within my file using the logical operators AND and OR. Everything was going swimmingly for me until the end of chapter two, when we learned to use grouping and regular expressions with a stream editor (think find and replace on steroids). I was able to complete the exercise in the book, but got stuck when I tried to repurpose the information to my own example. Throughout the book I ran across these hitches, where the examples leapt from straightforward to convoluted without enough context to help me bridge the divide. It may be that page limits hindered the provision of more practice exercises to scaffold learning. Despite this limitation, I was able to absorb and benefit from most of the content. The material covered in this book is well-chosen to suit the purpose of providing a cookbook of techniques useful to librarians. After command line concept, the book introduces the basics of formats (HTML, XML, JSON, Linked Data, and MARC) and how to transfer data between them. Further chapters delve into more details regarding how to troubleshoot data wrangling problems, working with delimited text files, XML, JSON, scriptcontinued on page 39
38 Against the Grain / September 2020
<http://www.against-the-grain.com>