Collecting from the Web: Collection Development Policy in the Born-Digital Universe By Carol A. Mandel (Dean Emerita, New York University Division of Libraries, and Distinguished Presidential Fellow, Council on Library and Information Resources) <carol.mandel@nyu.edu>
I
f, before the events of the last two years, there were any lingering doubts that the wild west of the web, despite all its click-bait and conspiracy theories, held a treasure chest of critical documentary content, those concerns have surely been dispelled. The global and personal stories of the COVID-19 pandemic, painful videos exposing racial injustice, and the stunning evidence presented to the House select committee investigating the January 6, 2021 insurrection at the U.S. Capitol are but the most prominent examples of the historical, social, and cultural evidence that web archiving will carry to the immediate and long-term future. We, as librarians and as individuals in society, are indebted to the vision and foresight of the Internet Archive and of many national libraries for taking up the challenge of capturing, managing, and providing enduring access to segments of that otherwise ephemeral evidence. What we as librarians have not yet done is fully consider what the vast, complex cornucopia of born-digital content in the web can mean for the concept of our own library collecting and collections. If libraries have a responsibility to ensure society’s access to a world of knowledge, this is not content to be ignored. Librarians have been leaders in the transition from print to electronic resources; in transitioning collecting from “just in case” to “just in time”; in creating libraries that are service-based rather than collection-based;1 in moving to new models of open access; in working to de-colonize our approach to access and collections. The benefits of this work have been strikingly evident in the last two years, when access to the fullest possible range and depth of online content has meant survival for a locked-in world. Yet even as we are immersed in a universe of born-digital content that far expands the bounds of traditional electronic publishing, we have not yet stretched our arms fully around the potential role of that content in libraries. (The idiom “getting arms around” is apt here in all its meanings: embracing, understanding, managing.) It’s time, as the old Apple ads used to say, to think different.
In a paper issued in late 2019, I discussed the need to frame the nature and challenges of born-digital content into issues that, in turn, can lead to strategies for access, collection, and preservation.2 One significant component of that task is to match the opportunities and complexities of web content to areas relevant to the mission, priorities, and capacities of a range of different types of libraries. This issue of Against the Grain devoted to web archiving is a welcome opportunity to begin that considered look. A key step is to consider whether the concept of “archiving the web” has created a too daunting connotation, one that implies a vast archival task that falls only into the realm of a national library. “The web” can be viewed as an overarching, if messy, entity. But as my friend and colleague, Clifford Lynch, describes so well in his companion article in this journal, “the
20 Against the Grain / December 2021 - January 2022
web” is not what most of us casually think it is, and not what it used to be. It is not a monolith, but rather more than a billion disparate sites, many containing ever more items in ever more forms, accessed through the portal of a web browser. From a content perspective, a web portal can be viewed as the world’s largest third-party distributor. It is the pointer to most networked born-digital content — the good, “Collecting the bad, and the ugly. And much strategies have of the good content is worthy of consideration for library collecting. never remained For libraries, web harvesting is an stagnant as important tool by which valuable — new forms of in fact, essential — born-digital concontent and tent can be targeted and collected.3
new distribution A small number of libraries have begun to add web collecting to their models have development of special collections emerged. It is strength. 4 If a library has deep time to embrace strength in, say, 20th century student protest movements, collecting a new phase.” from the web is essential to continue that collection into the 21st century. But how “special” need a collection be to merit the addition of digital-only content? In a century when so much substantive and valuable material is available by harvesting it from the web, the red-line distinction between “general” and “special” is not meaningful as a collecting determination. Special collections grew up as separate entities because of the need to physically segregate and safeguard rare material.5 Brilliant and active special collections librarians have expanded the reach and function of their work, but web collecting need not — and should not — be bounded by the special collections realm. This is not to say that collecting from the web is, as yet, as efficient as core collecting streamlined by processes such as approval plans and demand-driven acquisitions. Web collecting requires selector time and expertise to establish and monitor profiles (as do many approval plans), along with (relatively modest) operational support to participate in a program such as Archive-It. But for any library that serves its community by shaping collections — at least around the edges of a “readymade” plan — selector expertise is always part of the picture. And on the support side, libraries have triumphed in the 21st century through their ability to retool operations. Over the last 50 years, libraries have adapted to many new forms of collecting, adding images, video, databases. Selectors have grappled with, to use some now dated terminology, grey literature, technical reports, government documents, and more recently open access publications. Shared content has been expanded through microfilming and then digitization, and through
<https://www.charleston-hub.com/media/atg/>