11 minute read

Generative AI Tools Transforming the Library? Rethinking Possibilities and Questions

By Raymond Pun (Academic and Research Librarian, Alder Graduate School of Education, California)

In the past few months, there have been intense discussions and anxiety (rightfully) over the usage, impact, and effects of ChatGPT on our library services and academic learning. Based on large language models (LLMs), ChatGPT is a chatbot that performs tasks of predicting and generating the next word in a sequence of words. The tool draws on various content and textual data from the Internet, including from Wikipedia, academic journals, magazines, books, etc., to generate an essay, poem, joke, story, etc. based on the user’s request.

ChatGPT has become an increasingly popular tool even though it is an inherently biased system since its sources are all from the internet. Artificial intelligence (AI) is also not going away, and some libraries and academic publishers have even started integrating various AI programs into their practices and services, and some applications can be benign.1 For example, Stanford University Libraries’ Research Data Services Division is experimenting with HCR (Handwritten Character Recognition), where it turns handwritten documents such as manuscripts into text and data. This has enabled researchers, especially digital humanists, to engage and analyze the data of archival materials. 2 Another example is Adam Matthews (AM) Digital which uses handwritten text recognition (HTR) to digitize content at several university libraries.3

For ChatGPT, the learning impact is immediately felt. Plus, it’s not a surprise; as my colleague, Andrew Carlos, head of research, outreach, and inclusion at Santa Clara University Library, noticed and shared on Twitter in February 2023: “A student asked me for help finding the full text for two articles both of which had the same exact title. Immediately alarm bells were ringing and yup, they were ChatGPT references.”4

As he observed, ChatGPT can generate false citations and information — also referred to as a “hallucination,” a bug that developers are working to rectify.

Peter Bae, Assistant University Librarian for scholarly collections access, fulfillment & resource sharing, described in a social media post how challenging some of these fake citations (including digital object identifiers known as DOIs) can create additional labor for him and his Resource Sharing and Interlibrary Services department. Peter noted that this issue is global, “It is an issue not only in English-speaking countries. I have heard the same story from colleagues in Korea.” These “hallucination citations” can create additional labor for high- demand services and the staff for any library, such as reference and research services, as well as technical services such as cataloging. Students are already utilizing ChatGPT to identify and recommend sources for their writing and research assignments, all without realizing that the information can be inaccurate and falsified. There’s a more overt impact on public services work in libraries as I shared.

In the world of cataloging, ChatGPT can also create authority records. Janelle Zetty, head of Cataloging at University of Louisiana at Lafayette, tweeted the following ChatGPT query: “Create MARC authority record for Zydeco musician Jeremy Fruge.” As Janelle posted, “The person doesn’t have a name authority record in LC yet,” yet ChatGPT created an example of a MARC authority record for him.5 With these types of activities generated by ChatGPT, we need to rethink, pause, and reflect on this tool and its implications in many aspects of our work.

Across the United States, many pre-K-12 schools have banned using ChatGPT, fearing students might evade schoolwork, may not develop their critical thinking, writing, and reading skills, and may ultimately “cheat” and become codependent on the tool in their academics. (As an aside, while I wrote this post, it very well could have also been composed by a similar ChatGPT request). For higher education, there has not been consensus or consistent policy formation addressing ChatGPT, nor on its alignment with the academic honesty policy or the honor code, such as having students disclose their use of ChatGPT for an assignment — if that is permitted. There are also tools such as GPTZero developed by Edward Tian, a Princeton University student, which can identify and spot check passages that could have been generated by ChatGPT.6 For faculty, ChatGPT might be used to write recommendation letters, reports, and even their own academic effort.

This is our moment to rethink what plagiarism and cheating means for both students and faculty. This is also an opportunity to ask ourselves what it would take for a tool like ChatGPT to be harnessed in a reasonable and ethically responsible way that reflects the parameters of its application and impact in the library world, on the profession at large, and for the communities we serve. There are other factors to consider regarding this tool, based on my own experiences speaking with library workers on its impact towards our services.

Looking Back to Look Forward

When I worked at the Hoover Institution Library & Archives at Stanford University as the Education and Outreach Manager in early 2022, there was a chatbot that was being piloted to make connections among archival collections. It was programmed by Dr. Kirill O. Kalinin, a political methodologist and comparativist with expertise in statistical analysis, machine learning, and natural learning processing. I remember engaging in multiple conversations with Kirill on the potential of this chatbot to support research and access for special collections. In my role, I uploaded many lines of content into the chatbot that were sourced on research and subject guides that I had managed and co-created. For example, if researchers used the chatbot and wanted to know about African American collections at the Hoover Institution Library & Archives, the chatbot would recommend the Dr. Condoleezza Rice Papers (the 66th United States secretary of state and current director of the Hoover Institution), Civil Rights Movements in Alabama Pamphlet Collection (1965-1966), and other relevant collections. This tool showed great potential in generating discovery and connections and access for the extensive and disparate archival collections from a major institution, even prior to the launch of ChatGPT-3.

On Privacy

For students, especially those from marginalized backgrounds, questions of privacy arise. In February 2023, I presented a webinar for the Lifelong Learning Information Literacy’s (LILi) Show & Tell Webinar entitled, “Using ChatGPT to Engage in Library Instruction? Challenges and Opportunities.” Many of the 300 registrants asked about privacy. Attendees asked if students should create a ChatGPT account to experiment with it, which can create potential issues in terms of tracking what users are asking, tracking usage data, and creating data profiles. This could lead into privacy surveillance and potential data brokering issues, reminiscent of an incident in which LexisNexis provided personal information and data to U.S Immigration and Custom Enforcement.7 The concern for privacy is very valid.

In a blog post, Autumn Caines, an instructional designer from University of Michigan-Dearborn, has recommended ways to mitigate potential harms that the tool may cause. 8 Caines suggested that students should not create their own ChatGPT accounts, but that the instructor can demo it either live or include a copy/paste prompt in a presentation slide for them to observe. Caines suggested students could use burner email accounts, which could reduce personal data collection and surveillance if they use this account when using the tool.

On Information Privilege

In March 2023, the University of San Francisco Gleeson Library’s Annie Pho, head of instruction and outreach, invited me to facilitate a session on ChatGPT in connection to teaching and learning. Dr. Shawn P. Calhoun, university library dean, raised an important question about information privilege and who can actually access this tool for academic success. ChatGPT now has a free version and fee-based premium version, which may stratify users in those who can afford it and those who cannot. Information privilege is an LIS concept that focuses on “the affordance or opportunity to access information that others cannot.”9 Information privilege is a critical concept in this context, with well-resourced major research universities better able to apply the tool in teaching and learning settings than less-resourced institutions. Like many costly library databases and tools, paid versions become difficult to sustain due to the ongoing subscription cost. We know that there will be new versions of ChatGPT (an updated version known as GPT-4 will be out in the summer and currently in beta mode where it has been able to perform successfully on the American Bar Exam, Law School Admission Test, and Graduate Record Examinations with the exception of Advanced Placement English Literature, Language, and Composition exams)10 and other alternatives to this tool, but we need to acknowledge that there are cost implications for both individuals and institutions. see endnotes on page 22

As an academic/school librarian, I work in this unique setting where I collaborate with teacher educators/faculty on curriculum development and support and teach preservice teachers/graduate students. One area that I’ve been looking at is how we can use this tool to critically engage with its many applications. There are ways to foster critical thinking and to interrogate ChatGPT’s responses, especially its hallucination citations. Another activity is to identify ways to improve a generated passage, such as by searching for appropriate scholarly sources and questioning the information produced by ChatGPT from an ethical framework based on core values and guiding principles such as transparency, privacy, and data governance.

This is also an opportunity to teach algorithmic literacy, the awareness and understanding of the use of algorithms in online systems, and to critically analyze and evaluate algorithms and their biases. For example, Ashley Shea, head of instruction initiatives from Cornell University Library, created a LibGuide highlighting this learning outcome in the library instruction program: “Students,” she wrote, “will be able to articulate how the automation and embedded biases of algorithms lead to personalization, sorting and discrimination.”11 There are many future opportunities to consider, but it is also important to note how ChatGPT is already affecting library services, personnel and our researchers.

I will conclude this piece by quoting Dr. Ruha Benjamin, professor of African American studies at Princeton University, regarding emerging technologies and questions on ethics: “The key thing for me is it’s not simply about making the technologies better at doing what they say they’re supposed to do, but it’s also widening the lens to think about how they’re being used, what kinds of systems they’re being used in, and bring the question back to society, not just the designers of technology.”12 There are lingering questions to ask but it’s critical to acknowledge the immediate impact on us and the long-term changes.

Endnotes

1. Loida Garcia-Febo, “Exploring AI,” American Libraries. Last modified March 1, 2019, https://americanlibrariesmagazine. org/2019/03/01/exploring-ai/

2. Catherine Nicole Coleman and Michael A. Keller, “AI in the Research Library Environment,” in Artificial Intelligence in Libraries and Publishing, eds. Ruth Pickering and Matthew Ismail (Ann Arbor: The Charleston Conference-ATG Media, 2022), 27-32. https:// doi.org/10.3998/mpub.12669942

3. AM Digital, “Universities and colleges,” Last modified March 17, 2023, https://www.amdigital.co.uk/create/universities-andcolleges

4. Andrew Carlos (@infoglut), “A student asked me for help finding the full text for two articles, both of which had the same exact title. Immediately alarm bells were ringing and yup. They were chatgpt references,” Twitter, February 27, 2023, 7:08 AM., https:// twitter.com/infoglut/status/1630223245273346049

5. Janelle Zetty (@jazetty), “Here is an example of an authority record created with ChatGPT. The person doesn’t have a name authority record in LC yet,” Twitter, March 6, 2023, 6:06 AM., https://twitter.com/jazetty/status/1632744388798586880

6. Emma Bowman, “A college student created an app that can tell whether AI wrote an essay.” NPR. Last modified January 9, 2023, https://www.npr.org/2023/01/09/1147549845/gptzero-ai-chatgpt-edward-tian-plagiarism

7. Sam Biddle, “LexisNexis to Provide Giant Database of Personal Information to ICE.” The Intercept. Last modified April 2, 2021. https://theintercept.com/2021/04/02/ice-database-surveillance-lexisnexis/

8. Autumn Caines, “ChatGPT and Good Intentions in Higher Education.” Is a Liminal Space Blog. Last Modified December 29, 2022. https://autumm.edtech.fm/2022/12/29/chatgpt-and-good-intentions-in-higher-ed/

9. Sarah Hare and Cara Evanson, “Information Privilege Outreach for Undergraduate Students,” College & Research Libraries 79, no. 6 (September 2018): 726, doi.org/10.5860/crl.79.6.726

10. Rumman Chowdhury (@ruchowdh), “Many nuggets of insights into this GPT-4 paper but this is one of the most compelling — across the board GPT performs poorly at AP English — it’s incapable of abstract creativity. Same with complex leetcode which is ultimately an abstraction codified. Humans aren’t replaceable,” Twitter, March 14, 2023,11:47 AM., https://twitter.com/ruchowdh/ status/1635714392372449282.

11. Ashley Shea, “Mann Instruction Materials for Faculty, Staff and Librarians,” Cornell University LibGuide. Last Modified August 10, 2022. https://guides.library.cornell.edu/InstructionResources

12. Ruha Benjamin, “Princeton University’s Ruha Benjamin on Bias in Data and AI,” The Data Chief. Last Modified October 20, 2022. https://www.thoughtspot.com/data-chief/ep25/princeton-university-ruja-benjamin-on-bias-in-data-and-ai

This article is from: