15 minute read
Developing FRAME, EMMA, and ARM: Systems and Metadata for Sharing Resources Remediated for Accessibility
By Bill Kasdorf (Principal, Kasdorf & Associates, LLC; Co-Founder, Publishing Technology Partners)
Despite the significant progress on making publications accessible in the past few years, the unfortunate truth is that the majority are not fully accessible. Many trade, scholarly, and educational publishers have made great strides in developing “born accessible” publications, with help from the production services and hosting platforms that create or disseminate books and journals. This means that new books and journals are increasingly accessible, at least in one of their available formats (ideally EPUB 3 or HTML) when they’re initially published. But these items are a small fraction of the resources in the scholarly and educational ecosystem.
This problem is especially acute in colleges and universities, many of which are legally required to provide accessible versions of course materials to any student who needs them due to a disability that prevents reasonable access to standard print or digital publications. Most institutions have disability services offices, known as DSOs, whose staff must obtain a digital file of a required publication — often a PDF that is inaccessible or only partially accessible — and fix it, a process known as remediation. That remediation is done in a particular way for a particular individual student, depending on the nature of that student’s disability.
Remediation is Labor-Intensive and Wasteful Publishers, understandably wary of piracy, have not typically made this process easy. Because digital files are so easy to duplicate, publishers usually stipulate that the DSO to which they’re sending a file for remediation may not share that remediated file; it can only be provided to the specific individual for whom it has been requested, who must also agree not to share it.
Increasingly, students can obtain an EPUB of their needed resource, but even those may not be sufficiently accessible and thus need to be remediated in some way. Sometimes the DSO staff has to create the digital file by buying a print copy, scanning it, and using optical character recognition (OCR) to convert the resulting image into machine-readable text, a process that, while having become much more sophisticated over the past two decades, still produces files with typographical errors and which lack the structure that a user needs to be able to navigate them properly. So the DSO staff must proofread the scanned file, correct the errors, and add tags for things like headings, lists, and links. You can imagine how much work this involves. And that’s just making it readable, not fully accessible (more on this below).
The result is a shockingly inefficient system. The disabled student may have to wait weeks for the resources they need to be remediated, thus falling behind the other students in the class. The staff at DSOs are chronically overworked, and their work comes in unevenly, with peak demand at the beginning of each semester. Finally, the same book or article may be remediated by dozens, scores, or even hundreds of colleges and universities, each working in their own silo, never sharing the files — some not even with other students at the same institution.
Fixing the System: The FRAME Project
When John Unsworth, now Dean of Libraries and Professor of English at the University of Virginia, was in a similar role at Brandeis University, he worked with a student majoring in film who was blind. Remediating movies is far more difficult than remediating books and articles. For every movie that student was required to “view,” the university had to provide audio descriptions, separate narrative audio tracks describing the visual content — and that narrative must not interfere with the dialog of the film itself. This work is fiendishly difficult and expensive.
It came as a shock to Unsworth to learn that those files would not be shared with other similarly disabled students, either at Brandeis or any other university, not to mention all the remediated books and articles that his student and so many others like her needed. Finding this situation intolerable, he resolved to act. He obtained an IMLS grant that funded a study, “Repository Services for Accessible Course Content,”1 which scoped out this problem, including not only the issues and their extent but also the prevailing perceptions and misperceptions.
This study resulted in both an article, “Toward Accessible Course Content: Challenges and opportunities for libraries and information systems,” published in the Proceedings of the Association for Information Science and Technology in December 2016, 2 and an extensive white paper, “Libraries: Take AIM! Accessible Instructional Materials and Higher Education,” published in March 2017.3 That project documented, among other things, that it is widely but incorrectly believed by DSOs that they are legally prevented from sharing the remediated files that they create. (Sharing may be contractually prevented, depending on the source of the file that was remediated, but it is not illegal.)
A four-year Mellon-funded project, FRAME: Federating Repositories of Accessible Materials in Higher Education followed.4 FRAME involved the academic libraries and DSOs of eight universities — George Mason University, the University of Illinois Urbana-Champaign, Northern Arizona University, Ohio State University, Texas A&M University, Vanderbilt University, the University of Virginia, and the University of Wisconsin — and three leading repositories of content: Benetech’s Bookshare, HathiTrust, and the Internet Archive. (A later participant was Ace, the Accessible Content ePortal from OCUL, the Ontario Council of University Libraries.)
FRAME’s goal was to develop a unified search across the participating repositories to improve the ability of the DSOs to obtain resources their staff were required to remediate, and also to create a new repository in which remediated resources from whatever source could be deposited (not limited to the repositories in the project) — thus enabling other participating DSOs to obtain those already-remediated resources. Since book remediation is consistently the main challenge for DSOs, it was decided to focus the initial development of the system on books, although the system would also accommodate journal articles.
FRAME’s first priority was to clearly establish the legal foundation for its work in order to clarify the confusion across the DSO community about sharing their outputs. A group of legal experts was convened at the Association of Research Libraries headquarters in Washington, D.C., for a meeting focusing on “The Law and Accessible Texts,” in January of 2019. That meeting and the subsequent work by those experts resulted in a white paper, “Reconciling Civil Rights and Copyrights: The Law and Accessible Texts,”5 which clearly established that it is not a violation of copyright to provide an accessible version of a resource to a person who has a disability that impairs their ability to fully consume the published version. This so-called “copyright exception” is based on both U.S. law (e.g., the Chafee Amendment6) and international law (e.g., the Marrakesh Treaty,7 to which the U.S. became a signatory on February 8, 2019).
Developing EMMA: Educational Materials Made Accessible
The primary tangible result of the FRAME project was the development of EMMA, a technical infrastructure and repository at the University of Virginia.8 Initially based on the technology stack used by Bookshare, it was highly customized and refined by the UVA developers to address the specific needs expressed by the DSOs of the participating institutions.
For example, where Bookshare’s search function is based on helping people with print disabilities find books of interest (e.g., “find me books on the development of the Panama Canal,” or even more general, “find me a book about animals”), EMMA requires a “known item search”: the exact edition of the exact book specified by a professor on a student’s syllabus. And where Bookshare is intended to be used directly by a printdisabled person, EMMA is developed to be used by the staff of a given DSO in service to the disabled student, faculty, or staff person. End users do not have access to EMMA’s interface; it is the DSO’s responsibility to ensure that the recipient of the remediated resource is qualified to use it by virtue of a print disability. Finally, EMMA needed to be able to search across all three repositories — the Internet Archive and HathiTrust in addition to Bookshare — no two of which are exactly the same in design. This search enhancement involved very significant re-engineering of the Bookshare stack — though far less work than having to start from scratch.
EMMA also needed to address issues that none of the three repositories needed to address individually. Although all three require a combination of bibliographic metadata and administrative metadata for general functioning, the DSO staff needed additional metadata that expressed specific information to help select a resource to remediate, and they needed to provide specific information about the remediation tasks they performed when depositing a remediated resource into EMMA. Finding that no appropriate metadata model existed to support these descriptions, the project developed the EMMA remediation metadata model based on the requirements and vocabularies used by the participating DSOs.
Of particular concern to DSO staff is to understand the technical complexity and challenges to remediation presented by a required book’s content and design. For example, if it contains significant math and tables or many images, it would likely be much more difficult to remediate than a book that does not. Remediators need to know the formats in which the book is available, such as PDF or EPUB, and the quality of the text — an image-based PDF requires conversion to text using OCR and subsequent cleanup effort, whereas the published text-based PDF or EPUB has text that can generally be trusted to be accurate. What is the nature of the equations, if there are any? Assistive technology will properly read MathML, but images of equations will need a lot of work to help a blind reader understand them properly. Speaking of images, do the images have alternative text (alt text), descriptive text which conveys the meaning and context of a visual item in a digital setting? Knowing the answers to questions like these in advance of remediation work will have a significant bearing on which of multiple alternative available resources the DSO will select.
Additionally, “accessible” is not a binary concept: remediation is done to address the needs of a given student with particular disabilities. A DSO staff member remediating an EPUB for a dyslexic student will not need to provide alt text for images; but another DSO staff member needing the same book for a blind student will need them. On the other hand, the blind student may be satisfied with the PDF (with alt text, of course), whereas the dyslexic student may need EPUB so that they can substitute the font and change the line spacing, and a low-vision student may need the EPUB so they can enlarge the font and so it can be made to reflow. Both students need their reading experience improved, but in slightly different ways.
The EMMA metadata also needed to enable DSO staff, when depositing a remediated resource, to convey other information that will be of value to another DSO — since the whole point of EMMA is the sharing of remediated resources. Did they fix the heading structure? Did they tag an untagged PDF? If there’s math, did the DSO create MathML, or just alt text? Did they remediate the whole book, or just chapters one through six? Did they convert it to Word?
All of this description of design and work done was critical to the development of the EMMA metadata model. EMMA’s search function is designed to require only the minimum such metadata to find a resource; but it is essential that the redeposited remediated resource has much more complete, and of course accurate, bibliographic and administrative metadata. But the staff of the DSO is not necessarily expert in bibliographic and administrative metadata. This is where the academic libraries at the participating institutions come in.
Librarians are metadata experts, and as they typically work with information systems that are far more sophisticated than anything the DSO deals with, they play an essential role in the EMMA workflow. EMMA’s design enables the DSOs to supply the metadata that they’re expert in, but it also has features that can generate authoritative bibliographic metadata based on an ISBN, for example. And while the DSO staff may accumulate metadata in spreadsheets in the course of their work, the EMMA system is designed to enable bulk upload of resources, more commonly done at the end of a semester rather than individually as each book is remediated. Librarians at participating institutions are responsible for the upload process as a whole, including ensuring that all the administrative metadata is present and proper, and that the bibliographic metadata is complete and accurate.
Building on EMMA to Create NISO’s ARM
When the grant-funded pilot phase of the FRAME project was complete, the EMMA metadata model was offered to NISO for standardization. NISO formed a working group in late 2023 to develop an ANSI/NISO standard model called Accessibility Remediation Metadata, or ARM.9
One of the primary tasks of this NISO working group is to review the existing EMMA model in order to refine and expand it to address use cases that were out of scope for EMMA. EMMA very deliberately focused only on what the libraries and DSOs in the pilot phase required, and on making the resulting system no more complicated than necessary.
The NISO ARM model has a much broader scope. People representing — and having expertise in — a wide variety of areas were recruited to form the working group. That working group has since formed six subgroups, each focusing on an area that may need to be better addressed in the ARM model. They are the following:
• The Resource Sharing subgroup focuses on resource sharing between libraries or other organizations, as with interlibrary loan. What kinds of resources are shared, and to what extent are they remediated by the borrower when needed by the requesting party? Does the lender get the remediated resource back? What does the associated metadata interchange need to express?
• The Non-Print Disabilities / Access Barriers subgroup focuses on types of disabilities not addressed by EMMA, such as neurodiversity and motor disabilities. This group also addresses access barrier issues in addition to access consumption issues.
• The Non-Text Media subgroup focuses on media such as audio, video, animations, and interactivity, all of which play an increasingly important role in scholarship and education.
• The Primary and Secondary Education subgroup looks beyond EMMA’s university focus to assess the needs of K-12 education, such as workbooks, elementary math, and complex fixed layout and color usage common in school textbooks.
• The Higher Education subgroup takes a deeper dive into issues in colleges and universities, such as access issues like paywalls, institutional access, and open access.
• The Internationalization subgroup will focus on languages and scripts as well as the needs in developing countries, such as issues of low or no bandwidth.
The work of these subgroups will inform the development of ARM, including not only identifying issues not addressed by the EMMA model but also the incorporation of new properties and vocabularies needed to address them.
The NISO Accessibility Remediation Metadata (ARM) Working Group officially began work in January 2024 and currently plans to complete its work, resulting in ANSI/NISO standardization, by October 2025.
Meanwhile, EMMA is currently transitioning to a membership model, which entails moving away from HathiTrust and Bookshare (each having their own membership models and unable to permit resources to be deposited in the EMMA repository), and incorporating OpenAlex10 and Unpaywall,11 which will expand its scope with extensive open access journal content — resulting in an update of the EMMA metadata model that ideally will align with ARM.
Endnotes
1. IMLS National Leadership Grants – Libraries, 2015. Log Number: LG-72-15-0009-15. Proposal, “Repository Services for Accessible Course Content.” https://www. imls.gov/grants/awarded/lg-72-15-0009-15
2. Fenlon, Katrina, and others. “Toward accessible course content: Challenges and opportunities for libraries and information systems,” Proceedings of the Association for Information Science and Technology, 27 December 2016. https://doi.org/10.1002/pra2.2016.14505301027, Accessed 26 March 2024.
3. Wood, Laura C., and others. “Libraries: take AIM!: accessible instructional materials and higher education.” 2017. http://hdl.handle.net/10427/010667, Accessed 26 March 2024.
4. Full disclosure: I served as a consultant to the FRAME project all four years and continue to participate in the work. Over this time, I have written and spoken extensively about the project. Some of the content in this article was also covered in my article “FRAME and EMMA: Enabling responsible sharing of resources remediated for accessibility,” Information Services & Use, December 2022, https://doi.org/10.3233/isu-220166, which covered a presentation I gave at NISO Plus 2022.
5. Butler, Brandon, and others. “The Law and Accessible Texts: Reconciling Civil Rights and Copyrights,” Association of Research Libraries (ARL) and the University of Virginia (UVA) Library, 22 July 2019. https:// www.arl.org/resources/the-law-and-accessible-textsreconciling-civil-rights-and-copyrights/, Accessed 26 March 2024.
6. 17 U.S.C. 121 & 121A, The Chafee Amendment, 2018. https://www.loc.gov/nls/about/organization/ laws-regulations/copyright-law-amendment-1996pl-104-197/, Accessed 26 March, 2024.
7. World Intellectual Property Organization (WIPO). Marrakesh Treaty to Facilitate Access to Published Works for Persons Who Are Blind, Visually Impaired or Otherwise Print Disabled, 27 June 2013. https://www. wipo.int/treaties/en/ip/marrakesh/, Accessed 26 March 2024.
8. https://emma.uvacreate.virginia.edu/