23 minute read
ATG Interviews Clifford Lynch
Executive Director, Coalition for Networked Information (CNI)
Interview conducted by Michael Upshall (ConsultMU)
Transcript edited by Leah Hinds (Executive Director, Charleston Hub)
The following is an excerpt of an interview that was conducted for ATG the Podcast that aired on September 8, 2024.
Michael Upshall: Well, Clifford, welcome to ATG the podcast. We’re very pleased to have you on the podcast today. You’ve been involved with the Coalition for Networked Information for many years. What we’d like to do today is to talk a little bit about your background, how you got to the Coalition, and a little bit about your role there. So thank you for joining. It’s great to be here with you.
When you started with your studies, you actually started out studying maths before moving to computing. Why did you switch from one to the other?
Clifford Lynch: Well, a couple reasons, and you have to remember how long ago that was. I mean, we’re talking I started college in 1972, gosh, and I really had no exposure to computers until my second year in college. You know, they just weren’t a thing for most students in high school at that time. So, it was a mix of just discovering something that was really interesting and also realizing that much as I enjoyed mathematics, I wasn’t really sure I wanted to spend my life as a research mathematician.
MU: It sounds like very soon after getting involved in computing, you found yourself on a very interesting library project to California doing online catalogs.
CL: Yeah, actually, I had started about the same time, shortly after I started getting involved in computers. I ran into a professor at Columbia’s library school, Theodore Hines, who was doing a lot of work with a language called SNOBOL4, a symbolic text and computing language. And through him, I got very interested in a lot of the sort of textual and related applications of computing, which at that time were very esoteric and connected up with a fellow named Ed Brownrigg, who was at New York University. I went to work for him when I was still in college doing some library applications there. Then in 1980, give or take, I went with Ed Brownrigg to the University of California Office of the President, which was the headquarters for what was at the time a nine-campus system, to build a very large scale library automation system, basically a union catalog for all of the collections of the University of California’s roughly 100 libraries, under the slogan, “One university, one library.” The idea being anyone at any of the campuses could search the entire collection. And that was, of course, backed up with a whole series of interlibrary loan arrangements and things to physically get material to people. So that was, for the time, an enormous project, and really quite innovative. One of the things that’s really striking to me about it, even looking back, is that this was about the same time as automatic teller machines were coming into fairly wide deployment. If you think about the public’s interaction with computers, the first places they were seeing it were ATMs and library catalogs, which was just fascinating. So it was a whole adventure trying to make this kind of thing user-accessible. And as we went down the path from there, we moved on to many other things. One of the things we did very early was to make the system accessible not just from in library terminals but first from dial-up terminals and then later on from Internet access and to open that up, really because it was a public institution in California, to the world — which was quite unheard of at the time. So we actually had people using the system from all over the globe. It is really quite interesting. As things progressed through the 80s, we also started putting up abstracting and indexing databases. One of the early ones we did, which was very high impact, was to add the National Library of Medicine’s Medline database to the system, which again was striking, in that it really opened up health information to the public in a way that had never been done before. And at a scale that had never been done before. Let us say it was met with mixed reactions by physicians around California when patients walked in with stacks of article abstracts asking, “Well, what about this?” But that was a wonderful experience. We had a great team there.
The last thing I’ll just mention about the University of California, because it’s relevant for a lot of things, is that in order to really deploy that online catalog at that scale, we needed a pretty sophisticated computer network. And we actually worked with Bolt, Baranek and Newman (BBN), the original contractor on the ARPANET, to put in a ARPANET-like packet switch network to support this, which became the basis for a lot of the inter-campus networking at the University of California until I left the university in ’97. Inter-campus networking was essentially a library function that reported to me, a library automation function. So, I was really very deeply involved in networks.
MU: It sounds amazing that you had this first involvement with libraries and you got a chance to work with one of the biggest and, still today, one of the largest catalogs. The California e-scholarship system is one of the biggest institutions and repositories. It’s quite amazing and sort of fortuitous. They’ve continued to be a leader throughout. So, do you think that early experience sort of stood you in good stead for your later involvement in CNI?
CL: Yeah, the California Digital Library (CDL) and the UC Libraries collectively are wonderful. Absolutely remarkable. Fabulous work. Yeah, because a lot of what played out through my time at CNI really was outgrowth, extensions and evolutions of things that got their start back when I was at UC. Network -based computing, the shift to digital content, and then the shift to largely centralized digital content. All of the policy issues around government information on the Internet, scholarly publishing on the Internet, many of those things had their roots back then.
MU: Let’s talk about the Coalition for Networked Information. You weren’t actually the founder, but you were there pretty early on. I think you joined around 1997, but I think it was Paul Evan Peters who started it. What was the basis? What was the idea behind it?
CL: So actually, while I was certainly not the founder, there was a sort of a complex prehistory to the Coalition that I was very heavily involved in. The person for whom I worked for a long time at the University of California was a man named Richard West, who was chief information officer for administration and had IT and a lot of other things under him. He had a number of colleagues around the country who were either chief librarians or chief information officers, who shared his feeling that there was just a wealth of opportunity that was being opened up in the 1990s by the large-scale rollout of the Internet that was going to affect every facet of higher education activities: teaching, learning, research, libraries. And there were a number of informal groups among those people who got together and ultimately decided to found CNI. And they recruited Paul in as the founding director.
MU: And how did you get involved?
CL: Well, I was part of a lot of those discussions in that prehistory. And when Paul came in, I ran one of the sub-projects, certainly. He was a good friend, and I worked very closely with him throughout his time there. Well, really I knew him before CNI, but I worked with him very closely through his tenure at CNI. And like everybody else, I was quite devastated by his sudden and untimely death.
MU: Why the word coalition, which is still not common for groups today?
CL: I think... I don’t really know exactly the source of that, other than to, you know, sort of characterize it along the lines of coalitions of the willing. We didn’t want to have what I’d characterize as a membership organization, you know, which was like about representing a sector or something. The way, for example, some library organizations are, but rather to bring together a diverse group of organizations bound by a common interest or a common vision. And that’s very much what happens. You see, certainly the predominant membership in CNI is research universities, but once you start looking beyond that, you very quickly see foundations, government agencies, publishers, technology providers, really quite an assortment of organizations. And if you really go back to the membership as it was in the ’90s, it was even more diverse, I would say, because you know since then we’ve had an enormous amount of consolidation in the industries if you think about library automation systems, if you think about publishing. But you’ve also had some really interesting structural changes. So, for example, if you look in the very early days, companies like Sun, DEC, and IBM were represented among our members, and Apple. The way that those companies engaged with the higher-ed sector started changing really dramatically in the 2000s. There were a number of joint projects that were done between various libraries and companies like DEC and Sun back in the ’90s or early 2000s, and we just don’t see things like that any more.
MU: So that’s what you mean by networked information, by working across borders, as it were, with technology providers, institutions, and bringing them all together.
CL: So that’s one piece of it. Another piece of it, though, which is very much a product of its times, was if you go back and you look at the early ’90s particularly, there was still a lot of excitement around standalone PCs. We had phenomena like CD-ROMs. Remember the New Papyrus notions of, you know, publishing things on physical media that could be plugged into standalone machines. The whole idea of networking was still very much in flux at that time. And we wanted to put a firm stake in the ground that we believe the future was the network, was network access, using the network to share information and also using the network to interconnect information. That makes sense. Very much the kind of ideas that subsequently played out in things like Gopher and then ultimately the World Wide Web.
MU: You’re preaching to the converted here as the publisher of the first British Encyclopedia on CD -ROM. I remember the excitement of just getting all this stuff on screen. It’s just amazing. Fascinating. So, you mentioned that it was rather different in the early days, and you certainly talked about consolidation. How do you think CNI today compares with that organization from the ’90s?
CL: Well, I mean, if you look at what we were doing at CNI in the ’90s and into maybe 2005 or so time frame, a tremendous amount of it was, I think you used the term in a prior conversation with me about sort of evangelistic consultation, something like that, which really I thought characterized it beautifully. The amount of time that Paul spent, that I spent, and that other people involved in CNI spent talking to scholarly societies, to research groups, to foundations, to government agencies, just getting them to understand there was this network out there and here are the kinds of things you can do with it and you’re going to need to think about how this affects your strategies for disseminating information, collecting information, sharing information, and collaboration. I mean, it was just a huge lift. The number of organizations that needed to be brought up to speed on that were just stunning.
And, you know, every one of them had a slightly different situation, and so it was really very much a question of going out and you know talking it through with people wherever they were, and you know how it affected the things they most cared about at the time. There was also a whole enormous subtext of threat to economic models as well, particularly for some of the scholarly societies which have always had fragile economic models. Those sorts of threats still exist today, but perhaps we can come back to that in a slightly different way.
MU: So today, CNI is very much a discussion body, a little bit less of the evangelism. How do you facilitate the dialogue and discussion within your members, which is a rather different initial evangelism mode?
CL: Well, you know, I think we’re a little bit more than a discussion body. We certainly are a discussion body; I think we are sometimes very effective as an incubator for projects that are launched among our members. A place where our members can connect with other like-minded members, find a common problem, set out to do some work on it, and then subsequently serve as a reporting out vehicle for that, a showcase for that. I think another function we play is very much one of an early warning radar. I see a significant part of my role as identifying trends, developments, technologies, issues that need to be on our members’ radar for five-year strategic planning, which is very, very different than almost all of the other organizations I know, which emphasize heavily, you know, stuff you can take home and use right now.
MU: I remember reading about you described as an oracle in the culture of networked information. So clearly there’s a view of CNI as providing some sort of authoritative guidance, of awareness, and understanding of current trends in an informed way. Do think that’s the case?
CL: We try, we try. I like to think we’ve got it. You know, we have a history of getting it mostly right. If you look at a number of the really important developments, you will see those reflected as featured plenaries or similar things at CNI very, very early on. And there are some that are still coming to roost. And I would also say, as time has gone on, particularly in the last seven or eight years, I think we have been focusing a bit more on the research enterprise broadly.
You know, CNI had a lot of its roots in collaborations between information technology and libraries. But that conversation now has gotten much broader and much more complicated. And in particular, you know, the whole nature of how we do research is changing in really significant ways. And that’s affecting, of course, not just how we do research, but how we disseminate the results, the collaboration tools around the research, trends like research data management, which is something that CNI has been all over since the turn of the century, and which was regarded as a little bit loony on the topic till around 2015. I can remember telling people that in 2002, 2003, that this was going to be a major piece of budgetary investment for them. And they’re like, no, couldn’t happen. But it seems to have. So anyway, one that I’m watching very, very closely now is the trend towards automated labs on a large scale. When we first brought CNI back together in person, as the pandemic diminished in December 21, the plenary session was a look at something called the Cloud Lab that Carnegie Mellon had just sunk an enormous amount of money into. Basically, they were working with a company called Emerald Cloud Lab, which was founded by a bunch of CMU graduates, and had set up a commercial facility out near San Francisco Airport. And, basically, they duplicated that roughly in collaboration with Emerald in a building slightly off campus in Pittsburgh for Carnegie Mellon and other educational institutions in the area, other startups. And that’s gone operational now on a large scale. They were just starting to build it in 2021.
Now, if you think about the implications of this for everything, from the way it changes the balance of capital and operating expenses for faculty in the science, all the way through issues around productivity, utilization of equipment. It’s a real game changer. And of course, it interfaces very neatly on one side into research data management infrastructure, the whole set of trends around scientific reproducibility and clear documentation of scientific research. And it’s very amenable if you want to put machine learning things in the loop to semi-autonomous robotic research. So, we’ve had that very strongly on our radar screen since ’21. And the reactions were everything from “I gotta go back and talk to my provost about this right now.” To, “We need to be thinking about what this means all the way through. I have no idea what this has to do with my institution and my library.” But I feel like this is exactly what we should be doing, getting carefully selected key developments like this on people’s radar screens early.
MU: Absolutely. Do you think that the relatively small size of CNI, you have around 200 members, has enabled that kind of focus development to take place? If you compare it with major events like the Charleston Conference that we’re both familiar with, with thousands of attendees, much larger scale, but very different in sort of tone and focus. Do you think that small, relatively small, number of members facilitates the kind of joint development that you’ve described?
CL: Yes, but let me say a couple of particular things about that which are important to understand. Membership in CNI is institutional. Institutions make a commitment to CNI and are typically represented by a very small number of people, most commonly two, in leadership roles in the institution. Typically, one of them is the head of the library operations. The other might be a CIO, a head of research computing, conceivably, a chief research officer. We are finding we’re getting more engaged with those folks as time goes on. And we really deliberately limit the attendance at CNI meetings to a very small number of reps per institution, two plus speakers, if any. We keep the meetings small, exactly to facilitate that sort of thing. Backing that up, we make a great effort to capture a lot of what goes on in the meetings on video, and make that available to our members in the broader community. So what we do gets a good deal of dissemination to all the people at our member institutions and also the broader community.
MU: Sounds like a good formula. Clearly it seems to be working.
You’ve mentioned this Carnegie Mellon collaboration, can we move on to some of the current issues that your members are facing? And of course, you’ll know all these topics. One of the big ones is open access. And you mentioned before that open access, with societies and their fragile economics, open access seems to some societies to sort of be like a dagger to the heart to the kind of operation that they’ve traditionally carried on. So, what’s your view looking at the position of open access in the U.S.? You’re familiar with the OSTP/Nelson memo, the move towards more immediate deposit and availability. Do you think the guidance is going to have the desired effect?
CL: This is a complicated area as I think you correctly suggest there; it’s one in which the specifics are playing out very differently in different geographical regions. So the situation in Europe, in the UK, in Latin America, is very different. And as far as publications, internal articles and things like this, this is really not on the top of our list — a lot of other organizations are involved in this now, advocating for it, trying to sort out the economic models. And it’s not clear to me that we add an enormous amount of value there. We have certainly disseminated information about the policies and talked through some of the very real mechanical issues of trying to get in compliance with those. For example, this whole question of how federal funders actually keep track of whether the papers were indeed deposited where they were supposed to be is a real nuisance for a lot of institutions and investigators. And the more we can figure out ways to just make that run smoothly, the better, obviously, it is for everybody. I would say that there is quite a lot of momentum at this point behind the embargo-free federal deposit kind of models. And as long as the funders remain resolute on that, which ultimately is a political process that I’m not going to try and call the outcome of at this point, I think we will see a steady move towards more and more material being deposited embargo free. I think it is important to note that at least in the U.S., I don’t think that repositories other than the repositories that are specifically identified by the federal agencies as depository repositories, if you will, are going to be playing a very big role in that. I think that the situation is very different in some other countries, but I think that for whatever reasons in the U.S., they’ve chosen not to emphasize institutional repositories as a key part in that.
I think that preprint repositories are kind of an underestimated factor here as well. I mean, I continue to be astounded by the reach and impact of arXiv at Cornell, for example; it’s just an essential sort of thing at this point. So those are a few thoughts on that.
Going back to the longer view of CNI and its role, you will find that 20 years ago when open access was a very novel idea in most quarters, we were paying a good deal more attention to that and getting our members up to speed. But a number of more specialized and more focused organizations have come in to do that. We’re always happy to relinquish the territory to those kinds of organizations for data sets and research data management. I would say that we are still more engaged and we’re watching quite carefully the policy pronouncements coming out of OSTP and the federal agencies in that area as well as other developments, but even there, we are, I would say, certainly less heavily involved, particularly in the details than we were ten years ago. And we’ve seen, you know, a number of organizations come in who have a very specific focus there.
MU: So perhaps you’re saying that in some respects, open access is kind of a battle won. But when you look at something like research integrity, which is almost like a sort of side effect of open access, that seems to be a growing and rather disturbing problem when you look at the number of retracted articles. What’s your take on that?
CL: I think we’ve clearly got several different problems going on here, and they are serious. I think there’s an adjacent set of issues around research security, which often are mentioned in the same breath where, you know, people speak of research integrity and security. But I think that they actually are rather different issues and it’s helpful to separate them out. Having said that, I can’t help but wonder at times, at least in my more paranoid moments, whether some of these eruptions of largescale problems in research integrity don’t have some roots in attempts to more deliberately undermine the functioning of the research enterprise. You just have to wonder sometimes with things at this scale. I think we’ve got a major challenge right now around the research integrity issue, in that nobody is eager to take responsibility for it. The universities want the publishers to do it. The publishers want the universities to do it. The government isn’t sure what it wants — some days it wants to do it. Other days it wants the universities to do it. So I think one of the first prerequisites to really getting this cleaned up is going to have to be some consensus on locus of responsibility and roles. And I do believe that there are players, at least in the US federal government, who are making some effort to get those conversations happening, which I think just can’t happen soon enough.
MU: That’s good news, because it’s always alarming when you read a succession of posts that all talk about it not being my problem.
CL: Yeah, and I read the same ones you do. It’s just ridiculous. I mean, I have seen some of these flow charts, you probably have too, of how a research misconduct investigation proceeds and the various handoffs between government, university and publisher. And the constraints on those handoffs. It’s unbelievable.
MU: But I think there are some obvious candidates for rapid changes. I think that the journal special issue seems to me a very, very suspect kind of concept, which played into the hands of many open access publishers because it was just so easy to implement. And when you read about thousands of special issues coming from one publisher in the space of one year, this is just getting ridiculous.
CL: Although, let me share the very connected dilemma that is very much on my mind, and I don’t think has fully surfaced yet, although we got a good foretaste of it in the pandemic. On the one hand, we have had a growing commitment for at least 25 years now to the use of preprints, which I think are a wonderful thing, in that they vastly accelerate the growth of knowledge, the speed of discovery. They really do, in my view, make a difference. Certainly, we saw in the pandemic, when there was a great, great need for rapid dissemination, rapid discovery, rapid breakthroughs. A huge reliance on preprint servers, even in areas where disciplines and sub-disciplines and contexts, whereas before the pandemic, they were exceedingly reluctant to adopt preprints, in part because they were preprints in areas that affected the decisions of policymakers and of practitioners who were not in a good position to independently make judgments about an un-refereed manuscript. So here we are now. We’ve got this research integrity problem, but we’ve also got a huge commitment to preprints, which I don’t see reversing. And we have a whole set of issues around junk science and similar things, which are not exactly research integrity problems. They’re somewhat related to it, but not exactly the same. And we’re going to need to address it, as a society, and this is not something that the scholars can work out all by themselves. As a society, we’re going to need to come to some kind of understanding about this situation.
Editor’s Note: Thank you to Michael Upshall for conducting a series of podcast interviews for ATG the Podcast, and to Cliff Lynch for agreeing to speak with us. Check out ATG the Podcast for more!