SD Times September 2021

Page 16

016-17_SDT051.qxp_Layout 1 8/20/21 12:35 PM Page 16

16

SD Times

September 2021

www.sdtimes.com

GitHub Copilot sparks debates BY JENNA SARGENT few weeks ago GitHub released its Copilot solution, which uses AI to suggest code to developers. Developers can write a comment in their code and Copilot will automatically write the code it thinks is appropriate. It’s an impressive example of the power of AI, but has many developers and members of the open-source community upset and worrying over what it means for the future of open source. One issue is that the program has had many examples of exactly copying an existing function verbatim, rather than using AI to create something new. For example, Armin Ronacher, director of engineering at Sentry and the creator of Flask, tweeted a GIF of himself using Copilot where it reproduced the famous fast inverse square root function from the video game Quake. Leonora Tindall, a free software enthusiast and co-author of Programming Rust, reached out to GitHub asking if her GPL code was used in the training set and the company’s support team responded back saying, “All public GitHub code was used in training. We don’t distinguish by license type.” When SD Times reached out to GitHub to confirm what code Copilot was trained on, they declined to comment. “I, like many others, have shared work on GitHub under the General Public License, which as you may know is a copyright-based license that allows anyone to use the shared code for whatever they want, so long as, 1) they give credit to the original author and 2) anything based on it is also shared, publicly, under the GPL. Microsoft (through GitHub) has fulfilled neither of these requirements,” Tindall said. “Their argument is that copying things is fair use as long as the thing you’re copying it into is a machine learning dataset, and subsequently a machine learning model. It’s

A

clear that copying is happening, since people have been able to get Copilot to emit, verbatim, very novel and unique code (for instance, the Quake fast inverse square root function).”

AI-powered code authoring solution found to be copying GPL code verbatim According to Tobie Langel, an opensource and web standards consultant, the GPL was largely created to avoid things like Copilot from happening. “I understand why people are upset that it’s legal to use — or considered legally acceptable of a risk of using — GPL content to train a model of that nature. I understand why this is upsetting. It’s upsetting because of the intent of what the GPL is about,” said Langel. Ronacher believes that Copilot is largely in the clear based on current copyright laws, but that there’s an argument to be made that there are a lot of elements of copyright laws that ought to be revisited. Langel also feels that legally Copilot is fine based on the conversations he’s had with IP lawyers so far. The issue of what’s copyrightable, and what isn’t, is complicated because different people have different opinions about it. “I think a lot of programmers think there’s a difference between taking one small function and using it without attribution or taking a whole file and using it without attribution. Even from a copyright perspective, there are differences about what’s the minimum level of creation that actually falls under copyright,” said Ronacher. For example a+b wouldn’t be copyrightable, but something more complicated and unique could be. If you were to remove all the comments and reduce it to its minimum, the fast inverse square root function in Quake is still only two lines. “But it’s such a memorable and well-known function that it’s

hard to argue that this is not copyrighted because it’s also very complex and into the creation of this a lot of thought went,” said Ronacher. There is a threshold on what is copyrightable versus what isn’t, but it’s hard for humans to determine where that line is, and even harder for a machine to do that, Ronacher said. “I don’t think being upset about Copilot is synonymous with being for a hardline stance on copyright. A lot of us free software types are pretty anti-copyright, but since we have to play by those rules, we think the big companies should have to obey them also,” said Tindall. Langel believes that Copilot won’t be the final breaking point for addressing some of the issues in open source, just another drop in the bucket. “I think these issues have been increasingly brought up, whether it be ICE using open-source software, there’s lots that


Turn static files into dynamic content formats.

Create a flipbook
Issuu converts static files into: digital portfolios, online yearbooks, online catalogs, digital photo albums and more. Sign up and create your flipbook.