O P E N D ATA &
GOVERNMENT Isaac Hinman The University of Edinburgh MSc Design & Digital Media 2015
Img1
O P E N D ATA & G O V E R N M E N T
CONTENTS An overview of the current state of open data policy
ABSTRACT
C U R R E N T S TAT E
08/09
22/23
D E F I N I N G O P E N D ATA
P R I VAT E V S . P U B L I C
10/11
26/27
B E N E F I T S O F O P E N D ATA
G O V E R N M E N T O P E N D ATA
12/13
28/29
VA L U E C R E AT I O N
FEEDBACK LOOPS
14/17
30/31
BENEFICIARIES
DOES IT WORK?
17/18
32/33
TRANSPARENCY
APPENDICES
19/20
34/41
4
Img2
O P E N D ATA & G O V E R N M E N T
5
THE AUTHOR
ISAAC HINMAN Isaac is a developer, primarily focused on front-end design and JavaScript technologies. This report documents his final project of an MSc in Design & Digital Media at the University of Edinburgh.
vis.space
github.com/ isaachinman
isaachinman.com
6
THE PROCESS
This project was a combination of practical and academic work over the course of several months. The practical component consisted of a continual cycle of prototyping, researching, and designing. The academic component was undertaken after the practical component to allow for a real-world perspective.
STEP 4
ACADEMIA Considerable academic research into the fields of open data, big data, and opendata governmental policy was undertaken to ensure a full understanding of the field and academic climate.
STEP 5
DOCUMENTATION The documentation process included comments within source code, video walkthroughs of each web application, and this document itself.
7
STEP 1
PROTOTYPING From small prototype JavaScript functions to full-fledged AngularJs SPAs, planning out approaches and best coding practices was an extremely important step in all practical work undertaken.
STEP 2
RESEARCH As is the case anytime a new technology is utilised, extensive research into different frameworks, libraries, APIs, and languages was necessary and continual.
STEP 3
DESIGNING Throughout the design process, a premeditated and intentional effort was made to maintain a clean and cohesive aesthetic, with both visual elements and coding practices.
Img3
8
9
ABSTRACT
This paper will approach the following questions: (1) what value does open data create, (2) who benefits from potential value creation, (3) how do governments use open data, and (4) does the use of open data by governments lead to transparency. Theoretical and academic reasoning will be complemented by first-hand experiences as an open-data developer, primarily working with NASA datasets (a live demo can be found at vis.space). The general conclusion made is that open data does create value in several different ways, but who actually benefits from that value creation is unclear. An overview of how governments use open data is given, and while there is still a crucial need for quantitative research, early indications seem to suggest that open data policies do indeed lead to governmental transparency.
10
$823,000,000
COUNTRIES WITH O P E N D ATA POLICIES
Estimated value of public sector data in the USA
97
€68,000,000
60
Estimated value of public sector data in Europe
2013
O P E N D ATA VA L U E C R E AT I O N Economic
Political
2014
CASE STUDY Economic Value In many cities across the US, people can dial 311, a non-emergency phone line, for general information – this line is frequently used to get bus times. In San Francisco, open data on transit saves the city more than a million dollars on 311 calls each year (a 21.7% reduction in all 311 calls).
Social
“If governments did not gain a net benefit from collecting and using data then they would not collect it in the first place” - The World Bank
D E F I N I N G O P E N D ATA
O P E N D ATA DEFINED The entirety of this document presupposes the following definition of “open data”: Open data is data (see definition for Data in Appendix I) that can be freely used, modified, and shared by anyone for any purpose.1 There are several lists of “principles of open data”, most notably of which is the Sebastopol List.. It's not necessary to commit to any definition as specific as these for the discussion at hand, but it is important to at least be aware of such lists, as they indicate the current state of open data and the direction in which it is heading.The Sebastopol List gives the eight principles (and seven additional principles), the conjunction of which forms a necessary and sufficient condition for what open government data should be (see below). 1 As defined by "The Open Definition", an open source project that seeks to make precise the meaning of "open data" (can be found at opendefinition.org).
SEBASTOPOL LIST C O N D I T I O N S F O R O P E N D ATA 1. Complete
9. Online and free
2. Primary
10. Permanent
3. Timely
11. Trusted
4. Accessible
12. A presumption of openness
5. Machine processable
13. Documented
6. Non-disciminatory
14. Safe to open
7. Non-proprietary
15. Designed with public input
8. License-free
The Sebastopol meeting was coordinated by Tim O’Reilly of O’Reilly Media and Carl Malamud of Public.Resource. Org, with sponsorship from the Sunlight Foundation, Google, and Yahoo.
11
12
13
BENEFITS OF O P E N D ATA
I
14
B E N E F I T S O F O P E N D ATA
"Opening a system typically requires a shift from mechanistic control to an evolutionary perspective"
A BRIEF INTRODUCTION: The most basic premise behind the concept of open data is this: open data itself creates or generates more value than the selling of datasets. Of course, data has no inherent value; it only becomes valuable through use, and allowing unrestricted use of a dataset to every human on the planet seems to be the best way to maximise use and thus maximise value. It's a simple enough concept that is akin to crowdsourcing in its “tapping into the intelligence of the collective public”.2 There is a somewhat different argument for open data that stems from the “right to information” civil movement, which proposes a public right of access to information from a human rights perspective. While it's an interesting debate in its own right, tangible and measurable economic, social, and political benefit is what will drive more governments to adopt open data policies, not lobbying for “right of access to information” to be considered a human right. Although I will not deal with this issue further, and we need not commit either way for the following discussion, I think that while “human right” may be a bit over the top, there is very good reason to think that each citizen deserves access to data produced by their 2 Janssen, Marijn, Yannis Charalabidis, and Anneke Zuiderqijk. “Benefits, Adoption Barriers and Myths of Open Data and Open Government” Information Systems Management, 29.4 (2012): 258-268.
government (within reason and within privacy concerns), as this data was produced via public funds. VA L U E C R E AT I O N : There are several ways in which open data is thought to create value. Distinguishing between the types of value creation is important because various governments and public organisations adopt open data for different reasons, expecting different kinds of value creation. There are three main areas of open data value creation: economic, social, and political. E C O N O M I C VA L U E C R E AT I O N : Economic value creation from open data is perhaps the best empirically-backed of the three, and has therefore frequently been used to legitimise open data policies. The Australian government quoted Tasman's 2008 report on the value of spatial data, Denmark referred to a 2010 study by Gartner that estimated the publishing of Danish government data could create new services worth 600 million Danish krone (~£57m), Spain and the US quoted the 2000 PIRA and 2006 MEPSIR studies that estimated the value of public sector information at 750 billion euros (~£531b) in the US and 68 billion euros (~£41b) in Europe , the UK has cited a 2008
15
Distinguishing between the types of value creation is important because governments adopt open data policies for different reasons
Img4
16
Cambridge University study which estimated the value of government data in the UK at £6 billion, and so on. These studies are all quite outdated now. A more recent study in 2013 by the McKinsey Global Institute estimated open data could generate more from $3 trillion (~£1.9t) to $5.4 trillion (~£3.5t) globally per year in additional value.3 So clearly open data has significant value – a conclusion that comes as no surprise, as private data is uncontroversially valuable. Just how ubiquitous is data? The projected growth in global data generated per year is 40%; we are literally generating so much data that it is physically impossible to store all of it.4 How valuable is data? The short answer is “very”. The World Bank provides a concise and convincing argument for the value of data: “if governments did not gain a net benefit from collecting and using... data then they would not collect it in the first place”.5 While governments primarily seek economic value in data through efficiency gains, the private sector seeks economic value in data by a wide variety of means. An interesting example that has provided multiple cities with significant economic savings is public transit. In many cities across the US, people can dial 311, a non-emergency phone line, for general information – this line is frequently used to get bus times. The city of Albuquerque launched an open data initiative in 2012 which included transit times, and in just one year it saved the city approximately $180,000 in calls to their 311 call centre.6 In San Francisco, open data on transit saves the city more than a million dollars on 311 calls each year (a 21.7% reduction in all 311 calls).7 The use of open data in automating administrative tasks is another common money-saver. Alameda county (in California) saves more than $565,000 3 Manyika, J., et al. “Open Data: Unlocking Innovation and Performance With Liquid Information”. McKinsey & Co Website. (2013). 4 Manyika, James, et al. “Big Data: The Next Frontier for Innovation, Competition, and Productivity” (2011). 5 Stott, Andrew. “Open Data for Economic Growth” The World Bank, Transport & ICT Global Practice. 25 June 2014. 6 Daly, Jimmy. “Albuquerque's Open Data Efforts Are Delivering ROI for the City” StateTech. 3 Dec 2013. Web. 31 July 2015. 7 Nath, Jay. “Can #opendata save gov $?” 23 June 2012, 10:30AM. Tweet.
annually via an app that automates invoicing.8 The app was born out of an apps challenge attended by the county's 9,000 employees. S O C I A L VA L U E C R E AT I O N : The evidence for open data's economic impact is sparse, and evidence for its social and political impact is even less.9 Moreover, the results of the few studies that have been done are in some cases contradictory. In studying government transparency's (an alleged result of open data) relationship to general public trust in government, some studies found that increased transparency increases trust (as people perceive more control over government) while some found that increased transparency decreases trust (as more governmental failures are revealed).10 Open data's relationship to governmental transparency will be covered in more detail in a subsequent section. Nonetheless, open data has unquestionable social impact. Public datasets on healthcare provider performance help individual citizens make more informed choices about their health and who is best equipped to serve them.11 ESPC in Edinburgh uses open data (provided by the city) on school catchments to help individual citizens make more informed choices about real estate locations.12 San Francisco, New York, Los Angeles, Seattle, Chicago, and dozens of other US cities use the CrimeMapping platform to display realtime 911 calls, allowing citizens to see – and presumably avoid – crimes that are happening in their area (and in the long run reduce crime via a better-informed citizenry).13
8 “Alameda County Saves More Than $565 with Open Data” Socrata. Web. 1 Aug 2015. 9 Huijboom, Noor, and Tijs Van den Broek. “Open Data: An International Comparison of Strategies” European Journal of ePractice 12.1 (2011): 4-16. 10 Curtin, Deirdre, and Albert Meijer. “Does Transparency Strengthen Legitimacy? A Critical Analysis of European Union Policy Documents” Information Polity (2006). Rothstein, Bo. “Social Capital in the Social Democratic Welfare State” Politics & Society 29.2 (2001): 207-241. 11 Williams, Oscar. “Open Data Could Save the NHS Hundreds of Millions, Says Top UK Scientist” The Guardian. 29 May 2015. Web. 26 July 2015. 12 “Search by School Catchments” ESPC.com. ESPC (UK) Ltd. 15 May 2015. Web. 26 July 2015. 13 “About Us: Who We Are and What We Do” Web. 1 Aug 2015.
In my own work, I have created applications that provide social value (albeit in a less life-changing way). In my work with NASA APIs, I discovered that many are abandoned or ill-kept. This is because quite a few of them have been created at hackathons and other sorts of one-off events, with longevity not even a consideration (more on this in a subsequent section). One such abandoned API is the Predict the Sky API, which I became interested in immediately.14 The premise of the API is to combine “global weather data with a comprehensive library of space events”, allowing you to send a lat/lon location and receive back both the current cloud cover and list of astral bodies overhead. It’s essentially a utility for amateur astronomers/ hobbyists. Upon realising the API was deprecated and now no longer functional, I built a new clientside application from scratch that delivers the same social utility (a more detailed account can be found in the technical appendix). P O L I T I C A L VA L U E C R E AT I O N : The political impact of open data is strongly tied to its social impact, and as explained in the previous section, the effect of increased government transparency brought about by open data is unclear. As with the other two value creation areas (economic and social), the proposed value creation by open data in the political sphere has to do with efficiency. Lack of competition is frequently cited as a reason for lower productivity in the public sector (as compared to the private sector). The opening of government data not only allows greater scrutiny by the public on the efficiency of the public sector, it can also drive internal competition via comparison engines that allow measurement and ranking of departments.15 Beyond a straightforward increase in efficiency, open data necessarily seems to weaken government via transparency. In a 2012 study published in Information Systems Management about open data and open government, Marijn Janssen (et al) write that governments “have to accept that they
inevitably give up some level of control when opening their data to the public... [they] should expect or actively solicit feedback and be able to make sense of this feedback... Opening a system typically requires a shift from mechanistic control to an evolutionary perspective”.16 This is where the “crowdsourcing” of open data comes in; if governments merely dump their data with no intention of opening a dialogue, the political system gains nothing. Feedback loops (as per system theory) become incredibly important for political change by way of open data to take place (more on this in a subsequent section). Whether the political impact of open data is actually net positive is still unclear. A 2014 study by Martin Lodge of the Department of Government & Centre for Analysis of Risk and Regulation, London School of Economics found that the UK's recent “Red Tape Challenge”, an initiative meant to facilitate a crowdsourced rework of regulatory legislation, actually had little effect. The resulting changes in legislation from the programme came almost entirely from within government, and the procedure – which was meant to be “cost-lite” by freely crowdsourcing information – actually turned out to be quite high in cost, due to expenditures like running and monitoring the website, analysing comments, etc.17 Similarly, Obama's open data initiative has had little effect on the actual inter-workings of most American government agencies (more on this in a subsequent section). VA L U E C R E AT I O N O F M Y W O R K : If the use of open data creates value, did the practical work I undertook create any value? I believe the web applications I built did create value, insofar as a prototype application can. As far as I know, the StarFinder app I built is the only app of its kind. There are no other utilities for searching individual stars and receiving key parameters back, let alone a utility that 16 Janssen, 266.
14 predictthesky.org/developers.html 15 Manyika, James, et al. “Big Data: The Next Frontier for Innovation, Competition, and Productivity” McKinsey & Co Website. (2011).
17 Lodge, Martin, and Kai Wegrich. “Crowdsourcing and Regulatory Reviews: A New Way of Challenging Red Tape in British Government?” Regulation & Governance 9.1 (2015): 30-46.
17
18
accurately animates the star with correct colour. Again, as far as I know, the Earth/Landsat app is the only app of its kind – no other 3D Earth populates with tiles that have all been taken within the past two weeks. NASA's Earth API (which I used to develop the Earth/Landsat app) is incredibly simple: you request a latitude/longitude, and you get back an url of an image of that location. There are only a few other parameters (date of image, dimensions of image, and cloud cover probability). The images are all provided by the Landsat 8 satellite, which is in an orbital pattern that causes it to cross over (almost) every point on earth roughly once every 16 days – meaning these images are by far the most current satellite images available to the public. One suggested use by NASA is the monitoring of deforestation. A recent industry report estimated the value of the Landsat data at $2.19 billion annually, yet there wasn't (to my knowledge) a single web app that utilised the Earth API.18 The basic idea behind the application was to populate a 3D Earth with Landsat images dynamically, essentially recreating a Google Earth with extremely recent imagery. Feedback on the Landsat app from NASA has been largely positive, with one official saying it “suggests some very powerful applications” – so the potential for economic value has been created as well. More information on all practical work undertaken can be found in Appendix II. PROPOSED BENEFICIARIES: While it's not completely straightforward who actually benefits from open data, three distinct groups of individuals/organisations are thought to benefit: governments, citizens, and the private sector.19 Governments are meant to benefit through improved overall efficiency, more personalised public services, increased interaction between government and the public, and greater accountability through transparency. 18 "The Value Proposition for Landsat Applications – 2014 Update” National Geospatial Advisory Committee – Landsat Advisory Group. 2014. Web. 1 Aug 2015. 19 Ubaldi, Barbara. “Open Government Data” OECD Working Papers on Public Governance. 27 May 2013.
Citizens' lives are meant to be improved through intelligent use of data, saving time, money, and allowing individuals to make more informed personal choices. The private sector is meant to benefit by pursuing commercial exploitation of open government data, and the profit incentive present will supposedly drive innovation. ACTUAL BENEFICIARIES: In theory, it sounds like nearly everyone should benefit from open data, and the initial intuitive feeling is that open data is an equaliser that will level the playing field and promote egalitarianism. But there are undeniable barriers to entry when it comes to the actual use of datasets, and for open data to actually have an egalitarian effect, governments need to make proactive steps to mitigate barriers to use as much as possible, else the already empowered will only become more empowered. Mere “data dumps” are actually irresponsible in many cases. BARRIERS TO USE AND M A R G I N A L I S E D P O P U L AT I O N S : A running theme of lack of empirical evidence is undoubtedly forming, and however little research there is on the impact of open data, there is even less on its impact across socioeconomic groups. There are good reasons to believe, though, that in many cases open data benefits most the individuals already quite high on the socio-economic ladder, and there are individual cases that make good examples. The primary reason for a lack of “even spread” benefit across the socio-economic spectrum is a significant barrier to entry in the practical use of data. (I don't mean here barriers to entry in adopting open data governmental policies, this is discussed in a subsequent section.) Open data is open in theory, but rarely in practice. My own work with open data was only made possible by knowledge of various web languages and technologies. In the current system, the private sector (or individual developers) acts as an intermediary between open datasets and
users. As Michael Gurstein wrote in his 2011 paper “Open Data: Empowering the Empowered or Effective Data Use for Everyone”, exciting new outcomes from open data are “only available to those who are already reasonably well provided for technologically and with other resources”.20 Gurstein also gives a good concrete example of open data use “against the poor” (as Tim O'Reilly tweeted) in Bangalore, where digitised land records were primarily used by middle/upper income individuals and corporations to gain ownership of land from marginalised and poor populations, who had little-to-no access to the records, and even less ability to make use of them.21 An interesting example of leveraging technology to specifically benefit marginalised populations can be found in Question Box. The project was born out of the premise that 775 million people in the world are illiterate, and less than half of the global population have access to the internet.22 Clearly this means that more than half of the world cannot benefit from open data (or any information available on the internet), at least directly. Question Box is a callbox network (in India and Uganda) of public boxes.23 Each box has a green and red button, and a speaker/microphone. Users can simply press the green button, which connects the user (via a cellphone inside the box) to a dispatcher sitting in an office at a computer. The dispatcher can then verbally answer any questions the user may have by using online resources and search engines. When the user has acquired the desired information, they press the red button to end the interaction. David Robinson (et al), of the Information Society Project at Yale Law, makes the argument that the current model of governments providing interactive websites as “data portals” is misguided and counter-productive – governments should focus on “providing reusable data, rather than providing
Web sites, as the core of its online publishing responsibility” and in fact the private sector, “either nonprofit or commercial” is “better suited to deliver government information to citizens”.24 The private sector is more capable of the constant reworking that modern web applications require in the extremely fast-paced contemporary web development environment. A good example of such a private sector provider is Socrata, which provides a softwareas-a-service (SAAS) open data portal (and API engine) to governments and municipalities. The primary benefit of private sector SAAS providers is competition; Socrata received $25 million in venture capital funding in 2008 and 2013, and now has a direct competitor, OpenDataSoft (based in France).25 TRANSPARENCY: Is governmental transparency good in its own right? Should open data policies be pursued simply for transparency, and any additional economic, social, or political value added is a mere bonus? Barack Obama seems to think so. On his first day in office, he announced a transparency strategy to increase openness in the US government: “We will work together to ensure the public trust and establish a system of transparency, public participation, and collaboration”.26 Despite this, in the 2014 fiscal year (with similar figures in three previous years), the US government spent nearly four times as much on protecting their data as they did on sharing it ($562m on IT security vs $148m on information sharing).27 Whether or not open data policy actually leads to open government will be discussed in a subsequent section (4, III) – here I'd like to discuss the concept of governmental transparency and open government on its own.
20 Gurstein, Michael. “Open Data: Empowering the Empowered or Effective Data Use for Everyone?” First Monday 16.2 (2011). 21 Gurstein, 2. Solomon, Benjamin, R. Bhuvaneswari, and P. Rajan. “Bhoomi: 'E-Governance', or, an Anti-Politics Machine Necessary to Globalise Bangalore?” CASUM-m Working Paper. (2014) 22 “Internet Users” Internet Live Stats. Web. 27 July 2015. “Statistics on Literacy” United Nations Educational, Scientific and Cultural Organisation. Web. 27 July 2015. 23 “Overview” Question Box. Web. 27 July 2015.
24 Robinson, David, et al. “Government Data and The Invisible Hand” Yale Journal of Law & Technology 11 (2009): 160. 25 Stott, 13. 26 Obama, Barack. “Transparency and Open Government” The White House. Web. 27 July 2015. 27 Usaspending.gov, results for FY2014 “Information Technology Security” and FY2014 “Information Sharing”
19
20
W H AT I S O P E N / T R A N S P A R E N T GOVERMENT?: As Harlan Yu notes in his 2012 paper “The New Ambiguity of 'Open Government'”, the term “open government” used to have a somewhat negative connotation, and referred specifically to the “politically sensitive disclosures of government information”.28 The term was first used in the 1950s, and its meaning has changed significantly, even in the three years since the writing of Yu's paper.
to make their governments more transparent, responsive, and effective”. Maybe the issue isn't whether open government is inherently good, but merely that populations around the globe are demanding it, and it is their governments' responsibility to deliver. The mention of increased responsiveness, effectiveness, innovation, and progress in the OGP seems to imply some kind of objective and measurable intrinsic value in open government, though. CAUSAL DIRECTION:
Instead of seeking definition via academia, we can find more accurate definition in real-world policy and governmental agreements – the most important of which is the Open Government Partnership (OGP). The OGP was launched in 2011 by eight countries, and has since grown to 66 participating countries around the world, making it by far the largest open government initiative ever seen.29 The actual agreement is rather idealistic and vague, but it does provide a baseline for what “open government” means and entails, and the sheer number of participating countries shows an enormous global interest. The most integral components of open government are public access to government data and civic participation. The OGP calls for an increase in the availability of information about government activities, the support of civic participation, professional integrity, and increased access to new technologies.30 IS OPEN GOVERNANCE GOOD?: Interestingly, instead of beginning with an explanation of why open governance should be sought after, the OGP opens with this acknowledgement: “We acknowledge that people all around the world are demanding more openness in government... greater civic participation in public affairs, and seeking ways 28 Yu, Harlan and David G. Robinson. “The New Ambiguity of 'Open Government'” UCLA Law Review Discourse. (2012). 29 “Participating Countries” The Open Government Partnership. Web. 29 July 2015. 30 “Open Government Declaration” Open Government Partnership. Web. 28 July 2015.
Is open government data a result of transparency, or the other way around? Transparency is the more fundamental of the two concepts, and the fact that open data policy results in transparency is specifically intentional; open data is a vehicle for transparency. Moreover, open government data does not necessarily entail or result in governmental transparency (more on this in a subsequent section).
21
Img9
O P E N D ATA & G O V E R N M E N T
22
23
C U R R E N T S TAT E O F O P E N D ATA
II
24
C U R R E N T S TAT E O F O P E N D ATA
"Datasets themselves are political objects, and policies to open up datasets are the product of politics"
EMPIRICAL EVIDENCE & RESEARCH:
GLOBAL ADOPTION:
As made clear in previous sections, there is a significant lack of empirical evidence for the benefit (economic, social, political, or otherwise) of open data. There are, however, quite a few well-backed estimates of the value of open data both globally and in various sectors, most notably the 2013 McKinsey report and the 2014 World Bank report.
Despite a lack of empirical data and research, 97 countries currently have governmental open data policies in some form or another (up from 60 in 2013, a 38% increase).33 These 97 countries collectively host nearly 400 open data portals. The rapid proliferation of governmental open data and the multi-national nature of the data and portals have led to a current situation wherein there are no universal metrics for tracking and comparing data use – the exact things needed for quantitative research.
WIDESPREAD BIAS: Janssen (et al) write that “a conceptually simplistic view is often adopted with regard to open data, which automatically correlates the publicizing of data with use and benefits”.31 Moreover, the US and the UK are the only countries formally to have evaluated their open data policies, but only insofar as evaluating how well the policies were executed in comparison to legislation, not actual efficacy or impact itself.32 I have found that outside the small academic circle of individuals researching open data in a scrutinising way, nearly all other stakeholders (individual citizens, politicians, policy-makers) have a pro-open data bias.
31 Janssen, 258. 32 Huijboom, 9.
I S O P E N D ATA A P O L I T I C A L ? : Is open data apolitical, or are its components and proponents inherently political? This area of the open data discussion is a bit more abstract, and perhaps less practically useful than other questions, but nonetheless important in a broader consideration. It is important to remember that datasets are not mind-independent, impartial things. The parameters of every dataset were at some point set by a human mind, and anytime a human mind creates anything it leaves behind traces of its bias. For example, in the international classification of 33 “Tracking the State of Government Open Data” Global Open Data Index. Web. 28 July 2015.
25
The multinational nature of open data and portals has led to a current situation wherein there are no universal metrics
Img5
26
disease, tropical diseases have been historically under-represented, thus reducing the visibility of tropical disease mortality and affecting (even if in a minor way) the direction of global health policy.34 Classification spawns from preference. Tim Davies, a researcher at the University of Southamption, gives a good account of this phenomenon: “datasets themselves are political objects, and policies to open up datasets are the product of politics... The practical and political decisions that went into constructing a dataset do not disappear when that dataset is opened, but are instead carried with it”.35 Dr. Nishant Shah, co-founder and DirectorResearch at the Centre for Internet and Society, claims that open data initiatives are symptomatic of the “politics of the benign”, wherein certain stakeholders “seek to neuter the radical nature of demands made by the Openness movements while retaining the vocabulary of political change”.36 That is to say that the open data movements seek to maintain an image of political benignness and commonsense, ignoring what the data actually consist of, who they benefit most, and how they can actually be used to make society more democratic and open.37 In reality, open data initiatives are not entirely neutral, benign, or commonsensical – they are underpinned by existing political and economic ideology. P R I VAT E V S P U B L I C O P E N D ATA : The private sector opened datasets long before governments, and governments have only become involved in open data via models set forward by the private sector. Competition has driven many products and services to be free, and to be expected to be free by the consumer. This has resulted in a private sector environment where many services must be offered for free, 34 Bowker, Geoffrey and Susan Leigh Star. “Sorting Things Out: Classification and Its Consequences” MIT Press. (2000).
else they will become completely irrelevant and unused. So, the private sector has led the public. Jo Bates, of the Manchester Metropolitan University, gives a specific example of the UK, where the open data movement in government had little traction until private sector businesses started to actively campaign for its adoption.38 M O T I VAT I O N : Private sector data is almost entirely to do with selling a product or service, while public data is not. Despite this fundamental difference, the use of open data can benefit both sectors. Stuart Coleman, commercial director of the Open Data Institute, says that in addition to governmental open data, “we're also starting to see businesses look to release some of their data openly” and that while it may not be right for all businesses, “there are increasingly more use cases”.39 P R I VAT E O P E N D ATA , C LO S E D D ATA , A N D G R E Y A R E A S : Unlike governmental open data, which is intentionally held to specific standards meant to protect its “openness” and integrity, the private sector is free to do as they please, and often the line between “open” and “closed” data is quite blurry. In the private sector, companies frequently benefit from individual APIs, and an excellent example of blurred-openness can be found in API request limits. The Google Maps API, one of the most used APIs in the world, has a request limit of 100k requests per 24 hour period. If someone exceeds this limit, they will receive 429 (too many request) errors. Any use of the API within this limit is entirely free and open, however if an individual or organisation wants a higher limit, they then must pay. Nearly all APIs work in this way – free to individuals or small organisations, but at cost to large-scale commercial use.
35 Davis, Tim. “The Messy Reality of Open Data and Politics” The Guardian. 8 April 2013. Web. 1 Aug 2015. 36 Shah, Nishant. “Big Data, People's Lives, and the Importance of Openness” DML Central. 24 June 2013. Web. 1 Aug 2015. 37 Kitchin, Rob. “Four Critiques of Open Data Initiatives” The London School of Economics and Political Science. 27 Nov 2013. Web. 1 Aug 2015.
38 Bates, Jo. “Co-Optation and Contestation in the Shaping of the UK's Open Government Data Initiative” The Journal of Community Informatics Vol 2, No 2 (2012). 39 Say, Mark. “Can Open Data Work for the Private Sector?” CIO UK. 20 Dec 2013. Web. 1 Aug 2015.
On the other end of the spectrum, some private sector online businesses have gone the exact opposite direction, and put their content behind a paywall. Notable examples of this in online journalism are the Wall Street Journal and The Times (UK). In a modern culture where online content is expected to be free at all levels (with revenue in most cases coming by way of advertisement), the implementation of a paywall is a risky move.
it's somewhere in between public and private, and the PAF exemplifies the privatisation of previously public sectors, and the selling off of data in the process.45
When The New York Times introduced a paywall in 2011, most readers planned not to pay, and ultimately did not.40 However, enough readers did pay to keep the business viable, and The New York Times now generates 52% of its revenue directly from readers, most of which coming by way of the paywall.41 Despite this, The New York Times has reported an annual increase in sales of 1% or less for the past three years. In a 2012 study for Cyberpsychology, Behavior, and Social Networking, Jonathan Cook and Shahzeen Attari reported that despite predominantly offering content for free, providers on the internet “increasingly charge for access”, and that their users' response is typically one of “strong psychological reactance, particularly those in the inequity cluster”.42 In the case of The New York Times, users generally did not purchase a digital subscription when the paywall was implemented, and “decreased their visits, devalued the NYT, and frequently planned to exploit loopholes to bypass the paywall or switch providers altogether”.43 The Postcode Address File (PAF) in the UK has not been released openly despite numerous calls from citizens and MPs. The dataset is monetised, as under the Postal Services Act of 2000, the Royal Mail has “stewardship for the time being” of the PAF, effectively giving them a monopoly on this sort of data. PAF gives access to 1.8 million UK postcodes and over 29 million residential and business addresses, which are constantly updated and verified.44 The Royal Mail is a strange example, as 40 Cook, Jonathan, and Shahzeen Z. Attari. “Paying for What Was Free: Lessons from the New York Times Paywall” Cyberpsychology, Behavior, and Social Networking Vol 15, No 12 (2012). 41 Doctor, Ken. “The New York Times and Its Big 'Zero'” CNN Money. 6 Feb 2015. Web. 1 Aug 2015. 42 Cook, 685. 43 Cook, 686. 44 “Postcode Address File” Royal Mail. Web. 31 July 2015.
45 Arthur, Charles. “MPs and Open-Data Advocates Slam Postcode Selloff” The Guardian. 17 Mar 2014. Web. 1 Aug 2015.
27
28
29
G O V E R N M E N TA L O P E N D ATA
III
30
G O V E R N M E N TA L O P E N D ATA
"A government can provide open data on politically neutral topics even as it remains deeply opaque and unaccountable"
THEORY VS PRACTICE, BARRIERS AND DRIVERS: An interesting observation about the barriers and drivers of open data policy adoption by government is that while the drivers lie predominantly outside government, the barriers lie almost exclusively within. The story of open data policy adoption is a story of governmental inertia and resistance to change. There are some genuine concerns (ie privacy and limited networks), but the majority of the current barriers are structural in nature. It's worth noting that governmental open data is of course readable, but is very rarely writable. BARRIERS: In 2011, Noor Huijboom and Tijs Van den Broek published a study for the Dutch Ministry of the Interior and Kingdom Relations, TNO entitled “Open Data: An International Comparison of Strategies”, which gives a very unbiased and comprehensive look at governmental open data strategies. Their account of barriers to open data adoption is particularly lucid. Of the barriers they list, these are the most important: closed government culture, privacy legislation, limited user-friendliness, lack of standardisation, security threats, existing charging models (the income of some government organisations is based on the selling of datasets), uncertain economic impact, and network
overload.46 Huijboom and Van den Broek polled policy-makers and experts; these barriers come directly from surveys and are listed in descending order, with “closed government culture” being the most-cited barrier. DRIVERS: While the proposed economic, social, and political benefits (outlined in previous sections) provide a reasonable backing for open data policies, in reality hype and self-perpetuation may be just as important. After all, the quick propagation of open data policies was largely due to examples set by the US and the UK – some policy-makers have even stated that their respective countries have “a track record of being an advanced information society”, and that they “wanted to maintain that image”.47 FEEDBACK LOOPS: Open government data without the possibility of collaboration and feedback from the public would merely serve the public sector's exploitation of datasets, and would do nothing to better the lives of the citizenry meant to be empowered. Feedback loops are crucial in creating a dialogue between a government and its citizens. 46 Huijboom, 7. 47 Huijboom, 8.
31
The basic premise is that open data itself creates or generates more value than the selling of datasets
Img6
32
MY INTERACTION WITH NASA: In my practical work, I made frequent use of open data NASA APIs, which deliver data ranging from star luminosity, to the weather on mars, to image urls of satellite images. The NASA data portal is a collection of a few dozen APIs, most of which aren't actually developed or maintained by NASA themselves, but do utilise NASA's data. The data portal, run by a recently-created Open Innovation team, is NASA's response to Obama's transparency mandate. Surprisingly, there is only a very small group of web developers actively using NASA's APIs, and I believe that some of the services I made use of had never actually been put to practical use. In this way, I became something of a beta tester, and developed a dialogue with two NASA officials, reporting dead and deprecated APIs, server errors, potential use cases, and developing prototypes, in many cases helping to improve services provided. Besides email interaction with NASA, I also engaged in bug reporting for Cesium, LightSlider, and various APIs through Google Forums and GitHub. G I T H U B A S A P L AT F O R M F O R DIALOGUE: GitHub is a social coding site that “uses Git as its distributed revision control and source code management system”.48 Basically, developers can contribute to “repositories” by using Git (typically through the command line or a GUI). These repositories can be private, but most are public, and anyone with a GitHub account can raise issues, push commits for approval, etc. At the time of writing, GitHub has a community of over 10 million people contributing to over 25 million projects – the largest community of its kind in the world. Empirical academic studies have shown that the mechanisms within GitHub and its particular 48 Thung, Ferdian, et al. “Network Structure of Social Coding in GitHub” Software Maintenance and Reegineering (CSMR), 2013 17th European Conference on. IEEE, 2013.
structure and workflow contribute to improving collaboration among developers, resulting in a small value of the average shortest path (of repositories), indicating high productivity.49 I believe that GitHub is an excellent collaborative tool, partially because of the way the actual site is set up, and partially because its foundation in Git for version control means it provides incredible utility. I believe it will be integral in future collaboration between government agencies and citizens (developers in this case), and the process has already begun. In 2009, the New York Senate was the first government organisation to upload code to GitHub.50 Now approximately 44 countries use GitHub to varying degrees, with the US, UK, Canada, Brazil, and Australia having the most repositories by far, and there are over 10,000 active government users.51 52 H A C K AT H O N S A N D T H E I R BYPRODUCTS: Hackathons have become increasingly popular in recent years, and are attractive to participants and benefactors alike in their ability to facilitate the creation of new code, software, hardware, etc, in a very short space of time - typically ranging from a day to a week or so.53 NASA hosts their own two-day hackathon called "Space Apps", which is responsible for many of the NASA Data Portal APIs.54 While hackathons are a great way to generate new technologies rapidly, the technologies that do come out of them are intrinsically tied to poor support and early death (in most cases). In speaking with NASA employees and past participants of the Space Apps hackathon, it seems like most individuals involved are aware of this.
49 Thung, 326. 50 Schacon. “NY State Senate Code on GitHub” GitHub Blog. 20 June 2009. Web. 1 Aug 2015. 51 “Who's Using GitHub?” GitHub. Web. 1 Aug 2015. 52 Balter, Ben. “Government Opens Up: 10k Active Government Users on GitHub” GitHub. 14 Aug 2014. Web. 1 Aug 2015. 53 Leckart, Steven. “The Hackathon is On: Pitching and Programming the Next Killer App” Wired. 17 Feb 2012. Web. 2 Aug 2015. 54 2015.spaceappschallenge.org
D O E S O P E N D A T A A C T U A L LY L E A D TO OPEN GOVERNMENT?: As Harlan Yu notes, a government can “provide open data on politically neutral topics even as it remains deeply opaque and unaccountable”.55 For example, Budapest and Szeged both provide open data on transit schedules (allowing Google Maps to provide routes to users). This transit data is both open and governmental, but has no impact on the actual transparency of the Hungarian government.56 In the US, most federal agencies have taken a “passive-aggressive attitude” toward Obama's open data program, and only a tiny group of agencies are proactively involved in data.gov (which was launched in 2009 at the behest of former Chief Information Officer of the US, Vivek Kundra). Together, the USGS and CENSUS comprise 92.43% of uploads (both data and applications) to data.gov.57 NASA, NOAA, and the EPA are other notable contributors. After these five agencies, the remaining 164 agencies jointly make up only 0.63% of all uploads to data.gov. The data on US open data adoption show that most agencies have effectively ignored Obama's programme, as it directly conflicts with existing interagency collaboration practices.58 This, though, is just an example of structural barriers to open data within government. What happens when a government does release open data that is nontrivial and in some cases political? John Bertot, Paul Jaeger, and Justin Grimes, all of the University of Maryland, published a comprehensive and exemplary report in 2010 on the use of ICTs to promote transparency. They found that the term “transparency” is used “with great liberty” but “little evaluation criteria”; however open data policy can be transformative in general, and particularly in regard to transparency and anticorruption.59 They maintain a cautious outlook, 55 Yu, 181. 56 “List of Publicly-Accessible Transit Data Feeds” GoogleTransitDataFeed. 25 July 2015. Web. 29 July 2015. 57 Peled, Alon. “When Transparency and Collaboration Collide: The USA Open Data Program” Journal of the American Society for Information Sciences and Technology 62.11 (2011): 2085-2094.
and explicitly state that the “the extent to which ICTs can create a culture of transparency and openness is unclear”, but that “initial indications are that ICTs can in fact create an atmosphere of openness that identifies and stems corrupt behavior”.60 Bertot (et al) note that the governmental transparency movement is a rare alignment of policy, technology, practice, and citizen demand – it's almost universally supported.61 However, technology access and literacy may be the most pressing barrier to a positive effect on transparency – in the US for example, over 35% of households still do not have access to the internet.62 Even though the movement is still in its infancy, small examples of effect can be found. Ethan Zuckerman, senior researcher at the Berkman Center for Internet and Society, gives the example of Ghana – one of many African countries whose citizens are demanding the opening of data. Zuckerman says that in Ghana, there is “an extremely engaged citizenry, and a sceptical and technically competent press starting to demand certain critical pieces of information”.63 Echoed in Zuckerman's example is the integral requirement of technical competence. CONCLUSIONS: So open data policies can indeed create value in at least three distinct ways: (1) economic, (2) social, and (3) political. However, barriers to use may cause the benefits of open data to be skewed towards those already empowered if counter-measures are not taken. The adoption rates of open data policies by governments across the globe are increasing exponentially, and now nearly half the countries in the world utilise open data in some way or another. Understandably for a phenomenon less than a decade old, the quantitative data available is not strong enough to make firm conclusions, but early indicators seem to show that open data policies do generally increase governmental transparency. 60 Bertot, 269.
58 Peled, 2085.
61 Bertot, 268.
59 Bertot, John, Paul Jaeger, and Justin Grimes. “Using ICTs To Create a Culture of Transparency: E-Government and Social Media as Openness and Anti-Corruption Tools for Societies” Government Information Quarterly 27.3 (2010): 264-271.
62 File, Thom, and Camille Ryan. “Computer and Internet Use In the United States: 2013” United States Census Bureau. Nov 2014. 63 Hogge, Becky. “Open Data Study” Transparency and Accountability Initiative (2010).
33
34
35
APPENDICES
IV
36
APPENDIX I: GLOSSARY OF TERMS
B I G D ATA : Datasets (see definition for Data) whose size is beyond the ability of typical database software tools to capture, store, manage, and analyse. The threshold size of big data can vary by sector, but typically it ranges from a few dozen terabytes to multiple petabytes. D ATA : The lowest level abstraction from which information, and then knowledge, can be derived.1 H A C K AT H O N S : A gathering wherein programmers collaboratively code over a short period of time (generally no longer than a week). Hackathons are almost exclusively goal-oriented, with teams aiming to produce a functional product, service, API, program, etc, in the time allotted. ICT: Information and communications technology – in this case specifically the use of certain technologies to open up datasets by facilitating access. ICT is concerned with the storage, retrieval, manipulation, transmission, or receipt of digital data. O P E N G O V E R N M E N T D ATA : Data (see definition for Data) which is: 1. Governmental: produced or commissioned by public bodies 2. Open (see definition for Open Data) These two conditions in conjunction form a necessary and sufficient requirement for Open Government Data. OPEN SOURCE COMMUNITY: A broad, internet-based community of mostly programmers and web developers (in both the public and private sector) who contribute to various open source projects through more specific communities like GitHub, StackOverflow, and Reddit.
1 Ubaldi, Barbara. “Open Government Data” OECD Working Papers on Public Governance. 27 May 2013.
37 Data is the lowest level abstraction from which information, and then knowledge, can be derived
Img7
O P E N D ATA & G O V E R N M E N T
38
APPENDIX II: ACCOUNT OF TECHNICAL WORK
L I V E L I N K : vis.space SPACEVIS: This is the “container” single-page app (SPA) that holds all of the individual web apps. I used AngularJs for its routing, templating, and controllers. I used Bootstrap for a basic layout and grid system. Angular isn’t being used here to its full potential – it's mostly being used for templating and running JavaScript through controllers. Essentially, when “changing” pages, the only thing that actually changes is the html markup inside a div. There is no actual page reloading, just different bits of HTML being injected on request (these bits are kept in separate HTML files for ease of use). This means navigating through different pages and content is extremely fast, and urls are actually “virtual urls”. Similarly, when navigating to a different virtual url, the appropriate controller is called onload and runs a predetermined bit of JavaScript. S TA R F I N D E R : This application was the first I undertook. One of NASA's largest APIs is/was the Star API, which is actually based on a dataset belonging to the American Museum of Natural History (AMNH), and contains 15 variables on over 100k stars (mostly hipparcos). The idea was to build a web application that could search and retrieve information on any of the stars, and display this information in a useful way. Practically, the application has just two different (rather large) functions: 1. Database Function This happens just once on page load, and fetches the label parameter for all the stars. These labels are then put them into an array, which is then scanned by a jQuery UI autocomplete search input for live autocomplete. As a default, the sun is
loaded on page load via a search function call. This is a prototype application, after all, and the initial database XMLHttpRequest could be modified to allow for searching across any of the 15 variables, as well as filtering, etc. 2. Search Function This happens every time the search button is pressed. The search string (in this case a star's label parameter) is added to a request url and sent. When the request is returned, a number of things are checked. If the star is the sun, some sun-specific actions are performed (applying an image-based texture, etc). If the star is not the sun, the function will run the normal procedure: create a sphere, add a mesh, parse the star's colour from JSON, interpolate it in HSL space (across a spectrum with predetermined colour stops), convert the colour back to hex, calculate the star's approximate temperature from its BV colour value, etc. I was pleased to be able to successfully calculate star temperatures from BV colour values, but from early on, I also wanted to be able to calculate radii. It is possible to calculate a star's radius if you know the luminosity and temperature:
L = 4pR2sT4
Where L is the luminosity in watts, R is the radius in meters, s is the Stefan-Boltzmann constant, and T is the star's surface temperature in Kelvin. Obviously I had the temperature, and the API delivers a luminosity parameter, so I should have been able to find radii. However, when I dug into the luminosity parameter, I found something quite strange. In the Digital Universe software used to generate the APIs data, luminosity can be set to a constant, eg I can set lum to the sun, and all other stars will have a lum variable as a proportion of the sun's luminosity. Strangely, the luminosity variable wasn't set to the sun – the sun's lum parameter was 0.8913. It wasn't scaled to Vega either. In fact, even going back to the dataset and searching specifically for a star with a lum parameter of 1, I didn't find anything. So I got in touch with Brian Abbott, assistant director
39
Img8
of the Hayden Planetarium at the AMNH. Brian was very helpful, but oddly informed me that he had no idea what the lum parameter had been set to, and moreover that AMNH wasn't aware NASA was using their data. The API was eventually pulled, and I had to revert to an identical API created at a NASA hackathon (HackTheUniverse Star API).1 So unfortunately radii are not being calculated, but it's certainly possible. Animation is done via the ThreeJs library, and is quite simple/straightforward.2 ThreeJs is a 3D JavaScript library that produces animations via canvas, svg, CSS3D, and WebGL renderers - I primarily use it to render via canvas. ThreeJs was created by MrDoob, and has remarkably good documentation and support. E A R T H / L A N D S AT 8 : The second web app I built was one that utilises NASA’s Earth API, which is powered by the Google Earth Engine (for mapping). The API is incredibly simple: you request a latitude/longitude, and you get back an url of an image of that location. There are only a few other parameters (date of image, dimensions of image, and cloud cover probability). The images are all provided by the Landsat 8 satellite, which is in an orbital pattern that causes it to 1 star-api.herokuapp.com 2 threejs.org
cross over (almost) every point on earth roughly once every 16 days – meaning these images are by far the most current satellite images available to the public. One suggested use by NASA is the monitoring of deforestation. A recent industry report estimated the value of the Landsat data at $2.19 billion annually, yet there wasn't (to my knowledge) a single web app that utilised the Earth API.3 The basic idea behind the application I wanted to create was to populate a 3D Earth with Landsat images dynamically, essentially recreating a Google Earth with extremely recent imagery. I originally set out to build the application based on a the simple Web GL Earth API in an attempt to keep things as light as possible. The API supports Leaflet mapping, but I found its capabilities to be too simple for what was needed – laying 0.025° by 0.025° tiles onto an Earth required a very precise mapping utility. Obviously the number of tiles to cover the entire Earth is massive as the tiles are quite small – it would take approximately 100 million tiles, equating to terabytes worth of data, so tile population would need to be dynamic, and only once a user reached a certain zoom level (with a simpler mesh being used at zoomed out levels).
3 “The Value Proposition for Landsat Applications – 2014 Update” National Geospatial Advisory Committee – Landsat Advisory Group. 2014. Web. 1 Aug 2015.
40
Eventually I discovered Cesium, a JavaScript library for creating 3D globes and 2D maps in browser (importantly) without a plugin.4 It draws via WebGL as well, but is much richer in its features. CORS errors were an early problem, as the use of cross-domain images in canvas/ WebGL “pollutes” the canvas. My original intent was to create a server-side PHP proxy (an idea which Dan Hammer at NASA was eager to collaborate on), but happily Cesium has a built-in proxy mechanism. The application is fully functional and populates a Cesium-based 3D globe with the most recent Landsat 8 imagery. Dan Hammer confirmed that the dim parameter is indeed the width and height in degrees of the tile/image, with the centeroid being the requested lat/lon, and the projection being EPSG:4326. Currently, however, Dan has been unable to successfully configure the server, and despite being told my API key had its request limits raised to 100k/hr, I get 429 errors after just 9 rapid requests. The application is fully functional, though, and users can find a spot on the globe, click a button, and watch as a small area is populated with Landsat 8 tiles which are curved and placed on top of the globe. I've chosen to use the Bing Maps Platform as my macro-level map, because the API is well supported and the imagery is excellent. The guys at NASA were happy to see someone put the API to use, and said it “suggests some very powerful applications”.. APOD SLIDER: NASA’s Astronomy Picture of the Day (APOD) is one of the highest-traffic US-government websites. It typically ranks within the top 5 USgov sites based on traffic, but it looks extremely outdated, and the UI is nonexistent. As it happens, NASA provides an APOD API that serves APOD images. It’s about as simple as an API can be: you send an XML request with a year, month, and day, and you receive back an url for an image (or in some cases a video).
researching various image slider solutions, I ultimately decided to use LightSlider, partially because it is very light and utilises jQuery, and partially because it allows refresh calls that rescan for new DOM elements and append them to the slider, obviously a crucial component to a dynamic slider.5 As I worked with the plugin, I happily discovered a number of other features I was pleased about: CSS transitions with jQuery fallback, support for swipe and mouseDrag (ie mobile friendly), and it’ll slide nearly anything (Vimeo, YouTube, iframes, etc). After a bit of thinking, I realised that the slider only needed a single JavaScript function to work – a function that, when called, added 10 new slides. I set up an array of image ids, and set up a for loop inside the function (obviously the start and stop vars are declared outside of the loop): for (var i=imgIds.length; i<(imgIds.length+10); i++) { // generate a slide } Inside the loop lie the real working bits, which include setting up the appropriate DOM structure for a new slide, sending an XML request inside a callback, and (once the request is returned) assigning the appropriate img tag with an src. The function runs on page load, as obviously 0 + 10 equals 10, resulting in 10 initials slides (the default of which is today's APOD). Each time the user goes to a new slide, a conditional is called, wherein if the current slide’s id is equal to (imgIds.length-3), the main function is called, and 10 new slides are generated. This was the simplest way I could think of to achieve “infinite scrolling”. I realised rather late on in the project that sometimes APODs aren’t actually images; now and then they are actually YouTube or Vimeo videos. Luckily, the API returns “media_type”, so it was just a matter of adding in a conditional wherein if media_type == “image”, create an img DOM node, else if media_ type == “video”, create an iframe DOM node.
Over the course of a few days, I built a dynamic image slider that delivers APOD content. After
Interestingly, after completing the work and contacting Robert Nemiroff (the author/editor of APOD and professor at MTU), I was told that the APOD team was “generally not interested in
4 cesiumjs.org
5 github.com/sachinchoolur/lightslider
'updating' the NASA APOD's appearance”, but that they do encourage mirrors with different appearances. I have no idea why they wouldn't want to update their UI and web design, as the page clearly appears to have been built in the 1990s. Although the APOD team isn't interested in modern web design, the slider itself has value as a plugin to be used easily and modularly in any website (and would be particularly useful as a WordPress plugin). SKYFINDER: In working with the NASA APIs, I discovered that many are abandoned or ill-kept. This is because quite a few of them have been created at hackathons and other sorts of one-off events, with longevity not even a consideration. One such abandoned API is the Predict the Sky API, which I became interested in immediately.6 The premise is that the API combines “global weather data with a comprehensive library of space events”, allowing you to send a lat/lon location and receive back both the current cloud cover and list of astral bodies overhead. It’s essentially a utility for amateur astronomers/hobbyists. After setting up some basic structure and sending out initial XML requests to the API, I discovered the project had been dead for several months. This initially was quite a let down, but I soon realised that what the API itself does is nothing spectacular. To reproduce it, all you need is current weather data for a location, and current astronomy data for a location. So I set about to reproduce the API (albeit client-side via calls to several other APIs). Weather was extremely easy – I set up an API key at Weather Underground’s API and was parsing cloud cover in a matter of minutes. The nice thing about WU’s API is that it also provides expected cloud cover over the next 6 hours or so (based on radar, I presume), so this adds a further utility of being able to plan out a night of observing.
speaking with him via email, I discovered that he is getting a lot of his data via automated formfilling of a International Astronomical Union Minor Planet Center webpage, and JavaScript calculations based on offline data. I have no interest in either of these methods, as they’re rather error-prone and unnecessarily laborious. After lots and lots of searching, I eventually found an almost entirely undocumented “Planets-API”, which provides planets overhead for a given lat/ lon. The API was linked to me by a Redditor who claimed to be the creator. I sent some requests to the API, realised CORS headers weren’t set, and messaged the redditor back, asking if they could set the correct headers. I never got a response, so for the time being I have proxied my requests through a third party service to get around CORS issues. (On a side note, in all the recent API work I've done, it has been shocking how many APIs haven't even had their headers set up correctly). As it turns out, Weather Underground’s API also does very basic astronomy stuff – the current Moon phase, sunrise and sunset, etc. So I’ve put all this together to create an app that fetches a lat/lon (off a Google Map via the v3 API), and returns current cloud cover, visible planets, moon phase, and sunrise/sunset, effectively recreating the deprecated NASA Predict the Sky API. W E AT H E R O N M A R S : This was the last application built, and is very simple in nature. NASA's Mars Atmospheric Aggregation System (MAAS) API delivers the REMS weather data transmitted by the Curiosity Rover on Mars.7 The Weather on Mars simply fetches data via a single API call to MAAS, parses the returned JSON string, and makes minor edits to formatting to make the data more readable.
Surprisingly, I was unable to find any astronomical API that would be of use in determining overhead bodies. A man named Mark Casazza built a webpage that serves a similar purpose, but after 6 predictthesky.org/developers.html
7 marsweather.ingenology.com
41
42
CREDITS
I N D E S I G N T E M P L AT E : graphicriver.net/item/annual-report/11224010 Item ID: 11224010 License:3c8629c9-356b-4048-976c-ce9dde17eb24 IMAGES: Img1: CC0 License pexels.com/photo/black-and-white-city-man-people-1984 Img2: CC0 License www.pexels.com/photo/apple-desk-office-technology-996 Img3: CC0 License www.pexels.com/photo/abstract-glare-visual-art-3582 Img4: CC0 License www.pexels.com/photo/black-and-white-city-flight-5490 Img5: CC0 License www.pexels.com/photo/close-up-notebook-keyboard-6189 Img6: CC0 License www.pexels.com/photo/berlin-eu-european-union-federal-chancellery-4666 Img7: CC0 License www.pexels.com/photo/black-and-white-lights-abstract-1952 Img8: CC0 License www.pexels.com/photo/books-magazines-building-school-2757 Img9: CC0 License www.pexels.com/photo/time-train-station-clock-deadline-4090 FONTS: Aileron: Public Domain (Free for Commercial Use) fontsc.com/font/aileron
I N F O R M AT I O N
AUTHOR & PROGRAMME: Isaac Hinman (s1463054) MSC Design & Digital Media Edinburgh College of Art The University of Edinburgh 2015 LIVE SITE: vis.space Mirror: playground.eca.ed.ac.uk/~s1463054/spacevis/index.html D O C U M E N TAT I O N B LO G : isaachinman.wordpress.com INCLUDED ON SC CARD: Documentation Open Data & Government PDF InDesign Video Walkthroughs SpaceVis (overview) StarFinder Earth/Landsat APOD SkyFinder MarsWeather Practical Work Source Code GitHub Repositories Hosted Versions vis.space playground (mirror) Prototypes AngularJs Prototype Formulas Outline (Written Portion)
43
OPEN DATA & GOVERNMENT Isaac Hinman