TELLING A STORY THROUGH DATA: – OPEN DATA
Intermediate Project – Hauptthema Anne Schirner, Sommersemester 2012
TELLING A STORY THROUGH DATA: – OPEN DATA
Anne Schirner, Matrikelnummer 11075789 Intermediate Project, Main Topic Supervised by Prof. Philipp Heidkamp (Interface Design) FH Köln, Fakultät für Kulturwissenschaften Köln International School of Design, BA Integrated Design Summer Term 2012
CONTENT
Topic Exploration
05
Open Data
07
Visualisations
08
Applications
10
“Open Data for Africa�
15
Critical Discourse
17
Open Data: Telling a story through data
19
Data Collection
21
Visualising Quantities
23
Data Mining / Colour Coding
27
Data Analysis
29
Data Interpretion
31
Visualising Changes Over Time
33
Prospects
35
Appendix
38
left: World passenger airline routes. Data from Google Maps, Airlineroutemaps, and individual airline websites.
Telling a Story Through Data
INTRODUCTION / TOPIC EXPLORATION
4/5
MOTIVATION Fascinated by compelling and meaningful examples of data visualizations – both from the old and more recent past – and motivated by my love for numbers and the preference for spreadsheet applications, led me to the desire to give it a try myself: Telling a story through data. But which story would be worth telling? After some time of exploring possible topics of different backgrounds – among which was The Arab Spring and Education – the idea shaped to tackle a topic that is closely related to the field of data visualization, since it is basically the source, that most of the work done in the field, is based on: Open Data. In recent years the amount of data generated has been exploding. One reason can be seen in the rapid advancement of technologies. The possibility of storing huge amounts of data at affordable prices is only one parameter, but the most profound one: • $600 you pay for a disk drive on which all of the world’s music can be stored • 5 billion mobile phones in use in 2010 • 30 billion pieces of content shared on Facebook every month
•
235 terabytes data collected by the US Library of Congress by April 2010 • 40% projected growth global data generated per year versus • 5% growth in global IT spending¹ One could get the impression that Moore’s law has been proven correct. Another reason for the tremendous increase of data amounts is that more and more people are taking part in that game, basically because they get access to the means needed to interact in our linked world. There is an incredible amount of data out there and it is becoming more and more every day. As Ben Fry, an American expert in data visualization, puts it: “This is only going one way: there is no trend towards less data”³ A new development is that Non-Governmental Organisations (NGOs) and governments themselves are making this data accessible to the people on web portals, therefore becomes Open Data. The burning question now is, how to put that data into context. How can we make use of it? 1 list cf. McKinsey (2011): Big data: The next frontier for innovation, competition, and productivity, pg. 6 2 cf. Encyclopædia Britannica online 3 Simon Rogers (2010), The Guardian, Government data from around the world, available at: http://www.guardian.co.uk/ news/datablog/2010/jan/07/government-data-world
In 1965 the American engineer Gordon Moore predicted that the number of transistors per silicon chip doubles every year.²
Telling a Story Through Data
“… the best way to get value from data is to give it away.” Neelie Kroes, Vice-President of the European Commission responsible for the Digital Agenda, opening speech, press conference on Open Data Strategy, Dec 12, 2011
OPEN DATA / OVERVIEW
6/7
OPEN DATA WHAT DOES “OPEN” MEAN?
What does stand behind the concept Open? Open implies something should be legally and technically open. Every human being should be able to freely access and use it, reuse it for a new purpose or distribute either a part or all of it. It should be available to everyone in an easily accessible standard and editable electronic format.⁴ The data can be accessed through the web by everyone as raw or geospatial data and is available either for download or via an Application Programming Interface (API). Open Data is closely linked to the terms Linked Data, Open Government Data as well as Semantic Web.
The Semantic Web is an evolution of the World Wide Web that, rather than just linking from one document to another, focuses on their meaning in relation to each other. Linked Data is a set of technologies to achieve this for data, creating a web of data.⁷
KEY PLAYERS
In May 2009 the US government took off by taking the open data portal Data.gov online, closely followed by the UK and New Zealand a little later that year. These websites can be seen as the answer to the call for “Freeing our data”⁵ by several organizations, among them newspapers like The Guardian or NGOs like The Open Knowledge Foundation (OKF). One argument against keeping the data locked up is that it belongs to the people, since the people are funding the collection of the data by paying taxes.⁶ By February 2012 a significant number of countries around the world among which were Kenya, South Korea and Uruguay had joined the Open Data movement. Even some federal states as well as cities like Paris, and Berlin started to make data sets accessible. This movement is part of a bigger initiative that is tackling topics like education, entertainment, librairies, services. The Open Knowledge Foundation is one of the leading actors in this field.
VISION
When it comes to Open Data you keep on hearing about great concepts about what can be achieved through building applications and visualisations based on that data: increased transparency, accountability, innovation, efficiency, collaboration and participation. Only time will tell whether this is going to become reality. 4 cf. Dr. Tariq Khokhar (2012): Open data – the new revolution, talk held at the Developing the Caribbean Conference, [video file], retrieved from: http://www.ustream.tv/ recorded/20017987 5,6 cf. Tim Berners-Lee (2009): The next web of open, linked data [video file], retrieved from http://www.ted.com/talks/ tim_berners_lee_on_the_next_web.html
Screenshots of Open Data portals (USA, Kenya)
7 cf. FAQ catalogue, http://data.gov.uk/faq
Telling a Story Through Data
VISUALISATIONS BASED ON RAW DATA The Jobless Rate for People Like You Shan Carter, Amanda Cox and Kevin Quealy/ The New York Times This interactive visualisation shows how the recession in 2009 effected the unemployment rate of different groups of people in the United States of America filtering for race, gender, age, and education level. It is based on a number of time-series plots, which are considered to be a very strong and efficient way of interpreting time-based data, due to the “natural ordering of the time scale”. This visualisation method can handle big data sets and enables the viewer to easily compare different parts of the data.⁸
The interactive feature of this chart allows for an even extended comparability; Not only the changing of one entity over time can be investigated. It is possible to compare a huge variety of demographic groups with regard to these changes.
ht t p : / / w w w. ny t im e s . c o m / int e r a c t iv e / 2 0 0 9 / 1 1 / 0 6 / bu s in e s s / e c o n o my / unemployment-lines.html November 6, 2009 Data source: U.S. Bureau of Labor Statistics 8 cf. Edward R. Tufte (2009)[2001], The Visual Display of Quantitative Information, pg. 28
OPEN DATA / VISUALIZATIONS
8/9
BASED ON GEOSPATIAL DATA Flight Patterns - altitudes and types Aaron Koblin An Animation of North American flight paths being monitored by the Federal Aviation Administration on August 12, 2008. The different colors stand for different makes and models of aircrafts, the different saturation of colours for different altitudes the aircrafts are flying in. The big picture shows the geo-referenced flight pattern above the entire country, the one on the right above Atlanta. This work was created with Processing, an open source programming environment developed by Ben Fry and Casey Reas.
Since networks can be found in almost every aspect in our interconnected world, a number of visualisation experts became interested in tackling the topic through visualisations; “aiming to uncover the inherent principles and behaviours that regulate a variety of natural and artificial systems, normally charcterised by a multitude of interconnecting elements.”⁹ Including the variable time into a network visualisation enhances the information content, due to the fact that networks are usually evolving systems.¹⁰
http://www.aaronkoblin.com/work/flightpatterns/index.html June 28, 2009 Data source: FAA 9 Manuel Lima (2011): Visual Complexity. Mapping Patterns of Information, pg. 73 10 ibid.
Telling a Story Through Data
APPLICATIONS BASED ON RAW DATA Where does my money go? Jonathan Gray/ Open Knowledge Foundation This interactive application shows how public money is being spent in the UK. In the course of the launch of the UK Open Data platform data.gov.uk the British Prime Minister David Cameron UK announced: “Each month each government department will publish every item of spending over £25.000 online.” This illustrates the movement towards more transparency and accountability. An honourable promise and an example what a modern democracy ought to look like, but at the same time, goals that are hard to meet. An explanation for the publication of incorrect
data on the equivalant US website. A 2009 study revealed that US$1.3 billion were wrongly displayed.¹¹ The visualisation methods implemented at data.gov. uk include bubble visualisations, choropleth and tree maps.
http://wheredoesmymoneygo.org/ launched in autumn 2009 Data source: HM Treasury (UK economics and finance ministry) 11 cf. Alexander Schellong/ Ekaterina Stepanets (2011): Unbekannte Gewässer. Zum Stand von Open Data in Europa, pg. 28
OPEN DATA / APPLICATIONS
BASED ON GEOSPATIAL DATA Augmented Reality (AR) Augmented Reality describes the enrichment of reality through additional information and virtual content. Georeferenced Points of Interest (POI) and the respective metadata is being accesses and shown as an overlaying image. Smartphones are the devices that offer the features needed for AR. AR links the GPS positioning to available geospatial data. AR applications do exist already but it is a fairly new technology, that will have to prove its value.
illustration: Oliver Uberti, National Geographic (http://ngm.nationalgeographic.com/big-idea/14/ augmented-reality)
10 / 11
Telling a Story Through Data
BASED ON DATA ACCESSED THROUGH A WEB API Haiti Crisis Map Crowdsourcing / OpenStreetMap OpenSteetMap is a free editable map of the entire world. Inspired by the collaborative internet encyclopedia Wikipedia every registered user can improve the map by adding entities through an API. In the pictures below three stages of the map of Port-au-Prince on Haiti are shown. In January 2010 Haiti was shaken by an earthquake of magnitude 7.0 leaving thousands of people dead and many of the buildings destroyed. The image in the upper left corner shows the map of Port-au-Prince just before the earthquake. It was not very well developed back then. In the picture on the right, you see a map of the same city only days after the earthquake took place, with the locations of refugee camps, blocked roads and damaged buildings added. This editing, done by people around the world, was based on satellite imagery provided by GeoEye, a company providing geospatial information. It helped rescue teams, that were accessing the edited map with their Garmins (GPS device), to do their work. The last image (down left) shows Port-au-Prince after the refugee camps had been removed and life was back to normality.¹² A powerful way to make a difference and help even though not being on site but maybe thousands of kilometers away in front of a computer.
12 cf. Tim Berners-Lee, Tim Berners-Lee: The year open data went worldwide, [videao file], retrieved from http://www. ted.com/talks/tim_berners_lee_on_the_next_web.html, February 2010 13 Encyclopædia Britannica online
API – in full application programming interface – are sets of standardized requests that allow different computer programs to communicate with each other.¹³ An interface is a common boundary between two separate systems. It is the means or the medium via which these two systems communicate. Most popular APIs are Web APIs like Twitter API, Google Maps API, Flickr API, the websites of the World Bank and the Guardian have APIs to access data.
OPEN DATA / APPLICATIONS
http://www.ted.com/talks/lang/en/tim_berners_lee_the_year_open_data_went_worldwide.html
12 / 13
Telling a Story Through Data
Respiratory Illnesses vs. Use of Fuel Wood, 2005/6 – two graphs, generated by an online-application on http://opendata.go.ke
OPEN DATA / POTENTIAL
14 / 15
OPEN DATA FOR AFRICA With Kenya Open Data, the first Sub-Saharan country and the second African country after Morocco, joined the Open Data movement in July 2011. “The goal of opendata.go.ke is to make core government development, demographic, statistical and expenditure data available in a useful digital format for researchers, policymakers, ICT developers and the general public.”¹⁴ The launch of the open data portal has received significant media attention accompanied by debates about why Open Data can become particulary important for this region of the world and countries similarly situated in terms of development. But first I would like to briefly go a bit more into detail on the appearance and functionality of the web site. Compared to other data web portals it is one of the more compelling examples: It is well structured, visually appealing, provides meaningful categories and offers the possiblity to create visualisations on site based on the filed data sets. It is very obvious that the creators of the site have put some priliminary thoughts into it. The portal is powered by an US-based start-up providing Open Data solutions; Socrata.¹⁵ It is managed by the Kenyan Ministry of Information and Communications in partnership with the World Bank.
In Kenya a similar situation can be observed. One goal is to make the people in charge accountable for what they are doing. But on the African continent another chance of opening up the data stores is gaining relevance; The increase of internal African trade. “Saying that a 5% increase in internal African trade would bring ten times the amount of money as currently is being received in foreign aid. Lack of openness of information is making more internal trade difficult. If African nations would release more information, such as on goods and products available for export, more transactions would be possible, while at the same time cutting out various middle men that now drive up the cost of trade enormously.” This is how Dr. Bitange Ndemo, Permanent Secretary of the Ministry of Information and Communications in Kenya, explained how to make use of the data in order to enhance economical growth.¹⁸
Where does the potential of Open Data lie for developing countries? When screening the Brazilian Open Data portal the nature of the data sets that had been released indicated an obvious intention: The fight against procurement fraud and corruption. The federal government in Brazil ran a program that audited and published data on irregularities associated with fraud and procurement in public finances, they shared this data over the internet and in local media. This data exposed 373 municipalities and had a dramatic effect on local politics: It resulted in a 30% drop of the same people being reelected.¹⁶ In Brazil only 33 data sets have been released by the government up to date, but the majority contains data with titles such as “Suppliers of the Federal Executive” or “National Register of Enterprises with a bad reputation“.¹⁷ 14 cf. http://opendata.go.ke/page/about 15 cf. Hanif Rahemtulla, Jeff Kaplan, Björn-Sören Gigler, Samantha Cluster, Johannes Kiess, Charles Brigham (2011): Open Data Kenya. Case Study of the Underlying Drivers, Principal Objectives and Evolution of one the first Open Data Initiatives in Africa, pg. 16 16 cf. Dr. Tariq Khokhar (2012): Open data – the new revolu-
tion. [video online] Available at: http://www.ustream.tv/ recorded/20017987 [Accessed on 25 February 2012] 17 cf. http://beta.dados.gov.br/dados/dataset 18 cf. Dr. Bitange Ndemo (2012): Open Data: Africa’s Opportunity. [video online] Available at: http://epsiplatform. eu/content/africa-needs-open-data-most [Accessed on 14 March 2012]
Telling a Story Through Data
“By overemphasizing a truth, you can create a lie.” Deroy Peraza, creative director at Hyperakt, a design firm in NYC interviewed by Josh Smith, graphic designer and co-founder of IDSGN-blog, NYC, July 19, 2011
OPEN DATA / DISCOURSE
16 / 17
CRITICAL DISCOURSE Even though there is a big hype happening regarding Open (Government) Data currently and even tendencies towards it becoming a philosophy¹⁹ for some of the activists, the topic needs to be reviewed critically. Where can problems or difficulties be seen? As the German saying goes “I only trust a statistic that I faked myself”, the term raw data ought to be questioned. Some hold the view that raw data do not exist, and that even the most elementary perception is already influenced by potential uses, expectations, context, and theoretical constructs.²⁰ Hence Tuomi suggests to reverse the hierarchy data-informationknowledge. According to Tuomi data is being collected based on knowledge and information (in this order).²¹ Is data still being used to prove a hypothesis or is it generating questions out of itself? Secondly Open Data is lacking definitions providing differentiation toward what does stand behind the term data. The focus in this discourse is rather being put on the openness and reusability of data.²² This could be an explanation for the difference in appearance of the already installed data portals. The categories the data is filed under on the mentioned websites are varying tremendously concerning naming, content and variety. This leads me to the next deficiency; The need for data directories, that enable the user to search data in well structured archives and hence to access it easily. This precondition for free access is not being met in a lot of cases, which I realised while searching the respective portals. Some do not even provide the possibility to search data through categories or tags. Another point that is creating lack of reliability is the not yet clearly outlined field of data journalism. Data sets are the source, data visualisations are based on. Those visualisations are being published in print or web-media and ideally provide additional information on or are the starting point for investigating a certain “story”. This field is called data jounalism, a newly emerging area in journalism. Since this development started fairly recently there is only few experts in the field. Some of them are working with The New York Times and The Guardian. 19, 22 Alexander Schellong, Ekaterina Stepanets (2011), Unbekannte Gewässer. Zum Stand von Open Data in Europa, pg. 4 20 Ilkka Tuomi (1999), Data is More Than Knowledge. Implications of the Reversed Knowledge Hierarchy for Knowledge Management and Organizational Memory, pg. 4 21 cf. ibid
The difficulty that is becoming obvious is that data journalism involves basically three professions; the journalist, that is telling the story, the statistician, that is taking care of the collection, organisation, analysis and interpretion of data and the designer that is concerned with the visual and perceptual aspects. As Hal Varian, chief economist at Google, emphasises: “The ability to take that data, to be able to understand it, to process it, to extract value from it, to visualize it and to communicate it, that is going to be a hugely important skill in the next decades”²³ There is an urgent need for experts that are capable to cover all three areas that are the basis of a meaningful, relevant, “true” data visualisation. A call for change in the educational system towards a more integrated approach. Also concerning is the technology-centred character of Open Data. One principle of Open Data is to provide data in machine-readable formats. Leaving it exclusively to machines to analyse and interpret data should not be the target. It helps to look for patterns and trends in the data, but is unable to extract a meaning. Additionally there should be also data that is readable by humans only.²⁴ Two more critical aspects concerning Open Data in general are the issues of financing and privacy. It is very cost-intensive to scrape, gather and sort huge amounts of data. The argument that the tax payer is already paying for the data collection is not entirely true. There is a huge amount of work involved when formatting and sorting the data in a way so that it makes sense to the public. The US data portal data.gov for instance is currently at risk due to massive budget cuts.²⁵ The issue of privacy violation is an argument against Open Data often brought up, in particular in Germany. Since Open Data puts its focus on nonpersonalised data, the argument does not apply to it. Another question worth asking, is how to ensure that Open Data meets a quality standard, since the nature of “open” implies the vulnerability to manipulation? Time will reveal whether Open Data is going to prove to be a relevant and changemaking resource. 23 Hal Varian, chief economist at Google interviewed by James Manyika, director McKinsey’s San Francisco office, Napa, California, September 2008 24 cf. Alexander Schellong, Ekaterina Stepanets (2011), Unbekannte Gewässer. Zum Stand von Open Data in Europa, pg. 10 25 cf. http://sunlightfoundation.com/press/releases/2011/04/01/editorial-memo-proposed-budget-cuts/
Telling a Story Through Data
“The ability to take that data, to be able to understand it, to process it, to extract value from it, to visualize it and to communicate it, that is going to be a hugely important skill in the next decades” Hal Varian, chief economist at Google interviewed by James Manyika, director McKinsey’s San Francisco office, Napa, California, September 2008
OPEN DATA / MOTIVATION
OPEN DATA: TELLING A STORY THROUGH DATA MOTIVATION
Since the goal of this paper is to tell a story through data I am going to stop now tackling the topic through words. Let us see what can be discovered with the help of the visualisations on the following pages. Surely my intention was not to create impressive spider-web-like images, but rather to come up with a couple of meaningful, decodable visualisations. As the saying goes, “a picture is worth a thousand words”, my work consists mainly of visualisations plus additionally a few explanatory side notes as well as an excursion into several perceptual facets. To start off I put down all the questions that came to my mind while having a look at all the different open data portals installed by governments. While keeping these questions at the back of my mind I started the actual work by undertaking the following steps; data collection, data mining, data analysis and interpretation.
1 How many datasets have been published by different countries up to now? 2 When – on a timeline – did the publication of different data sets by different countries take place? Which countries were pioneers, which followed their example and when? 3 Which categories are the datasets filed under? 4 Is there a difference in focus, concerning which government data is being released, set by different nations? 5 Which formats are more popular? Which ones less popular? 6 Open Data in Africa - What is the current status?
18 / 19
http://www.opengov.se/ http://www.transparency.gov.tl/
Sep 27, 2011
Dec 27, 2011
government
government
government
government
government
government
government
government
government
government
government
government
government
government
government
government
government
government
government
government
NGO
government
government
government
government
government
government
government
government
government
government
Belgium
Brasil
Canada
Chile
Denmark
Estonia
Finland
France
Germany
Greece
Hong Kong
Ireland
Italy
Kenya
Moldova
Morocco
Netherlands
New Zealand
Norway
Peru
Russia
Saudi Arabia
Singapor
South Korea
Spain
Schweden
Timur Leste
UAE
UK
USA
Uruguay
Nov 1, 2010
May 21, 2009
Sep 30, 2009
5/15/2011
Aug 28, 2011
10/27/2011
Jun 1, 2011
June 1, 2009
?
Apr 1, 2010
Nov 4, 2009
Sep 15, 2011
vor kenya
4/1/2011
Jul 8, 2011
Oct 18, 2011
2012
Mar 31, 2011
Aug 15, 2010
2013
Dec 5, 2011
May 15, 2011
2/29/2012
Nov 17, 2009
Sep 15, 2011
Mar 17, 2011
Apr 1, 2011 21
83
705
16
--
746
48
5
3500
24
1755
223
48
163
494
176
--
82
173
http://www.portaltransparencia.gov.br/
-
-
-
comment
environmental only
-
--
106
633
654
6290
http://datos.gub.uy/
http://www.data.gov/
http://data.gov.uk/
--
5169
8011
63 http://www.government.ae/web/guest/uae-data
http://datos.gob.es/datos/
http://www.data.go.kr/Main.do
http://data.gov.sg/
x
-
germ
fren
http://www.suomi.fi/suomifi/tyohuone/yhteiset -
http://pub.stat.ee/px-web.2001/Dialog/statfile -
-
-
x
-
x
x
x
lan
-
-
-
-
-
-
-
-
Udine, Piedmont
incl. 2520 statistic datasets
-
-
-
-
-
spreadsheet containing the raw data
its a mess
no data on it
Badalona, Catalunya, Euskadihttp://www.aporta.es/web/guest/index
-
-
-
fren
x
x
x
x
x
x
x
x
-
x
x
-
x
-
x dataset available: usage of http://data.govt.nz
unavailable since Feb 15
x Open Kenya is powered by Socrata, the Seattle-b
-
county Fingal (Nov 16 2010) not released yet/ http://www.statcentral.ie/x
-
-
Berlin, Bavaria
Paris, Montpellier, Rennes
-
-
-
-
Edmonton, Mississauga, Nanaimo, Ottawa, Toronto, Vancouver
-
-
-
-
additional counties/ cities
http://www.saudi.gov.sa/wps/portal/!ut/p/c4/04_SB8K8xLLM9MSSzPy8xBz9CP0os3iTMGenYE8TIwODUEsLA89QU69g11A_YwMTQ_3g1Dz9gmxHRQCO1nwy 217 -
http://opengovdata.ru/
http://www.datosperu.org/
http://data.norge.no/
http://www.data.govt.nz/
http://data.overheid.nl/
http://data.gov.ma/Pages
http://data.gov.md/
http://opendata.go.ke/
http://www.dati.gov.it
http://data-gov.ie/.html
http://www.gov.hk/en/theme/psi/welcome/
http://geodata.gov.gr/geodata/
http://www.portalu.de/portal/default-page.psml--
http://www.data.gouv.fr/
http://opengov.fi/data/
opendata.riik.ee
http://digitaliser.dk/resource/432461
http://datos.gob.cl/
11743 http://www.data.gc.ca/default.asp?lang=En&n=F9B7A1E3-1
http://beta.dados.gov.br/
http://data.gov.be/
http://www.bahrain.bh/pubportal/wps/portal/data/ 251
http://data.gov.au/
government
Mar 10, 2011
government
number of data public sector bodies/ categories sets available on website 858
Bahrain
url
Australia
launch date
institution
country
OPEN DATA / DATA COLLECTION
DATA COLLECTION
The collection of the needed data took quite some time. It involved firstly finding out which countries had already installed an open data portal or are in the process in doing so. The second step was to create my own data set in form of a spreadsheet and filling it with the data I found on respective websites. The information I was interested in included e.g. the number of data sets available per country, the different dates various countries started to open up their data storages, which categories are data sets filed under, which cities and states within those countries started to make use of data as well, etc.. Since this work was done manually it can not be guaranteed that every single number is correct, but I put all effort into getting a data set that sticks as close as possible to the facts. Because Open Data is certainly a hot topic currently there is a lot happening ongoingly. Every day new data sets are being uploaded. The spreadsheet that can be seen on page 20 illustrates the status recorded before February 26, 2012. On March 12, 2012 I had another look at some of the data portals and discovered that since the above stated date there have been more than 350 000 data sets added to the French web site.²⁶ This illustrates the importance the topic is being treated with, in recent times. Unfortunately I was not capable to include this data into my work, because it would have been too time-consuming at this stage of my project. This means that my work is out of date, before it even has been published.
26 cf. http://www.data.gouv.fr
20 / 21
amount of data sets available per country
karte/ farben
coming soon
0-50
50-100
100-250
250-500
500-1000
1000-2500
2500-5000
5000-10000
>10000
30
OPEN DATA / VISUALISATION
22 / 23
VISUALISING QUANTITIES
through saturation
The intention of the first visualisation (page 22) produced for this project was to provide an overview on which countries have released data sets so far and how many of these sets. The medium used to achieve this, is in case a small amount of data sets have been released the geographical area of the respective country is covered with less saturated colour. The darker the colour, the bigger the amount of data – a choropleth map. The graph on the right side shows the clustering of the data amounts. I decided to show a bigger differentiation in the area of small numbers of data sets been released. The bigger the amount of data the more open is the cluster pattern. This effects the appearance of the visualisation. This decision was based on the fact that there is a lot of countries that provide only 30 or 80 data sets and it does not really matter in this visualisation whether there is 8000 or 9500. To get a first insight into the topic this visualisation works fine, but as soon as one is looking for more indepth information it does not work quite so well. What are the reasons behind this? First the amount of saturation provides only a rough idea about the amount of data sets it is representing. Secondly it is difficult to identify the actual colour the country has. This is due to the fact that in most of the cases the areas are not directly bordering each other. Also the size of the single counties matter as well as the surrounding background colour, representing the ocean areas, is effecting the appearance of the colour. The smaller the area the country is covering the lighter in colour it appears. Additionally the smallest countries in the map that provide data sets (e.g. Hong Kong, Singapore) do not appear due to the small print size. Conclusion: Good enough for getting a first impression of the spreading of the open data movement.
through area The visualisation on page 24/25 is the second attempt to create a visualision based on the exact same numbers but with a different approach: The target was to get a more meaningful visualisation by including more detailed information. The size of the area covered by a shaded circle represents the corresponding number of data sets. The focus is put on that “the visual representation of the data is consistent with the numeric representation”.²⁷ What does this mean? Using area for translating numbers can have misleading effects due to distortion in human visual perception. For example, the perceived area of a circle probably grows somewhat more slowly than the actual (physical, measured) area: the reported perceived area = (actual area), where x = 0.8±0.3x(depending on the size of the circle). This has been found out by undertaking psychological experiments. Also different people perceive areas differently.²⁸ How to do it right? Edward Tufte advices to use absolute scaling regardless of possible perceptual failings in order to “tell the truth about data”. If feasible the actual numbers represented can be included in the visualisation.²⁹ Opposing Tufte, James Flannery came up with the method of apparent scaling. He wrote his dissertation on the topic “Scaling of proportional map symbols” (1956) and found out that when people were asked to pick a circle that has twice the size of another cirlce, they would most often choose a circle that was 1.8 times the size. But as mentioned before that differs depending on the size of the circle.³⁰ For the visualisation shown on the following two pages, I decided to use the absolute scaling method, because the reader has been made aware of this matter. For the visualisations on the pages 22, 24/25 I decided to use shades of grey, in order to not create confusion with the colour coding used for the following ones. 27 Edward R. Tufte, The Display of Quantitative Information, pg. 55 28, 29 ibid. 30 cf. John Krygier, Perceptual Scaling of Map Symbols, Available at: http://makingmaps.net/2007/08/28/perceptualscaling-of-map-symbols
>10000 5000-10000 2500-5000 1000-2500 500-1000 250-500 100-250 50-100 0-50 coming soon
,000
Telling a Story Through Data
karte mit f
amount of dats sets
100 500 1000
5000 10000
farbkreisen
OPEN DATA / VISUALISATION
24 / 25
CATEGORIES lia
A
n
US
in
Bah
rai
Sp a
Ca na da
S a u di Ar a bia
ala
Ch
Ze
ile
Ne w
nd
nya
ma
rk
geo location elections crime justice road safety agriculture l tia n industry, pollution pa io t os air quality ge istra rity u n mi , sec economy ad w y e a r r u e, l lita uct subventions mi ustic rastr ent j tax inf ironm rt, v en vices po finance ns try, er tra us ial s tion employment ons ind soc pula relati labour po nal o ur ce ati inan s, labo culture rn f te tax, ience ature ducation in c ,n s our e ure n, , lab health tio leis nces ca re, cie du ltu n, s lth cu atio hea uc
Ke
Fr a nc e
Den
processes
pub lic ng ho sect usi o us d h ing or e s lan tion, c u e leg rban c an c dev islat urity b gulf cooperation council tru rform ment a r a s i elo n e pm on, ju co tor p overn population, housing g, e g c tra nt, in stice , communicatio n o din e se ional e i t a t r l o p i s n n n tra fra sp bu stat reg ustic rt str rur ort j agriculture, fishing po uc al, y ure ns en al her tu loc weather e tra truct ry, fis re env rgy t s iro s a s r e w y, a g r t e e r f r n o e n cie m fo y n in o , i e ty, n at re erg v se r d v l i t a c i r e c s o u w s em lt en se og elf on cu industry ind raph are ika t, c ric us en pasif y ag m t , trade ry tra ion y on ori vir ma igrat societ stry eco de es, indices c i r p u en , n m d o em ion de, in cs plo my i social insurance lat tou yme pu , tra nom e n t o a r m y l r a a po rce , eco t i f c i f a sm nt irs e finan en ax cul spor mm l, t oym labour market tur t s c co isca mpl ism e e ienc g r e f education rita es, , leis tou re, he rch u t edu echn re health u ea t s l e c r u o a , c ces, h hea tion logy s t t l lth ar cien hea s
Au
str a
GEOSPATIAL OTHER COMMUNICATION ADMINISTRATIVE HOUSING GOVERNMENT SECURITY JUSTICE INFRASTRUCTURE ENVIRONMENT POPUL ATION ECONOMY FINANCE EMPLOYMENT CULTURE EDUCATION HEALTH
ge og ge raph n y his eral me t asu ory rem com news ent mu pla nica t n pro ning ion p po erty l i tic gov s em ernm erg e en nt c saf ety ies l tra aw env nspo rt ir ind onm e ig com eno nt us m soc unit ind iety y u bus stry em ines s plo rec yme rea nt t io cu tec lture n hn sci olog e edu nce y s ca hea tion lth
language, linguistics history, archeology es information, communication ogi nol government, politics ech nt military tio ing ica ce s s un hou justi ation ons law , ic ti mm rity mun condi co transport l cu m se t, co ogica l agriculture r po oro try ns ete dus nature, environment tra d m il, in trade ics o persons gn an tist rei sta ea s o r t f n e ar economy, industry um shme ge n ns ice li labour co tab xcha serv es ck e , care ts society, culture y n o eco st curit ccou no a e nce arts, music, literature f i l s nal sura na my , technology n e cia tio , in sciences lab ce so na nce r forc sm ou a u ri duacation, training t r e o u fin labo , tou edu rism health re tion ca ltu a e cu duc lth hea tion e ea lth h ed
er n atio oth ce nic a sp mmu sing o , c , hou rs on n ffai ati ctio ns sa m u r tr tio teran o f s c in con ele y, ve ons t pris ry uri ts, ec milita cour s l nt, tion na ent a me tio ce port ironm na r o s v f en tran y, en re h ltu rces law rap cu og agri resou ies t ge l a alth tili tur t ,u , we na ergy men verty rces en viron s, po , divo en iture iages nd arr pe s, m tion x e h e, eat opula s m u p ,d s ns co ce cture rise in irths p b ufa nter nd aid n ma ess e rce a e m sin bu com ices nces pr omy ign fina s e r on ment service fo c e n rn es c ve huma n o g , fina al ce ent urance loc ran nm s e, insu over ce, in ment t y a st cial ral g finan emplo vel a so fede ing, rce, on, tr gy i o o t nk r f nol ea ba bou recr , tech n a , s l ts e tio ar ienc duca trition u e sc ,n h alt e h
g e o sp at c o u nti ial h o us es ing j u s tic a wa gricu e ter ltu , lan sani re ta d, env c l im tion iron a e m e nt, nerg te y na pov tural e r m ty reso i g ur po ratio ce man s ufa pula n nati tio ct o n a l ac uring n cou , in nts du ec pub onom , inf stry lat fina lic fin y ion n c ia anc em plo l sec e y t tou men or edu rism t ca h e a lth tion se cto r
OPEN DATA / VISUALISATION
26 / 27
DATA MINING
Defining categories
Most of the data portals have the data sets filed under categories. This facilitates the search for specific data. The names and and amount of categories vary from country to country though. The visualisation on the left helped me to extract congruent categories and defining the colours representing each category. Regarding the colour coding I proceeded the following way; I typed the term into a search engine and pressed the button for image search. Having a look at the result of the search I went with the colour that came up most often. In case a certain colour was taken already, I chose the one “ranked� place two.
Categories on the Chilean data portal
c0 m35 y100 k0
culture
c30 m30 y0 k0
infrastructure
c100 m85 y30 k0
population
c0 m75 y100 k0
justice
c0 m10 y95 k0
security
c60 m90 y0 k0
administration
c0 m100 y100 k0
employment
c50 m0 y100 k0
geospatial
c100 m100 y25 k25
other
c25 m100 y60 k0
finance
c85 m20 y50 k0
environment
c70 m50 y40 k0
economy
c15 m100 y90 k10
communication
c100 m0 y0 k0
health
c0 m0 y0 k85
government
c30 m50 y75 k10
housing
c85 m50 y0 k0
education
Telling a Story Through Data
Australia
Culture
Bahrain
Economy
Canada
Education
Chile
Employment
France
Environment
Kenya
Singapore
UK
USA
Finance
Health
Infrastructure
Population
OPEN DATA / VISUALISATION
28 / 29
DATA ANALYSIS Once the data collection or scraping is done, it needs to be figured out what to do with it, by finding out which story would be worth telling. In order to make sense out of data we are looking for two entities: patterns and relationships.³¹ Generating the visualisation on page 28 was the way to analyse my data set with regard to patterns and relationships. The information included in this transformation is; a selection of countries and available data sets per category, that I am confronting. The number of data sets are represented as percental amout not as total numbers. I had the idea to tranform the data in that particular way, since I started working on the visualisations; not knowing whether the outcome would be satisfying or if I would get a nice, colourful image, that contains no relevant information at all. The outcome I consider to be somewhat inbetween those two poles. It is possible to read some information. Relationships and patterns are visible but not in a very convincing way. This more experimental approach, works only partially. It is difficult to read the correleations due to overlaying of information. To increase the readability a few proportions need to be adjusted. The space between the single countries/categories needs to be bigger for instance. Also the opacity of the coloured
Australia Bahrain Belgium
Health
Education
42 2.04 49 9.94 6 5.22
196 9.51 2 0.41 9 7.83 1 3.03 587 5.06 23 48.94 61 25.00
153 7.42 5 1.01 4 3.48
0.00
0.00 2 0.19 1 0.59
0.00 199 19.08 3 1.78
0.00 2
0.00 3
15 3.62 3 9.38 24 4.62
76 18.36 2 6.25 12 2.31
Brasil Canada Chile Denmark
0.00 1053 9.07 14 29.79 11 4.51
Culture Employment
0.00 779 6.71 2 4.26 15 6.15
France Greece
Finance
Economy
174 8.44 268 54.36
356 17.26 49 9.94 16 13.91 6 18.18 6330 54.55 1 2.13 4 1.64
0.00 1 3.03
0.00 1 6.25 8 0.77
0.00 4 8.51 4 1.64 2 12.50 248 23.78
0.00
0.00
0.00
0.00 3
0.00 1
0.00
0.00 35
12 2.90 1 3.13 16 3.08
21 5.07 5 15.63 13 2.50
38 9.18 21 65.63 147 28.27
Finland 0.00 38 3.64
18 0.87 53 10.75 7 6.09 2 6.06 666 5.74 3 6.38
0.00 5 0.48
Hong Kong Italy
Morocco New Zealand
31 7.49 0.00 16 3.08
With this data visualisation I am intending to cite developments in the field during recent years, being far away from “competing” with them. Thinking of work done by data visualisers like Moritz Stefaner and Manuel Lima, who translate very complex data sets, containing a number of dimensions and levels, into maps using techniques like arc diagrams or flow charts. There is a fine line between producing impressive, but difficult to decode visual images and functional network visualisations that communiate “and ask interesting questions about society through its data traces”³² Succeeding with this requires a high level of understanding in a wide area of fields, including perception, statistics and narrative skills. The weakness of my approach encouraged me to come up with a different way to visualise this matter on the following pages.
Population Environmen Infrastructu Government Justice t re 406 178Australia 44 52 244 19.69 8.63 2.13 2.52 11.83 8Bahrain 26 12 0.00 1.62 5.27 2.43 0.00 39 12Belgium 3 39 33.91 10.43 2.61 2.61 7.83 3 2Brasil 2 18 24.24 9.09 6.06 6.06 3.03 103 551 876Canada 419 6 0.89 4.75 7.55 3.61 0.05 Chile 0.00 0.00 0.00 0.00 0.00 31 45Denmark 18 6 0.00 12.70 18.44 7.38 2.46 3Finland 1 2 0.00 0.00 18.75 6.25 12.50 21 444France 3 1 2.01 0.00 42.57 0.29 0.10 27 119Greece 13 0.00 15.98 70.41 7.69 0.00 Hong Kong 3 0.00 0.00 0.00 100.00 0.00 14 7 24Italy 15 12.84 6.42 22.02 13.76 0.00 17 34Kenya 1 1 0.00 4.63 9.26 0.27 0.27 Moldova 14 48 13 9 14 3.38 11.59 3.14 2.17 3.38 Morocco 0.00 0.00 0.00 0.00 0.00 New Zealand 139 32 87 28 6 26.73 6.15 16.73 5.38 1.15
Security Housing Administrati Communica on tion 64 31 31 54 3.10 1.50 1.50 2.62 21 0.00 4.26 0.00 0.00 7 0.00 6.09 0.00 0.00 5 2 0.00 15.15 0.00 6.06 191 1 0.00 0.00 1.65 0.01 0.00 0.00 2 0.00 0.82 0.00 0.00 74 0.00 7.09
6 0.00 3.55
0.00 47 19.26 6 37.50
Other
TOTAL
19 0.92
2062 100 493 100 115 100 33 100 11605 100 47 100 244 100 16 100 1043 100 169 100 3 100 109 100 367 100 414 100 32 100 520 100
0.00 0.00 0.00 43 0.37
0.00
0.00
0.00 1 6.25
0.00
0.00
0.00
0.00
0.00
0.00
0.00
0.00
data set that is the basis for the visualisations 0.00 0.00 on page 0.00 28 and 0.00 30 0.00
1.83 (2011): 2.75 2.75 This. 0.92 0.00 32.11 31 cf. Nathan Yau Visualize The FlowingData Guide Kenya 22 23 3 2 26 23 to Design, Visualization, and Statistics, pg. 5.99 6.27 0.82 0.54 8 7.08 6.27 Moldova
areas matter. But still the misleading effect of wrongly presented amounts remains, due to the fact that the sizes (2-D areas) are shown in a wrong way. In order to get the amounts right the only area in this visualisation that is close to “telling the truth” is the area right next to the column displaying the categories.
5 0.00
4.59Complexity. Mapping Patterns of 32 Manual Lima (2011): Visual 4 211 1.09 57.49 0.00 0.00 Information, pg. 12 0.00 0.00
0.00
0.00
45 10.87
7 1.69
36 8.70
0.00 0.00
0.00
0.00
0.00
0.00 0.00
0.00
0.00
0.00
35 0.00 8.45
Telling a Story Through Data
Kenya
UK
Canada
Australia
OPEN DATA / VISUALISATION
30 / 31
DATA INTERPRETATION
through treemapping The visualisation on page 30 displays the difference in focus concerning the topic of how many data sets are filed under a category by the governments of the exemplary countries Australia, Canada, Kenya and the UK. The underlying data set is the same as the one used for the visualisation on page 28. A treemap is an area-based visualisation where the size of each rectangle represents a metric. In this case each single rectangle represents the amount of data sets filed under one category. The colour coding is applied as shown on page 27. The area covered by a rectangle equals the percentage of data sets filed under a category out of the total number of data sets published by one country. For example 9% of Canada’s data sets are filed under Health. Other categories take up following space: Education (5%), Culture (6.7%), Employment (5.7%), Economy (54.5%), Population (4.7%), Environment (7.5%), Infrastructure (3.6%), Justice (0.05%), Security (0.01%), Government (0.89%), Communication (1.65%), Other (0.37%).
tween the single categories of one country. It is not difficult to read the focus that is put by a country. A negative aspect is that the comparison between two countries is not that easy concerning total numbers. To compare the focus different countries are putting, is still possible, but in case total numbers are of interest, the total area of the rectangle representing the countries should represent the actual number of data sets, published by the respective countries. Those numbers vary enormously (e.g. Canada: 11605/ Kenya: 367) Another factor that complicates the reading of the graphic is the difference in number of the shown categories. (Australia: 16 categories/UK: 10 categories) The perceptual misinterpretion of circles concerning correct display of amounts does not apply to squares. The human eye is able to correctly estimate the areas of squares.³⁴
A treemap is used for displaying hierarchical data. The tree has one stem, that forks into big branches and then into small branches. The “big branches” are the categories and “small branches” the sub-categories. In the visual transfomation on page 30 only one layer is included. To add another layer one option would be to combine all the single counries (“World Open Data”) or to include additional sub-categories. “During 1990, in response to the common problem of a filled hard disk, I became obsessed with the idea of producing a compact visualization of directory tree structures”³³ Ben Shneiderman explains the origin of the idea of creating a tree map. He had first tried a tree diagram, but it got too big and complex. A positive aspect of this way of transformation is the easy decodability of the percental relationship be33 Ben Shneiderman (1998): Treemaps for space-constrained visualization of hierarchies, available at: http://www.cs.umd. edu/hcil/treemap-history/index.shtml
34 cf. John Krygier (2007): Perceptual Scaling of Map Symbols, available at: http://makingmaps.net/2007/08/28/perceptualscaling-of-map-symbols
Telling a Story Through Data
“Time is one of the hardest variables to map in any complex system. It is also one of the richest.� Manuel Lima, Interaction Designer and Information Architect
OPEN DATA / VISUALISATION
32 / 33
VISUALISING CHANGES OVER TIME
Animation Animation is a powerful tool to visualise changes over time. It provides the possibility to add another dimension. Hans Rosling famous animation “Wealth and Health of Nations” reveals the correlation between Income per Person (GDP/capita) and Life Expectancy around the world in the time period between the years 1800 and 2010.³⁵ The two mentioned entities of all the countries are visualised in a simple bubble chart between a x- and y-axis. A third level is being added by adjusting the sizes of the circles according to the total population of those countries. And the forth level is the time – the bubbles are moving between the two axes, while changing their size. The outcome could be either a time-based animation that can be watched while the animation “speed” is fixed, like it is the case with Roseling’s animation. Another option to present information over time is to create an interactive web application that can be explored with the animation “speed” being defined by the user. Both options allow for more information, being displayed in the limited space of a screen, than it would be possible with a similar sized, static graph. More meaning can be transported by effectively showing changes of entities over a period of time.
With the topic Open Data the dimension time is also an interesting factor. Possible questions worth asking would be: When over time did opening up data stores take place in certain countries? Where did it start? Which country joined this movement only recently? When did the uploading of datasets take place over time? Has the upload taken place in the beginning just after launching the portal and now nothing is happening anymore?
world health and wealth visualised over the past 200 years
An impressive example for the display of time-based information interactively is the Arab Spring timeline launched by The Guardian in March 2011, just after the start of the numerous uprisings in the Middle East. It provides a listing of key events organised by country on a timeline that is navigatable. Usually the set-up of timelines is horizontally (2D), this one is navigatable vertically, which gives you the impression to enter a 3D space.³⁶ Arab spring: an interactive timeline of Middle East protests
35 Hans Rosling (2010): Wealth & Health of Nations, available at: http://www.gapminder.org 36 Garry Blight, Sheila Pulham and Paul Torpey (2011): Arab spring: an interactive timeline of Middle East protests, available at: http://www.guardian.co.uk/world/interactive/2011/ mar/22/middle-east-protest-interactive-timeline
Telling a Story Through Data
“I think we’re actually in pretty primitive stages really. For the most part we’ve just taken an artifact from the print world and inserted it in the digital world.” Deroy Peraza, creative director at Hyperakt, a design firm in NYC interviewed by Josh Smith, graphic designer and co-founder of IDSGN-blog, NYC, July 19, 2011
OPEN DATA / VISUALISATION
34 / 35
PROSPECTS
Since this is my first visualisation project in 2D, I decided to begin considerate and start off by producing basic transformations based on one data set. Working on the visualisations increased my awareness of the fact that only a small change of parameters may result in a completely different outcome. The challenge herein lies in sticking to the “reality”, that is behind the data set, as closely as possible. While collecting the data I often had to think of Nathan Yau’s description of his internship at The New York Times: “There was one day when my only goal was to verify three numbers of a dataset (...). Only after we knew the data was reliable did we move on to the presentation.” ³⁷ I believe sometimes this attention to detail might be missing, probably not everyone working in the field has the capacity invest that much time into it. That makes me wonder about the integrity of data sets being generated around the world. This also concerns the actual visualisation part. The big hype around data visualisation in recent years has resulted in an explosion in the field of data visualisation; not only for the good: “Whenever a new trend in information technology captures the interest of enough people to become popular, a great deal of confusion is created as everyone rushes to embrace it with little understanding of what it is and how it works.”³⁸ I think it is important to be aware of this, and look at visualisations accordingly, but I am also positive that once the hype is fading the field of data visualisation will have made a step forward. In the end one is also learning through mistakes. A recognisable trend is that we are heading towards bigger data sets being visualised in smaller spaces. One of the reasons can be seen in the ongoing “production” of data that is now also becoming available through the Open Data movement. Huge amounts of 36 Nathan Yau (2011): Visualize This. The FlowingData Guide to Design, Visualization, and Statistics. pg. 2 37 Stephen Few (2007): Data Visualization. Past, Present, and Future. pg. 8 page 36/37: Open Government Data plus “Open City Data”
data that is waiting for being made sense of, in order to shape the future. Another reason behind this development is the increasing amount of tools becoming available, that are designed to handle this task. Big data became managable. Despite my basic approach I have to admit that I quite enjoyed the work on “transforming the data”, and will surely stay involved – by following future developments as well as through hands-on learning.
00,000
Telling a Story Through Data
cities: open data available
countries: amount of dats sets
100 500 1000
5000 10000
OPEN DATA / VISUALISATION
viz: open data in states and cities looked at globally
36 / 37
Telling a Story Through Data
REFERENCES LITERATURE
WEB
Brown, Brad/Bughin, Jacques/Chui, Michael/Dobbs, Richard/ Hung Byers, Angela/Manyika, James/Roxburgh, Charles (2011): Big data: The next frontier for innovation, competition, and productivity, San Francisco.
data.gov/opendatasites opendata.go.ke epsiplatform.eu data.gov.au www.bahrain.bh/pubportal/wps/portal data.gov.be beta.dados.gov.br www.data.gc.ca datos.gob.cl digitaliser.dk/resource/432461 opendata.riik.ee opengov.fi/data data.gouv.fr portalu.de/portal/default-page geodata.gov.gr/geodata gov.hk/en/theme/psi/welcome data-gov.ie/.html dati.gov.it opendata.go.ke data.gov.md data.gov.ma/Pages data.overheid.nl data.govt.nz data.norge.no datosperu.org opengovdata.ru saudi.gov.sa/wps/portal data.gov.sg data.go.kr/Main.do datos.gob.es/datos opengov.se transparency.gov.tl government.ae/web/guest/uae-data data.gov.uk data.gov datos.gub.uy datacatalogs.org
Few, Stephen (2007): Data Visualization. Past, Present, and Future, Berkeley, USA Few, Stephen (2010): Data Visualization for Human Perception. In: Soegaard, Mads and Dam, Rikke Friis (eds.). “Encyclopedia of Human-Computer Interaction”, Aarhus, Denmark Lima, Manuel (2011): Visual Complexity. Mapping Patterns of Information, New York, USA Rahemtulla, Hanif/Kaplan, Jeff/Gigler Bjorn-Soren/ Cluster, Samantha/Kiess, Johannes/Brigham, Charles (2011): Open Data Kenya. Case Study of the Underlying Drivers, Principal Objectives and Evolution of one the first Open Data Initiatives in Africa., Washington DC Schellong, Alexander/Stepanets, Ekaterina (2011): Unbekannte Gewässer. Zum Stand von Open Data in Europa., Wiesbaden, Germany Stone, Maureen (2006): Choosing Colors for Data Visualization, Woodinville, USA Tufte, Edward R. (2009)[2001]: The Visual Display of Quantitative Information, Cheshire, USA Tufte, Edward R. (1994)[1990]: Envisioning Information, Cheshire, USA Tuomi, Ilkka (1999): Data is more than knowledge: implications of the reversed knowledge hierarchy for knowledge management and organizational memory, Journal of Management Information Systems, 16, 3, 107-121, Helsinki, Finland Yau, Nathan (2011): Visualize This. The FlowingData Guide to Design, Visualization, and Statistics, Indianapolis, USA
gapminder.org visualcomplexity.com cs.umd.edu/hcil/treemap-history guardian.co.uk/world/interactive/2011/mar/22/middle-east-protest-interactive-timeline atlas.media.mit.edu/
APPENDIX / REFERENCES
38 / 39
MEDIA
IMAGES
Berners-Lee, Tim (2009): The next web of open, linked data [TED talk], Available at: http://www.ted.com/ talks/tim_berners_lee_on_the_next_web.html [Accessed on 24 February 2012]
page 2: a_trotskyite (flickr) (2009) World passenger airline routes. Available at: http://upload.wikimedia.org/wikipedia/commons/0/0d/World_airline_ routes.png
Berners-Lee, Tim (2010): The year open data went worldwide. [TED talk], Available at: http://www. ted.com/talks/lang/en/tim_berners_lee_the_year_ open_data_went_worldwide.html [Accessed on 24 February 2012]
page 4: Brett Ryder (2010) The data deluge. [The Economist] Available at: http://www.economist.com/ node/15579717
Blight, Garry/ Pulham, Sheila and Torpey, Paul (2011): Arab spring: an interactive timeline of Middle East protests, Available at: http://www.guardian.co.uk/ world/interactive/2011/mar/22/middle-east-protestinteractive-timeline [Accessed on 27 February 2012]
page 7: screenshots of http://www.data.gov (up) and http://opendata.go.ke page 8: Amanda Cox (2009): The Jobless Rate for People Like You. [New York Times] Available at: http://madigitalmedia.wordpress.com/2011/09/01/the-joblessrate-for-people-like-you-study-case-2/
McGhee, Geoff (2010): Journalism in the Age of Data [video online] Available at: http://datajournalism. stanford.edu/
page 9: Aaron Koblin (2009): Flight patterns. Available at: http://sandbox.aaronkoblin.com/projects/flightpaths/wallpaper/atlanta.png
Ndemo , Bitange (2012): Open Data: Africa’s Opportunity. [video online] Available at: http://epsiplatform. eu/content/africa-needs-open-data-most [Accessed on 14 March 2012]
page 10: Jonathan Gray/Open Knowledge Foundation (2009): Where does my money go [online application] Available at: http://www.lem.sssup.it/WPLem/odos/ odos.html
Khokhar, Tariq (2012): Open data - the new revolution, [talk held at the”Developing the Caribbean Conference”], Available at: http://www.ustream.tv/recorded/20017987 [Accessed on 25 February 2012]
page 11: Oliver Uberti: Augmented Reality [National Geographic], Available at: http://ngm.nationalgeographic.com/big-idea/14/augmented-reality
Rogers, Simon (2010), The Guardian, Government data from around the world. Welcome to our single gateway, Available at: http://www.guardian.co.uk/news/datablog/2010/jan/07/government-data-world Rosling , Hans (2010): Wealth & Health of Nations. [video online] Available at: http://www.gapminder. org [Accessed on 16 March 2012] Varian, Hal (2009): How the Web challenges managers. [video online] Available at: https://www.mckinseyquarterly.com/wrapper.aspx?ar=2286&story=true& url=http%3a%2f%2fwww.mckinseyquarterly.com
page 12/13: Maps Port-au-Prince. Available at: http:// tanconectados.com/wp-content/uploads/2011/04/ puertoprincipe.gif (page 12), http://sync.nl/de-vrijeopen-kaart-en-de-plaats-van-het-ondernemerschap (page 13) page 14: Open Data Kenya: Respiratory Illnesses vs. Use of Fuel Wood, 2005/6, Available at: https://open. umich.edu/blog/wp-content/uploads/2012/01/Kenya-Open-Data-Visualization
Images and graphs not listed are created by myself.
Telling a Story Through Data
Hiermit versichere ich, dass ich die Arbeit selbstständig angefertigt und keine anderen als die angegebenen Quellen und Hilfsmittel genutzt habe Zitate habe ich als solche kenntlich gemacht
KÜln, den 19. März 2012