Introduction to Information Visualisation
Alan Dix1,2
1 Talis, 43 Temple Row, Birmingham, B2 5LS, UK
2 University of Birmingham, School of Computer Science, Birmingham, UK alan@hcibook.com http://alandix.com/academic/teaching/Promise2012/
Abstract. This is a short introduction to information visualisation, which is increasingly important in many fields as information expands faster than our ability to comprehend it. Visualisation makes data easier to understand through direct sensory experience (usually visual), as opposed to more linguistic/logical reasoning. This chapter examines reasons for using information visualisation both for professional data analysts and also end-users. It will also look at some of the history of visualisation (going back 4,500 years), classic examples of information visualisations, and some current challenges for visualisation research and practice. Design of effective visualisation requires an appreciation of human perceptual, cognitive and also organisational and social factors, and the chapter discusses some of these factors and the design issues and principles arising from them.
Keywords: information visualisation, human–computer interaction, HCI, visual analytics.
1 Introduction
Information assails us in business, in science, in government and in day-to-day life, from the processing of massive scientific streams at CERN to sentiment analysis of millions of Twitter messages to gauge the popularity of a party, or to locate power outages. Information retrieval is about selecting out of this morass of data, relevant documents, images, and audio. Visualisation is about helping people make sense of either the original data sources or the subsets of data obtained through information retrieval. Both can operate independently, but also they have great power together. This chapter is a short introduction to information visualisation. In it we will look at a number of areas. First, in the next section, we will look at the definition and scope of visualisation in general and information visualisation in particular. Most critically, despite the term being 'visualisation', it may in fact involve other senses and is centrally about the use of these senses to make sense of data. Visualisation has various purposes and users; section 3 considers these, in particular the different ways in which information visualisation is used by data analysts compared with data consumers (whether a company CEO or newspaper reader).
While visualisation seems like a modern phenomenon, and indeed interactive computer visualisation is comparatively recent, in fact the roots of static visualisation
M. Agosti et al. (Eds.): PROMISE Winter School 2012, LNCS 7757, pp. 1–27, 2013. © Springer-Verlag Berlin Heidelberg 2013
can be traced back at least 4500 years. Section 4 presents a brief history of visualisation from Mesopotamian financial tables and 10th century line graphs to current spreadsheet graphics, data journalism and visual analytics. This is followed in section 5, by an overview of some of the kinds of visualisation that might be particularly useful in the context of information retrieval.
The chapter concludes with a discussion of some of the human-centred design principles that can be applied to visualisation choice and creation. We will consider detailed design issues; in particular the way interaction can soften the trade-offs that are inherent in making (static) visualisation choices. However, we will also consider the way visualisation (and for that matter information retrieval) fits into a larger social and organisational context.
2 What Is (Information) Visualisation
Defining Visualisation
Visualisation is perhaps easier to recognise than define. In his textbook "Information Visualisation" [1], Bob Spence refers to the dictionary definition: visualize: to form a mental mode or mental image of something [1]
He emphasises that visualisation is critically about in sight, what happens in your head, not a computer. Often the most powerful mental images are formed from words alone, but that would not correspond to the common notion of visualisation, which is often about the design of media (computer, paper) to help people.
So, for this chapter we shall adopt a slightly different definition of (information) visualisation:
making data easier to understand using direct sensory experience
Note this is still about insight and understanding, but also about the perception ('sensory experience') and deliberate design ('making').
Note also that this definition says 'sensory', not simply 'visual', as the inner visualisation that makes you say "I see" can also be engendered by other senses. Although less common, you can have aural and tactile 'visualisation' – think of the click of a Geiger counter – faster clicks mean more radiation. These non-visual forms are particularly valuable for those with visual disability, but also in contexts when the eyes need to be elsewhere, for example while flying a plane. This all said, the vast majority of visualisation is, as the name suggests, visual. The visual cortex accounts for around 50% of our brain, and so it makes sense to use it.
Also note that the word 'direct' is in the definition to exclude purely rich textual descriptions, no matter how sharply they focus the mind. Although you use your eyes to read words or even tables of numbers, they are processed linguistically and logically, rather than the more instant feeling you get when you see a rising graph.
With many caveats to beware of pseudo-science, you can think of this as a form of left brain / right brain distinction. Not that one is better than the other. In statistics, one is taught never to start by calculating means, t-tests, etc., but instead always to
Introduction to Information Visualisation
start off by drawing graphs are often obvious by simp what one thinks is true of t calculate the statistics (very
, trying to get a feel for the data (very right brain). Thi ly glancing at a graph. However, having got an idea he data, one does not trust that intuition, but then start left brain) to verify the insight. The two work together
Visualising Numbers
This does not mean that te Good layout can create dir columns of numbers in Fig which is the biggest numb hand one. This is because points aligned, so the bigge left of the decimal point. Ef
xt and numbers are not an integral part of visualisati ect visual (or other sensory) impressions. Look at the t ure 1. In each column try to see, as quickly as possib er. This is harder in the left-hand column than the rig the numbers in the right-hand column have their decim st numbers are also the ones that stick out furthest to fectively the line of figures acts like a miniature bar gra
Of course one may no alignment doesn't matter. purpose" – work out what y and use that to determine th A more complex form o found in Table Lens [2]. T be collapsed down to a few be read (and will be aligned too small to show the numb up a sort of mini-histogram that the column to the left i column far less so. When t more miniature (columns 2 at a glance view of the tiny
This is also an example or fisheye view [3]. The e (the focus), whilst the colla big picture (the context).
t care about the biggest numbers, in which case As a first heuristic for information visualisation: "th ou want the viewer to be able to do with the visualisat e form.
f visualisation, where the numbers are still central, can his is like a spreadsheet except that columns and rows pixels each. Where a cell is not collapsed the numbers properly!). However, when the height of the cell beco m er it is reduced to a line of pixels so that the column e
. By sorting the middle column, it is immediately obvi s correlated to some extent with it, but that the right h he column width is collapsed the histogram becomes e and 3 in figure 2), giving less detail, but still allowing columns.
of a general visualisation technique called 'focus+cont xpanded cells allow one to look at certain values in de psed cells allow one to get an idea of how that fits into
3 ings a of s to r. ion. two ble, ghtmal the aph. the hink tion n be can can mes ends ious hand even g an text' etail the
Fig. 1. Visualising in numbers
Information Visualisation
The term 'information' visu used to contrast it with 'scie that have a direct connectio example the airflow around of fields of numbers of vect
In contrast, information complex structurally, but Furthermore, the data of inf gender male/female) as wel
The two are not entirely (GIS) involve data over 2D for marketing purposes will patterns for climate modellin to understand patterns of glo
alisation, as opposed to 'visualisation' in general, is usu ntific' visualisation. In science there are many phenom n to the physical world, but are in some way invisible, an aircraft wing. This scientific data is often in the fo ors defined continuously over a 2D or 3D space. visualisation is often concerned with data sometimes m almost always discrete: hierarchies, tables, point d ormation visualisation often includes categorical data ( l as continuous data (e.g. height). distinct, for example, geographical information syste maps. Methods used to display regional petrol consumpt not be so different from those showing average temperat g, and of course one might want to use both these data bal warming.
3
Why Use Visual
isation and Who Is It for?
When creating visualisation
First there is the data a academic interpreting exp anomalies in a bank's accou cycleway, or the intelligenc to prevent a terrorist attack
The other group is the reader, or the CEO. These peasant, but have in comm even be highly numerate be
s there are two kinds of target audience. nalyst, whose job it is to sift through data whether erimental results, the forensic accountant looking nts, the city planner working out the best route for a n e officer piecing together emails, tweets and passport d
eventual data consumer, the client, audience, newspa may range from a time-strapped manager to an illiter on that they are not experts at data analysis, and may yond what they recall of basic school mathematics. ally mena for orm more data. (e.g. ems tion ture sets the for new data aper rate not
Fig. 2. Table Lens [2]
Given you are reading t the first group, the scientis you need to work harder to not have as intuitive a grasp
For each of these two visualisations.
his book, it is likely that you have more in common w t, statistician or professional, than the second. This me design visualisations for the data-consumer, as you w of what is good.
groups we'll look at reasons why you may want to
For the Data Consumer
For the data consumer, the representations, that can be when using a visualisation audience to a more comple glance or not at all.
There are two main rea understanding and rhetoric understanding – This i already seen. For example want a graph to show the nu against time. For the gen graphics to form 'infograph appealing (if the readers do partly to point out particul Guardian Datablog [4], sh Thatcher administration) un (blue=Conservative, red=L of the Exchequer at the time
focus usually needs to be on simple, well understo grasped at first time of looking. Sometimes, for examp as part of a presentation, it is possible to introduce x graphic, but often the visualisation has to work at f
sons for providing visualisations to the data consum
s when we want to help others see what the analyst , as part of teaching a course on mobile internet, you m mber of people accessing the internet via a mobile plot eral public, graphs are often augmented with text ics'. This is partly to make the visualisations more visu not look at the visualisation they will learn nothing), ar features. For example, the page in figure 3, from ows the UK budget deficit from 1979 (the start of til 2011. The colours denote the dominant party in po w abour), the pictures at the top are the various Chancell and actual numbers included in the figure.
ig. 3. UK Deficit and Borrowing [4]
with eans will use ood, ple, e an first mer: has may tted and ally and the the wer lors
rhetoric – Visualisation whether valid or not. For graph (see Fig 4.), that plo eventually taking off, sho w which to invest. When we are comprehensible. They comprehensible they are, t must be clever!). Rhetorica used to convince others of political pamphlet, or PhD t
can also be used to persuade readers of a particular po example, every business plan includes a 'hockey st ts projected users / income over time, starting slow, ing the potential investor that this is a good business see graphs it is easy to be impressed, whether or not t seem professional, scientific, and, sadly, often the l he more people are impressed (if they are difficult t l use of numbers or graphs can be misleading1, or can the truth, whether in an academic paper, newspaper, pa hesis.
Fig. 4. A 'hoc
ckey stick' graph, as found in most business plans
It would be nice to say not like that. The good guy data certainly helps. Note t the download of the raw d 'data journalism'. This mea and manipulate or visualise and democracy in an age wh Often small changes in take away. Consider figure in this graph). However, un obtained from the Guardian size of the economy and h relative to your income.) In the world credit crisis. Figu have risen and fallen, with also suggests a tendency fo clear that the overall level administration actually sub compared with 3%), counte
that the answer to rhetoric is facts, but sadly the world s have to tell as good a story as the bad guys! Howev hat as well as infographics, the Guardian Datablog allo ata behind the visualisations; a part of the practice cal ns that a more experienced reader can download the d it in any way that they wish, crucial for informed deb ere information is power.
a diagram can make a big difference to the lesson peo 5, this shows UK deficit again (up means deficit, so b like figure 3, this has been corrected for GDP (figures a Datablog [4, 6]), so that deficit is shown relative to ence its affordability (like looking at your credit card both figures it is clear that the deficit shot up in 2009 w re 3 also shows that with successive governments defi even the occasional foray into surplus. However, the gr r it to increase with time. Correcting for GDP, make in real terms has remained pretty steady, with the stantially lower than the long term average (2% G r to the popular narrative of all political parties.
1 As Winston Churchill said, " or as has been attributed to D and statistics" [5].
the only statistics you can trust are those you falsified yours israeli and others "there are three kinds of lies: lies, damned oint, tick' but s in they less they n be arty d is ver, ows lled data bate ople bad, also the bill with icits raph es it last GDP elf", lies,
Fig. 5. U
UK Deficit relative to GDP (data from [4, 6])
For the Data Analyst
While many of the same les the opportunity for training more complex and powerf must still in a sense be 'firs perception; if you have to then probably you are bette 'first glance' after you have have many specialised gra cybernetic systems. Some t master, but once mastered o For the data analyst we c exploration. understanding – Like something' at least in abstr example, we may be expec academic, the graphics use published in an article as th example, figure 6 shows a [7]. For publication purpo believe) what you have s experiment, the purpose is outliers.
sons are true for the experienced data analyst, there is a , or growing experience, so visualisations can afford to ul, potentially including novel techniques. Visualisat t glance', as the purpose is to use the power of our sens spend too long figuring out what a visualisation mea r looking straight at the numbers. However, this can b extensive experience and training. For example, engine phs used to understand fluid flows, electric circuits ake several years of training during undergraduate study ffer an instant overview. an also consider two kinds of purpose: understanding
the data consumer, the analyst may in a sense 'kn act terms, but wish to make it salient to themselves. ting a power law, so plot points on a log-log scale. For d to help oneself understand may well be similar to th e audiences (oneself and other scientists) are similar. box plot from the famous neutrinos faster than light pa ses the aim is to help others see (or convince others een in the paper. For the scientist who performed to confirm/disconfirm hypotheses, highlight exceptions
exploration – The other haven't even been considere of data or new phenomeno chaos of billions of intercep of visualisation is driven b pattern. Here the aim is to s visualisations that avoid th multidimensional plot!). T of visualisation, each of wh or maybe present several vi
purpose for the data analyst is to find new things t d before. This may be a scientist encountering a new k n, or an intelligence officer seeking patterns amongst ted emails. In the previous cases, the design or select y the desire to expose and clarify a previously kno eek the unknown, indeed maybe deliberately try to des e obvious, exploring new angles (maybe literally i ypically this may involve flipping between different ki ich may emphasise one aspect of the data, but hide oth sualisations at once (see Figure. 7).
also o be tion sory ans, be a eers s or y to and now For the hose For aper s to the s or that kind the tion own sign in a inds hers,
Box
. 7. Multiple parallel visualisations [8]
The human sensory syste in visualisation for data an patterns where there are no more likely one will appe particularly those for exp distinguish happenstance fr
m is tuned to find patterns, and this is exploited to the alysis, especially exploratory. However, we may also ne. The more different ways you look at something, ar to have a pattern, purely by chance. Visualisatio loratory analysis, therefore need to help the ana om real underlying patterns.
4 A (Very) Brief H
4.1 Static Visualisation
The computer-driven visu visualisations of various fo tablet in Figure 8 is aroun information. We may think tablet writing is of an adm of numbers.
istory of Visualisation
(From 2500 BC to 1990 AD)
alisations shown so far are comparatively recent, rms date back many millennia. The Mesopotamian c d 4500 years old, and contains a table of administrat bureaucracy is new, but the vast majority of early c inistrative / financial nature, often including simple tab
full see the ons, alyst but clay tive clay bles
6.
plot of Neutrino transit times [7]
8. Mesopotamian table o
n a clay tablet
Moving on 3500 years, planetary movements. The heavenly body, where the y the 19th century and shows x-axis hours of the day (fro distance along the route (P clearly as the steeper lines. These early visualisation of computing it became po easily, or to create new o n days this was done using v printer, but now it is simply print! Fig
This use of computers t page or PDF, or printed in drawn illustrations. Thes particularly when commu
Fig. 9. 10th Century time line
Figure 9 shows an early line graph of solar, lunar x-axis is days in the month and the lines track e -axis is their height in the sky. Figure 10 skips forward a visualisation of the Paris–Lyon train timetable, with m 6am to 6am the next day) and the y-axis showing aris at the top Lyon at the bottom). Fast trains stand
s were created painstakingly by hand, but with the adv ssible to create the same visualisations more quickly es that would have been impossible before. In the ea ery slow x-y pen plotters or character-graphics on a l a matter of selecting a few options in Excel and press
. 10. 1855 Paris-Lyon train timetable
o create fixed visualisations whether on screen, on a w a newspaper, is in many ways similar to the older ha e static visualisations are still of great importan nicating with others. Furthermore, understanding
and each d to the the out vent y or arly line sing web andnce, the
Fig.
effective design and qualities of static visualisations is an essential first step to creating more complex interactive visualisations. For static visualisation, the core texts are undoubtedly Tufte's beautifully illustrated books [9–11].
4.2 Interactive Visualisation
Examples of interactive visualisation can be traced back to early scanning vector graphics displays, or the seaside information boards where tiny lights were illuminated when you pressed buttons for different kinds of features. However, it was in the early 1990s when growing graphics power made it possible, for the first time, to create rich 3D graphics, complex visualisations and real-time interaction. This led to a blossoming of information visualisation (and other graphics) research notably in the groups at Xerox PARC and University of Maryland. Not all the ideas were good, just like with gloriously multi-fonted documents during the desktop publication revolution in the 1980s, there were many examples of gratuitous 3D, most of which are deservedly forgotten. However, despite this, most of the core kinds of visualisations in use today were introduced at that time (see selection in Figure 11), several of which will be discussed in the next section.
Fig. 11. Interactive Visualisations from the early 1990s: clockwise from top left: Cone Trees [12], TreeMaps [13], FilmFinder [14], Buttefly Browser [15], and Pixel Plotting [16] in centre (note how use of 3D distorts text in Butterfly Browser)
4.3 Current Directions
We have already seen examples of data journalism where rich, but simple to understand, infographics have made their way into mainstream media. Furthermore the web has increased the public expectations of high quality, often interactive, visualisations. These web visualisations are sometimes 'authored', that is created by
Introduction to Information Visualisation
the individual or institution a number of data sharing a your own data; figure 12 sh open data initiatives by go data on many aspects of li crime to concert venues. O the data in citizens’ own int
responsible for the article or blog. However, there are a nd analysis sites that make it easy to upload and visua ows one example, IBM's "Many Eyes" [17]. Furtherm vernments and corporations across the world are mak fe available to all from the environment to employm ften this comes with an invitation to mashup and visua erfaces.
Various factors including volume of available data, stations, to the trivia of Tw and a major challenge. H been a rise in processing p is now relatively easy to computational frameworks computation on massive dat Where this computation growing [19, 20] (see the learning and other data pro interactive problem solving
eScience and the web itself have led to an increase in from the scientific data streams of ozone monitor itter streams. Analysing this data has become big busin appily, in parallel with the rise in big data there ower, both on the desktop and also in the cloud wher spin up significant computational power as needed , notably MapReduce [18], for dealing with distribu a.
meets visualisation, the nascent field of visual analytic chapter later in this volume). This combines mach cessing algorithms with interactive visualisation to ena .
5 Classic Visualisa
There are at least as many section we will look at a retrieval.
5.1 Hierarchical Data
Trees, taxonomies and hie structure after the humble biological taxonomies, (p hierarchies. After file brow of tree visualisation is some
tions for Information Retreival
kinds of visualisation as there are kinds of data. In few classes that are particularly relevant for informat rarchies are perhaps the most ubiquitous form of d table; we all encounter trees whether organisation cha arts of) formal ontologies, XML data, or file syst ser/outliner style textual layouts, the most common fo sort of box and line 'organisation' chart as in Figure 13
11 also alise more king ment, alise the ring ness has re it and uted cs is hine able this tion data arts, tem orm 3.
Fig. 12. IBM Many Eyes [17]
However, these simple v 14 shows a relatively small we can see a number of pro to fit the labels in the boxe lower levels of the tree, the scrolling, and consequent shows the Cone Tree [12 horizontally (Cam Tree) c selects a node the relevant r of attention.
isualisations tend to breakdown as trees get bigger. Fig tree (part of a task structure), but even on this small tr blems of scale. The boxes are labelled, but it is often h s without them overlapping. Furthermore, as one look width grows very rapidly leading to extensive horizon loss of context. One option is to use 3D, and figure ], which lays out the tree nodes in rings vertically onnected to their parents, creating 'cones'. When a u ing swings round so that the selected node is in the cen
While the use of 3D ap view these in 2D, so the oc the natural essence of the sense of the trees in figur particular problems with te right – small differences ma
parently increases the amount of space available we clusion is still there, just more acceptable as we view i 3D world. Note how the shadows are used to help m e 15. The vertical layout on the left of Figure 15 xt labels, solved partially by the horizontal layout on tter in visualisation.
gure tree, hard ks at ntal e 15 y or user ntre still it as make had the
Fig. 13. Simple tree visualisation
Fig. 14. Simple tree visualisation
Fig. 15. Using 3D
: the Cone Tree, vertical and horizontal variants [12]
Another approach is Tre particularly suitable for tree utilisation in a file system, divides the space horizontal subtree to determine the s proportional to its size (see
eMaps, which dissect and fill 2D space [13]. TreeMaps s where there is some notion of size or volume (e.g. d staffing or spending in organisational units). The TreeM ly and vertically on successive steps, using the 'size' of pace allocated. In the end each smallest level has a Fig. 16).
Fig. 16. TreeMap of two l
evel hierarchical data: data on the left, TreeMap on the right
Fig. 17. TreeMap vari
ants for images at leaves and large numbers of nodes [13]
Early versions of Tree layout algorithm. However artefacts such as many thin
Maps applied the simple alternating horizontal/vert , later variants have divided space differently to av rectangles when a node has a large number of childr
are disk Map f the area tical void ren,
particularly important when displaying images in the leaf nodes (Figure 17, left). The TreeMap is in many ways quite simple, but is one of the more heavily used 'complex' visualisations, proving itself able to manage vast trees (Figure 17, right) and yet still be relatively comprehensible.
Finally for here, although not the end of the visualisation of hierarchical data by any means, are methods that distort space in order to show a tree. The most well known (but not most well used or understood) of these is the Hyperbolic Browser [21]. This began with the failure of simple circular layouts to deal with larger trees. If a tree has a constant branching factor, say on average 3 nodes per parent, then the number of subnodes increase exponentially at each level down: 3, 9, 27, 81, 243, 729, ... However, when we layout in a circle, then the circumference of successively larger circles only grows linearly – there is never enough space!
Mathematicians deal with a kind of curved space called hyperbolic geometry. This had theoretical beginnings, but now turns out to have applications in cosmological physics. The important feature of hyperbolic space is that the circumference of 'circles' in this (rather odd) geometry increases exponentially with the diameter of the circle – perfect for trees. Unfortunately we do not see in hyperbolic geometry, so this is then projected back down into 2D (see Fig. 18) leading to an effect rather like earlier Fish Eye visualisations [3].
5.2 Clustered Data
Quite frequently in information retrieval there is no fixed structure, instead, we have large sets of search results, with common attributes, but no given hierarchical structure. Although there is no given structure, sometimes a form of structure is induced using clustering, whether at a single level to give groups of related nodes, or at multiple levels with clusters of clusters leading to tree structure.
Hierarchies are clearly centred on our linguistic/logical understanding of the world, and to some extent need to be made more immediate to our sensory perception. In contrast, clusters correspond closely to human perception, we see a group of sheep and, without consciously thinking, "they are all similar" they become a flock in our minds. However, the fuzzier concept has its own challenges.
Fig. 18. The Hyperbolic Browser visualising the web [21]
Where the data is numer obvious ways to show a clu identified as a cluster. Th maybe by the user choosin visualised on a 2D plane li extent, perhaps by drawing area included in the cluster there is an obvious region, clustering algorithm create ellipse as in Fig. 19, but per and if there is not an obviou
(a) scattered dat
The other obvious way
Introduction to Information Visualisation
ic and can be shown on some sort of scatter plot, there ster. Figure 19.a shows a group of elements that have b is might have been done by an automatic algorithm, g or encircling elements interactively. If they are be ke this, then we may visualise the cluster by showing a border around the cluster (Fig. 19.a) or shading (Fig. 19.b, shading). This is particularly appropriate w for example, if the user has lassoed the elements, or s a segmentation of space. Of course this may not be haps an area delimited by lines, or a more complex sha s region we might just use the convex hull of the points
a elements (b) show average or extent ig. 19. Visualising numeric clusters
to show a cluster is using some form of average positi 9.b is not one of the original points, but an average of e like this is an advantage if we want to reduce the clu ng the number of points displayed by just showing r. Of course, this can be odd for some sorts of data, erage of "2.2 children" per family in the 1970s. Howev form of representation.
The large point in Figure 1 the points. Using an averag of the display [22], reduci centre point of each cluste example, the (in)famous av numbers at least admit this Things get far more diffi text, or even categorical da comprehensible than 55% space, for example with numeric data can be used t average value of such der interpret, and you are still selects the 'average' elemen
Where the data is categ most or all of the cluster, i images, sound, rather than archetype, it is often better
cult when the clusters represent non-numeric data, imag ta such as gender – 0.2 of a child is at least more ea female. Sometimes this kind of data is mapped into multi-dimensional scaling, in which case this deri o display in the same way as numeric data. However, ived statistics is likely to become increasingly hard left with the problem of what details to show if the u t.
These examples may be (Fig. 20.a). For example, w co-occurrence of words a
orical there may be some attributes that are commo n n which case these may be used, but for rich media: t trying to compute some form of average, or genera to give one or more real examples. deliberately chosen to be 'typical' using some meas ith text one may compute similarity measures based nd then use examples that are central based on t
are been , or eing g its the when the e an ape, s ion. f all utter the for ver, ges, asily 2D ived the d to user to text, ated sure d on this.
(a) show typical Fig
elements
(b) show spread of elements
. 20. Visualising non-numeric clusters
Alternatively one may delib elements in the cluster (Fi these, one can simply rando
This use of example data visualisation of clustering f the Scatter–Gather browser regions in each. Each of th up of one of these cluster re
erately look for examples that are spread widely over g. 20.b). When there is no systematic way of choos mly choose a number of examples. points can be seen in the Scatter-Gather browser, a cla or text documents. Figure 21 shows the main windo w (bottom left), which consists of two columns with f ese regions represents a cluster. The inset shows a clo presentations.
. 21. The Scatter–Gather Browser [23]
The Scatter–Gather brow hard to state or 'recognised of documents, perhaps the e then uses an automatic algo selects one or more clusters set, and then repeats the p clusters. Eventually, when t conventional view for final
The problem here is tha millions of documents so sets. In the inset image in number of keywords. These
ser is designed to help users find documents based when seen' criteria. The system starts with some collect ntire library or perhaps the results of a keyword search rithm to cluster the documents into 10 clusters. The us that look interesting, the system gathers these into a n rocess clustering the chosen document set into 10 n he clusters are small enough, the user can swop to a m selection.
t the original library may consist of many thousands that the early clusters themselves consist of very la Figure 21 there are two main regions. On the top ar are commonly occurring words in the cluster – a form
the sing ssic of five osed on tion h. It sers new new more s or arge re a m of
Fig
the 'common attribute' visu represented by a short title typical elements.
alisation. Below that area are three individual docume or snippet; that is, a form of visualisation by choos
5.3 Multi-attribute Dat
a
Data items have many attri authors, date and place o keywords and taxonomic cl
The earliest approach t command-line interface (F traditional SQL databases SPARQL or noSQL databa > new query > type=‘jou query proce list all (Y > N
Fig.
butes, for example a bibliographic record may have ti f publication, number of pages, references, citatio assification.
o such data was some form of boolean query usin ig. 22). This form of querying is still possible both and more recent databases such as RDF data us ses such as MongoDB's command line mode [24].
rnal’ and keyword=‘visualisation’ ssing complete - 2175 results /N)
22. Boolean queries at a command line
Of course ordinary user however, the most common sort of search form allowin example, Figure 23 is the G
s are not expected to use this form of query langua interfaces are really only one step beyond, giving so g target values to be entered against different fields. mail advanced search form. Boolean queries dressed up
23. Goo
gle Gmail search form (http://mail.google.com/)
More sophisticated inte selection attributes are sho
rfaces allow faceted browsing. This is where seve wn simultaneously then, as one makes selections aga
ents sing itle, ons, ng a for sing age, ome For p! eral ainst
Fig.
one, the options on others shows three document attri selected 'interaction' and document type. The count – effectively documents wh ( "interaction" AND type = "j
Note that against each aut selected documents that als list 39+157>173 as there ar
are narrowed correspondingly. For example, Figure butes: keywords, author and document type. The user 'visualisation' from the keywords and 'journal' as '173' shows the total number of documents satisfying b ere:
IN keywords OR "visualisation" IN keywords ournal"
hor and keyword is a count. This shows the number o include the relevant attribute. Note that in the keywo e 23 documents with both keywords.
Fig. 24. F
aceted browsing, similar to HiBrowse [25]
There are many example and also alternative data in based on one of the earliest including large document re
The dynamic counts are line query interfaces it was results. In Figure 24 it is ob you will have no results – t what will happen after you visual interaction heuristic.
s of faceted browsing, both for conventional tabular d cluding Semantic Web data [26]. The images above , HiBrowse [25], which was used for various applicati positories and hotel selection.
a critical feature of the HiBrowse. With early comm easy to refine a search and then find it ended up with vious that if you decide to choose the author 'smith', t hat is the counts give a sort of peek over the horizon a r next interaction. This is a very powerful, but underus
24 has the both s ) r of ords data are ions mand h no then s to sed,
Fig. 2
5. Dynamic filtering in FilmFinder [14]
In the above example, the interactions were discrete, selections of attributes leading to updated values. However, sometimes interactions can be made more rapid and continuous. Figure 25 shows two screenshots of FilmFinder [14], another early faceted browsing interface. Within each screen there is a larger area to the left which shows a scatter plot of films, coloured by genre and plotted against date (x-axis) and popularity (y-axis). On the right are a number of sliders, which allow the setting of maximum and minimum values for various attributes. As the user moves these sliders, the points on the scatter graph are filtered in real time giving instant feedback. Note that in the right-hand screen shot the filtering has reduced the number of points and so the titles of the films are also shown.
Another example of faceted browsing is the Influence Explorer [27, 28]. This was designed to allow exploration of complex engineering problems including simulations. An example problem was light bulb design choices. There are various input parameters that can be chosen (e.g. material, thickness and length of filament), and various output measures (e.g. cost, lifetime). Large numbers of simulations are run to create a large set of multi-dimensional data points, each corresponding to a single simulation run. The engineer can then use dynamic sliders to either select 'input parameter' ranges (e.g. choose range of thicknesses), or 'output' parameters (e.g. maximum cost). So far, this is like the FilmFinder interface, except above each slider is a small histogram showing the way the currently selected items (simulation runs) are distributed over the relevant values. This is effectively like the counts in HiBrowse making it possible to see whether the sliders are hovering near critical values.
5.4 Big Data
One of the trends noted in section 4.3 is the vast data sets that now need to be analysed. Many visualisations fail when dealing with large data. Some problems are computational, simply too many points to perform calculations on, especially for real time interactive visualisation. Some problems are more intrinsic to the visualisation, for example if there are too many points on a scatter plot it becomes unreadable, just solid colour.
One approach is to simply use less space to visualise each item. Figure 27 shows VisD, an example of pixel plotting [16], which uses a single pixel for each data point and then packs these densely filling the available space. In Figure 27 the
Fig. 26. 'Peek over the horizon' histograms in Influence Explorer [27, 28]
pixels are plotted in circles starting in the centre and then spiralling outwards. Similar techniques are also used for filling square areas. The colour represents a single attribute of the data, and some other attribute is used to order the plotting. For example, if the data is ordered by time then trends in the data would appear as changes in the average colour between centre and periphery, and periodicity would show up as segments or swirling patterns.
Pixel plotting allows 100s of thousands or millions of data points to be plotted on an ordinary display, but still this does not help with many current datasets, for example the tens of billions of web pages on a typical crawl.
For these vast datasets there needs to be some form of data reduction. This may take the form of some sort of pre-programmed or emergent aggregation. For example, detecting clusters and displaying the cluster averages, as described earlier – that is visualising groups not individual elements. Example data points can also be used, effectively reducing the number of points displayed. These examples might be selected using some systematic technique, or simply using random sampling [29].
6 Designing for Visualisation
6.1
Perception and Purpose
As we have seen there are many different forms of visualisation. When choosing a visualisation or designing a new one there are several factors to take into account: visual ‘affordances’ – what we can see – Our eyes are better at some things than others. For example, they are much better at discriminating levels of darkness, than hues of colour, and are much better comparing lengths of lines when the lines share a common base.
objectives, goals and tasks – what we need to see – Recall the lists of numbers in Figure 1, if the purpose is to compare sizes or find the biggest/smallest, then aligning the decimal points helps you to do this. If you can understand the purpose of a visualisation, you are in a better position to ensure that the visual affordances make this purpose achievable.
Fig. 27. Pixel plotting [16]
aesthetics – what we l functional, but often they infographics intended for p systems as we all work bett
These different visualisa often use 3D charts as they visual discrimination of (sta 3D pie charts. Pie charts ar angles, but when the pie ch compare the resulting differ
For some purposes the fu most important. For others, may be as or even more purposes, which may each design the art is in choosing
ike to see – Sometimes visualisations simply need to also need to be attractive. This is certainly true of ublic consumption. However, it is also true of professio er when things are pleasant to look at. tion factors often conflict. For example, business repo look impressive, even though it is usually harder to m tic) 3D images. Particularly common, and problematic e difficult anyway as our eyes are not good at compar art is put in perspective it becomes nearly impossible ently shaped segments. nctional aspects, the fit between perception and purpose including the persuasive use of visualisation, the aesthe important. Furthermore there may be several inten be optimally suited for different visualisations. As in an appropriate trade-off between these conflicting goals
6.2 Interaction
One of the advantages of in in visualisation by allowin used in complex visualisat parameter ranges for filterin a static visualisation. Howe powerful by just a little inte
As an example, let's con 28. This representation is q total over time (the overall overall sales. It is also easy bottom category. However are the sales of bananas inc all fruits can be equally ea choices and trade-offs.
teraction is that it can relax some of the trade-offs intrin g some choices to be altered dynamically. This can ions, for example, the FilmFinder allowed the setting g which would have to be chosen beforehand and fixed ver, very simple visualisations can be made surprisin raction.
sider the simple stacked histogram of fruit sales in Fig uite good at giving one a sense of the overall trend of height of the bars) and of the breakdown of fruits wit to see the trend over time of apple sales, as they are it is very hard to see the trends of other fruits, for examp reasing or decreasing? With a static stacked histogram, sy to visually analyse and so the designer needs to m
21
o be the onal orts make are ring e to e, is etics nded n all s nsic n be g of d in ngly gure the thin the ple, not make
Fig. 28. Stacked Histogram
However, if the histogr relaxed. Figure 29 shows (a called 'dancing histograms made. Figure 29.a shows ho given cell of the histogram how selecting a particular f trends in the chosen fruit. W a slight, but steady increasin
am is augmented with interaction, this trade-off can static screen shot of) interactive stacked histograms (a ') [30]. Two very small interactive additions have b w the area at the bottom right changes to show details o as the user floats their mouse over it. Figure 29.b sho ruit makes the histogram bars drop so it is possible to e can now answer the question; in fact banana sales h g trend.
Fig. 29. Interactive Stacked Hi
stogram [30] (a) counts for individual cells (b) changing the b
You can probably think this, perhaps allow the use maybe completely swoppin
There are various differ previous examples: highlighting and focus of the nodes causing it and the view. This highlightin examples such as this, wh spread across the visualisa highlighting a line in a char drill down and hyperlin the cat has been selected a items of interest are expand when a folder is opened. window or page, so being m overview and context –overview of all the data. Th collection (on the left) or f may have to hide or reduce we are looking at is importa level view, for example, wh show a thumbnail of the w
of other simple interactive modifications to a chart l r to re-order the fruits by dragging the labels in the k g the visualisation to show side-by-side bars. ent kinds of interaction, most of which we have seen
– In the Cone Tree in figure 15, the user has selected its parents to be highlighted and also rotated to the fron g is useful even for simple elements, but more so en the focus is not just a single point, but in some w tion, as in this case with the node and its parents, t such as the Paris–Lyon timetable in Figure 10. ks – In the TreeMap of images in Figure 17, the image nd expanded so that we can see more details. Sometim ed in place, as in this case, or an outliner or file choo Alternatively drilling into an element may open a fr ore like a hyperlink.
As well as seeing details it is often important to get e TreeMaps in Figure 17 do this showing an entire ph ile system (on the right). When we are seeing details, this overview, but having some idea of the context of w nt. This is sometimes achieved by having a separate hi en you zoom in to a part of a picture, image editors of hole image with the currently selected portion mark n be also been of a ows see have base like key, n in one nt of for way , or e of mes oser resh t an hoto we what ighften ked.
Another random document with no related content on Scribd:
IN CAMP ON THE AROOSTOOK RIVER.
CHAPTER VII.
“Now, my co-mates, and brothers in exile, Hath not old custom made this life more sweet Than that of painted pomp? are not these woods More free from peril than the envious court?”
REDEEMED FROM STARVATION.—THE FIRST HABITATION ON THE AROOSTOOK.—MR. BOTTING’S HOUSE.—THE TOUROGRAPH ASTONISHES THE NATIVES.—PURCHASING SUPPLIES AT MASARDIS.— HOMEWARD BOUND.—AU REVOIR!
When I turned out the next morning the first thing I heard was an exclamation from the Colonel.
“What a jolly place for trout!” “Trout!” we echoed. “You don’t mean it?”
A WAITING BREAKFAST.
“I do, every time, my hearties,” responded the Colonel, as he cast his line far out on the surface of a dark foam-flecked pool at the junction of the two rivers. The next instant we saw his rod bend like a whiplash, and as the speckled prize which weighed above two pounds shot up out of the stream five hungry men fastened their eyes on it with ravenous fascination, and smacked their jaws in anticipation of a breakfast.
“Bravo, Colonel! Do it again!” we cried, as the trout was landed; and verily he did it again and again, while we did them all to a brown in the frying-pan.
During a few days rest here we secured a number of views, hunted partridges, and captured four fine beaver. Aside from the value of the
pelts of the latter animals, they placed us once more beyond the chance of starvation; and having lived for a month almost entirely on their flesh, we had learned by experience that it was better than nothing.
We still retained the “shoes” on our canoes, for although each day the Aroostook River grew deeper and wider, we were obliged to repeat the experiences of Mansungun Stream.
On we paddled, day after day. Soon we passed the junction of the Mooseleuk and Aroostook Rivers, and great was our joy when at last we caught sight of the first house since leaving Chamberlin Lake.
From an architectural point of view it would hardly have interested the humblest carpenter, but to our longing eyes it was the assurance of perils over and the hardest part of the tour accomplished.
A rough log cabin, with barn adjoining, and a few acres of cleared land constituted the farm of one Philip Painter. Here, as I was focussing the camera for a picture, a mother and three children gazed on me from the window, and viewed my operations with astonishment.
THE FIRST HOUSE ON THE AROOSTOOK RIVER.
But being still over one hundred miles from the end of our voyage, the tarry was of short duration.
The Colonel, however, in prowling about the farm, found time to fill his pockets with a quantity of small apples, no larger than nutmegs, and about as digestible. He distributed them among the party as we
were returning to the boats, imagining that he had made a glorious capture.
“Splendid, aren’t they?” he said, as we began to munch them.
“Anything for a change from beaver stews,” I replied. “I feel that I could take to boot-leg cheerfully.”
A mile further on another farm appeared, perched upon a high bluff.
“We must take this place by storm!” cried the Colonel. “We must find a straight North American meal if we perish in the attempt,” and he led a gallant advance toward the farm house.
Mr. Botting, the proprietor of the place, appeared in answer to our hail and greeted us with a stare of open-eyed wonder. The first words he spoke were in company with a jerking action of his thumb toward the Tourograph.
“What kind of a machine do ye call that?” he asked, eyeing the instrument with a profound glance.
“This,” said the Colonel, hastening to explain, “is the improved Gatling gun.”
“An’ ye’ve come all the way to this God-forsaken hole to sell it?” said the man. “What’s it fur, anyhow?”
“Cats,” replied the Colonel, with the gravest expression in the world.
“Wal, we ain’t got no cats round here,” said the man. “Haint seen the ghost o’ one in years.”
“Don’t believe him,” I said, interposing, “It’s not a Gatling gun; it is a camera—an instrument for taking pictures—likenesses.”
“Oh!” drawled the man, “I see! He-he! Queer lookin’ affair, ain’t it? Looks like one o’ these patent coffee-grinders I seed down at ’Guster (Augusta) when I was there last.”
“Sir, you insinuate,” said the Colonel. “We have had neither sight nor taste of coffee in weeks, and we don’t sport a coffee-grinder for bare admiration’s sake, we can tell you.”
“Which brings us to our business,” said I. “We have just come from Moosehead Lake. Can you get up a dinner for the crowd?”
“Wal, yes, I guess so,” said the man in a half-dubious tone, as he took in the calibre of the party.
Then, beckoning us to follow, he hobbled back into the house, where after an hour’s tarry we were served with a dinner that hardly paid for the time lost in eating it. It consisted of bread, potatoes, and tea sweetened with molasses; but, like the apples, even this was “a change” from beaver stews.
“CAN YOU GET UP A DINNER FOR THE CROWD?”
“Must a-had a dry time, gen’lmen,” he said, as he busied himself attending to us. “Didn’t find much water, I guess. Never did see the ’Roostook run down so low in all my life, an’ I’ve lived on this ’ere river now nigh on thirty-seven year. I’m seventy odd year old, but
only for a lame hip I’ve got I could tramp through the woods with the best o’ ye.”
“You must have some trouble in working your farm,” remarked the Colonel, surveying the fields in front of the door.
“Oh, no; not much. I raise sons to do it. I’ve got eleven as likely boys as you ever did see; but I lost one in the war—poor feller!” as in a husky tone of voice he pointed to a framed certificate of his son’s war services.
Sixteen miles more of vigorous paddling brought us to the town of Masardis, the post-office of the county, and landing on the shore among a number of dug-outs and batteaux, we entered the village.
“Where is the store?” inquired the Colonel, as he crossed the street and rapped at the door of one of the houses.
“Don’t have any,” said the lady who answered his call, surprised at her visitor.
“Well, can you sell us some flour, potatoes and coffee?” and then the Colonel unrolled his memorandum of much needed camp supplies. At this house we purchased flour, at another potatoes, at another coffee, no two articles being had at the same place, while chickens at twenty-five cents each were sold “on the run,” the Colonel and Hiram securing them after an energetic race.
BIRD TRAPPING MADE EASY.
An old lady of seventy summers, who sold me a box of honey and was very communicative, said during a short but delightful conversation—“I suppose you have heaps more people down in Connecticut than we have in this town; but I don’t believe they are half so happy as our townsfolks. Oh, no! they can’t be near so happy —except, well—except on election days;” and a sad expression came over her wrinkled countenance, for the smaller the town, the greater is the feeling on politics in Maine.
The river now widens to a distance of over one hundred and fifty feet, and day after day shows a gradual increase in its depth and power
The current sweeps us swiftly onward through rapids innumerable in the full excitement of canoe life, but occasionally we are forced to disembark and drag our canoes over a rocky beach, which obliges us to retain the “shoes.”
At our various camps we are visited by the inhabitants along the route, who in return for the history of our tour entertain us with news of the outside world, from which we have been separated for so many weeks. Then we begin to realize that we are homeward bound.
“SEVENTY SUMMERS.”
“Hello, stranger!” yelled the Colonel, as rounding a bend in the stream he spied a man standing in one of the loghouses that dot the banks; “can you tell us how far it is to the next town?”
“Dunno, friend; but its nigh on ten miles by the road.”
An invitation to one of these callers, requesting the honor of his company at breakfast was accepted (with avidity), although, as he remarked, “the old woman was waiting to serve that meal for him on yonder hill.”
On passing the towns of Ashland and Washburn, the foamy and discolored appearance of the stream gave evidence of the potato starch manufactories in the vicinity.
The strangest peculiarity of the inhabitants was their utter ignorance of the country and its surroundings.
These people, living on the river, could not give us the faintest idea of distances to points along the shore.
A PEEP AT THE STRANGERS.
Another gave the same answer, while a third did not know the name of the next town, although he had lived five years in the
country—a parallel to the Virginian woodsman who stalked forth from his native pines one day to learn that there had been such a catastrophe in the history of his country as the war of the Rebellion.
“Wake up, boys,” yelled the Colonel, arousing the party (4 . .) at our last camp near Washburn, where we turned out in the dark to partake of a hasty breakfast before embarking.
PRESQUE ISLE—CIVILIZATION IN FOCUS.
“If we are going to make forty-five miles to Caribou to day, we must make hay while the sun shines,—or while it doesn’t shine,” he added, as he took notice of the darkness.
Soon we were gliding down the swift stream, avoiding the huge rocks dimly appearing through the mist, until at last the rising sun dispelled the darkness.
At Presque Isle we landed, and while the guides were preparing dinner, I climbed a neighboring hill with my Tourograph and secured a picture of the scene.
Hour after hour we labored at the paddles, until they seemed almost a part of ourselves; the “shoes” on our canoes retarded us not a
little.
The sun was creeping down the western sky, and the tall pines on the bluffs above us threw their lengthening shadows across the stream, as doubling the last bend we shot the canoes along side the wharf at Caribou, and completed our tour of over four hundred miles from Moosehead Lake to the Aroostook River.
Here we took the cars.[B]
[B]Since this canoe tour was completed the railroad has been extended to the town of Presque Isle, at which point tourists can leave the Aroostook River, saving themselves a tedious paddle of about twenty-two miles to Caribou.
A delegation of the “big people” of the vicinity saw us off.
VALEDICTORY.
At the parting moment they seemed visibly affected, as our sketch shows.
As we crossed the line at Fort Fairfield the following day on our way to Woodstock, New Brunswick, the custom house officer found
nothing in our kit to reward his examination, although he displayed much curiosity in the leather case containing the camera.
“You must have had a fine time,” he remarked.
“Yes,” was the reply, “save building dams and shoeing canoes.”
While the Indian ejaculated—
“Me think so, too; yes!”
In the whirl of the outside world the weeks fleet by as with the swiftness of a day, but in the solitude of the wilds it seems a longer lease of time.
It is like an age since we took leave of civilization and plunged into the heart of the forests. Now, out of the depths, with a bound we are again in the noise of the busy world.
Mighty trees, primeval rocks with draperies of vine and moss and lichen, tumbling cascades, rushing streams, and all the forest’s wealth of color, form and music disappear like magic.
Presto! what a change!
From the sigh and rustle of the grand old pines list to the rattle of rail cars, the shriek of whistles, and hum of machinery in the mills and factories.
From the croon of the night-bird, that with the distant star has often been my only company in the dark hours while my comrades slept, list to the bark of dogs and crow of cocks, as we rush past town and hamlet through the night and early morn. We are out of the wilds. Farewell, Nature! Welcome, Home!
“There is a pleasure in the pathless wood, There is a rapture on the lonely shore, There is society where none intrude—
To sit on rocks, to muse o’er flood and fell, To slowly trace the forest’s shady scene, Where things that own not man’s dominion dwell, And mortal foot hath ne’er or rarely been; To climb the trackless mountain all unseen,
Alone o’er steep and foaming falls to lean— This is not solitude; ’tis but to hold Converse with Nature’s charms and view her stores unrolled.”
THE MOST ARTISTIC AND CHARMING BOOK OF THE SEASON.