THE PHYSICAL WORLD AS AN ABSTRACT INTERFACE Darren Edge, Alan Blackwell & Lorisa Dubuc
University of Cambridge Computer Laboratory William Gates Building 15 JJ Thomson Avenue Cambridge CB3 0FD
We describe an approach to understanding the way in which people can use physical objects as if they were components of an abstract language. This arose from our study of common devices that cannot support direct manipulation, because users are controlling future operations such as VCR recording. We apply an adaptation of the Cognitive Dimensions of Notations framework, a vocabulary for the analysis and design of information devices, to the digitally augmented physical objects of tangible user interfaces. Our specific focus is on assisting collaboration, awareness and communication between individuals. We describe three case studies: rhetorical structure for children, shared information spaces within organizations, and assistive technology for individuals with dementia.
Introduction Whenever we arrange physical objects or manipulate mechanisms, we rely on immediate perceptual feedback in order to adjust our actions and confirm their effects. For over 20 years, this experience of “direct manipulation” has inspired the design of graphical user interfaces (GUIs), in which visual elements of the computer screen are made to behave as if they were arrangements of physical objects. However physical systems themselves do not always support direct manipulation. As soon as the earliest electromechanical controls enabled physical action at a distance, it became necessary to design physical indicators for the human operator. Even more challenging is the situation in which physical systems include internal state, such that their future behaviour may vary. An operator then needs to anticipate that future behaviour, taking additional precautions to specify, test and review the effects of current actions. This has been described as the irony of automation (Bainbridge, 1987). We describe it in terms of an “attention investment” equation in which the mental effort of specification test and review must be offset against the expected convenience of automatic operation (Blackwell, 2002). The relationship between physical action, perceptual feedback, system state and future behaviour has become more critical in computer science as we develop technical infrastructures for pervasive or ubiquitous computing. In these fields, we anticipate design scenarios in which many physical objects may be augmented with computation and communication facilities. In some scenarios, system users may interact with
pervasive network facilities via a conventional screen interface (perhaps made more portable via mobile phones, PDAs, wrist displays or compact head-mounted displays). However we are particularly interested in those circumstances in which the augmented physical objects themselves can be manipulated in order to control the system. This is described as a tangible user interface or TUI (Ishii and Ullmer, 1997). A critical question for the designers of TUIs is whether they are able to apply the design principles of direct manipulation. At first sight, it seems that this might be the primary advantage of a TUI. Even the best GUI is only a simulation of object behaviour in the physical world, whereas a TUI is composed of real objects. However this fails to account for the centrality of internal state to augmentation. An augmented object, almost by definition, is one that is associated with additional state information. The benefit of this additional state is that manipulating one object may have effects on others, or that a manipulation may have some effect in the future. In systems terms, these can be considered varieties of automation. In computational terms, they can be considered varieties of programming. In order to specify multiple actions or future actions, it is necessary to describe those actions using some abstract representation. This is the precise opposite of direct manipulation, in which there is no abstract intermediary between action and result. In our analysis of TUIs, we therefore distinguish direct actions on the world (perhaps mediated by a physical control device) from abstract actions that manipulate representations of internal state. The first can be analysed using conventional usability approaches, whereas the latter should be analysed using approaches from the psychology of programming. Our design approach emphasises the manipulation of abstract notational systems (of which programming languages are one example). In the case of TUIs, this leads us to think of the ways in which the physical world can express an abstract information structure, and an arrangement of physical objects can become a notation, perhaps expressed as a manipulable solid diagram (Blackwell et al, 2005).
The Cognitive Dimensions Framework The Cognitive Dimensions (CDs) framework (Blackwell and Green, 2003) is a tool for the design and evaluation of notations, where a notation is the means of representation and control of an underlying information structure. Notations are evaluated against a set of interrelated criteria – cognitive dimensions – with the result being compared against the desired profile of the activities to be performed with the notation. This procedure can also be followed in reverse to aid the design of notations, by looking for solutions whose profiles fit closely with that of the intended activities. The CDs framework was designed to be applicable for any notation – virtual (as in GUIs) or otherwise – and as such the names of the dimensions reflect high level concepts that can often be interpreted in a number of ways. By restricting the domain of analysis to TUIs, some of these dimensions have interpretations that are particularly salient and worthy of more role expressive names. We call such interpretations the tangible correlates of the cognitive dimensions. We apply the cognitive dimensions and their tangible correlates at the earliest possible stage of TUI design, as a precursor to creating prototypes with modelling materials. Although prototyping allows „hands-on‟ creativity, an inherent problem is the vast number of alternative prototypes which can be produced when the mind is allowed to wander freely. The benefit of potentially finding novel and surprising solutions is balanced against the cost of creating numerous unsuitable designs. The definition of a
„suitable design‟ is one whose profile provides a close match with the profiles of the activities to be performed with the design. A preliminary CDs analysis can therefore be beneficial in placing constraints on the design space, such that only suitable designs are considered in the prototyping phase. The rigour of CDs analysis ensures that more of the possible design space is considered, and the constraints it imposes ensure that more of the suitable design space is explored creatively during prototyping. To illustrate the kind of reasoning typical of a CDs analysis we will present examples from our past and current work, using the typographic convention of cognitive dimension<CD> to refer to the traditional CDs, and tangible correlate<TC>[cognitive dimension] to refer to their tangible correlates. For a full description of the Cognitive Dimensions framework, see http://www.cl.cam.ac.uk/~afb21/CognitiveDimensions.
Case Studies Design of TUIs to support collaboration between collocated users In the WEBKIT project, we designed a physical representation of the abstract structure of arguments (Stringer et al, 2004). Our goal was to help schoolchildren make a critical assessment of material they had found during research on the Web, as they selected and presented evidence in support of some proposition. We used tangible tokens (with embedded RFID tags) to maintain the relationship between specific claims formulated by the students, and the evidence from which those claims were derived. The tokens could be ordered into physical structures that both illustrated and enforced the abstract argumentation structures of classical rhetoric. This process was carried out in teams, where all children could see the emerging abstract structure, and collaborate in constructing and refining it. When the argument was eventually presented to the whole class, the ordered tokens provided private cues for a speaker, while also offering a simple control interface to the projection of multimedia illustration material for the talk. Each stage of an argument is represented by a token holder that provides two parallel racks for the linear arrangement of tokens. Tokens can slide freely within these linear constraints and be moved freely from one rack to another. This is an example of low viscosity<CD>, or resistance to change, in terms of both the manipulation of tokens within a rack, and the movement of tokens between racks. We say that the interface has both low rigidity<TC>[viscosity] and low rootedness<TC>[viscosity] respectively – desirable characteristics for a collaboration interface. These correlates commonly trade-off against the shakiness<TC>[error proneness] of the physical notation, or its proneness to accidental and unrecoverable damage. This is also the case with the WEBKIT interface – tokens can be easily knocked out of the shallow racks. However, the consequences of this are partially offset by the use of LEDS on each token to indicate detection status – this feedback, which eliminates hidden augmentations<TC>[hidden dependencies], makes it easier for users to detect and correct such errors.
Design of TUIs for to support awareness within distributed teams One of our current projects is a tangible interface to support shared awareness within distributed teams. In this situation, rather than having a single focal interface to support active collaboration, each user needs to have their own personal interface for passively communicating information to other users, and monitoring the changing state of other users‟ information structures in return. Furthermore, as users‟ focus is their existing work,
such an interface needs to operate on the periphery of users‟ attention and complement the traditional monitor, mouse and keyboard setup. The structural approach we have devised involves a division of labour between the two sides of the workspace – tokens representing items of interest, or information entities, are arranged and manipulated on an interactive surface positioned by the side of the keyboard not occupied by the mouse or other pointing device. The opposite side – where the pointing device is located – is used to group together all of the tools of the interface that are used to navigate and set attributes of information entities. The positions and identities of tokens are detected using computer vision techniques, and attributes of the associated information entity are dynamically displayed around the token on the interactive surface. The types of information entity envisaged include people, documents, tasks, reminders, timers, and so on. Users will be able to use their interface to „physically‟ pass information entities between interactive surfaces, and associate tokens with others‟ entities to monitor their evolution, e.g. progress on a task. A shared web space will be used to set up token-entity associations and publish in real time the changing information structures of all team members. In this kind of TUI, the role expressiveness<CD> of physical tokens must be tradedoff for increased adaptability<TC>[abstraction] in terms of the different information entities they can be used to represent. In accordance with the „peripheral‟ requirement, the interactive surface in our TUI has sufficiently low bulkiness<TC>[diffuseness] that it can be relocated by the user to their home office, client site, etc. – the interface as a whole has low rootedness<TC>[viscosity]. The physical tools also have a degree of permanence<TC>[visibility] in their location, which means that the user can learn to operate them by touch alone. Combined with the parallel, bimanual method of interaction, this decreases the overall rigidity<TC>[viscosity] of the notation.
Design of TUIs to support communication for individuals with dementia Another current project we are working on is a TUI to assist older individuals with dementia, in maintaining a conversation through the virtual management of conversational topics. TUIs were a good candidate for use by this user group, as older people are often more comfortable working with concrete physical representations than abstract virtual ones. In the interface we are developing, physical objects are used to denote individual conversational topics and placed on a shared tabletop. The spatial arrangement of objects may be used to represent planned conversational topics and their anticipated flow, but this arrangement can be altered dynamically at any point during the conversation. A camera on a mobile phone is used by a person with dementia to select a physical object (and thereby a conversational topic); as topics are selected using the phone, the software tracks which topics have been discussed and relays this information back to the individual when repetition occurs. Externalising information in this way reduces hard mental operations<CD> by reminding users of what has been discussed previously, and providing visual cues for possible future topics. As individuals with dementia may be easily distracted by their environment and have a limited attention capacity, the interface also requires a high level of structural correspondence<TC>[closeness of mapping] for ease of understanding, and needs to keep hidden augmentations<TC>[hidden dependencies] to a minimum so as not to surprise users. This is achieved in our interface by using special tags that have a one-to-one mapping with conversational topics; the user with dementia having control over which tag designs to use and how they should be attached to familiar, role expressive<CD> physical
objects. This control is important for maintaining individuals‟ sense of identity, the loss of which can become a barrier to communication (Dubuc and Blackwell, 2005). These tags are detected and identified by computer vision software on the mobile phone. The phone is packaged in a casing designed to reduce unwieldy operations<TC>[hard mental operations] for users with impaired motor control, and has good purposeful affordances<TC>[role expressiveness] for pointing due the real-time camera display on the screen of the phone. The tag recognition software only functions when the camera is held pointed directly at a tagged object, making it easy to intentionally capture a conversational topic, and difficult to accidentally capture one. Hence both the shakiness<TC>[error proneness] and rigidity<TC>[viscosity] of the capture mechanism are minimised.
Conclusion Conventional GUI design guidelines, based on direct manipulation, are satisfied trivially by TUIs. As a result, they give little assistance for TUI design. We have therefore outlined a new approach to help designers analyse the needs of diverse user populations, in order to construct innovative digital augmentations of the physical world.
Acknowledgments This work is sponsored by Boeing Corporation. The WEBKIT project was funded by European Union grant IST-2001-34171.
References Bainbridge, L. (1987). Ironies of automation. In J. Rasmussen, K. Duncan and J. Leplat (Eds) New Technology and Human Error. Chichester: Wiley, 271–284 Blackwell, A.F. (2002). First Steps in Programming: A Rationale for Attention Investment Models. In Proc. IEEE Symposia on Human-Centric Computing Languages and Environments, 2–10 Blackwell, A.F. and Green, T.R.G. (2003). Notational systems - the Cognitive Dimensions of Notations framework. In J.M. Carroll (Ed.) HCI Models, Theories and Frameworks: Toward a multidisciplinary science. San Francisco: Morgan Kaufmann, 103–134 Blackwell, A.F., Edge, D., Dubuc, L., Rode, J. A., Stringer, M. & Toye, E. F. (2005). Using Solid Diagrams for Tangible Interface Prototyping. IEEE Pervasive Computing, 4(4): 74–77. Dubuc, L. and Blackwell, A. (2005) Opportunities for augmenting conversation through technology for persons with dementia. In Proc. Accessible Design in the Digital World (ADDW), Dundee, Scotland. Ishii, H. & Ullmer, B. (1997). Tangible bits: towards seamless interfaces between people, bits and atoms. In Proc.CHI'97 Conference on Human Factors in Computing Systems. New York, NY: ACM Press, 234–241 Stringer, M., Toye, E.F., Rode, J.A. and Blackwell, A.F. (2004). Teaching rhetorical skills with a tangible user interface. In Proc. ACM Interaction Design and Children.