TMA1115 Final Major Project - Another Reality
stephen hibbert - 2014
Acknowledgement Thanks to Anneke Pettican, Rowan Bailey and Eddie Dawson Jones for their support during the production of this project. Created at: school of art, design and architecture University of Huddersfield Submitted in part-fulfillment of the requirement of ma - Digital media design Created by Stephen Hibbert - u1371079 2   Stephen Hibbert - 2014
Contents Introduction 4 Section 1 - Prior Work Term 1 - Short Projects Term 2 - Processing & Visualisation Understanding Cinema4D The inception of Xuni AR
7 8 12 17 18
Section 2 - Research and Testing 25 The theoretical setup 26 Kinect Sensor testing 28 Augmented Reality Glasses Testing 30 Gesture and UI research 32 Gesture Recognition 33 AR and VR user interface design 34 Section 3 - Visualisation 3.1. Capturing the source footage 3.2. Marker-less Motion Tracking 3.3. Storyboard Development 3.4. Compositing video footage 3.5. Developing the CG environment
37 38 40 42 44 47
Conclusion 60 Appendix 61 References 62
TMA1115 - Final Major Project  3
4   Stephen Hibbert - 2014
introduction What does the future look like? After investigating multiple different strands of emerging technology and information delivery this project now enters its final stages of production. In attempting to communicate a new type of Augmented Reality interface 'Xuni AR', I have researched and compiled a number of different visualisation techniques that attempt to mimic what a potnetial user might see when wearing the Xuni system. This document 'Another Reality' records the process by which I came to visualise the projects visual assets used in both the testing of interface related devices and the resulting animated visualisation. It serves as a companion piece to the Journal Paper 'What does the future look like? An introduction to a potential future augmented reality system' which describes the research and testing that went into producing this project.
TMA1115 - Final Major Project  5
"Meaningful information is not a given fact, and particularly now, when our cultural artefacts are being measured in terabytes and petabytes, organising, sorting, and displaying information in an efficient way is a crucial measure for intelligence, knowledge, and ultimately wisdom." -- Andrew Vande Moere (2010) Data Flow 2: Visualising Information in Graphic Design. Gestalten
6   Stephen Hibbert - 2014
section 1 prior work
During the first two terms my work investigated various different techniques that might be used to help visualise information. I was interested in how data visualisation techniques might be taken further, to try and develop real-time environments within which to 'immerse the user in data'. The idea behind this supported my overall wish to convey understanding of a given topic (depending on the data used) to an audience in new ways.
TMA1115 - Final Major Project  7
Term 1 - Short Projects Short Project 1 - City Generation The principle research aim in the first term was to generate original content using freely available public license datasets. Using either Unity3D, Autodesk Maya and Maxon Cinema4D I intended to implement unique visualisations both pre-rendered and interactive. However, I first needed to better understand what each software packages strengths were, and experiment with how I might implement them. A therefore embarked on series of short projects, each designed to assess a specific technique or attribute I understood to be of relevance to later work. Many software vendors now allow their communities to create and distribute scripts, tools and assets which support the main package functionality. Wherever used, in whichever package, these assets were referenced and credited as appropriate. All assets were non commercial public license versions, unless otherwise stated.
8   Stephen Hibbert - 2014
For this project I decided to try and implement some scripts in order to test procedural generation techniques, something which may prove useful when using abstract data object generation. Although it is not necessarily my intention to generate real world locations in the use of data visualisation, the use of real world references is useful in trying to visualise how data and scripts can be used to visualise objects quickly. The script used here ‘Kludgecity’, is a public license MEL (Maya Embedded Language) script written by Ed Whetstone. It allows generation of various building types dependent on user specified parameters. Although the models are detailed, they were quite quick to generate, however there are still issues with overall scene lighting and texturing. Although these assets are useful, and quick to generate, they can lack in fine detail and control, and therefore their implmentation was limited.
Short Project 2 - Import/Export
Short Project 3 - Satellite Data
Unity3D appears to have the flexibility I need in order to produce a bespoke interactive piece, using rendering methods normally found only in pre-rendered imagery. Its use throughout the video game industry on both small independent products, right through to large scale multi-national releases, clearly proves its ability to create a consistent and content rich experience. After generating some initial test assets for a generic environment, my first task in Unity3D was to develop an understanding of how imported geometry might look, and also how factors such as lighting and scale might be affected during the transfer process. The screenshots seen here show a development environment with a building interior built in Maya using KludgeCity and then transplanted into a basic Unity3D landscape. Initial impressions are positive, after initial research into best use of file types and geometry requirements are met. Further investigation will be needed into shading and texturing transfer.
To understand how data might be mapped to objects I decided to test my theory on a readily available large dataset. Accessing freely available Geo spatial data through Google Earth using a custom overlay from Kings College London. After choosing a sample dataset I then transferred it into a DEM (Digital Elevation Model) viewer. I then exported a black and white height map and after further adjustments in Adobe Photoshop, I imported this black and white image as the dataset for a 3D Map in Cinema4D.
TMA1115 - Final Major Project 
9
4 - Interactive Map
5 - Real-time lighting techniques
This project involved integrating the object data created in Cinema4D using the UK map information. Importing this object into Unity3D was relatively straightforward with a generic texture added for the grass, and a flat plane shape used for the water area. Further assets were then added to test the import of 3rd party objects such as trees and the skybox. Finally basic camera effects such as lens flare were added for effect. Further scene optimisations would be necessary to improve overall performance, but this project helped to demonstrate the power of Unity3D, and the ease with which an basic environment can be created.
Effective lighting is key when creating effective and realistic looking rendering of assets, the cost of employing high end lighting techniques however means it becomes much more prohibitive if the creator wants to generate something in realtime more than 30 frames per second. A possible solution to this can be to bake light maps. This process involves using a 3D package alongside a capable render package to produce a light based texture map. This can then be composited together and mapped onto a realtime asset. If done effectively this can result in highly effective looking realtime objects that appear almost pre rendered, but can be drawn as quickly as a normal low-resolution texture object. Simulating realistic light in 3D is technically challenging. Various methods and algorithms have been developed to mimic natural light characteristics. Light mapping
10   Stephen Hibbert - 2014
involves breaking each light characteristic down. The key is to try a give as much flexibility as possible at the composite stage. Compositing each light map layer allows finer adjustment to each attribute for example in order to change its colour, saturation etc and therefore tailor the look of the resultant texture to the object and overall scene. Multiple light map layers or ‘passes’ can be generated separately by creating a new ‘bake set’ for each light characteristic necessary to achieve a final high definition texture. Ambient Occlusion for example simulates how light radiates on non-reflective surfaces, giving a surface a more natural light distribution. These passes can then be edited together in Photoshop using various blending modes and opacity levels.
6. Introduction to Shading Networks A shading network generally relates to the assets constructed in a 3D package to describe how the surface of an object will appear when rendered to screen, whether real-time or pre-rendered. Balancing computational efficiency with high quality in rendering is often where advanced knowledge of shading can set apart one piece of work from another. Two geometrically identical objects can appear distinctly different with two different networks applied. This short project introuduced some basic concepts regarding the interaction of differrnt shading components (also known as layers or nodes).
TMA1115 - Final Major Project
11
Term 2 - Processing & Visualisation Exploring the possibilites in using data The primary structure of my work in term 2 involved the exploration and development of data driven artwork in communication. Data Visualisation exists to explain complex data driven subjects to an audience in an engaging and informative way. To do this requires the 'Data Visualiser' to understand both the information given and the potential audience requirement. This bespoke method can be seen being utilised many different industries, what makes this design field interesting is the ability to effectively enlighten an audiences understanding of a concep, and in some cases offering a potentially paradigm shifting view of a subject. Nasa.gov offers a rich source of data related scientific information, with data visualisation seen as a key method in communicating research findings to a wide audience. The illustration used in the 'Nasa Perpetual Ocean' project gathered ocean current data over a number of years, which was then enhanced and overlayed onto a three dimensional motion graphic of the planet. Whilst the scientific data could be immensely useful to certain areas of the scientific community, it also exists as a beautiful example of data based artwork, the ocean current flow markings evoking the brush strokes of a Vincent Van Gogh styled painting. NASA's work is both technically and artistically accomplished, but how do they go about delivering such impressive work? After looking at various data visualisation professionals in my previous term, my work coming into this term has shifted toward the practical application of Data Visualisation. This has meant I have had to first evaluate the fundamental technical aspects of what might go into  such visualisation. In recent years many instutions and governments have made steps to promote open data initiatives, allowing anyone to download and analyse the data gathered. The United Nations is one such agency promoting this initiative, in the hope that other public and private agencies can use the data productively. Although the data is simply
12   Stephen Hibbert - 2014
formatted for straightforward access, intuitively interpretting and visualising the data offered decidely challenging. We first have to understand how this data needs to be read, and also what context the data has been gathered under. The book Data Points by Nathan Yau informs the reader in the art of statistical visualisation. As with any discipline an understanding of the associated rules and conventions allow the user to better realise and clearly communicate their ambition. Data Points succeeds in breaking down the conventions of statistical visualisation noting key questions for the creator to contemplate when attempting to clearly convey information.
A still image from the Nasa 'Perpetual Ocean' animation. retrieved from www.nasa.gov
Journalism in the Age of Data - Geoff McGhee This Youtube documentary from 2010 discusses the impact and importance of Data Visualisation in Journalism and also the benefits and limitations currently facing this rapidly evolving area of research. Journalist Geoff McGhee sought to understand what constituted good data visualisation from bad by talking to various industry sources about the techniques, skills, and tools needed to provide effective visualisations to a wide audience. Geoff McGhee The documentary provides a foundation with which to understand how data can be used effectively. It discusses the importance of narrative and definition of context in attempting to communicate to an audience. Without these considerations the visualisation seriously risks being mis-understood, or simply overlooked by the recipient as the data fails to communicate its meaning. This documentary brought home to me the reality of Data Visualisation whilst also illustrating how much this field is continuing to evolve from the documentaries creation to the present, with innovation continuing to deliver new ways of delivering information. The work of various luminaries such as David McCandless, Jer Thorpe, Fernanda Viegas & Martin Wattenberg continues to inspire what might be possible, with my own work hopefully looking further into this work adding an immersive aspect that might be able to further enlighten.
"...(in order for a data visualisation to achieve its aims) the individual must make the connection between data and real life." -- Nathan Yau (2012)'Data Points'
TMA1115 - Final Major Project 
13
Learning to process the data Data Visualisation exists to explain complex data driven subjects to an audience in an engaging and informative way. To do this requires the Data Visualiser to understand both the information given and the potential audience requirement. This bespoke method can be seen in all areas of design, what makes this field interesting is the potential to teach and enlighten the user, offering a potentially paradigm shifting view of a subject. With this in mind, I had to not only investigate methods of acquiring data, but also a means to visualise this data. Although approximations of data display could be made using a purely design related software to construct an image, this would not be practical if the amount of data to processed was too large, or if the data was likely to be revised and updated. Artist, Developer, and Author Ben Fry clearly understands all the issues surrounding practical data visualisation, and his founding involvement in developing the Processing programming language is testament to this. Fry's book 'Visualising Data' goes into great detail guiding the coder/reader through the sometimes complex methods and skills needed to interpret the ever increasing amounts of available information. Coming from a design (non-programming) background, this has at times proved difficult for me, as this method of asymalating data and directing the visualisation parameters is difficult without a fully formed realisation of what needs to be communicated to whom. Fry discusses the method of interaction with datasets, illustrating the methodology with the sequence - acquire, parse, filter, mine, represent, refine, interact. This reminds the designer that there are a number of stages which need to be addressed to deliver a successful result. In attempting to learn the Processing language in such a short time frame, I have also found utilising a variety of learning resources has given me a broader understanding of what the Processing development environment is capable of. The book Generative
14   Stephen Hibbert - 2014
Above and opposite: Interactive Output Displays using Processing 2.1
Design (2012), in addtion to Visualising Data (2008), was used in to gain an alternative perspective in harnessing the design and data driven aspects of the language. Further practice based learning was accumulated by joining a MOOC course run by online learning portal Coursera. 'An Introduction to Processing' gave users the chance to learn the basic structure of the Processing Language and then carry out assignments which were subsequently peer-reviewed with feedback.
TMA1115 - Final Major Project 
15
16   Stephen Hibbert - 2014
Understanding Cinema4D In attempting to visualise information I needed to fully grasp what the modern 3D package is capable of. During the first term I investigated real-time 3D techniques such as Lightmapping, and before embarking on 3D Visualisation during this term, certain techniques would have to be researched before being able to implement them effectively. Early into the term I investigated the 3D package Maxon Cinema4D. Following tutorials published by GreyScaleGorilla.com I embarked on a project integrating Modelling and Dynamics Simulation, followed by basic Post Processing techniques in Adobe After Effects. Completing this project gave me a much fuller understanding of the Cinema4D workflow, in comparison to the type offered by Autodesk Maya.For me, its differences lie around the inherent power of the effector based modelling and the greater emphasis on the nested object based structure, in conjunction with the better designed attribute editor. In addition to the central tutorial series I also conducted a series of additional independent experiments. These experiments included: • comparison of different surface deformation techniques - bump, normal and displacement mapping. • introduction of High Dynamic Range lighting systems. • addition of royalty free sound effects and using After Effects modifiers. • network based rendering in Cinema4D and After Effects
Additional experiment - rendering an High Resolution oil lamp
TMA1115 - Final Major Project
17
The inception of Xuni AR
18   Stephen Hibbert - 2014
The RSA Student Design Awards After researching and developing the different realtime techniques, data visualisation, and the potential for implementing a virtual environment, the opportunity to focus these different strands into a combined project presented itself in the form of the RSA Design Awards - Tomorrows Workplace brief. The Brief seemed to allow for my ideas to be developed within a more formal structure with the additional requirement of a short business case statement. This meant elements of technology I had already taken an interest in researching could be developed into a theoretical concept. Initially when conceiving the idea for the RSA project I realised I would need to develop the background information in order to better ground the project within some sense of reality. In addition, the prototype hardware appeared to be sufficiently developed to handle the requirements of the concept, but in order to develop a real world business proposition, I would need to understand how the user might also develop their understanding of the the virtual environment. Workers use technology all the time to improve productivity and refine ideas. How would our concept move the working environment forward, and what benefit might it be to the end user?
These questions (alongside many others) needed to be researched and addressed in presenting a convincing concept. At this point I decided the project needed additional support, provided by recruiting specially selected group members who could provide feedback, background workplace research and design iteration. Consequently this would allow me to develop the concept much more quickly. My work would covered Project Management, 3D Visualisation and environment modelling. Wujia (aka Milka) Min provided expertise in researching the background issues and insights surrounding historical, current and potential future workplace designs. and co-wrote the initial business plan. Memunah Hussain generated the 2D concept illustrations for the devices, and also worked with me on a suitable colour palette and effective layout of the different submission boards. Memunah also worked alongside both myself and Wujia to develop the statistic based information visualisations. The result was the invention of the Xuni AR system.
TMA1115 - Final Major Project 
19
fig. 12.
'Iteration of Mannequin' by Stephen Hibbert
iterating a visual style My three dimensional visualisation work took a number of forms over the course of the project, gradually being refined and simplified in an attempt to reflect Memunahs two dimensional drafts. In doing so I also learnt to utilise the Memunahs actual Adobe Illustrator path files within Cinema4D to create accurate 3D object versions. In using 3D assets I was able to retain greater control over how I could communicate its wearable device concept. By manipulating the three dimensional mannequin and environments we could then accentuate or focus in on any details necessary. This would not have been easily achievable using a conventional 2D system, as each image would have had to be fully re-drawn each time a change was made. My next exploration was in attempting to understand what the user of the 'Xuni AR' concept might actually see when looking through the Augemented Reality glasses. Although we continued to iterate the layout, and background research to support the submission, I still had not communicated this key viewpoint to my group colleagues.
20   Stephen Hibbert - 2014
Iteration of shader used in Xuni AR illustrations
Although various concept visualisations existed if searched for on the internet, we would still need to develop our own concept in order to distinguish our overall project from other efforts. I set about trying to develop a rudimentary environment combining the use of my own assets with some of the standard visualisation assets included with the Cinema4D package. Alongside this I began to delve deeper into the shader network system within Cinema4D. There are many elements to any shading network, each of which contributes to the appearance and perception of a scene. Cinema4D’s shader capability, although initially daunting (although no more so than any other 3D package) was quickly becoming intuitive to use. The layout of the Cinema4D shader composition window allows users who may be familiar with the Adobe style of layered composition to partially understand the underlying concepts at work. An early concept image for a virtual glass interface
TMA1115 - Final Major Project 
21
Environment scanning One of the techniques I wanted to communicate within the 'Xuni' RSA submission was realtime 3D environment scanning. Various methods and technologies now exist in this field, although up until recently most solutions utilised expensive and time consuming equipment and software. Various alternative techniques could be used, one of which is a technique called Photogrammetry. In 2013 Autodesk released a new Apple Ipad app entitled '123DCatch'. By taking a number of photos around a central object, photogrammetry techniques could be used to generate a 3D virtual object of the photograph subject(s). This particular software is impressive but can be time consuming and rather limiting in that it can only really deal with objects, rather than full environments. Professional implementation is much more refined, as can be seen in the work such as 'The Vanishing of Ethan Carter' created by The Astronauts . Innovative devices are beginning to be brought to the wider commercial market with a number of different vendors offering portable three dimensional environment scanners. The 'Structure Sensor' designed by Occipital Inc attaches to a iOS device, allowing the user to scan an environment and view and edit the resulting 3D object. Independent developer Nigel Choi (formerly a Google web progammer) has been experimenting with a prototype Structure Sensor. After contacting Choi to ask how he had managed to develop this technique he explained: " By taking this (sensor) offline. I coded the real-time point cloud viewer myself using the SDK they provided. I have been fortunate enough to get hands on a prototype unit and have met the Occipital guys." As well as providing feedback on the 'Xuni AR' project concept: " I think your goal of enabling deeper collaboration for a remote workforce is very noble. There are lots of enabling technologies for these." He also suggested that I might be able to achieve a similar result using some software developed by Occipital called Skanect used with the Kinect 1.0 sensor.
22   Stephen Hibbert - 2014
Top: An example of the photogrammetry technique using Autodesk 123Catch Bottom: Real-time scan using structure sensor - created by Nigel Choi
TMA1115 - Final Major Project 
23
"The heroes of twentieth century technology will be the teams of designers, anthropologists, psychologists, artists, and (a few) engineers who integrate the power of technology with the complexity of the human interface to create a world that makes us smarter, with us, in the end, barely noticing." -- Mark Weiser (1999) 'How computers will be used differently in the next twenty years'
24   Stephen Hibbert - 2014
section 2
research & testing This section discusses the devices used as initial inspiration and their potential connection to each other. To understand how this final major project might work in practice various software development tools were used to carry out experiments. The original hypothesis for this final major project – of using separate hardware devices in a unified software environment - was investigated with each device in isolation demonstrating the sensor input required.
TMA1115 - Final Major Project 25
The theoretical setup
The Structure Sensor created by Occipital Inc is a portable depth sensor that can create a 3d map of its surroundings or a focus object. A mobile smart device would be used to synchronise all the sensor data.
UnityÂŽ by Unity Technologies would be used to create and control CG environment which incorporates the sensor data The Myo Armband created by Thalmic Labs incorprates sensors to read hand movement. it sends the sensor data wirelessly to a connected device.
26   Stephen Hibbert - 2014
In development an Oculus Rift DK2 Virtual Reality head mounted display would be used as an alternative to the smartglasses.
Smartglasses would be used to display an environment created on the smartdevice. This artist impression incorporates the depth sensor into the smartglasses.
Theoretically a depth sensor could be attached to the Oculus Rift, this would allow a basic environment to be developed and used for testing.
TMA1115 - Final Major Project 
27
Kinect Sensor testing Depth sensor testing was carried out using the Microsoft Kinect for Windows® (Version 1) depth sensor hardware. The Kinect sensor has a wide range of possible application fields with Microsoft Research in particular publishing an extensive collection of materials dealing with environment capture and individual object recognition as seen through the implementation of Kinect Fusion on a Windows based operating system (Izadi, Kim, & Hilliges, 2011; Newcombe & Davison, 2011). Although not a mobile device (without custom modification) the Kinect sensor is a good indicator of the possible outputs that could be achieved.
Test One – Zigfu plug-in for Unity software
Test Two – Processing for Kinect software Given the lower level language access Processing for Kinect provides, a greater amount of control over individual Kinect sensor functionality could be implemented. If the Kinect continues to be used the Processing integration may be extended in future testing, if Unity® software proves to be unsuitable for use in this project over the longer term. However, further investigation of the Processing language support will be necessary.
The Unity software plugin ‘Zigfu ZDK’ enabled direct display access to the Kinect sensor captures within Unity. However this software is primarily concerned with individual skeletal tracking integration and its functionality regarding environment scanning is limited.
Above:Screenshot of a basic Processing for kinect sensor display output left: Zigfu unity plug-in showing kinect sensor outputs
28 Stephen Hibbert - 2014
Test Three – Skanect software ‘Skanect’ (Occipital Inc, 2014) environment capture software was used to create a three dimensional mesh of the physical environment immediately surrounding the authors test space. By capturing depth and point information using the Kinect sensor a high-density 50,000-point mesh was calculated within the Skanect software . The limitations of the free software license meant a restricted low-density 5000point mesh was then calculated for export into the Cinema4D program for rendering and display. Although basic in export complexity, the Skanect Free software proved the viability of generating a 3D mesh from depth sensor data. The creators of Skanect, Occipital Inc., have since produced a much-improved portable depth sensor entitled the ‘Structure Sensor’ and associated ‘Structure SDK’, which allows for much higher fidelity and potentially a near-real-time dense-mesh generation. It is therefore assumed that such a sensor might be theoretically included into other portable hardware in the future. This would offer the possibility of real-time environment scanning which could then be topologically reduced, refined or amended by a central processing unit’s in-built software algorithms, such as those used in the Kinect Fusion application (Izadi et al., 2011). In addition, contemporary research using other devices has demonstrated the potential for automatic surface reconstruction using various different scanning techniques; including light-field generation (Kim, Zimmer, & Pritch, 2013) and automatic large scale reconstruction of scenes using LiDAr data (Lin et al., 2013). Extensive research also exists for Vision Based Tracking systems (Appendix 4.4), which utilise point cloud data similar to that gathered by the Kinect sensor device. This research further demonstrates the potential for a portable depth-sensing device. A depth and motion sensing mobile hardware project currently in development entitled ‘ATAP-Project Tango’ (Google Inc, 2014a) has also demonstrated the potential viability and interest in this field with the introduction of a Android based tablet featuring integrated depth and motion sensing technology (Google Inc, 2014b).
Top: Cinema4D render of low-density room scan taken from skanect software Bottom: Skanect software capturing authors office environment
TMA1115 - Final Major Project
29
Augmented Reality Glasses Testing Due to the limited availability of augmented reality devices to test at time of writing, an alternative device was implemented in testing, which would allow for similar stereoscopic display features to be evaluated within the Unity environment. Furthermore, use of this device could also be tested on third party subjects gauging their overall perception of stereoscopic visual overlay. The Oculus Rift DK2 display is a recently released (July 2014) relatively low-cost implementation of stereoscopic head mounted display (HMD) hardware, designed with a number of key enhancements over currently available and pre-existing counterparts. Internally it integrates numerous sensors including a Gyroscope, Accelerometer and Magnetometer allowing momentum and rotation tracking. Externally it has the addition of the Near Infrared CMOS Sensor to track ‘near UV led points’ integrated into the external casing of the HMD, which allow the system to track lateral head movement. Together, this allows the unit to translate the on-board sensor information and feed this into an update of an internal view using lightweight, low-persistence OLED screen of an virtual environment inside the HMD. In addition, Oculus Rift has the provision for an additional input on the HMD itself which could provide the option of a ‘video see-through’ AR device at some point. Testing the Oculus Rift setup was conducted using version 0.3.2 of the Oculus SDK for OSX, although positional tracking enabled by version 0.4.1 could not be enabled at time of testing due to incomplete drivers for the OSX operating system. This meant lateral movement was not detected, although an acceptable, immersive experience was still possible to implement due to Unity 0.3.0 support. A range of third-party beta state (non-positional tracking) software programs were experienced, which have been in development prior to this project test phase. Three subjects were used to test the HMD, alongside the author, with short test sequences recorded for overall impression feedback. Various third-party beta-software titles were used in testing phase to gauge experiences from both a fixed point and reactions to fast movement within an environment. Responses were overall very
30 Stephen Hibbert - 2014
Top: Test subject experiencing Oculus Rift Radial G Demo © (Tammeka Games, 2014) Bottom: Wireframe of Creative arts building
positive with the unique visual enhancement offered by the hardware overriding any shortcomings inherent in the pre-release beta state software. In addition a to-scale replica environment of the Creative Arts Building (CAB), University of Huddersfield was integrated into a Unity Software scene file to simulate a real world location. Only basic un-optimised geometry was displayed in order to maintain a viable average frame-rate of 60fps (Oculus VR Inc, 2014a). Despite the scene files sparse decoration, a straightforward immersive experience was created with no scripting or programming required. This offers the future possibility for further enhancement with various graphical optimisations and accoutrements being added to further enhance the test subjects feeling of immersion. Finally for contrast to the full dynamic immersion tests a pre-rendered stereoscopic image sequence was created using various specialist render settings within Cinema4D (Maxon Computer GmbH, 2014). Using Cinema4D’s standard camera asset, various test renders were initiated to determine a best use setting. A Merged Stereoscopic Image in ‘side-by-side’ mode with a 64 pixel additional parallax was noted to be the most effective in generating a convincing three-dimensional effect and was used in gauging test subject responses.. It should be noted, that additional stereoscopic camera settings available within Cinema4D, such as ‘Placement’ and ‘Zero Parallax’, were not used in this experiment. With this mind, although various artefacts were present during testing, including chromatic aberration and ‘texture swim’, the author can realistically assume these would be eradicated with more development time. The brief test period allowed all test subjects to familiarise themselves with a full virtual reality environment for the first time. Contrary to widespread media reports regarding the Oculus Rift and similar HMDs, no issues regarding nausea due to motionsickness-like effects were reported at any stage. Top: A stereoscopic screenshot taken from the unity player, shows 'inengine' view of creative arts building. bottom: Stereoscopic image sequence test
TMA1115 - Final Major Project
31
Gesture and UI research The proposed Xuni AR interface must take into consideration an individual user’s mixed-reality view of the world. The proposed system works by interlinking the spatial data from the depth sensor device with the HMD orientation sensor measurement and outputting to the 3D engine displaying the computer graphics within the HMD device. Therefore a virtual interface could theoretically be constructed which uses this spatial data to set the bounds within which to project content. In addition the possibility exists, given the additional sensor arrangement included with many commercial depth sensors including the Kinect tested here, that the CG interface might be able to use colour data gathered from the physical space to influence the colour, lighting and texture of the CG objects. Potentially this would lead to greater visual harmony and integration between the two spaces when mixed together. In attempting to visualise how the Xuni AR system might look an extrapolation based on findings taken in the research phase alongside references to other AR and user interface designs was designed. These designs were adapted into a short animation sequence by the author using Cinema4D. This serves the purpose of communicating to the audience both the idea, and considerations for practical implementation behind the project.
32   Stephen Hibbert - 2014
Gesture Recognition The design for the Xuni AR system incorporates the use of a pair of Myo armbands to detect and automate control of content displayed on the AR HMD. The Myo armbands are a custom-designed version of battery-operated electromyographic (EMG) sensors in conjunction with other built-in orientation based sensors (Attenberger & Buchenrieder, 2014). They use these sensors to accurately read changes in arm muscle tension, orientation and acceleration and wirelessly communicate this information to a central control device using Bluetooth low energy 4.0 technology. In brief the Myo armbands work by detecting all this information and matching this to a pre-defined set of gestures using specially designed algorithms to filter out random noise generated by unwanted gestures (Nuwer, 2013). In future these pre-defined gestures could be amended or added to using scripting by the Myo developer, tailoring the device use to the intended requirements. Incorporating these factors into the design of the Xuni AR system should allow for measured responsive motion feedback when viewing content within the AR HMD display. Using the Myo armbands for reference, further research was gathered regarding gesture recognition and in particular the work of HitLabNZ, University of Christchurch, New Zealand. Their study ‘User-defined gestures for augmented reality’ (Piumsomboon et al., 2013) records extensive blind testing of various hand poses or ‘tasks’ that might be implemented within an AR related interface. Using this information three distinct tasks were chosen to be implemented in the visualisation of the Xuni system, shown in Table 1.
TMA1115 - Final Major Project
33
AR and VR user interface design View management involves computing what a user sees within a 3D environment when it is projected onto a display screen. Doing this interactively to create effectively laid out user interfaces for augmented reality, virtual reality, or any other kind of 3D interface requires representing processing and screen space. – P.110 Emerging Technologies of Augmented Reality: Interface and Design (B. Bell & Feiner, 2000, 2001; Haller et al., 2007) As briefly referenced, above, there are numerous physical and aesthetic considerations required when designing a virtual interface, not all of which can be covered in this paper. Therefore various key components were reviewed by the author and integrated into a short visualisation animation, with a view to amending the design in a future publication. The paper ‘Lessons Learned in Designing Ubiquitous Augmented Reality User Interfaces’ (Haller et al., 2007; Sandor, 2008) identifies three classes of ‘criteria pertaining to the task(s) that need to be executed’ : • Task-specific criteria – for specific carried out within an application. • System-specific criteria - relating to the unique capabilities of the designed-for AR system. • User-specific criteria – dependent on ergonomic human factors and anthropology. As the Xuni AR has a unique combination of inputs and sensory information to draw on, both task and system specific criteria were considered in the prototype design of the interface. Initially the author investigated two current high-profile, well-tested, twodimensional implementations of navigation, one AR and one traditional screen-based.
34 Stephen Hibbert - 2014
1. Glass by Google Inc. Although not directly comparable, Glass would provide a useful insight in what to consider when controlling an interface without a physical device to use in hand. Glass is one in a number of products recently announced either in development or commercially released which have come to be categorized as ‘smartglasses’. Other examples include Epson Moverio, Recon Jet, Oakley Airwave, and Technical Illusions castAR. Each offers different in-built variations on Augmented Reality interface design and control, and a number of these are also beginning to be supported with Myo Armbands (Thalmic Labs Inc, 2014). The Google Glass interface uses the Glass Development platform. It has a comprehensive design reference resource available to developers (Google Inc, 2014c) intending to develop for the Glass interface. Control of the Glass interface is primarily through the use of a voice-activated invocation model with simplified touch control available by touching the side of the hardware unit itself providing access to the ‘timeline’ side-scrolling interface. Although different in its applied control to the proposed Xuni system, the Glass interface guidelines offer some useful principles that could potentially be transferred to other application control interfaces in the AR space. For example the tiled ‘card’ visual display framework offers a unified design approach developers should consider offering a unique, simple style along with resources including ‘card templates, a colour palette, typography and writing guidelines to follow whenever possible’ (Google Inc, 2014c).
2.‘Quick Look’ and ‘Cover Flow’ by Apple Inc As shown in Table 1 the hypothetical Xuni AR interface incorporates the use of a potential design for navigating multiple items using a rotary styled movement. This design follows some similarities with the Apple Inc. patented ‘Quick Look’ and ‘Cover Flow’ interface used as optional navigation tools in various applications within the current OS X interface (Apple Inc., 2014b), particularly within the central ‘Finder’ file navigation service. Apple provide extensive User Experience guidelines (Apple Inc., 2014a) for developers of products using the Cocoa and Cocoa Touch frameworks on both OS X and iOS. In the publicly viewable Apple WWDC 14 Session video ‘Designing Intuitive User Experiences’, Apple User Experience Evangelist Mike Stern presents some key facts that might be considered in ensuring these intuitive experiences are consistent in order for users to positively engage with the app and associated device.
The more apps behave as we expect them too, the more intuitive they are to us. The more intuitive apps are, the easier it is for us to concentrate on our true objective. The very best user interfaces are so intuitive, so natural, that they just sort of disappear and allow us to focus on what truly matters, that’s what we call intuitive. – Mike Stern, Apple Inc. (2014) Clockwise from top: google visualisation of glass interface operation; myo integrated with smartglasses; Apple cover flow layout; glass card design guidelines.
TMA1115 - Final Major Project
35
36   Stephen Hibbert - 2014
section 3 visualisation
Visualisation takes the findings from the research study undertaken previously and documents the construction a theoretical interactive interface system using 3D visualisation software Cinema4D and compositing software Adobe After Effects CC. The resulting animation is called 'Another Reality'.
TMA1115 - Final Major Project  37  
3.1. Capturing the source footage 3.1.1.Camera Equipment and settings Camera Body: Canon 5D MkII 35mm SLR Camera Lens: Canon EF 24-70mm f/2.8 Ultrasonic Settings (dependant on shot to be tracked) ISO: 100 - 200 F-Stop: 6.3 - 9.0 Shutter Speed: 1/30 – 1/60 second Lens: 24mm wide angle Focus depth: Manual - long distance White Balance: Auto (centre weighted) Colour space: sRGB Movie settings: QuickTime MOV; Res. 1920x1080 px at 25 fps
Above: Camera settings manually fixed for all video footage taken Opposite page: A selection of the reference images taken
38   Stephen Hibbert - 2014
3.1.2.Importance of fixed capture settings The video element of the Visualisation sequence required a number of elements to be carefully considered prior to the addition of the computer graphic elements. To capture the source footage various cameras were tested, with the Canon 5D MkII eventually selected. In order to create a consistent motion track various settings would need to be manually set and fixed. If these settings were automatically controlled and adjusted by the camera, various issues would have arisen when attempting to calculate the motion track in Adobe After Effects, with any changes in focus depth in particular resulting in failed analysis of motion track data. With the settings listed above, the intention was to capture footage that was well lightbalanced and noise-free, although issues with motion blur were unavoidable given the intended handheld motion required. In addition to these settings, sun position for area lighting strength and direction was noted, with subsequent shot footage purposely shot at similar times of day (approximately 1430 hrs.) to maintain consistent overall lighting conditions. In a larger production these settings might be compensated for with the use of a light meter and/or colour correction in postproduction.
TMA1115 - Final Major Project 
39
3.2. Marker-less Motion Tracking In order to visualise the Xuni Augmented Reality system in I decided to integrate full motion video with computer graphics. This involved the use of marker-less motion tracking in order for the computer graphic elements to be positioned correctly within the video scene. By generating virtual tracking points within the scene using analysis tools built in to Adobe After Effects, certain data was approximated, such as the floor level, by analysing elements within the footage and their movement relative to the camera view. This data was then exported into Cinema 4D, allowing the user to construct computer generated geometry with relatively correct position and orientation. Initial testing was mixed with various technical issues resulting in unstable CG geometry appearing on top of the video track. This was due to the unstable motion quality of the video track overall producing a less than satisfactory result when generating Track Camera data, at an average of 1.48 pixel error calculation (a result of <0.5px is generally seen as very good). This was mitigated somewhat by re-processing a tighter election of video footage using the Track Camera, and then selecting a greater number of resulting capture points with which to generate a null ground plane object. After a number of attempts the motion track capture average error below 1.0 pixel. Even with the increased motion track accuracy numerous other issues had to overcome in order to generate a satisfactory match between the video and CG assets. In particular the way in which After Effects and Cinema4D calculate their relative geometry co-ordinates differs significantly with After Effects calculating from the top left of the 2D frame, and Cinema4D calculating from the 3D scene space centre. This can be mitigated somewhat by setting the ground plane and origin within the Cinema4D track to match the Cinema 4D scene. Object sizing also interprets differently in each software package. The size of 3D objects calculated in an After Effects project translate into scale attributes in Cinema4D rather then the preferred ‘object size’ attributes. This means objects built to scale in Cinema4D can often not match expected scales imported from After Effects and further
40 Stephen Hibbert - 2014
scale adjustment needs to be worked on. Further compounding this issue is that each motion track calculates a scale unique to each different video track, so each CG scene needs further adjustment to compensate. Although this can be catered for in most situations, when using a handheld POV (point of view) motion camera it can lead to difficulties in geometry scale matching between different shots, requiring a tailored scale for each scene. Finally the camera motion track exported from After Effects into Cinema4D often generates a number of errors if subsequently moved or scaled (using a Null parent or otherwise) to fit a scene. The only way to compensate for this is to import the computer graphic objects into the ‘Motion Camera scene file’ and build around the motion track position, scale and rotation key frames.
The motion track process clockwise from top left: original footage; points calculated; solid objects created for reference; cinema4D assets correctly positioned using acquired motion track assets.
TMA1115 - Final Major Projectâ&#x20AC;&#x192;
41
3.3. Storyboard Development In order to aid the visualisation process a storyboard was created using information gathered from the Motion Tracking development process. Certain shots were subsequently cut from the final sequence based on continuing evaluation and refinement. The Adobe Ideas app for Apple iPad was used to digitally sketch each frame, with motion indicators added to the composition where additional description was needed. This process helped enormously in giving direction to the project visualisation. By estimating timing for each shot, further progress could also be made with the computer graphic pre-visualisation.
42 â&#x20AC;&#x192; Stephen Hibbert - 2014
TMA1115 - Final Major Projectâ&#x20AC;&#x192;
43
3.4. Compositing video footage The "Controllable' shot seen in this animation utilises a number of different techniques layered together to create the illusion of a controllable interface.Among the elements in developing this shot was the requirement to illustrate a physical hand movement on-screen. This needed to synchronise with the movement of a computer generated animation of an virtual interface rotating in the background. In order to successfully sync these two elements the decision was made to super-impose the hand movement so that individual timing adjustments might be made, seperate to the background motion track. Various methods exist to composite areas of footage together into the same video piece. During this project the two main options avialable within Adobe After Effects CC were used in order to discover the most suitable aesthetic. In total, after using both techniques, the masking stage of this project took around 30 hours to fully complete, and still contains some artefacts
2
TOP: An early attempt to cutout the hand using multiple masks BOTTOM: This method requires careful manipulation of mask points
44 â&#x20AC;&#x192; Stephen Hibbert - 2014
3.4.1. Traditional Rotoscoping
3.4.2. The 'Rotobrush' tool
Initially a traditional rotoscoping method was used. This involved creating masks for each component that would require 'cutting-out' of its current background and 'pasting' into a new background. These masks could then have their individual vertices altered to match the movement displayed on-screen. Although historically this is the primary method used in compositing footage elements, it is notoriously time consuming to implement, and difficult to achieve a realistic, compelling result. The compositor has to consistently balance a number of different elements including vertex position, tangent length, and tangent angle over multiple frames of movement. This means understanding the movement contained within the whole shot being worked on and carefully judging the required use of each element, as each change is carried forward into the later frames. If ill judged this can lead to unwanted flickering or unintentional movement, leading the audience to notice the a 'cut-out' effect when watching the film back. This method also suffers from a difficulty in implementing a 'feather' effect to the mask edge. This can lead to either to much transparency along the mask edge or not enough resulting in a 'hard edge' cutout.
The second method employed uses the Rotobrush Tool within Adobe After Effects. This recently developed tool has been created primarily to alleviate the difficulties with traditional rotoscoping described above. Various algorithms within the tool help to track pixel shift between each frame with the user required to make moderate adjustments where miss-calculation has occurred. The tool uses information gathered from prior frames to aid in better calculation, so as more frames are processed, the tool becomes more accurate in its prediction.
Pink outline of hand position created using the rotobrush tool
TMA1115 - Final Major Projectâ&#x20AC;&#x192;
45
3.4.3. Repeating Animations After each hand movement section had been rotoscoped it was then duplicated, offset and then time-reversed to create a mirrored animation sequence which could be replicated as seen in the picture below. This screenshot is an example of one hand movement repeated 3 times. All of the key frames created in the original sequence 'MVI_9761.MOV Original Slow 2' oscillate back and forth up the layers to ' 'MVI_9761.MOV Original Slow 7' creating the illusion of the hand movement moving forward and then back into the original position. The difficulty with this technique lies in transitioning from one animation pose to another, which involved morphing the foreground footage to compensate.
Keyframed sequences are layered and offset to create repeat hand moves
46 â&#x20AC;&#x192; Stephen Hibbert - 2014
3.5. Developing the CG environment 3.5.1.Designing the CG elements Each shot – ‘Wearable’, ‘Controllable’, and ‘Immersive’ - incorporated geometry designed and created especially for this project. The ‘Wearable’ shot features three separate objects that were developed out of some initial visualisation work in the previous ‘Exploration In Practice’ module. Each object was modified from some original reference objects using Adobe Illustrator paths as the initial curves. These curves were initially extruded in various ways to create a 3D object, and then modified with deformers and tools in Cinema4D. This allowed for ‘non-destructive’ deformation of the objects, making further adjustments to any early deformation actions much easier to perform. The object shading was achieved through both modification of pre-built shaders available with Cinema4D (Studio Edition) and custom-built shaders using various different channel attributes. Object tags were then used to control the lighting, scene display and rendering attributes. The ‘Controllable’ shot used similar techniques in object construction with the addition of a Mograph cloner to generate the ‘spin wheel’ effect. The ‘Immersive’ shot expanded the use of these object effectors incorporating seven different types to achieve the final animation (see 3.5.2).
TMA1115 - Final Major Project
47
'Wearable' Shot
2.Basic furniture added
3.Main assets inserted and lit
4.Furniture change to match video
1.Initial camera track test
2.Early test 'spinner' interface
3.Effector added to new 'spinner'
4.Assets merged with Motion Track
1.Initial camera track calculated
2.Tracked assets exported to C4D
3.Early matrix based effector test
4.Matrix & tracked assets merged
'Immersive' Shot
'Controllable' Shot
1.Initial camera track test
48 â&#x20AC;&#x192; Stephen Hibbert - 2014
5.Mask for 2D overlay created
6.Mask position linked to track
7.mask layered over cg assets
8.CG and text added to video
5.Rotoscope of hand movement
6.Hand move matched to CG animation
7.Textures added to spinner
8.Text added to video
5.Track matched with matrix anim.
6.Matrix replaced with Spheres
7.Effectors attached to camera path
8.CG added to video footage
TMA1115 - Final Major Projectâ&#x20AC;&#x192;
49
3.5.2.Environment lighting Integrating the computer-generated assets was helped by inclusion of an spherically distorted image of the filmed environment setting. Two photos of the Atrium space used for filming were edited in Adobe Photoshop. Using various effects filters they were first distorted spherically. They were then offset horizontally to allow for the construction of a seamless edge. By attaching this image to a spherical skybox and adjusting its shading properties to emit light, the affected objects could simulate the reflection and refraction of light evident in the video footage. In addition, the same image was used as an environmental reflection texture seen in the coloured orbs reflection as the camera passes through. Due to the fact that this photo would not be seen directly by the viewer certain extra elements were omitted to conserve both development and render times. This spherical map construction technique was only partly implemented during this project, with further stages of refinement required if the image were to be used as an actual background for the CG objects. Also the images, although of high quality, were not true HDR layered images -these would have required the use of a minimum of three bracketed-exposure passes per still image to be composited together, before the distortion took place, and this proved to be unnecessary in this particular project. A true spherical HDR image is created using a number of other features and items of equipment not available for use in this project, although the techniques used were investigated during development. Techniques involving the use of Fisheye or Rectilinear camera lenses, light probes, and a spherically-adjustable panoramic tripod head would normally be used.
50 â&#x20AC;&#x192; Stephen Hibbert - 2014
TMA1115 - Final Major Projectâ&#x20AC;&#x192;
51
3.5.3.The Creative Arts Building Although the motion track and environment lighting described above helped to integrate the computer graphic assets into the scene, it was nevertheless difficult to visualise the scale of the surrounding environment when trying to fully match the virtual with the physical. In addition the associated work on environment scanning meant their might be a requirement to have a CG replica of a real environment available to use as an asset in any real-time extension of the project. With all of this in mind the decision was made to re-create the Creative Arts Building Atrium area.
52 â&#x20AC;&#x192; Stephen Hibbert - 2014
3.5.4.Using Cinema4D ‘Effectors’ Various different ‘Effectors’ available within Cinema4D were used to displace animated geometry in both the ‘Controllable’ and ‘Immersion’ shots. Creating the animation in the final ‘Immersion’ shot required the use of displacement effectors. By attaching these effectors to the camera motion path and then animating various properties within the effectors the geometry within the scene could be changed or displaced. For example, using the ‘Falloff’ property limited the area of effect for each of these effectors allowing related geometry to expand and contract their position, rotation and scale attributes dependant on distance from, and strength of, the effector. By combining a number of effectors at various points we see the sphere shaped cluster appear to dynamically avoid contact with the user walking through and subsequently immersing the Xuni AR user in the CG content alongside the physical environment.
Illustration of multiple effectors (orange highlight) used to create displacement of geometry (Top) as camera passes through space (Bottom).
TMA1115 - Final Major Project
53
3.5.5.Simulating Global Illumination
3.5.6.Physical Renderer
After experimenting with various shading properties to create a satisfying ‘particle’ effect used in the ‘Immersive’ shot, a tutorial published by Nick Campbell at Greyscalegorilla.com appeared to offer an interesting aesthetic, based on ‘Boolean Taxidermy’ a work by renowned art studio Zeitguised. By implementing a colour based effector to vary colour based on distance from sphere centre and activating the ‘Chan Lum’ effect in the shader network an interesting style of shading could be utilised when animating an object, without the distortion and render cost often associated when animating objects using Global Illumination. The assets used in the Immersive shot differ from both the Campbells work in many other areas with the omission of elements such as dynamic based evaluation, sub-surface scattering or post render effects demonstrated. The Immersive render also differentiates itself by its use of various deformation techniques using effectors including inheritance, plain, and target effectors. It also adds the implementation of environmental lighting and the Physical renderer.
The physical render settings within Cinema4D enable certain features that imitate the features of a real camera. Although at first this appears to offer the designer a perfect opportunity to re-create photo-realistic renders, esepcially when the intention is to composite over still images or live footage, the physical renderer actually requires very careful calibration to be used effectively. As the Cinema4D help file states " In most cases you should use the normal CINEMA 4D renderer (Standard), which itself is very fast and stable." The reason for this is that the physical renderer, by its very nature, actually depreciates the fidelity of the images it recreates, so for example image artefacts present in physical cameras such as chromatic aberration due to lens based distortion are present in the final image. It also requires much greater levels of calculation due to the additional effects being evaluated, for example the user cannot render with hard shadows enabled, instead processor intensive area shadows have to be calculated. It is therefore advised to use the physical renderer only when used in the correct context, and only then if sufficient results cannot be achieved using the standard renderer. For this reason this project only used the Physical renderer during the final 'Immersion' shot with the render time experiecing an average ten-fold increase from around 7 minutes per frame to over 1.5 hours per frame at certain points.
54 Stephen Hibbert - 2014
3.5.7.Correcting Visual Artefacts The physical renderer was primarily employed due to its ability to render physically correct motion blur. Although post processing effects would have been quicker to implement and would have attempted to create a similar effect they suffer from the inabilty to accurately calculate depth related motion and volume space correction, especially with complex compositions (such as the overlapping spheres used here). However including motion blur calculation is not without its potential to create further problems. When rendering this project for example various artifacts started to appear when motion blur was applied, that hadn't appeared in the non-blur render. The solution was found through close analysis of the cameras motion path keyframes. During import from the After Effects Motion Track this motion path had been interpretted realistically aside from one key area. The rotation values had been translated as positive integers only, meaning if negative rotation crossed from 0º into negative numbers for example -1º the system had interpreted this as 359º, causing a 1 frame long 360º rotation to occur. As a result the motin blur calculation was incorrectly calculated resulting in severe distortion to the image.
Timeline displaying mis-interpretted keyframes, resulting in severe motion blur calculation (top)
TMA1115 - Final Major Project
55
3.5.8.Linear workflow By default Cinema4D (since release 12) has employed a linear workflow in its scene creation and rendering. However developing within a linear colorspace is a relatively new addition to many other programs, with its addition into After Effects and other Adobe layer-based products only integrated since the Adobe CS6 and CC versions, having up until now being primarily implemented in higher-end development environments using applications such as the node-based 'Nuke' compositing software developed by the The Foundry. Its reason for existing at all revolves around its mathematically correct calculation of computer graphics and interpretation of physical light when displayed on a screen. If used properly, the linear workflow allows for greater realism in rendering and 'truer' interpretation of light when compositing layers of images together. This is because in a non-linear workflow each asset has its gamma level automatically adjusted for 'correct' display. The difficulty in using a linear workflow is that many devices and software use different default colour profiles to describe and display their colour range when recording or generating their own footage. This footage needs to be linearized during the 'input' stages of project development in order for the linear calculations to be correctly evaluated. If all of the necessary footage is not linearized various issues tend to arise with colour reproduction, and may be adjusted incorrectly prior to output for final display. This issue is not often realised until the footage is output through a projector, monitor or television using a non-linear colour profile and the resulting images tend too appear either too bright or too dark. The solution is to linearize footage as it comes into a workspace, and prior to any additional effects being added. This allows the designer to correctly evaluate the source footage and ensure the resulting output profile (which depends on the output device) is faithfully reproduced. If linear settings are not available, it is recommended to use RGB 2.2 Gamma colour profile as a close alternative.
56 â&#x20AC;&#x192; Stephen Hibbert - 2014
Two images demonstrating the difference in using a non-linear PNG workflow (top) and linear EXR image workflow (bottom) during compositing separate render passes on to film footage.
3.5.9.Using 32-bit Images
" In practical terms, Linear workflow refers to a rendering workflow in which image gamma is carefully taken into account in order to assure proper light computations in a render." - Renderman University
Using a linear workflow most accurately requires the use of image files which support 16-bit floating point colour components. Although 32-bit TIFF does provide for this, a more versatile (and less memory intensive) option is to use the 32-bit openEXR image format. Originally developed as an open source initiative by Industrial Light and Magic (ILM), the EXR file format provisions for linear workflow and also includes multiple enhancements useful when compositing layers of image data and has become the de facto standard in many high end digital film post-production studios. After experimenting with various image file formats in rendering multi-pass images openEXR was found to offer the most reliable replication of the standard single-layer render when compositing on top of the film footage. Unfortunately Adobe After Effect CC does not natively support the multi-layer characteristics built into the EXR format and therefore each frame had to be rendered with five seperate pass image sequences (see following page), resulting in a five-fold increase in frame quantities and a two-fold increase in frame memory storage allocation. All of this meant a standard 25 fps 1080p sequence of 1000 frames (40 seconds) used an average of approximately 20 Gigabytes (GB) of storage space during the postproduction phase. In contrast a 1080p resolution PNG 8-bit sequence used only 2 GB.
TMA1115 - Final Major Projectâ&#x20AC;&#x192;
57
The five passes rendered out using Maxon Cinema4D and composited on top of one another in Adobe After Effects. From Left to Right: 1. Specular; 2. Ambient Occlusion; 3. Shadow; 4. RGBA; 5. Object Buffer; 6. Final Composite
58 â&#x20AC;&#x192; Stephen Hibbert - 2014
TMA1115 - Final Major Projectâ&#x20AC;&#x192;
59
Conclusion In summary the introductory testing and theoretical implementation of this Xuni AR service prove that such a system is theoretically viable. Some might argue this invention is an inevitable conception of current developments in wearable technology. Nevertheless the amalgamation of these synchronous devices requires careful thought into the means by which they can be fully harnessed to exploit their individual characteristics in search of a truly immersive and productive whole. Alongside the many technical hurdles that need to be overcome in getting the Xuni service to work properly, challenges still remain in terms of reliability, efficiency and integration.. However, foremost must be the user centred iterative refinement in the design of both the operational layout and control of such a system. The device needs to not just work, it needs to â&#x20AC;&#x2DC;disappearâ&#x20AC;&#x2122; if it is to be truly effective in its operational goals.
60 â&#x20AC;&#x192; Stephen Hibbert - 2014
Appendix 4 Appendix 4.1 Hardware references and implementation
4.2 Software references and implementation
4.3 The Internet of Things
The hardware devices used in the invention of the Xuni system are a pair of smart
4.2.1 Software used in development and testing isolated Xuni AR features:
The ‘internet of things’, originally envisaged by Kevin Ashton as a new phrase in
armbands (Fig. 1), a pair of smart glasses with integrated AR projection and external facing depth sensing camera (Fig. 3), a portable smart device controlling the central
radio-frequency identification (RFID) research and development at M.I.T
Unity® and Unity®Pro (Version 4.5.1f2) - (Unity Technologies, 2014)
(Schoenberger & Ashton, 2002), has now come to take on the wider definition for the
operation of the whole system (Fig. 4).
Oculus SDK for Mac OS (Version 0.3.2 Preview 2) - (Oculus VR Inc, 2014b)
digital connection of myriad objects that surround us.
Each of these devices has a real world counterpart or parts, which have been
Oculus Rift plug-in for Unity® 4 Pro Integration - (Oculus VR Inc, 2014b)
These objects can be man-made or naturally occurring, connecting to the same
referenced or used in designing, testing and developing the Xuni system.
Oculus Unity Tuscany Demo - (Oculus VR Inc, 2014b)
4.1.1 The primary devices referenced in designing Xuni AR:
Zigfu ZDK for Unity plug-in - (Version 1.1) (Motion Arcade Inc, 2012)
Armbands - Myo Armbands® (Alpha Version) developed by Thalmic Labs,
Skanect Free - (Occipital Inc, 2014)
Waterloo, Ontario, Canada
OpenNI 2 SDK (Version2.2.0.33) – Open Source Software hosted by Occipital Inc.
Depth Sensor - Structure Sensor® developed by Occipital Inc., San Francisco,
OpenKinect for Processing - (Shiffman, 2014)
California, USA
network using tags containing micro computing architecture and on-board sensors, enabling devices to accumulate and sort relevant data and communicate information to other connected objects/devices, independent of any human interaction, also known as machine-to-machine (M2M) communication (“Machine to machine (M2M),” 2014).
Cinema4D® - (Maxon Computer GmbH, 2014)
Augmented Reality Glasses - Glass™ developed by Google Inc., Mountain View, California, USA Smart Device – Android™ or iOS™ based mobile telephone or tablet.
4.4 Depth-‐sensor and vision-‐based tracking
4.2.2 Software used in testing Virtual Reality headset:
Vision based tracking concerns the application of computer algorithms to live camera-
Radial G: Racing Revolved (Version 1.2) – (Tammeka Games, 2014)
based footage in order to decipher the three-dimensional position of objects seen. Much of the research found stems from initial work on Simultaneous localization and
Fireworks (Version 1.0) – (Yuji Shimada, 2014)
4.1.2 Devices used in testing Xuni AR features:
Armbands – traditional mouse and keyboard (awaiting shipment of Myo
mapping (SLAM) conducted in part by NASA into remote robotic control (Riisgaard
4.2.3 Links to the Visualisation Animations created by the Author
& Blas, 2003). As research work has advanced, recent years have seen this technique starting to be developed and integrated into portable smart devices (13th Lab, n.d.;
Armband Development Kit at time of writing)
Castle, n.d.; Google Inc, 2014a). For example Swedish company 13thLab have
Depth Sensor – Kinect for Windows™ sensor developed by Microsoft Corporation
developed a ‘Monocular Visual SLAM’ application which uses the smart devices
California, USA
This research field continues to evolve with the recently publicised involvement of
Apple iMac™ (2014 model -‐ ME089B)
coverage.
Augmented Reality Glasses – Oculus Rift™ DK2 developed by Oculus VR, Irvine,
‘single camera to create a 3D map of reality’ (13th Lab, n.d.).
Smart Device – Apple MacBook Pro with Retina Display™ (Feb. 2013 model) and
Google with its mobile phone based ‘Tango’ project attracting widespread media
All of this development allows the potential for Xuni AR to be further researched and developed. 23 25
TMA1115 - Final Major Project 24
61
References Castle, R. (n.d.). Robert Castle Consulting. Retrieved August 30, 2014, from
Lin, H., Gao, J., Zhou, Y., Lu, G., Ye, M., & Zhang, C. (2013). Semantic decomposition
http://robertcastle.com/portfolio/
5 References
and reconstruction of residential scenes from LiDAR data. ACM Trans. …, 1–
10. Retrieved from
Chen, X., & Grossman, T. (2014). Duet: exploring joint interactions on a smart
13th Lab. (n.d.). Point Cloud SDK. Retrieved August 30, 2014, from
http://vis.uky.edu/~gravity/Research/ResidentialReconstruction/assets/si
phone and a smart watch. Proceedings of the 32nd Conference on Human
http://13thlab.com/about/
Apple Inc. (2014a). Designing Great Apps -‐ Apple Developer. Retrieved August 19,
g_webpage/LiDAR_House_Recon_SIG13_lowres.pdf
Factors in Computing Systems, CHI 2014, 159–168. Retrieved from http://dl.acm.org/citation.cfm?id=2556955
Machine to machine (M2M). (2014). Wikipedia. Retrieved August 17, 2014, from
from https://www.google.com/atap/projecttango/#project
Maxon Computer GmbH. (2014). Cinema 4D. Friedrichsdorf, Germany: MAXON
2014, from https://www.google.com/atap/projecttango/#devices
Milgram, P., & Kishino, F. (1994, December 25). A Taxonomy of Mixed Reality
http://en.wikipedia.org/wiki/Machine_to_machine
2014, from https://developer.apple.com/design/
Google Inc. (2014a). ATAP Project Tango – Google. Retrieved August 17, 2014,
Experience. Mac Developer Library. Retrieved August 19, 2014, from
Google Inc. (2014b). ATAP Project Tango Development Kit. Retrieved August 21,
Apple Inc. (2014b). Quick Look Programming Guide: Quick Look and the User https://developer.apple.com/library/mac/documentation/UserExperience
/Conceptual/Quicklook_Programming_Guide/Articles/QLUserExperience.ht ml#//apple_ref/doc/uid/TP40005020-‐CH3-‐SW3
Google Inc. (2014c). Design for Glass. Retrieved August 18, 2014, from
Advanced Interaction Modalities and Techniques. (M. Kurosu, Ed.) (Vol.
Haller, M., Billinghurst, M., & Thomas, B. H. (2007). Emerging Technologies of
Attenberger, A., & Buchenrieder, K. (2014). Human-‐Computer Interaction.
Institute of Electronics, Information and Communication Engineers.
https://developers.google.com/glass/design/index
Retrieved from http://search.ieice.org/bin/summary.php?id=e77-‐
Technologies of Augmented Reality: Interfaces and Design. Hershey Pa;
07230-‐2
London: IGI Global. doi:10.4018/978-‐1-‐59904-‐066-‐0
Bell, B., & Feiner, S. (2000). Dynamic space management for user interfaces. …
Izadi, S., Kim, D., & Hilliges, O. (2011). KinectFusion: real-‐time 3D reconstruction
the 13th Annual ACM Symposium on User Interface …. Retrieved from
and interaction using a moving depth camera. Proceedings of the 24th ….
http://dl.acm.org/citation.cfm?id=354790
Retrieved from http://dl.acm.org/citation.cfm?id=2047270
Bell, B., & Feiner, S. (2001). View Management for Virtual and Augmented
133–143. doi:10.1007/s00779-‐006-‐0071-‐x
Card, S. K., Moran, T. P., & Newell, A. (1983). The psychology of human-‐computer
Kuniavsky, M., Moed, A., & Goodman, E. (2012). Observing the user experience: a practitioner’s guide to user research. Amsterdam; London: Morgan Kaufmann.
mapping and tracking. ISMAR 2011 -‐ The Tenth IEEE International Symposium on Mixed and Augmented Reality. Retrieved from
Nuwer, R. (2013). Armband adds a twitch to gesture control -‐ tech -‐ 25 February 2013 -‐ New Scientist. New Scientist. Retrieved August 19, 2014, from
http://www.newscientist.com/article/dn23210-‐armband-‐adds-‐a-‐twitch-‐to-‐ gesture-‐control.html#.U_Mn6ksSOkQ
26
27
62 Stephen Hibbert - 2014
Newcombe, R., & Davison, A. (2011). KinectFusion: Real-‐time dense surface
http://www.nngroup.com/articles/guerrilla-‐hci/
low.pdf
computing’s dominant vision. Personal and Ubiquitous Computing, 11(2),
Arcade Inc. Retrieved from http://zigfu.com
the Initimidation Barrier. Retrieved August 19, 2014, from
http://graphics.ethz.ch/~hzimmer/papers/kim-‐sig13/kim-‐sig13-‐paper-‐
Bell, G., & Dourish, P. (2006). Yesterday’s tomorrows: notes on ubiquitous
Motion Arcade Inc. (2012). Zigfu -‐ ZDK (for Unity®). San Francisco, USA: Motion
Nielsen, J. (1994). Guerrilla HCI: Using Discount Usability Engineering to Penetrate
angular resolution light fields. ACM Trans. …, 1–12. Retrieved from
Technology, 2001, 101–110.
d_12_1321&category=D&year=1994&lang=E&abst=
http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=6162880
Kim, C., Zimmer, H., & Pritch, Y. (2013). Scene reconstruction from high spatio-‐
Reality. UIST 2001 ACM Symposium on User Interface Software and
interaction. London; Hillsdale, N.J: L. Erlbaum Associates.
Visual Displays. IEICE TRANSACTIONS on Information and Systems. The
Augmented Reality. (M. Haller, M. Billinghurst, & B. Thomas, Eds.)Emerging
8511). Cham: Springer International Publishing. doi:10.1007/978-‐3-‐319-‐
Computer GmbH. Retrieved from http://www.maxon.net
28
Occipital Inc. (2014). Skanect. Occipital Inc. Retrieved from http://skanect.occipital.com
Oculus VR Inc. (2014a). Oculus Developer Guide: SDK Version 4.0. Retrieved from http://static.oculusvr.com/sdk-‐
downloads/documents/Oculus_Developer_Guide_0.4.1.pdf
Oculus VR Inc. (2014b). Oculus SDK. Irvine, USA: Oculus VR, Inc. Retrieved from http://www.oculusvr.com
Passion Pictures, Unity Technologies, & Nvidia Corporation. (2013). Unity -‐ The Butterfly Effect. Retrieved August 20, 2014, from http://unity3d.com/pages/butterfly
Piumsomboon, T., Clark, A., Billinghurst, M., & Cockburn, A. (2013). User-‐defined gestures for augmented reality. Human-‐Computer Interaction – INTERACT 2013. Retrieved from http://link.springer.com/chapter/10.1007/978-‐3-‐ 642-‐40480-‐1_18
Riisgaard, S., & Blas, M. (2003). SLAM for Dummies. A Tutorial Approach to
Simultaneous …, 1–127. Retrieved from http://burt.googlecode.com/svn-‐ history/r106/trunk/1aslam_blas_repo.pdf
Sandor, C. (2008). Human Computer Interaction. (C. S. Ang & P. Zaphiris, Eds.). IGI Global. doi:10.4018/978-‐1-‐60566-‐052-‐3
Schoenberger, C. R., & Ashton, K. (2002). The internet of things. Forbes. Retrieved August 16, 2014, from
http://www.forbes.com/global/2002/0318/092.html
Shiffman, D. (2014). OpenKinect for Processing.
Tammeka Games. (2014). Radial G (Demo). Retrieved from http://radial-‐g.com
29
TMA1115 - Final Major Project
63