AiEngines AiLibrary and Advanced Tech Center

Page 1

The Artificial Technology Center and the Digital Library

Author: Daniel Delgado Gordon Kraft Aug. 31, 1999 (c) COPYRIGHT 1999 ARTIFICIAL, Inc.


SECTION I: The Artificial Technology Center What is the Artificial Technology Center? The Artificial Technology Center is to be a new and unique research and development laboratory. Its purpose is to advance and promote the technology of high-speed Internet applications. The center will advance the technology with an innovative research and development laboratory. It will promote the technology by demonstrating and educating the public as to the new possibilities offered by a high speed Internet

Why is it needed? The Center is needed for one primary reason - to develop profitable products for the Internet of tomorrow. All businessmen are aware of the difficulties in developing new products. Developing products for the Internet is even more challenging because no one knows how the Internet will grow and what it will grow into. We do know that its amazing technological advances have changed the rules of commerce, yet we do not yet know what those rules have changed into. Because of this, developing creative and profitable products is much more difficult on the Internet than in the past. The Center will be an environment designed to quickly produce new products and technologies. It takes advantage of the fact that advances in broadband technology will soon turn the Internet into a high bandwidth network. These changes will open the door for a host of previously impossible applications. The Center solves the problem of how to create these different applications and it also determines if they will be profitable. It does this by building an environment where an Internet user can use the new applications in a meaningful and productive way, and where at the same time, we can assess their usefulness and profitability. The Center’s environment is called the Digital Library.

SECTION II: The Digital Library What is the Digital Library? The Digital Library is conceived as an integration, both physically and in the abstract, of a library with the World Wide Web. The goal is have an environment consisting of library and web which fulfills the functions of both in the most useful and human centered manner possible. We are building a library because a library, seen as a physical container of knowledge, has the same function as the Internet - today’s most successful and powerful container of knowledge. The Internet has its enormity and timeliness of information, hyperlinks, plus the well-honed excellence of the various search engines The library, on the other hand, has the advantage of human comfort such as windows, chairs, desks, and space. Both however have the same function, which is to serve us humans in our ability to acquire, manipulate, and create information.

Proprietary and confidential Artificial, Inc.

Aug. 1999

2


As the Center’s main tool for product development, the Digital Library will serve multiple roles as think tank, R&D facility, and technology showcase. By building it we hope not only to demonstrate the advanced capabilities offered by a high bandwidth Internet, but also to act as a bootstrap and catalytic agent for the development of commercially successful products. The Digital Library will require innovative hardware and software I/O devices and solutions. The library must make one’s physical environment a powerful tool for accessing and manipulating the multitude of information streams present on the Internet: such as HDTV video, surround sound, hypertext search engines, and 3D data sets. Conversely, it must also allow one to control the physical aspects of the library in such a way as to facilitate one’s control of the Internet and it’s multimedia. Solving these problems, and in general building the library, will hopefully result in the profitable Internet application we need for commercial success.

SECTION III: The Components What will the library consist of? The Digital Library will have a variety of software and hardware components. They are named the following: The Physical Library The Library Web Site The Library Query Engine The Library Server Advanced Concept Tools

The Physical Library The Physical Library is an actual library. The present one is located in Florida. The library is comfortable, well lit, spacious, with computer access and a private collection of books, records, tapes etc. It also contains media players such as a multimedia capable computer, a HDTV capable large screen monitor, and a stereo system. Regardless as to how we construct the Digital Library, we will make sure it remains comfortable, spacious and well lit. The Digital Library is not meant to be a replacement for a traditional library, nor for the creation of a ‘super’ high tech system with lots of bells and whistles. These may be laudable goals but they miss the point of the Digital Library. The goal is to produce a melding of a library with its Internet counterpart. To do so successfully, we must maintain those qualities of a traditional library that make it enjoyable and easy to use. These include in no small part those that make it comfortable, spacious, and well lit.

The Library Web Site As with the Physical Library, the Library Web Site (version alpha) is active (http://www.ghklibrary.com/aidl). The Web Site will serve as the main conduit for the interactions between the user, the Internet, and the Physical Library. Its most important

Proprietary and confidential Artificial, Inc.

Aug. 1999

3


page is a well-done QuickTime VR rendition of the Physical Library. This page will have controls and gateways to other control pages, which will allow the manipulation of the Digital Library. For example, if one double clicks on the image of the stereo system, then a new page comes up with information about the stereo system as well as controls to manipulate it. Likewise, if one double clicks on the image of the Wide Screen TV, information about your video database and controls become available. Using a web site to control the library give certain advantages. We envision the Digital Library as containing a variety of different displays placed at various locations in it. Using the web in this fashion allows the user to control any number of library Internet devices with the same interface and on the different displays. As such, the web site becomes a universal controller for the user. The current Physical Library uses software from Crestron Home Control Systems (http://www.crestron.com/website/index.html) for controlling the home. This software will also be used for the control of the library from the web site. Its use will save us development time and assure compatibility with the house. Most importantly, the system is ideal for web hardware interfacing since it has been specifically designed for Internet control. In addition, it is a Java based system, which makes programming for the library much easier. The Library Web Site user interface, although doable in Java, will require the use of Shockwave and Director. Shockwave was used for the construction of the current web site’s interface, and continuing do so allows for the use of previous work. Furthermore, Shockwave works with QuickTime VR graphics. Currently, QuickTime VR type graphics are an important part of the web site since the main control page is a QuickTime VR graphic. Many similar pages will be needed for the future development. Fortunately both Apple and Macromedia (http://www.macromedia.com/shockwave/) have publicly available Software Development Kits (SDK). Apple released theirs years ago and Macromedia has just released theirs.

The Library Query Engine In a regular library a user is able to arrange his books, music etc. in whatever arrangement he desires (provided that it’s not the public library!). In addition, he can and probably will continuously add new material to his library. He could store these in any fashion he desires, perhaps even randomly. What he most likely will do is to arrange them in categories and store them in various locations such as shelves to help him find them in the future. Essentially, he wants space to store his materials and a method to find them whenever he wants to. The Library Query Engine is the Digital Library’s method for selecting and storing Internet information that the user wants to select and store. In effect, it allows a user to do the same as he would in a regular library, however with all the types of information on the Internet. As one can imagine, developing a query engine is not trivial because of the diversity of the media on the Internet and the complexity of categorizing the information.

Proprietary and confidential Artificial, Inc.

Aug. 1999

4


A good example is the situation with photo and movie formats. There are many different forms of media and variations of them, such as the picture formats JPG GIF, TIFF, and the various movie formats MOV, AVI, MPG etc. A query engine must be sophisticated enough to decode all these formats and also learn about any new ones. A much more difficult problem is how the query engine should categorize information in a manner pertinent to the user. Consider the example of a GIF image of a painting. One could categorize it by the colors, history, and even meaning. One analysis provides seven ways of making a suitable search engine (http://www.gils.net/showcase/). Some methods are very sophisticated and require the use of intelligent agents and/or content analysis. Fortunately, there has been a large amount of government and industrial cooperation in the making of a universally accepted search engine (e.g., http://www.dlib.org/projects.html). The result has been a concerted international effort to develop a world standard for searching for information. The standard is known as ISO 10163 in all places except the US, where it is more commonly know as ANSI Z39.50. This standard handles text information and also large complex collections such as found in libraries, universities, and museums (http://www.gils.net/webz3950.html). The US government has established the Global Information Locator Service (GILS) to assist in the development and dissemination of Z39.50 to the World Wide Web (http://www.gils.net/webz3950.html). The result has been the easy transfer of the standard to the commercial sector. Among other efforts, GILS makes available code samples and programming utilities for companies to implement search engines using Z39.50. One such company, Blue Angel Technology uses GILS to provide a large amount of the functionality required by our engine (http://www.blueangeltech.com/). The Digital Library ‘s query engine could be made in conjunction with this or other such developers so as to save time and not duplicate efforts.

The Library Server In a regular library, the user has shelves in which to store his information. The Library Server will allow the user to do the same. However not just with text information as found in books, but with all different sorts of media. The Library Server will work in conjunction with the Query Engine to store multimedia and other information in the library. Furthermore, just as in a regular library where one never has enough shelves, the Library Server should have the most possible amount of disk space possible to hold all the information the user will store. The Library Server must, just like the Query Engine, categorize and select information in an intelligent fashion. Think of what would happen otherwise. All of us who surf the web invariably save some information. Consequently, all of us who surf the web will most likely run out of disk space. What prevents this from occurring all the time is that we save only those bits of information meaningful to us. The Library Server needs to do the same thing. In addition to selecting what to save, the Library Server must deal with the format in which to save it in. The PDF format from Adobe appears to be the best to use because PDF will be the most common interface output for the display panels. We might have used Z39.50 since it makes use of a common internal data format. However on the World Proprietary and confidential Artificial, Inc.

Aug. 1999

5


Wide Web, the .PDF format is increasingly becoming the most favored method of storing documents that need to be accessed from a web site. In addition, there are a large number of skilled programmers available with expertise in the .PDF format.

SECTION IV: Advanced Concept Tools Advanced concept tools are possible candidates for commercialization: Each tool will be carefully examined as to its potential and feasibility. If acceptable, the concept will then be turned into a working prototype for use in the Digital Library. The library will be shown to the public and their reaction to the tool will be assessed. Based on this feedback and other more rigorous assessments (such as is done in usability testing and human factors), the tool will either be rejected, or added permanently to the Digital Library and developed for the marketplace. Building a tool is the most central part of the Library because it directly addresses the issue of profits. Building a tool uses all aspects of the Library. When ideas are proposed and critiqued, the library functions as a think tank. When a tool is built, the library functions as a rapid prototype and development lab. Finally, when the tool is demonstrated in the Digital Library, it acts as a marketing agency and a usability-testing laboratory. A potential tool must be assessed very carefully and meet certain criteria to be successful. As previously mentioned, the Digital Library can be thought of as the integration of the Internet to one’s personal environment. It has as least two functions. First, it allows you to use one’s physical actions to work with the Internet. Secondly, it allows you to use the Internet to control one’s physical environment. The library should be comprised of tools that accomplish these functions. This is a loose criteria, but essential for maintaining the focus on the Digital Library. Currently, most Internet users explore the Web with a browser on a PC. With the Digital Library, we want this capability and much more. The reason we want much more is because now, instead of accessing the Internet via one PC, we have an entire room that we can transform into tools for interacting with the Internet. Some of these tools will use the room features as control devices for manipulating the Internet. Others will serve to use the room as a large-scale addition to the web browser. The following is a list with descriptions of tools that have been proposed as of 9/19/99:

Tool: The Internet Book. Description: A flat LCD panel display that allows control of the Digital Library Home Page anywhere in the Library. The device would be very flat, lightweight and rugged (encased in a cushion or leather padding). At the top left and right hand corners there would be large arrow buttons. These are used with the Forward Backward icons in the web browser. The user goes to where he wants to in the library. Picks up the book and right away what he sees is the Digital Library page. As he manipulates the web site icons, appropriate

Proprietary and confidential Artificial, Inc.

Aug. 1999

6


responses occur in the various items in the real library. The interface is the web browser so that there is a little learning curve for the user. Technically this is easy to implement. The quickest would be to use an industrial LCD panel with a long cable for connection to the network. If a cable is unwieldy we might want to go with a PC portable with an IR link to the net. Of course the IR link speed would not allow for the display to be used for video media, however, that would not be its main use. The user would turn his head up and view the movie on the wide screen monitor. The book would be mostly as a room/Internet controller and use only the local pages.

Tool: Visible Mouse Focus Description: Mouse Focus refers to the object that is currently selected by a mouse. In the real world we can also think about having a mouse focus that mimics the position of the mouse in the web site. This can be implemented in several ways. One is that as the user moves the cursor to a hot spot (e.g. the stereo), a spotlight focused on the real object lights up. Another possibility is by a servo-controlled light. This light would change its position as the mouse moves on the screen. This may be nicer because the moving light is a closer simile to a moving mouse than a spotlight turning on. There is also another function that this tool serves. We can also use it to let the browser select a book from your library. If the browser knew your books and their location, it could point to a particular volume when the browser search engine hits include information found in your library books.

Tool: Reality Fusion Description: Without a doubt this can be a tremendous enhancement to the library. Using any common computer video camera, it allows you to use the subjects position and posture to control things in the library. The company has a web site with a demo version that one can download. http://www.realityfusion.com It is very low cost, has a SDK, and being a local company, can perhaps be a valuable resource. Reality Fusion allows a tool called “User Focus”.

Tool: User Focus Description: This is an extension on the Visible Mouse Focus concept. We can think of the user as having a ‘mouse focus’ that we want the Internet browser to be aware of. For example, suppose that the user is choosing and adjusting his music system. Most likely, he will be standing next to the audio system so as to adjust the controls. The position next to the music system is a hot site just as the Web version location is a hot zone for double clicking. Knowing the hot zone can be very useful for modifying the equipment in preparation for its use. This is where Reality Fusion’s product is so helpful. We can use it to easily identify where the user is.

Proprietary and confidential Artificial, Inc.

Aug. 1999

7


Tool: Voice Recognition Description: This is an obvious one and must be included. This tool is an example of a class of tools characterized by the fact that they are not marketable, yet still are needed for the Library. It is also a very important one because a poor implementation would be worse then none at all. People get enormously frustrated at voice recognition systems that do not work. A careful choice of the many available must be made to avoid public dissatisfaction with the Digital Library as a whole

Tool: Web Integrated Video Editor for Wide Screen TV Description: The wide screen TV offers lots of possibilities that will be looked at in another paper. Here, The tool gives is a non-linear editing capability for movies. It doesn’t have to be a complete set of controls, just sufficient to allow a user to stop, review, clip and save. A user could make small pieces of multimedia that serve as notes, study guides’ etc. For example, suppose the viewer just saw a news report about something pertinent to his job. The user should be able to freeze the video, select segments, and create a small video clip. The browser will serve to let the user manipulate the clips. For example, when the user wants to select clips to review, he selects their icons and drags them onto the Screen icon

Tool: Virtual Headphones Description: As the user moves about, the library notes his location and moves the audio “sweet spot” so that it follows the user. We can use Reality Fusion technology to obtain the user’s position. The movement of the sweet spot is possible with PC sound cards that have 3D imaging capability, such as the SoundBlaster and Gravis cards.

Tool: Video Conferencing Description: Again an obvious one, not marketable, yet needed. To save development time, we will use an upper-end turnkey system offered to large corporations. The magazine AV Video has reviews of these. For public relations, we might want to get a well-known appropriate celebrity and have him be at the other end of the conversation during important demonstrations

Tool: Keyed information Description: Keeping in mind the library metaphor, which connotes an ability to do research, we should offer some method of keying various parts of the multimedia stream. For example during video playback, the user should be able to mark the sequence (as previously described), but he should also be able to attach a text string used as a key for accessing information. The most natural way of doing this would use voice recognition. Envision a user looking at a news clip. He sees something of interest. Speaks “Stop”, “Rewind”, “Play for 10 seconds”, “Save Segment as Conference Example”. Resulting in a video clip saved to the server.

Proprietary and confidential Artificial, Inc.

Aug. 1999

8


SECTION V: The Portable Digital Library The Portable Digital Library is an Advanced Concept Tool. It was not mentioned in the previous section because it merits a detailed description by itself. It will be the first tool to be built and the most complex to date. The Portable Digital Library will be designed so as to allow a person to have as much functionality of the real Digital Library as possible. This tool is needed for a variety of reasons. In order to market successfully any product arising from the Digital Library, we have to take it to potential customers and tradeshows. The Portable Digital Library is designed for this purpose. In addition, because it is substantially different than the Digital Library (because of its size), it will most likely result in additional product concepts than the original. The Portable Digital library will have three separate LCD panels (not counting the PC display). Each one will be thin, lightweight and mounted on a swiveling armature, similar to that found in some light-stands. The display panels are mounted so that one can easily pick one up and reposition it by hand. The displays have several features. They have touch screens. They have the Dimension-X overlay (http://www.dti3d.com) that allows 3D imaging. They have sensors that give the position of the screen with respect to the user. The display panels form the basics of a visualization ‘dome’ that can be rearranged depending of the type of multimedia to be presented. Here is how we can use them:

Functions of the Portable Digital Library Panoramic images and movies: The user can put them side to side to form one large display. This can be used to show movies and images in the wide screen format.

Movie editing: One can place a movie clip on each display and use the third as a touch screen control for the editor. The editor could be a video-editing page on the library web site.

QuickTime VR viewing and control: If the user arranges them in a half-circle facing him, he would have an excellent viewer for QuickTime VR images. The displays arranged like this could give up to about a 120degree wraparound screen. Viewing a QuickTime VR image, e.g. of the digital library web site would be impressive. The touch screens would allow the user to sweep his hand across the display and literally pull the graphics to where he wants.

Proprietary and confidential Artificial, Inc.

Aug. 1999

9


GIS analysis: If you place a screen looking up and below you, it can act as 3D-terrain viewer for Geographic Information Systems (GIS). Its application could be in petrochemical 3D mapping and satellite analysis.

Flight, 3D, and game simulations: If you place the displays in a front, left, and right formation, they can be used for car and flight simulators. Used with a standard VRML plug-in, a browser (Cosmo is the typical one) can act as ‘spaceship’ that one can use to fly through the VRML space. This is done using the Dimension-X screen coating and the position sensors on the displays. The Dimension-X allows stereoscopic 3D displays without the encumbrance of glasses, and the position sensors allow virtual windowing.

Book and sheet music viewer: One could place 3 pages of a sheet of music or consecutive pages from a book. As a sheet music viewer, we can use Reality Fusion technology so that the user can turn the page using physical motions.

Videoconferencing: If we add a video camera to each display, we would have a videophone capable of a conference with multiple views of the participants (maybe useful) or of anything that they put in front of the cameras (very useful).

Distance Learning: Capturing live videos of lectures to a server will allow students in remote classrooms, distant cities and other far parts of the world the ability to attend classes at secondary schools, colleges and universities at their convenience. The server allows not only distance learning but also 'time' learning, in the sense that they can view the lessons at a time of their choosing. This allows company employees working on undergraduate or graduate degrees, who must travel on company business or have to work during critical times, the ability to learn their lessons even while accomplishing their regular company business. Another market is in corporate training, military training or any other application where important information needs to be retrieved when the viewer has available time.

Tool: 3D-scanner station: If we add a video camera to each display, you could use it to produce real-time avatars of any object placed in front of the displays. Several companies make software that takes multiple view of an object, such as with pictures, and recreates it as a 3D volumetric model. What has not been produced is a real-time 3D volumetric model! This is not necessarily due to it being technically unfeasible, but rather because without some sort of multiple camera arraignment, there is no use for such software. The Portable Digital Library will have multiple cameras and so can make use of such software. One company Proprietary and confidential Artificial, Inc.

Aug. 1999

10


makes a product that wraps a picture of a person in a smart manner so that it looks realistic. However, it does this on a generic head model. A combination of this technology combined with real-time 3D volumetric rendering has great potential, and at the least would be impressive.

Networking: There is no reason apart from cost not to have the displays be addressable from the Internet. This would allow the useful feature of automatic configuration. A user could just put one of these things close to another, and they would sense each other's presence and configure the visual displays accordingly.

Section VI: Potential Business Associates This project will require a variety of technologies, and thus will also required business relationships with the companies that make those technologies. This is viewed as another benefit derived from the building of the digital library. Here we identify those businesses. BroadBand Networks Corporation http://www.bbnc.com BroadBand Networks Corporation or BBNC is a company specializing in a variety of hardware for making broadband transmission possible, primarily for the cable industry. BBNC is a prime example of the company that the Center was designed to help. As such, we provide new product concepts and marketing by using their devices in the Digital Library. Macromedia http://www.macromedia.com/shockwave/ Macromedia makes Shockwave, which allows multimedia like effects with on the web. It is well suited for use with QuickTime VR. They freely give out their SDK, which we will use. Reality Fusion http://www.realityfusion.com/ Reality Fusion makes a computer video camera application that allows one to use the body space of the user for control of the computer. They supply a SDK and also a demo version of the software. This software is highly suited for the library because it makes use of the space around the user for control. Cardiff Software http://www.cardiff.com/ Cardiff Software makes a product that allows one to create forms using the .PDF format. The library uses .PDF extensively, both for the display devices and also as the storage type for the library server. This product would be ideal for use on the library web site for

Proprietary and confidential Artificial, Inc.

Aug. 1999

11


this purpose. There seems to be no licensing difficulties except that they require a oneweek training session. Blue Angel Technology http://www.blueangeltech.com/ Blue Angel Technology makes software that implements the ISO standard for a universal query engine. Although we will be using .PDF as the libraries native storage format, we may still take advantage of any existing code that we can modify for our purposes. SFS Software http://www.siteforum.com/ SFSS software has several products useful for the library. SiteEater, useful for the server, allows one to have local copies of web sites. InternalWeb creates a full text search index for any downloaded sites (via SiteEater). We most likely will need a developer's license because the products, as is, have their own interfaces that may be incompatible for our purposes. D-Lib Magazine http://www.dlib.org/projects.html Library science research on digital libraries is extensive and diverse. D-Lib Magazine is a good conduit for possible cooperation with the many organizations such as government agencies and universities that are studying Digital Libraries. Dimension Technologies Inc., (DTI) http://www.dti3d.com DTI makes a lenticular coating for LCD monitors that allow a user to view 3D without the use of glasses. This is important for the library because, although we want the user to view and use 3D content on the Internet, we do not want to constrict him in any way. He must be free to use the entire space available to him in the library.

Section VII: Tasks How do we begin? Building a sophisticated system such as the Digital and Portable Digital Library will require several major tasks. This section will look at the problem of construction and specify the various tasks needed to build it. Some of these will be to build the hardware network backbone that will connect to the Internet and other broadband information channels to the library. Others will be for creating the software required both to connect the network to the Digital Library, and also to build the Advance Concept Tools. A large task will be to devise or integrate a query engine into the Library. Another will be to develop the Library Server software. The server software will need to selectively select and store the information that will stream across the network. To start off with the first task will be the creating of the Library Server software.

Proprietary and confidential Artificial, Inc.

Aug. 1999

12


The Server Software: Description: Task Description: Programmer Requirements:

Section VIII: Other Issues Concerning the Oxygen Project This is a large (est. $10 million) project from the AI group at MIT whose purpose is to come up with the most integrated use of transmission technology - resulting in a variety of household objects that a person could query in a natural language context. In many ways this can be viewed as a generalized Digital Library. However in this case the environment is expanded to include every possible space a human may interact in. This project is devised with technology projected to be available in about ten years. Obviously it is not pertinent to our effort in making commercially successful Internet products now. One thing that perhaps may be useful to us is their work on command language parsing. This might be available under a technology transfer program.

Profitability As designed, the Digital Library in itself has a limited market because of its cost. Some such markets are the government, petrochemical companies, and medical research groups. What it was designed for was to provide features that could be marketed independently of the Digital Library. What this means is that while we hope that the Digital Library by itself will be marketable; we expect that some Advance Concept Tools will be marketable.

Proprietary and confidential Artificial, Inc.

Aug. 1999

13


Appendix A: White Paper Comments This document is the combined effort of a variety of people. The writer (Daniel Delgado) served as an editor of all the great ideas which from all the people who contributed in some way or another. The creator of the concept of a Digital Library, its major designer and supporter, is Gordon Kraft. Chris King also provided major input. As well as Ralph P. Manfredo of BroadBand Networks Corporation. This document is our blueprint for the Digital Library, and also the conclusion of the serious discussions and careful thought processes that went into designing it. As such, this document is also continuously in the process of being revised and expanded as new input and more decisions are made. This appendix lists various recommendations about the white paper for discussion. It is in an appendix because it is not intended to be seen except for those involved it its creation.

Section on Business Relationships: Our connections to the various individual and companies are very important. It would help us to clarify who and why we need any such relationship. This has been implemented as Section VIII: Potential Business Associates.

Digital Presentation of the white paper: This document is intended to be as complete and detailed as possible so as to serve as the blueprint for the construction of the Center and the Digital Library. Its detail makes it unsuitable as a marketing tool. We can resolve this by having a high-level digital presentation. It would be high-level so as to get the essential concepts across to our audience, yet not loose them in the detail. It would be digital because this would allow us the most possible audience. The presentation could be done as a PowerPoint demonstration, and/or as a well-done multimedia CD-ROM.

Revision and editing by a professional technical writer: Without question, this document must be as clear and as well written as possible. This document will be shown to potential business partners and investors. Its presentation must also be excellent for the same reason. It is important enough so as to let a professional technical writer edit and revise it.

Library research expert (digital): Considering the difficulties mentioned in building a universal library query engine, we should identify a partner or affiliation with that expertise. In addition to the benefits derived from a technology transfer, we could also use this for marketing and other support. Current candidates for such partners are those companies that build search engine type software (e.g. Blue Angel Technology: http://www.blueangletech.com/). The Proprietary and confidential Artificial, Inc.

Aug. 1999

14


other choice is an affiliation with a research group such as found in a university or government agency.

Internet use of Library for surveys and usability testing: Would it be feasible to give access to the basic library (free) in order to gain a large user email list - in exchange for surveys and usability testing. If yes, then levels should be created for whatever functionality is offered. There is where the profitability begins - at what level, at what point and at what cost? The library could represent the initial acceptance of the user and notify us that the library has become useful and important.

Top ten site access is needed: There are lists available that give you the top ten web sites visited on the Internet. These should be included in the basic library so se can be assured of user, market share. Each of the top ten should be available in perhaps a richer and more useful environment.

Proprietary and confidential Artificial, Inc.

Aug. 1999

15


Portable Digital Library Hardware/Software Configuration Configuration

Internet

Server Computer: Fred junior AI Server AI Query Engine Reality Fusion

AiARS Touch Screen Keyboard

Cont. Speech Recogniser

Web Browser

DimensionX LCD Positional Monitor DimensionX LCD Positional Monitor DimensionX LCD Positional Monitor

Video Camera

Microphone

Position Sensors

Audio

Network Audio/Visual Serial Hardware Software

Proprietary and confidential Artificial, Inc.

Aug. 1999

16


Portable Library Specifications These specifications are for the Portable Digital Library, which is being built as the main showcase for the Artificial Technical Center. Both the Center and the Library are explained in detail in the white paper "The Artificial Technology Center and the Digital Library", available on the Artificial Technology web site: http://www.ghklibrary/aidl. This document is for the purpose of describing how to build the Portable Library. It is comprised of three sections. The first section, Methods, describes the desired features and then procedures for making them. The second section, Materials, is a list of items to be purchased. The third section, Notes, contains information pertinent to the library as a whole. In addition, a schematic overview of the library's hardware and software components is shown in Figure 1.

Methods Feature: Moveable Display Screens Description: The ability of a user to place the displays in almost any position or orientation desired. This is a fundamental component of the Portable Library. The Portable Library has three main LCD panels for its displays. By positioning the displays in various configurations, the user of the Portable Library can use it as either a wide screen system, or a wraparound QTVR viewer, or an out-the-window views such as is used for flight and car simulators (these configurations are described separately as individual features). In addition it also allows the user to use the entire space around him for the Windows Desktop. With this a user can, for example, place file icons for movies clips above his head, and sound clips next to his left ear. Procedure: Modifying a "Magnifier Reading" lamp provides the mechanical arm that supports the Library's displays. These lamps have a large magnifying lens and are used by people who do precision miniature work as well as by the sight impaired. Because of their needs the mechanical support linkage has been optimized for maintaining the light and lens at a position of the viewer's choice. This is the capability the library needs for its displays. Three lamps are needed. Each is modified so as to remove the head/light assembly and substitute it with a LCD display. Depending on the weight of the lamp, we may also replace various springs for ones of higher tension to compensate for the greater weight of the display. To obtain the display's location and orientation in real-time, we use a commercially available infrared position tracker system (described in the Materials section). One sensor is attached to each display so as to provide a continuous feedback as to its position and orientation. This information is then passed via a serial port to the Portable Library's computer. Written with the tracker's SDK, a program on the Portable Library computer first initializes the tracker and then monitors it by the serial port. The Portable Library can

Proprietary and confidential Artificial, Inc.

Aug. 1999

17


then access the tracker data from the serial acquisition program. The access can be in done in one of two ways, either by using Window event messages or by using a shared memory location. The conventional method is to use Window event messages. However, it this is not feasible we will resort to a shared memory location. In either case, any program that requires positional information, such as a Java Applet activated from the browser, must make use of the tracker's SDK if it needs to access this information. For the out the window simulators, the program will change the display mode to one viewport per display. The different views can then be produce by either configuring the software for this arraignment or by running multiple instances of the executable, but with different viewing positions (e.g. left, right, and front views). For the QTVR and Panoramic viewer, the program will change the display mode to one large viewport with each display mapping into a different portion of the view. This configuration will also be used to enlarge the Desktop to surround the user. Feature: QTVR Viewer Description: The QTVR Viewer is a configuration of the library that is optimized for the wraparound images of QuickTime VR. The LCD panels are positioned side-bye-side in a semicircle. Such an arrangement can increase the field of view (FOV) to as much as 120 degrees. The aim is to give an experience similar to that of a 'Circle-rama' theater such as is found in Disney Land. Since QTVR images are used extensively by the library, this will be an important part of the overall effect of the Portable Library. Procedure: A program will be written that monitors the position of the three displays. When the user places them side by side in a semi-circle, the program will send a Windows event message to all active programs, and also change the display mode to that of one large viewport with each display mapping into a different portion of the view. In addition, the size and aspect ratio of each viewport will be changed to match the FOV of the combined monitors and a system message will be added that reminds the user to reposition the displays when viewing a QTVR image. Feature: 3D Viewer Description: The use of the Dimension-X lenticular coating allows a user to see volumetric 3D data without any encumbrances such as goggles or LCD shutter glasses. With it the user will be able to explore 3D data sets on the Internet using a VRML browser. He will also use it with car and flight type simulations and for watching 3D movies Procedure: To use the Dimension X 3D coating, we must use a LCD panel display because the effect depends on the characteristic of liquid crystals to polarize light. To use the display for the Portable Library, we must also make the displays as lightweight as possible. Weight is an important factor because the Library's support arms are optimized for the weight of its lamp and lens assembly. Our displays must be the same weight or lighter.

Proprietary and confidential Artificial, Inc.

Aug. 1999

18


We will require three displays. They will be bought off-the-shelf, or adapted from laptop displays, or built by us. The most convenient way would be to purchase them, however it is unlikely that we will find off-the-shelf displays that meets our needs. The next easiest solution is to buy a portable computer and remove its display, power supply and video circuit board. This may be made more convenient by finding a manufacturer that will sell us just the video components from its laptop. Regardless, we then substitute long cable assemblies and put only the LCD panel on the arm, all else will be placed at the base of the arm. If obtaining a manufactured display is not possible, we will buy a LCD panel, a video interface board, power supply, and build it. This is not very difficult but is a more involved process than the other methods. On the other hand this way does give us the advantage of allowing us to select the best and most appropriate parts. After obtaining the displays, we then send them to Dimension X for them to apply the lenticular coating need for 3D. Since this may take some time (not under our control) we will send the displays to the manufacturer as soon as possible. Development will then continue using a standard monitor display. To speed development, an effective yet costly way would be to obtain an off-the-shelf 3D monitor from Dimension-X in addition to the coatings for the library's three displays. The 3D monitor can then be used for development until the library's displays are modified. Since VRML 3D models are the most common types of 3D objects found on the web, we will have a VRML browser with 3D capability. The Windows web browser Explorer calls its own VRML browser, but others are easily substituted. Hopefully, the Dimension-X people know or sell a web browser with built-in 3D support. Otherwise, we will use a well-known VRML browser such as Cosmo Player and adapt it for use in the Library. Many simulator games now support 3D viewing. The Portable Library will be able to work with all such games and programs. These are meant to be use with LCD 'Flicker' glasses. The lens go dark then light, only allowing one eye to see an image at any one time, usually at a rate of the display monitor synch (e.g. 60 hertz). By presenting to each eye one image of a stereoscopic pair, a fairly effective 3D effect is produced. The Portable Library's 3D coating is compatible with this method. We should verify this with MS Flight Simulator and a popular network flight simulator, and some other popular programs. Feature: Panoramic Viewer Description: The displays are placed side by side for viewing videos with wide aspect ratios such as Panoramic movies. Unlike the QTVR displays, the images are not arranged in a semi-circle but as one large flat display. Procedure: Although this is a simple concept, achieving a workable view will be difficult. When watching movies and looking at images, one should not have a border along each portion of the image mapped by a display. Yet just about all displays have a border along all edges of the display, somewhat like a picture frame (in the case of a LCD panel, this is probably the display circuit board). The ideal solution is to find a display

Proprietary and confidential Artificial, Inc.

Aug. 1999

19


with no borders at least along one dimension. Without this, we may be able to achieve the same effect with real-time optical correction and some type of mirror arrangement. Please see the Addendum: Concerning making a flat display with no gaps (Gordon, I think we might have some patents here). Regardless of how it is achieved, speech, hand motion, and the touch screen will control the Panoramic Viewer. As with the QTVR viewer, it will make use of the IR position sensors on the library's three display support arms. A program will be written that changes the display mode to that of one large viewport with each display mapping into a different portion of the view. Again as with the QTVR viewer, the size and aspect ratio of each viewport will be changed to match the FOV of the combined monitors, and a system message will be added that reminds the user to reposition the displays when he is viewing a QTVR image. A user will be able to use the Panoramic Viewer for editing and interacting with video (such as is done with enhanced DVD's). While watching a news clip the user could say "stop" followed by "save as example one", which would result in a of a clip of video saved onto the hard drive. He could also use hand motions or the touch screen to change the speed of the clip. The user will also be able to manipulate static images. He will be able save an image to disk by using his finger to trace around a frozen video segment (such as with a mouse with MS Photo Editor). Feature: A Friendly Librarian Description: The user can easily search both the Library's material and the Internet without the user needing to know browser commands. It then presents the material in a manner appropriate to the media, so for example video is sent to the Wide Screen TV and audio is channeled to the music system. Procedure: This is a very challenging feature. It must make use of the software routines of every SDK used in the library. Software must be written to allow control with voice, hand gestures, touch screen and keyboard entry. With voice the user will be able to address the librarian, ask a question, and have packaged "power-point" type presentation of the information. The initial version will try to simplify the problems by limiting the information and media to that obtained from the Internet via the web site of Margaret Anderson. This allows the program that is parsing the information to make use of a common data format. A Java/C++ programmer will be used. Even using Margaret's web site as a test site, various programs must be written and integrated together. First a control program that monitors all I/O, meaning keyboard, voice, gesture, and desktop commands. Upon activation by the command 'Librarian', which e.g. is done by voice by calling out "Librarian", the program must parse the subject's question, and then output one or more browser commands to obtain the information. Having received it, the program must then select the appropriate medium and present the information. The difficulty lies of course in programming the presentation. Feature: Hand Motion Control Proprietary and confidential Artificial, Inc.

Aug. 1999

20


Description: The user can control various aspects of the Library hand by using hand gestures. These are the following: QTVR viewer Video Photographic images Music Hopefully this will be one of the more natural ways for controlling the library. With this a person could sweep his hands across a QuickTime VR image and have it rotate around him. He could point to a display icon and launch it by literally making a throwing motion or other such gesture. With photos or other graphics, a user can use his hands to pan an image larger then the viewport. For video, the user can use gestures for the editing commands such as stop, fast forward, and reverse. For music, the user can interact with any of the numerous programs available for creating, conducting and playing music. For example, a user could do such things as change tempo, switch or add instruments, change the rhythm or 'turn' a page while one plays. Procedure: Software from Reality Fusion will be used for recognizing gestures. This software does a real-time analysis of the video input so as to respond to specific hand and body gestures by the user. To use it a small lightweight video camera facing the user will be attached to each display panel. Using the Software Developer's Kit (SDK) that has been provided by Reality Fusion, we will write software to interface it with the Library. The software will send the appropriate user I/O (most likely emulating ASCII from the keyboard) when it senses a command gesture. Reality Fusion will be used for the following: QuickTime VR Viewer DrumBox (shareware) Microsoft Photo Editor (GIF, JPG and others) QuickTime Viewer (MOV) Microsoft Active Player (MPG) Feature: Speech Control Description: The user controls various functions of the Library by voice. Voice commands will be used for the following: Windows Commands Web browser Library web pages Display settings (brightness, aspect ratio, contrast etc) Video (editing, stop, start, rewind etc.) Music (loudness, bass, treble etc) Procedure: The software ViaVoice will be used for speech control. Like most voice recognition systems, it comes with a command set for controlling Windows (such as with file commands) and also procedures for setting up command sets for programs. With this

Proprietary and confidential Artificial, Inc.

Aug. 1999

21


we can then give the user control over the display settings since they are adjustable by the operating system. Command sets will be written for the other functions. The first command set will be written for the browser since it will be the primary user interface. By doing so we add voice commands to the Library's web pages. To add voice capability for video, music and the display mode, additional command sets will be set up for the most commonly used video and music editor programs, as well as for changing the display. The programs that will have command sets written are the following: Microsoft Photo Editor QuickTime Viewer QuickTime VR Viewer Microsoft Active Player Additional command sets will also be written for the audio player and other multimedia programs that were provided with to the computer when purchased. For show demonstrations we need to make sure we can use voice independent commands for at least a subset of those needed. If ViaVoice does not supply this than we obtain an additional software package that does to be used in conjunction with ViaVoice. Feature: Touch Screen Control Description: The user can control the Library using touch screens on its three displays. Using the software typically supplied with the screens, the user will be able to select icons, move scroll bars, etc. just as if done by a mouse. Procedure: The touch screen will use an active optical matrix keypad and interface with the Library computer through the serial port. These work by placing optical infrared emitter/transmitters around the perimeter of the display. Unlike the pressure sensitive kind, they do not obscure the image. We will also have fewer problems with vibration because they do not require an actual press on the screen. We may have to use the pressure sensitive kind if the other's infrared system interferes with the tracking system that is used for display and other positional information. If this is the case, we will then resort to a touch membrane interface. Touch screens typically come with software so that a user can control the screen as if it were a mouse. We will use this for most of the functions in the library that will make use of the touch screen (those that are mouse like). If the library is using the displays in a group (such as with QTVR), the touch screen will be active across all three screens and with one shared viewport. Feature: Three-Camera Viewport. Description: Each of the three LCD displays has a lightweight video camera attached to it. Used with a good user interface this feature could be one of the more marketable aspects of the library. They have the added benefit of providing three viewpoints of the participants in conference. In addition, because a user of the Portable Library can

Proprietary and confidential Artificial, Inc.

Aug. 1999

22


reposition the displays and cameras, they can move them so as to interact with other participants. We could greatly enhance the effectiveness of this type of videoconferencing by designing utilities that help the user use the three-viewport arrangement. These could include such things as a 3D global display of all participant's in one picture;. or a tactile response when remote displays 'bump' into each other. Another thing we could do stereoscopic video conferencing. This is possible in the Portable Library because all the displays are 3D capable. If the user positions two of the cameras/displays towards a view, the two images would be used to produced stereoscopic pairs that result in a three dimensional view of the visual scene. These enhanced video features are well suited for distance learning and teleoperations such as remote surgery. Feature: Three-dimensional volumetric reconstruction. Description: The user of the library can use the cameras to reconstruct a 3D volumetric image of whatever he places in front of the library. If he does this to himself he is also provided with a 3D analogue suitable for use as an Avatar. Procedure: Several commercial software companies make products that construct a 3D object from pictures of the object from several viewpoints. There is one company (TBN) that makes software to wrap a picture of a person intelligibly on a generic 3D model of a head. We obtain these products and hopefully SDK's so as to do this in real-time in the library.

Materials Hardware Window 95 machine optimized for graphics and as fast as possible (700 MHz) with monitor Three PCI graphic cards Three Display and support and position systems (modified lamp arm). Three very lightweight LCD displays Lenticular stereoscopic coating for LCD displays Three very lightweight video cameras A Multi sensor IR tracking system A high speed Internet connection Software: ViaVoice (IBM) Reality Fusion's SDK SDK for IR tracking system SDK for the touch screen

Proprietary and confidential Artificial, Inc.

Aug. 1999

23


Microsoft Photo Editor QuickTime Viewer QuickTime VR Viewer Microsoft Active Player Friendly Librarian control interfaces Cost: Windows PC 3000.00 Graphics cards 3 @ 350.oo Monitor arms 3 @ 275 LCD coatings 3 @ 13000.00 (approximate) Video Cameras 3 * 100.00 IR tracking system 15000.00 Internet connection TBD Software Librarian TBD All Software TBD

Notes: Concerning a flat display with no gaps: As was pointed out in describing the Panoramic Viewer, to do so out of multiple displays requires a display with no border. At this point, I am not aware of any available. This is surprising. There must be a great demand for this characteristic in a display. This feature could be used by all the stadiums and large events to make extremely large configurable displays of any type they need (a display is an analog of a pixel, more pixels more display). We should investigate further to see if this may be a profitable product. Unfortunately, as of right now I do not know of a way to solve this problem for such a market. In the more limited case of the Portable Library, we have the situation of having only three monitors that can be placed in a side by side arrangement. We may be able to use mirrors and a rearrangement of the viewport to accomplish this. I will present a small article if the interest is there. Concerning the serial ports: As specified, almost all the I/O is done via the serial port. We should however explore the use of other faster communication methods, for example Firewire. The serial ports are specified to insure a baseline level of communication so as to reduce unknowns and ensure success. A lighter, perhaps wireless, video transmission of high resolution would be the most desirable. Concerning the IR Tracker: This is an expensive item, and one may think it overkill for its use to track the positions of the displays. Actually, other methods, such as mechanical resistance measures etc. would require the use a contractor to build the tracker. This would most likely be more expensive and take more time than using an IR tracker. Furthermore, by placing an additional sensor on the user (typically a small I/R reflective small sticker), we have the capability of producing Virtual Windows in the simulation. A

Proprietary and confidential Artificial, Inc.

Aug. 1999

24


Virtual Window is a close cousin to my device a Portal Display (and its basis). Virtual Windows work by monitoring the head position of the user so that the graphics through a window, such as with a airplane, car, and other vehicles, is accurately corrected for the optical path between the user and the scene.

Proprietary and confidential Artificial, Inc.

Aug. 1999

25


Turn static files into dynamic content formats.

Create a flipbook
Issuu converts static files into: digital portfolios, online yearbooks, online catalogs, digital photo albums and more. Sign up and create your flipbook.