TECHNOLOGY WATCH REPORT Human-enhanced time-aware multimedia search
CUbRIK Project IST-287704 Deliverable D11.1 WP11
Deliverable Version 1.0 – 30 September 2012 Document. ref.: cubrik.D111.ATN.WP11.V1.0
Programme Name: ...................... IST Project Number: ........................... 287704 Project Title: .................................. CUBRIK Partners: ........................................ Coordinator: ENG (IT) Contractors: UNITN, TUD, QMUL, LUH, POLMI, CERTH, NXT, MICT, ATN, FRH, INN, HOM, CVCE, EIPCM Document Number: ..................... cubrik.D111.ATN.WP11.V1.0 Work-Package: ............................. WP11 Deliverable Type: ........................ Document Contractual Date of Delivery: ..... 30 September 2012 Actual Date of Delivery: .............. 30 September 2012 Title of Document: ....................... Technology Watch Report Author(s): ..................................... Ralph Traphรถner
Approval of this report ............... Summary of this report: .............. History: .......................................... Keyword List: ............................... Availability .................................... This report is public
This work is licensed under a Creative Commons Attribution-NonCommercialShareAlike 3.0 Unported License. This work is partially funded by the EU under grant IST-FP7-287704
CUbRIK Technology Watch Report
D11.1 Version 1.0
Disclaimer This document contains confidential information in the form of the CUbRIK project findings, work and products and its use is strictly regulated by the CUbRIK Consortium Agreement and by Contract no. FP7ICT-287704. Neither the CUbRIK Consortium nor any of its officers, employees or agents shall be responsible or liable in negligence or otherwise howsoever in respect of any inaccuracy or omission herein. The research leading to these results has received funding from the European Union Seventh Framework Programme (FP7-ICT-2011-7) under grant agreement n째 287704. The contents of this document are the sole responsibility of the CUbRIK consortium and can in no way be taken to reflect the views of the European Union.
CUbRIK Technology Watch Report
D11.1 Version 1.0
Table of Contents EXECUTIVE SUMMARY
1
1.
2
MULTIMEDIA SEARCH 1.1 MOTIVATION 1.1.1 Requirements 1.2 STATE OF ART AND CHALLENGES 1.3 PLATFORMS AND COMPONENTS FOR MULTIMEDIA SEARCH 1.4 CONTENT PROCESSES 1.4.1 Content Processing components 1.5 QUERY PROCESS 1.6 MIR METADATA STANDARDS 1.7 COMMERCIAL MIR ENGINES
2.
HUMAN COMPUTATION 2.1 2.2 2.3 2.4
3.
GAMES WITH A PURPOSE (GWAP) CROWDSOURCING HUMAN COMPUTATION CHALLENGES PLATFORMS AND COMPONENTS FOR HUMAN COMPUTATION GENERAL PURPOSE SEARCH SYSTEMS AND PROCESS ENGINES
3.1 FRAMEWORK COMPARISON 3.2 ASSESSMENT OF GENERAL INTEGRATION FRAMEWORKS 3.2.1 SOPERA, Swordfish 3.2.2 SAP Enterprise SOA, NetWeaver 3.2.3 Oracle Fusion Middleware 3.3 ASSESSMENT OF SEARCH FRAMEWORKS 3.3.1 Apache Hadoop, Nutch and Solr 3.3.2 UIMA 3.3.3 SMILA 3.3.4 Summary 4.
BIBLIOGRAPHY 4.1 4.2 4.3
MULTIMEDIA INFORMATION RETRIEVAL AND HUMAN COMPUTATION GENERAL PURPOSE SEARCH AND PROCESS ENGINES GENERAL REFERENCES
CUbRIK Technology Watch Report
2 2 3 6 6 7 9 10 11 12 12 14 15 16 19 19 24 24 29 32 36 36 40 43 47 48 48 54 54
D11.1 Version 1.0
Executive Summary This report contains an overview of the commercial market and of the most practical academic realizations in the domain of human-enhanced multimedia search. The report starts with an overview of the achievements in two pillar disciplines at the basis of CUbRIK: Multimedia Information Retrieval (MIR) and Human Computation (HC). For these sectors, that are more recent and less consolidated in terms of commercial products and open source frameworks, the technology watch overviews the main architectural principles, the challenges and then survey examples of applications and, when available, of products. The last section of the report concentrates on enterprise search systems and business process engines, defined as broad spectrum solutions, either from open source or from private companies, which can be applied to address the information need of enterprises. This part of the technology watch addresses a more consolidated sector, and thus is organized in a more conventional way, based on the detection and comparison of product features. The list of reviewed products comprises both commercial and open source frameworks. It concludes with the overview of SMILA, the open source framework at the base of the CUbRIK platform, which in this way is put in context with respect to its main competitors.
CUbRIK Technology Watch Report
Page 1
D11.1 Version 1.0
1.
Multimedia Search
The Web is progressively becoming a multimedia content delivery platform. This trend poses severe challenges to the information retrieval theories, techniques and tools. We start this report with an introduction that defines the problem of multimedia information retrieval with its challenges and application areas, overviews its major technical issues, shows the reference architecture unifying the aspects of content processing and querying. Then the report exemplifies next-generation platforms for multimedia search.
1.1
Motivation
The growth of digital content has reached impressive rates in the last decade, fueled by the advent of the so-called “Web 2.0” and the emergence of user-generated content. At the same time, the convergence of the fixed-network Web, mobile access, and digital television has boosted the production and consumption of audio-visual materials, making the Web a truly multimedia platform. This trend challenges search as we know it today, due to the more complex nature of multimedia with respect to text, in all the phases of the search process: from the expression of the user’s information need to the indexing of content and the processing of queries by search engines. We start the technology watch report giving concise overview of Multimedia Information Retrieval (MIR), the long-standing discipline at the base of audio-visual search engines.
1.1.1 Requirements The requirements of a MIR application go beyond text information retrieval in a variety of ways [56]:
Opacity of content: whereas in text IR the query and the content use the same medium, MIR content is opaque, in the sense that the knowledge necessary to verify if an item is relevant to a user’s query is deeply embedded in it and must be extracted by means of a complex pre-processing (e.g., extracting speech text from a video).
Query formulation paradigm: the user’s information need can be formulated not only by means of keywords, as in traditional search engines, but also by analogy, e.g., by providing a sample of content “similar” to what the user is searching for.
Relevance computation: in text search, relevance of documents to the user’s query is computed as the similarity degree between the vectors of words appearing in the document and in the query (modulo lexical transformations). InMIR, the comparison must be done on a wide variety of features, characteristic not only of the specific medium in which the content and the query are expressed, but even of the application domain (e.g., two audio files can be deemed similar in a music similarity search context, but dissimilar in a topic-based search application). MIR applications requirements have been extensively addressed in the last three decades, both in the industrial and academic fields. As a consequence, MIR is now a consolidated discipline, adopted into a wide variety of domains [67] The following is a non-exhaustive list of recent applications:
Architecture, real estate, and interior design (e.g., searching for ideas).
Broadcast media selection (e.g., radio channel [41], TV channel).
Cultural services (history museums [94], art galleries, etc.).
Digital libraries (e.g., image catalogues [36], musical dictionaries, bio-medical imaging catalogues [11], film, video and radio archives).
CUbRIK Technology Watch Report
Page 2
D11.1 Version 1.0
E-Commerce (e.g., personalized advertising, on-line catalogues [29], directories of eshops).
Education (e.g., repositories of multimedia courses, multimedia search for support material).
Home Entertainment (e.g., systems for the management of personal multimedia collections [36], including manipulation of content, e.g. home video editing [4], searching a game, karaoke).
Investigation services (e.g., human characteristics recognition [19], forensics [20]).
Journalism (e.g. searching speeches of a certain politician [35] using his name, his voice or his face [19]).
Multimedia directory services (e.g. yellow pages, Tourist information, Geographical information systems).
Multimedia editing (e.g., personalized electronic news service, media authoring [57]).
Remote sensing (e.g., cartography, ecology [104] natural resources management).
Social (e.g. dating services, podcast).
Surveillance (e.g., traffic control, surface transportation, non-destructive testing in hostile environments).
1.2 State of Art and Challenges Multimedia search engines and their applications operate on a very heterogeneous spectrum of content, ranging from home-made content created by users to high value premium content, like feature film video. The quality of content largely determines the kind of processing that is possible for extracting information and the kind of queries that can be answered. Challenge 1: Content Acquisition In text search engines, content comes either from a closed collection (as, e.g., in a digital library) or is crawled from the open Web. In MIR, content is acquired from many sources and in multiple ways:
From media-production devices (e.g., scanners, digital cameras, smartphones, etc.).
By crawling the Web or local document repositories.
By user’s contribution.
By syndicated contribution from content aggregators.
Via broadcast capture (e.g., from air/cable/satellite broadcast, IPTV, Internet TV multicast, etc.). The most important factor affecting what can be done with multimedia assets (apart from their editorial value) is their intrinsic quality (e.g., the definition of an image or the encoding format of a video) and the quality of the metadata associated with them. Metadata are textual descriptions that accompany a content element; they can range in quantity and quality, from no description (e.g., Web cam content) to multilingual data (e.g., closed captions and production metadata of motion pictures). Metadata can be found: 1. Embedded within content (e.g., close captions). 2. In surrounding Web pages or links (e.g., HTML content, link anchors, etc). 3. In domain-specific databases (e.g., IMDB [137] for feature films). 4. In ontologies (e.g., like those listed in the DAML Ontology Library [138].
CUbRIK Technology Watch Report
Page 3
D11.1 Version 1.0
Challenge 2: Content Normalization In textual search engines, context is subjected to a pipeline of operations for preparing it to be indexed [10]; such pre-processing includes parsing, tokenization, lemmatization, and stemming. With text, the elements of the index are of the same nature of the constitutive elements of content: words. Multimedia content needs a more sophisticated pre-processing phase, because the elements to be indexed (called “features” or “annotations”) are numerical and textual metadata that need to be extracted from raw content by means of complex algorithms. Even prior to processing content for metadata extraction, it may be necessary to submit it to a normalization step, with a twofold purpose: 1) translating the source media items represented in different native formats into a common “internal” representation format (e.g., MPEG4 15 for video files), for easing the development and execution of the metadata extraction algorithms; 2) producing alternative variants of native content items, e.g., to provide freebies (free sample copies) of copyrighted elements or low resolution copies for distribution on mobile or lowbandwidth delivery channels (e.g., making a 3GP version [14] of video files for mobile phone fruition). Challenge 3: Content Indexing Indexes are a concise representation of the content of an object collection. Multimedia content cannot be indexed as it is, but features must be extracted from it; such features must be both sufficiently representative of the content and compact to optimize storage and retrieval. Features are traditionally grouped into two categories:
Low level features: concisely describe physical or perceptual properties of a media element (e.g., the colour or edge histogram of an image).
High level features: domain concepts characterizing the content (e.g., extracted objects and their properties, geographical references, content categorizations, etc). As in text, where the retrieved keywords can be highlighted in the source document, also in MIR there is the need of locating the occurrences of matches between the user’s query and the content. Such requirement implies that features must be extracted from a continuous medium, and that the coordinates in space and time of their occurrence must be extracted as well (e.g., the time stamp at which a word occurs in a speech audio file, the bounding-box where an object is located in an image, or both pieces of information to denote the occurrence of an object in a video). Feature detection may even require a change of medium with respect to the original file, e.g., the speech-to-text transcription. Challenge 4: Content Querying Text IR starts from a user’s query, formulated as a set of keywords, possibly prefixed by logical operators (AND, OR, NOT). The semantics of query processing is text similarity: both the text files and the query are represented into a common logical model(e.g., the word vector model [82]), which supports some form of similarity measure (e.g., cosine similarity between word vectors). In MIR, the expression of the user’s information need allows for alternative query representation formats and matching semantics. Examples of queries can be:
Textual: one or more keywords, to be matched against textual metadata extracted from multimedia content.
Mono-media: a content sample in a single media (e.g., an image, a piece of audio) to be matched against an item of the same kind (e.g., query by music or image similarity, query by humming) or of a different medium (e.g., finding the movies whose soundtrack is similar to an input audio file).
Multi-media: a content sample in a composite medium, e.g., a video file to be matched using audio similarity, image similarity, or a combination of both. Accepting in input queries expressed by means of non-textual samples requires real-time content analysis capability, which poses severe scalability requirements on MIR
CUbRIK Technology Watch Report
Page 4
D11.1 Version 1.0
architectures. Another implication of non-textual queries is the need for the MIR architecture to coordinate query processing across multiple dedicated search engines: for example, an image similarity query may be responded by coordinating an image similarity search engine specialized in low-level features matching and a text search engine, matching high-level concepts extracted from the query (e.g., object names, music gender, etc). Challenge 5: Content Browsing Unlike data queries, IR queries are approximate and thus results are presented in order of relevance, and often in a number that exceeds the user’s possibility of selection. Typically, a text search engine summarizes and pages the ranked results, so that the user can quickly understand the most relevant items. In MIR applications, understanding if a content element is relevant has additional challenges. On one side, content [5]: for example, summarizing a video may be done in several alternative ways: by means of textual metadata, with a selection of key frames, with a preview (e.g., the first 10 seconds), or even by means of another correlated item (e.g., the free trailer of a copyrighted feature film). The interface must also permit users to quickly inspect continuous media and locate the exact point where a match has occurred. This can be done in may ways, e.g., by means of annotated time bars that permit one to jump into a video where a match occurs, with VCR-like commands, and so on. Figure 1 shows a portion of the user interface of the PHAROS multimedia search platform [15] for accessing video results of a query: two time bars (labeled “what we hear”, “what we see”) allow one to locate the instant where the matches for a query occur in the video frames and in the audio, inspect the metadata that support the match, and jump directly to the point of interest.
Figure 1: Visual and aural time bars in the interface of the PHAROS search platform
CUbRIK Technology Watch Report
Page 5
D11.1 Version 1.0
1.3
Platforms and Components for Multimedia Search
The architecture of a MIR system [32] can be described as a platform for composing, verifying and executing search processes, defined as complex workflows made of atomic blocks, called search services, as illustrated in Figure 2. The most important categories of MIR processes are Content Processes, which have the objective of acquiring multimedia content from external sources (e.g. from the user or from a video portal) and extracting features from it; and the Query Processes, which have the objective of acquiring a user’s information need and computing the best possible answer to it. Accordingly, the most important categories of search services are content services, which embody functionality relevant to content acquisition, analysis, enrichment, and adaptation; and query services, which implements all the steps for answering a query and computing the ranked list of results.
1.4
Content Processes
A content process (as the one schematized in Figure 5) aims at gathering multimedia content and at elaborating it to make it ready for information retrieval. A MIR platform may host multiple content processes, as required for elaborating content of different nature, in different domains, for different access devices, for different business goals, etc. The input to the process is twofold:
Multimedia Content (image, audio, video).
Information about the content, which may include publication metadata (HTML, podcast [139], RSS [140], MediaRSS [141], MPEG7, etc), quality information (encoding, user’s rating, owner’s ratings, classification data), access rights (DRM data, licensing, user’s subscriptions),and network information (type and capacity of the link between the MIR platform and the content source site e.g., access can be local diskbased, remote though a LAN/SAN, a fixed WAN, a wireless WAN, etc).
Figure 2: Example of a MIR Content Process The output of the process is the textual representation of the metadata that capture the knowledge extracted from the multimedia content via content processing operations. Section 4 and Section 5 respectively provide a discussion on the state of the art of metadata vocabularies and analysis techniques for multimedia contents. A content process can be designed so as to dynamically adapt to the external context, e.g., CUbRIK Technology Watch Report
Page 6
D11.1 Version 1.0
as follows:
By analyzing the content metadata (e.g., manual annotations) to dynamically decide the specific analysis operators to apply to a media element (e.g., if the collection denotes indoor content, a heuristic rule may decide to skip the execution of outdoor object detection).
By analyzing the access rights metadata to decide the derived artefacts to extract (e.g., if content has limited access, it may be summarized in a freebie version for preview)
By analyzing the geographical region where the content comes from (e.g., inferring the location of the publisher may allow the process to apply better heuristic rules for detecting the language of the speech and call the proper speech-to-text transcriptor).
By understanding the content delivery modality (e.g., a real-time stream of a live event may be indexed with a faster, even if less precise, process for reducing the time-tosearch delay interval).
1.4.1 Content Processing components Content processing is the activity performed over a content item with the aim of creating a representation suitable for indexing and retrieval purposes. The way contents are processed is application dependent, as it relates to the nature of the processed items. Information Retrieval (IR) systems typically deal with textual contents. MIR systems are not an exception, as information is often represented in a textual format. Therefore, textual processing is a common activity in MIR applications, and its specification embodies a well-defined set of standard operations for text analysis. The goal of such operations is to deal with basic language units as morphemes, roots, stems, words, phrases, sentences, etc., in order to finetune the retrieval performance of the system. Typical examples for such operations are:
Language Detection: determines the language of the textual content to be indexed. Such operation is often used to discriminate or filter text before further analysis operations are applied. The aim is to improve the retrieval performance byaddressing texts to the proper analysis flow (e.g., English text must be processed considering English dictionaries and grammatical rules, which are different, for instance, from the ones of the German language).
Spell Checking, Spelling Correction, and Spelling variant: textual terms can be misspelled or can use an uncommon spelling. These operations help in identifying, correcting or substituting incorrect spells (e.g., as the ones provided by a speech to text algorithms).
Lemmatization: it consists in substituting words with their stems. Lemmatization is often used to tune retrieval performance because it reduces variants of the same root word to a common concept, while containing the size of the index structure.
Stopword Removal: languages have words that do not carry semantic meaning, and, hence, that do not convey a concept that is useful for retrieval purposes (i.e., articles, conjunctions, etc). Such words are named stopwords and they are usually removed before indexing. MIR systems, with respect to typical IR systems, must also process audio, video or images in order to produce annotations. A Mono-Annotation analysis is defined as an analysis operation where a single combination of file type and content type (e.g., the audio track of a video file) is represented by a single annotations set. On the contrary, a Multi-Annotation analysis provides multiple view-points over the same content, in order to produce more valuable annotations: for instance, the audio track of a movie can be first analyzed to identify the speaking actors and then to segment it according to speakers’ turns. Multiple annotations can be considered separately, as independent description of the analyzed content, or jointly, in order, for instance, to raise the overall confidence on the produced metadata. In the former case, the annotations associated with the managed contents is defined as Mono-modal, otherwise, we talk about Multi-Modal annotations. Using the example of the movie file, the fact that in a single scene both the face and the voice CUbRIK Technology Watch Report
Page 7
D11.1 Version 1.0
of a person are identified as belonging to an actor “X” can be considered as a correlated event, so to describe the scene as “scene where actor X appears” with a high confidence. Multi-modality is typically achieved by means of annotation fusion techniques: media processing operations are probabilistic processes, where the result is characterized by a confidence value; multiple features extracted from media data can be fused to yield more robust classification detection [66]. For instance, multiple content segmentation techniques (e.g., shot detection and speaker’s turn segmentation) can be combined in order to achieve better video splitting; voice identification and face identification techniques can be fused in order to obtain better person identification. Typically, multi-annotation and multi-modality can be expensive goals to achieve, thus limiting their adoption in such domains where fast indexing performances should be traded with accuracy in the content descriptions. Multimedia processing operations can be roughly classified in three macro categories: transformation, feature extraction and classification.
Transformation: to convert the format of media items. For instance, a video transformer can modify a movie file in DVD quality to a format more suitable for analysis (e.g., MPEG); likewise, an audio transcoder can transform music tracks encoded in MP3 to WAV, so to enable subsequent analysis operations.
Feature extraction: to calculate low-level representations of media contents, i.e. feature vectors, in order to derive a compact, yet descriptive, representation of a pattern of interest [86]. Such representations can be used to enable content based search, or as input for classification tasks. Examples of visual features are colour, texture, shape, etc. [38]; examples of music features are loudness, pitch, tone (brightness and bandwidth), Mel-filtered Cepstral Coefficients [93], and more.
Classification: to extract and assign conceptual labels to content elements by analyzing their raw representations; the techniques required to perform this operations are commonly known as machine learning. For instance, an image classifier can assign to image files annotations expressing the subject of the pictures (e.g.: mountains, city, sky, sea, people, etc.), while an audio file can be analyzed in order to discriminate segments containing speech from the ones containing music. Arbitrary combinations of transformation, feature extraction and classification operations can result in several analysis algorithms. Table 1 presents a list of 14 typical audio, image and analysis techniques; the list is not intended to be complete, but rather to give a glimpse on the analysis capabilities currently available for MIR systems; to provide the reader with a hook to the recent advancements in the respective fields, each analysis technique is referenced with a recent survey on the topic. Audio Analysis
Image Analysis
Video Analysis
Audio segmentation [67]: to split audio track according to the nature of its content. For instance, a file can be segment according to the presence of noise, music, speech, etc.
Semantic Concept extraction [60]: the process of associating high-level concepts (like sky, ground, water, buildings, etc.) to pictures.
Scene detection [78]: detection of scenes in a video clip; a scene is one of the subdivisions of a play in which the setting is fixed, or that presents continuous action in one place [79].
Audio event identification [76]: to identify the presence of events like gunshots and scream in an audio track.
Optical character recognition [13]: to translate images of handwritten, typewritten or printed text into an editable text.
Video text detection and segmentation [70]: to detect and segment text in videos in order to apply image OCR techniques.
Music genre (mood) identification [84]: to identify
Face recognition and identification [91,105]:
Video summarization [40]: to create a shorter version of a
CUbRIK Technology Watch Report
Page 8
D11.1 Version 1.0
Audio Analysis
Image Analysis
Video Analysis
the genre (e.g., rock, pop, jazz, etc.) or the mood of a song.
to recognize the presence of a human face in an image, possibly identifying its owner.
video by picking important segments from the original.
Speech recognition [6]: to convert words spoken in an audio file into text. Speech recognition is often associated with speaker’s identification [74], that is to assign an input speech signal to one person of a known group
Object detection and identification [103]: to detect and possibly identify the presence of a known object in the picture.
Shot detection [23]: detection of transitions between shots. Often shot detection is performed by means of Keyframe segmentation [53] algorithms that segment a video track according to the key frames produced by the compression algorithm.
Table 1: Content analysis techniques and components in MIR systems.
1.5
Query Process
A MIR Query process (like the one schematized in Figure 4) accepts in input an information need and formulates the best possible answer from the content indexed in the MIR platform, and possibly from content indexed by other external search engines or published on the Web.
Figure 3: Example of a MIR Query Process The input of the query process is an information need, which can be a keyword or a content sample. The output of the query process is a result set, which contains information on the objects (typically content elements) that match the input query. The description of the objects in the result set can be enriched with metadata coming from sources external to the MIR platform (e.g., additional metadata on a movie taken from IMDB, or a map showing the position of the object taken from a Geographical Information System). A collateral source of input is the query context, which expresses additional circumstances about the information need, often implicit. Well-known examples of query context are: user preferences, past users queries and their responses, access device, location, access rights, and so on. The query context is used to adapt the query process, e.g., to expand the original information need of the user with additional keywords reflecting her preferences, or disambiguating a query term based on the application domain where the query process is embedded. Queries are classified as mono-modal, if they are represented in a single medium (e.g., a CUbRIK Technology Watch Report
Page 9
D11.1 Version 1.0
text keyword, a music fragment, an image) or multi-modal, if they are represented in more than on medium (e.g., a keyword AND an image). As for Search Computing, also in MIR queries can be classified as mono-domain, if they are addressed to a single search engine (e.g., a general purpose image search engine like Google Images [121] or a special purpose search service as Empora [122] garments search), or multi-domain, if they target different, independent search services (e.g., a face search service like Facesaerch [123] face search service and a video search service like Blinkx [64]). Table 2 exemplifies the domain and mode classification of queries: Mono Domain
Multi Domain
Mono Modal
Find all the results that match a given keyword; Find all the images similar to a given image.
Find theatres playing movies acted by an actor having the voice similar to a given one.
Multi Modal
Find all videos that contain a given keyword and that contain a person with a face similar to a given one.
Find all CDs in Amazon with a cover similar to a given image.
Table 2: Examples of Mono/Multi modal, Mono/Multi domain queries in a MIR system. Multi-modal queries use the different modes to retrieve information from one (possibly composite, distributed) search service; the result set is determined as the intersection of the items that satisfy the conditions associated with each query mode. Multi domain queries address information needs that may span different independent search services (e.g., a MIR platform instance and a text or vertical search engine). The global result is obtained by combining the partial results computed by the independent search services.
1.6
MIR metadata standards
The current state of the practice in content management presents a number of metadata vocabularies dealing with the description of multimedia content [34]. Many vocabularies allow the description of high-level (e.g., title, description) or low-level features (e.g., color histogram, file format), while some enable the representation of administrative information (e.g., copyright management, authors, date). In a MIR system, the adoption of a specific metadata vocabulary depends on its intended usage, especially for what concerns the type of content to describe. In the followings we list a set of common vocabularies currently adopted in MIR applications; the list is not intended to be complete, but rather to show how the portability, completeness and extensibility of a metadata format can affect its usage in MIR systems.
MPEG-7 [65] represents the attempt from ISO to standardize a core set of audio-visual features and structures of descriptors and their (spatial/temporal) relationships. By trying to abstract from all the possible application domains, MPEG-7 results in an elaborate and complex standard that merges both high-level and low-level features, with multiple ways of structuring annotations. MPEG7 is also extendible, so to allow the definition of application-based or domain-based metadata.
Dublin Core [24] is a 15-element metadata vocabulary (created by domain experts in the field of digital libraries) intended to facilitate discovery of electronic resources, with no fundamental restriction on the resource type. Dublin Core holds just a small set of high-level metadata and relations (e.g. title, creator, language, etc…), but its simplicity made it a common annotation scheme across different domains.
MXF (Material Exchange Format) [9] is an open file format, aimed at the interchange of audio-visual material, along with associated data and metadata, for various applications used in the television production chain. MXF metadata address both high-
CUbRIK Technology Watch Report
Page 10
D11.1 Version 1.0
level and administrative information, like the file structure, key words or titles, subtitles, editing notes, location, etc. Though it offers a complete vocabulary, MXF has been intended primarily as an exchange format for audio and video rather then a description format for metadata storage and retrieval.
Exchangeable Image File Format (Exif) [49] is a vocabulary adopted by digital camera manufacturers to encode high-level metadata like date and time information, the image title and description, the camera settings (e.g., exposure time, flash), the image data structure (e.g., height, width, resolution), a preview thumbnail, etc. By being embedded in picture raw contents, Exif metadata is now a de-facto standard for image management software; to support extensibility, Exif enables the definition of custom, manufacturer-dependent additional terms.
ID3 [45] is a tagging system that enriches audio files by embedding metadata information. ID3 includes a big set of high-level (such as title, artist, album, genre) and administrative information (e.g. the license, ownership, recording dates), but a very small set of low-level information (e.g. BPM). ID3 is a worldwide standard for audio metadata, adopted in a wide set of applications and hardware devices. However, ID3 vocabulary is fixed, thus hindering its extensibility and usage as format for low-level features.
Other examples of multimedia-specific metadata formats are SMEF [124] for video, IPTC [125] for images and MusicXML [126] for music. In addition, several communities created some domain-specific vocabularies like LSCOM [127] for visual concepts, IEEE LOM [128] for educational resources and NewsML [129] for news objects.
1.7
Commercial MIR engines
Midomi (www.midomi.com) is an example of audio processing technologies applied to music search engine. The interface allows users to upload voice recordings of public songs, and then to query such music files by humming or whistling. Another similar application is Shazam [130]. Shazam is a commercial music search engine that enables users to identify tunes using their mobile phone. The principle consists in using its mobile phone to record a sample of few seconds of a song from any source (even with bad sound quality) and the system returns the identified song with the necessary details: artist, title, album, etc. Similar systems for music search are also provided by BMAT [131]. Voxalead™ [132] is an audio search technology demonstrator implemented by Exalead to search in TV news, radio news and VOD programs by content. The system uses a third-party speech-to-text transcription module transcribe political speeches in several languages. The field of image search technologies also appears to be mature. Google Images [133] and Microsoft Bing [134], for instance, now offer a ”show similar images” functionality, thus proving the scalability of content-based image search on the Web. Other notable example of image MIR engine is Tiltomo [135], which also performs search according to the image theme. Blinkx [136] is another example of search engine on videos and audio streams. Blinkx, like Voxalead™, uses speech recognition to match the text query to the video or audio speech content. Blinkx represents an example of mature video MIR solution as they claim to have over 30 million hours of video indexed.
CUbRIK Technology Watch Report
Page 11
D11.1 Version 1.0
2.
Human computation
In CUbRIK multimedia processing is enhanced with the contribution of humans at all levels: content, query and feedback processing, according to the principles and techniques of Human Computation (HC), which the discipline that aims at harmonizing the contribution of humans and computers in the resolution of complex tasks. The general principle of HC is to use computer and network architectures to organize the distributed allocation of work to a crowd of human performers, who contribute their skill in solving problems where algorithms fail or produce uncertain output, like object recognition in images or text translation. HC solutions assume a variety of forms, from crowdsourcing labor markets, to data collection or early alerting mobile applications, to games with a purpose and crowd search. This chapter overviews a the most notable applications in the market, discusses a conceptual framework that abstracts the common aspects of the existing approaches, pin down the dimensions that characterize a HC solution, and highlights some open research questions and projects addressing them. In year 2011, the US Digital Consumer Report by Nielsen [70] reports that, in the US, social network/blog usage and gaming are respectively the first and second busiest online activity performed by fixed network users, surpassing email. In the same period, also mobile usage registered a quantum leap (+28%) in the share of online time spent on social networking platforms. The rise of gaming and social network usage sets the background for the diffusion of Human Computation [93], applied in business, entertainment and science, where the online time spent by users is harnessed to help in the cooperative solution of tasks.
2.1
Games with a Purpose (GWAP)
Games with a Purpose (GWAPs) focus on exploiting the billions of hours that people spend online playing with computer games to solve complex problems that involve human intelligence [92]. The emphasis is on embedding a problem solving task into an enjoyable user experience, which can be conducted by individual users or by groups. GWAPs, and more generally useful applications where the user solves perceptive or cognitive problems without knowing, address such tasks as adding descriptive tags and recognizing objects in images, checking the output of Optical Character Recognition (OCR) for correctness, helping protein folding and multiple sequence alignment algorithms in molecular biology and comparative genomic research [21]. Several game design patterns have been proposed [95], which exploit different input-output templates, and the mechanics of users’ involvement has begun to be modelled formally [20]. A first game template is based on the output agreement pattern. In this case, two randomly chosen players share a same input (typically, an image or a tune) and are requested to produce some description of their common input: the objective is to reach an agreement as quickly as possible, by outputting the same descriptive label. Output agreement games are useful for annotating multimedia content assets, as humans are induced to produce semantic annotations that describe as accurately as possible the input. An example of output agreement games is the Extra Sensorial Perception (ESP) game [94], shown in Figure 2(a). A second game template, illustrated in Figure 2(b), is based on the input agreement pattern, in which two players are given inputs (e.g., a tune) that are known by the game, but not by them, to be the same or different. They are requested to produce descriptions of their input, so that their partners can guess whether the inputs are the same or different. Players see only each others outputs and the play terminates when both players correctly determine whether they have been given the same input or not.
CUbRIK Technology Watch Report
Page 12
D11.1 Version 1.0
Figure 4: examples of GWAPS: output agreement (left), input agreement (right) Both input and output agreement games assume that both players have equivalent roles, whereas the inversion-problem template differentiates between them: at ach round, one player assumes the role of the “describer” and the other one that of he “guesser”. The describer receives an input (e.g, an image, or a word) and based on it, sends suggestions to the guesser to help her make a guess that reproduces the original input.
Figure 5: examples of GWAPS: inversion problem (left), state space exploration (right) Figure 3(a) shows an example of inversion-problem game, Verbosity, where the describer gives clues about a secret word to the guesser, with the aim of producing common sense descriptions of concepts. A different example of inversion-problem game is Peekaboom [96], used to help detect objects in images. Here the guesser (Peek) starts out with a blank screen, while the describer (Boom) receives an image and a word related to it. The goal of the game is for the describer to progressively reveal parts of the image to the guesser, so that the latter can guess the associated word. In doing so, the game extracts a bounding box where the object associated with the image is most probably located. GWAPs have been applied not only to the simple tasks illustrated above, but also to the massively cooperative resolution of highly complex problems, which are too expensive for being addressed with state-of-the-art computer architectures and where human wisdom is used to quickly search and reduce the space of possible solutions. An exemplary case is the FoldIt game [21], where crowds of online users compete and cooperate in addressing one of the hardest computational problems in biology: protein folding, that is, the prediction of biologically relevant low-energy conformation of proteins. The game has the form of a 3D puzzle, where players are given an initial structure and a set of manipulation operators that they can use to produce a folding that respect a number of physical and chemical constraints. Players can save partially folded structures, which may be taken up by other players, making the resolution effort cooperative. Several structure prediction problems have been solved CUbRIK Technology Watch Report
Page 13
D11.1 Version 1.0
successfully by Foldit players, often outperforming automatic automated predictor tools. As an example, in 2012, Foldit players modelled in just three weeks the structure of an enzyme that had eluded researchers for a decade, which will afford new insights for the design of antiretroviral drugs [30].
2.2
Crowdsourcing
Crowdsourcing, a term coined by Jeff Howe in a Wired magazine article about the rise of new forms of online labour organizations [41], is the outsourcing of work, traditionally performed by employees, to an open community of online workers. A ďŹ rst form of crowdsourcing is found in vertical markets, where a community of professional or amateur contributors is called in for supplying specific products or services. An example is iStockPhoto (http://istockphoto.com), a content marketplace where photos made both by professionals and amateurs are sold at affordable prices. Other examples are found in graphic design, e.g., the 99designs design contest web site (http://99designs.com) where customers can post requirements for web page or logo creation and designers compete by submitting proposals; in fashion design, e.g., the Threadless community and marketplace for T-shirt and garment design (http://www.threadless.com/); and even in highly technical and specialized services, like collaborative innovation, e.g., Innocentive (http://www.innocentive.com/), where technical challenges for product innovation are addressed to a community of problem solvers. A different form of crowdsourcing is provided by horizontal platforms, which broker the execution of microtasks in different domains, like speech and handwriting transcription, data collection and verification. A typical microtask brokerage platform has a Web interface that can be used by two kinds of people: work providers can enter in the system the specification of a piece of work they need (e.g., collecting addresses of businesses, classifying products by category, geo-referencing location names, etc); work performers can enrol, declare their skills, and take up and perform a piece of work. The application manages the work life cycle: performer assignment, time and price negotiation, result submission and verification, and payment. In some cases, the application is also able to split complex tasks into micro-tasks that can be assigned independently [43], e.g., breaking a complex form into sub-forms that can be filled by different workers. In addition to the web interface, some platforms offer Application Programming Interfaces (APIs), whereby third parties can integrate the distributed work management functionality into their custom applications. Examples of horizontal microtask crowdsourcing markets are Amazon Mechanical Turk and Microtask.com. Figure 2.4 shows the worker interface of Amazon Mechanical Turk. Tasks, called Human Intelligence Tasks (HITs) are packaged in groups offered as a bundle by the same requester, and groups are displayed in order of number of HITs. HITs have a descriptive title, an expiration date, a time slot for completing them, and the amount paid per solved HIT. A survey conducted in 2010 on the demographics of Amazon Mechanical Turk by Panos Ipeirotis [45] revealed that the population of workers is mainly located in the United States (46.80%), followed by India (34.00%), and then by the rest of the world (19.20%), with an higher percentage of young workers; most workers spend a day or less per week on Mechanical Turk, and complete 20-100 HITs per week, which generates a weakly income of less than 20 USD.
CUbRIK Technology Watch Report
Page 14
D11.1 Version 1.0
Figure 6: worker’s interface in Mechanical Turk
2.3
Human Computation Challenges
The design of Human Computation systems must take into consideration a number of issues that arise when humans are included in the loop [46]. Human factors in computing and decision making: the human cognitive processes interplay with affective states and may exhibit systematic and predictable bias, which, if overlooked, may threaten the validity of human-assigned tasks.
Human factors in decision making have been studied separately in neuroscience, behavioral economics and cognitive psychology; recently, the interdisciplinary field of neuroeconomics has addressed the integration of multiple techniques to examine decision-making by considering cognitive and neural constraints, used in psychology and neuroscience, in the construction of mathematical decision models, typical of economics [81]. Anchoring and sequential effects (the dependency of responses on prior information) have been studied (e.g., the ‘anchoring bias’ studied by [89] and well known in the environmental decision-making literature) and mitigation measures have been proposed that could be applied to crowdsourcing and online judgement elicitation [68].
Quality of human output and malicious behaviour: a major issue in human computation design is measuring and improving the quality of human contributions, which requires estimating the quality of the task performer and combining different (and potentially conflicting) results produced by different performers. The simplest technique for quality improvement is, when possible and sensible, submitting the same task to multiple performers and take the majority response as the true outcome, thus assimilating the quality measure to the probability that a response is correct [55]. Refined approaches rely on the estimation of user’s quality and combine the responses taking such estimation into account. As an extreme case, performers may intentionally cheat, producing incorrect results, which requires suitable techniques for detecting spammers [47], or they may simply be negligent, an attitude that can be detected by pre-task screening [25].
Market Design: some human computation applications, e.g., task crowdsourcing for data validation, can be regarded as a work market, where work sellers and buyers meet [46]. As in any market, efficiency heavily depends on the reciprocal trust between buyers and sellers, and can be hampered by opportunistic or fraudulent behaviours. Trust and reputation systems used in e-commerce and recommender systems [49] apply to human computation too, but need to be adapted to the specificity of human computation environments.
CUbRIK Technology Watch Report
Page 15
D11.1 Version 1.0
2.4
Ethical and legal aspects: human computation applications question the ethical and legal framework at the base of the human condition [103]. Many applications of human computation (e.g., expert finding and task-to-worker matching) require processing user’s data, e.g., profiles and friendship links extracted from social networks. These requirements may collide with the user’s expectation about data protection and privacy. Furthermore, the disconnection of workers from employers implied by crowdsourcing is also changing the workplace relationships, with such potential consequences as worker’s alienation, incapacity of judgement about the moral valence of tasks, and overmonitoring by the commissioner.
Platforms and Components for Human Computation
As discussed in Section 2.2, human computation assumes a variety of forms, which range from paid work in an online labour market to the management of early alerts using mobile social networks. Despite such a variety of manifestations, a common framework can be identified and used to better understand the challenges in the design, usage, and evaluation of a human computation solution. Figure 2.6 conceptualizes the actors, information, and steps that are involved in human computation.
Figure 7: general architecture of a human computation platform By making a parallel with traditional machine-based computation, a human computation framework takes in input a problem solving scenario, formalized as program executable cooperatively by both humans and machines. Such a program can be expressed as a workflow of elementary instructions (tasks), which leads to the accomplishment of the desired goal. These multiple tasks must be executed respecting precedence constraints and inputoutput dependencies; some of them are executed by machines (Machine Tasks) and some by a community of performers (Crowd Tasks). The community of human performers may not be completely determined a priori and the task assignment rules are dynamic and based on the tasks specification and on the characteristics of the potential members of the crowd. The human computation framework produces in output some results to the problem solving goal, which may be non-deterministic, because machine tasks are solved probabilistically (as, for example, in image similarity matching) and/or because the human performers make errors or event cheat. The different applications of the general Human Computation framework illustrated in Section 2.3 can be classified according to the dimensions of their design, which we summarize below:
Humans involved: to attain its problems solving goal, an application may require only one individual at a time (e.g., the reCaptcha system for OCR correction [97]), a small group of users (e.g., a multi-player GWAP) or an open community (e.g., people
CUbRIK Technology Watch Report
Page 16
D11.1 Version 1.0
mobilisation [74] and human sensing applications [26]).
Human ability involved: applications can exploit different human abilities: perception and/or emotion, like in content appraisal and user interface usability assessment, where human feedback is collected with techniques like physiological signal analysis and eye movement tracking to estimate the impact of content or of interface design [52]; judgment, when the discriminating power of users is exploited, like in user studies, where people are requested to judge alternative solutions [38]; social behaviour when cooperation and inter-personal communication are prominent, like people mobilisation [74] and social business process management, where communities of external stakeholders are involved in enterprise business processes [16].
Task Interface Style: the interface of a Crowd Task can be based on different interaction styles; a game engages individuals or small groups in a challenge, where the resolution of the problem is a collateral result [20]; instead, a task is an explicit unit of work, with well-defined input/output and explicitly submitted to a recipient, like in crowdsourcing markets and query & answer platforms; an intermediate style is the implicit task, in which a user performs a useful activity without necessarily knowing (like, e.g., in reCaptcha-enabled login forms [97]).
Control: task definition, assignment and progress can be controlled centrally, as is the case in work automation platforms like Amazon Mechanical Turk; alternatively task activation may be managed in a distributed way, as in the DARPA Network Challenge strategy [74], where a recursive incentive mechanism was used, which encouraged people to recruit effective workers so to improve their expected success rate and reward. Motivation mechanism: users may engage in executing tasks for different reasons; purely for fun, like in GWAPs, for ethical reasons, like in volunteer work; or for an economical reward. Recent studies have examined the correlation between incentives, task definition policies and the resulting quality of task execution [43].
Time: human computation applications may have critical time requirements that make the execution of tasks and the propagation of actions or decisions along social links important, like in the DARPA Network Challenge [74]. The design of time-critical human computation applications can be addressed at the task and at the communication level. In the former case, the amount of time allotted for the execution of the task is set explicitly, so that potential performers can better self-evaluate their adequacy as problem solvers and not claim a task that they will not be able to perform at the requested speed (this solution is adopted by most crowdsourcing applications). In the latter case, a suitable incentive mechanism is defined that favours the rapid spread of information among potential performers. As an example of this latter approach, the winners of the DARPA Network Challenge adopted a reward mechanism by which the user gained a prize not only if he spotted one balloon personally, but also if a person recruited by him spotted one or in turn recruited a third person that directly or transitively contributed to the balloon identification. Such an incentive fostered the rapid creation of geographically distributed teams, because each user had a quantifiable advantage at selecting from his acquaintances the most active and geographically well-positioned members to add to the team. Table 3 shows several examples of Human Computation applications and correlates them with the most relevant dimensions in their design.
CUbRIK Technology Watch Report
Page 17
D11.1 Version 1.0
Table 3: Design dimensions in human computation applications
CUbRIK Technology Watch Report
Page 18
D11.1 Version 1.0
3. General purpose search systems and process engines This section overviews the status of the art in the more consolidated market of search and business process execution platforms for enterprises. A recent research [Forrester 2011] [107] on the enterprise search market overview still postulates that “What Search Buyers Need Is Not What Search Vendors Sell”. This statement has more than one facet. Sometimes, users just want enterprise search results, but get sophisticated dashboards instead. Other cases require tailored and context driven information supplies, i.e. they require search driven business applications. Thus, the search engine market has different groups of vendors serving different user needs with a broad variety of products and architectures. Following a brief categorization as provided by [Forrester 2011] [107]:
Specialized search vendors position search as a strategic technology to solve specific problems. Examples in this category are Attivio, Coveo, Endeca, Exalead, Sinequa, and Vivisimo. Empolis’ IAS product best fits into this category as well, although it is not named a search engine.
Integrated search vendors sell separate search products, but have integrated them with other information management technology. Their aim is to deeply penetrate the workplace with all their technologies. Examples are Autonomy, Microsoft, and IBM.
Detached vendors have a slightly different positioning. Google is different because of its simple deployment strategy, ISYS offers its search componentized to resellers and OEMs, and Fabasoft is different by solely focusing on the public sector. The differentiation on a functional feature level between search products has become difficult, i.e. with increasing maturity the products become more feature complete and similar to each other. Even the most pessimistic market forecasts for enterprise search foresee annual growth rates of 10% [JCR 2010] [114].
3.1 Framework Comparison For each system there is a brief verbal description of the framework and the basic architecture. Then the following characteristics can be used to evaluate the framework:
Criteria for evaluation of the business model, and see what institutional interests are behind this framework and / or whether a potential for cooperation and the integration of existing systems into a new framework exists.
Characteristics to describe the software architecture and technical standards to determine the sustainability and future security of a framework open. Here we check whether the frameworks support software technical standards for integration platforms to either function or components from other systems will be integrated to evaluate the ability to integrate with other systems.
Characteristics to describe the functions provided for the actual search, administration, monitoring, and data import and the options for scaling and performance to determine the functionality of the framework. All characteristics are measured on a scale with three options: 0 for poor, 1 for a satisfactory and 2 for a good review.
CUbRIK Technology Watch Report
Page 19
D11.1 Version 1.0
Business model related features 1) Partners, collaborations
3) Marketing activities
With this feature, important partnerships and partners are recognized. Rating: 0 = no cooperation 1 = few collaborations 2 = many, or at least a "powerful" cooperation. Providers Apache Software Foundation and Eclipse Foundation attach great importance to the use of other project results and therefore accelerate the cooperation with other projects, insofar as these have always "powerful" partners. With this feature, the market impact is detected. Rating: 0 = no impact, insignificant 1 = market influence, but not a leader 2 = co-leader in sectors that are not in the focus of this study. With this feature, the marketing activities are recorded in order to identify new potential leader.
4) Licensing model 5) Number of developers
Rating: 0 = no marketing activities 1 = few marketing activities 2 = high marketing activities License model without rating. [142] If information can be found on the websites of the provider, these are briefly described.
2) Market impact
6) Maturity, proliferation of frameworks
7) Community
Rating: 0 = 1-9 Developers 1 = 10-20 developer 2 => 20 developers If information can be found on the websites of the provider, these are briefly described. Rating: 0 = no proliferation and maturation, beta version 1 = low or not installed in commercial failsafe applications, last release was published over a year ago. 2 = high or installed in commercial failsafe applications. With this feature, the activity of a community is measured without rating.
CUbRIK Technology Watch Report
Page 20
D11.1 Version 1.0
Architecture and standards related features 8) Summary of software support for standards
The following standards are checked whether they are supported in the system or not. Rating: 0 = no standards 1 = one of the following standards are supported: Web services standards (WSDL, SOAP), SCA, OSGi, BPEL, Java Plug-in technology. 2 = either all of the following standards are supported: Web services standards (WSDL, SOAP), SCA, BPEL, OSGi, Java Plug-in technology or there is a conformity to grid standards [120]. Standards in architecture SOA Although there is no standard for a service oriented architecture, should nevertheless be noted that the functions are implemented as services, and can be flexibly combined. SCA Service Component Architecture (SCA) [http://www.osoa.org/display/Main/Service+Component+Architectur e+Specifications ] SDO Service Data Objects [http://www.osoa.org/display/Main/Service+Data+Objects+Specificat ions ] JBI Java Business Integration (JBI) J2EE Java Platform, Enterprise Edition [http://java.sun.com/javaee/technologies/javaee5.jsp ] Java SE Java Platform, Standard Edition [http://java.sun.com/javase/ ] . NET
.NET [http://msdn.microsoft.com/de-de/netframework/default.aspx ]
J2C
Java EE Connector Architecture (JCA) [http://www.jcp.org/en/jsr/detail?id=112 ] Java Management Extensions (JMX) [http://java.sun.com/javase/6/docs/technotes/guides/jmx/index.htm l]
JMX
OSGi
OSGi Alliance (formerly "Open Services Gateway Initiative") [http://www.osgi.org/Specifications/HomePage ]
Standards for the integration of components and infrastructure WSDL Web Services Description Language [ http://www.w3.org/TR/wsdl ] UDDI
Universal Description, Discovery and Integration (UDDI) [ http://www.oasis-open.org/committees/uddi-spec/doc/tcspecs.htm ] BPEL Business Process Execution Language [ http://docs.oasisopen.org/wsbpel/2.0/wsbpel-v2.0.pdf ] Standards for external data sources JDBC Java database connectivity [http://java.sun.com/javase/6/docs/technotes/guides/jdbc/ ]
CUbRIK Technology Watch Report
Page 21
D11.1 Version 1.0
LDAP
Lightweight Directory Access Protocol [http://tools.ietf.org/html/rfc4510 ]
Standards for Security SAML Security Assertion Markup Language [http://www.oasisopen.org/committees/tc_home.php?wg_abbrev=security ] WSWeb Services Security [http://www.oasisSecurity open.org/committees/tc_home.php?wg_abbrev=wss ] XKMS XML Key Management Specification (XKMS) [ http://www.w3.org/TR/xkms2/ ] JAAS Java SE Security [http://java.sun.com/javase/technologies/security/ ] XML Sig XML Encrypti on
XML Signature [http://www.w3.org/TR/2001/PR-xmldsig-core20010820/ ] XML Encryption [http://www.w3.org/TR/xmlenc-core/ ]
Standards for messaging SOAP Simple Object Access Protocol (SOAP) [http://www.w3.org/TR/soap/ ] HTTP
Hypertext Transfer Protocol [http://tools.ietf.org/html/rfc2616 ]
JMS
Java Message Service [http://www.jcp.org/en/jsr/detail?id=914 ]
SSL / TSL Secure Sockets Layer (SSL), Transport Layer Security (TLS) [http://web.archive.org/web/20080208141212/http://wp.netscape.c om/eng/ssl3/ ] Other Standards JAXB Java Architecture for XML Binding [https://jaxb.dev.java.net/ ] 10) Integratio n of supplier's compone nts 11) Integratio n/ providers in foreign compone nts
Description (if possible) how providers own components are integrated. Rating: 0 = no extension possible 1 = Proprietary integrations of components or Java plug-ins 2 = integration with service-oriented standards such as Web services, SCA and BPEL. Description (if possible) how external components can be integrated and how the framework can be integrated into other applications. Rating: 0 = no extension possible 1 = Proprietary integrations of third-party components or Java vendor = plugins. 2 = integration capabilities of providers foreign components with serviceoriented standards such as Web services, SCA and BPEL.
CUbRIK Technology Watch Report
Page 22
D11.1 Version 1.0
Functional features 12) Functions for Information Retrieval
If possible, the functions it offers are cited. Rating: 0 = no function 1 = few functions: search function with minimal indexing, search and ranking. 2 = search capabilities with indexing, search and ranking and a concept-oriented knowledge model.
13) Functions for the administration
14) Functions for monitoring:
15) Functions for data import
If possible, the functions it offers are cited. Rating: 0 = no function 1 = few functions 2 = many functions If possible, the functions it offers are cited. Rating: 0 = no function 1 = few functions 2 = many functions If possible, the functions it offers are cited. Rating: 0 = no function 1 = few functions, for at least two sources of data (HTML, PDF, TXT, JDBC, RTF, XML) 2 = many functions, at least for the five main sources of data (HTML, PDF, TXT, JDBC, RTF, XML)
16) Functions scaling
If possible, found information on the scalability are cited without rating.
17) Features performance
If possible, found information about the performance are cited. Rating: 0 = low performance 1 = average performance or high performance yet been tested in practice 2 = high performance, which has been proven in practice or in benchmarks.
CUbRIK Technology Watch Report
Page 23
D11.1 Version 1.0
3.2
Assessment of General Integration Frameworks
Following the detailed system assessments are given. The findings are based on information provided by the respective vendors as well as publications in professional journals and partially from commercial market studies. If no information was found for a feature, this is indicated by “no information found”. Unclear information and assumptions are indicated as follows:
“*”:
Feature planned by vendor
“(…)”:
Assumption based on common knowledge about vendor and product
3.2.1 SOPERA, Swordfish General Information Provider:
SOPERA GmbH, Eclipse Foundation
Branch:
Germany, Global Activities View ¤
Name of the framework: "Sopera Advanced Services Framework" (Sopera ASF) Swordfish SOA runtime framework of Eclipse Foundation URL:
http://sopera.de / http://www.eclipse.org/swordfish /
Product description SOPERA or Swordfish Platform is an extensible framework for service-oriented architecture (SOA). It offers an Enterprise Service Bus (ESB) in the strict sense and basic messaging and integration capabilities as open source. "Sopera Advanced Services Framework" (SOPERA ASF) a complete framework to create a full SOA set up. The scope of this entire framework is illustrated in the following figure and includes the following components: SOPERA Service Editor SOPERA Policy Editor, SOPERA Intelligent Deployment and Administration.
CUbRIK Technology Watch Report
Page 24
D11.1 Version 1.0
Figure 8: SOPERA Advance Service Framework SOPERA was developed by Deutsche Post in 5 years and released as open source code in 2007. Maintenance, support, training and implementation services are provided by SOPERA GmbH. At Eclipse Foundation, this framework is still in the incubation phase.
Architecture The architecture of the framework is service-oriented and addresses the following challenges:
A uniform interface (API) between technical components in an SOA runtime platform should protect the exchange of components supported.
With this framework is the interoperability skills between SOA environments are protected by the implementation of adapters to existing platforms supported.
The framework is to protect not only Enterprise Environments supported but also resource-restricted embedded and mobile devices by the use of light weight components. At the core of this framework is the Distributed Service Bus that represents an Enterprise Service Bus. Furthermore, this framework provides a deployment mechanism of the run-time deployment of components. The following figure shows the component architecture of Swordfish. The framework is implemented in Java and can be extended with additional components.
CUbRIK Technology Watch Report
Page 25
D11.1 Version 1.0
Figure 9: Swordfish Architecture The broad support of the major architectural standards, especially the Service Component Architecture (SCA) and Java Business Integration (JBI) and OSGi is a key feature of this framework.
Assessment Business Model
1) Partners, collaborations:
2) Market impact:
3) Marketing activities:
4) License model:
5) Number of developers:
Although powerful partners (eg Eclipse, Microsoft, Novell, Oracle) are mentioned, but not their role held, only smaller but active partners are taken into account in the assessment Rating: 1 In recent years SOPERA has seen take-up in particular in the finance and insurance industry. SOPERA was recently taken over by talend. Rating: 1 Only the commercial supplier SOPERA has active marketing . Based on the release as open source in 2007 and later, Swordfish has been widely reported in scientific journals and the visibility improved. Rating: 1 Since this framework is in the incubation phase at the Eclipse Foundation it can also be used under (Eclipse Open Source License). On the web pages of SOPERA open source and commercial software are extensions. Approx. 6 core developers Rating: 1
CUbRIK Technology Watch Report
Page 26
D11.1 Version 1.0
6) maturity, distribution of the framework: 7) Community:
The latest release (3.3.1) was in 2010.
Rating: 0 Approx. 100 comments in the official newsgroup
Architecture and Standards
8) Summary of supported The following account of supported standards refers both to the software engineering open source version as well as the commercial version, as standards splitting the information is not possible. In summary, integration using Web services, BPEL and SCA is possible. Rating: 2 Standards in architecture SCA SOA SDO JBI J2EE Java SE . NET J2C JMX OSGi Standards for the integration of components and infrastructure
Yes, planned Yes No information found. Yes, 1.0, 2.0 planned Yes, 1.4, 5 Yes, 5, 6 planned Yes, 3.0 Yes, 1.5 Yes, 1.2 Yes, planned
WSDL UDDI BPEL Standards for external data sources
Yes, 1.1 Yes, scheduled 3.0.2 Yes, 2.0
JDBC LDAP Standards for Security SAML WS-Security XKMS JAAS XML Sig XML Encryption
Yes, 2.0 Yes (v3)
CUbRIK Technology Watch Report
Yes, 1.1 Yes, 1.0 Yes, 2.0 Yes Yes Yes Page 27
D11.1 Version 1.0
Standards for messaging SOAP HTTP JMS SSL / TSL Different Standards JAXB 9) Integration of providers own components
Yes, 1.1 Yes, 1.1 Yes, 1.1 Yes, 3.0 Yes, 2.0 Integration using Web services and BPEL possible.
Rating: 2 Integration using Web services and BPEL possible.
10) Integration of other components
Rating: 2
Functional Features
11) Functions for Information Retrieval 12) functions for the administration
no Rating: 0 Tools from different vendors can be integrated. SOPERA offers a tool for administration and monitoring. Rating: 1 There is no information about the configuration of this feature on stock, insofar it will be assumed that at least some functions are available.
13) functions for the monitoring: 14) Functions for Data Import 15) scaling functions 16) features performance
See Functions for the administration. Rating: 1 No information found. Rating: 0 No information found. No information found. Rating: (1)
CUbRIK Technology Watch Report
Page 28
D11.1 Version 1.0
3.2.2 SAP Enterprise SOA, NetWeaver General Information Provider-name: Branch: Name of the Framework:
SAP AG Germany, Global Activities Enterprise service-oriented architecture (enterprise SOA), SAP NetWeaver
URL:
http://www.sap.com/platform/esoa/index.epx
Product Description SAP NetWeaver is the open integration and application platform of SAP AG. Netweaver is based on the concept of the enterprise service architecture and addresses the challenges related to the development and optimization of flexible IT environments. The SAP NetWeaver platform is the basis for all solutions from SAP AG, and is the successor of SAP R / 3. The complex and monolithic R/3-System is divided into many small blocks which are assembled for any business process demanded. The integration system provides web services as a main transport layer between components. Therefore old and non-SAP applications are easily integrated. An incremental shift to a modern SOA architecture is possible. Additionally to Netweaver an integrated development environment, the SAP Composite Application Framework (SAP CAF), is available.
Architecture Since release 7.1 Netweaver offers an enterprise service architecture that implements an enterprise service bus. There is a strong Web Service focus. The following figure indicates the architecture layers and supported standards.
Figure 10: Netweaver Architecture
CUbRIK Technology Watch Report
Page 29
D11.1 Version 1.0
Assessment Business Model
1) Partners, collaborations: 2) Market impact:
3) Marketing activities: 4) License model: 5) Number of developers:
6) maturity, distribution of the framework: 7) Community:
SAP AG has a lot of cooperation partners and above all a large installation base. Rating: 2 SAP is the market leader in business software solutions. According to Gartner [Gartner SAP 2007] all SAP R / 3 customers will use in the foreseeable future SAP Enterprise SOA, because the benefits are obvious and no other manufacturer of an SOA for SAP R / can offer third party integration Rating: 2 Strong marketing 造 Rating: 2 commercially No information, however, is to assume that SAP have a high number of developers in the development. Rating: 2 SAP has achieved a high level of maturity and the market is continuously supplied with new releases. Rating: 2 Yes
Architecture and Standards
8) Summary of software engineering standards
SAP has worked on many international standardization of technologies that are applied in the field of service-oriented architectures Rating: (2)
Standards in architecture SOA Yes SCA Yes SDO No information found. JBI No information found. J2EE Yes Java SE Yes . NET No information found. J2C No information found. JMX No information found. OSGi No Standards for the integration of components and infrastructure
CUbRIK Technology Watch Report
Page 30
D11.1 Version 1.0
WSDL Yes UDDI Yes BPEL Yes Standards for external data sources JDBC (Yes) LDAP (Yes) Security standards SAML WS-Security XKMS JAAS XML Sig XML Encryption Standards for messaging SOAP HTTP JMS SSL / TSL Different Standards JAXB 9) Integration of providers own components 10) Integration of other components
Yes Yes No information found. Yes Yes Yes Yes Yes No information found. (Yes) No information found. Yes, service-oriented BPEL standards, Web services. However, no detailed information was found. Rating: (2) Yes, with service-oriented BPEL standards, Web services Rating: (2)
Functional Features
11) Functions for Information Retrieval 12) Functions for administration
13) functions for the monitoring: 14) Functions for Data Import
CUbRIK Technology Watch Report
SAP Netweaver has an integrated feature-rich search engine Rating: 1 There is potential for improvement, often third-party products are integrated Rating: 1 There is potential for improvement, often third-party products are integrated Rating: 1 No explicit information. However, there are in SAP R / 3 Functions for many import formats.
Page 31
D11.1 Version 1.0
Rating: 2 SAP has the expertise in dealing with huge amounts of data.
15) scaling functions 16) features performance
No information found. It is expected that the performance is on a level of industrial use. Rating: (2)
3.2.3 Oracle Fusion Middleware General Information Provider-name: Branch:
Oracle USA, Germany, Global Activities View
Name of the Framework:
Oracle Fusion Middleware
URL:
http://www.oracle.com/lang/de/products/middleware/index.html
Product Description Oracle Fusion Middleware is a software technology portfolio of middleware products that supports all kinds of business processes. Its portfolio ranges from portals and process management, application infrastructure (e.g. an enterprise service bus) and developer applications to business intelligence (see figure below). An example is the Oracle Application Server. Under the name of Oracle Fusion Middleware, these modules are combined and delivered since mid of 2008. Of particular note is (according to Oracle's data) that this middleware is the world's first unified middleware for Grid computing, SOA and event-driven architecture (see figure below).
Figure 11: Oracle Fusion Middleware
CUbRIK Technology Watch Report
Page 32
D11.1 Version 1.0
Further, Oracle Fusion Middleware not only supports the integration of its own components, but sees itself as a cross-platform for the integration of different (even competing) platforms such as BEA, IBM or Microsoft.
Architecture Oracle supports the new-generation middleware, i.e. a service infrastructure based on the Service Component Architecture (SCA) standard, in whose development Oracle was involved. The service infrastructure contains all components for construction, deployment and management of SOAs. In the following figure, both the architecture of the Oracle Middleware and the corresponding supported standards are shown.
Figure 12: Architecture and supported standards of Oracle Fusion The core of the implemented Java Architecture is the fusion service bus, which is an implementation of an Enterprise Service Bus. The Oracle SOA Suite includes many features that are shown in the following figure.
CUbRIK Technology Watch Report
Page 33
D11.1 Version 1.0
Figure 13: Oracle SOA Suite
Assessment Business Model
1) Partners, collaborations: 2) Market impact:
3) Marketing activities:
Oracle has many collaborations, partners and customers. Rating: 2 Oracle has great influence because of its product range, in particular the Oracle databases. With WebLogic similar impact may be achieved in the middleware market. Rating: 2 High marketing activity
Rating: 2 4) License model: commercially 5) Number of No information found. developers: Rating: 2 6) maturity, distribution Well established market player. of the framework: Rating: 2 7) Community: Oracle has a very active community in a number of forums, blogs and mailing lists
Architecture
8) Summary of software engineering standards CUbRIK Technology Watch Report
All important standards are supported. Rating: 2 Page 34
D11.1 Version 1.0
Standards in architecture SOA Yes SCA Yes SDO Yes JBI Yes J2EE Yes Java SE Yes . NET No information found. J2C Yes JMX Yes OSGi No. Standards for the integration of components and infrastructure WSDL Yes UDDI Yes BPEL Yes Standards for external data sources JDBC Yes LDAP Yes Standards for Security SAML Yes WS-Security Yes XKMS No information found. JAAS No information found. XML Sig Yes XML Encryption Yes Standards for messaging SOAP Yes HTTP Yes JMS Yes SSL / TSL Yes Different Standards JAXB No information found. 9) Integration of providers own As already mentioned above, Oracle sees itself as components cross-platform 10) Integration of other components
Rating: 2 As already mentioned above, Oracle sees itself as cross-platform Rating: 2
CUbRIK Technology Watch Report
Page 35
D11.1 Version 1.0
Functional Features
11) Functions for Information Retrieval
Oracle provides a search engine with conventional Functionality
12) Functions for administration
Rating: 1 Oracle provides tools for the administration of the SOA Suite (see above Figure).
13) Functions for the monitoring: 14) Functions for Data Import
15) Scaling functions
Rating: 2 Yes Rating: 2 Oracle Product Information Management Data Hub (PIM Data Hub) offers extensive functionality for data import Rating: 2 Linear scalability is offered in combination with in memory computing
16) Features performance Sun scaling function Rating: 2
3.3
Assessment of Search Frameworks
3.3.1 Apache Hadoop, Nutch and Solr General Information Provider-name: Branch: Name of the Framework:
URL:
Apache Software Foundation U.S. Global Activities View Apache Hadoop Apache Nutch Apache Solr http://hadoop.apache.org/ http://lucene.apache.org/nutch/ http://lucene.apache.org/solr/
Product Description Hadoop The Hadoop framework is capable of dealing with huge amounts of data in the range of petabytes by distributing applications over standard computers. The developed technology is similar to Google’s Map-Reduce and the Google File System (GFS), but unlike GFS based on open source. Hadoop started 2006 as a part of the Apache Lucene project that includes three components: the Full-Text Search Lucene, Hadoop and positioned above both the Internet search engine Nutch. Insofar Hadoop is an essential component of a search engine, but it can also be used in other applications, which is a main reason for the outsourcing as a top level project of the
CUbRIK Technology Watch Report
Page 36
D11.1 Version 1.0
Apache Software Foundation in 2008. Today, Hadoop is a key technology for many Big Data applications. The Hadoop framework is implemented entirely in Java. Shell Scripts are available for the administration to start and stop computer clusters available for use. The communication takes place via TCP / IP. The framework is in constant development, many organizations are involved in the development process. The maturity level can now be recognized as elaborate as many installations are shown on the web site. At Yahoo Hadoop is distributed across 15,000 computers. The IBM Blue Cloud Initiative uses this framework, too. Nutch Nutch is an Open Source Software for a complete Inter-/Intranet search engine that facilitates crawling (automatic collection of documents), indexing of these documents and a ranking in terms of search queries.
is completely implemented in Java, uses Apache Solr for indexing, integrates the Hadoop platform, and added web-specific features such as crawlers, link database for link analysis and parsers for HTML, text, PDF, MS-Word and RTF. The Nutch project is intended particularly for scalability and robustness of the search engine for several billion documents and to offer a transparent alternative to commercial search engines. In June 2003, the developers have successfully built a demo system with about 100 million indexed documents. On the basis of the resource requirement, it had to go off the network [see Fog 2005]. The project is still alive: about 10 developers working on it further, the latest release (2.1) was in 2012. Solr Solr is also an Open Source Software for an Enterprise Search Server based on Apache Lucene, which unlike Nutch does not include a crawler and does not use the Hadoop framework. Solr includes advanced search features with high-level APIs to facilitate developers to very flexibly to develop their own applications on the Apache Web technology. It offers the following functions:
Add, delete or change documents in XML format in the index via HTTP. Queries are carried out in the same way. The result output is in XML / XSLT. In addition, the following advanced search functions are available: hit highlighting, faceted search, caching Simple Administration interface. Easy monitoring and logging of indexing and search Replication mechanism for the creation of distributed indices in Master-/Slave principle (Collection Distribution). Prefabricated class modules in various languages (Perl, PHP, Java, Ruby, etc.) for any direct use of Solr functions in various programming environments (see also the C + + integration). Both Nutch and Solr offer a Java plug-in architecture, which can be extended easily with plug-ins and can be integrated into other Java applications. The last release (4.0) was in October 2012.
Architecture See Product Description
Assessment Business Model CUbRIK Technology Watch Report
Page 37
D11.1 Version 1.0
1) Partners, collaborations:
By the Apache Foundation, there are many collaborations with other Apache projects. Rating: 2 The Apache Web platform has an increasing market impact and is further pushed by the big data hype
2) Market impact:
3) Marketing activities:
4) License model: 5) Number of developers:
Rating: 2 No Marketing direct marketing activities, community driven Rating: 1 Apache License Version 2.0 No precise information. However, there is strong community activity.
6) maturity distribution of the framework:
Rating: (2) Mature and established technology with many applications in industry
7) Community:
Rating: 2 Large community
Architecture and standards
8) Summary of software engineering standards
In summary there is good standards support. Although there are no available service-oriented standards, but with Nutch the integration of frameworks by Java plugins is demonstrated. The highly scalable Hadoop framework can be used in GRID architectures.
Rating: 1 Standards in architecture SOA No SCA No SDO No information found. JBI No information found. J2EE (Yes) Java SE Yes . NET No J2C No information found. JMX No information found. OSGi No Standards for the integration of components and infrastructure WSDL No UDDI No BPEL No
CUbRIK Technology Watch Report
Page 38
D11.1 Version 1.0
Standards for external data sources JDBC (Yes) LDAP No information found. Standards for Security SAML No information found. WS-Security No information found. XKMS No information found. JAAS No information found. XML Sig No information found. XML Encryption No information found. Standards for messaging SOAP No HTTP Yes JMS No information found. SSL / TSL No information found. Different Standards JAXB No information found. 9) Integration of Yes, as a Java plug-ins. providers own components 10) Integration of other components
Rating: 1 Yes, as a Java plug-ins.
Rating: 1 Functional features
11) Functions for Information Retrieval 12) Functions for administration
13) functions for the monitoring:
14) Functions for Data Import 15) scaling functions
Crawling and indexing is offered in Nutch. No ETL in Solr. Rating: 1 Restricted, some shell scripts are available with Hadoop for administration. Nutch provides additional simple administration and monitoring functions. Rating: 1 See Administration
Rating: 1 Few document formats can be imported. Rating: 1 Solr does not offer true linear scalability
CUbRIK Technology Watch Report
Page 39
D11.1 Version 1.0
16) features performance
Performance is very much configuration dependant and tricky to maintain. Rating: 1
3.3.2 UIMA General Information Provider-name: Branch: Name of the Framework:
Apache Software Foundation U.S., global activities Unstructured Information Management Architecture (UIMA)
URL:
http://uima.apache.org/
Product Description UIMA provides an extensible component-based architecture and a framework for a runtime environment for the analysis of unstructured information such as text, audio, video or pictures to determine latent meanings, relationships and relevant facts. This software system includes the UIMA Software Development Kit (SDK) and tools for configuration of components that are written in Java or C + +. The framework itself is implemented in Java. IBM has developed this framework and released under the Apache 2.0 license, with the aim of facilitating a community for the development of innovative methods for automatic indexing in unstructured information. The latest SDK release is 2.4 from December 2011.
Architecture UIMA is a component based software architecture. The following figure shows the architecture of the UIMA framework, which consists of the following components: 
"Aggregate Analysis Engine": The components of analysis are called Analysis Engines (AEs) and include one or more so-called Annotator. Annotators are algorithms for the analysis of documents such as the sentence recognition, part-of-speech taggers or text summary. The contents of documents and the results of the Annotator are in so-called Common Analysis Structure (CAS) stored objects. Each Annotator must declare its capabilities with respect to the framework, that is, what type of analysis results it requires, for example, provide for a certain predefined language corresponding results. Annotators can be connected in so-called pipelines.

"Collection Processing Engine" (CPE) consists of a Collection Reader, AE pipelines and CAS Consumer. "The collection reader reads the data to be processed, either from an external data source or from other CPEs and fed to the AE pipeline, which adds the appropriate CAS types. In the last step of the CPE the resulting objects are fed to external knowledge bases or search systems, so as to enable semantic searches.
CUbRIK Technology Watch Report
Page 40
D11.1 Version 1.0
Figure 14: UIMA Architecture
Assessment Business Model
1) Partners, collaborations:
2) Market impact:
3) marketing activities: 4) License model: 5) Number of developers:
6) maturity distribution of the framework: 7) Community:
UIMA was initially developed by IBM alphaWorks and released in late 2006 as an open source project at SourceForge. Since 5 October 2006, it will continue as an Apache project. The standardization of the OASIS UIMA is under way. Components for textual analysis of data from the OpenNLP and the GATE community are developed and offered. IBM's Content Analytics Platform is based on UIMA. Rating: 2 Takeup is mostly in the scientific communities but gaining momentu by IBM's activities, e.g. Watson. Rating: 1 Marketing by IBM alphaWorks always refers to the base UIMA framework. Rating: 1 Apache License Version 2.0 Currently 12 committers. With additional developers from the OPenNLP and gate community that produce complementary components, the size of the community increased. Rating: 1 UIMA is utilized commercially by IBM in a number of products. Current stable version 2.4. Rating: 2 Rather big community of users and commercial support.
CUbRIK Technology Watch Report
Page 41
D11.1 Version 1.0
Architecture and Standards
8) Summary support for standards
The component architecture requires effort to be deployed with Web Services. BPEL is not supported. In general service-oriented standards are only partially supported. Nevertheless a simple extension with UIMA-compliant components (Java and C + +) is possible.
Rating: 1 Standards in architecture SOA No SCA No SDO No information found. JBI No information found. J2EE No information found. Java SE Yes . NET No J2C No information found. JMX No information found. OSGi No Standards for the integration of components and infrastructure WSDL (Yes) UDDI (Yes) BPEL No Standards for external data sources JDBC (Yes) LDAP No Standards for Security SAML No information found. WS-Security No information found. XKMS No information found. JAAS No information found. XML Sig No information found. XML Encryption No information found. Standards for messaging SOAP Yes HTTP Yes JMS Yes SSL / TSL No information found. Different Standards JAXB No information found. 9) Integration of Yes, with UIMA components. providers own Rating: 1 components
CUbRIK Technology Watch Report
Page 42
D11.1 Version 1.0
10) Integration of other components
Extensions are possible, but have to look for the specified standards of UIMA. It is however not an integration framework for any service. Rating: 1
Functional Features
11) functions for Information Retrieval 12) features for administration 13) functions for monitoring: 14) functions for data import
UIMA provides only a technical framework. Search functionality is not included. Rating: 0 There are a couple of support tools that facilitate the initialization of a UIMA application, but no console like application. Rating: 1 No monitoring tools. Rating: 0 There are import options for PDF, HTML and a couple of further formats.
Rating: 1 15) scaling functions Scale-out has been introduced with UIMA Asynchronous Scaleout based on Apache MQ. A UIMA Analysis Engine can be wrapped as a service and hence executed on parallel processes. UIMA AS is an extension to UIMA. 16) features performance
No information. Rating: (1)
3.3.3 SMILA General Information Provider-name:
Branch: Name of the Framework:
The Eclipse Foundation (Protagonist) Empolis (Protagonist) brox IT-Solutions GmbH worldwide activities SMILA (Unified Information Access Architecture)
URL:
http://www.eclipse.org/smila/
Product Description SMILA is an extensible framework for building big data and/or search solutions to access unstructured information in the enterprise. Besides providing essential infrastructure components and services, SMILA also delivers ready-to-use add-on components, like connectors to most relevant data sources. Using the framework as their basis will enable developers to concentrate on the creation of higher value solutions, like semantic driven CUbRIK Technology Watch Report
Page 43
D11.1 Version 1.0
applications etc. SMILA has been designed from the ground for scale-out, scale-up and robustness. It is targeting at industry scale applications. The latest release is 1.1 from July 2012.
Architecture A SMILA system consists of two distinguished parts: 
First, data has to be imported into the system and processed to build an search index or extract an ontology or whatever can be learned from the data.

Second, the learned information is used to answer retrieval requests from users, for examples search or ontology exploration requests. In the first process usually some data source is crawled or an external client pushes the data from the source into the SMILA system using the HTTP ReST API. Often, the data consists of a large number of documents (e.g. a file system, web site, or content management system). To be processed, each document is represented in SMILA by a record describing the metadata of the document (name, size, access rights, authors, keywords...) and the original content of the document itself. To process large amounts of data, SMILA must be able to distribute the work to be done on multiple SMILA nodes (computers). Therefore, the bulkbuilder separates the incoming data into bulks of records of a configurable size and writes them to an ObjectStore. For each of these bulks, the JobManager creates tasks for workers to process them and produce other bulks containing the result of their operation. When such a worker is available, it asks the TaskManager for tasks to be done, does the work and finally notifies the TaskManager about the result. Workflows define which workers should process a bulk in what sequence. Whenever a worker finishes a task for a bulk successfully, the JobManager can create followup tasks based on such a workflow definition. In case a worker fails processing a task (because the process or machine crashes or because of a network problem), the JobManager can decide to retry the task later and so ensure that the data is processed even in problematic conditions. The processing of the complete data set using such a workflow is called a job run. The monitoring of the current state of such a job run is possible via the HTTP ReST API.
CUbRIK Technology Watch Report
Page 44
D11.1 Version 1.0
Figure 15: SMILA Architecture
Assessment Business Model
1) Partners, collaborations: There is a close and extensive cooperation with many partners in the private sector and research (eg Fraunhofer Society, DFKI). 2) Market impact:
Rating: 2 * It aims to establish a widely accepted standard for information access management but the community is still small.
3) marketing activities:
Rating: 1 Many marketing activities (including road shows and events) are carried out.
Rating: 2 * Eclipse License Version 3.0 10 developers in the core team. Rating: 1 6) maturity, distribution of Following the Eclipse standard maturity model it is a stable the framework: production ready framework since version 1.0 4) License model: 5) Number of developers:
7) Community:
CUbRIK Technology Watch Report
Rating: 2 Still small community that is currently getting momentum due to availability of a version 1 Page 45
D11.1 Version 1.0
Architecture and Standards
8) Summary of support for standards
Sun Architecture Description Rating: 2 *
Standards in architecture SOA Yes SCA Yes SDO Yes JBI No details J2EE No details Java SE Yes . NET No details J2C No details JMX No details OSGi Yes Standards for the integration of components and infrastructure WSDL Yes UDDI Yes BPEL Yes Standards for external data sources JDBC Yes LDAP No Standards for Security SAML No details WS-Security No details XKMS No details JAAS No details XML Sig No details XML Encryption No details Standards for messaging SOAP Yes HTTP Yes JMS Yes SSL / TSL Yes Different Standards JAXB No details 9) Integration of providers own components Yes, comfortable with service-oriented standards. Rating: 2 * 10) Integration / providers in foreign Yes, comfortable with service-oriented components standards. Rating: 2 *
CUbRIK Technology Watch Report
Page 46
D11.1 Version 1.0
Functional Features
11) functions for Information Retrieval 12) features for administration
Solr integration. Rating: 2 * There are at least offered a few tools for the administration. In Empolis IAS a feature complete console is included.
13) functions for monitoring:
Rating: (1) * Monitoring tools for the status of the complete SMILA-based system and individual components available.
14) functions for data import 15) scaling functions
16) features performance
Rating: 2 * Self scaling ETL framework with many connectors included. Rating: 2 * OSGi based and designed for linear scalability. As basis of Empolis IAS the largest current installation holds 2 billion documents on a cluster with 32 quadcore nodes. Known installations are high performance systems Rating: 2 *
3.3.4 Summary The results can be interpreted as follows:
It turns out that service-oriented architectures are established, as these are supported by all major systems with big market shares (Oracle, SAP).
The open source frameworks UIMA and Apache (Hadoop, Solr, and Nutch) lack service-oriented standards and hence the possibilities to easily integrate services provided in different programming languages.
The scalability and performance of the systems cannot be analyzed without actual comparable experiments. Both factors depend strongly on the particular implementation of an application. Providers do not publish appropriate data. This also applies to some software engineering standards.
Among the analyzed frameworks, only SMILA was designed with search applications in mind. Only dedicated search products offer extensive functions for information retrieval. The combination of service-oriented standards and extensive functions for Information Retrieval is best met by SMILA.
The analysis of the open source offerings in general shows that they are ready and mature for solutions in enterprise-wide applications.
CUbRIK Technology Watch Report
Page 47
D11.1 Version 1.0
4.
Bibliography
4.1
Multimedia Information Retrieval And Human Computation
1. 2.
3.
4. 5.
6. 7. 8. 9. 10.
11.
12.
13.
14. 15.
16.
17.
M. A. MALLET: A Machine Learning for Language Toolkit. http://mallet.cs.umass.edu/,Last accessed: September 2012. T. Abdelzaher, Y. Anokwa, P. Boda, J. Burke, D. Estrin, L. Guibas, A. Kansal, S. Madden, and J. Reich. Mobiscopes for human spaces. IEEE Pervasive Computing, 6:20–29, 2007. F. Abel, C. Hauff, G.-J. Houben, R. Stronkman, and K. Tao. Twitcident: fighting fire with information from social web streams. In WWW (Companion Volume), pages 305– 308, 2012. Adobe Inc. Adobe Premiere. http://www.adobe.com/products/premiere, last accessed: September 2012. L. Agnihotri, N. Dimitrova, J. R. Kender, and J. Zimmerman. Study on requirement specifications for personalized multimedia summarization. In Proceedings of the 2003 IEEE International Conference on Multimedia and Expo, ICME 2003, 6-9 July 2003, Baltimore,MD, USA, pages 757–760, 2003. M. A. Anusuya and S. K. Katti. Speech recognition by machine, a review. CoRR, abs/1001.2267, 2010. Aulov and M. Halem. Assimilation of Real-time Satellite and Human Sensor Networks for modeling Natural Disasters. AGU Fall Meeting Abstracts, 2011. D. B. andW. J. File Interchange Handbook, chapter The Material Exchange Format. Elsevier Inc., Focal Press, 2004. R. A. Baeza-Yates and B. A. Ribeiro-Neto. Modern Information Retrieval the concepts and technology behind search, Second edition. Pearson Education Ltd., Harlow, England, 2011. D. E. M. M. G. O. B. S. Baldi A., Murace R. and G. L. Definition of an automated content based image retrieval (CBIR) system for the comparison of dermoscopic images of pigmented skin lesions. Biomed Eng Online., 8(18), 2009. N. Bansal and B. Srivastava. On using crowd for measuring traffic at aggregate level for emerging countries. In Proceedings of the 8th International Workshop on Information Integration on the Web: in conjunction with WWW 2011, pages 5:1–5:6, 2011. S. M. Beitzel, E. C. Jensen, and D. A. Grossman. Retrieving ocr text: A survey of current approaches. In Symposium on Document Image Understanding Technologies (SDUIT, 2003 Bozzon, M. Brambilla, and S. Ceri. Answering search queries with crowdsearcher. In WWW, pages 1009–1018, 2012 Bozzon, M. Brambilla, P. Fraternali, F. S. Nucci, S. Debald, E. Moore, W. Nejdl, M. Plu, P. Aichroth, O. Pihlajamaa, C. Laurier, S. Zagorac, G. Backfried, D.Weinland, and V. Croce. Pharos: an audiovisual search platform. In Proceedings of the 32nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2009, Boston, MA, USA, July 19-23, 2009, page 841, 2009. Bozzon, I. Catallo, E. Ciceri, P. Fraternali, D. Martinenghi, and M. Tagliasacchi. A framework for crowdsourced multimedia processing and querying. In R. A. BaezaYates, S. Ceri, P. Fraternali, and F. Giunchiglia, editors, CrowdSearch, volume 842 of CEUR Workshop Proceedings,pages 42–47. CEUR-WS.org, 2012. M. Brambilla, P. Fraternali, and C. Vaca. BPMN and design patterns for engineering social BPM solutions. In Proceedings of the 4th International Workshop on BPM and
CUbRIK Technology Watch Report
Page 48
D11.1 Version 1.0
18.
19.
20.
21.
22.
23.
24. 25.
26.
27.
28. 29. 30.
31.
32.
33. 34.
Social Software (BPMS 2011), 2011. W. L. Buntine, J. L¨ofstr¨om, J. Perki¨o, S. Perttu, V. Poroshin, T. Silander, H. Tirri, A. J. Tuominen, and V. H. Tuulos. A scalable topic-based open source search engine. In 2004 IEEE/WIC/ACM International Conference on Web Intelligence (WI 2004), 20-24 September 2004, Beijing, China, pages 228–234, 2004. J. Cacho, D. Spring, S. Hester, and R. M. Nally. Allocating surveillance effort in the management of invasive species: A spatially-explicit model. Environmental Modelling & Software, 25(4):444–454, 2011. T. Campbell, S. B. Eisenman, N. D. Lane, E. Miluzzo, R. A. Peterson, H. Lu, X. Zheng, M. Musolesi, K. Fodor, and G.-S. Ahn. The rise of people-centric sensing. IEEE Internet Computing, 12(4):12–21, 2008. K. T. Chan, I. King, and M.-C. Yuen. Mathematical modeling of social games. Computational Science and Engineering, 2009. CSE ’09. International Conference on, 4:1205–1210, 2009. S. Cooper, F. Khatib, A. Treuille, J. Barbero, J. Lee, M. Beenen, A. Leaver-Fay, D. Baker, and Z. Popovic. Predicting protein structures with a multiplayer online game. Nature, 466(7307):756–760, 2010. Cotsaces, N. Nikolaidis, and I. Pitas. Video shot detection and condensed representation. a review. Signal Processing Magazine, IEEE, 23(2):28 –37, march 2006. DCMI. Dublin Core Metadata Initiative. http://dublincore.org, Lasta accessed: Septembr 2012. Diou, C. Papachristou, P. Panagiotopoulos, A. Delopoulos, G. Stephanopoulos, N. Dimitriou, H. Rode, A. P. de Vries, T. Tsikrika, and R. Aly. Vitalas at trecvid-2008. In TRECVID 2008 workshop participants notebook papers, Gaithersburg, MD, USA, November 2008, 2008. J. S. Downs, M. B. Holbrook, S. Sheng, and L. F. Cranor. Are your participants gaming the system? Screening mechanical turk workers. In Proceedings of the 28th International Conference on Human Factors in Computing Systems, pages 2399– 2402, 2010. P. Dutta, P. M. Aoki, N. Kumar, A. Mainwaring, C. Myers, W. Willett, and A. Woodruff. Common Sense: Participatory urban sensing using a network of handheld air quality monitors. In Proceedings of the 7th ACM Conference on Embedded Networked Sensor Systems, pages 349–350, 2009. Eclipse Foundation. SMILA SeMantic Information Logistics Architecture. http://www.eclipse.org/smila, Last accessed: September 2012. Eyealike, Inc. Eyealike platform for online shopping. http://www.eyealike.com/home, last accessed: September 2012. L. S. Fiorella De Cindio, Ewa Krzatala-Jaworska. Problems&proposals: A tool for collecting citizens intelligence. In ACM, editor, Proc. of the Second International Workshop on Collective Intelligence at ACM CSCW Conference, Seattle, Washington, February 2012. Firas Khatib Frank DiMaio et al,. Crystal structure of a monomeric retroviral protease solved by protein folding game players. Nature Structural & Molecular Biology, 18(10):1175–1177, October 2011. P. Fraternali, M. Brambilla, and A. Bozzon. Model-driven design of audiovisual indexing processes for search-based applications. In Seventh International Workshop on ContentBased Multimedia Indexing, CBMI ’09, Chania, Crete, June 3-5, 2009, pages 120–125, 2009. H. Gao, G. Barbier, and R. Goolsby. Harnessing the crowdsourcing power of social media for disaster relief. IEEE Intelligent Systems, 26(3):10–14, 2011. v. O. J. Geurts J. and H. L. Requirements for practical multimedia annotation. In Proc.
CUbRIK Technology Watch Report
Page 49
D11.1 Version 1.0
35. 36. 37.
38. 39.
40.
41.
42. 43.
44.
45. 46. 47.
48.
49.
50. 51.
52.
53.
of the Int. Workshop on Multimedia and the Semantic Web, pages 4–11, Heraklion, Crete, 2005. Google Inc. Google Election Video Search. http://googleblog.blogspot.com/2008/07/intheirown-words-political-videos.html, 2009. Google Inc. Google Picasa. http://picasa.google.com, last accessed: September 2012. M. Hamilton, F. Salim, E. Cheng, and S. Choy. Transafe: a crowdsourced mobile platform for crime and safety perception management. ACM SIGCAS Computers and Society, 41(2):32– 37, 2011. Hanbury. A survey of methods for image annotation. J. Vis. Lang. Comput., 19(5):617– 627, 2008. J. Heer and M. Bostock. Crowdsourcing graphical perception: using mechanical turk to assess visualization design. In Proceedings of the 28th International Conference on Human Factors in Computing Systems, pages 203–212, New York, NY, 2010. J. Heer and M. Bostock. Crowdsourcing graphical perception: using mechanical turk to assess visualization design. In E. D. Mynatt, D. Schoner, G. Fitzpatrick, S. E. Hudson,W. K. Edwards, and T. Rodden, editors, CHI, pages 203–212. ACM, 2010. W. Heeren, L. van der Werff, R. Ordelman, A. van Hessen, and F. de Jong. Radio oranje: searching the queen’s speech(es). In W. Kraaij, A. P. de Vries, C. L. A. Clarke, N. Fuhr, and N. Kando, editors, SIGIR, page 903. ACM, 2007. J. Howe. The rise of crowdsourcing. Wired, 14(6), 2006. X. Hu, M. Stalnacke, T. B. Minde, R. Carlsson, and S. Larsson. A mobile game to collect and improve position of images. In Proceedings of the 3rd International Conference on Next Generation Mobile Applications, Services and Technologies, 2009. Huang, H. Zhang, D. C. Parkes, K. Z. Gajos, and Y. Chen. Toward automatic task design: A progress report. In Proceedings of the ACM SIGKDD workshop on Human Computation, pages 77–85, 2010. ID3.org. ID3v2. Technical report, ID3.org, Last accessed: September 2012. P. Ipeirotis. The new demographics of Mechanical Turk. March 2010. P. G. Ipeirotis and P. K. Paritosh. Managing crowdsourced human computation: a tutorial. In Proceedings of the 20th international conference companion on World wide web, pages 287–288, 2011. P. G. Ipeirotis, F. Provost, and J.Wang. Quality management on amazon mechanical turk. In Proceedings of the ACM SIGKDDWorkshop on Human Computation (KDDHCOMP 2010), pages 64–67, 2010. Japan Electronics and Information Technology Industries Association. Exchangeable image file format for digital still cameras: Exif (Version 2.2). Technical report, JEITA, 2002. Jøsang, R. Ismail, and C. Boyd. A survey of trust and reputation systems for online service provision. Decision Support Systems, 43(2):618–644, 2007. S. Kim, C. Robson, T. Zimmerman, J. Pierce, and E. M. Haber. Creek watch: pairing usefulness and usability for successful citizen science. In Proceedings of the 29th International Conference on Human Factors in Computing Systems, pages 2125– 2134, New York, NY, 2011. M. Klein. The mit deliberatorium enabling large-scale deliberation about complex systemic problems. In J. Filipe and A. L. N. Fred, editors, ICAART (1), pages 15–24. SciTePress, 2011. S. Koelstra, A. Yazdani, M. Soleymani, C. M¨uhl, J.-S. Lee, A. Nijholt, T. Pun, T. Ebrahimi, and I. Patras. Single trial classification of EEG and peripheral physiological signals for recognition of emotions induced by music videos. In Proceedings of the 2010 international conference on Brain informatics, pages 89–100, 2010.
CUbRIK Technology Watch Report
Page 50
D11.1 Version 1.0
54. 55. 56. 57. 58. 59.
60. 61. 62.
63.
64.
65. 66.
67.
68. 69.
70. 71. 72.
73.
74.
Koprinska and S. Carrato. Temporal video segmentation: A survey. Sig. Proc.: Image Comm., 16(5):477–500, 2001. L. I. Kuncheva, C. J. Whitaker, and C. A. Shipp. Limits on the majority vote accuracy in classifier fusion. Pattern Analysis & Applications, 6(1):22–31, 2003. M. S. Lew, N. Sebe, C. Djeraba, and R. Jain. Content-based multimedia information retrieval: State of the art and challenges. TOMCCAP, 2(1):1–19, 2006. Limelight Networks. Limelight Video Platform. http://www.limelightvideoplatform.com/, last accessed: September 2012. Y. Liu, P. Piyawongwisal, S. Handa, L. Yu, Y.Xu, and A. Samuel. Going beyond citizen data collection with Mapster: A Mobile+Cloud citizen science experiment. In Proceedings of the IEEE Workshop on Computing for Citizen Science (eScience 2001), 2011. Y. Liu, D. Zhang, G. Lu, and W.-Y. Ma. A survey of content-based image retrieval with high-level semantics. Pattern Recogn., 40(1):262–282, Jan. 2007. G. Lowe. Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision, 60(2):91–110, 2004. Madan, M. Cebri´an, D. Lazer, and A. Pentland. Social sensing for epidemiological behavior change. In Proceedings of the 12th ACM international conference on Ubiquitous Computing, pages 291–300, 2010. N. Maisonneuve, M. Stevens, M. E. Niessen, P. Hanappe, and L. Steels. Citizen noise pollution monitoring. In Proceedings of the 10th Annual International Conference on Digital Government Research: Social Networks: Making Connections between Citizens, Data and Government, pages 96–103, 2009. Manasseh, K. Ahern, and R. Sengupta. The connected traveler: using location and personalization on mobile devices to improve transportation. In Proceedings of the 2nd International Workshop on Location and the Web (LOCWEB09), pages 1–4, 2009. S. T. Manjunath B. S., Salembier P. Introduction to MPEG-7: Multimedia Content Description Interface. Wiley, 2002. P. Maragos P., Potamianos A. Multimodal Processing and Interaction, Audio, Video, Text Series: Multimedia Systems and Applications, volume 33. Springer Verlag, first ed. edition, July 2008. Marsden, A. Mackenzie, A. Lindsay, H. Nock, J. Coleman, and G. Kochanski. Tools for searching, annotation and analysis of speech, music, film and video a survey. LLC, 22(4):469–488, 2007. J. M. Mart´ınez. Mpeg-7 overview (version 10). ISO/IEC JTC1/SC29/WG11N6828, 2004. M. Mozer, H. Pashler, M. H. Wilder, R. A. Lindsey, M. Jones, and M. Jones. Improving human judgments by decontaminating sequential dependencies. In Advances in Neural Information Processing Systems, pages 1705–1713, 2010. C.-W. Ngo and C.-K. Chan. Video text detection and segmentation for optical character recognition. Multimedia Systems, 10:261–272, 2005. 10.1007/s00530-004-0157-0. Nielsen. What americans do online: Social media and games dominate activity. Technical report, Nielsen, August 2011. N. NJUGUNA. Are smart phones better than paper-based questionnaires for surveillance data collection? a comparative evaluation using influenza sentinel surveillance sites in kenya, 2011. In A. S. for Microbiology, editor, Proc. International Conference on Emerging Infectious Deseases, ICEID 2012, March 2012. Parameswaran, A. D. Sarma, H. Garcia-Molina, N. Polyzotis, and J. Widom. Humanassisted graph search: It’s okay to ask questions. Technical report, Stanford University, Palo Alto, CA, 2010. D. Petrovska-Delacrtaz, A. El Hannani, and G. Chollet. Text-independent speaker
CUbRIK Technology Watch Report
Page 51
D11.1 Version 1.0
75.
76.
77.
78.
79. 80. 81.
82. 83. 84. 85.
86. 87. 88.
89. 90. 91.
92. 93.
verification: State of the art and challenges. In Y. Stylianou, M. Faundez-Zanuy, and A. Esposito, editors, Progress in Nonlinear Speech Processing, volume 4391 of Lecture Notes in Computer Science, pages 135–169. Springer Berlin / Heidelberg, 2007. Pickard, I. Rahwan, W. Pan, M. Cebri´an, R. Crane, A. Madan, and A. Pentland. Time critical social mobilization: The darpa network challenge winning strategy. CoRR, abs/1008.3172, 2010. Potamitis and T. Ganchev. Generalized recognition of sound events: Approaches and applications. In G. Tsihrintzis and L. Jain, editors, Multimedia Services in Intelligent Environments,volume 120 of Studies in Computational Intelligence, pages 41–79. Springer Berlin / Heidelberg, 2008. J. Quinn and B. B. Bederson. Human computation: a survey and taxonomy of a growing field. In Proceedings of the 29th International Conference on Human Factors in Computing Systems, pages 1403–1412, 2011. R. J. Radke, S. Andra, O. Al-Kofahi, and B. Roysam. Image change detection algorithms: a systematic survey. IEEE Transactions on Image Processing, 14(3):294– 307, 2005. Z. Rasheed and M. Shah. Scene detection in hollywood movies and tv shows. In Computer Vision and Pattern Recognition, 2003. Proceedings. 2003 IEEE Computer Society Conference on, volume 2, pages II – 343–8 vol.2, june 2003. T. Sakaki, M. Okazaki, and Y. Matsuo. Earthquake shakes Twitter users: real-time event detection by social sensors. In Proceedings of the 19th International Conference on World Wide Web (WWW10)), pages 851–860, 2010. G. Salton, A.Wong, and C. S. Yang. A vector space model for automatic indexing. Commun. ACM, 18(11):613–620, 1975. G. Sanfey. Social decision-making: Insights from game theory and neuroscience. Science, 318(5850):598–602, 2007. N. Scaringella, G. Zoia, and D. Mlynek. Automatic genre classification of music content: a survey. Signal Processing Magazine, IEEE, 23(2):133 –141, march 2006. S. Shah, F. Bao, C. Lu, and I. Chen. Crowdsafe: crowd sourcing of crime incidents and safe routing on mobile devices. In Proceedings of the 19th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems, pages 521–524. ACM, 2011. G. M. Snoek and M. Worring. Concept-based video retrieval. Foundations and Trends in Information Retrieval, 2(4):215–322, 2009. 46 2 Human Computation and Crowd Search Stothard, J. Sousa-Figueiredo, M. Betson, E. Seto, and N. Kabatereine. Investigating the spatial micro-epidemiology of diseases within a point-prevalence sample: a field applicable method for rapid mapping of households using low-cost gps-dataloggers. Transactions of the Royal Society of Tropical Medicine and Hygiene, 2011. The Quaero Consortium. The Quaero Program. http://http://www.quaero.org, Last accessed: September 2012. Theseus. The Theseus programme. http://theseus-programm.de, Last accessed: September 2012. P. K. Turaga, R. Chellappa, V. S. Subrahmanian, and O. Udrea. Machine recognition of human activities: A survey. IEEE Trans. Circuits Syst. Video Techn., 18(11):1473– 1488, 2008. Tversky and D. Kahneman. Judgment under uncertainty: Heuristics and biases. Science, 185:1124–1131, 1974. R. Typke, F. Wiering, and R. C. Veltkamp. A survey of music information retrieval systems. In ISMIR 2005, 6th International Conference on Music Information Retrieval, London, UK, 11-15 September 2005, Proceedings, pages 153–160, 2005.
CUbRIK Technology Watch Report
Page 52
D11.1 Version 1.0
94. 95. 96. 97. 98. 99.
100.
101.
102.
103. 104.
105. 106.
University of Twente. Buchenwald demonstrator. http://vuurvink.ewi.utwente.nl, Last accessed: September 2012. von Ahn. Human Computation. PhD thesis, CMU, December 2005. CMU-CS-05-193. von Ahn. Human computation. In CIVR, 2009. von Ahn and L. Dabbish. Labeling images with a computer game. In E. DykstraErickson and M. Tscheligi, editors, CHI, pages 319–326. ACM, 2004. von Ahn and L. Dabbish. Designing games with a purpose. Commun. ACM, 51(8):58– 67, 2008. von Ahn, R. Liu, and M. Blum. Peekaboom: a game for locating objects in images. In Proceedings of the SIGCHI conference on Human Factors in computing systems, CHI ’06, pages 55–64, New York, NY, USA, 2006. ACM. von Ahn, B. Maurer, C. McMillen, D. Abraham, and M. Blum. recaptcha: Human-based character recognition via web security measures. Science, 321(5895):1465–1468, 2008. T. Yan, V. Kumar, and D. Ganesan. Crowdsearch: exploiting crowds for accurate realtime image search on mobile phones. In S. Banerjee, S. Keshav, and A.Wolman, editors, MobiSys, pages 77–90. ACM, 2010. Z. Yang, B. Li, Y. Zhu, I. King, G.-A. Levow, and H. M. Meng. Collection of user judgments on spoken dialog system with crowdsourcing. In D. Hakkani-T¨ur and M. Ostendorf, editors, SLT, pages 277–282. IEEE, 2010. Yilmaz, O. Javed, and M. Shah. Object tracking: A survey. ACM Comput. Surv., 38(4), 2006. G.-J. Yu, Y.-S. Chen, and K.-P. Shih. A content-based image retrieval system for outdoor ecology learning: A firefly watching system. In AINA (2), pages 112–115. IEEE Computer Society, 2004. W.-Y. Zhao, R. Chellappa, P. J. Phillips, and A. Rosenfeld. Face recognition: A literature survey. ACM Comput. Surv., 35(4):399–458, 2003. J. Zittrain. Ubiquitous human computing. Philosophical Transactions of the Royal Society, (366):3813–3821, 2008.
CUbRIK Technology Watch Report
Page 53
D11.1 Version 1.0
4.2
General Purpose Search And Process Engines
107. [Forrester 2011] Owen, Leslie: Market Overview: Enterprise Search. Published by Forrester Research, Inc.; 2011 108. [Forrester 2008] Fulton, Larry: The Forrester Wave™: SOA Service Life-Cycle Management, 2008. Herausgegeben von Forrester Research, Inc. 109. [Forrester 2006] Brown, Matthew: The Forrester Wave™: Enterprise Search Platforms, Q2 2006, Herausgegeben von Forrester Research, Inc.; 2006. 110. [Forrester 2005] Vollmer, Ken and Peyret, Henry: The Forrester Wave™: Integration Suites, Q3 2005, Herausgegeben von Forrester Research, Inc.; 2005. 111. [Dohmen 2008] Dohmen, Oliver: Java a la SAP, in javamagazin 02/2008. 112. [Gartner 2007a] Andrews, Whit: Magic Quadrant for Information Access Technology, herausgegeben von Gartner Research, 2007. 113. [Gartner-SAP 2007] Malinverno, Paolo: How to Get Value Now (and in the Future) From SAP’s Enterprise SOA, Gartner RAS Core Research, 2007. 114. [JRC 2010] Benghozi, Pierre-Jean and Chamaret, Cecile: Economic Trends in Enterprise Search Solutions, European Joint Research Center, EUR 24383 EN, 2012 115. [Gartner 2007b] Kenney, Frank: Evaluating IBM, Microsoft, Oracle and SAP Commitment to SOA Governanc, Gartner Research, 2007. 116. [Nebel 2005] Nebel, Michael: Ausgegugelt Freie Suchsoftware: Aspseek und Nutch, in IX 09/2005. 117. [Nebel 2007] Nebel, Michael: Faltplan – Arbeiten mit großen verteilten Datenmengen, in IX 03/2007. 118. [Novakovic 2008] Novakovic, Igor, Schmidt, August Georg: Presentation to SMILA Creation Review, (http://www.eclipse.org/proposals/eilf/SMILA_Creation_Review.pdf ); 2008 119. [Wang Yu 2008] Scaling your Java EE Application – Part 2; 2008. http://www.theserverside.com/tt/articles/article.tss?l=ScalingYourJavaEEApplicationsP art2.
4.3 120. 121. 122. 123. 124. 125. 126. 127. 128. 129. 130. 131. 132.
General References Open Grid Forum http://www.ggf.org/ Google Images http://images.google.com Empora Online Shop www.empora.com Facesaerch www.facesaerch.com http://www.bbc.co.uk/guidelines/smef/ International Press Telecommunications Council: http://www.iptc.org/IPTC4XMP/ Recordare: http://www.recordare.com/xml.html LSCOM Lexicon Definitions and Annotations, http://www.ee.columbia.edu/ln/dvmm/lscom/ Learning Object Metadata, http://ltsc.ieee.org/wg12/ International Press Telecommunications Council, http://www.iptc.org http://www.shazam.com/ http://www.bmat.com/ http://voxaleadnews.labs.exalead.com/?l=en
CUbRIK Technology Watch Report
Page 54
D11.1 Version 1.0
133. 134. 135. 136. 137. 138. 139. 140. 141. 142.
images.google.com http://www.bing.com/images http://www.tiltomo.com/ http://www.blinkx.com The Internet Movie Database: http://www.imdb.com The DAML Ontology Library: http://www.daml.org/ontologies Podcast, http://en.wikipedia.org/wiki/Podcasting RSS, http://en.wikipedia.org/wiki/RSS_(file_format) Media RSS, http://en.wikipedia.org/wiki/Media_RSS An overview of the most important licenses [http://www.ifross.de/ifross_html/lizenzcenter.html ]
CUbRIK Technology Watch Report
Page 55
can
be
found
at
D11.1 Version 1.0