CUbRIK posters by CUbRIK Project

How Do We Deep-Link? Leveraging User-Contributed Time-Links for Non-Linear Video Access Raynor Vliegendhart, Babak Loni, Martha Larson, and Alan Hanjalic Multimedia Information Retrieval Lab, Delft University of Technology

Introduction

Contributions

● Problem: How do users deep-link?

● Notion of Viewer Expressive Reaction (VER):

(i.e., refer to points in a video by explicitly mentioning time-codes)

● Motivation: Leverage time-codes within deep-link comments for enabling non-linear video access ● Dataset: MSRA-MM 2.0 / YouTube comments

Reflects viewers’ perceptions of noteworthiness (but extend beyond depicted content and induced affect)

● Viewer Expressive Reaction Variety taxonomy (VERV): Captures how users deep-link; Shown to be appropriate for automatic filtering Envisioned Future Retrieval System Deep-links ▼

0:44 omg so cute

cats at play +surprise Funny cats in water

what’s the breed of the cat at 2:14?

by NekoTV • 2 year ago • 3,491 views Funny cats in and around water 3:24

The song at 3:33 is called “The eye of the tiger” epic failure at 0:23

Wrestling kittens

2:59 That didn’t go too well

by Rizzzalie • 5 years ago • 8,136 views World wrestling federation, young feline division

5:17 Damn!

i liked it till 2:11 then it became boring

► Nyaaa wrote: 1:12 omg, that’s impossible!

0:39

► Xfade wrote: O_o now I didn’t expect that at all! 0:33

Stalking cat

That move at 6:12 was dumb

by lowdope • 4 years ago • 4,435 views Moire blog: http://moire.lowdope.com/ 1:14

(These deep-link comments occur unprompted on social video sharing platforms)

► jb87 wrote:

Approach

Results

● Taxonomy elicitation via crowdsourcing (Amazon Mechanical Turk):

● Annotation agreement:

● Given: 3 deep-link comments per video ● Task: Describe why a comment was posted (2–4 sentences) ● Post-processing: Card-sorting technique

whoa. unreal creepy eyes at 1:02

● VER/non-VER: 2,842 comments (84.6%) ● VERV: 2,140 comments (63.7%) ● Automatic classification results:

● Misclassification challenges: ● “Funny” comments often labeled as here by humans, but classified as love by the classifier ● Annotation crowdsourcing task, for:

● Comments with multiple interpretations ● Comments with multiple sentences

● Validating elicited VERV taxonomy ● Annotating 3,359 deep-link comments: ● Whether it contains a true deep-link (VER/non-VER)

Future Work

● VERV class (if and only if VER comment)

● Improve automatic classification by adding content features

● Linear SVM comment classification experiment (unigram features)

Contact: R.Vliegendhart@tudelft.nl @ShinNoNoir

● Develop the envisioned deep-link retrieval system

21st ACM international conference on Multimedia, Barcelona, Spain, 2013

CONTEXT-BASED PEOPLE RECOGNITION in CONSUMER PHOTO COLLECTIONS Markus Brenner, Ebroul Izquierdo MMV Research Group, School of Electronic Engineering and Computer Science Queen Mary University of London, UK {markus.brenner, ebroul.izquierdo}@eecs.qmul.ac.uk

Aim

ď&#x201A;ˇ Resolve identities of people primarily by their faces

ď&#x201A;ˇ Perform recognition by considering all contextual information

at the same time (unlike traditional approaches that usually train a classifier and then predict identities independently)

ď&#x201A;ˇ Incorporate rich contextual cues of personal photo collections

where few individual people frequently appear together

Face Detection and Basic Recognition

Graph-based Recognition

Initial steps: Image preprocessing, face detection and face normalization

Model: pairwise Markov Network (graph nodes represent faces) Unary Potentials: likelihood of faces belonging to particular people đ?&#x2018;˘ đ?&#x2018;¤đ?&#x2018;&#x203A; =

đ?&#x2018;?đ?&#x2018;&#x201C;

Pairwise potential

Face

đ?&#x2018;&#x201C; đ?&#x2018;¤đ?&#x2018;&#x203A;

Unary potential

Descriptor-based: Local Binary Pattern (LBP) texture histograms

Pairwise Potentials: encourage spatial smoothness, encode exclusivity constraint and temporal domain

LBP

â&#x20AC;Ś for each block â&#x20AC;Ś

đ?&#x153;?, đ?&#x2018;&#x2013;đ?&#x2018;&#x201C; đ?&#x2018;¤đ?&#x2018;&#x203A; = đ?&#x2018;¤đ?&#x2018;&#x161; â&#x2C6;§ đ?&#x2018;&#x2013;đ?&#x2018;&#x203A; â&#x2030; đ?&#x2018;&#x2013;đ?&#x2018;&#x161; đ?&#x2018;? đ?&#x2018;¤đ?&#x2018;&#x203A; , đ?&#x2018;¤đ?&#x2018;&#x161; = 0, đ?&#x2018;&#x2013;đ?&#x2018;&#x201C; đ?&#x2018;¤đ?&#x2018;&#x203A; = đ?&#x2018;¤đ?&#x2018;&#x161; â&#x2C6;§ đ?&#x2018;&#x2013;đ?&#x2018;&#x203A; = đ?&#x2018;&#x2013;đ?&#x2018;&#x161; đ?&#x2018;?đ?&#x2018;&#x153; đ?&#x2018;¤đ?&#x2018;&#x203A; , đ?&#x2018;¤đ?&#x2018;&#x161; , đ?&#x2018;&#x153;đ?&#x2018;Ąâ&#x201E;&#x17D;đ?&#x2018;&#x2019;đ?&#x2018;&#x;đ?&#x2018;¤đ?&#x2018;&#x2013;đ?&#x2018; đ?&#x2018;&#x2019;

LBP

Similarity metric: Chi-Square Statistics All samples are independent

Basic face recognition: k-Nearest-Neighbor

Topology: only the most similar faces are connected with edges

Unary potential of every node

Face similarity

Inference: maximum a posteriori (MAP) solution of Loopy Belief Propagation (LBP)

Te Te

Social Semantics

Body Detection and Recognition

Individual appearance for a more effective graph topology (used to regularize the number of edges)

â&#x20AC;Ś when faces are obscured or invisible

Unique People Constraint models exclusivity: a person cannot appear more than once in a photo

ď&#x201A;ˇ

Detect upper and lower body parts

ď&#x201A;ˇ

Bipartite matching of faces and bodies

ď&#x201A;ˇ

Graph-based fusion of faces and clothing

Pairwise co-appearance: people appearing together bear a higher likelihood of appearing together again Groups of people: use data mining to discover frequently appearing social patterns

Tr Based on face similarities

...

Experiments Public Gallagher Dataset: ~600 photos, ~800 faces, 32 distinct people Our dataset: ~3300 photos, ~5000 faces, 106 distinct people

Gain @ 3% training 25% 20% 15% 10%

ď&#x201A;ˇ All photos shot with a typical consumer camera

ď&#x201A;ˇ Considering only correctly detected faces (87%)

0% + Graph. Model

+ Social Semantics

+ Body parts

Unary potential of every node

Upper body similarity

Lower body similarity Te

Te Face similarity

PicAlert!: A System for Privacy-Aware Image Classification and Retrieval Sergej Zerr, Stefan Siersdorfer, Jonathon Hare E-mail: {zerr,siersdorfer}@L3S.de, jsh2@ecs.soton.ac.uk A large portion of images, published in social Web2.0 applications, are of a highly sensitive nature, disclosing many details of the users’ private life. We have developed a web service which can detect private images within a user’s photo stream and provide support in making

privacy decisions in the sharing context. In addition, we present a privacy oriented image search application which automatically identifies potentially sensitive images in the result set and separates them from the remaining pictures.

Acquiring the Ground Truth Using a Social Annotation Game 81 users annotated 37,535 recent images 27,405 public 4,701 private

Common notion of “privacy”: “Private are photos which have to do with the private sphere (like self portraits, family, friends, your home) or contain objects that you would not share with the entire world (like a private email). The rest are public. In case no decision can be made, the picture should be marked as undecidable.”

GUI of the game.

2 Colors

Edges

Feature Extraction

Classifier Training

SIFT

BEP Visual: 0.74 Text: 0.78 Combination: 0.80

Faces

Top-50 s temmed terms ac c ording to their Mutual Information values for “public ” v s . “priv ate” photos in Flic k r

Evaluation: P/R c urv es for the features and their c ombination.

Search & Web Service

Search results for the query “cristiano ronaldo” (06/06/12).

Web service GUI for privacy-oriented image classification.

www.cubrikproject.eu

Map to Humans and Reduce Error - Crowdsourcingfor DeduplicationApplied to Digital Libraries Mihai Georgescu, Dang Duc Pham, ClaudiuS. Firan, JulienGaugaz, Wolfgang Nejdl •Find duplicate entities based on metadata •Focus on scientific publications in the Freesearchsystem

Crowdsourcing:

•An automatic method and human labelers work together towards improving their performance at identifying duplicate entities •Actively learn how to deduplicate from the crowd by optimizing the parameters of the automatic method

1 HIT = 5 Pairs 5ct / HIT 3 ->5 Assignments

[Show Diff] [Full Text] Title: Comparing Heuristic, Evolutionary and Local Search Approaches to Scheduling

[Show Diff] Title: Comparing Heuristic, Evolutionary and Local Search Approaches to Scheduling.

Authors: Soraya Rana, Adele E. Howe, L. Darrell, Whitley Keith Mathias Venue: Proceedings of the Third International Conference on Artificial Intelligence Planning Systems, Menlo Park, CA Publisher: The AAAI Press Year: 1996 Language: English Type: conference

Authors: Soraya B. Rana, Adele E. Howe, L. Darrell Whitley, Keith E. Mathias Book: AIPS Pg. 174-181 [Contents] Year: 1996 Language: English Type: conference (inproceedings)

After carefully reviewing the publications metadata presented to you, how would you classify the 2 publications referred:

Abstract: The choice of search algorithm can play a vital role in the success of a scheduling application. In this paper, we investigate the contribution of search algorithms in solving a real-world warehouse scheduling problem. We compare performance of three types of scheduling algorithms: heuristic, genetic algorithms and local search.

Judgment for publications pair: oDuplicates oNot Duplicates

Identify pairs with ADS = threshold±ε Sample and add to Pcand

Automatic Method •DuplicatesScorerproduces an ADS •DSParams={(fieldName, fieldWeight)} and threshold •Compare ADSto threshold => ADϵ{1,0}

High confidence pairs => Ptrain Pcand = Pcand - Ptrain

Optimize DSParams and threshold to fit to the data in Ptrain

Identify duplicate pairs from Ptrain, Pdupl

Crowd Decision •Aggregated decision from all workers for a pair produces a CSD •Worker contribution to the CSDis proportional to the confidence ck we have in him •Compare CDSto 0.5 => CDϵ{1,0}

Initial DSParams, Threshold Pcand = φ

0.80 0.60 0.40 0.20 A

sign

sign+DS/m

R sign+DS/o

DS/m

DS/o

Optimization strategies

3 workers

5 workers

Iter

Manual

Boost

Heur

Accuracy

79.19

80.00

79.73

80.00

78.92

79.73

CD-MV

sign

sign+DS/m

sign+DS/o

DS/m

DS/o

CD-MV

Sum-Err

76.49

79.46

79.19

0.2 0

0.6 7

0.5 6

0.9 7

Sum-log-err

71.89

78.11

78.38

78.92

80.27

76.76

0.7 7

0.7 0

0.7 9

0.8 3

Pearson

73.24

79.46

80.54

79.46

81.08

0.9 5

1.0 0

0.4 8

0.6 6

0.6 3

L3S Research Center / Leibniz Universität Hannover Appelstrasse 4, 30167 Hannover, Germany phone: +49 511 762-19715



dblp.kbs.uni-hannover.de www.cubrikproject.eu

vWi , j v

Worker Confidence •Asses how reliable are the individual workers when compared to the overall performance of the crowd •Simple measure: proportion of pairs that have the same label as the one assigned by the crowd •Use an EM algorithm to iteratively compute the worker confidence •Compute CSD •Update ck

Compare CD to AD and optimize DSParamsand threshold to maximizeAccuracy

Crowd Decision Strategies

Contact: Mihai Georgescu email: georgescu@L3S.de

weighti, j (k ) 

Crowd Decision and Optimization Strategies Experiment Setup • 3 Batches : o 60 HITs with qualification test o 60 HITs without qualification test o 120 HITs without qualification test

1.00

•Just signatures • Sign •Just the DuplicatesScorer • DS/m • DS/o •First compute signatures and then base decision on DuplicatesScorer • sign + DS/m • sign + DS/o •Directly use Crowd Decision obtained via Majority Voting CD-MV

kWi , j

Crowd Decision Strategies: •MV: Majority Voting; All users are equal ck=1 •Iter: ck computed using the EM algorithm •Boost: ck computed using the EM algorithm using boosted weights in the computation of CSD •Heur: Heuristic 3/3 or 4/5

Better DSParams, Threshold Pdupl

Duplicate Detection Strategies

1   weighti, j (k )Wi, j (k ) CSDi, j 

Compute crowd decisions and worker confidences

Get Crowd Labels for Pcand

•MTurkHITs to get labeled data, while tackling the quality issues of the crowdsourcedwork

Crowd Soft Decision Aggregation of all individual votes Wi,j(k)ϵ{-1,1} CSDϵ{0,1}

Compare ADSto CSD and optimize DSParams •minimize the sum of errors •minimize the sum of log of errors •maximize the Pearsoncorrelation Compare CD to AD and optimize threshold to maximize Accuracy

Swarming to Rank for Recommender Systems Ernesto Diaz-Aviles, Mihai Georgescu, and Wolfgang Nejdl Overview •Address the item recommendation task in the context of recommender systems •An approach to learning ranking functions exploiting collaborative latent factors as features • Instead of manually creating an item feature vector, factorize a matrix of user-item interactions •Use these collaborative latent factors as input to the Swarm Intelligence(SI) ranking method SwarmRank

SI for Recommender Systems Swarm-RankCF • a collaborative learning to rank algorithm based on SI • while learning to rank algorithms use hand-picked feature to represent items we learn such features based on user-item interactions, and apply a PSO-based optimization algorithm that directly maximizes Mean Average Precision.

Evaluation Dataset: Real world data from internet radio: 5-core of the Last.fm Dataset –1K Users transactions

242,103

Unique users

888

Items(artists)

35,315

Evaluation Methodology: All-but-one protocol or leave-one-out holdout method

where hit(u) = 1, if the hidden item I is present in u’s Top-N list of recommendations, and 0 otherwise.

Contact: Ernesto Diaz-Aviles, Mihai Georgescu email: {diaz, georgescu}@L3S.de L3S Research Center / Leibniz Universität Hannover Appelstrasse 4, 30167 Hannover, Germany phone: +49 511 762-19715

www.cubrikproject.eu

LikeLines: Collecting Timecode-level Feedback for Web Videos through User Interactions Raynor Vliegendhart, Martha Larson, and Alan Hanjalic Multimedia Information Retrieval Lab, Delft University of Technology

Problem

Approach

● Problem: Providing users with a navigable heat map of interesting regions of the video they are watching.

A Web video player component with a navigable heat map, that:

● Motivation: Conventional time sliders do not make the inner structure of the video apparent, making it hard to navigate to the interesting bits.

● Uses multimedia content analysis to seed the heat map. ● Captures implicit and explicit user feedback at the timecode-level to refine the heat map.

LikeLines player

play pause seek like ...

viewers

interaction session server

content analysis

System Overview

Implementation

● Video player component, augmented with:

● Video player component implemented in JavaScript and HTML5.

● Navigable heat map that allows users to jump directly to “hot” areas; ● Time-sensitive “like” button that allows users to explicitly like particular points in the video.

● Out-of-the-box support for YouTube and HTML5 videos. <script type="text/javascript"> var player = new LikeLines.Player('playerDiv', { video: 'http://www.youtube.com/watch?v=wPTilA0XxYE', backend: 'http://backend:9090/' }); </script>

● Captures user interactions: ● Implicit feedback such as playing, pausing and seeking; ● Explicit “likes” expressed by the user. ● Combines content analysis and captured user interactions to compute a video’s heat map.

t (s)

content analysis

t (s)

user feedback 1

+…+

Future Work

● Back-end interaction session server stores and aggregates per video:

Source code: https://github.com/delftmir/likelines-player

Contact: R.Vliegendhart@tudelft.nl @ShinNoNoir

● Back-end server reference implementation is written in Python.

t (s)

user feedback n

● All interaction sessions between each user and player; ● Initial multimedia content analysis of the video.

● Video player component communicates with a back-end server using JSON(P).

● For what kinds of video is timecode-level feedback useful? ● How should user interactions be interpreted? ● How to fuse timecode-level feedback with content analysis without encouraging snowball effects? ● Can timecode-level data be linked to queries to recommend relevant jump points? ● How to collect a critical mass of timecode-level data by incentivizing users to interact with the system?

20th ACM international conference on Multimedia, Nara, Japan, 2012

One of These Things is Not Like the Other: Crowdsourcing Semantic Similarity of Multimedia Files Raynor Vliegendhart*, Martha Larson*, and Johan Pouwelse** Multimedia Information Retrieval Lab* Delft University of Technology

Parallel and Distributed Systems Group** Delft University of Technology

Problem

HIT Design

● Problem: What constitutes a near duplicate?

Amazon Mechanical Turk (AMT) is a crowdsourcing platform to which Human Intelligence Tasks (HITs) can be submitted.

For example: Are these two files the same? Why (not)?

Phrasing in our HIT is important in order to elicit serious judgments: ● “Imagine that you download the three items in the list and that you view them.” Chrono Cross - 'Dream of the Shore Near Another World' Violin/Piano Cover

Chrono Cross Dream of the Shore Near Another World Violin and Piano

(YouTube: IQYNEj51EUI)

(YouTube: Iuh3YrJtK3M)

Yes: No:

Harry Potter and the Sorcerers Stone Audio Book (478 MB) Harry Potter and the Sorcerer s Stone (2001)(ENG GER NL) 2Lions- (4.36 GB) Harry Potter.And.The.Sorcerer.Stone.DVDR. NTSC.SKJACK.Universal.S (4.46 GB)

It’s the same song. These are different performances by different performers.

Definition: Functional near-duplicate multimedia items are items that fulfill the same purpose for the user. Once the user has one of these items, there is no additional need for another. ● Task: Discovering new notions of user-perceived similarity between multimedia files in a file-sharing setting. ● Motivation: Clustering items in search results.

● Don’t force workers to make a contrast, and ● Explain the definition of functional similarity. o The items are comparable. They are for all practical purposes the same. Someone would never really need all three of these. o Each item can be considered unique. I can imagine that someone might really want to download all three of these items. o One item is not like the other two. (Please mark that item in the list.) The other two items are comparable.

Experiments ● Dataset: ● Popular file-sharing site: The Pirate Bay (thepiratebay.se). Screenshots from Tribler 5.4 (tribler.org)

● 75 queries derived from Top 100 list. ● 32,773 filenames and metadata.

Approach ● Idea: Point the odd one out, inspired by Sesame Street’s “one of these things is not like the other”.

● 1000 random triads sampled from search results. ● Crowdsourcing Experiment: ● Recruitment HIT and Main HIT run concurrently on AMT. ● 8 out of 14 qualified workers produced free-text judgments for 308 triads within 36 hours. ● Card Sort: ● Group similar judgments into piles, merge piles iteratively, and, finally label each pile. ● End result: 44 user-perceived dimensions of similarity discovered.

● Crowdsourcing Task: ● 3 multimedia files displayed as search results ● Worker points the odd one out and justifies why. ● Challenge: Eliciting serious judgments

Contact: R.Vliegendhart@tudelft.nl @ShinNoNoir

Conclusion ● Wealth of user-perceived dimensions of similarity discovered. ● Quick results due to interesting crowdsourcing task.

ICT.OPEN 2012, Rotterdam, The Netherlands, 2012

Mining Emotions in Short Films User Comments or Crowdsourcing?

Claudia Orellana-Rodriguez

Ernesto Diaz-Aviles

Wolfgang Nejdl

orellana@L3S.de

diaz@L3S.de

nejdl@L3S.de

Motivation

Task

Emotions are everywhere Many applications and diverse disciplines can benefit from mining emotions

Extract emotions in short films Exploit film criticism expressed through YouTube comments

Emotion detection approach [2]

Emotion lexicon

Human-provided word-emotion association ratings annotated according to Plutchik’s psychoevolutionary theory (NRC Emotion Lexicon - EmoLex)[1]

1. Create a profile for each short film 2. Extract the terms from the profile 3. Associate to each term an emotion and polarity 4. Compute the emotion vector and polarity

Plutchik’s Wheel of Emotions

TROPFEST

YOUR FILM FESTIVAL

c1 c2 cn

short film comments

c1 c2 . . . cn

short film profile 0.80$

Amazon Mechanical Turk Sandbox

emotion and polarity vector

adjectives EmoLex

Cosine similarity between the emotional vectors built from expert judgments and the ones built (i) through crowdsourcing using AMT, and (ii) automatically using YouTube comments.

0.75$

Cosine$Similarity$

Amazon Mechanical Turk

nouns

0.70$ 0.65$ 0.60$ 0.55$ 0.50$ AMT$workers$vs.$Moviegoers$

YouTube$comments$vs.$ Moviegoers$

[1] S. M. Mohammad and P. D. Turney, “Crowdsourcing a word- emotion association lexicon,” Computational Intelligence, 2011. [2] E. Diaz-Aviles, C. Orellana-Rodriguez, and W. Nejdl. Taking the Pulse of Political Emotions in Latin America Based on Social Web Streams. In LA-WEB, 2012

Claudia Orellana-Rodriguez L3S Research Center e-mail: orellana@L3S.de

Crowdsourcing Social Fashion Images: a Pilot Study María Menéndez*, Babak Loni†, Martha Larson†, Claudio Massari‡, Davide Martinenghi¥, Raynor Vliegendhart†, Luca Galli¥, Mark Melenhorst†, Marco Tagliasacchi¥ and Piero Fraternali ¥ *Department of Computer and Information Science, University of Trento, Italy †Multimedia

Motivation

Information Retrieval Lab, Delft University of Technology, Netherlands ‡Innovation Engineering, Italy ¥Dipartimento di Elettronica e Informazione, Politecnico di Milano, Italy

As part of the requirements from industrial and technical partners, a collection of fashion items needs to be gathered for further use in the vertical domains. This data can be used for several tasks related to social fashion such as: - Multimedia Content Analysis tasks including: recognizing different types of fashion items and predicting user appeal of fashion images -“Multimedia crowdsourcing” tasks: Algorithms for combining many noisy human decisions into a single decision

Semantic Relation: Does the tag designate a fashion image? As defined in Wikipedia, zari is “an even thread traditionally made of fine gold or silver used in traditional Indian, Pakistani, and Persian garments”. Some Flickr pictures tagg with “zari” are:

This dataset is used by “Open Innovation: Fashion Industry”, one of the frameworks of CUbRIK project

Images tagged ‘zari’

CUbRIK

A research project financed by European union • Advance the architecture of multimedia search • Exploit the human contribution in multimedia search • Use open-source components provided by the

community

• Start up a search business ecosystem

However, these are also example of Flickr pictures tagged with the term zari

Image Retrieval Using Wikipedia’s index for fashion topics, 470 topics were selected (only topics which are related to categories containing the text “fashion” or “cloth”) Using the 470 topics, 323,507 images were collected from Flickr. These images were selected just by matching the Wikipedia tag to the Flickr’s image tag given by Flickr’s users. Each picture contains one tag.

In order to clean the data set and to collect more information for users about these images, we use a crowdsoucing approach.

Refining the Dataset with Crowdsourcing Considering the retrieved images, a further refinement on data should be done to filter out fashion related images. Crowdsourcing is used to achieve this goal.

Crowdsourcing  Crowdsourcing is the combined effort of a large number of human contributors  Narrowly defined, it is micro-outsourcing of small tasks on a crowdsourcing platform such as Amazon’s Mechanical Turk (AMT)

We use crowdsourcing on this dataset to: • Filter out non-fashion related images • Investigate whether existing tags match what appears in the picture • Obtain information on what it is liked/disliked in the picture • Categorize images according to different variables • Investigate whether picture composition features related to classic and expressive aesthetics (e.g., clean, sophisticated, clear, aesthetic) influence turker’s choice.

• Micro-tasks called Human Intelligence Tasks (HITs) • HITs are carried out by MTurk workers (turkers) • Typically used for tasks that lend themselves well to piecemeal work (multiple people make small contributions) • Requesters can assign qualifications to turkers

Crowdsourcing Approach and Results To create a prototype of our crowdsourcing task, we futher filtered the fashion categories and chose 14 most common categories (i.e., dress,

shirt, sleeveless shirt, suit, trousers, jeans, skirt, coat, necktie, leather glove, handbag, and jewelry).

The list of common fashion categories was created by five people with no special knowledge on fashion. The top 6 images, according to Flickr's relevance ranking, were selected for each category (in total 72 images). For the pilot crowdsourcing task, each HIT was performed by 3 different turkers. Turkers contributed by tagging, rating, describing, and categorizing different aspects of pictures such as kind of model, number of people in image, and photographer expertise.

contact: m.a.larson@tudelft.nl

Results - In total 50 turkers performed 210 HITs (5 turkers - 6 HITs were discarded because of meaningless answers) - 10 turkers performed some 70% of the total number of HITs. The maximum number of HITs per turker was 45 and the minimum 1. - In average, turkers spent 4min 42 sec per HIT (SD=2min 58 sec) - Some 72% of the turkers were females and some 28% were male. - The mean age was 32.21 years (SD=7.564). - Most of the turkers came from India (41%), followed by USA (40%)

Learning to Rank for Joy Claudia Orellana-Rodriguez

Ernesto Diaz-Aviles

Ismail Sengor Altingovde

Wolfgang Nejdl

orellana@L3S.de

e.diaz-aviles@ie.ibm.com

altingovde@ceng.metu.edu.tr

nejdl@L3S.de

Motivation

Task

Emotions are everywhere Users’ information needs are complex and depend on their context (e.g. time of the day, mood, location)

Leverage social feedback for aﬀective video ranking

Emotion lexicon Human-provided word-emotion association ratings annotated according to Plutchik’s psychoevolutionary theory (NRC Emotion Lexicon - EmoLex)[1]

Emotion detection approach [2] 1. Create a profile for each video 2. Extract the terms from the profile 3. Associate to each term an emotion and polarity 4. Compute the emotion and polarity vector c1

c1 c2 nouns .. . adjectives cn

c2 cn video comments

video profile

emotion and polarity vector EmoLex

Features

Eliciting judgements

Basic: created by the uploader of the video without any other user interaction (e.g. video title, tags, description) [3] Social: product of the interaction between YouTube users and the video (e.g. likes, dislikes, views, comments, favorites) [3] Sentic: emotions and polarity extracted from the video comments [4]

Relevance Judgements: ask users to indicate how relevant is each video with respect to a given query [3] Aﬀective Judgements: ask users to annotate all the videos according to the emotions they experience while watching the videos

After eliciting judgements, we label with 1 the videos which are both, relevant and associated to the emotion Joy (our aﬀective context of interest), and with 0, otherwise. With these labels and using RankSVM, we learn three diﬀerent ranking functions: Basic. Using only basic video features, Social and Sentic. Trained using social and sentic features, All. Using all the features 0.55 0.50 0.45 0.40 0.35 0.30 0.25 0.20 0.15 0.10 0.05 0.00 P@10

MAP

basic

NDCG@5

social and sentic

NDCG@10

MeanNDCG

all

[1] S. M. Mohammad and P. D. Turney, Crowdsourcing a word- emotion association lexicon, Computational Intelligence, 2011  [2] E. Diaz-Aviles, C. Orellana-Rodriguez, and W. Nejdl. Taking the Pulse of Political Emotions in Latin America Based on Social Web Streams. LA-WEB, 2012 [3] S. Chelaru, C. Orellana-Rodriguez, and I. S. Altingovde. How Useful is Social Feedback for Learning to Rank YouTube Videos . WWWJ 1-29, In press   [4] E. Diaz-Aviles, C. Orellana-Rodriguez, and W. Nejdl. Mining Emotions in Short Films: User Comments or Crowdsourcing?. WWW’Companion, 2013

Claudia Orellana-Rodriguez L3S Research Center e-mail: orellana@L3S.de