How Do We Deep-Link? Leveraging User-Contributed Time-Links for Non-Linear Video Access Raynor Vliegendhart, Babak Loni, Martha Larson, and Alan Hanjalic Multimedia Information Retrieval Lab, Delft University of Technology
Introduction
Contributions
● Problem: How do users deep-link?
● Notion of Viewer Expressive Reaction (VER):
(i.e., refer to points in a video by explicitly mentioning time-codes)
● Motivation: Leverage time-codes within deep-link comments for enabling non-linear video access ● Dataset: MSRA-MM 2.0 / YouTube comments
Reflects viewers’ perceptions of noteworthiness (but extend beyond depicted content and induced affect)
● Viewer Expressive Reaction Variety taxonomy (VERV): Captures how users deep-link; Shown to be appropriate for automatic filtering Envisioned Future Retrieval System Deep-links ▼
0:44 omg so cute
cats at play +surprise Funny cats in water
what’s the breed of the cat at 2:14?
by NekoTV • 2 year ago • 3,491 views Funny cats in and around water 3:24
The song at 3:33 is called “The eye of the tiger” epic failure at 0:23
Wrestling kittens
2:59 That didn’t go too well
by Rizzzalie • 5 years ago • 8,136 views World wrestling federation, young feline division
5:17 Damn!
i liked it till 2:11 then it became boring
► Nyaaa wrote: 1:12 omg, that’s impossible!
0:39
► Xfade wrote: O_o now I didn’t expect that at all! 0:33
Stalking cat
That move at 6:12 was dumb
by lowdope • 4 years ago • 4,435 views Moire blog: http://moire.lowdope.com/ 1:14
(These deep-link comments occur unprompted on social video sharing platforms)
► jb87 wrote:
Approach
Results
● Taxonomy elicitation via crowdsourcing (Amazon Mechanical Turk):
● Annotation agreement:
● Given: 3 deep-link comments per video ● Task: Describe why a comment was posted (2–4 sentences) ● Post-processing: Card-sorting technique
whoa. unreal creepy eyes at 1:02
● VER/non-VER: 2,842 comments (84.6%) ● VERV: 2,140 comments (63.7%) ● Automatic classification results:
● Misclassification challenges: ● “Funny” comments often labeled as here by humans, but classified as love by the classifier ● Annotation crowdsourcing task, for:
● Comments with multiple interpretations ● Comments with multiple sentences
● Validating elicited VERV taxonomy ● Annotating 3,359 deep-link comments: ● Whether it contains a true deep-link (VER/non-VER)
Future Work
● VERV class (if and only if VER comment)
● Improve automatic classification by adding content features
● Linear SVM comment classification experiment (unigram features)
Contact: R.Vliegendhart@tudelft.nl @ShinNoNoir
● Develop the envisioned deep-link retrieval system
21st ACM international conference on Multimedia, Barcelona, Spain, 2013
CONTEXT-BASED PEOPLE RECOGNITION in CONSUMER PHOTO COLLECTIONS Markus Brenner, Ebroul Izquierdo MMV Research Group, School of Electronic Engineering and Computer Science Queen Mary University of London, UK {markus.brenner, ebroul.izquierdo}@eecs.qmul.ac.uk
Aim
 Resolve identities of people primarily by their faces
 Perform recognition by considering all contextual information
at the same time (unlike traditional approaches that usually train a classifier and then predict identities independently)
 Incorporate rich contextual cues of personal photo collections
where few individual people frequently appear together
Face Detection and Basic Recognition
Graph-based Recognition
Initial steps: Image preprocessing, face detection and face normalization
Model: pairwise Markov Network (graph nodes represent faces) Unary Potentials: likelihood of faces belonging to particular people � �� =
1
đ?‘?đ?‘“
Pairwise potential
Face
f1
f2
� ��
Unary potential
Descriptor-based: Local Binary Pattern (LBP) texture histograms
f3
Pairwise Potentials: encourage spatial smoothness, encode exclusivity constraint and temporal domain
LBP
‌ for each block ‌
đ?œ?, đ?‘–đ?‘“ đ?‘¤đ?‘› = đ?‘¤đ?‘š ∧ đ?‘–đ?‘› ≠đ?‘–đ?‘š đ?‘? đ?‘¤đ?‘› , đ?‘¤đ?‘š = 0, đ?‘–đ?‘“ đ?‘¤đ?‘› = đ?‘¤đ?‘š ∧ đ?‘–đ?‘› = đ?‘–đ?‘š đ?‘?đ?‘œ đ?‘¤đ?‘› , đ?‘¤đ?‘š , đ?‘œđ?‘Ąâ„Žđ?‘’đ?‘&#x;đ?‘¤đ?‘–đ?‘ đ?‘’
LBP
Similarity metric: Chi-Square Statistics All samples are independent
Basic face recognition: k-Nearest-Neighbor
Te
Te
Tr
Topology: only the most similar faces are connected with edges
Unary potential of every node
Tr
Tr
Face similarity
Tr
Inference: maximum a posteriori (MAP) solution of Loopy Belief Propagation (LBP)
Te Te
Tr
Social Semantics
Body Detection and Recognition
Individual appearance for a more effective graph topology (used to regularize the number of edges)
‌ when faces are obscured or invisible
Unique People Constraint models exclusivity: a person cannot appear more than once in a photo

Detect upper and lower body parts

Bipartite matching of faces and bodies

Graph-based fusion of faces and clothing
Pairwise co-appearance: people appearing together bear a higher likelihood of appearing together again Groups of people: use data mining to discover frequently appearing social patterns
Tr Based on face similarities
...
Experiments Public Gallagher Dataset: ~600 photos, ~800 faces, 32 distinct people Our dataset: ~3300 photos, ~5000 faces, 106 distinct people
Gain @ 3% training 25% 20% 15% 10%
 All photos shot with a typical consumer camera
5%
 Considering only correctly detected faces (87%)
0% + Graph. Model
+ Social Semantics
+ Body parts
Unary potential of every node
Tr
Tr
Upper body similarity
Lower body similarity Te
Te Face similarity
Tr
PicAlert!: A System for Privacy-Aware Image Classification and Retrieval Sergej Zerr, Stefan Siersdorfer, Jonathon Hare E-mail: {zerr,siersdorfer}@L3S.de, jsh2@ecs.soton.ac.uk A large portion of images, published in social Web2.0 applications, are of a highly sensitive nature, disclosing many details of the users’ private life. We have developed a web service which can detect private images within a user’s photo stream and provide support in making
1
privacy decisions in the sharing context. In addition, we present a privacy oriented image search application which automatically identifies potentially sensitive images in the result set and separates them from the remaining pictures.
Acquiring the Ground Truth Using a Social Annotation Game 81 users annotated 37,535 recent images 27,405 public 4,701 private
Common notion of “privacy”: “Private are photos which have to do with the private sphere (like self portraits, family, friends, your home) or contain objects that you would not share with the entire world (like a private email). The rest are public. In case no decision can be made, the picture should be marked as undecidable.”
GUI of the game.
2 Colors
Edges
Feature Extraction
Classifier Training
SIFT
BEP Visual: 0.74 Text: 0.78 Combination: 0.80
Faces
Top-50 s temmed terms ac c ording to their Mutual Information values for “public ” v s . “priv ate” photos in Flic k r
4
3
Top-50 s temmed terms ac c ording to their Mutual Information values for “public ” v s . “priv ate” photos in Flic k r
Evaluation: P/R c urv es for the features and their c ombination.
Search & Web Service
Search results for the query “cristiano ronaldo” (06/06/12).
Web service GUI for privacy-oriented image classification.
www.cubrikproject.eu
Map to Humans and Reduce Error - Crowdsourcingfor DeduplicationApplied to Digital Libraries Mihai Georgescu, Dang Duc Pham, ClaudiuS. Firan, JulienGaugaz, Wolfgang Nejdl •Find duplicate entities based on metadata •Focus on scientific publications in the Freesearchsystem
Crowdsourcing:
•An automatic method and human labelers work together towards improving their performance at identifying duplicate entities •Actively learn how to deduplicate from the crowd by optimizing the parameters of the automatic method
1 HIT = 5 Pairs 5ct / HIT 3 ->5 Assignments
[Show Diff] [Full Text] Title: Comparing Heuristic, Evolutionary and Local Search Approaches to Scheduling
[Show Diff] Title: Comparing Heuristic, Evolutionary and Local Search Approaches to Scheduling.
Authors: Soraya Rana, Adele E. Howe, L. Darrell, Whitley Keith Mathias Venue: Proceedings of the Third International Conference on Artificial Intelligence Planning Systems, Menlo Park, CA Publisher: The AAAI Press Year: 1996 Language: English Type: conference
Authors: Soraya B. Rana, Adele E. Howe, L. Darrell Whitley, Keith E. Mathias Book: AIPS Pg. 174-181 [Contents] Year: 1996 Language: English Type: conference (inproceedings)
After carefully reviewing the publications metadata presented to you, how would you classify the 2 publications referred:
Abstract: The choice of search algorithm can play a vital role in the success of a scheduling application. In this paper, we investigate the contribution of search algorithms in solving a real-world warehouse scheduling problem. We compare performance of three types of scheduling algorithms: heuristic, genetic algorithms and local search.
Judgment for publications pair: oDuplicates oNot Duplicates
Identify pairs with ADS = threshold±ε Sample and add to Pcand
Automatic Method •DuplicatesScorerproduces an ADS •DSParams={(fieldName, fieldWeight)} and threshold •Compare ADSto threshold => ADϵ{1,0}
High confidence pairs => Ptrain Pcand = Pcand - Ptrain
Optimize DSParams and threshold to fit to the data in Ptrain
Identify duplicate pairs from Ptrain, Pdupl
Crowd Decision •Aggregated decision from all workers for a pair produces a CSD •Worker contribution to the CSDis proportional to the confidence ck we have in him •Compare CDSto 0.5 => CDϵ{1,0}
Initial DSParams, Threshold Pcand = φ
0.80 0.60 0.40 0.20 A
sign
sign+DS/m
R sign+DS/o
DS/m
DS/o
Optimization strategies
3 workers
5 workers
MV
MV
Iter
Manual
Boost
Heur
Accuracy
79.19
80.00
79.73
80.00
78.92
79.73
CD-MV
sign
sign+DS/m
sign+DS/o
DS/m
DS/o
CD-MV
Sum-Err
76.49
79.46
79.46
79.46
79.46
79.19
R
0.2 0
0.2 0
0.2 0
0.6 7
0.5 6
0.9 7
Sum-log-err
71.89
78.11
78.38
78.92
80.27
76.76
A
0.7 7
0.7 7
0.7 7
0.7 0
0.7 9
0.8 3
Pearson
73.24
79.46
79.46
80.54
79.46
81.08
P
0.9 5
0.9 5
1.0 0
0.4 8
0.6 6
0.6 3
L3S Research Center / Leibniz Universität Hannover Appelstrasse 4, 30167 Hannover, Germany phone: +49 511 762-19715
ck
dblp.kbs.uni-hannover.de www.cubrikproject.eu
c
vWi , j v
Worker Confidence •Asses how reliable are the individual workers when compared to the overall performance of the crowd •Simple measure: proportion of pairs that have the same label as the one assigned by the crowd •Use an EM algorithm to iteratively compute the worker confidence •Compute CSD •Update ck
Compare CD to AD and optimize DSParamsand threshold to maximizeAccuracy
Crowd Decision Strategies
P
-
Contact: Mihai Georgescu email: georgescu@L3S.de
2
weighti, j (k )
Crowd Decision and Optimization Strategies Experiment Setup • 3 Batches : o 60 HITs with qualification test o 60 HITs without qualification test o 120 HITs without qualification test
1.00
•Just signatures • Sign •Just the DuplicatesScorer • DS/m • DS/o •First compute signatures and then base decision on DuplicatesScorer • sign + DS/m • sign + DS/o •Directly use Crowd Decision obtained via Majority Voting CD-MV
kWi , j
Crowd Decision Strategies: •MV: Majority Voting; All users are equal ck=1 •Iter: ck computed using the EM algorithm •Boost: ck computed using the EM algorithm using boosted weights in the computation of CSD •Heur: Heuristic 3/3 or 4/5
Better DSParams, Threshold Pdupl
Duplicate Detection Strategies
1 weighti, j (k )Wi, j (k ) CSDi, j
Compute crowd decisions and worker confidences
Get Crowd Labels for Pcand
•MTurkHITs to get labeled data, while tackling the quality issues of the crowdsourcedwork
Crowd Soft Decision Aggregation of all individual votes Wi,j(k)ϵ{-1,1} CSDϵ{0,1}
Compare ADSto CSD and optimize DSParams •minimize the sum of errors •minimize the sum of log of errors •maximize the Pearsoncorrelation Compare CD to AD and optimize threshold to maximize Accuracy
Swarming to Rank for Recommender Systems Ernesto Diaz-Aviles, Mihai Georgescu, and Wolfgang Nejdl Overview •Address the item recommendation task in the context of recommender systems •An approach to learning ranking functions exploiting collaborative latent factors as features • Instead of manually creating an item feature vector, factorize a matrix of user-item interactions •Use these collaborative latent factors as input to the Swarm Intelligence(SI) ranking method SwarmRank
SI for Recommender Systems Swarm-RankCF • a collaborative learning to rank algorithm based on SI • while learning to rank algorithms use hand-picked feature to represent items we learn such features based on user-item interactions, and apply a PSO-based optimization algorithm that directly maximizes Mean Average Precision.
Evaluation Dataset: Real world data from internet radio: 5-core of the Last.fm Dataset –1K Users transactions
242,103
Unique users
888
Items(artists)
35,315
Evaluation Methodology: All-but-one protocol or leave-one-out holdout method
where hit(u) = 1, if the hidden item I is present in u’s Top-N list of recommendations, and 0 otherwise.
Contact: Ernesto Diaz-Aviles, Mihai Georgescu email: {diaz, georgescu}@L3S.de L3S Research Center / Leibniz Universität Hannover Appelstrasse 4, 30167 Hannover, Germany phone: +49 511 762-19715
www.cubrikproject.eu
LikeLines: Collecting Timecode-level Feedback for Web Videos through User Interactions Raynor Vliegendhart, Martha Larson, and Alan Hanjalic Multimedia Information Retrieval Lab, Delft University of Technology
Problem
Approach
● Problem: Providing users with a navigable heat map of interesting regions of the video they are watching.
A Web video player component with a navigable heat map, that:
● Motivation: Conventional time sliders do not make the inner structure of the video apparent, making it hard to navigate to the interesting bits.
● Uses multimedia content analysis to seed the heat map. ● Captures implicit and explicit user feedback at the timecode-level to refine the heat map.
LikeLines player
play pause seek like ...
viewers
interaction session server
content analysis
System Overview
Implementation
● Video player component, augmented with:
● Video player component implemented in JavaScript and HTML5.
● Navigable heat map that allows users to jump directly to “hot” areas; ● Time-sensitive “like” button that allows users to explicitly like particular points in the video.
● Out-of-the-box support for YouTube and HTML5 videos. <script type="text/javascript"> var player = new LikeLines.Player('playerDiv', { video: 'http://www.youtube.com/watch?v=wPTilA0XxYE', backend: 'http://backend:9090/' }); </script>
● Captures user interactions: ● Implicit feedback such as playing, pausing and seeking; ● Explicit “likes” expressed by the user. ● Combines content analysis and captured user interactions to compute a video’s heat map.
t (s)
content analysis
+
t (s)
user feedback 1
+…+
Future Work
● Back-end interaction session server stores and aggregates per video:
Source code: https://github.com/delftmir/likelines-player
Contact: R.Vliegendhart@tudelft.nl @ShinNoNoir
● Back-end server reference implementation is written in Python.
t (s)
user feedback n
● All interaction sessions between each user and player; ● Initial multimedia content analysis of the video.
● Video player component communicates with a back-end server using JSON(P).
● For what kinds of video is timecode-level feedback useful? ● How should user interactions be interpreted? ● How to fuse timecode-level feedback with content analysis without encouraging snowball effects? ● Can timecode-level data be linked to queries to recommend relevant jump points? ● How to collect a critical mass of timecode-level data by incentivizing users to interact with the system?
20th ACM international conference on Multimedia, Nara, Japan, 2012
One of These Things is Not Like the Other: Crowdsourcing Semantic Similarity of Multimedia Files Raynor Vliegendhart*, Martha Larson*, and Johan Pouwelse** Multimedia Information Retrieval Lab* Delft University of Technology
Parallel and Distributed Systems Group** Delft University of Technology
Problem
HIT Design
● Problem: What constitutes a near duplicate?
Amazon Mechanical Turk (AMT) is a crowdsourcing platform to which Human Intelligence Tasks (HITs) can be submitted.
For example: Are these two files the same? Why (not)?
Phrasing in our HIT is important in order to elicit serious judgments: ● “Imagine that you download the three items in the list and that you view them.” Chrono Cross - 'Dream of the Shore Near Another World' Violin/Piano Cover
Chrono Cross Dream of the Shore Near Another World Violin and Piano
(YouTube: IQYNEj51EUI)
(YouTube: Iuh3YrJtK3M)
Yes: No:
Harry Potter and the Sorcerers Stone Audio Book (478 MB) Harry Potter and the Sorcerer s Stone (2001)(ENG GER NL) 2Lions- (4.36 GB) Harry Potter.And.The.Sorcerer.Stone.DVDR. NTSC.SKJACK.Universal.S (4.46 GB)
It’s the same song. These are different performances by different performers.
Definition: Functional near-duplicate multimedia items are items that fulfill the same purpose for the user. Once the user has one of these items, there is no additional need for another. ● Task: Discovering new notions of user-perceived similarity between multimedia files in a file-sharing setting. ● Motivation: Clustering items in search results.
● Don’t force workers to make a contrast, and ● Explain the definition of functional similarity. o The items are comparable. They are for all practical purposes the same. Someone would never really need all three of these. o Each item can be considered unique. I can imagine that someone might really want to download all three of these items. o One item is not like the other two. (Please mark that item in the list.) The other two items are comparable.
Experiments ● Dataset: ● Popular file-sharing site: The Pirate Bay (thepiratebay.se). Screenshots from Tribler 5.4 (tribler.org)
● 75 queries derived from Top 100 list. ● 32,773 filenames and metadata.
Approach ● Idea: Point the odd one out, inspired by Sesame Street’s “one of these things is not like the other”.
● 1000 random triads sampled from search results. ● Crowdsourcing Experiment: ● Recruitment HIT and Main HIT run concurrently on AMT. ● 8 out of 14 qualified workers produced free-text judgments for 308 triads within 36 hours. ● Card Sort: ● Group similar judgments into piles, merge piles iteratively, and, finally label each pile. ● End result: 44 user-perceived dimensions of similarity discovered.
● Crowdsourcing Task: ● 3 multimedia files displayed as search results ● Worker points the odd one out and justifies why. ● Challenge: Eliciting serious judgments
Contact: R.Vliegendhart@tudelft.nl @ShinNoNoir
Conclusion ● Wealth of user-perceived dimensions of similarity discovered. ● Quick results due to interesting crowdsourcing task.
ICT.OPEN 2012, Rotterdam, The Netherlands, 2012
Mining Emotions in Short Films User Comments or Crowdsourcing?
Claudia Orellana-Rodriguez
Ernesto Diaz-Aviles
Wolfgang Nejdl
orellana@L3S.de
diaz@L3S.de
nejdl@L3S.de
Motivation
Task
Emotions are everywhere Many applications and diverse disciplines can benefit from mining emotions
Extract emotions in short films Exploit film criticism expressed through YouTube comments
Emotion detection approach [2]
Emotion lexicon
Human-provided word-emotion association ratings annotated according to Plutchik’s psychoevolutionary theory (NRC Emotion Lexicon - EmoLex)[1]
1. Create a profile for each short film 2. Extract the terms from the profile 3. Associate to each term an emotion and polarity 4. Compute the emotion vector and polarity
Plutchik’s Wheel of Emotions
TROPFEST
YOUR FILM FESTIVAL
c1 c2 cn
short film comments
c1 c2 . . . cn
short film profile 0.80$
Amazon Mechanical Turk Sandbox
emotion and polarity vector
emotion and polarity vector
emotion and polarity vector
adjectives EmoLex
Cosine similarity between the emotional vectors built from expert judgments and the ones built (i) through crowdsourcing using AMT, and (ii) automatically using YouTube comments.
0.75$
Cosine$Similarity$
Amazon Mechanical Turk
nouns
0.70$ 0.65$ 0.60$ 0.55$ 0.50$ AMT$workers$vs.$Moviegoers$
YouTube$comments$vs.$ Moviegoers$
[1] S. M. Mohammad and P. D. Turney, “Crowdsourcing a word- emotion association lexicon,” Computational Intelligence, 2011. [2] E. Diaz-Aviles, C. Orellana-Rodriguez, and W. Nejdl. Taking the Pulse of Political Emotions in Latin America Based on Social Web Streams. In LA-WEB, 2012
Claudia Orellana-Rodriguez L3S Research Center e-mail: orellana@L3S.de
Crowdsourcing Social Fashion Images: a Pilot Study María Menéndez*, Babak Loni†, Martha Larson†, Claudio Massari‡, Davide Martinenghi¥, Raynor Vliegendhart†, Luca Galli¥, Mark Melenhorst†, Marco Tagliasacchi¥ and Piero Fraternali ¥ *Department of Computer and Information Science, University of Trento, Italy †Multimedia
Motivation
Information Retrieval Lab, Delft University of Technology, Netherlands ‡Innovation Engineering, Italy ¥Dipartimento di Elettronica e Informazione, Politecnico di Milano, Italy
As part of the requirements from industrial and technical partners, a collection of fashion items needs to be gathered for further use in the vertical domains. This data can be used for several tasks related to social fashion such as: - Multimedia Content Analysis tasks including: recognizing different types of fashion items and predicting user appeal of fashion images -“Multimedia crowdsourcing” tasks: Algorithms for combining many noisy human decisions into a single decision
Semantic Relation: Does the tag designate a fashion image? As defined in Wikipedia, zari is “an even thread traditionally made of fine gold or silver used in traditional Indian, Pakistani, and Persian garments”. Some Flickr pictures tagg with “zari” are:
This dataset is used by “Open Innovation: Fashion Industry”, one of the frameworks of CUbRIK project
Images tagged ‘zari’
CUbRIK
A research project financed by European union • Advance the architecture of multimedia search • Exploit the human contribution in multimedia search • Use open-source components provided by the
community
• Start up a search business ecosystem
However, these are also example of Flickr pictures tagged with the term zari
Image Retrieval Using Wikipedia’s index for fashion topics, 470 topics were selected (only topics which are related to categories containing the text “fashion” or “cloth”) Using the 470 topics, 323,507 images were collected from Flickr. These images were selected just by matching the Wikipedia tag to the Flickr’s image tag given by Flickr’s users. Each picture contains one tag.
In order to clean the data set and to collect more information for users about these images, we use a crowdsoucing approach.
Refining the Dataset with Crowdsourcing Considering the retrieved images, a further refinement on data should be done to filter out fashion related images. Crowdsourcing is used to achieve this goal.
Crowdsourcing Crowdsourcing is the combined effort of a large number of human contributors Narrowly defined, it is micro-outsourcing of small tasks on a crowdsourcing platform such as Amazon’s Mechanical Turk (AMT)
We use crowdsourcing on this dataset to: • Filter out non-fashion related images • Investigate whether existing tags match what appears in the picture • Obtain information on what it is liked/disliked in the picture • Categorize images according to different variables • Investigate whether picture composition features related to classic and expressive aesthetics (e.g., clean, sophisticated, clear, aesthetic) influence turker’s choice.
• Micro-tasks called Human Intelligence Tasks (HITs) • HITs are carried out by MTurk workers (turkers) • Typically used for tasks that lend themselves well to piecemeal work (multiple people make small contributions) • Requesters can assign qualifications to turkers
Crowdsourcing Approach and Results To create a prototype of our crowdsourcing task, we futher filtered the fashion categories and chose 14 most common categories (i.e., dress,
shirt, sleeveless shirt, suit, trousers, jeans, skirt, coat, necktie, leather glove, handbag, and jewelry).
The list of common fashion categories was created by five people with no special knowledge on fashion. The top 6 images, according to Flickr's relevance ranking, were selected for each category (in total 72 images). For the pilot crowdsourcing task, each HIT was performed by 3 different turkers. Turkers contributed by tagging, rating, describing, and categorizing different aspects of pictures such as kind of model, number of people in image, and photographer expertise.
contact: m.a.larson@tudelft.nl
Results - In total 50 turkers performed 210 HITs (5 turkers - 6 HITs were discarded because of meaningless answers) - 10 turkers performed some 70% of the total number of HITs. The maximum number of HITs per turker was 45 and the minimum 1. - In average, turkers spent 4min 42 sec per HIT (SD=2min 58 sec) - Some 72% of the turkers were females and some 28% were male. - The mean age was 32.21 years (SD=7.564). - Most of the turkers came from India (41%), followed by USA (40%)
Learning to Rank for Joy Claudia Orellana-Rodriguez
Ernesto Diaz-Aviles
Ismail Sengor Altingovde
Wolfgang Nejdl
orellana@L3S.de
e.diaz-aviles@ie.ibm.com
altingovde@ceng.metu.edu.tr
nejdl@L3S.de
Motivation
Task
Emotions are everywhere Users’ information needs are complex and depend on their context (e.g. time of the day, mood, location)
Leverage social feedback for affective video ranking
Emotion lexicon Human-provided word-emotion association ratings annotated according to Plutchik’s psychoevolutionary theory (NRC Emotion Lexicon - EmoLex)[1]
Emotion detection approach [2] 1. Create a profile for each video 2. Extract the terms from the profile 3. Associate to each term an emotion and polarity 4. Compute the emotion and polarity vector c1
c1 c2 nouns .. . adjectives cn
c2 cn video comments
video profile
emotion and polarity vector EmoLex
Features
Eliciting judgements
Basic: created by the uploader of the video without any other user interaction (e.g. video title, tags, description) [3] Social: product of the interaction between YouTube users and the video (e.g. likes, dislikes, views, comments, favorites) [3] Sentic: emotions and polarity extracted from the video comments [4]
Relevance Judgements: ask users to indicate how relevant is each video with respect to a given query [3] Affective Judgements: ask users to annotate all the videos according to the emotions they experience while watching the videos
After eliciting judgements, we label with 1 the videos which are both, relevant and associated to the emotion Joy (our affective context of interest), and with 0, otherwise. With these labels and using RankSVM, we learn three different ranking functions: Basic. Using only basic video features, Social and Sentic. Trained using social and sentic features, All. Using all the features 0.55 0.50 0.45 0.40 0.35 0.30 0.25 0.20 0.15 0.10 0.05 0.00 P@10
MAP
basic
NDCG@5
social and sentic
NDCG@10
MeanNDCG
all
[1] S. M. Mohammad and P. D. Turney, Crowdsourcing a word- emotion association lexicon, Computational Intelligence, 2011 [2] E. Diaz-Aviles, C. Orellana-Rodriguez, and W. Nejdl. Taking the Pulse of Political Emotions in Latin America Based on Social Web Streams. LA-WEB, 2012 [3] S. Chelaru, C. Orellana-Rodriguez, and I. S. Altingovde. How Useful is Social Feedback for Learning to Rank YouTube Videos . WWWJ 1-29, In press [4] E. Diaz-Aviles, C. Orellana-Rodriguez, and W. Nejdl. Mining Emotions in Short Films: User Comments or Crowdsourcing?. WWW’Companion, 2013
Claudia Orellana-Rodriguez L3S Research Center e-mail: orellana@L3S.de