Center for Language Technology
An Analysis of Discourse Connectives in SMT Jakob Elming Center for Language Technology. University of Copenhagen Anders Johannsen Center for Language Technology. University of Copenhagen Daniel Hardt Department of IT Management. Copenhagen Business School
Workshop on Big Data and Language Technology Copenhagen Business School May 31. 2013
Center for Language Technology
What is a discourse connective? •
Discourse connectives explicitly indicate a discourse relation between two sentences or clauses CONTRAST John paid $5, but Mary paid $10. John paid $5. But Mary paid $10.
•
Most of these words also have non-discourse function, e.g. All but one student passed the test.
•
We base our work on the 100 discourse connectives described in the Penn Discourse Treebank (Prasad et al., 2008)
Center for Language Technology
Motivation The presence of a discourse connective (DC) in the source sentence seems to affect the quality of the translation 32
Sentences with DC
31
Sentences without DC
BLEU
30 29 28 27 26 25 English to French
English to Danish
Center for Language Technology
Motivation Additionally, the number of DCs in a sentence seems to be a relevant factor 32 English to French
30
English to Danish
BLEU
28 26 24 22 20 0
1 2 3 4 Number of DCs in sentence
>4
Center for Language Technology
Sentence lengths Longer sentences will be more likely to contain a DC 80 70 60 %
50 40 30 20
English sentences with DCs
10 0 1-10 11-20 21-30 31-40 41-50 51-60 61-70 71-80 Sentence length intervals in words
Is the effect we are seeing a result of longer sentences getting lower BLEU scores?
Center for Language Technology
Sentence lengths There does not seem to be any correlation between BLEU and sentence length 45 R² = 0,00222
BLEU
40 35 30 25 20 0
10
20 30 40 50 60 Sentence length in words
70
80
Center for Language Technology
BLEU
BLEU
The effect of DCs across sentence lengths 38 36 34 32 30 28 26 24 22 20
English to French without DC
38 36 34 32 30 28 26 24 22 20
with DC
English to Danish 1-10
11-20 21-30 31-40 41-50 51-60 61-70 Sentence length intervals in words
71-80
Center for Language Technology
System specifications SMT system • Phrase-based SMT system (Moses) • Trained on ~50 mill. parallel words from Europarl v7 • Final 100K sentences held back for tuning and evaluation • Average over 5 MERT optimizations Discourse connective identification • POS-based features • Maximum entropy model • Trained on WSJ sections 2-22 • POS features handle domain-shift better than syntactic
Center for Language Technology
Optional presence • Cartoni et al. (2011) note that DCs are “almost always optional” • For DCs, this leaves the translator the option of translating: • DC to corresponding DC • DC to implicit discourse relation • Implicit discourse relation to DC •
This opens for variation in translation and may be a relevant factor in the lower BLEU score: • The higher variation makes it harder for the system to learn • In evaluation, the probability of choosing the same strategy as in the reference translation may be smaller
Center for Language Technology
Document origin •
Cartoni et al. (2011) analyze the use of DCs in Europarl
•
They find great variation relating to whether a document is original language or a translation
•
To test this, we created two English to French system: Original-to-translation and translation-to-original and compare their decrease in BLEU in DC sentences
Δ BLEU
1-10 2 0 -2 -4 -6 -8 -10 -12 -14 -16
Sentence length intervals in words 11-20 21-30 31-40 41-50 51-60 61-70 71-80
Original to translation Translation to original
Center for Language Technology
Discourse dependent lexical choice •
Meyer et al. (2012) show that the discourse function of the DC can be essential to translate it correctly
•
E.g. “The champions league has become a source of income for clubs since it started in 1992.”
•
Here we need to know that since marks a temporal relation between the clauses and therefore should be translated as depuis (temporal) in stead of car (causal).
•
Increasing the accuracy of correct lexical choice does however not result in a BLEU increase in their experiments
•
This indicates that lexical selection is not very important to the BLEU drop we are seeing
Center for Language Technology
Word order differences Another possibility is that DCs are accompanied by word order differences in the language pairs A and B although C There is an area where progress has not been made on harmonisation, and that is forms of protection complementary to the Geneva convention, although it has been included as an item in the council 's programme for several years. A although C and B Der er et område, hvor harmoniseringen ikke har gjort fremskridt, skønt dette punkt har været med i Rådets program i flere år, og det er spørgsmålet om andre former for beskyttelse end flygtningestatus i henhold til Geneve-konventionen.
Center for Language Technology
Word order differences Word alignments also indicate more reordering in sentences with DCs 1,6
Crosses per link
1,4 1,2 1 with DC
0,8
without DC
0,6 0,4 0,2 0 English to Danish
English to French
Center for Language Technology
Focus on individual DCs English to French Rank 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
English to Danish
DC
BLEU
DC
BLEU
then so however although when so+that in+fact if but because and indeed for+example also therefore while since before as finally
23.37 24.11 24.57 24.61 24.91 24.94 25.23 25.27 25.99 26.27 26.46 26.46 26.71 26.75 26.96 27.26 27.39 28.14 28.21 31.60
although for+example however in+fact while then since so+that before when as also so finally if indeed therefore because but and
21.54 22.42 23.24 23.51 23.69 24.19 24.25 25.05 25.11 25.28 25.34 25.65 25.66 25.76 25.79 25.97 26.09 26.51 27.17 27.21
Center for Language Technology
Future directions Analysis • Reordering patterns related to DCs • Individual DCs • Corresponding implicit relations • The effect of “free” translation choices Experiments • Can we close some of this BLEU gap? • Can this e.g. help confidence estimation?
Center for Language Technology
References •
B Cartoni, S Zufferey, T Meyer, and A Popescu-Belis. 2011. How comparable are parallel corpora? Measuring the distribution of general vocabulary and connectives. In Proceedings of BUCC.
•
T Meyer and A Popescu-Belis. 2012. Using Sense-labeled Discourse Connectives for Statistical Machine Translation. In Proceedings of HyTra.
•
T Meyer, A Popescu-Belis, N Hajlaoui, and A Gesmundo. 2012. Machine Translation of Labeled Discourse Connectives. In In Proceedings of AMTA.
•
R Prasad, N Dinesh, A Lee, E Miltsakaki, L Robaldo, A Joshi, and B Webber. 2008. The Penn Discourse TreeBank 2.0. In Proceedings of LREC.