Dependency length minimization hypothesis revisited Himanshu Yadav1
Shubham Mittal2 1 University
2 Indian
Samar Husain2
of Potsdam
Institute of Technology Delhi
1
4 3 1 1
John
1
gave
an
old
1
book
to
Mary
Arrows show dependencies going from head to dependents. Each dependency arc is labeled with dependency length: the distance between a head and its dependent, measured in words.
2
Dependency length minimization hypothesis
• Distance between syntactically linked words in minimized in natural languages (Futrell et al., 2015; Liu et al. 2017). • This ’distance’ is typically measured by counting the number of words intervening a dependency. See figure on the left.
3
DL=4
Xd
Xi
Xj
Xk
Xh
Structure (a)
DL=4
Xd
Xi
Xj
Xk
Structure (b)
4
Xh
Motivation
• Current formulation of dependency length will assume that the dependency Xh → Xd is equally complex in structure (a) and (b). I Because number of words intervening Xh → Xd is 4 in both (a) and (b).
• But this formulation ignores syntactic nature of the material that intervenes a dependency. I For example, structure (b) has more number of embeddings compared to (a).
5
DL=4, IC=1
Xd
Xi
Xj
Xk
Xh
Structure (a)
DL=4, IC=3
Xd
Xi
Xj
Xk
Structure (b)
6
Xh
Our proposal
• We propose a new measure of syntactic complexity Intervener complexity • We operationalize intervener complexity as the the number of intervening heads I For example, there are 3 heads which intervene the Xh → Xd dependency in structure (b), while there is 1 intervening head in (a). Thus, intervener complexity (IC) of Xh → Xd is 1 and 3 in (a) and (b) respectively.
7
8
Hypotheses
• H1: Intervener complexity is minimized in natural languages independent of constraint on dependency length. • H2: Dependency length is minimized irrespective of constraint on intervener complexity.
9
Sentence: Have they, however, actually weakened bank security? 4 3 2
2 1
1
they
weakened
bank
security?
Have,
however,
actually
A DL-matched random linear arrangement (DL-matched RLA)
2
2
2
1
security?
1
1
bank
weakened
they,
Have
however,
actually
An IC-matched random linear arrangement (IC-matched RLA)
10
Method
• To test H1, we compare the distribution of intervener complexity in real (language) trees with the baseline trees matched in dependency length and topological structure with the real trees. We call this baseline DL-matched RLAs. • To test H2, we compare the distribution of dependency length in real trees with the baseline trees matched for intervener compelxity and topological structure with the real trees. We call this baseline IC-matched RLAs.
11
Real language
DL−matched RLAs
Intervener Complexity
1.4
1.3
1.2
1.1
5.0
7.5
Sentence length
12
10.0
Results
• Intervener complexity grows significantly slower (w.r.t. sentence length) in real trees compared to DL-matched RLAs. • This suggests that intervener complexity minimization is an independent constraint on natural languages.
13
Dependency length
Real language
IC−matched RLAs
2.00
1.75
1.50
1.25 5.0
7.5
Sentence length
14
10.0
Results
• Dependency length does not grow slower (w.r.t. sentence length) in real trees compared to IC-matched RLAs. • This suggests that dependency length minimization arises as a consequence of constraint on intervener complexity (no. of intervening heads) and topological properties of the trees such as arity.
15
Dependency length
Intervener complexity
2.5
2.0
1.5
1.0 5.0
7.5
10.0
5.0
7.5
10.0
Sentence length
Intervener complexity shows less variability across languages and across sentence length compared to dependency length.
16
Implication
• Intervener complexity (operationalized as number of intervening heads) better captures the syntactic complexity across languages.
17
Xd
Xi
Xj
18
Xk
Xh
Why intervening heads matter? • Structural integrations and temporary storage of linguistic items consume the same pool of limited resource (Gibson, 1998; Just & Carpenter, 1992). • Consider dependency Xh → Xd . Comprehension process requires that node Xd has to be actively maintained in memory until node Xh is encountered. • Several integrations has to be preformed in the region intervening Xd and Xh , i.e., dependencies Xk → Xj etc. has to be resolved. • Integrations in the intervening region and maintenance of the Xd draws from the same pool of limited resource. • It will be difficult to resolve Xh → Xd as the resource demand increases – resource demand increases with increase in number of integrations in the intervening region which reflects in the number of intervening heads. 19