Accepted Manuscript
A new distance-based total uncertainty measure in the theory of belief functions Yi Yang, Deqiang Han PII: DOI: Reference:
S0950-7051(15)00457-8 10.1016/j.knosys.2015.11.014 KNOSYS 3343
To appear in:
Knowledge-Based Systems
Received date: Revised date: Accepted date:
27 July 2015 22 October 2015 20 November 2015
Please cite this article as: Yi Yang, Deqiang Han, A new distance-based total uncertainty measure in the theory of belief functions, Knowledge-Based Systems (2015), doi: 10.1016/j.knosys.2015.11.014
This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
ACCEPTED MANUSCRIPT
Highlights • A new total uncertainty measure in evidence theory is proposed.
CR IP T
• The new measure is directly defined in the evidential framework.
• The new measure is not a generalization of those in the probabilistic framework.
AN US
• The belief intervals and distance metric are used for the new measure’s design.
• The new measure has no drawbacks in traditional ones and has desired
AC
CE
PT
ED
M
properties.
1
ACCEPTED MANUSCRIPT
3
Yi Yanga , Deqiang Hanb,∗ a
4 5
b
CR IP T
2
A new distance-based total uncertainty measure in the theory of belief functions
1
SKLSVMS, School of Aerospace, Xi’an Jiaotong University, Xi’an, China 710049 Institute of Integrated Automation, Xi’an Jiaotong University, Xi’an, China 710049
Abstract
7
The theory of belief functions is a very important and effective tool for un-
8
certainty modeling and reasoning, where measures of uncertainty are very
9
crucial for evaluating the degree of uncertainty in a body of evidence. Several
10
uncertainty measures in the theory of belief functions have been proposed.
11
However, existing measures are generalizations of measures in the proba-
12
bilistic framework. The inconsistency between different frameworks causes
13
limitations to existing measures. To avoid these limitations, in this paper,
14
a new total uncertainty measure is proposed directly in the framework of
15
belief functions theory without changing the theoretical frameworks. The
16
average distance between the belief interval of each singleton and the most
17
uncertain case is used to represent the total uncertainty degree of the given
18
body of evidence. Numerical examples, simulations, applications and related
19
analyses are provided to verify the rationality of our new measure.
CE
PT
ED
M
AN US
6
Keywords: belief functions, evidence theory, uncertainty measure, belief
21
interval, distance of interval
AC 20
∗
Corresponding author: Deqiang Han Email addresses: jiafeiyy@mail.xjtu.edu.cn (Yi Yang), deqhan@gmail.com (Deqiang Han ) Preprint submitted to Knowledge-Based Systems
November 26, 2015
ACCEPTED MANUSCRIPT
22
1. Introduction Dempster-Shafer evidence theory (DST) [1], also called the theory of be-
24
lief functions, has been widely used in many applications related to uncer-
25
tainty modeling and reasoning, e.g., information fusion [2], pattern classifica-
26
tion [3, 4], clustering analysis [5], fault diagnosis [6], and multiple attribute
27
decision-making (MADM) [7, 8].
CR IP T
23
In the theory of belief functions, there are two types of uncertainty in-
29
cluding the discord (or conflict or randomness) [9] and the non-specificity
30
[10], hence the ambiguity [11]. Various kinds of measures for these two types
31
of uncertainty and the total uncertainty including both two types were pro-
32
posed. The measures of discord include the discord measure [9], the strife [9],
33
the confusion [12], etc; the measures of non-specificity include Dubois and
34
Prade’s definition [10] generalized from the Hartley entropy [13] in the clas-
35
sical set theory, Yager’s definition [14], and Korner’s definition [15], etc. The
36
most representative total uncertainty measures are the aggregated measure
37
(AU ) [16] and the ambiguity measure (AM ) [11].
ED
M
AN US
28
In essential, no matter AU or AM , they are the generalization of Shan-
39
non entropy [17] in probability theory, but not the direct definition in the
40
framework of the theory of belief functions. That is, they pick up a probabil-
41
ity according to some criteria or constraints established based on the given
42
body of evidence (BOE), and then calculate the corresponding Shannon en-
AC
CE
PT
38
43
tropy of the probability to indirectly depict the degree of uncertainty for the
44
given body of evidence. As mentioned in the related references [11][18], AU
45
and AM have their own limitations. For example, they are insensitive to
46
the change of BBA. These limitations to some extent are related to the in3
ACCEPTED MANUSCRIPT
consistency [19, 20] between the framework of the theory of belief functions
48
and that of the probability theory. Therefore, to avoid the limitations of
49
the traditional definitions for the uncertainty measure, in our work, a new
50
total uncertainty measure is proposed directly in the framework of the theory
51
of belief functions without the switching between different frameworks. We
52
analyze the belief interval and conclude that belief intervals carry both the
53
randomness part and the imprecision part (non-specificity) in the uncertainty
54
incorporated in a BOE. Thus, it is feasible to define a total uncertainty mea-
55
sure for a BOE. Given a BOE, the distance between the belief interval of each
56
singleton and the most uncertain interval [0, 1] is used for constructing the
57
degree of total uncertainty. The larger the average distance, the smaller the
58
degree of uncertainty. Since there is no switch of theoretical frameworks, our
59
new definition has desired properties including the ideal value range and the
60
monotonicity. Furthermore, the uncertainty measure can provide more ratio-
61
nal results when compared with the traditional ones, which can be supported
62
by experimental results and related analyses.
ED
M
AN US
CR IP T
47
The rest of this paper is organized as follows. Section 2 provides the brief
64
introduction of the theory of belief functions. Commonly used uncertainty
65
measures in the theory of belief functions are introduced in Section 3. Some
66
drawbacks of the available total uncertainty measures including AM and
67
AU are also pointed out in Section 3. In Section 4, a new total uncertainty
CE
PT
63
measure is proposed. Some desired properties and related analyses on the
69
new proposed definition are also provided. Experiments and simulations are
70
provided in Section 5 to show the rationality of our proposed total uncertainty
71
measure. An application of the new total uncertainty measure on feature
AC 68
4
ACCEPTED MANUSCRIPT
evaluation is provided in Section 6. Section 7 concludes this work.
73
2. Basics of the theory of belief functions
CR IP T
72
74
The basic conception in the theory of belief functions [1] is the frame
75
of discernment (FOD), whose elements are mutually exclusive and exhaus-
76
tive, representing what we concern. m : 2Θ → [0, 1] is called a basic belief
77
assignment (BBA) defined over an FOD Θ if it satisfies
A⊆Θ
m(A) = 1, m(∅) = 0
AN US
X
(1)
78
When m(A) > 0, A is called a focal element. A BBA is also called a mass
79
function. The set of all the focal elements denoted by F and their corre-
80
sponding mass assignments constitute a body of evidence (BOE) (F, m). The belief function (Bel) and plausibility function ((P l)) are defined as
M
81
ED
Bel(A) = 82
P l(A) =
X
B⊆A
X
A∩B6=∅
m(B)
(2)
m(B)
(3)
The belief function Bel(A) represents the justified specific support for the
84
focal element (or proposition) A, while the plausibility function P l(A) rep-
85
resents the potential specific support for A. The length of the belief interval
86
[Bel(A), P l(A)] is used to represent the degree of imprecision for A.
CE
Two independent BBAs m1 (·) and m2 (·) can be combined using Demp-
AC
87
PT
83
88
ster’s rule of combination as [1] 0, AP= ∅ m1 (Ai )m2 (Bj ) m(A) = Ai ∩Bj =A P , A 6= ∅ m1 (Ai )m2 (Bj ) 1− Ai ∩Bj =∅
5
(4)
ACCEPTED MANUSCRIPT
There are still some other alternative combination rules. See [21] for details.
90
The theory of belief functions has been criticized for its validity [19, 20, 22,
91
23, 24, 25]. It is not a successful generalization of the probability theory, i.e.,
92
there exists inconsistency [20] between the framework of the theory of belief
93
functions and that of the probability theory.
94
3. Uncertainty measures in the theory of belief functions
CR IP T
89
In the theory of belief functions, there are two types of uncertainty includ-
96
ing the discord (or the conflict or the randomness) and the non-specificity,
97
hence ambiguity [11].
98
3.1. Measure of discord in belief function
AN US
95
Measures of discord are for describing the randomness (or discord or con-
100
flict) in a BOE [11]. Available definitions are listed below. Although with
101
various names, they are all for the discord part of the uncertainty in a BOE.
ED
102
M
99
1) Confusion measure [12]
AC 104
X
m(A) log2 (Bel(A))
(5)
X
m(A) log2 (P l(A))
(6)
A⊆Θ
2) Dissonance measure [14]
CE
103
PT
Conf (m) = −
Diss(m) = −
A⊆Θ
3) Discord measure [26] Disc(m) = −
X
m(A) log2
A⊆Θ
6
"
X
|B − A| 1− m(B) |B| B⊆Θ
#
(7)
ACCEPTED MANUSCRIPT
4) Strife measure [9] Stri(m) = −
X
m(A) log2
A⊆Θ
"
X
|A − B| 1− m(B) |A| B⊆Θ
#
(8)
CR IP T
105
106
Note that all these definitions can be considered as Shannon entropy-alike
107
measures. More detailed information on these measures can be found in [9].
108
3.2. Measures for non-specificity in belief function
Non-specificity [27, 15] means two or more alternatives are left unspecified
110
and represents an imprecision degree. It only focuses on those focal elements
111
with cardinality larger than one. Non-specificity is a special uncertainty
112
type in the framework of belief functions theory when compared with the
113
probabilistic framework. Some non-specificity measures [10, 14, 15] were
114
proposed. The most commonly used non-specificity definition is [10]:
M
AN US
109
ED
N S(m) =
X
A⊆Θ
m(A) log2 |A|
(9)
It can be regarded as a generalized Hartley measure [13] from the classical
116
set theory. When the BBA m(·) is a Bayesian BBA, i.e., it only has singleton
117
focal elements, it reaches the minimum value 0. When BBA m(·) is a vacuous
118
BBA, i.e., m(Θ) = 1, it reaches the maximum value log2 (|Θ|). This definition
119
was proved to have the uniqueness by Ramer [28] and it satisfies all the
120
requirements of the non-specificity measure.
AC
CE
PT
115
7
ACCEPTED MANUSCRIPT
122
3.3. Measures for total uncertainty in belief functions theory 1) Aggregated Uncertainty (AU ) [16] P AU (m) = max − θ∈Θ pθ log2 pθ pθ ∈ [0, 1], ∀θ ∈ Θ P pθ = 1 s.t. θ∈Θ P ¯ ∀A ⊆ Θ Bel(A) ≤ pθ ≤ 1 − Bel(A),
CR IP T
121
θ∈A
(10)
In fact for AU , given a BBA, the probability with the maximum Shannon
124
entropy under the constraints established using the given BBA is selected
125
and its corresponding Shannon entropy value is defined as the value for AU .
126
Therefore, it is also called as ”upper entropy” [29]. It is an aggregated total
127
uncertainty (ATU) measure, which captures both non-specificity and discord.
128
AU satisfies all the requirements [29] for uncertainty measure including prob-
129
ability consistency, set consistency, value range, monotonicity, sub-additivity
130
and additivity for the joint BBA in Cartesian space.
M
ED
2) Ambiguity Measure (AM ) [11] X BetPm (θ) log2 (BetPm (θ)) AM (m) = − P
PT
131
AN US
123
(11)
θ∈Θ
m(B)/ |B| is the pignistic probability [30] of a
where BetPm (θ) =
133
BBA. In fact AM uses the Shannon entropy of the pignistic probability of a
134
given BBA to represent the uncertainty.
135
3.4. Drawbacks of available total uncertainty measures for belief functions
θ∈B⊆Θ
AC
CE
132
136
The traditional total uncertainty measures have their own drawbacks.
137
AM cannot satisfy the sub-additivity (for joint BBA in Cartesian space)
138
which has been pointed out by Klir [18]. Moreover, AM is criticized due to
139
its logical non-monotonicity [29] as shown in Example 1. 8
ACCEPTED MANUSCRIPT
141
3.4.1. Example 1 Suppose that the FOD is {θ1 , θ2 , θ3 }. There are two BBAs as follows. m1 ({θ1 , θ2 }) = 1/3, m2 ({θ1 , θ2 , θ3 }) = 1/3,
142
m1 ({θ1 , θ3 }) = 1/2, m2 ({θ1 , θ3 }) = 1/2,
Obviously, from the two BBAs, there exists Bel2 (A) 6 Bel1 (A),
m1 ({θ2 , θ3 }) = 1/6;
CR IP T
140
m2 ({θ2 , θ3 }) = 1/6.
P l1 (A) 6 P l2 (A), ∀A ⊆ Θ.
That is to say, all the belief intervals of m1 can be contained by those
144
corresponding belief intervals of m2 , which means that m2 has a higher level
145
of uncertainty. We can calculate their corresponding AM and AU as follows.
AN US
143
1.5546 = AM (m1 ) > AM (m2 ) = 1.5100; AU (m1 ) = AU (m2 ) = 1.5850 = log2 3.
We can see that the results of AM are counter-intuitive, because AM violates
147
the monotonicity, i.e., AM decreases the total quantity of uncertainty in this
148
example where a clear increment of uncertainty of m2 compared with that
149
of m1 . Although AU does not bring out a decrease of the quantity of uncer-
150
tainty, it generates two equal values, which is also counter-intuitive, because
151
m2 should have higher level of uncertainty. We provide further analyses on
ED
PT
this example. The belief function and plausibility function of m1 and m2 are Bel1 ({θ1 }) = 0.0000, P l1 ({θ1 }) = 0.8333;
AC
CE
152
M
146
Bel1 ({θ2 }) = 0.0000,
P l1 ({θ2 }) = 0.5000;
Bel1 ({θ1 , θ2 }) = 0.3333,
P l1 ({θ1 , θ2 }) = 1.0000;
Bel1 ({θ3 }) = 0.0000,
P l1 ({θ2 }) = 0.6667;
Bel1 ({θ1 , θ3 }) = 0.5000,
P l1 ({θ1 , θ3 }) = 1.0000;
Bel1 ({θ2 , θ3 }) = 0.1667,
P l1 ({θ1 , θ3 }) = 1.0000;
Bel1 (Θ) = 1.0000,
P l1 (Θ) = 1.0000; 9
ACCEPTED MANUSCRIPT
Bel2 ({θ1 }) = 0.0000,
P l2 ({θ1 }) = 0.8333;
Bel2 ({θ2 }) = 0.0000,
P l2 ({θ2 }) = 0.5000;
Bel2 ({θ3 }) = 0.0000,
P l2 ({θ1 , θ2 }) = 1.0000;
CR IP T
Bel2 ({θ1 , θ2 }) = 0.0000,
P l2 ({θ2 }) = 0.6667;
Bel2 ({θ1 , θ3 }) = 0.5000,
P l2 ({θ1 , θ3 }) = 1.0000;
Bel2 ({θ2 , θ3 }) = 0.1667,
P l2 ({θ1 , θ3 }) = 1.0000;
Bel2 (Θ) = 1.0000,
P l2 (Θ) = 1.0000;
Since AU tries to find a probability maximizing the Shannon entropy,
154
and the uniform p.m.f.(probability mass function) P (θ1 ) = P (θ2 ) = P (θ3 ) =
155
1/3, which has the maximum Shannon entropy, satisfies all the constraints
156
(shown in Eq. (10)) established using Bel1 , P l1 and those using Bel2 and P l2 .
157
Therefore, both AU1 and AU2 reach the maximum value log2 3 = 1.5850.
158
3.4.2. Example 2
M
AN US
153
Suppose that the FOD Θ = {θ1 , θ2 }. A BBA over Θ is m({θ1 }) = a,
160
m({θ2 }) = b, m({Θ}) = 1 − a − b, where a, b ∈ [0, 0.5]. Here, we calculate
161
AU , AM values corresponding to different values of a and b. The change of
162
AU and AM values with the change of a and b are illustrated in Figure 1.
PT
CE
163
ED
159
AC
As shown in Figure 1, AM reaches its maximum when a = b, because when a = b (no matter a or b’s value is large or small), the pignistic probabil-
ity is uniformly distributed. This is counter-intuitive, because AM cannot
discern the effect of total set’s mass assignment values to the uncertainty degree. For example, m1 ({θ1 }) = m1 ({θ2 }) = 0.5 and m2 ({θ1 }) = m2 ({θ2 }) = 10
AN US
CR IP T
ACCEPTED MANUSCRIPT
Figure 1: The change of total uncertainty measures in Example 2.
0.25, m2 (Θ) = 0.5, there exists AM (m1 ) = AM (m2 ). Obviously, it is ir-
M
rational to say that m1 and m2 have the same degree of total uncertainty. AU never changes with the change of a and b. The value of AU is always
ED
log2 2 = 1. The reason is analyzed as follows. The constraints for calculating AU in this example are
PT
Bel({θ1 }) = a 6 P (θ1 ) 6 1 − b = P l({θ1 });
164
CE
Bel({θ2 }) = b 6 P (θ2 ) 6 1 − a = P l({θ2 }); Bel(Θ) = 1 − a − b 6 P (θ1 ) 6 1 = P l(Θ);
Since AU tries to find a p.m.f. with the maximum Shannon entropy, and
the uniformly distributed P (θ1 ) = P (θ2 ) = 0.5 always satisfies the constraints
166
above (because a, b ∈ [0, 0.5]), no matter how a and b change, P (θ1 ) =
167
P (θ2 ) = 0.5 is always picked up when calculating AU and thus, AU always
168
equals to log2 2 = 1. However, intuitively, the degree of uncertainty should
169
change with the change of a and b. So, AU cannot well describe the degree
AC
165
11
ACCEPTED MANUSCRIPT
170
of total uncertainty here. As we can see above, both AM and AU borrow the uncertainty measure
172
(Shannon entropy) in the probability theory framework and they both have
173
drawbacks. However, as aforementioned, the belief functions theory is not a
174
successful generalization of the probability theory. There exists inconsistency
175
between the two frameworks. We do not prefer such a switch between differ-
176
ent frameworks, which might cause problems in representing the uncertainty
177
in belief functions. Therefore, in our work, we design the total uncertainty
178
measure directly in the framework of belief functions theory as introduced in
179
the next section.
180
4. Belief interval distance-based total uncertainty measure
181
4.1. Belief interval and uncertainty
185
186
187
188
AN US
M
Case I: The most uncertain case is Bel(A) = 0 and P l(A) = 1, i.e., the belief interval is [0,1].
Case II: The clearest case is Bel(A) = 1 and P l(A) = 1 (A is assure to occur) or Bel(A) = 0 and P l(A) = 0 (A is assure to never occur). Case III: When Bel(A) = a, P l(A) = b, ∀a, b ∈ (0, 1), the belief interval
is [a, b]. For A, the degree of imprecision can be represented by b − a, the
AC
189
ED
184
m can represent its uncertainty degree as analyzed below.
PT
183
The belief interval [1] [Bel(A, P l(A)] of a given focal elements A in a BBA
CE
182
CR IP T
171
190
probability of whether A occurs or not (randomness) is large than a and less
191
than b.
192
Case IV: When Bel(A) = P l(A) = a, ∀a ∈ (0, 1), the belief interval
193
is [a, a]. It means that A has no imprecision and whether A occurs or not 12
ACCEPTED MANUSCRIPT
194
cannot be clearly determined (it is with the randomness, i.e., the probability
195
a). Therefore, the information related to the uncertainty carried by a belief
197
interval both include the randomness part and the imprecision part (non-
198
specificity). Thus, given a BBA, we can fully utilize the information of the
199
belief intervals to measure the total uncertainty of the BBA. Then, how to
200
use the information of belief intervals?
CR IP T
196
Given the belief interval of A, i.e., [Bel(A), P l(A)], if the belief interval is
202
farther from the most uncertain case [0,1], then A has smaller uncertainty; if
203
the belief interval is nearer to the most uncertain case [0,1], then A has larger
204
uncertainty. Then, how to describe the distance between belief intervals?
205
Here we use the distance between interval numbers [31, 32] as follows. Given two interval numbers [a1 , b1 ] and [a2 , b2 ], a strict distance is defined as s 2 2 a1 + b 1 a2 + b 2 1 b 1 − a1 b 2 − a2 I d ([a1 , b1 ], [a2 , b2 ]) = − − + 2 2 3 2 2 (12)
ED
M
206
AN US
201
The belief interval [Bel(A), P l(A)] can be considered as an interval num-
208
ber. If we replace [a1 , b1 ] by [Bel(A), P l(A)] and [a2 , b2 ] by [0, 1], then,
209
the distance obtained can be used to represent how far it is from A to
210
the most uncertain case. As shown in Eq. (12), in dI , the proportion of
211
[(Bel({θi }) + P l({θi })) /2 − 1]2 and [(P l({θi }) − Bel({θi })) /2 − 1]2 is 1/3.
212
Generally speaking, an interval number is a kind of fuzzy data [31]. To define
213
the distance between two fuzzy data, the value of such a proportion depends
214
on the shape [31] (could be the symmetric triangular, normal, parabolic,
215
square root fuzzy data, and interval number) of the fuzzy data model. Here,
216
we use the belief intervals (interval numbers), therefore, there is a rational
AC
CE
PT
207
13
ACCEPTED MANUSCRIPT
and natural assumption that the interval is with a uniform distribution. Un-
218
der such an assumption, the proportion should be 1/3 and dI corresponds
219
to Mallows’ distance between two distributions, hence a strict distance met-
220
ric for interval numbers. By using such a strict distance metric for interval
221
numbers, we can further define a total uncertainty measure for a BOE as
222
follows.
223
4.2. New total uncertainty measure based on belief intervals
CR IP T
217
Suppose that m is a BBA over the FOD Θ = {θ1 , ..., θn }. First, calcu-
225
late the belief interval [Bel({θi }), P l({θi )}] , i = 1, ..., n for each singleton
226
θi .Then, calculate the distance between each [Bel({θi }), P l({θi )}] and [0, 1],
227
which represents the degree of departure from the most uncertain case for
228
each singleton. Since larger distance (larger departure from the most uncer-
229
tain case) means smaller total uncertainty, one minus the normalized and
230
averaged belief interval distances (for all singletons) is used as the total un-
231
certainty measure for the BBA m as shown in Eq. (13):
ED
M
AN US
224
233
235
(13)
√ 3 is the normalization factor 1 .
The newly proposed T U I has no drawbacks as those of AU and AM
pointed out in Examples 1 and 2. See the Examples 3 and 4 in Section 5 for
AC
234
where
n 1 √ X I · 3· d ([Bel({θi }), P l({θi })], [0, 1]) n i=1
CE
232
PT
T U I (m) = 1 −
details. 1
The distance dI ([Bel({θi }), P l({θi })], [0, 1]) reaches its maximum value when the be-
lief interval is [0,0] or [1,1]. Therefore, the normalization factor is 1/dI ([0, 0], [0, 1]) = √ 1/dI ([1, 1], [0, 1]) = 3.
14
ACCEPTED MANUSCRIPT
Here, we provide an illustrative example to show how T U I is calculated.
237
Suppose that FOD is Θ = {θ1 , θ2 , θ3 }. A BBA over Θ is m({θ1 }) =
238
0.3, m({θ2 , θ3 }) = 0.5, m(Θ) = 0.2.
CR IP T
236
First calculate the belief function and plausibility function for singletons: Bel({θ1 }) = m({θ1 }) = 0.3, Bel({θ2 }) = m({θ2 }) = 0.0, Bel({θ3 }) = m({θ3 }) = 0.0,
AN US
P l({θ1 }) = m({θ1 }) + m(Θ) = 0.5,
P l({θ2 }) = m({θ2 , θ3 }) + m(Θ) = 0.7, P l({θ3 }) = m({θ2 , θ3 }) + m(Θ) = 0.7.
Then, calculate the distance between each belief interval of singleton and
M
the interval [0, 1]:
ED
dI ([Bel({θ1 }), P l({θ1 })], [0, 1]) = dI ([0.3, 0.5], [0, 1]) q 2 1 0.5−0.3 1−0 2 0.3+0.5 = − 0+1 − 2 +3 = 0.2517; 2 2 2
PT
dI ([Bel({θ2 }), P l({θ2 })], [0, 1]) = dI ([0, 0.7], [0, 1]) q 0+1 2 1 0.7−0 1−0 2 0+0.7 = − + − = 0.1732; 2 2 3 2 2
Then T U I = 1 −
√ 3/3 · (0.2517 + 0.1732 + 0.1732) = 0.6547.
AC
239
CE
dI ([Bel({θ3 }), P l({θ3 })], [0, 1]) = dI ([0, 0.7], [0, 1]) q 2 1 0.7−0 1−0 2 0+0.7 = − 0+1 +3 2 − 2 = 0.1732; 2 2
240
4.3. Desired properties of the new total uncertainty measure
241
4.3.1. Range
242
The range of T U I is [0, 1], which is desired for practical use. For a vacuous
243
BBA m(Θ) = 1, all the belief intervals for singletons are [0, 1]. Therefore, by 15
ACCEPTED MANUSCRIPT
using Eq. (13), its degree of total uncertainty is 1, which is the maximum.
245
For a categorical BBA m(θi ) = 1, the belief interval for θi is [1, 1] and those
246
of other singletons are [0, 0]. Therefore, by using Eq. (13), its degree of total
247
uncertainty is 0, which is the minimum.
CR IP T
244
Proof of the range: For a vacuous BBA, when using Eq. (13), T U I = 1.
249
On the other hand, if T U I = 1, i.e., all the belief intervals for singletons are
250
[0, 1], the corresponding BBA must be a vacuous one. So the maximum value
251
is unique. If a BBA is not a vacuous one, then some singletons’ belief intervals
252
can be strictly contained by [0, 1]. Therefore, according to Eq. (13), the
253
corresponding T U I will be smaller than 1. Therefore, the unique maximum
254
value of T U I is 1.
AN US
248
For a categorical BBA m(θi ) = 1, when using Eq. (13), T U I = 0. On
256
the other hand, if T U I = 0, the only possible belief interval for singletons
257
are either [0, 0] or [1, 1]. For a BBA, one and only one belief interval for
258
singleton could be [1, 1]. So, the corresponding BBA should be m({θi }) =
259
1, m({θj }) = 0, ∀j 6= i, which represents the clearest case. If a BBA is not
260
m({θi }) = 1, m({θj }) = 0, ∀j 6= i, some singletons’ belief intervals are neither
PT
ED
M
255
[0, 0] nor [1, 1]. Therefore, according to Eq. (13), the corresponding T U I will
262
be larger than 0. So, the minimum value of T U I is 0.
263
End of Proof
4.3.2. Monotonicity
AC
264
CE
261
Given an uncertainty measure U N and two BBAs m1 and m2 , if ∀A ∈ P(Θ) : P l1 (A) 6 P l2 (A), Bel1 (A) > Bel2 (A) or ∀A ∈ P(Θ) : [Bel1 (A), P l1 (A)] ⊆ [Bel2 (A), P l2 (A)] 16
ACCEPTED MANUSCRIPT
there exists U N (m1 ) 6 U N (m2 ), then it is said that U N satisfies the mono-
266
tonicity [29]. The physical meaning of the monotonicity is that an uncertainty
267
measure in the theory of belief functions must not decrease the total quan-
268
tity of uncertainty in situations where there is a clear decrease in information
269
(increment of uncertainty).
CR IP T
265
Proof of the monotonicity: For two BBAs defined over the same FOD,
271
if it holds that ∀A ∈ P(Θ) : [Bel1 (A), P l1 (A)] ⊆ [Bel2 (A), P l2 (A)], then,
272
there exists ∀θi ∈ Θ : [Bel1 ({θi }), P l1 ({θi })] ⊆ [Bel2 ({θi }), P l2 ({θi })]. Ac-
273
cording to the property of the distance of interval numbers in Eq. (12), there
274
exists dI ([Bel1 ({θi }), P l1 ({θi })], [0, 1]) > dI ([Bel2 ({θi }), P l2 ({θi })], [0, 1]).
277
278
279
280
holds. End of Proof 4.4. On other properties
M
276
Then, according to Eq. (13), T U I (m1 ) 6 T U I (m2 ), i.e., the monotonicity
ED
275
AN US
270
It is declared in [29, 11] that AU and AM satisfy the probability consistency and set consistency.
PT
Probability consistency: When m is a Bayesian BBA (all focal ele-
CE
ments are singletons), AU reduces to Shannon entropy in probability theory: AU (m) = AM (m) = −
X
m({θ})log2 (m({θ}))
θ∈Θ
AC
Set consistency: When m focuses on a single set (i.e., m(A) = 1, ∀A ⊆
Θ), AU reduces to a Hartley measure in classical theory: AU (m) = AM (m) = log2 (|A|)
17
ACCEPTED MANUSCRIPT
It should be noted that it is meaningless to apply these two requirements
282
or properties to our new measure. First, our new T U I is designed directly
283
in the framework of belief functions theory, but not a generalization of the
284
measures in probability framework or those in classical set theory. Second,
285
the theory of belief functions is not a successful generalization of the prob-
286
ability theory. It cannot degenerate back to the probability framework. For
287
example, a Bayesian BBA is not a probability but still a special BBA, which
288
cannot satisfy the additivity for mutually exclusive events and cannot satisfy
289
the unity for the total set in Kolmogorov probability axioms [33]. Therefore,
290
the validity and strictness of such a consistency is doubtful.
AN US
CR IP T
281
Furthermore, AU also satisfies the subadditivity and additivity, which
292
defined for the joint BBA in Cartesian space. In our work, we do not con-
293
cern the joint BBA in Cartesian space, therefore, we will not analyze the
294
subadditivity and additivity in this paper. We will further verify our new total uncertainty measure by examples and
ED
295
M
291
simulations in the next section.
297
5. Experiments and simulations
298
5.1. Example 3
This Example 3 is a revisiting of Example 1. Using Eq. (13), we can
obtain that T U I (m1 ) = 0.6667 and T U I (m2 ) = 0.7778, i.e., m2 has a higher
AC
300
CE
299
PT
296
301
level of uncertainty, which is intuitive. It can be proved that T U I satisfy the
302
monotonicity. The proof can be found at Section 4.3.2.
18
ACCEPTED MANUSCRIPT
303
5.2. Example 4 This Example 4 is a revisiting of Example 2. Our new total uncertainty
305
measure T U I , non-specificity measure N S, dissonance measure Diss are also
306
calculated. The results are shown in Figure 2.
AU
CR IP T
304
TU I
AM
1
1
2
0.95 0.9 0.85 0 0.5
0.8 0.5
0.5 a
0 0
b
0.8
AN US
1
0.6
0.5
0 0
a
a
Diss
M
NS
b
1
0.4 0.5
0
0
0.5 b
1 0.8
1
0.6
ED
0.5
0.5
PT
0 0.5
0 0
b
0 0.5
0.5 a
0 0
b
0.2 0
CE
a
0.5
0.4
AC
Figure 2: The change of total uncertainty measures in Example 4.
307
308
AU never changes as analyzed in Example 2, which is counter-intuitive.
309
AM values never changes, so far as a = b holds, which is also counter-intuitive
310
as aforementioned in Example 2. As we can see in Figure 2, T U I brings out 19
ACCEPTED MANUSCRIPT
rational results. It reaches the maximum when a = b = 0, i.e., m(Θ) = 1.
312
When a + b is fixed, the maximum values are reached if a = b holds. For
313
example, suppose that a + b = 0.3. The T U I values are shown in Figure 3.
314
This makes sense, because m(Θ) = 1 − (a + b) is fixed, the non-specificity
315
part is fixed. When a = b, the conflict part reaches its maximum value.
CR IP T
311
AN US
0.9
TU I
0.8
X: 0.15 Y: 0.15 Z: 0.85
0.7 0.6
M
0.5 0.6
ED
0.4
a+b=0.3
0.2
b
0
0.2
0.1
0.3
0.4
0.5
a
PT
0
316
CE
Figure 3: T U I for those with fixed a + b
Note that N S reaches 0 (minimum), when a = b = 0.5 (m(Θ) = 0); it reaches
318
1 (maximum), when a = b = 0 (m(Θ) = 1). Diss reaches 1 (maximum),
319
when a = b = 0.5; it reaches 0 ( minimum), when a = 0 or b = 0 (no conflict).
AC
317
20
ACCEPTED MANUSCRIPT
320
5.3. Example 5 Suppose that the FOD is Θ= {θ1 , θ2 ,θ3 } and a BBA over Θ is m(A) =
322
1, ∀A = Θ. We make changes to the BBA step by step. In each step, m(Θ)
323
has a decrease of ∆ = 0.05 and each singleton mass m({θi }), i = 1, 2, 3 has an
324
increase of ∆/3. Finally, m(Θ) reaches 0 and m({θi }) = 1/3. We calculate
CR IP T
321
total uncertainty measures including T U I , AM , AU , non-specificity N S, and
326
dissonance (Diss) at each step. The changes of uncertainty values are shown
327
in Figure 4.
AN US
325
Change of Uncertainty values 1.6
NS AU AM
1 0.8 0.6
PT
0.4
TU I Diss
M
1.2
ED
Uncertainty values
1.4
AC
CE
0.2
0 0
5
10
Steps
15
20
Figure 4: The change of total uncertainty measures in Example 5.
328
329
As we can see in Figure 4, with the change of the BBA in each step, the
330
non-specificity changes from the maximum to zero. This is intuitive, because 21
ACCEPTED MANUSCRIPT
the mass assignments are transferred from the focal elements with larger
332
cardinality to those with smaller cardinality. The measure for conflict, i.e.,
333
Diss here, becomes larger and reaches the maximum in the final. This is
334
also intuitive, because the BBA changes from a vacuous BBA (no conflict)
335
to a Bayesian BBA (with uniform distribution on all singletons) in the fi-
336
nal. The traditional total uncertainty measures AU and AM never change
337
and they are always at the maximum value. This is counter-intuitive be-
338
cause the total uncertainty degree of a vacuous BBA should be the largest,
339
and thus, it should be larger than that of a Bayesian BBA (even if, it is
340
a Bayesian BBA with uniform distribution on all singletons). The reasons
341
for the irrational results of AU and AM are analyzed as follows. They are
342
both defined based on some probabilistic transformation from BBAs. In this
343
example, for the probabilistic transformation used in AM and AU , the re-
344
sults are always a uniformly distributed probability mass function (p.m.f.):
345
P (θi ) = 1/3, i = 1, 2, 3, therefore, AM and AU will never change here. Our
346
new measure T U I provides rational results. It becomes smaller and smaller
347
when the BBA changes from a vacuous one to a Bayesian one as shown in
348
Figure 2. Therefore, according to our opinion, it is inappropriate to define
349
total uncertainty measure in the theory of belief functions by indirectly using
350
the uncertainty measure in probability framework, i.e., Shannon entropy. It
351
should be better not to switch the framework but to directly design in the
CE
PT
ED
M
AN US
CR IP T
331
framework of belief functions theory. This is also the motivation of our work
353
in this paper.
AC 352
22
ACCEPTED MANUSCRIPT
354
5.4. Example 6 The FOD is Θ = {θ1 , θ2 , θ3 }. A BBA over Θ is m(A) = 1, ∀A = Θ. We
356
make changes to the BBA step by step. In each step, m(Θ) has a decrease
357
of ∆ = 0.05 and mass assignment of one singleton m({θ1 }) has an increase
358
of ∆ = 0.05. In the final step, m(Θ) = 0 and m({θ1 }) = 1. We calculate
CR IP T
355
total uncertainty measures inlcuding T U I , AM , AU , non-specificity measure
360
N S, and dissonance (Diss) of the BBA at each step. The changes of these
361
uncertainty values are shown in Figure 5.
AN US
359
Change of Uncertainty values
1.6
NS AU AM
1.4
M
0.8
0.4
PT
0.2
ED
Uncertainty values
1
0.6
I
TU Diss
1.2
0
CE
−0.2 0
5
10
Steps
15
20
AC
Figure 5: The change of total uncertainty measures in Example 6.
362
363
The BBA changes from a vacuous one to a categorical one. As shown in
364
Figure 5, non-specificity N S becomes smaller and smaller, which is intuitive. 23
ACCEPTED MANUSCRIPT
365
The conflict part Diss is always 0, which is also intuitive, because at most
366
two focal elements are available for the BBA in each step (including {θ1 } and
Θ), therefore, there is no conflict. AU , AM , T U I all decrease intuitively.
368
However, AU is not sensitive to the BBA’s change at the beginning stage
369
(nearly unchanged). This has already been criticized by Jousselme [11].
370
5.5. Example 7
CR IP T
367
Examples 7-9 are used for reference from [11]. Suppose that the FOD is
372
Θ = {θ1 , ..., θ8 }. A BBA over Θ is m(A) = 1, ∀A = Θ. We make changes
373
to the BBA step by step. In each step, m(Θ) has a decrease of ∆ = 0.05,
374
and the mass assignment for one focal element B with cardinality s < 8 has
375
an increase of ∆ = 0.05. In the final step, m(B) = 1. Here, we can set
376
s = 2, 4, 6, respectively. Given an s value, we repeat the whole procedure
377
of BBA change. Under the different s, the changes of values for uncertainty
378
measures including T U I , AM , AU , N S, and Diss are shown in Figure 6. 3
|B| = s = 2
3
3 2.5
2
2
2
1.5
1.5
1.5
1
1
1
0.5
0.5
0.5
CE AC
|B| = s = 4
2.5
PT
2.5
Uncertainty values
ED
M
AN US
371
0 0
10
20
0 0
10
Steps
20
0 0
|B| = s = 6
Diss NS AU AM TU I
10
Figure 6: The change of total uncertainty measures in Example 7.
379
24
20
ACCEPTED MANUSCRIPT
As shown in Figure 6, when the mass assignments are transferred to a smaller
381
size focal element, the non-specificity N S intuitively decreases. The total
382
uncertainty AU , AM , T U I all decrease. If s has a smaller value, the total
383
uncertainty and the non-specificity will decrease faster, because the mass
384
assignments are transferred to a smaller size focal element. Comparatively,
385
AU is not sensitive to the change of the BBA. The conflict part (Diss) is
386
always 0. The reason is analyzed as follows. At the beginning, the BBA is a
387
vacuous one; in the middle steps, the BBA has two focal elements including
388
the total sets; in the final step, it has only one focal element. Therefore,
389
there is no conflict.
390
5.6. Example 8
AN US
391
CR IP T
380
Suppose that the FOD is Θ = {θ1 , ..., θ10 }. A BBA over Θ is m(Θ) = 1−a,
and m(A) = a, |A| = s 6 10. Given an a, and the initial |A| = 1, we make
393
changes to the BBA step by step. |A| increases by 1 in each step. In the
394
final, the BBA becomes a vacuous one. We set a = 0.3, 0.5, 0.8, respectively.
395
Given an a value, we repeat the whole procedure of BBA change. Under the
396
different a, the changes of values for uncertainty measures including T U I ,
397
AM , AU , N S, and Diss are shown in Figure 7.
ED
PT
As shown in Figure 7, non-specificity N S, total uncertainty measures in-
CE
398
M
392
cluding AU , AM , and our T U I all intuitively increase with the change of
400
BBA at each step. If a has a larger value, they will increase faster, because
401
relatively more mass assignments are transferred to a larger size focal ele-
402
ment. Comparatively, AU is not sensitive to the BBA’s change. The conflict
403
part (Diss) is zero and never changes. The reason is analyzed as follows.
404
Before the final step, the BBA has two focal elements including the total
AC
399
25
ACCEPTED MANUSCRIPT
405
sets; in the final step, it has only one focal element. Therefore, there is no
406
conflict. 3.5
a=0.5
3.5
3
3
3
2.5
2.5
2.5
2
2
2
1.5
1.5
1.5
1
1
1
0.5
0.5
0 0
5
10
0 0
a=0.8
CR IP T
a=0.3
AN US
Uncertainty values
3.5
Diss NS AU AM TU I
0.5
5
Steps
10
0 0
5
10
M
Figure 7: The change of total uncertainty measures in Example 8.
409
410
5.7. Example 9
Suppose that the size of an FOD is |Θ| = 5. Randomly generate 10 BBAs
with k, 1 6 k 6 25 − 1 focal elements using the algorithm [11] in Table 1.
PT
408
ED
407
In this simulation, the number of focal elements is set to 15. Those
412
generated BBAs are then combined one by one using Dempster’s rule of
413
combination in Eq. (4). At each instant t of the evidence combination, the
414
values of N S, AU , AM , Diss, and T U I are calculated. The whole procedure
AC
CE
411
415
is repeated for 100 times and the averaging values of these different uncer-
416
tainty measures at different t are shown in Figure 8. It is shown that all
417
the uncertainty measures, except for Diss, decrease with the increase of the
418
combination steps. This makes sense, because it is intuitive that the total 26
ACCEPTED MANUSCRIPT
Table 1: Algorithm 1: Random generation of BBA
Input: Θ: FOD; Nmax : Maximum number of focal elements;
Generate the power set of Θ: P(Θ);
CR IP T
Output: m: a BBA
Generate a random permutation of P(Θ) → R(Θ); Generate a integer between 1 and Nmax → k; FOReach First k elements of R(Θ) do
AN US
Generate a value within [0, 1] → mv(i), i = 1, ..., k; END
Normalize the vector mv = [mv(1), ..., mv(2)] → mv 0 m(Aj ) = mv 0 (j), j = 1, ..., k.
uncertainty decreases in the information fusion procedure such as the evide-
M
419
ED
420
421
nce combination. The non-specificity measure drops faster and more sig-
423
nificantly than the total uncertainty measures. This is because that in the
424
evidence combination based on Dempster’s rule, the focal elements are split
425
into focal elements with smaller cardinality. Also, since the focal elements
426
are split into smaller size focal elements, the conflict in the BBA will sig-
AC
CE
PT
422
427
nificantly increase at the beginning steps. At the final steps, since there is
428
less further split of focal elements and the consensus effect in combination
429
becomes more dominant, the conflict in the BBA decreases slightly as shown
430
in Figure 8. 27
ACCEPTED MANUSCRIPT
Change of Uncertainty values
2.5
CR IP T
Uncertainty values
2
1.5
NS AU AM
1
TU I Diss
0 0
AN US
0.5
2
4
Steps
6
8
10
Figure 8: The change of total uncertainty measures in Example 8.
432
433
M
6. Application of the new total uncertainty measure Here we use our new total uncertainty measure in feature evaluation for
ED
431
pattern classification to further show its rationality. Three classes of samples are artificially generated. There are 200 samples
435
in each class. Each sample has four dimensions. In each class, each dimen-
436
sion of samples is Gaussian distributed with different mean and standard
437
deviation (Std for short) as shown in Figure 9 and Table 2.
CE
438
PT
434
As shown in Figure 9 and Table 2, since the three Gaussian probability
density functions (PDFs) in feature 2 are quite well separated, the class
440
discriminability of feature 2 is the best; the class discriminability of feature
441
4 is the worst (totally overlapped), and that of the feature 3 is also very bad.
442
The class discriminability of feature 1 is in the middle. This can also be
AC 439
28
AN US
CR IP T
ACCEPTED MANUSCRIPT
Figure 9: Probability density functions (PDFs) of different features of the samples in three
M
classes.
Table 2: Gaussian distribution parameters of the samples.
ED
Features
PT
Feature 1
CE
Feature 2
AC
Feature 3
Feature 4
Class1
Class2 Class3
Mean
0
4
5.5
Std
1
1.2
1.2
Mean
0
5
8
Std
1
1.2
1.2
Mean
4.8
5
5.2
Std
1.2
1.2
1.2
Mean
5
5
5
Std
1.2
1.2
1.2
29
ACCEPTED MANUSCRIPT
verified by using the discrimination criterion [34] as follows.
J=
tr(Sw ) tr(Sb )
(14)
CR IP T
443
444
where tr denotes the trace of a matrix. Suppose that there are C classes and
445
each class Ci has Ni samples. The degree of inner-class cohesion Sw and the degree of inter-class separability Sb are as follows [34] " T # C P P P P (Ci )E X − N1i X X − N1i X Sw = i=1 X∈Ci X∈Ci T C P P P 1 1 X −M X −M P (Ci ) Ni Sb = Ni i=1
447
X∈Ci
AN US
446
(15)
X∈Ci
where X is a feature (vector) of a sample and
M
C 1 X M= C i=1
1 X X Ni X∈C i
!
(16)
is the mean of all the classes’ centroids. If J of some feature (or set of
449
features) in Eq. (14) is smaller, then such a feature (or set of features) is
450
more crisp and discriminable.
ED
448
PT
For the four dimensions of the artificially generated samples, there exists
451
CE
J(1) = 0.2666, J(2) = 0.1336, J(3) = 488.9156, J(4) = 981.3525 It means that in terms of discriminability, feature 2 is the best; feature 4
is the worst; feature 3 is also very bad, and feature 1 is in the middle, which
453
is accordant to the judgement based on the Gaussian Parameters.
AC
452
454
Now, we use different total uncertainty measures (including AM, AU, and
455
our new T U I ) for the feature evaluation. Intuitively, the feature with smaller
456
total uncertainty measure should be better (higher discriminability). 30
ACCEPTED MANUSCRIPT
458
First, we generate BBA from each sample xq in the three-class artificial samples on different feature i ∈ {1, 2, 3} according to [35] mixq (Aj )
−2/(β−1)
|Aj |−α/(β−1) dqj
=P
Ak 6=∅
459
−2/(β−1)
|Ak |−α/(β−1) dqk
(17)
CR IP T
457
+ δ −2/(β−1)
where parameters α = 1, β = 2 as suggested in [34]. Here we give an
PT
ED
M
AN US
illustrative example as in Figure 10
460 461
In Figure 10, three different colors represent three different classes. c1
denotes the centroid of samples in class 1; c1,2 denotes the centroid of samples
AC
462
CE
Figure 10: Illustration of BBA generation.
463
in class 1 and class 2; c1,2,3 denotes the centroid of samples in classes 1, 2,
464
and 3. Calculate the distance d between xq and those centroids of single
465
classes and compound classes. Then according to Eq. (17), the BBA can be
466
generated. 31
ACCEPTED MANUSCRIPT
Second, we calculate AU, AM, and our proposed T U I for all the generated
468
BBAs (corresponding to each sample on different features). Then, calculate
469
the average values of AU, AM and T U I for different features. The results
470
are shown in Table 3.
CR IP T
467
Table 3: Average total uncertainty measures
Feature1
Featue 2
Feature 3
Feature 4
AU
1.4022
1.1968
1.5850
1.5850
AM
1.1855
0.9860
1.5844
1.5847
TU I
0.4496
AN US
Measures (Ave)
0.3787
0.6218
0.6221
We consider the average total uncertainty measures as scores for different
472
features. The feature with smaller score is better (with high discriminability)
473
As shown in Table 3 the feature evaluation based on the average T U I
474
values is accordant to the results obtained using discrimination criterion.
475
That is, feature 2 is the best; feature 4 is the worst; feature 3 is also very
476
bad, and feature 1 is in the middle.
ED
M
471
AM can also provide the correct feature evaluation result. Note that
478
feature 3 is more ambiguous than feature 4. The difference of average T U I
479
for features 3 and 4 is 0.0003; The difference of average AM for features 3
480
and 4 is also 0.0003. However, the value range of T U I is [0, 1], while the value
481
range of AM is [0, log2 3]. Therefore, our proposed T U I is more sensitive
482
when measuring the ambiguity in different features, which agrees with the
483
analysis in above examples.
AC
CE
PT
477
484
485
The average AU values for both features 3 and 4 are equal. They are both the maximum value log2 3 = 1.5850. The reason is analyzed below. 32
ACCEPTED MANUSCRIPT
According to the BBA generation approach used, because in features 3 and 4, all the centroids (7 centroids, 3 singleton classes, and 4 compound
CR IP T
classes) are very close, all the distances between xq and those centroids are almost equal when generating BBAs for xq on features 3 and 4. Therefore, each focal element of the BBA generated for xq on features 3 and 4 has the
mass value approximating to 1/7. According to all the BBAs generated, for feature 3, m(A) = (1/7) ± 0.010; for feature 4, m(A) = (1/7) ± 0.007, A ⊆
AN US
{c1 , c2 , c3 }. According to Eq. (10), since AU is calculated based on the maximization of Shannon entropy given the constraints established using belief functions as
1 4 ≈ m({c1 }) ≤ P (c1 ) ≤ m({c1 }) + m({c1 , c2 }) + m({c1 , c3 }) + m({c1 , c2 , c3 }) ≈ 7 7} | {z } | {z Bel({c1 })
P l({c1 })
M
1 4 ≈ m({c2 }) ≤ P (c2 ) ≤ m({c2 }) + m({c1 , c2 }) + m({c2 , c3 }) + m({c1 , c2 , c3 }) ≈ 7 7} | {z } | {z Bel({c2 })
P l({c2 })
Bel({c3 })
ED
4 1 ≈ m({c3 }) ≤ P (c3 ) ≤ m({c3 }) + m({c1 , c3 }) + m({c2 , c3 }) + m({c1 , c2 , c3 }) ≈ 7 7} | {z } {z | P l({c3 })
P (c1 ) = P (c2 ) = P (c3 ) = 1/3, which corresponds to the maximum Shannon
487
entropy, always satisfies the constraints, so, the AU value is fixed to log2 3 =
488
1.5850. Based on AU, we cannot judge which one is better for features 3 and
489
4.
In summary, our proposed T U I can be well used in feature evaluation for
pattern classification, and it performs better than AM and AU.
AC
491
CE
490
PT
486
492
7. Conclusions
493
In this paper, a new total uncertainty measure based on the distance
494
of belief intervals is proposed in the framework of belief functions theory. 33
ACCEPTED MANUSCRIPT
There is no switch between frameworks in the new measure. It is based on a
496
new perspective that for a body of evidence, the farther it is from the most
497
uncertain case, the less uncertainty it carries. It has been experimentally
498
shown that our new measure can rationally represent the total uncertainty
499
in BOEs. The new measure also has some desired properties and behaves
500
more rational than traditional measures like AU and AM in some cases as
501
shown in our experiments and simulations. Furthermore, our new measure
502
can provide more rational results in practical applications such as the feature
503
evaluation.
AN US
CR IP T
495
Till now, the theory of belief functions is still not so solid and needs fur-
505
ther refining. The available works mainly focus on the uncertainty modeling
506
and reasoning in belief functions theory and there have emerged many valu-
507
able results. However, the performance evaluation in belief functions theory
508
is far from mature, which has become its bottle neck for the further develop-
509
ment. In future work, we will focus on performance evaluation in the theory
510
of belief functions and try to use our new total uncertainty measure to im-
511
plement the theoretical evaluation and the application-based evaluation in
512
the theory of belief functions.
513
Acknowledgements
ED
PT
CE
This work was supported by the Grant for State Key Program for Ba-
AC
514
M
504
515
sic Research of China (973) (No. 2013CB329405), National Natural Science
516
Foundation (Nos. 61203222, 61573275), Foundation for Innovative Research
517
Groups of the National Natural Science Foundation of China (No. 61221063),
518
Science & technology project of Shaanxi Province (No.2013KJXX-46), Spe34
ACCEPTED MANUSCRIPT
cialized Research Fund for the Doctoral Program of Higher Education (No.
520
20120201120036), and Fundamental Research Funds for the Central Univer-
521
sities (No. xjj2014122).
522
References
523
524
CR IP T
519
[1] G. Shafer, A mathematical theory of evidence, Princeton university press Princeton, 1976.
[2] G. Lin, J. Liang, Y. Qian, An information fusion approach by combining
526
multigranulation rough sets and evidence theory, Information Sciences
527
314 (2015) 184–199.
AN US
525
[3] G. Dong, G. Kuang, Target recognition via information aggregation
529
through Dempster-Shafer’s evidence theory, Geoscience and Remote
530
Sensing Letters, IEEE 12 (6) (2015) 1247–1251.
532
[4] Z.-g. Liu, Q. Pan, J. Dezert, Evidential classifier for imprecise data based
ED
531
M
528
on belief functions, Knowledge-Based Systems 52 (2013) 246–257. [5] Z.-g. Liu, Q. Pan, J. Dezert, G. Mercier, Credal c-means clustering
534
method based on belief functions, Knowledge-Based Systems 74 (2015)
535
119–132.
CE
[6] O. Basir, X. Yuan, Engine fault diagnosis based on multi-sensor informa-
AC
536
PT
533
537
tion fusion using Dempster-Shafer evidence theory, Information Fusion
538
8 (4) (2007) 379–386.
539
[7] D. Han, J. Dezert, J.-M. Tacnet, C. Han, A fuzzy-cautious owa ap-
540
proach with evidential reasoning, in: Proceedings of the 15th Interna35
ACCEPTED MANUSCRIPT
541
tional Conference on Information Fusion (FUSION), ISIF, Singapore,
542
2012, pp. 278–285.
544
[8] R. R. Yager, N. Alajlan, Dempster-Shafer belief structures for decision
CR IP T
543
making under uncertainty, Knowledge-Based Systems 80 (2015) 58–66.
[9] G. J. Klir, B. Parviz, A note on the measure of discord, in: Proceed-
546
ings of the Eighth international conference on Uncertainty in artificial
547
intelligence, Morgan Kaufmann Publishers Inc., 1992, pp. 138–141.
548
[10] D. Dubois, H. Prade, A note on measures of specificity for fuzzy sets,
549
AN US
545
International Journal of General System 10 (4) (1985) 279–283. ´ Boss´e, Measuring ambiguity in [11] A.-L. Jousselme, C. Liu, D. Grenier, E.
551
the evidence theory, Systems, Man and Cybernetics, Part A: Systems
552
and Humans, IEEE Transactions on 36 (5) (2006) 890–903.
M
550
[12] U. H¨ohle, Entropy with respect to plausibility measures, in: Proceedings
554
of the 12th IEEE International Symposium on Multiple-Valued Logic,
555
1982, pp. 167–169.
557
PT
nal 7 (3) (1928) 535–563.
[14] R. R. Yager, Entropy and specificity in a mathematical theory of evi-
AC
558
[13] R. V. Hartley, Transmission of information, Bell System Technical Jour-
CE
556
ED
553
559
560
561
dence, International Journal of General System 9 (4) (1983) 249–260.
[15] R. K¨orner, W. N¨ather, On the specificity of evidences, Fuzzy sets and systems 71 (2) (1995) 183–196.
36
ACCEPTED MANUSCRIPT
[16] D. Harmanec, G. J. Klir, Measuring total uncertainty in Dempster-
563
Shafer theory: A novel approach, International Journal of General Sys-
564
tems 22 (4) (1994) 405–419.
565
566
CR IP T
562
[17] C. E. Shannon, The mathematical theory of communication, Bell System Technical Journal 27 (1949) 379–423, 623–656.
[18] G. Klir, I. Lewis, H.W., Remarks on “measuring ambiguity in the ev-
568
idence theory”, IEEE Transactions on Systems, Man and Cybernetics,
569
Part A: Systems and Humans 38 (4) (2008) 995–999.
570
571
AN US
567
[19] J. Pearl, Reasoning with belief functions: an analysis of compatibility, International Journal of Approximate Reasoning 4 (5) (1990) 363–389. [20] J. Dezert, A. Tchamova, On the validity of Dempster’s fusion rule and its
573
interpretation as a generalization of bayesian fusion rule, International
574
Journal of Intelligent Systems 29 (3) (2014) 223–252.
ED
M
572
[21] F. Smarandache, J. Dezert, Advances and applications of DSmT for
576
information fusion-Collected works-Volume 3, American Research Press,
577
2009.
PT
575
[22] J. F. Lemmer, Confidence factors, empiricism and the Dempster-Shafer
579
theory of evidence, in: Proceedings of the 1st Conference on Uncertainty in Artificial Intelligence (UAI-85), 1985, pp. 160–176.
AC
580
CE
578
581
[23] L. A. Zadeh, A simple view of the dempster-shafer theory of evidence
582
and its implication for the rule of combination, AI Magazine 7 (2) (1986)
583
85–90. 37
ACCEPTED MANUSCRIPT
[24] P. Wang, A defect in Dempster-Shafer theory, in: Proceedings of the
585
Tenth international conference on Uncertainty in artificial intelligence,
586
Morgan Kaufmann Publishers Inc., 1994, pp. 560–566.
CR IP T
584
587
[25] A. Gelman, The boxer, the wrestler, and the coin flip: A paradox of
588
bayesian inference, robust bayes, and belief functions, American Statis-
589
tician 60 (2) (2005) 146–150.
[26] G. J. Klir, A. Ramer, Uncertainty in the Dempster-Shafer theory: a
591
critical re-examination, International Journal of General System 18 (2)
592
(1990) 155–166.
AN US
590
[27] D. Dubois, H. Prade, Properties of measures of information in evidence
594
and possibility theories, Fuzzy sets and systems 100 (1999) 35–49.
595
[28] A. Ramer, Uniqueness of information measure in the theory of evidence, Fuzzy Sets and Systems 24 (2) (1987) 183–196.
ED
596
M
593
[29] J. Abell´an, A. Masegosa, Requirements for total uncertainty measures
598
in Dempster-Shafer theory of evidence, International journal of general
599
systems 37 (6) (2008) 733–747.
gence 66 (2) (1994) 191–234.
AC
601
[30] P. Smets, R. Kennes, The transferable belief model, Artificial intelli-
CE
600
PT
597
602
[31] A. Irpino, R. Verde, Dynamic clustering of interval data using a
603
Wasserstein-based distance, Pattern Recognition Letters 29 (11) (2008)
604
1648–1658.
38
ACCEPTED MANUSCRIPT
607
608
609
610
611
[33] A. Papoulis, S. U. Pillai, Probability, random variables, and stochastic processes, 4th Edition, Tata McGraw-Hill Education, 2002.
[34] R. O. Duda, P. E. Hart, D. G. Stork, Pattern classification, John Wiley & Sons, 2012.
[35] M.-H. Masson, T. Denoeux, Ecm: An evidential version of the fuzzy c-means algorithm, Pattern Recognition 41 (4) (2008) 1384–1397.
AC
CE
PT
ED
M
612
tance measure, Fuzzy 130 (3) (2002) 331–341.
CR IP T
606
[32] L. Tran, L. Duckstein, Comparison of fuzzy numbers using a fuzzy dis-
AN US
605
39