A new distance-based total uncertainty measure in the theory of belief function

Page 1

Accepted Manuscript

A new distance-based total uncertainty measure in the theory of belief functions Yi Yang, Deqiang Han PII: DOI: Reference:

S0950-7051(15)00457-8 10.1016/j.knosys.2015.11.014 KNOSYS 3343

To appear in:

Knowledge-Based Systems

Received date: Revised date: Accepted date:

27 July 2015 22 October 2015 20 November 2015

Please cite this article as: Yi Yang, Deqiang Han, A new distance-based total uncertainty measure in the theory of belief functions, Knowledge-Based Systems (2015), doi: 10.1016/j.knosys.2015.11.014

This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.


ACCEPTED MANUSCRIPT

Highlights • A new total uncertainty measure in evidence theory is proposed.

CR IP T

• The new measure is directly defined in the evidential framework.

• The new measure is not a generalization of those in the probabilistic framework.

AN US

• The belief intervals and distance metric are used for the new measure’s design.

• The new measure has no drawbacks in traditional ones and has desired

AC

CE

PT

ED

M

properties.

1


ACCEPTED MANUSCRIPT

3

Yi Yanga , Deqiang Hanb,∗ a

4 5

b

CR IP T

2

A new distance-based total uncertainty measure in the theory of belief functions

1

SKLSVMS, School of Aerospace, Xi’an Jiaotong University, Xi’an, China 710049 Institute of Integrated Automation, Xi’an Jiaotong University, Xi’an, China 710049

Abstract

7

The theory of belief functions is a very important and effective tool for un-

8

certainty modeling and reasoning, where measures of uncertainty are very

9

crucial for evaluating the degree of uncertainty in a body of evidence. Several

10

uncertainty measures in the theory of belief functions have been proposed.

11

However, existing measures are generalizations of measures in the proba-

12

bilistic framework. The inconsistency between different frameworks causes

13

limitations to existing measures. To avoid these limitations, in this paper,

14

a new total uncertainty measure is proposed directly in the framework of

15

belief functions theory without changing the theoretical frameworks. The

16

average distance between the belief interval of each singleton and the most

17

uncertain case is used to represent the total uncertainty degree of the given

18

body of evidence. Numerical examples, simulations, applications and related

19

analyses are provided to verify the rationality of our new measure.

CE

PT

ED

M

AN US

6

Keywords: belief functions, evidence theory, uncertainty measure, belief

21

interval, distance of interval

AC 20

Corresponding author: Deqiang Han Email addresses: jiafeiyy@mail.xjtu.edu.cn (Yi Yang), deqhan@gmail.com (Deqiang Han ) Preprint submitted to Knowledge-Based Systems

November 26, 2015


ACCEPTED MANUSCRIPT

22

1. Introduction Dempster-Shafer evidence theory (DST) [1], also called the theory of be-

24

lief functions, has been widely used in many applications related to uncer-

25

tainty modeling and reasoning, e.g., information fusion [2], pattern classifica-

26

tion [3, 4], clustering analysis [5], fault diagnosis [6], and multiple attribute

27

decision-making (MADM) [7, 8].

CR IP T

23

In the theory of belief functions, there are two types of uncertainty in-

29

cluding the discord (or conflict or randomness) [9] and the non-specificity

30

[10], hence the ambiguity [11]. Various kinds of measures for these two types

31

of uncertainty and the total uncertainty including both two types were pro-

32

posed. The measures of discord include the discord measure [9], the strife [9],

33

the confusion [12], etc; the measures of non-specificity include Dubois and

34

Prade’s definition [10] generalized from the Hartley entropy [13] in the clas-

35

sical set theory, Yager’s definition [14], and Korner’s definition [15], etc. The

36

most representative total uncertainty measures are the aggregated measure

37

(AU ) [16] and the ambiguity measure (AM ) [11].

ED

M

AN US

28

In essential, no matter AU or AM , they are the generalization of Shan-

39

non entropy [17] in probability theory, but not the direct definition in the

40

framework of the theory of belief functions. That is, they pick up a probabil-

41

ity according to some criteria or constraints established based on the given

42

body of evidence (BOE), and then calculate the corresponding Shannon en-

AC

CE

PT

38

43

tropy of the probability to indirectly depict the degree of uncertainty for the

44

given body of evidence. As mentioned in the related references [11][18], AU

45

and AM have their own limitations. For example, they are insensitive to

46

the change of BBA. These limitations to some extent are related to the in3


ACCEPTED MANUSCRIPT

consistency [19, 20] between the framework of the theory of belief functions

48

and that of the probability theory. Therefore, to avoid the limitations of

49

the traditional definitions for the uncertainty measure, in our work, a new

50

total uncertainty measure is proposed directly in the framework of the theory

51

of belief functions without the switching between different frameworks. We

52

analyze the belief interval and conclude that belief intervals carry both the

53

randomness part and the imprecision part (non-specificity) in the uncertainty

54

incorporated in a BOE. Thus, it is feasible to define a total uncertainty mea-

55

sure for a BOE. Given a BOE, the distance between the belief interval of each

56

singleton and the most uncertain interval [0, 1] is used for constructing the

57

degree of total uncertainty. The larger the average distance, the smaller the

58

degree of uncertainty. Since there is no switch of theoretical frameworks, our

59

new definition has desired properties including the ideal value range and the

60

monotonicity. Furthermore, the uncertainty measure can provide more ratio-

61

nal results when compared with the traditional ones, which can be supported

62

by experimental results and related analyses.

ED

M

AN US

CR IP T

47

The rest of this paper is organized as follows. Section 2 provides the brief

64

introduction of the theory of belief functions. Commonly used uncertainty

65

measures in the theory of belief functions are introduced in Section 3. Some

66

drawbacks of the available total uncertainty measures including AM and

67

AU are also pointed out in Section 3. In Section 4, a new total uncertainty

CE

PT

63

measure is proposed. Some desired properties and related analyses on the

69

new proposed definition are also provided. Experiments and simulations are

70

provided in Section 5 to show the rationality of our proposed total uncertainty

71

measure. An application of the new total uncertainty measure on feature

AC 68

4


ACCEPTED MANUSCRIPT

evaluation is provided in Section 6. Section 7 concludes this work.

73

2. Basics of the theory of belief functions

CR IP T

72

74

The basic conception in the theory of belief functions [1] is the frame

75

of discernment (FOD), whose elements are mutually exclusive and exhaus-

76

tive, representing what we concern. m : 2Θ → [0, 1] is called a basic belief

77

assignment (BBA) defined over an FOD Θ if it satisfies

A⊆Θ

m(A) = 1, m(∅) = 0

AN US

X

(1)

78

When m(A) > 0, A is called a focal element. A BBA is also called a mass

79

function. The set of all the focal elements denoted by F and their corre-

80

sponding mass assignments constitute a body of evidence (BOE) (F, m). The belief function (Bel) and plausibility function ((P l)) are defined as

M

81

ED

Bel(A) = 82

P l(A) =

X

B⊆A

X

A∩B6=∅

m(B)

(2)

m(B)

(3)

The belief function Bel(A) represents the justified specific support for the

84

focal element (or proposition) A, while the plausibility function P l(A) rep-

85

resents the potential specific support for A. The length of the belief interval

86

[Bel(A), P l(A)] is used to represent the degree of imprecision for A.

CE

Two independent BBAs m1 (·) and m2 (·) can be combined using Demp-

AC

87

PT

83

88

ster’s rule of combination as [1]     0, AP= ∅ m1 (Ai )m2 (Bj ) m(A) = Ai ∩Bj =A  P , A 6= ∅  m1 (Ai )m2 (Bj )  1− Ai ∩Bj =∅

5

(4)


ACCEPTED MANUSCRIPT

There are still some other alternative combination rules. See [21] for details.

90

The theory of belief functions has been criticized for its validity [19, 20, 22,

91

23, 24, 25]. It is not a successful generalization of the probability theory, i.e.,

92

there exists inconsistency [20] between the framework of the theory of belief

93

functions and that of the probability theory.

94

3. Uncertainty measures in the theory of belief functions

CR IP T

89

In the theory of belief functions, there are two types of uncertainty includ-

96

ing the discord (or the conflict or the randomness) and the non-specificity,

97

hence ambiguity [11].

98

3.1. Measure of discord in belief function

AN US

95

Measures of discord are for describing the randomness (or discord or con-

100

flict) in a BOE [11]. Available definitions are listed below. Although with

101

various names, they are all for the discord part of the uncertainty in a BOE.

ED

102

M

99

1) Confusion measure [12]

AC 104

X

m(A) log2 (Bel(A))

(5)

X

m(A) log2 (P l(A))

(6)

A⊆Θ

2) Dissonance measure [14]

CE

103

PT

Conf (m) = −

Diss(m) = −

A⊆Θ

3) Discord measure [26] Disc(m) = −

X

m(A) log2

A⊆Θ

6

"

X

|B − A| 1− m(B) |B| B⊆Θ

#

(7)


ACCEPTED MANUSCRIPT

4) Strife measure [9] Stri(m) = −

X

m(A) log2

A⊆Θ

"

X

|A − B| 1− m(B) |A| B⊆Θ

#

(8)

CR IP T

105

106

Note that all these definitions can be considered as Shannon entropy-alike

107

measures. More detailed information on these measures can be found in [9].

108

3.2. Measures for non-specificity in belief function

Non-specificity [27, 15] means two or more alternatives are left unspecified

110

and represents an imprecision degree. It only focuses on those focal elements

111

with cardinality larger than one. Non-specificity is a special uncertainty

112

type in the framework of belief functions theory when compared with the

113

probabilistic framework. Some non-specificity measures [10, 14, 15] were

114

proposed. The most commonly used non-specificity definition is [10]:

M

AN US

109

ED

N S(m) =

X

A⊆Θ

m(A) log2 |A|

(9)

It can be regarded as a generalized Hartley measure [13] from the classical

116

set theory. When the BBA m(·) is a Bayesian BBA, i.e., it only has singleton

117

focal elements, it reaches the minimum value 0. When BBA m(·) is a vacuous

118

BBA, i.e., m(Θ) = 1, it reaches the maximum value log2 (|Θ|). This definition

119

was proved to have the uniqueness by Ramer [28] and it satisfies all the

120

requirements of the non-specificity measure.

AC

CE

PT

115

7


ACCEPTED MANUSCRIPT

122

3.3. Measures for total uncertainty in belief functions theory 1) Aggregated Uncertainty (AU ) [16] P AU (m) = max − θ∈Θ pθ log2 pθ   pθ ∈ [0, 1], ∀θ ∈ Θ     P pθ = 1 s.t. θ∈Θ   P   ¯ ∀A ⊆ Θ  Bel(A) ≤ pθ ≤ 1 − Bel(A),

CR IP T

121

θ∈A

(10)

In fact for AU , given a BBA, the probability with the maximum Shannon

124

entropy under the constraints established using the given BBA is selected

125

and its corresponding Shannon entropy value is defined as the value for AU .

126

Therefore, it is also called as ”upper entropy” [29]. It is an aggregated total

127

uncertainty (ATU) measure, which captures both non-specificity and discord.

128

AU satisfies all the requirements [29] for uncertainty measure including prob-

129

ability consistency, set consistency, value range, monotonicity, sub-additivity

130

and additivity for the joint BBA in Cartesian space.

M

ED

2) Ambiguity Measure (AM ) [11] X BetPm (θ) log2 (BetPm (θ)) AM (m) = − P

PT

131

AN US

123

(11)

θ∈Θ

m(B)/ |B| is the pignistic probability [30] of a

where BetPm (θ) =

133

BBA. In fact AM uses the Shannon entropy of the pignistic probability of a

134

given BBA to represent the uncertainty.

135

3.4. Drawbacks of available total uncertainty measures for belief functions

θ∈B⊆Θ

AC

CE

132

136

The traditional total uncertainty measures have their own drawbacks.

137

AM cannot satisfy the sub-additivity (for joint BBA in Cartesian space)

138

which has been pointed out by Klir [18]. Moreover, AM is criticized due to

139

its logical non-monotonicity [29] as shown in Example 1. 8


ACCEPTED MANUSCRIPT

141

3.4.1. Example 1 Suppose that the FOD is {θ1 , θ2 , θ3 }. There are two BBAs as follows. m1 ({θ1 , θ2 }) = 1/3, m2 ({θ1 , θ2 , θ3 }) = 1/3,

142

m1 ({θ1 , θ3 }) = 1/2, m2 ({θ1 , θ3 }) = 1/2,

Obviously, from the two BBAs, there exists Bel2 (A) 6 Bel1 (A),

m1 ({θ2 , θ3 }) = 1/6;

CR IP T

140

m2 ({θ2 , θ3 }) = 1/6.

P l1 (A) 6 P l2 (A), ∀A ⊆ Θ.

That is to say, all the belief intervals of m1 can be contained by those

144

corresponding belief intervals of m2 , which means that m2 has a higher level

145

of uncertainty. We can calculate their corresponding AM and AU as follows.

AN US

143

1.5546 = AM (m1 ) > AM (m2 ) = 1.5100; AU (m1 ) = AU (m2 ) = 1.5850 = log2 3.

We can see that the results of AM are counter-intuitive, because AM violates

147

the monotonicity, i.e., AM decreases the total quantity of uncertainty in this

148

example where a clear increment of uncertainty of m2 compared with that

149

of m1 . Although AU does not bring out a decrease of the quantity of uncer-

150

tainty, it generates two equal values, which is also counter-intuitive, because

151

m2 should have higher level of uncertainty. We provide further analyses on

ED

PT

this example. The belief function and plausibility function of m1 and m2 are Bel1 ({θ1 }) = 0.0000, P l1 ({θ1 }) = 0.8333;

AC

CE

152

M

146

Bel1 ({θ2 }) = 0.0000,

P l1 ({θ2 }) = 0.5000;

Bel1 ({θ1 , θ2 }) = 0.3333,

P l1 ({θ1 , θ2 }) = 1.0000;

Bel1 ({θ3 }) = 0.0000,

P l1 ({θ2 }) = 0.6667;

Bel1 ({θ1 , θ3 }) = 0.5000,

P l1 ({θ1 , θ3 }) = 1.0000;

Bel1 ({θ2 , θ3 }) = 0.1667,

P l1 ({θ1 , θ3 }) = 1.0000;

Bel1 (Θ) = 1.0000,

P l1 (Θ) = 1.0000; 9


ACCEPTED MANUSCRIPT

Bel2 ({θ1 }) = 0.0000,

P l2 ({θ1 }) = 0.8333;

Bel2 ({θ2 }) = 0.0000,

P l2 ({θ2 }) = 0.5000;

Bel2 ({θ3 }) = 0.0000,

P l2 ({θ1 , θ2 }) = 1.0000;

CR IP T

Bel2 ({θ1 , θ2 }) = 0.0000,

P l2 ({θ2 }) = 0.6667;

Bel2 ({θ1 , θ3 }) = 0.5000,

P l2 ({θ1 , θ3 }) = 1.0000;

Bel2 ({θ2 , θ3 }) = 0.1667,

P l2 ({θ1 , θ3 }) = 1.0000;

Bel2 (Θ) = 1.0000,

P l2 (Θ) = 1.0000;

Since AU tries to find a probability maximizing the Shannon entropy,

154

and the uniform p.m.f.(probability mass function) P (θ1 ) = P (θ2 ) = P (θ3 ) =

155

1/3, which has the maximum Shannon entropy, satisfies all the constraints

156

(shown in Eq. (10)) established using Bel1 , P l1 and those using Bel2 and P l2 .

157

Therefore, both AU1 and AU2 reach the maximum value log2 3 = 1.5850.

158

3.4.2. Example 2

M

AN US

153

Suppose that the FOD Θ = {θ1 , θ2 }. A BBA over Θ is m({θ1 }) = a,

160

m({θ2 }) = b, m({Θ}) = 1 − a − b, where a, b ∈ [0, 0.5]. Here, we calculate

161

AU , AM values corresponding to different values of a and b. The change of

162

AU and AM values with the change of a and b are illustrated in Figure 1.

PT

CE

163

ED

159

AC

As shown in Figure 1, AM reaches its maximum when a = b, because when a = b (no matter a or b’s value is large or small), the pignistic probabil-

ity is uniformly distributed. This is counter-intuitive, because AM cannot

discern the effect of total set’s mass assignment values to the uncertainty degree. For example, m1 ({θ1 }) = m1 ({θ2 }) = 0.5 and m2 ({θ1 }) = m2 ({θ2 }) = 10


AN US

CR IP T

ACCEPTED MANUSCRIPT

Figure 1: The change of total uncertainty measures in Example 2.

0.25, m2 (Θ) = 0.5, there exists AM (m1 ) = AM (m2 ). Obviously, it is ir-

M

rational to say that m1 and m2 have the same degree of total uncertainty. AU never changes with the change of a and b. The value of AU is always

ED

log2 2 = 1. The reason is analyzed as follows. The constraints for calculating AU in this example are

PT

Bel({θ1 }) = a 6 P (θ1 ) 6 1 − b = P l({θ1 });

164

CE

Bel({θ2 }) = b 6 P (θ2 ) 6 1 − a = P l({θ2 }); Bel(Θ) = 1 − a − b 6 P (θ1 ) 6 1 = P l(Θ);

Since AU tries to find a p.m.f. with the maximum Shannon entropy, and

the uniformly distributed P (θ1 ) = P (θ2 ) = 0.5 always satisfies the constraints

166

above (because a, b ∈ [0, 0.5]), no matter how a and b change, P (θ1 ) =

167

P (θ2 ) = 0.5 is always picked up when calculating AU and thus, AU always

168

equals to log2 2 = 1. However, intuitively, the degree of uncertainty should

169

change with the change of a and b. So, AU cannot well describe the degree

AC

165

11


ACCEPTED MANUSCRIPT

170

of total uncertainty here. As we can see above, both AM and AU borrow the uncertainty measure

172

(Shannon entropy) in the probability theory framework and they both have

173

drawbacks. However, as aforementioned, the belief functions theory is not a

174

successful generalization of the probability theory. There exists inconsistency

175

between the two frameworks. We do not prefer such a switch between differ-

176

ent frameworks, which might cause problems in representing the uncertainty

177

in belief functions. Therefore, in our work, we design the total uncertainty

178

measure directly in the framework of belief functions theory as introduced in

179

the next section.

180

4. Belief interval distance-based total uncertainty measure

181

4.1. Belief interval and uncertainty

185

186

187

188

AN US

M

Case I: The most uncertain case is Bel(A) = 0 and P l(A) = 1, i.e., the belief interval is [0,1].

Case II: The clearest case is Bel(A) = 1 and P l(A) = 1 (A is assure to occur) or Bel(A) = 0 and P l(A) = 0 (A is assure to never occur). Case III: When Bel(A) = a, P l(A) = b, ∀a, b ∈ (0, 1), the belief interval

is [a, b]. For A, the degree of imprecision can be represented by b − a, the

AC

189

ED

184

m can represent its uncertainty degree as analyzed below.

PT

183

The belief interval [1] [Bel(A, P l(A)] of a given focal elements A in a BBA

CE

182

CR IP T

171

190

probability of whether A occurs or not (randomness) is large than a and less

191

than b.

192

Case IV: When Bel(A) = P l(A) = a, ∀a ∈ (0, 1), the belief interval

193

is [a, a]. It means that A has no imprecision and whether A occurs or not 12


ACCEPTED MANUSCRIPT

194

cannot be clearly determined (it is with the randomness, i.e., the probability

195

a). Therefore, the information related to the uncertainty carried by a belief

197

interval both include the randomness part and the imprecision part (non-

198

specificity). Thus, given a BBA, we can fully utilize the information of the

199

belief intervals to measure the total uncertainty of the BBA. Then, how to

200

use the information of belief intervals?

CR IP T

196

Given the belief interval of A, i.e., [Bel(A), P l(A)], if the belief interval is

202

farther from the most uncertain case [0,1], then A has smaller uncertainty; if

203

the belief interval is nearer to the most uncertain case [0,1], then A has larger

204

uncertainty. Then, how to describe the distance between belief intervals?

205

Here we use the distance between interval numbers [31, 32] as follows. Given two interval numbers [a1 , b1 ] and [a2 , b2 ], a strict distance is defined as s 2 2 a1 + b 1 a2 + b 2 1 b 1 − a1 b 2 − a2 I d ([a1 , b1 ], [a2 , b2 ]) = − − + 2 2 3 2 2 (12)

ED

M

206

AN US

201

The belief interval [Bel(A), P l(A)] can be considered as an interval num-

208

ber. If we replace [a1 , b1 ] by [Bel(A), P l(A)] and [a2 , b2 ] by [0, 1], then,

209

the distance obtained can be used to represent how far it is from A to

210

the most uncertain case. As shown in Eq. (12), in dI , the proportion of

211

[(Bel({θi }) + P l({θi })) /2 − 1]2 and [(P l({θi }) − Bel({θi })) /2 − 1]2 is 1/3.

212

Generally speaking, an interval number is a kind of fuzzy data [31]. To define

213

the distance between two fuzzy data, the value of such a proportion depends

214

on the shape [31] (could be the symmetric triangular, normal, parabolic,

215

square root fuzzy data, and interval number) of the fuzzy data model. Here,

216

we use the belief intervals (interval numbers), therefore, there is a rational

AC

CE

PT

207

13


ACCEPTED MANUSCRIPT

and natural assumption that the interval is with a uniform distribution. Un-

218

der such an assumption, the proportion should be 1/3 and dI corresponds

219

to Mallows’ distance between two distributions, hence a strict distance met-

220

ric for interval numbers. By using such a strict distance metric for interval

221

numbers, we can further define a total uncertainty measure for a BOE as

222

follows.

223

4.2. New total uncertainty measure based on belief intervals

CR IP T

217

Suppose that m is a BBA over the FOD Θ = {θ1 , ..., θn }. First, calcu-

225

late the belief interval [Bel({θi }), P l({θi )}] , i = 1, ..., n for each singleton

226

θi .Then, calculate the distance between each [Bel({θi }), P l({θi )}] and [0, 1],

227

which represents the degree of departure from the most uncertain case for

228

each singleton. Since larger distance (larger departure from the most uncer-

229

tain case) means smaller total uncertainty, one minus the normalized and

230

averaged belief interval distances (for all singletons) is used as the total un-

231

certainty measure for the BBA m as shown in Eq. (13):

ED

M

AN US

224

233

235

(13)

√ 3 is the normalization factor 1 .

The newly proposed T U I has no drawbacks as those of AU and AM

pointed out in Examples 1 and 2. See the Examples 3 and 4 in Section 5 for

AC

234

where

n 1 √ X I · 3· d ([Bel({θi }), P l({θi })], [0, 1]) n i=1

CE

232

PT

T U I (m) = 1 −

details. 1

The distance dI ([Bel({θi }), P l({θi })], [0, 1]) reaches its maximum value when the be-

lief interval is [0,0] or [1,1]. Therefore, the normalization factor is 1/dI ([0, 0], [0, 1]) = √ 1/dI ([1, 1], [0, 1]) = 3.

14


ACCEPTED MANUSCRIPT

Here, we provide an illustrative example to show how T U I is calculated.

237

Suppose that FOD is Θ = {θ1 , θ2 , θ3 }. A BBA over Θ is m({θ1 }) =

238

0.3, m({θ2 , θ3 }) = 0.5, m(Θ) = 0.2.

CR IP T

236

First calculate the belief function and plausibility function for singletons: Bel({θ1 }) = m({θ1 }) = 0.3, Bel({θ2 }) = m({θ2 }) = 0.0, Bel({θ3 }) = m({θ3 }) = 0.0,

AN US

P l({θ1 }) = m({θ1 }) + m(Θ) = 0.5,

P l({θ2 }) = m({θ2 , θ3 }) + m(Θ) = 0.7, P l({θ3 }) = m({θ2 , θ3 }) + m(Θ) = 0.7.

Then, calculate the distance between each belief interval of singleton and

M

the interval [0, 1]:

ED

dI ([Bel({θ1 }), P l({θ1 })], [0, 1]) = dI ([0.3, 0.5], [0, 1]) q 2 1 0.5−0.3 1−0 2 0.3+0.5 = − 0+1 − 2 +3 = 0.2517; 2 2 2

PT

dI ([Bel({θ2 }), P l({θ2 })], [0, 1]) = dI ([0, 0.7], [0, 1]) q 0+1 2 1 0.7−0 1−0 2 0+0.7 = − + − = 0.1732; 2 2 3 2 2

Then T U I = 1 −

√ 3/3 · (0.2517 + 0.1732 + 0.1732) = 0.6547.

AC

239

CE

dI ([Bel({θ3 }), P l({θ3 })], [0, 1]) = dI ([0, 0.7], [0, 1]) q 2 1 0.7−0 1−0 2 0+0.7 = − 0+1 +3 2 − 2 = 0.1732; 2 2

240

4.3. Desired properties of the new total uncertainty measure

241

4.3.1. Range

242

The range of T U I is [0, 1], which is desired for practical use. For a vacuous

243

BBA m(Θ) = 1, all the belief intervals for singletons are [0, 1]. Therefore, by 15


ACCEPTED MANUSCRIPT

using Eq. (13), its degree of total uncertainty is 1, which is the maximum.

245

For a categorical BBA m(θi ) = 1, the belief interval for θi is [1, 1] and those

246

of other singletons are [0, 0]. Therefore, by using Eq. (13), its degree of total

247

uncertainty is 0, which is the minimum.

CR IP T

244

Proof of the range: For a vacuous BBA, when using Eq. (13), T U I = 1.

249

On the other hand, if T U I = 1, i.e., all the belief intervals for singletons are

250

[0, 1], the corresponding BBA must be a vacuous one. So the maximum value

251

is unique. If a BBA is not a vacuous one, then some singletons’ belief intervals

252

can be strictly contained by [0, 1]. Therefore, according to Eq. (13), the

253

corresponding T U I will be smaller than 1. Therefore, the unique maximum

254

value of T U I is 1.

AN US

248

For a categorical BBA m(θi ) = 1, when using Eq. (13), T U I = 0. On

256

the other hand, if T U I = 0, the only possible belief interval for singletons

257

are either [0, 0] or [1, 1]. For a BBA, one and only one belief interval for

258

singleton could be [1, 1]. So, the corresponding BBA should be m({θi }) =

259

1, m({θj }) = 0, ∀j 6= i, which represents the clearest case. If a BBA is not

260

m({θi }) = 1, m({θj }) = 0, ∀j 6= i, some singletons’ belief intervals are neither

PT

ED

M

255

[0, 0] nor [1, 1]. Therefore, according to Eq. (13), the corresponding T U I will

262

be larger than 0. So, the minimum value of T U I is 0.

263

End of Proof

4.3.2. Monotonicity

AC

264

CE

261

Given an uncertainty measure U N and two BBAs m1 and m2 , if ∀A ∈ P(Θ) : P l1 (A) 6 P l2 (A), Bel1 (A) > Bel2 (A) or ∀A ∈ P(Θ) : [Bel1 (A), P l1 (A)] ⊆ [Bel2 (A), P l2 (A)] 16


ACCEPTED MANUSCRIPT

there exists U N (m1 ) 6 U N (m2 ), then it is said that U N satisfies the mono-

266

tonicity [29]. The physical meaning of the monotonicity is that an uncertainty

267

measure in the theory of belief functions must not decrease the total quan-

268

tity of uncertainty in situations where there is a clear decrease in information

269

(increment of uncertainty).

CR IP T

265

Proof of the monotonicity: For two BBAs defined over the same FOD,

271

if it holds that ∀A ∈ P(Θ) : [Bel1 (A), P l1 (A)] ⊆ [Bel2 (A), P l2 (A)], then,

272

there exists ∀θi ∈ Θ : [Bel1 ({θi }), P l1 ({θi })] ⊆ [Bel2 ({θi }), P l2 ({θi })]. Ac-

273

cording to the property of the distance of interval numbers in Eq. (12), there

274

exists dI ([Bel1 ({θi }), P l1 ({θi })], [0, 1]) > dI ([Bel2 ({θi }), P l2 ({θi })], [0, 1]).

277

278

279

280

holds. End of Proof 4.4. On other properties

M

276

Then, according to Eq. (13), T U I (m1 ) 6 T U I (m2 ), i.e., the monotonicity

ED

275

AN US

270

It is declared in [29, 11] that AU and AM satisfy the probability consistency and set consistency.

PT

Probability consistency: When m is a Bayesian BBA (all focal ele-

CE

ments are singletons), AU reduces to Shannon entropy in probability theory: AU (m) = AM (m) = −

X

m({θ})log2 (m({θ}))

θ∈Θ

AC

Set consistency: When m focuses on a single set (i.e., m(A) = 1, ∀A ⊆

Θ), AU reduces to a Hartley measure in classical theory: AU (m) = AM (m) = log2 (|A|)

17


ACCEPTED MANUSCRIPT

It should be noted that it is meaningless to apply these two requirements

282

or properties to our new measure. First, our new T U I is designed directly

283

in the framework of belief functions theory, but not a generalization of the

284

measures in probability framework or those in classical set theory. Second,

285

the theory of belief functions is not a successful generalization of the prob-

286

ability theory. It cannot degenerate back to the probability framework. For

287

example, a Bayesian BBA is not a probability but still a special BBA, which

288

cannot satisfy the additivity for mutually exclusive events and cannot satisfy

289

the unity for the total set in Kolmogorov probability axioms [33]. Therefore,

290

the validity and strictness of such a consistency is doubtful.

AN US

CR IP T

281

Furthermore, AU also satisfies the subadditivity and additivity, which

292

defined for the joint BBA in Cartesian space. In our work, we do not con-

293

cern the joint BBA in Cartesian space, therefore, we will not analyze the

294

subadditivity and additivity in this paper. We will further verify our new total uncertainty measure by examples and

ED

295

M

291

simulations in the next section.

297

5. Experiments and simulations

298

5.1. Example 3

This Example 3 is a revisiting of Example 1. Using Eq. (13), we can

obtain that T U I (m1 ) = 0.6667 and T U I (m2 ) = 0.7778, i.e., m2 has a higher

AC

300

CE

299

PT

296

301

level of uncertainty, which is intuitive. It can be proved that T U I satisfy the

302

monotonicity. The proof can be found at Section 4.3.2.

18


ACCEPTED MANUSCRIPT

303

5.2. Example 4 This Example 4 is a revisiting of Example 2. Our new total uncertainty

305

measure T U I , non-specificity measure N S, dissonance measure Diss are also

306

calculated. The results are shown in Figure 2.

AU

CR IP T

304

TU I

AM

1

1

2

0.95 0.9 0.85 0 0.5

0.8 0.5

0.5 a

0 0

b

0.8

AN US

1

0.6

0.5

0 0

a

a

Diss

M

NS

b

1

0.4 0.5

0

0

0.5 b

1 0.8

1

0.6

ED

0.5

0.5

PT

0 0.5

0 0

b

0 0.5

0.5 a

0 0

b

0.2 0

CE

a

0.5

0.4

AC

Figure 2: The change of total uncertainty measures in Example 4.

307

308

AU never changes as analyzed in Example 2, which is counter-intuitive.

309

AM values never changes, so far as a = b holds, which is also counter-intuitive

310

as aforementioned in Example 2. As we can see in Figure 2, T U I brings out 19


ACCEPTED MANUSCRIPT

rational results. It reaches the maximum when a = b = 0, i.e., m(Θ) = 1.

312

When a + b is fixed, the maximum values are reached if a = b holds. For

313

example, suppose that a + b = 0.3. The T U I values are shown in Figure 3.

314

This makes sense, because m(Θ) = 1 − (a + b) is fixed, the non-specificity

315

part is fixed. When a = b, the conflict part reaches its maximum value.

CR IP T

311

AN US

0.9

TU I

0.8

X: 0.15 Y: 0.15 Z: 0.85

0.7 0.6

M

0.5 0.6

ED

0.4

a+b=0.3

0.2

b

0

0.2

0.1

0.3

0.4

0.5

a

PT

0

316

CE

Figure 3: T U I for those with fixed a + b

Note that N S reaches 0 (minimum), when a = b = 0.5 (m(Θ) = 0); it reaches

318

1 (maximum), when a = b = 0 (m(Θ) = 1). Diss reaches 1 (maximum),

319

when a = b = 0.5; it reaches 0 ( minimum), when a = 0 or b = 0 (no conflict).

AC

317

20


ACCEPTED MANUSCRIPT

320

5.3. Example 5 Suppose that the FOD is Θ= {θ1 , θ2 ,θ3 } and a BBA over Θ is m(A) =

322

1, ∀A = Θ. We make changes to the BBA step by step. In each step, m(Θ)

323

has a decrease of ∆ = 0.05 and each singleton mass m({θi }), i = 1, 2, 3 has an

324

increase of ∆/3. Finally, m(Θ) reaches 0 and m({θi }) = 1/3. We calculate

CR IP T

321

total uncertainty measures including T U I , AM , AU , non-specificity N S, and

326

dissonance (Diss) at each step. The changes of uncertainty values are shown

327

in Figure 4.

AN US

325

Change of Uncertainty values 1.6

NS AU AM

1 0.8 0.6

PT

0.4

TU I Diss

M

1.2

ED

Uncertainty values

1.4

AC

CE

0.2

0 0

5

10

Steps

15

20

Figure 4: The change of total uncertainty measures in Example 5.

328

329

As we can see in Figure 4, with the change of the BBA in each step, the

330

non-specificity changes from the maximum to zero. This is intuitive, because 21


ACCEPTED MANUSCRIPT

the mass assignments are transferred from the focal elements with larger

332

cardinality to those with smaller cardinality. The measure for conflict, i.e.,

333

Diss here, becomes larger and reaches the maximum in the final. This is

334

also intuitive, because the BBA changes from a vacuous BBA (no conflict)

335

to a Bayesian BBA (with uniform distribution on all singletons) in the fi-

336

nal. The traditional total uncertainty measures AU and AM never change

337

and they are always at the maximum value. This is counter-intuitive be-

338

cause the total uncertainty degree of a vacuous BBA should be the largest,

339

and thus, it should be larger than that of a Bayesian BBA (even if, it is

340

a Bayesian BBA with uniform distribution on all singletons). The reasons

341

for the irrational results of AU and AM are analyzed as follows. They are

342

both defined based on some probabilistic transformation from BBAs. In this

343

example, for the probabilistic transformation used in AM and AU , the re-

344

sults are always a uniformly distributed probability mass function (p.m.f.):

345

P (θi ) = 1/3, i = 1, 2, 3, therefore, AM and AU will never change here. Our

346

new measure T U I provides rational results. It becomes smaller and smaller

347

when the BBA changes from a vacuous one to a Bayesian one as shown in

348

Figure 2. Therefore, according to our opinion, it is inappropriate to define

349

total uncertainty measure in the theory of belief functions by indirectly using

350

the uncertainty measure in probability framework, i.e., Shannon entropy. It

351

should be better not to switch the framework but to directly design in the

CE

PT

ED

M

AN US

CR IP T

331

framework of belief functions theory. This is also the motivation of our work

353

in this paper.

AC 352

22


ACCEPTED MANUSCRIPT

354

5.4. Example 6 The FOD is Θ = {θ1 , θ2 , θ3 }. A BBA over Θ is m(A) = 1, ∀A = Θ. We

356

make changes to the BBA step by step. In each step, m(Θ) has a decrease

357

of ∆ = 0.05 and mass assignment of one singleton m({θ1 }) has an increase

358

of ∆ = 0.05. In the final step, m(Θ) = 0 and m({θ1 }) = 1. We calculate

CR IP T

355

total uncertainty measures inlcuding T U I , AM , AU , non-specificity measure

360

N S, and dissonance (Diss) of the BBA at each step. The changes of these

361

uncertainty values are shown in Figure 5.

AN US

359

Change of Uncertainty values

1.6

NS AU AM

1.4

M

0.8

0.4

PT

0.2

ED

Uncertainty values

1

0.6

I

TU Diss

1.2

0

CE

−0.2 0

5

10

Steps

15

20

AC

Figure 5: The change of total uncertainty measures in Example 6.

362

363

The BBA changes from a vacuous one to a categorical one. As shown in

364

Figure 5, non-specificity N S becomes smaller and smaller, which is intuitive. 23


ACCEPTED MANUSCRIPT

365

The conflict part Diss is always 0, which is also intuitive, because at most

366

two focal elements are available for the BBA in each step (including {θ1 } and

Θ), therefore, there is no conflict. AU , AM , T U I all decrease intuitively.

368

However, AU is not sensitive to the BBA’s change at the beginning stage

369

(nearly unchanged). This has already been criticized by Jousselme [11].

370

5.5. Example 7

CR IP T

367

Examples 7-9 are used for reference from [11]. Suppose that the FOD is

372

Θ = {θ1 , ..., θ8 }. A BBA over Θ is m(A) = 1, ∀A = Θ. We make changes

373

to the BBA step by step. In each step, m(Θ) has a decrease of ∆ = 0.05,

374

and the mass assignment for one focal element B with cardinality s < 8 has

375

an increase of ∆ = 0.05. In the final step, m(B) = 1. Here, we can set

376

s = 2, 4, 6, respectively. Given an s value, we repeat the whole procedure

377

of BBA change. Under the different s, the changes of values for uncertainty

378

measures including T U I , AM , AU , N S, and Diss are shown in Figure 6. 3

|B| = s = 2

3

3 2.5

2

2

2

1.5

1.5

1.5

1

1

1

0.5

0.5

0.5

CE AC

|B| = s = 4

2.5

PT

2.5

Uncertainty values

ED

M

AN US

371

0 0

10

20

0 0

10

Steps

20

0 0

|B| = s = 6

Diss NS AU AM TU I

10

Figure 6: The change of total uncertainty measures in Example 7.

379

24

20


ACCEPTED MANUSCRIPT

As shown in Figure 6, when the mass assignments are transferred to a smaller

381

size focal element, the non-specificity N S intuitively decreases. The total

382

uncertainty AU , AM , T U I all decrease. If s has a smaller value, the total

383

uncertainty and the non-specificity will decrease faster, because the mass

384

assignments are transferred to a smaller size focal element. Comparatively,

385

AU is not sensitive to the change of the BBA. The conflict part (Diss) is

386

always 0. The reason is analyzed as follows. At the beginning, the BBA is a

387

vacuous one; in the middle steps, the BBA has two focal elements including

388

the total sets; in the final step, it has only one focal element. Therefore,

389

there is no conflict.

390

5.6. Example 8

AN US

391

CR IP T

380

Suppose that the FOD is Θ = {θ1 , ..., θ10 }. A BBA over Θ is m(Θ) = 1−a,

and m(A) = a, |A| = s 6 10. Given an a, and the initial |A| = 1, we make

393

changes to the BBA step by step. |A| increases by 1 in each step. In the

394

final, the BBA becomes a vacuous one. We set a = 0.3, 0.5, 0.8, respectively.

395

Given an a value, we repeat the whole procedure of BBA change. Under the

396

different a, the changes of values for uncertainty measures including T U I ,

397

AM , AU , N S, and Diss are shown in Figure 7.

ED

PT

As shown in Figure 7, non-specificity N S, total uncertainty measures in-

CE

398

M

392

cluding AU , AM , and our T U I all intuitively increase with the change of

400

BBA at each step. If a has a larger value, they will increase faster, because

401

relatively more mass assignments are transferred to a larger size focal ele-

402

ment. Comparatively, AU is not sensitive to the BBA’s change. The conflict

403

part (Diss) is zero and never changes. The reason is analyzed as follows.

404

Before the final step, the BBA has two focal elements including the total

AC

399

25


ACCEPTED MANUSCRIPT

405

sets; in the final step, it has only one focal element. Therefore, there is no

406

conflict. 3.5

a=0.5

3.5

3

3

3

2.5

2.5

2.5

2

2

2

1.5

1.5

1.5

1

1

1

0.5

0.5

0 0

5

10

0 0

a=0.8

CR IP T

a=0.3

AN US

Uncertainty values

3.5

Diss NS AU AM TU I

0.5

5

Steps

10

0 0

5

10

M

Figure 7: The change of total uncertainty measures in Example 8.

409

410

5.7. Example 9

Suppose that the size of an FOD is |Θ| = 5. Randomly generate 10 BBAs

with k, 1 6 k 6 25 − 1 focal elements using the algorithm [11] in Table 1.

PT

408

ED

407

In this simulation, the number of focal elements is set to 15. Those

412

generated BBAs are then combined one by one using Dempster’s rule of

413

combination in Eq. (4). At each instant t of the evidence combination, the

414

values of N S, AU , AM , Diss, and T U I are calculated. The whole procedure

AC

CE

411

415

is repeated for 100 times and the averaging values of these different uncer-

416

tainty measures at different t are shown in Figure 8. It is shown that all

417

the uncertainty measures, except for Diss, decrease with the increase of the

418

combination steps. This makes sense, because it is intuitive that the total 26


ACCEPTED MANUSCRIPT

Table 1: Algorithm 1: Random generation of BBA

Input: Θ: FOD; Nmax : Maximum number of focal elements;

Generate the power set of Θ: P(Θ);

CR IP T

Output: m: a BBA

Generate a random permutation of P(Θ) → R(Θ); Generate a integer between 1 and Nmax → k; FOReach First k elements of R(Θ) do

AN US

Generate a value within [0, 1] → mv(i), i = 1, ..., k; END

Normalize the vector mv = [mv(1), ..., mv(2)] → mv 0 m(Aj ) = mv 0 (j), j = 1, ..., k.

uncertainty decreases in the information fusion procedure such as the evide-

M

419

ED

420

421

nce combination. The non-specificity measure drops faster and more sig-

423

nificantly than the total uncertainty measures. This is because that in the

424

evidence combination based on Dempster’s rule, the focal elements are split

425

into focal elements with smaller cardinality. Also, since the focal elements

426

are split into smaller size focal elements, the conflict in the BBA will sig-

AC

CE

PT

422

427

nificantly increase at the beginning steps. At the final steps, since there is

428

less further split of focal elements and the consensus effect in combination

429

becomes more dominant, the conflict in the BBA decreases slightly as shown

430

in Figure 8. 27


ACCEPTED MANUSCRIPT

Change of Uncertainty values

2.5

CR IP T

Uncertainty values

2

1.5

NS AU AM

1

TU I Diss

0 0

AN US

0.5

2

4

Steps

6

8

10

Figure 8: The change of total uncertainty measures in Example 8.

432

433

M

6. Application of the new total uncertainty measure Here we use our new total uncertainty measure in feature evaluation for

ED

431

pattern classification to further show its rationality. Three classes of samples are artificially generated. There are 200 samples

435

in each class. Each sample has four dimensions. In each class, each dimen-

436

sion of samples is Gaussian distributed with different mean and standard

437

deviation (Std for short) as shown in Figure 9 and Table 2.

CE

438

PT

434

As shown in Figure 9 and Table 2, since the three Gaussian probability

density functions (PDFs) in feature 2 are quite well separated, the class

440

discriminability of feature 2 is the best; the class discriminability of feature

441

4 is the worst (totally overlapped), and that of the feature 3 is also very bad.

442

The class discriminability of feature 1 is in the middle. This can also be

AC 439

28


AN US

CR IP T

ACCEPTED MANUSCRIPT

Figure 9: Probability density functions (PDFs) of different features of the samples in three

M

classes.

Table 2: Gaussian distribution parameters of the samples.

ED

Features

PT

Feature 1

CE

Feature 2

AC

Feature 3

Feature 4

Class1

Class2 Class3

Mean

0

4

5.5

Std

1

1.2

1.2

Mean

0

5

8

Std

1

1.2

1.2

Mean

4.8

5

5.2

Std

1.2

1.2

1.2

Mean

5

5

5

Std

1.2

1.2

1.2

29


ACCEPTED MANUSCRIPT

verified by using the discrimination criterion [34] as follows.

J=

tr(Sw ) tr(Sb )

(14)

CR IP T

443

444

where tr denotes the trace of a matrix. Suppose that there are C classes and

445

each class Ci has Ni samples. The degree of inner-class cohesion Sw and the degree of inter-class separability Sb are as follows [34] "  T # C P P P    P (Ci )E X − N1i X X − N1i X  Sw = i=1 X∈Ci X∈Ci T C  P P P  1 1  X −M X −M P (Ci ) Ni  Sb = Ni i=1

447

X∈Ci

AN US

446

(15)

X∈Ci

where X is a feature (vector) of a sample and

M

C 1 X M= C i=1

1 X X Ni X∈C i

!

(16)

is the mean of all the classes’ centroids. If J of some feature (or set of

449

features) in Eq. (14) is smaller, then such a feature (or set of features) is

450

more crisp and discriminable.

ED

448

PT

For the four dimensions of the artificially generated samples, there exists

451

CE

J(1) = 0.2666, J(2) = 0.1336, J(3) = 488.9156, J(4) = 981.3525 It means that in terms of discriminability, feature 2 is the best; feature 4

is the worst; feature 3 is also very bad, and feature 1 is in the middle, which

453

is accordant to the judgement based on the Gaussian Parameters.

AC

452

454

Now, we use different total uncertainty measures (including AM, AU, and

455

our new T U I ) for the feature evaluation. Intuitively, the feature with smaller

456

total uncertainty measure should be better (higher discriminability). 30


ACCEPTED MANUSCRIPT

458

First, we generate BBA from each sample xq in the three-class artificial samples on different feature i ∈ {1, 2, 3} according to [35] mixq (Aj )

−2/(β−1)

|Aj |−α/(β−1) dqj

=P

Ak 6=∅

459

−2/(β−1)

|Ak |−α/(β−1) dqk

(17)

CR IP T

457

+ δ −2/(β−1)

where parameters α = 1, β = 2 as suggested in [34]. Here we give an

PT

ED

M

AN US

illustrative example as in Figure 10

460 461

In Figure 10, three different colors represent three different classes. c1

denotes the centroid of samples in class 1; c1,2 denotes the centroid of samples

AC

462

CE

Figure 10: Illustration of BBA generation.

463

in class 1 and class 2; c1,2,3 denotes the centroid of samples in classes 1, 2,

464

and 3. Calculate the distance d between xq and those centroids of single

465

classes and compound classes. Then according to Eq. (17), the BBA can be

466

generated. 31


ACCEPTED MANUSCRIPT

Second, we calculate AU, AM, and our proposed T U I for all the generated

468

BBAs (corresponding to each sample on different features). Then, calculate

469

the average values of AU, AM and T U I for different features. The results

470

are shown in Table 3.

CR IP T

467

Table 3: Average total uncertainty measures

Feature1

Featue 2

Feature 3

Feature 4

AU

1.4022

1.1968

1.5850

1.5850

AM

1.1855

0.9860

1.5844

1.5847

TU I

0.4496

AN US

Measures (Ave)

0.3787

0.6218

0.6221

We consider the average total uncertainty measures as scores for different

472

features. The feature with smaller score is better (with high discriminability)

473

As shown in Table 3 the feature evaluation based on the average T U I

474

values is accordant to the results obtained using discrimination criterion.

475

That is, feature 2 is the best; feature 4 is the worst; feature 3 is also very

476

bad, and feature 1 is in the middle.

ED

M

471

AM can also provide the correct feature evaluation result. Note that

478

feature 3 is more ambiguous than feature 4. The difference of average T U I

479

for features 3 and 4 is 0.0003; The difference of average AM for features 3

480

and 4 is also 0.0003. However, the value range of T U I is [0, 1], while the value

481

range of AM is [0, log2 3]. Therefore, our proposed T U I is more sensitive

482

when measuring the ambiguity in different features, which agrees with the

483

analysis in above examples.

AC

CE

PT

477

484

485

The average AU values for both features 3 and 4 are equal. They are both the maximum value log2 3 = 1.5850. The reason is analyzed below. 32


ACCEPTED MANUSCRIPT

According to the BBA generation approach used, because in features 3 and 4, all the centroids (7 centroids, 3 singleton classes, and 4 compound

CR IP T

classes) are very close, all the distances between xq and those centroids are almost equal when generating BBAs for xq on features 3 and 4. Therefore, each focal element of the BBA generated for xq on features 3 and 4 has the

mass value approximating to 1/7. According to all the BBAs generated, for feature 3, m(A) = (1/7) ± 0.010; for feature 4, m(A) = (1/7) ± 0.007, A ⊆

AN US

{c1 , c2 , c3 }. According to Eq. (10), since AU is calculated based on the maximization of Shannon entropy given the constraints established using belief functions as

1 4 ≈ m({c1 }) ≤ P (c1 ) ≤ m({c1 }) + m({c1 , c2 }) + m({c1 , c3 }) + m({c1 , c2 , c3 }) ≈ 7 7} | {z } | {z Bel({c1 })

P l({c1 })

M

1 4 ≈ m({c2 }) ≤ P (c2 ) ≤ m({c2 }) + m({c1 , c2 }) + m({c2 , c3 }) + m({c1 , c2 , c3 }) ≈ 7 7} | {z } | {z Bel({c2 })

P l({c2 })

Bel({c3 })

ED

4 1 ≈ m({c3 }) ≤ P (c3 ) ≤ m({c3 }) + m({c1 , c3 }) + m({c2 , c3 }) + m({c1 , c2 , c3 }) ≈ 7 7} | {z } {z | P l({c3 })

P (c1 ) = P (c2 ) = P (c3 ) = 1/3, which corresponds to the maximum Shannon

487

entropy, always satisfies the constraints, so, the AU value is fixed to log2 3 =

488

1.5850. Based on AU, we cannot judge which one is better for features 3 and

489

4.

In summary, our proposed T U I can be well used in feature evaluation for

pattern classification, and it performs better than AM and AU.

AC

491

CE

490

PT

486

492

7. Conclusions

493

In this paper, a new total uncertainty measure based on the distance

494

of belief intervals is proposed in the framework of belief functions theory. 33


ACCEPTED MANUSCRIPT

There is no switch between frameworks in the new measure. It is based on a

496

new perspective that for a body of evidence, the farther it is from the most

497

uncertain case, the less uncertainty it carries. It has been experimentally

498

shown that our new measure can rationally represent the total uncertainty

499

in BOEs. The new measure also has some desired properties and behaves

500

more rational than traditional measures like AU and AM in some cases as

501

shown in our experiments and simulations. Furthermore, our new measure

502

can provide more rational results in practical applications such as the feature

503

evaluation.

AN US

CR IP T

495

Till now, the theory of belief functions is still not so solid and needs fur-

505

ther refining. The available works mainly focus on the uncertainty modeling

506

and reasoning in belief functions theory and there have emerged many valu-

507

able results. However, the performance evaluation in belief functions theory

508

is far from mature, which has become its bottle neck for the further develop-

509

ment. In future work, we will focus on performance evaluation in the theory

510

of belief functions and try to use our new total uncertainty measure to im-

511

plement the theoretical evaluation and the application-based evaluation in

512

the theory of belief functions.

513

Acknowledgements

ED

PT

CE

This work was supported by the Grant for State Key Program for Ba-

AC

514

M

504

515

sic Research of China (973) (No. 2013CB329405), National Natural Science

516

Foundation (Nos. 61203222, 61573275), Foundation for Innovative Research

517

Groups of the National Natural Science Foundation of China (No. 61221063),

518

Science & technology project of Shaanxi Province (No.2013KJXX-46), Spe34


ACCEPTED MANUSCRIPT

cialized Research Fund for the Doctoral Program of Higher Education (No.

520

20120201120036), and Fundamental Research Funds for the Central Univer-

521

sities (No. xjj2014122).

522

References

523

524

CR IP T

519

[1] G. Shafer, A mathematical theory of evidence, Princeton university press Princeton, 1976.

[2] G. Lin, J. Liang, Y. Qian, An information fusion approach by combining

526

multigranulation rough sets and evidence theory, Information Sciences

527

314 (2015) 184–199.

AN US

525

[3] G. Dong, G. Kuang, Target recognition via information aggregation

529

through Dempster-Shafer’s evidence theory, Geoscience and Remote

530

Sensing Letters, IEEE 12 (6) (2015) 1247–1251.

532

[4] Z.-g. Liu, Q. Pan, J. Dezert, Evidential classifier for imprecise data based

ED

531

M

528

on belief functions, Knowledge-Based Systems 52 (2013) 246–257. [5] Z.-g. Liu, Q. Pan, J. Dezert, G. Mercier, Credal c-means clustering

534

method based on belief functions, Knowledge-Based Systems 74 (2015)

535

119–132.

CE

[6] O. Basir, X. Yuan, Engine fault diagnosis based on multi-sensor informa-

AC

536

PT

533

537

tion fusion using Dempster-Shafer evidence theory, Information Fusion

538

8 (4) (2007) 379–386.

539

[7] D. Han, J. Dezert, J.-M. Tacnet, C. Han, A fuzzy-cautious owa ap-

540

proach with evidential reasoning, in: Proceedings of the 15th Interna35


ACCEPTED MANUSCRIPT

541

tional Conference on Information Fusion (FUSION), ISIF, Singapore,

542

2012, pp. 278–285.

544

[8] R. R. Yager, N. Alajlan, Dempster-Shafer belief structures for decision

CR IP T

543

making under uncertainty, Knowledge-Based Systems 80 (2015) 58–66.

[9] G. J. Klir, B. Parviz, A note on the measure of discord, in: Proceed-

546

ings of the Eighth international conference on Uncertainty in artificial

547

intelligence, Morgan Kaufmann Publishers Inc., 1992, pp. 138–141.

548

[10] D. Dubois, H. Prade, A note on measures of specificity for fuzzy sets,

549

AN US

545

International Journal of General System 10 (4) (1985) 279–283. ´ Boss´e, Measuring ambiguity in [11] A.-L. Jousselme, C. Liu, D. Grenier, E.

551

the evidence theory, Systems, Man and Cybernetics, Part A: Systems

552

and Humans, IEEE Transactions on 36 (5) (2006) 890–903.

M

550

[12] U. H¨ohle, Entropy with respect to plausibility measures, in: Proceedings

554

of the 12th IEEE International Symposium on Multiple-Valued Logic,

555

1982, pp. 167–169.

557

PT

nal 7 (3) (1928) 535–563.

[14] R. R. Yager, Entropy and specificity in a mathematical theory of evi-

AC

558

[13] R. V. Hartley, Transmission of information, Bell System Technical Jour-

CE

556

ED

553

559

560

561

dence, International Journal of General System 9 (4) (1983) 249–260.

[15] R. K¨orner, W. N¨ather, On the specificity of evidences, Fuzzy sets and systems 71 (2) (1995) 183–196.

36


ACCEPTED MANUSCRIPT

[16] D. Harmanec, G. J. Klir, Measuring total uncertainty in Dempster-

563

Shafer theory: A novel approach, International Journal of General Sys-

564

tems 22 (4) (1994) 405–419.

565

566

CR IP T

562

[17] C. E. Shannon, The mathematical theory of communication, Bell System Technical Journal 27 (1949) 379–423, 623–656.

[18] G. Klir, I. Lewis, H.W., Remarks on “measuring ambiguity in the ev-

568

idence theory”, IEEE Transactions on Systems, Man and Cybernetics,

569

Part A: Systems and Humans 38 (4) (2008) 995–999.

570

571

AN US

567

[19] J. Pearl, Reasoning with belief functions: an analysis of compatibility, International Journal of Approximate Reasoning 4 (5) (1990) 363–389. [20] J. Dezert, A. Tchamova, On the validity of Dempster’s fusion rule and its

573

interpretation as a generalization of bayesian fusion rule, International

574

Journal of Intelligent Systems 29 (3) (2014) 223–252.

ED

M

572

[21] F. Smarandache, J. Dezert, Advances and applications of DSmT for

576

information fusion-Collected works-Volume 3, American Research Press,

577

2009.

PT

575

[22] J. F. Lemmer, Confidence factors, empiricism and the Dempster-Shafer

579

theory of evidence, in: Proceedings of the 1st Conference on Uncertainty in Artificial Intelligence (UAI-85), 1985, pp. 160–176.

AC

580

CE

578

581

[23] L. A. Zadeh, A simple view of the dempster-shafer theory of evidence

582

and its implication for the rule of combination, AI Magazine 7 (2) (1986)

583

85–90. 37


ACCEPTED MANUSCRIPT

[24] P. Wang, A defect in Dempster-Shafer theory, in: Proceedings of the

585

Tenth international conference on Uncertainty in artificial intelligence,

586

Morgan Kaufmann Publishers Inc., 1994, pp. 560–566.

CR IP T

584

587

[25] A. Gelman, The boxer, the wrestler, and the coin flip: A paradox of

588

bayesian inference, robust bayes, and belief functions, American Statis-

589

tician 60 (2) (2005) 146–150.

[26] G. J. Klir, A. Ramer, Uncertainty in the Dempster-Shafer theory: a

591

critical re-examination, International Journal of General System 18 (2)

592

(1990) 155–166.

AN US

590

[27] D. Dubois, H. Prade, Properties of measures of information in evidence

594

and possibility theories, Fuzzy sets and systems 100 (1999) 35–49.

595

[28] A. Ramer, Uniqueness of information measure in the theory of evidence, Fuzzy Sets and Systems 24 (2) (1987) 183–196.

ED

596

M

593

[29] J. Abell´an, A. Masegosa, Requirements for total uncertainty measures

598

in Dempster-Shafer theory of evidence, International journal of general

599

systems 37 (6) (2008) 733–747.

gence 66 (2) (1994) 191–234.

AC

601

[30] P. Smets, R. Kennes, The transferable belief model, Artificial intelli-

CE

600

PT

597

602

[31] A. Irpino, R. Verde, Dynamic clustering of interval data using a

603

Wasserstein-based distance, Pattern Recognition Letters 29 (11) (2008)

604

1648–1658.

38


ACCEPTED MANUSCRIPT

607

608

609

610

611

[33] A. Papoulis, S. U. Pillai, Probability, random variables, and stochastic processes, 4th Edition, Tata McGraw-Hill Education, 2002.

[34] R. O. Duda, P. E. Hart, D. G. Stork, Pattern classification, John Wiley & Sons, 2012.

[35] M.-H. Masson, T. Denoeux, Ecm: An evidential version of the fuzzy c-means algorithm, Pattern Recognition 41 (4) (2008) 1384–1397.

AC

CE

PT

ED

M

612

tance measure, Fuzzy 130 (3) (2002) 331–341.

CR IP T

606

[32] L. Tran, L. Duckstein, Comparison of fuzzy numbers using a fuzzy dis-

AN US

605

39


Turn static files into dynamic content formats.

Create a flipbook
Issuu converts static files into: digital portfolios, online yearbooks, online catalogs, digital photo albums and more. Sign up and create your flipbook.