A General Procedure of Estimating Population Mean Using Information on Auxiliary Attribute by Florentin Smarandache

Sampling Strategies for Finite Population Using Auxiliary Information

A General Procedure of Estimating Population Mean Using Information on Auxiliary Attribute 1Sachin

Malik, †1Rajesh Singh and 2Florentin Smarandache 1

Department of Statistics, Banaras Hindu University Varanasi-221005, India

Chair of Department of Mathematics, University of New Mexico, Gallup, USA † Corresponding author, rsinghstat@gmail.com

Abstract This paper deals with the problem of estimating the finite population mean when some information on auxiliary attribute is available. It is shown that the proposed estimator is more efficient than the usual mean estimator and other existing estimators. The results have been illustrated numerically by taking empirical population considered in the literature.

Keywords

Simple random sampling, auxiliary attribute, point bi-serial correlation, ratio estimator, efficiency.

1. Introduction The use of auxiliary information can increase the precision of an estimator when study variable y is highly correlated with auxiliary variable x. There are many situations when auxiliary information is available in the form of attributes, e.g. sex and height of the persons, amount of milk produced and a particular breed of cow, amount of yield of wheat crop and a particular variety of wheat (see Jhajj et. al. (2006)). Consider a sample of size n drawn by simple random sampling without replacement (SRSWOR) from a population of size N. Let y i and  i denote the observations on variable y and  respectively for i th unit ( i =1, 2,......, N). Let  i =1; if the i th unit of the population possesses attribute  = 0; otherwise. N

i 1

Let A=   i and a=   i , denote the total number of units in the population and sample respectively possessing attribute  . Let P=A/N and p=a/n denote the proportion of units in the population and sample respectively possessing attribute  . Naik and Gupta (1996) introduced a ratio estimator t NG when the study variable and the auxiliary attribute are positively correlated. The estimator t NG is given by

Rajesh Singh ■ Florentin Smarandache (editors)

t NG  y

P p

(1.1)

with MSE



MSE(t NG )  f1 S2y  R 2S2  2RS y

where



(1.2)





f1 

Nn , Nn

S2 

1 N 1 N S  ,     P  i  P y i  Y .  i y N  1 i 1 N  1 i 1

R

2 1 N Y , S2y   yi  Y , P N  n i 1 2





(for details see Singh et al. (2008)) Jhajj et. al. (2006) suggested a family of estimators for the population mean in single and two phase sampling when the study variable and auxiliary attribute are positively correlated. Shabbir and Gupta (2007), Singh et. al. (2008) and Abd-Elfattah et. al. (2010) have considered the problem of estimating population mean Y taking into consideration the point biserial correlation coefficient between auxiliary attribute and study variable. The objective of this article is to suggest a generalised class of estimators for population mean Y and analyse its properties. A numerical illustration is given in support of the present study.

2. Proposed Estimator Let *i   i  mA , m being a suitably chosen scalar, that takes values 0 and 1. Then

q  p  mA  p  NmP , and

Q  ( Nm  1)P, where q 

N n b B , Q  , B   i and b   i . n N i 1 i 1

Motivated by Bedi (1996), we define a family of estimators for population mean Y as





q t  w1 y  w 2 bP  p    Q



(2.1)

where w1 , w2 and  are suitably chosen scalars. To obtain the Bias and MSE of the estimator t, we write

Sampling Strategies for Finite Population Using Auxiliary Information

y  Y1  e 0  , p  P1  e1  , s 2  S2 1  e 2 , s y S y 1  e 3  , b  1  e 3 1  e 2 1

such that

E(e i )  0 , i=0,1,2,3 and

1 1  E(e 02 )    C 2y , n N

1 1  E(e12 )    C 2p , n N

1 1  E(e 0 e1 )     pb C y C p , n N

1 1  E(e1e 2 )    C p  03 , n N

1 1  E(e1e 3 )    C p n N

12 ,  pb

Expressing (2.1) in terms of e’s , we have e1     t  Y  w 1 1  e 0   w 2 e1 1  e 3 1  e 2 1 1   R Nm  1   



(2.2) 

e1  e1  We assume that e2  1 and  1 , so that ( 1  e2 ) 1 and 1   are expandable. Nm  1  Na  1  Expanding the right hand side of (2.2) and retaining terms up to second powers of e’s ,we have

 e 0 e1  e1 e12   1 t  Y  Y[ w1 1  e 0     Nm  1 2  Nm  12 Nm  1 e12     w 2 e1  e1e 3  e1e 2    1] R Nm  1  

(2.3)

Taking expectation of both sides of (2.3) , we get the bias of t to the first degree of approximation as :

      1 B( t )  Y w1  1  w1  f1C py  f1C 2p   Nm  1   2Nm  12  w2

  f1 C p R 

 12   C p  03  C 2p   pb Nm  1  

(2.4)

Squaring both sides of (2.3) and neglecting terms of e’s having power greater than two, we have 23

Rajesh Singh ■ Florentin Smarandache (editors)

t  Y 

  2  1e12 4e 0 e1  2  e  2 2 1  Y  w 1 1  2e 0   e0    2   Nm  1 Nm  1    Nm  1    2

  w 22   R 



e12

2e12      1  2w1w 2 e1  e1e 3  e 0 e1  e1e 2   R Nm  1  

2  e 0 e1   1e1  e1  2w1 1  e 0     Nm  1 Nm  1 2Nm  12    e12     2w 2 e1  e1e 3  e1e 2   R  Nm  1 (2.5) Taking expectation of both sides of (2.5), we get the MSE of t to the first degree of approximation as:



 m  m  m  2 MSE ( t )  Y 1  w12 A1m    w 2 A 2  2w1w 2 A 3   2w1A 4   2w 2 A 5  2











(2.6)



C p  2  1  2     A1m    1  f1 C y  Nm  1  Nm  1  4k 

where,



 A 2    f1C 2p R

 A 3m   

   2  2  Cp f1 C p   k  12  C p  03  R   Nm  1   pb 

   1   A 4m   1  f1   k    Nm  1  2Nm  1      f  C p  C p   C   A 5m 12 p 03   R 1  Nm  1   pb 2



where , k   pb

Cy Cp



The MSE(t) is minimised for w1

 A A      A    A       A  A  A       w A   A     A  A      w  A  A  A      m 4

m 1

m 3

m 1

m 3

m 5

m 3

m 4

m 1

m 3

(2.7)

m 5

(2.8)

Sampling Strategies for Finite Population Using Auxiliary Information

3. Members of the family of estimator of t and their Biases and MSE Table 3.1:

Different members of the family of estimators of t

Choice of scalars w2 w1



Estimator m

t1  y



-1

t 2  w1 y Searls (1964) type estimator q t 3  w 1 y  Q p t 4  w y  P P t 5  y   p ,



Naik and Gupta (1996) estimator P t 6  y  bP  p  p





Singh et. al. (2008) estimator





t 7  w 1 y  w 2 bP  p 

t 8  w 1 y  bP  p 

t 9  w y  bP  p 









t10  y  bP  p  Regression estimator

The estimator t1  y is an unbiased estimator of the population mean Y and has the variance Var t1   f1S2y

(3.1)

Rajesh Singh ■ Florentin Smarandache (editors)

To, the first degree of approximation the biases and MSE’s of t i ' s , i=1,2,.......,10 are respectively given by Bt 2   Yw1  1

(3.2)

     1 2  Bt 3   Y w1  1  f1w1   pb C y C p  C p  2Nm  1   Nm  1 

(3.3)

   1 2   Bt 4   Y w1  1  w1f1 pb C y C p  C p  2   

(3.4)



Bt 5   Yf1 C 2p   pb C y C p



(3.5)

  Bt 6   Yf1 (C 2p   pb C y C p )   C p R        Bt 7   Y w1  1  12 f1 C p R   

   Bt 8   Y w1  1  f1 C p R  

 12  C p  03    pb 

 12   C p  03   pb  

 12  C p  03   pb 

     Bt 9   Y w  1  wf 1 C p 12  C p  03  R   pb   Bt10   Y

    f1 C p 12  C p  03  R   pb 

(3.6)

(3.7)

(3.8)

(3.9)

(3.10)

The corresponding MSE’s will be

  1  w A    2w A    

MSE t 2   Y 1  w 12 A100  2w 1A 400  2

MSE t 3   Y

2 m 1 1

m 4



 MSE t 4   Y 1  w 12 A10   2w 1A 40  2



MSE t 5   Y 1  A101  2A 401 2

(3.11)

(3.12)



(3.13)



(3.14) 26

Sampling Strategies for Finite Population Using Auxiliary Information



MSE t 6   Y 1  A101  A 2  2A 30 1  2A 401  2A 50 1 2





MSE t 7   Y 1  w12 A100  w 22 A 2  2w1w 2 A 300   2w1A 400   2w 2 A 500  2







MSE t 8   Y 1  w12 A100  A 2  2w1 A 300   A 400   2A 500  2



 





MSE t 9   Y 1  w 2 A100  A 2  2A 300   2w A 400   A 500  2





MSE t 10   Y 1  A100  A 2  2 A 300   A 400   A 500  2



(3.15)

(3.16)

(3.17)



(3.18) (3.19)

The MSE’s of the estimaors of ti, i=2,3,4,7,8,9 will be minimised respectively, for



w1 

A 400 



A100 

(3.20)

 

A 4m 

w1 

 

A1m  

(3.21)

 

w1 

A 40 

 

A10 

(3.22)

 A A     A    A                 A A    A             A A A A     A A  A 0 40

0 30

0 2 10

( 0) 3(0)

( 0) 4( 0)

0 50 0 2 30 ( 0) 1(0)

( 0) 2 1(0)

( 0) 5(0) ( 0) 2 3(0)

(3.23)

 A      A     

w

0 30

A 100 

0 40

(3.24)

A 40(0 )  A 500 

A100   A 2  2A 300 

(3.25)

Thus the resulting minimum MSE of ti , i= 2,3,4,7,8,9 are, respectively given by

Rajesh Singh ■ Florentin Smarandache (editors)





 0  2  A 2 40   min . MSE t 2   Y 1   A100    

(3.26)





 m  2  A 2 4   min . MSE t 3   Y 1    A1m     

(3.27)

  A      1



2

min . MSE t 4   Y   

0 2 4  A10  





(3.28)





  0  2 0  0  0  0   A 2 A 40   2A 30 A 50   A10  A 50  2 min . MSE t 7   Y 1   2  A 2 A100  A 300   







 A 300   A 400  2  0 min . MSE t 8   Y 1  A 2  2A 50   A100  

   2

   

(3.29)

  2

  

(3.30)





 0   A 0  2  A  2 40  50  min . MSE t 9   Y 1  0  0    A10   A 2  2A 30    

(3.31)

4. Empirical study The data for the empirical study is taken from natural population data set considered by Sukhatme and Sukhatme (1970): y = Number of villages in the circles and  = A circle consisting more than five villages

N  89, Y  3.36, P  0.1236,  pb  0.766, C y  0.6040, C p  2.190  04  6.1619,  40  3.810, 12  146.475,  03  2.2744

In the Table 4.1 percent relative efficiencies (PRE’s) of various estimators are computed with respect to y . 28

Sampling Strategies for Finite Population Using Auxiliary Information

Table 4.1: PRE of different estimators of Y with respect to y . Estimator t1  y

PRE’s 100.00

t2 t3

101.41

t4 t5

6.92 11.64

7.38

100.44

243.39

243.42

t10

241.98

90.35

Conclusion The MSE values of the members of the family of the estimator t have been obtained using (2.6). These values are given in Table 4.1. When we examine Table 4.1, we observe the superiority of the proposed estimators t2, t7, t8, t9 and t10 over usual unbiased estimator t1, t3, t4, Naik and Gupta (1996) estimator t5 and Singh et. al. (2008) estimator t6. From this result we can infer that the proposed estimators t8 and t9 are more efficient than the rest of the estimators considered in this paper for this data set. We would also like to remark that the value of the min. MSE(t10), which is equal to the value of the MSE of the regression estimator is 241.98. From Table 4.1 we notice that the value of MSE of the estimators t8 and t9 are less than this value, as shown in Table 4.1. Finally, we can say that the proposed estimators t8 and t9 are more efficient than the regression estimator for this data set.

References 1. Abd-Elfattah, A.M. El-Sherpieny, E.A. Mohamed, S.M. Abdou, O. F. (2010): Improvement in estimating the population mean in simple random sampling using information on auxiliary attribute. Appl. Mathe. and Compt. doi:10.1016/j.amc.2009.12.041 2. Bedi, P. K. (1996). Efficient utilization of auxiliary information at estimation stage. Biom. Jour, 38:973–976. 3. Jhajj, H.S., Sharma, M.K. and grover, L.K. (2006) : A family of estimators of population mean using information on auxiliary attribute. Pak. Journ. of Stat., 22(1), 43-50. 4. Naik,V.D. and Gupta,P.C.(1996): A note on estimation of mean with known population proportion of an auxiliary character. Journ. of the Ind. Soc. of Agr. Stat., 48(2), 151-158. 5. Searls, D.T. (1964): The utilization of known coefficient of variation in the estimation procedure. Journ. of the Amer. Stat. Assoc., 59, 1125-1126. 29

Rajesh Singh â&#x2013; Florentin Smarandache (editors) 6. Singh, R. Chauhan, P. Sawan, N. Smarandache, F. (2008): Ratio estimators in simple random sampling using information on auxiliary attribute. Pak. J. Stat. Oper. Res. 4(1) 47â&#x20AC;&#x201C;53. 7. Shabbir, J. and Gupta, S.(2007): On estimating the finite population mean with known population proportion of an auxiliary variable. Pak. Journ. of Stat., 23 (1), 1-9. 8. Sukhatme, P.V. and Sukhatme, B.V. (1970): Sampling theory of surveys with applications. Iowa State University Press, Ames, U.S.A.