I ns t i t ut eo fMa na g e me nt & Te c hni c a lSt udi e s
NUMERI CAL& STATTI STCALMETHODS
500
Ma s t e ri nCo mp u t e rAp p l i c a t i o n www. i mt s i ns t i t ut e . c om
IMTS (ISO 9001-2008 INTERNATIONALLY CERTIFIED)
NUMERICAL & STATISTICAL METHODS
NUMERICAL & STATISTICAL METHODS
FOR MORE DETIALS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
NUMERICAL & STATISTICAL METHODS
CONTENTS:
UNIT-01 01-20 Introduction,Definition Of Error,Types Of Error, ACCUMULATED ERROR,Sources Of Error,Effects Of Errors,Absolute, Relative, And Percentage Errors,Relation Between Relative Error And The Number Of Significant Figures,Theorem I,Theorem II,The General Formula For Errors,Propagation Of Error,Effect Of An Error In A Tabular Value,Accuracy And Precision,Approximate Numbers And Significant Figures,Approximation,Rounding Of Numbers,Numerical Stability,Summary ,Keywords,Answer To Check your Progress,Exercise Questions,Further Readings UNIT –II
21-48
STATISTICAL ANALYSIS Learning
objectives,Introduction,Analysis
of
variance
(ANOVA),Definition and assumption,One – way classified data,Two –way classified
data,ANOVA
table,Summary,Keywords,Answer
for
self
assessment,Exercise Questions,Reference Books
FOR MORE DETIALS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
NUMERICAL & STATISTICAL METHODS
CHAPTER - 1 ERRORS AND APPROXIMATION Structure Learning Objectives 1.0 Introduction 1.1 Definition Of Error 1.1.1 1.1.2
Types Of Error Accumulated Error
1.2 Sources Of Error 1.2.1 1.2.2
Effects Of Errors Absolute, Relative, And Percentage Errors
1.2.3 1.2.4
Relation Between Relative Error And The Number Of Significant Figures Theorem I
1.2.5 1.2.6
Theorem II The General Formula For Errors
1.3 Propagation Of Error 1.4 Effect Of An Error In A Tabular Value 1.5 Accuracy And Precision 1.6 Approximate Numbers And Significant Figures 1.6.1 Approximation 1.7 Rounding Of Numbers 1.8 Numerical Stability 1.9 Summary 1.10 Keywords 1.11 Answer To Check your Progress 1.12 Exercise Questions 1.13 Further Readings Learning Objectives Students completing this unit will be able to
Acquire knowledge of errors and types of errors in numerical mathematics.
Understand the propagation of error
Gain the knowledge of approximation and its effects.
Learn to achieve numerical stability in computations
FOR MORE DETIALS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
1
NUMERICAL & STATISTICAL METHODS 1.0 INTRODUCTION The numerical data used in solving the problems of everyday life are usually not exact, and the numbers expressing such data are therefore not exact. They are merely approximations, true to two, three, or more figures. Not only are the data of practical problems usually approximate, but sometimes the methods and processes by which the desired result is to be found are also approximate. An approximate calculation is one which involves approximate data, approximate methods, or both. It is therefore evident that the error in a computed result may be due to one or both of two sources: errors in a data and errors of calculation. Errors of the first type cannot be remedied, but those of the second type can usually be made as small as we please. Thus, when such a number as is replaced by its approximate value in a computation, we can decrease the error due to the approximation by taking to as many figures as desired, and similarly in most other cases. We shall therefore assume in this unit that the calculations are always carried out in such a manner as to make the errors of calculation negligible. Nearly all numerical calculations are in some way approximate, and the aim of the computer should be to obtain results consistent with the data with a minimum of labour. The object of the present unit is to set forth some basic ideas and methods relating to approximate calculations and to give methods for estimating the accuracy of the results obtained. Measurements and calculations can be characterized with regards to their accuracy and precision. Accuracy refers to how closely a value agrees with the true value. Precision refers to how closely values agree with each other. The error represents the imprecision and inaccuracy of a numerical computation. In any newly developed method it becomes necessary to identify the error, its nature and analyse the order of the error. Without error analysis the newly developed method becomes useless. The effect of errors in the laws of arithmetic is significant. In floating point arithmetic, the associative and distributive laws of arithmetic are not always satisfied. x+(y+z) (x+y)+z due to the presence of error. An example of Illustration will show this analytically. Let x
= 0.456732 x 10
y
= 0.243451
z
= –0.248000
-2
(x+y)+z
= 0.180000 x 10
-2
x+(y+z)
= 0.183200 x 10
-2
FOR MORE DETIALS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
2
NUMERICAL & STATISTICAL METHODS
An approximation is an inexact representation of something that is still close enough to be useful. It is not only applied to numbers but also applied to shapes and physical laws. Approximations may be used because incomplete information prevents use of exact representation. Many problems in physics are either too complex to solve analytically, or impossible to solve.
An approximation may yield a sufficiently accurate solution while
reducing the complexity of the problems significantly, even where the exact representation is known For instance, physicists often approximate the shape of the earth as a sphere even though more accurate representations are possible, because many physical behaviours - e.g.: gravity – are much easier to calculate for a sphere than for less regular shapes. 1.1 DEFINITION OF ERROR It is the difference between the true value and the measured value of a physical quantity. 1.1.1 Types Of Error (a) Absolute error (b) Relative error (c) Percentage error (d) Truncation error (e) Round off error (f) Inherent error (g) Accumulated error (a) Absolute Error (Eabs) The error between two values is defined as Eabs = ||
x – x ||
where x denotes the exact value and
x its approximation.
(b) Relative Error (Erel) It is ratio between the absolute error relative to the exact value. Erel =
|| x x || || x ||
(c) Percentage Error p = 100 rel (d) Inherent Error It is an error at the beginning of the process itself.
FOR MORE DETIALS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
3
NUMERICAL & STATISTICAL METHODS
4
1.1.2 Accumulated Error An error in one value may affect the computation of the next value and the error gets accumulated in a sequence of computation. It is called the accumulated error. Example: Consider the following procedure: Yi+1 = 100 Yi That is Y1 = 100 Y0 Y2 = 100 Y1 Y3 = 100 Y2 .............................. .............................. Let the exact value of Y0 = 9.98, Suppose we start with Y0 = 10 Here there is an inherent error of 0.02. Y1 = 100Y0 = 100 x 10 = 1000 Y2 = 100Y1 = 100 x 1000 = 1,00,000 Y3 = 100Y2 = 100 x 100000 = 1,00,00,000 ......................................................................... ......................................................................... The following table shows the exact value sand computed values. Variable
Exact value
Computed value
Error
Y0
9.98
10
0.02
Y1
998
1000
2
Y2
99800
100000
200
Y3
9980000
10000000
20000
We can notice in the table how the error gets accumulated. A small error of 0.02 at Y0 leads to an error of 200000 in Y3. This is called accumulation of error. The relative accumulated error is the ratio of the accumulated error to the exact value of that iteration. In the example we have seen, the relative accumulated error is shown below:
Variable
Exact value
Computed value
Accumulated error
Relative accumulated error
Y0 Y1
9.98 998
10 1000
0.02
0.2/9.98 = 0.002004
2
2.998=0.002004
Y2
99800
100000
200
200/999800=0.002004
Y3
9980000
10000000
200000
20000/998000=0.002004
FOR MORE DETIALS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
NUMERICAL & STATISTICAL METHODS
5
Notice that the relative accumulated error is the same for all the values.
1.2 SOURCES OF ERROR In a numerical computation, error may arise because of the following reasons. i.
Truncation error
ii.
Round of error
i. Truncation Error This is an error involved in a method. It occurs because of truncation in some infinite or finite series to a fewer member of terms. Such errors are essentially algorithmic errors. ii. Round off error Round off errors is due t o the inability of computing device to deal with in exact numbers. Such numbers are rounded off necessarily to their nearest approximation. This is dependent on the word size used to represent numbers of the device.
1.2.1 Effects of Errors The measurement of data and calculations of data is not precise always due to some error in the measuring instruments and method of calculation.
In the field of numerical
analysis, the numerical stability is affected by the errors and propagation of errors.
1.2.2 Absolute, Relative, And Percentage Errors The absolute error of a number, measurement, or calculation is the numerical difference between the true value of the quantity and its approximate value as given, or obtained by measurement or calculation. The relative error is the absolute error divided by the true value of the quantity. The percentage error is 100 times the relative error.
For
example, let Q represent the true value of some quantity. If Q is the absolute error of an approximate value of Q, then Q / Q = Relative error of the approximate quantity (Q / Q)100 = Percentage error of the approximate quantity If a number is correct to n significant figures, it is evident that its absolute error can not be greater than half a unit in the nth place. For example, if the number 4.629 is correct to four figures, its absolute error is not greater than 0.001 x ½ = 0.0005. Remark: It is to be noted that relative and percentage errors are independent of the unit of measurement, whereas absolute errors are expressed in terms of the unit used.
FOR MORE DETIALS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
NUMERICAL & STATISTICAL METHODS
6
1.2.3 Relation Between Relative Error And The Number Of Significant Figures The accuracy of a measurement or of a computed result is indicated by the number of decimals required to express it. This belief is erroneous, for the accuracy of a result is indicated by the number of significant figures required to express it. The true index of the accuracy of a measurement or of a calculation is the relative error. For example, if the diameter of a 2-inch steel shaft is measured to the nearest thousandth of an metre, the result is less accurate than the measurement of a kilometre of railroad track to the nearest metre. For although the absolute errors in the two measurements are 0.0005 m and 6 m, respectively, the relative errors are 0.0005/2=1/4000 and 1/10,500. Hence in the measurement of the shaft we make an error of one part in 4000, whereas in the case of the railroad we make an error of one part in 10,500. The latter measurement is clearly the more accurate, even though its absolute error is 12,000 times as great. The relation between the relative error and the number of correct figures is given by the following fundamental theorem:
1.2.4 Theorem 1 If
the
first
significant
figure
of
a
number
is
k,
and
the
number
is correct to n significant figures, then the relative error is less than 1 /(k 10 ). n-1
Before giving a literal proof of this theorem we shall first show that it holds for several numbers picked at random. Henceforth we shall denote absolute and relative errors of numbers by the symbols E and Er, respectively. 1.2.4.1 Example 1 Let us suppose that the number 864.32 is correct to five significant figures. Then k =8, n = 5, and E 0.01 ½ = 0.005. For the relative error we have
Er
=
0.005 5 1 864 .32 0.005 864320 . 5 2 86432 1
1 1 1 4 2(86432 1 / 2) 2 8 10 8 10 4
Hence the theorem holds here. 1.2.4.2 Example 2 Consider the number 369,230. Assuming that the last digit (the zero is written merely to fill the place of a discarded digit and is therefore not a significant figure, we have k = 3, n = 5, and E 10 ½ =5. Then Er
5 1 1 369230 5 2 36923 1 2(36923 1 / 2)
FOR MORE DETIALS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
NUMERICAL & STATISTICAL METHODS
<
1 1 4 2 3 10 3 10 4
7
.
1.2.4.3 Example 3 Suppose the number 0.0800 is correct to three significant figures. Then k = 8, n = 3, E 0.0001 ½ = 0.00005, and
Er
0.00005 5 1 0.0800 0.00005 8000 5 1600 1
=
1 1 2(800 1 / 2) 8 10 2
It is to be noted that in this example the relative error is not certainly less than 1/(2k 10 ), as was the case in Examples 1 and 2 above. n-1
To prove the theorem generally, let N = any number (exact value) n = number of correct significant figures, m = number of correct decimal places. Three cases must be distinguished, namely m <n, m=n, and m>n. Case 1 m<n. here the number of digits in the integral part of N is n–m. Denoting the first significant figure of N by k, as before, we have E 1/10 ½ , N k 10 m
n-m-1
–1/10 1/2 . m
Hence Er
=
1 / 10 m 1 / 2 10 m k 10 nm1 1 / 10 m 1 / 2 2kx10 n1 10 m 10 m 1 1 n 1 n 1 2k 10 1 2(k 10 1 / 2)
Remembering now that n is a positive integer and that k stands for any one of the digits from 1 to 9 inclusive, we readily see that 2k 10 –1 > k 10 n-1
n-1
in all cases except k=1 and n=1.
But this is the trivial case where N=1, 0.01, etc.; that is, where N contains only one digit different from zero and this digit is 1–a case which would never occur in practice. Hence for all other cases we have 2k 10 –1 > k 10 , and therefore n-1
Er <
n-1
1 k 10 n 1
Case 2 m = n. Here N is a decimal and k is the first decimal figure. We then have
FOR MORE DETIALS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
NUMERICAL & STATISTICAL METHODS E 1/10 ½ ,
N k 10 / 10 ½
m
-1
10 m 12 k 10 1 10 m 12
Er
=
8
=
m
10 m 1 1 m 2k 10 10 2k 10 m1 1
1 1 n 1 2k 10 1 k 10 m1
Case 3 m > n. In this case k occupies the (m – n+1) th decimal place and therefore N k 10
(m-n+1)
Er
=
– 1/10 ½ , E 1/10 ½. m
m
10 m 12 k 10 m 10 n 1 10 m 12
=
10 m 2k 10 m 10 n 1 10 m
1 1 n 1 2k 10 1 k 10 n 1
The theorem is therefore true in all cases. Corollary 1 Except in the case of approximate numbers of the form
k
p
(1.000...)10 , in which k is the only digit different from zero, the relative error is less than 1 /(2k 10 ). n-1
Corollary 2 If k 5 and the given approximate number is not of the form k(1.000...) 10 , then Er p
n
<
1/10 ;
2k 10
n-1
for
in
this
case
2k
10
and
therefore
10 . n
To find the number of correct figures corresponding to a given relative error we can not take the converse of the theorem stated at the beginning of this article, for the converse theorem is not true. In proving the formula for the relative error we took the lower limit for N in order to obtain the upper limit for Er. Thus, for the lower limit of N we took its first significant figure multiplied by a power of 10. In the converse problem of finding the number of correct figures corresponding to a given relative error we must find the upper limit for N. This upper limit will be k +1 times a power of 10, where k is the first significant figure in N. For example, if the approximate value of N is 6895, the lower limit to be used in finding the relative error is 6 10 , whereas the upper limit to be used in finding the absolute error is 3
7 10 . 3
To solve the converse problem utilize Theorem II, given below:
FOR MORE DETIALS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
NUMERICAL & STATISTICAL METHODS
9
1.2.5 Theorem II If
the
relative
error
in
an
approximate
number
is
less
than
1/[k+1) 10 ], the number is correct to n significant figures, or at least is in error by n-1
less than a unit in the nth significant figure. To prove this theorem let N = the given number (exact value) n = number of correct significant figures in N, k = first significant figure in N, p = number of digits in the integral part of N. Then n – p = number of decimals in N, and
N (k +1) 10 . p-1
Let Er <
1 (k 1) 10 n 1
Then E < (k+1) 10
p-1
Now 1/10
n-p
1 1 n p n 1 (k 1) 10 10
is one unit in the (n–p) th decimal place, or in the nth significant figure. Hence
the absolute error E is less than a unit in the nth significant figure. If the given number is a pure decimal, let p = number of zeros between the decimal point and first significant figure. Then n+p = number of decimals in N, and N
(k 1) 10 p1
Hence if Er <
1 (k 1) 10 n 1
,
we have E < But 1/10
n+p
(k 1) 10 p1
1 (k 1) 10 n 1
=
1 10 n p
.
is one unit in the (n+p)th decimal place, or in the nth significant figure. Hence the
absolute error E is less than a unit in the nth significant figure.
FOR MORE DETIALS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
NUMERICAL & STATISTICAL METHODS
10
Remark The absolute error is connected with the number of decimal places, whereas the relative error is connected with the number of significant figures.
1.2.6 The General Formula for Errors Let
N = f(u1, u2, u3, . . . un)
(1)
denote any function of several independent quantities u1, u2, . . . un, which are subject to the errors u1, u2, . . . un, respectively. These errors in the u’s will cause an error N in the function N, according to the relation N + N = f(u1 + u1,u2 + u2, . . . un + un).
(2)
To find an expression for N we must expand the right-hand member of (2) by Taylor’s theorem for a function of several variables. Hence we have
f(u1 + u1, u2 + u2, . . .un + un) = f (u1, u2,... un) + u1
+ u2
f u 2
+ 2u1 u2
+ . . . + un
2f u1u 2
f u n
+
f u1
2 1 2 f 2 f ( u ) . . . . ( u ) n 1 2 u12 u 2n
+ . . . +] + . . .
Now since the errors u1, u2, . . .un are always relatively small, we may neglect their squares, products, and higher powers and write N + N = f(u1, u2, u3,. . . un) + u1
f f + u2 u 2 u1
+ . . . + un
f u n
(3)
Subtracting (1) from (3), we get N =
f f u1 + u 2 u1
u2 + . . . +
f u n
un,
or
A quantity P is said to be relatively small in comparison with a second quantity Q when the ratio P/Q is small in comparison with unity. The squares and products of such small ratios are negligible.
FOR MORE DETIALS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
NUMERICAL & STATISTICAL METHODS
N =
f u + f 1 u 2 u1
u2 +
f u 3
u3 +. . . +
11
f u n
un.
(6.1)
This is the general formula for computing the error of a function, and it includes all possible cases. It will be observed that the right-hand member of (6.1) is merely the total differential of the function N. For the relative error of the function N we have Er =
N N
=
N u1 N u 2 N u n . . . u1 N u 2 N u n N
.
(6.2)
When N is a function of the form N =
Ka m b n c p dqer
then by (6.2) the relative error is Er = N/N = m(a/a) + n(b/b) + p(c/c) – q(d/d) – r(e/e). But since the
errors a, . . . e, etc. are just as likely to be negative as positive,
we must take all the terms with the positive sign in order to be sure of the maximum error in the function N. Hence we write Er m |a/a| + n|b/b| + p|c/c| + q|d/d| + r|e/e|. Check Your Progress 1. What is important of the error? Magnitude or sign. 2. The concept of relative error is the ___ absolute error. 3. The omission of certain digits from a number results known as ____. 1.3 PROPAGATION OF ERROR 1.3.1 Explanation: Propagation Of Errors Suppose that E(n) represents the growth of error after n steps. If |E(n)| n, the growth of error is said to be linear. if |E(n)| K , the growth of error is called exponential. If n
K > 1, the exponential error grows without bound as n , and if 0 < K < 1, the exponential error diminishes to zero as n .
Let us investigate how error might be propagated in successive computations. Consider the addition of two numbers p and q (the true values0 with the approximate values
~p ~ q
and
~ q , which contains errors p and q, respectively.
Starting with p =
~p
+ p and q =
+ q, the sum is
~
~
~
p + q = ( p + p) + ( q + q) = ( p +
~ q ) + (p + q)
(1)
FOR MORE DETIALS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
NUMERICAL & STATISTICAL METHODS
12
Hence, for addition, the error in the sum is the sum of the errors of the addends. The propagation of error in multiplication is more complicated. The product is
~
~
= ( p + p) ( q + q) =
Hence, if
~p
and
~ q
~p ~ q
+
~p q
+
~ q p + pq
pq
(2)
are larger than 1 in absolute value, the terms
~p and ~ q q p
show that
there is a possibility of magnification of the original errors p and q. Insights are gained if we look at the relative error. Rearrange the terms in (15) to get pq –
~p ~ q
=
~p + ~ q p + pq q
(3)
Suppose that p 0 and q 0; then we can divide (16) by pq and obtain
~ ~ pq - ~ p~ q p q q p pq pq pq ~ p q ~ q p pq = pq pq pq Furthermore, suppose that
~p /p
1,
~ q /q
1, and (p/p)(q/q) = Rp Rq 0. Then making
these substitutions yields the simplified relationship
pq - ~ p~ q p q 0 Rq Rp. pq q p
(4)
This shows that the relative error in the product pq is approximately the sum of the relative errors in the approximations
~p
and
~ q.
Often an initial error will be propagated in a sequence of calculations. A quality which is desirable for any numerical process is that a small error in the initial conditions will product small changes in the final result. An algorithm with this feature is called stable; otherwise, it is called unstable. Whenever possible we shall choose methods that are stable. The following definition is used to describe the propagation of error. Check Your Progress 4. What are the components of inherent errors? 5. Will the round-off errors accumulate with the increasing number of computations? 6. Is relative error independent of the unit? 7. Are absolute errors expressed in terms of units.
FOR MORE DETIALS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
NUMERICAL & STATISTICAL METHODS
13
1.4 EFFECT OF AN ERROR IN A TABULAR VALUE Let y0, y1, y2, . . . yn be the true values of a function, and suppose the value y5 to be affected with an error , so that its erroneous value is y5 + . Then the successive differences of the y’s are as shown below: Table Showing The Effect Of An Error In The Tabular y
y
y
y
2
y
3
4
y0 y0 y0 2
y1 y1
y0 3
y1
y0
2
y2
4
y2
y1 3
y2
y1 +
2
y3 y3
4
y2 + 3
y3 +
y2 + 4
2
y4 y4 + y5 +
4
y3 + 3 3
y4 – 2
y3 + 6
2
y5 –
4
y4 + 3 3
y5 +
y4 + 4
2
y6 y6
4
y5 – 3
y6
y5 +
2
y7 y7
4
y6 3
y7
y6
2
y8 y8
4
y7 3
y8 2
y9 y9
y10 This table shows that the effect of an error increases with the successive differences, that the coefficients of the ’s are the binomial coefficients with alternating signs, and that the algebraic sum of the errors in any difference column is zero. It shows also that the maximum error in the differences is in the same horizontal line as the erroneous tabular value. The following table shows the effect of an error in a horizontal difference table: Table 6 y y0 y1 y2 y3 y4 y5 + y6
y y0 y1 y2 y3 y4 y5 + y6–
2 y
3 y
4 y
2y2 2y3 2y4 2y5 + 2y6– 2
3y3 3y4 3y5 + 3y6 – 3
4y4 4y5 + 4y6 + 4
FOR MORE DETIALS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
NUMERICAL & STATISTICAL METHODS y7 y8 y9 y10
y7 y8 y9 y10
2y7+ 2y8 2y9 2y10
14 3y7 + 3 3y8 – 3y9 3y10
4y7 + 6 4y8 + 4 4y9 + 4y10
Here, again, the effect of the error is the same as in the preceding table, but in this table the first erroneous difference of any order is in the same horizontal line as the erroneous tabular value. The law according to which an error is propagated in a difference table enables us to trace such an error to its source and correct it. As an illustration of the process of detecting and correcting an error in a tabulated function, let us consider the following table: x 0.10 0.15 0.20 0.25 0.30 0.35 0.40 0.45 0.50 0.55 0.60 0.65 0.70
y 0.09983 0.14944 0.19867 0.24740 0.29552 0.34290 0.38945 0.43497 0.47943 0.52269 0.56464 0.60519 0.64422
y
2 y
3 y
4 y
4961 4923 4873 4812 4738 4655 4552 4446 4326 4195 4055 3903
-38 -50 -61 -74 -83 -103 -106 -120 -131 -140 -152
-12 -11 -13 -9 -20 -3 -14 -11 -9 -12
1 -1 4 -11 17 -11 3 2 -3
-4 6 -4
Here the third differences are quite irregular near the middle of the column, and the fourth differences are still more irregular.
The irregularity begins in each column on the
horizontal line corresponding to x = 0.40. Since the algebraic sum of the fourth differences is 1, the average value of the fourth differences is only about 0.1 of a unit in the fifth decimal place. hence the fourth difference found in this example are mostly accumulated errors. Referring now to Table 6, we have –4 = –11,
6 = 17, etc.
Hence, = 3 to the nearest unit. The true value of y corresponding to x = 0.40 is therefore 0.38945 – 0.0003 = 0.388942, since (yk +) – = yk. The columns of differences can now be corrected, and it will be found that the third differences are practically constant. 1.5 ACCURACY AND PRECISION In measurements and calculations accuracy and precision are the two important characteristics. Accuracy refers the agreement of the measured or calculated value with the fine or original value. Precision refers to how closely values agree with each other.
FOR MORE DETIALS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
NUMERICAL & STATISTICAL METHODS
15
In numerical analysis it is shown in practice that computed solutions are not exact mathematical solutions. The accuracy in this respect can be achieved practically in several ways. Hence, it because necessary to understand the error and analysis of error, and approximation for the development and numerical methods. 1.6 APPROXIMATE NUMBERS AND SIGNIFICANT FIGURES (a) Approximate Numbers In the discussion of approximate computation, it is convenient to make a distinction between numbers which are absolutely exact and those which express approximate values. Such numbers as 2, 1/3, 100, etc. are exact numbers because there is no approximation or uncertainty associated with them.
Although such numbers as ď °,
2,
e, etc. are exact
numbers, they cannot be expressed exactly by a finite number of digits. When expressed in digital form, they must be written as 3.1416, 1.4142, etc. Such numbers are therefore only approximations to the true values and in such cases are called approximate numbers. An approximate number is therefore defined as a number which is used as an approximation to an exact number and differs only slightly from the exact number for which it stands. (b) Significant figures A significant figure is any one of the digits 1,2,3,...9 ; and 0 is a significant figure except when it is used to fix the decimal point or to fill the places of unknown or discarded digits. Thus, in the number 0.00263 the significant figures are 2, 6, 3; the zeroes are used merely to fix the decimal point and are therefore not significant.
In the number 3809,
however, all the digits, including the zero, are significant figures. In a number like 46300 there is nothing in the number as written to show whether or not the zeros are significant figures. The ambiguity can be removed by writing the number in the powers-of-ten notation 4
4
4
as 4.63 x 10 , 4.630 x 10 , or 4.6300 x 10 , the number of significant figures being indicated by the factor at the left. 1.6.1 Approximation Approximation is usually apply to numbers. mathematical functions, shapes and physical laws.
Its use can also be extended in Approximation occurs in numerical
methods and analysis when an exact form or an exact numerical number is not known. By applying approximation it is possible to represent the real form so that no significant deviation can be found.
1.7 ROUNDING OF NUMBERS If we attempt to divide 27 by 13.1, we get 27/13.1 = 2.061068702 . . ., a quotient which never terminates. In order to use such a number in a practical computation, we must
FOR MORE DETIALS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
NUMERICAL & STATISTICAL METHODS
16
cut it down to a manageable form, such as 2.06, or 2.061, or 2.06107, etc. This process of cutting off superfluous digits and retaining as many as desired is called rounding off. To round off or simply round a number is to retain a certain number of digits, counted from the left, and drop the others. Thus, to round off ď ° to three, four, five and six figures, respectively, we have 3.14, 3.142, 3.1416, 3.14159. Numbers are round off so as to cause the least possible error. This is attained by rounding according to the following rule: To round off a number to n significant figures, discard all digits to the right to the nth place. If the discarded number is less than half a unit in the nth place, leave the nth digit unchanged; if the discarded number is greater than half a unit in the nth place, add 1 to the nth digit. If the discarded number is exactly half a unit in the nth place, leave the nth digit unaltered if it is an even number, but increase it by 1 if it is an odd number; in other words, round off so as to leave the nth digit an even number in such cases. When a number has been rounded off according to the rule just stated, it is said to be correct to n significant figures. The following numbers are rounded off correctly to four significant figures: 29.63243
becomes
29.63
81.9773
becomes
81.98
4.4995001
becomes
4.500
11.64489
becomes
11.64
48.365
becomes
48.36
7.495
becomes
7.50
When the above rule is followed consistently, the errors due to rounding are largely cancelled by one another. Such is not the case, however, if the computer follows an old rule which is sometimes advocated. The old rule says that when a 5 is dropped the preceding digit should always be increased by1. This is bad advice and is conducive to an accumulation of rounding errors and therefore to inaccuracy in computation. It should be obvious to any thinking person that when a 5 is cut off the preceding digit should be increased by 1 in only half the cases and should be left unchanged in the other half.
Since even and odd digits occur with equal
frequency, on the average, the rule that the odd digits be increased by 1 when a 5 is dropped is logically sound. The case where the number to be discarded is exactly half a unit in the nth place deserves further comment.
From purely logical considerations the digit preceding the
discarded 5000... might just as well be left odd, but there is a practical aspect to the matter. Rounded numbers must often be divided by other numbers, and it is highly desirable from the standpoint of accuracy that the division be exact as often as possible. An even number is
FOR MORE DETIALS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
NUMERICAL & STATISTICAL METHODS always divisible by 2, it may be divisible by other even numbers, and it may also be divisible by several odd numbers; whereas an odd number is not divisible by any even number and it may not be divisible by any odd number. Hence, in general, even numbers are exactly divisible by many more numbers than are odd numbers, and therefore there will be fewer leftover errors in a computation when the rounded numbers are left even. The rule that the last digit be left even rather than odd is thus conducive to accuracy in computation. In certain rare instances the rule for cutting off 50000 . . . should be modified. For example, if a 5 is to be cut off from two or more numbers in a column that is to be added, the preceding digit should be increased by 1 in half the cases and left unchanged in the other half, regardless of whether the preceding digit is even or odd. Other cases might arise where common sense should be the guide in making the errors neutralize one another. 1.8 NUMERICAL STABILITY In numerical analysis, numerical stability is a desirable property of numerical algorithms. Explanation: Sometimes a single calculations can be achieved in several ways, all of which are algebraically equivalent in terms of ideal real or complex numbers, but in practice when performed an digital computers yield different results. Some calculations might damp out approximation errors that occur, others might magnify such errors. Calculations that do not magnify approximation errors are called numerically stable. The definition of stability precisely depends on the context, but it is related to the accuracy of the algorithm.
The relevant phenomenon involved in numerical stability is
‘instability’. In the calculations made by the researchers are swamped by errors even though they have used the mathematics genuinely. Even if there were no round-off or truncation errors, the specific computational method employed can magnify small errors instead of damping so to lead enormous error in the precision level of the results. This phenomenon is called ‘instability’. It is possible to do a single calculation through several ways or methods. All of them may be algebraically equal but in practice computed digitally yielding different results. Some calculations might damp out approximation errors that occur, others might magnify such errors. The calculations that do not magnify approximation errors are called numerically stable. The most important task of numerical analysis is try to select algorithms which are robust-that is to stay, have good numerical stability among other desirable properties.
FOR MORE DETIALS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
17
NUMERICAL & STATISTICAL METHODS
18
EXAMPLE In an unstable algorithm to add an array of 100 number using a programme with two digits precision. Suppose that one element of the array is 1.0 and the other 99 elements are in 0.01 decimal accuracy. Adding in 0.01 would have no effect on the sum, and so the final answer would be 1.0, not a very good approximately the need answer. A stable algorithm would first start the array by the absolute values of the elements in ascending order. That ensures that numbers closest of zero will be taken into consideration first. Once that change is made, all of the 0.01 elements will be added, giving 0.99, and then the 1.0 element will be added yielding a rounded results of 2.0 – a much better approximation of the real rest. Check Your Progress 8. “The absolute error of a sum of approximate numbers is equal to the algebraic sum of their absolute errors” – Is the statement true or false? 9. The percentage error is ___ times the relative error. 10. Can you express &
2 exactly by a finite number of digits?
1.9 SUMMARY In any newly developed method it becomes necessary to identify the error, its nature and analyse the order of the error. Without error analysis the newly developed method becomes useless. An approximation is an inexact representation of something that is still close enough to be useful. It is not only applied to numbers but also applied to shapes and physical laws. In numerical analysis it is shown in practice that computed solutions are not exact mathematical solutions. The accuracy in this respect can be achieved practically in several ways. Hence, it because necessary to understand the error and analysis of error, and approximation for the development and numerical methods.
1.10 KEYWORDS Approximation -
The action of estimating something fairly accurately.
Conversion error
-
It is also known as representation error arise due to the limitation of the computer to store the data exactly.
Error propagation
-
In a process, an error is communicated successively in every Step of
computation. Finally effects
the total
error. Numerical
-
Having to do with a number of numbers
Significant figure
-
Each of the digits of a number that are used to express it to the required degree of accuracy.
FOR MORE DETIALS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
NUMERICAL & STATISTICAL METHODS Stability
-
Firmly fixed or not likely to change.
Theorem
-
A general proposition or rule that can be proved by reasoning.
The converse
-
The opposite of a fact or statement.
1.11 ANSWERS TO CHECK YOUR PROGRESS 1. Magnitude 2. Normalised 3. Round-off error 4. i) Data errors and ii) Conversion errors 5. Yes 6. Yes 7. Yes 8. True 9. 100 10. Cannot 1.12 EXERCISE QUESTIONS 1. Correct value and the computed value of certain quantities are given below. Find the error, absolute error, percentage error and relative error. Correct value
Computed value
(i) 1.5782
1.6782
(ii) â&#x20AC;&#x201C;18.1792
-18.01000
(iii) 15.000
15.8172
2. Explain the concept of significant digits. 3. Describe the relationship between significant digits and the following a) round-off errors
b) accuracy
c) precision
4. What are inherent errors? How do they arise? 5. Estimate the relative error of the final result in evaluation of (i) w1 = (x+y) /2
and
2
(ii) w2 = x +y/2
Given that x = 1.2 y = 25.6 and z = 4.5. 6. Find how many figures of the quotient 4.89ď °/6.7 are trust ver--, assuming that the denominator is true to only two figures. 7. The hypotenuse and a side of a right triangle are found by measurement to be 75 and 32 with, respectively. If the possible error in the hypotenuse is 0.2 and that in the side is 0.1, find the possible error in the computed angle A. 8. Distinguish between round off errors and truncation errors.
FOR MORE DETIALS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
19
NUMERICAL & STATISTICAL METHODS 9. Find the value of
10 –. correct to five significant figures.
10. Prove that the procedure w1 = (y+z)+x is better than the procedure w2 = (x+y)+z when |x| > |y| > |z|.
1.13 FURTHER READINGS Text Books 1. Introductory methods of Numerical Analysis, S.S. Sastry, Prentice-Hall of India Pvt. Ltd., New Delhi. 2. Numerical Methods for Scientific and Engineering Computation, M.K. Jain, S.R.K. Iyengar and R.K. Jain., Wiley Eastern Ltd., New Delhi. 3. Numerical Methods, E.Balagurusamy, Tata McGraw Hill Pub. Co. Ltd. (Third Print 2000), New Delhi-110008. 4. Fortran 77 And Numerical Methods, C Xavier, New Age International (P) Ltd., Publishers, New Delhi. References 1.
Numerical mathematical analysis, James b. Scarbornigh, Oxford University Press (1950).
2.
M.G. Salvadori and M.L. Barow. Numerical Methods in Engineering, Prentice-hall of India, New Delhi.
3.
Numerical Methods: Principles, Analyses and Algorithms. Srimanta Pal, Oxford University Press, India.
FOR MORE DETIALS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
20
NUMERICAL & STATISTICAL METHODS
UNIT –II
STATISTICAL ANALYSIS STRUCTURE: Learning objectives 2.0 Introduction 2.1 Analysis of variance (ANOVA) 2.2 Definition and assumption 2.3 One – way classified data 2.4 Two –way classified data 2.5 ANOVA table 2.6 Summary 2.7 Keywords 2.8 Answer for self assessment 2.9 Exercise Questions 2.10 Reference Books LEARNING OBJECTIVES: After learning this unit you should able to explain.
to implement the basic statistical methods all the statistical methods and model.
about the classification [one way, two way] with examples for the ease of user understanding.
about the different tests which are used in statistical for analyzing the data.
all the statistical methods and models for implementing the statistical computation.
2.0 INTRODUCTION When we go for analysis, first we must classify the data. These analysis give good results of test. The ANOVA is a tool for tests of significance. It takes the different sample for testing. The reason is to produce the good test the homogeneity of several means. The variance is classified into two, based on causes that is assignable causes and chance causes.
In ANOVA method the data classified into one way and two way. This ANOVA table classification produce the good test data. When we are applying the test for a table, we must know the t – test and F – test, chiquare test basically. These tests are used for analysis the significance in different order.
FOR MORE DETIALS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
21
NUMERICAL & STATISTICAL METHODS 2. 1 ANALYSIS OF VARIANCE (ANOVA)
The analysis of variance is a powerful statistical tool for tests of significance. The test of significance based on t-distribution is an adequate procedure only for testing the significance of the difference between two sample means. In a situation when we have three or more samples to consider at a time an alternative procedure is needed for testing the hypothesis that all the samples are drawn from same population i.e, they have same mean.
For example, five fertilizers are applied to four plots each of wheat & yield of wheat on each of the plot is given we may be interested in finding out whether the effect of these fertilizers on the yields is significantly different or in other words, whether the samples have come from the same normal population. The answer to this problem is provided by the technique of analysis of variance.
Thus basic purpose of the analysis of variance is to test the homogeneity of several mean. The total variation in any set of numerical data is due to a number of causes which may be classified as (i)
assignable causes (ii) chance causes The variation due to assignable causes can be detected& measured where as the
variation dew to chance causes is beyond the control of human hand & cannot be traced separately.
2.1.1 DEFINITION:
According to professor. R.A. fisher, analysis of variance (ANOVA) is the separation of variance ascribable to one group of causes from the variance ascribable to other group.
2.1.2 APPLICATIONS OF ANALYSIS OF VARIANCE:
In addition to testing the homogeneity of several means, the ANOVA technique is now frequently applied in testing the linearity of the fitted regression line or the significance of the correlation ration ď ¨.
2.2 DEFINITION AND ASSUMPTION:
For the validity of F-test in ANOVA, following assumption are mode: (i)
the observations are independent.
(ii)
Parent population from which observations are taken is normal and
FOR MORE DETIALS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
22
NUMERICAL & STATISTICAL METHODS (iii)
23
Various treatment & environmental effects are additive in nature.
Cochran’s theorem: [important for classification] 2
Let x1,x2,……..xn denote a random sample from normal population N(0, ). Let the sum of the squares of these values be written in the form xi = Q1+Q2+…….+Qk 2
Where Qj is a quadratic form in x1,x2…..xn with rank (degrees of freedom) rj , j=1,2,…….k. 2
2
Then the random variable Q1+Q2+…….+Qk are mutually independent and Qj/ is x variate with degrees of freedom if and only if rj =n. One-way classification: Let us suppose that N observations xij, (i=1,2,……k;j= 1,2,……..n) of a random variable x are grouped, on some basis, into k classes of sizes n 1,n2,……nk respectively, (n=ni) as exhibited below:
X11
x12………..x1n1
Means
Total
X21
x22………..x2n2
X1
T1
X2
T2
Xj
Ti
xk
Tk
xi1
Xk1
xi2…………xini
xk2………..xknk
G
The toal variation in the observation xij can be split into two components: (i)
The variation between the classes or the variation due to different bases of classification, commonly known as treatments.
(ii)
Variation within the classes, i.e, the inherent variation of random variable within observation of a class.
FOR MORE DETIALS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
NUMERICAL & STATISTICAL METHODS
24
The first type of variation is due to assignable causes which can be detected and controlled by human endeavour & the second type of variation is due to chance causes which are beyond the control of human hand.
The main objective of ANOVA is to examine if there is significant difference between the class means in view of the inherent variability within separate classes. In particular, let us consider the effect of k different rations on the yield in milk of N cows ( of the same bread & stock) divided into k classes of sizes n 1, n2, …..nk respectively, N=ni. Here the sources of variations are (i)Effect of the ration (treatment) ti; i=1,2,……k (ii)Error due to chance.
2.2.1 MATHEMATICAL MODEL: The linear mathematical model will be xij = +i+ij th
th
(i)
xij is the yield from the j cow fed on the i ration (i=1,2,……k)
(ii)
is the general mean effect = nii/N
(iii)
i is effect of i ration given by i= i -, i = 1,2,………..k
(iv)
ij is the error effect due to chance.
th
Assumptions in model: (i)
All the observation xij are independent.
(ii)
Different effects are additive in nature.
(iii)
ij are i.i.d N(0, e ) 2
2.2.2 NULL HYPOTHESIS: We can’t to test the equality of the population’s means, i.e., the homogeneity of different rations. Hence null hypothesis is given by, H0: 1=2=…….. =k = H0: 1= 2=…….=k =0 1.1 Statistical analysis of the model: Let us take xi = mean of the i class = xij/ni, (i=1,2,……k) th
and x..=overall mean = 1/N xij = 1/Nnixi consider (xij – x) = (xij-xi + xi-x..) 2
2
= (xij-xi) + ni (xi-x..) +2[{(xi-x..) (xij-xi)}] 2
2
FOR MORE DETIALS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
NUMERICAL & STATISTICAL METHODS
25
But (xij – xi) = 0, since the algebraic sum of the derivations of the rations from their mean is zero. (xij-x) = (xij-xi.) + ni(xi. –x..) 2
2
2
ST = (xij –x..) is known as total sum of squares. 2
2
SE = (xij – xi.) is called within sum of squares or error sum of squares (S.S.E) and 2
2
St = ni (xi. – x..) is called S.S due to treatments (S.S.T). Then, 2
2
Total S.S =S.S.E+S.S.T Degrees of freedom for various S.S: 2
ST , the total S.S which is computed from the N quantities of the form (x ij –x..) will carry (N-1) degrees of freedom (d.f), one of being lost because of the linear constraint (xij – x..) = 0 Similarly , the treatment um of squares ni (xi –x..) will have (k-1) d.f. since ni (xi-x..) =0 and 2
error S.S. SE = (sij – xi.)2 will have (N-K) d.f since it is based on N quantities which are subject to k 2
linear constraints (xij – xi.) = 0; i=1,2,….k Hence we see that the d.f for various S.S are also additive, since N-1= (N-k) + (k1)
2.2.3 MEAN SUM OF SQUARES:
The sum of squares divided by its degree of freedom gives the corresponding variance or the mean sum of squares (M.S.S) thus, 2
St/(k-1)=S.S.T/(k-1) = St is the M.S.S due to treatment and 2
2
SE /(N-k) = S.S.E/(N-k) = SE is the M.S.S due to error.
2.2.4 EXPECTATION OF TREATMENT SUM OF SQUARES: E (S.S.T/k-1) = e + 1/(k-1) nii 2
2
Expectation of error sum of squares: E(S.S.E/(N-k)) = e i.e, the error mean sum of squares always gives an unbiased estimate of 2
e . Here the test statistic for he is provided by the variance ratio. 2
2
2
F=ST /SE
2
2
2
2
Under H0, by Cochran’s theorem, St /e and SE /e 2
Are independently distributed as X variates with (k-1) and (N-k) d.f respectively. Hence the statistic, F= [St /s *1/k-1] [SE /e *1/N-k] = St /SE 2
2
2
2
2
2
Follows snedecor’s F(central) distribution with (k-1, N-k) d.f
FOR MORE DETIALS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
NUMERICAL & STATISTICAL METHODS
26
2. 3 ANOVATABLE FOR ONE-WAY CLASSIFIED DATA
Source
of
Sum of squares
D,f
variation
Mean
sum
of
Variance ratio
squares 2
Treatment(ration)
St
k-1
2
2
St =St /(k-1)
2
2
St /SE =Fk-1, Nk
Error
2
N-k
2
N-1
SE
Total
ST
2
2
SE =SE /N-k
Example: Four salesmen were posted in different areas by a company a number of commodity ‘x’ sold by them as follows:
A
B
C
D
20
25
23
15
23
32
28
21
28
30
35
19
29
21
18
25
Is there significant difference in the performance of these salesman? Procedure: Take H0: there is no significant difference is the performance of salesman. First the correlation faction cf be calculated. 2
c.f = G /N where G=Grand total, N = No. of. Information Total sum of squares (T.S.S) = Row sum of squares – c.f Where Row sum of square – sum of square of sales of commodity by all the four salesman. Between sum of square (salesman sum of square) Ti /ni – c.f where Ti represents sales of each salesman in different areas. 2
Error sum of squares = T.S.S – Between sum of square . 2
Ti
Ti
A
20
23
28
29
100
10000
B
25
32
30
21
108
11664
C
23
28
35
18
104
10816
D
15
21
19
25
80
6400
392
38880
FOR MORE DETIALS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
NUMERICAL & STATISTICAL METHODS
27
2
c.f = G /N = 153664/16 = 9604 Total sum of square = Row sum of square – c.f = 10054 – 9604 = 454 Between sum of square (B.S.S) = Ti /4 – c.f = 38880/4 -9604 2
=116 Error sum of square = T.S.S – B.S.S = 454 – 116 = 338
ANOVA TABLE:
Source
of
d.f
S.S
M.S.S
Variance ratio
Fat 5%
3
116
38.66
FC = 1.373
3.49
12
338
28.16
variation Between Sample Error
(within
sample) Total
15
The calculated value of F is less than the table. Hence here is no significant difference in the performance of the four salesmen.
Two-way classification: Let us now suppose that the N cows are divided into h different groups or classes according to their breed & stock, each group containing k cows & then let us consider the effect of k treatment on yield of milk.
Let the suffix I refer to treatments & suffix j refer to the varieties. Then the yields of milk x ij, (i=1,2,……k, j= 1,2,……h) of N = hxk cows furnish the data for the comparison of the treatments. The yields may be expressed as variate values in the following kxh two way tables
FOR MORE DETIALS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
NUMERICAL & STATISTICAL METHODS
28 Means
Totals
x11
x12……x1j…..x1n1
Means
Total
x21
x22……x2j…..x2n2
x1
T1
x2
T2
xj
Ti
Tk
xi1
xi2……..xij……xini
xk1
xk2…….xkj…..xknk
xk
Means
x.1
x.2………x.j…..x.h
x..
Totals
T.1
T.2……..T.j……T.h
G
1. Define ANOVA with example. …………………………………………………………………………………………………………… …………………………………………………………………………………………………………… ……………………………………………
Mathematical model: th
th
Let xij be the yield from the cow of j variety fed on the i ration ( i=1,2,…….k;j=1,2,……..h). (i)
the general mean effect given by = ij/N
(ii)
The effect i, (i=1,2,…..k) due to the i ration.
(iii)
The effect j, (j=1,2,…..h) due to the j variety.
th
th
Where ij is the error effect
Xij = +i + j + ij Statistical model of this classification:
FOR MORE DETIALS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
NUMERICAL & STATISTICAL METHODS
29
th
Let us write xi = mean yield of i ration =1/h xij (i=1,2,……k) th
Xj = mean yield of the j variety =1/k xij, (j=1,2,……h) x.. = overall mean = 1/hk xij Consider (xij – x..)2 = [(xij –xi.-x.j +x..) + (xi. – x..) + (x.j – x..)]
2
= (xij –xi.-x.j +x..) + (xi. – x..)2 2
+ (x.j – x..)2 + 2 (xi.-x..) (xij-xi.-x.j+x..) +2 (x.j-x..) (xij – xi. –x.j+x..) +2 (xi. – x..) (x.j –x..) Now (xi. – x..) (xij – xi.-x.j+x..) = [(xi. –x..) xij –xi –x.j+x..)] = [(xi. – x..) { (xij – xi.) - (x.j –x..)}]=0 Since algebraic sum of deviations of a set of observations about their mean is zero.
Similarly it can be easily seen that other product terms also vanish. Hence, (xij –x..)2 = h (xi. –x..)2 + k ( (xi.j –x..)2 + (xij – xi. –x.j + x..)2 Or ST2 = St2+Sv2+SE2
8.3.1 NULL HYPOTHESIS:
We set up the null hypothesis that the treatments as well as varieties are homogeneous. In other words, the null hypothesis for treatment & varieties are respectively: Ht: 1 = 2 = 3 = ………..=k = Hv: 1 =2 =3 = ………...=h =
Ht = 1 = 2=………..=k = 0 (or)
Hv= 1=2=…………=h = 0
Degrees of freedom for various S.S:
The total S.S, ST2 being computed from N=hk quantities (xij-x..) which are subject to one linear constraint (xij –x..)=0 will carry (N-1) d.f Similarly St2 will be based on (k-1) d.f, since (xi.=x..) =0 & Sv2 will have (h-1) d.f., since (xj.-x..) = 0and SE2 will carry (N-1)-(k-1)-(h-1) = (h-1) (k-1) d.f
Thus a partition of d.f is as follows; (hk-1) = (k-1) + (h-1) +(h-1) k-1) which implies that the d.f are additive.
FOR MORE DETIALS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
NUMERICAL & STATISTICAL METHODS
30
2.3.2 TEST STATISTIC:
In order to obtain appropriate test statistic to test the hypothesis Ht & Hv we need the expectation of the various mean S.S due to each of independent factors. Using the same notations for the mean S.S as in the case of one way classified data, we get, Mean S.S due to treatments = St2/k-1 = St2 (say) Mean S.S due to varieties = Sv2/h-1 =Sv2 (say) Error mean S.S = St2/(h-1) (k-1) = St2 (say) Exceptions of various sum of squares: E(St2) = + h/(k-1) i2 E(Sv2 = e2 +k/(h-1) j2 E[SE2/(h-1) (k-1)] = e2 E(SE2) = e2 We get by cochran’s theorem, St2/e2, Sv2/e2 and SE2/e2 Are mutually independent -variate with (k-1), (h-1) & (h-1) (k-1) d.f respectively. Hence under Ht & Hv respectively we get Ft = St2/e2(k-1) SE2/(h-1) (k-1) = St2/SE2 conforms to F (k-1), (h-1) (k-1) And Fv = Sv2/e2(h-1) SE2/e2(h-1) (k-1) =Sv2/SE2 conforms to F (h-1), (h-1) (k-1)
2.4 ANOVA TABLE FOR TWO-WAY CLASSIFICTED DATA
Source
of
Sum of squares
d.f
M.S.S
ST2=(xi. –x..)2
k-1
St2
Variance ratio
variation Treatments
=
ST2/(k-1) Varieties
Sv2 = (x.j –x..)2
h-1
Sv2 Sv2/(h-1)
Residual (or)
SE2 = = (xij –xi. –
Error
x.j +x..)2
(h-1) (k-1)
Ft =St2/SE2 ~ Fk-1, (h-1) (k-1)
=
Fv=Sv2/SE2 ~ Fh-1, (h-1) (k-1)
SE2= SE2/(h-1)
FOR MORE DETIALS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
NUMERICAL & STATISTICAL METHODS
31 (k-1)
(xij –x..)2
Total
Hk-1
Example: The following data gives the number of unit produced by 4 different workers using 3 different machines.
Workers
Machine type A
B
C
1
20
28
36
2
22
30
32
3
26
32
18
4
32
20
24
Test whether (i)4 workers differ with respect mean productivity (ii) whether mean productivity is same for difference machine type.
Procedure:
Ho: The mean productivity is same for 4 workers Ho: The mean productivity is same for the 3 different machines. Correction factor – G2/N = 96100/12 = 8008.33 Total sum of square (tT.S.S) = Raw S.S – C.F =8292 – 8008.33 =233.67
Workers
A
B
C
Ti
Ti2
1
20
28
26
74
5476
2
22
30
32
84
7056
3
26
32
18
76
5776
4
32
20
24
76
5776
Tj
100
110
100
310
24084
Tj2
10,000
12,100
10,000
32,000
Sum of square between machine (S.S.M) = Ti2/4-C.F = 32100/4-80008.33
FOR MORE DETIALS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
NUMERICAL & STATISTICAL METHODS
32 = 16.67
Sum of square between workers (S.S.W) = Ti2/3-C.F=24084/3-8008.33 = 19.67 Error sum of square = T.S.S – S.S.M – S.S.W = 283.67 – 16.67 – 19.67 = 247.33
2.5 ANOVA TABLE:
Source
d.f
S.S
M.SS
VR
Fat 5%/0.5
2
16.67
8.335
0.2022
5.14
3
19.67
6.566
Error
6
247.83
41.22
Total
11
283.67
variation Between machine Between
4.77
workers 0.1590
Since calculated value of F is less than table value for both machine & workers. There is no significant difference between machine type and workers. 2.5.1 ‘t’-TEST BASED ON MEAN. Student’s‘t’ distribution: Let xi = (i=1,2,…..n) be a randon sample of size n from a normal population with mean & variance 2. Then student’s t is defined by t=x-/s/n Where x=1/nxi is sample mean & S2 =1/n(xi-x)2 is an unbiased estimate of the population variance 2 Application of t-test distribution: t-test for single mean: Suppose we want to test: (i)
If a random sample xi (i=1,2,….n) of size n has been drawn from normal population with a specified mean.
(ii)
If the sample mean differs significantly from the hypothetical value o of the population mean.
Under null hypothesis Ho:
(i)
Sample has been drawn from the population with mean o.
FOR MORE DETIALS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
NUMERICAL & STATISTICAL METHODS (ii)
There is no significant difference between sample mean x & the population mean o.
The statistic t = x-o/s/n where x = 1/nxi and S2 = 1/n-1 (xi – x)2 Application of t – distribution: The t-distribution has a wide number of applications in statistics, some of which are (i)
To test if the mean (x) differs significantly from the hypothetical value of the population mean.
(ii)
To test the significance of the difference between two sample means.
(iii)
To test the significance of an observed sample correlation coefficient & sample regression coefficient.
(iv)
To test the significance of observed partial correlation coefficient.
Example for t-test for single mean: A random sample of 10 boys had following IQ’s 70, 120, 110, 101, 88, 83, 95, 98, 101, 100. (i) Do these data support the assumption of a population mean I.Q of 100? (ii) Find a reasonable range in which those of mean I.Q values of samples of 10 boys lie?
Solution: N=10 Take Ho: =100, H1:100 (two-tailed) Under Ho, the test statistic t=x-/s/n ~ tn-1 Where x= x/N and S = 1/n-1 [x2-n(x)2] X=x/n = 972/10 = 97.2 S = 1/9 (1833.6) = 203.73 = 14.27 T= x-/s/n = 97.2 – 100/14.2710 T= -8.85/14.27 = -0.62 t = 0.62 d.f = n-1 = 10-1 =9 Let = 5% = 0.05 table value t = 2.26 Since t <t, Ho is accepted at 5%/o.s (ii)confidence intervals for are given by x + t.s/n 97.2 + 2.26 (4.52) = 97.2 + 10.21 (866.99, 107.41) The data support the assumption & 95% confidience limits are 86.99, 107.41
FOR MORE DETIALS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
33
NUMERICAL & STATISTICAL METHODS Critical values of t: The critical or significant values of t at level of significance and d.f. v for two-tailed test are given by equation: P[ t > tv ()] = P[ t
tv () ] = 1-
The values tv () have been tabulated in fisher’s & Yates tables for different values of & v and are given in table. Since t-distribution is symmetric about t=0 P[t>tv()] + p [ t<-tv ()] = 2p [t>tv ()] = p[t> tv ()] = /2 p[t>tv(2)] = Tv (2) gives the significant value of t for a single –tail test at elvel of significance & v.d.f Hence the significant values of t at level of significance ‘’ for a single-tailed test can be obtained from those of two-tailed test by looking the values at level of significance 2. For eg, t8(0.05) for single-tail test = t8 (0.10) for two-tailed test. t-test for difference of means: Suppose we want to test if two independent samples zi (i=1,2,…..n1) and yj (j=1,2,……n2) fo sizes n1&n2 have been drawn from two normal populations with mean x &r respectively. Under null hypothesis (Ho) that the samples have been drawn from the normal populations with means x &y and under the assumption that the population variance are equal, i.e., x2 =y2=2 (say), the statistic t= x-y –(x &r)/s/(1/n1+1/n2)where x=1/ni xi, y =1/n2yj & S2 = 1/n1+n2-2[(xi-x)2 +(yj-y) is an unbiased estimate of the common population variance 2, follows student’s t-distribution with (n1+n2-2) d.f
Note: 1. An important deduction which is of much practical utility is discussed. Suppose we want to test if : a) two independent samples xi (i=1,2,…..n1) and yj (j=1,2,…..n2) have been drawn from populations with same means, or b) the two sample means x & y differ significantly or not. Under the null hypothesis, Ho that (a) samples have been drawn from the population with same means, i.e., x=r (or) (b) the sample means x & y do not differ significantly, the statistic t=x-y/s(1/n1+1/n2) 2.5.2 ASSUMPTIONS OF T-TEST FOR DIFFERENCE OF MEANS:
(i)
parent populations from which the samples have been drawn are normally distributed.
FOR MORE DETIALS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
34
NUMERICAL & STATISTICAL METHODS (ii)
The population variances are equal & unknown, i.e., x2=y2=2, where 2 is unknown.
(iii)
The two samples are random & independent of each other.
2.5.3 PAIRED T-TEST FOR DIFFERENCE OF MEANS:
Let us now consider the case when (i) the sample size are equal i.e., n1=n2=n (say) and (ii) two samples are not independent but the sample observations are paired together i.e., the pair of observations (xi,yi), (i=1,2,…..n) corresponds to the same (ith) sample unit. The problem is to test if the sample means differ significantly or not.
For eg, suppose we want to test the efficiency of a particular drug, say for inducing sleep. Let xi& yi (i=1,2,…..n) be the reading, in hours of sleep, on the ith individual, before & after the drug is given respectively. Here instead of applying the difference of the means test, we apply the paired t-test. Here we consider the increments, di =xi –yi, (i=1,2,….n) under null hypothesis,Ho that increments are due to situations of sampling, i.e., the drug is not responsible for these increments, the statistic: t= d/sn where d=1/n di & S2 = 1/n-1 (di –d)2follows student’s tdistribution with (n-1) d.f.
Example: The scores of 10 candidates prior & after training are given below: Let x = scores prior to training N=10 Y = scores after to training Take Ho:1 =2 H1: 1 <2 Under, Ho, the test statistic t= d/s/n ~ t(n-1) where d =d/n(i.e.) d=x-y S=1/n-1[d2-n(d)2] D= -63/10 =-6.3 S = 1/10-1[1125-10(-6.3)] =1/9(728.1) =80.9= 8.99 t=-6.3/8.9910 = -6.3/2.84 = -2.2 t = 2.2 X
Y
D=x-y
D2
84
90
-6
36
48
58
-10
100
FOR MORE DETIALS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
35
NUMERICAL & STATISTICAL METHODS 36
56
-20
400
37
49
-12
144
54
62
-8
64
69
81
-12
144
83
84
1
1
96
86
-10
100
90
84
6
36
65
75
-10
100
-63
1125
d.f. = n-1 = 10-1 = 9 let = 5% = 0.05 Table value t= 1.83 t > t So Ho is reject at 5% /.o.s H1 is a accepted at 5% /.o.s the training is effective.
2.5.4 F TEST FOR THE EQUALITY OF VARIANCE: Suppose we want to test (i) whether two independent samples xi,(i=1,2,….n1) and yj, (j=1,2,….n2) have been drawn from the normal populations with same variance 2(say), or (ii) whether the two independent estimates of the population variance are homogeneous or not. Under the null hypothesis (Ho) that (i) x2 =y2=2, i.e, the population variances are equal, or (ii) two independent estimates of the population variance are homogeneous, the statistic F is given by: F = Sx2/Sy2 where Sx2 = 1/n1-1(xi –x)2 and Sy2 = 1/n2-1(yj-y)2 are unbiased estimates of the common population variance 2 obtained from two Independent samples and it follows snedecor’s F-distribution with (v1,v2) d.f. where v1=n1 -1 and v2 = n2-1.
2.5.5 CRITICAL VALUES OF F-DISTRIBUTION: F table give the critical vsalues of F for the right –tailed test i.e., the critical region is determined by the right-tail areas. Thus the significant value F(n1,n2) at level of significance amd (n1,n2) d.f. is determined by: p[F>F(n1,n2)] = The reciprocal relation between the upper & lower significant points of Fdistribution:F(n1,n2) = 1/F1-(n1,n2) F(n1,n2) xF1-(n1,n2) =1
FOR MORE DETIALS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
36
NUMERICAL & STATISTICAL METHODS
37
The critical values of F for left tail-test Ho:12 =22 against H1: 12<22 are given by F<Fn1-1. n2-1 (1-), and for the two-tailed test, Ho: 12 =22 against H1: 1222 are given by F>Fn1-1, n2-1 (/2) and F<Fn1-1, n2-1(1-/2).
Example: The following two samples are from two normal populations.
Sample 1:60
65
71
74
76
82
85
87
Sample 2:61
66
67
85
78
63
85
86
88
91
Test whether 2 populations have the same variances.
Solution: Let x=values of sample 1 Y=values of sample 2. N1=8, n2=10 Take Ho: x2 =y2 H1:x2y2 (two-tailed) Under Ho, the test statistic F=greater variance/smaller variance~ F (r1, r2) Sx2 = (x-x)2/n1-1, sy2 = (y-y)2/n2-1, x=x/n1, y =y/n2 X=600/8 =75, y = 770/10 =10 Sx2 =636/7 =90.86, sy2 = 1200/9 = 133.33 Sy2>sx2, F –sy2/sx2 F = sy2/sx2 = 133.33/9.86 = 1.47 Let = 5% = 0.05, d.f (r1, r2) where r1 = n2-1 =9, r2 =n1-1 =7
Table value of F for (9,7) d.f. at 5% /.o.s. = 3.68. Since F<F, Ho is accepted at 5% l.o.s. The two population have same variance.
2.5.6 CHI-SQUARE TEST FOR INDEPENDENCE OF ATTRIBUTES:
Chi-square test: Chi-square test is applied in statistics to test the goodness of fit to verify the distribution of observed data with assumed theoretical distribution. Therefore it is a measure to study the divergence of actual & excepted frequencies. If there is no difference between actual & expected frequencies 2 is zero. Characteristics of 2-test: 1. test is based on events or frequencies, whereas in theoretical distribution, the test is based on mean & standard deviation.
FOR MORE DETIALS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
NUMERICAL & STATISTICAL METHODS 2. to draw inferences, this test is applied, specifically testing the hypothesis but not useful for estimation. 3. The test can be used between the entire set of observed & excepted frequencies. 4. for every increase in the number of degree of freedom, a new2distribution is formed. 5. It is a general purpose test & as such is highly useful in research.
Chi-square function: The square of a standard normal variate is known as a chi-square variate with 1 degree of freedom (d.f) Thus if ~N (,2), then Z = x-/~N (0,1) and Z2 = (x-/)2 is a chi-square variate with d.f. In general if xi, (i=1,2,…n), are n independent normal variates with means , & variances i2(i=1,2,….n), then 2 = (xi-i/i)2 is a chi-square variate with n.d.f. Applications of chi-square distributions: (i)
to test if the hypothetical value of the population variance is 2 = o2 (say).
(ii)
To test the ‘goodness of fit’.
(iii)
To test the independence of attributes.
(iv)
To test the homogeneity of independent estimates of the population variance.
(v)
To combine various probabilities obtained from independent esperiments to give a single test of significance.
(vi)
To test the homogeneity of independent estimates of the population correlation coefficient.
2. Explain the Chi-square test with example. …………………………………………………………………………………………………………… …………………………………………………………………………………………………………… …………………………………………… 2.5.7 - 2 Test Of Goodness Of Fit: Through the test we can find out the deviations between the observed values & expected values. Here we are not concerned with the parameters but concerned with the form of distribution. Karl perarson has developed a method to test the difference between the theoretical value and the observed value. The test is done by comparing the computed value with the table value of 2 for the desired degree of freedom. A greek letter 2 is used to describe the magnitude of difference between fact & theory. The 2 may be defined as, 2 = {(o-E)2/E}2 o-observed frequencies, E-expected frequencies. Steps: 1. A hypothesis is established along with the significance level. 2. compute deviations between observed value & expected value (0-E).
FOR MORE DETIALS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
38
NUMERICAL & STATISTICAL METHODS
39
3. square the deviations calculated (o-E)2. 4. Divide the (o-E)2 by its expected frequency. 5. Add all the values obtained in step4. 6. Find the value of 2, from table at certain level of significance, usually 5% 2
level. If the calculated value of 2 is greater that the table valued of 2, at certain level of significance, we reject the hypothesis. If computed value of 2, value is zero than, the observed value & expected value completely coincide. If the computed value of 2 is less than table value, at a certain degree of level of significance, it is said to be non-significant. This implies that the discrepancy expected frequencies may be due to fluctuations in simple sampling.
Example: For the data in the following table, test for independence between a person’s ability in mathematics & interest in Economics.
Ability in mathematics Average
High
Total
Low
63
42
15
120
Average
58
61
31
150
High
14
47
29
90
Total
135
150
75
360
Economics
Interest
in
Low
Solution: Take Ho: Ability in mathematical & interest in Economics are independent. H1: ability in mathematics & interest in Economics are not independent. Under Ho, the test statistic 2 = [(o-E) 2/E] ~ 2(S-1) Expected frequency = RT x CT/N E(63) = 120x135/360 = 45,
E(8) = 150x135/360 = 56.25
E(14) = 90x135/360 = 33.75,
E(42) = 120x150/360 =50
E(61) = 150x150/360 = 62.5,
E(47) = 90x150/360 = 37.7
E(15) = 120x75/360 = 25,
E(31) = 150x75/360 =31.25
E(29) = 90x95/360 = 18.75 d.f = (r-1) (s-1) = (3-1) (3-1) = 2x2 = 4 let = 5%, 2() = 9.488 2 >2(), Ho is rejected at 5% l.o.s. H1 is accepted at 5% l.o.s.
FOR MORE DETIALS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
NUMERICAL & STATISTICAL METHODS O
E
(O-E)2/E
63
45
7.200
58
56.25
0.054
14
33.75
11.557
42
50.0
1.280
61
62.5
0.036
47
37.5
2.406
31
31.25
0.002
15
25.0
4.400
29
18.75
5.603 2=32.14
Ability in mathematics and interest in economics are not independent. 2.5.8 GOODNESS OF FIT FOR BINOMIAL DISTRIBUTION:
Definition: A random variable x is said to follow binomial distribution if it assumes only nonnegative value and its probability mass function is given by: P(x=x) = p(x) =
px qn-x ; x = 0,1,2,….n; q=1-p
n x
0,
otherwise
The two independent constants n & p in the distribution are known as the parameters of the distribution. ‘n’ is also sometimes, known as the degree of the binomial distribution. Binomial distribution is a discrete distribution as x can take only the integral values, 0,1,2, ….n. Any random variable which follows binomial distribution is known as binomial variate. We shall use the notation x ~ B (n, p) to denote that random variable x follows binomial distribution with parameter n & p.
Example: Ten coins are thrown simultaneously. Find the probability of getting at least seven heads.
Solution: P = probability of getting head = ½ Q= probability of not getting head = ½ The probability of getting x heads in a random throw of 10 coins is: P(x) = 10 (1/2)x (1/2)10-x =
10 (1/2)10 ; x = 0,1,2,…
FOR MORE DETIALS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
40
NUMERICAL & STATISTICAL METHODS x
41
x
probability of getting at least seven heads is given by: P(x7) = p(7) +p(8)+ p(9)+ p(10)
10 = (1/2)10
7
10 +
8
10 +
9 +
10 10
=120+45+10+1/1024 = 176/1024
2.5.9 MOMENTS OF BINOMIAL DISTRIBUTION:
The first four moments about origin of binomial distribution are obtained as follows: 11 = E(x) =x n
px q n-x x
= np
n-1 px-1 a n-x x-1
= np (q+p) n-1 = np Mean of binomial distribution is n p. n 21= E (x2) = x2 x
p x q n-x
= {x (x-1) + x} n (n-1)/x(x-1). n-2 px q n-x x-2
= n (n-1) p2 { n-2 p x-2 q n-x } + np. x-2 31 = E (x3) = x3 P (x) n = {x(x-1) (x-2) + 3x (x-1) +x} x = n(n-1) (n-2) p3 n-3
px q n-x
px-3 q n-x + 3n (n-1) p2 x-3
n-2 px-2 qn-x + np. x-2
= n(n-1) (n-2) p3 (q+p)n-3 + 3n (n-1) p2 (q+p)n-2+ np.
= n (n-1) (n-2) p3 + 3n (n-1) p2 + np.
FOR MORE DETIALS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
NUMERICAL & STATISTICAL METHODS
41 = E (x4) = x4 n
px q n-x x
= n(n-1) (n-2) (n-3) p4 + 6n (n-1) (n-2) p3 +7n (n-1) p2 +np.
2.5.10 CENTRAL MOMENTS OF BINOMIAL DISTRIBUTION: 2 = 21 - 12 = n2 p2 – np2 +np – n2p2 = np(1-p) =npq. 3 = 31 -32111 + 213 = { n(n-1) (n-2) p3 + 3n (n-1) p2 +np} – 3 {n(n-1) p2 +np} np +2(np)3 = np (-3np2 +3np+ 2p2 – 3p +1 – 3npq) = np {3np (1-p) + 2p2 – 3p +1 – 3npq} = np (2p2 -3p +1) = np (2p2-2p +q) = npq (1-2p) =npq {q+p -2p} = npq (q-p) 4 = 4111+621112- 3114 = npq {1+3 (n-2) pq} Hence, 1 = 2/ 3
2 3
2 2 2
2
3
= n p q (q-p) /n p3q3 = (q-p)2/npq = (1-2p)2/npq
2 = 4/22 = npq (1+3 (n-2)pq)/n2p2q2 = 1+3(n-2)pq/npq = 3+*1- 6pq/npq 1 =1 = q-p/npq = 1-2p/npq, 2 =2 – 3 = 1-6pq/npq
Recurrence relation for the moments of binomial distribution: r = E{x-E(x)}r =(x-np)r n
px q n-x x
Differentiations with respect to p, we get, r/p = n
[-nr (x-np)r-1 px qn-x + (x-np)r {xpx-1 qn-x = (n-x) px qn-x-1}] x
= -nr n
(x-np)r-1 px q n-x + n x
(x-np)r px q n-x (x/p – n-x/q) x
= -nr (x-np) r-1 p(x) +(x-np)r p(x) (x-np/pq = -nr (x-np)r-1 p(x) +1/pq(x-np)r+1 p(x) =-nrr-1 + 1/pqr+1 r+1 = pq (nrr-1 + dr/dp) Putting 1 ,2 ,3 we get, 2 = pq (n0+d1/dp) =npq 3 =pq [2n1+d2/dp] = pq.d(npq)/dp = npq d/dp {p(1-p)} = npq d/dp (p-p2) = npq (1-2p) = npq (q-p)
FOR MORE DETIALS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
42
NUMERICAL & STATISTICAL METHODS
43
= pq [3n2+ d3/dp] = pq[3n.npq+d/dp{npq(q-p)}] =pq [3n2pq+nd/dp{p(1-p) (1-2p)}] =pq [3n2pq+nd/dp(p-3p2+2p3)] =pq [3n2pq+n(1-6p+6p2)] = pq [3n2 pq+n(1-6pq)] =npq [3npq+1-6pq] = npq [1=3pq(n-2)]. Example: A survey of 320 families with 5 children each revealed the following distribution: No. of. Boys
5
4
3
2
1
0
No. of .girls
0
1
2
3
4
5
No. of. Families 14
56
110
88
40
12
Is this result consistent with hypothesis that male & female births are equally probability.
Solution: Take Ho: the male & female births are equally probable. H1: the male & female births are not equally probable. P = probability of male birth =1/2 Q = 1-p = 1-1/2 = ½. P(x) = probability of ‘x’ male births in a family of 5 children. =
5
px q n-x =
5
x
(1/2)x (1/2)5-x x
p(x) = 5
(1/2)5 x
Excepted frequency = N.p(x) =320.
5
(1/2)5 x
=320
5
(1/32) = 10 5
x
x
Under, Ho, 2 =[(o-E)2/E] ~2 (n-1
O
E
(O-E)2/E
14
10
1.600
56
50
0.720
110
100
1.000
88
100
1.440
FOR MORE DETIALS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
NUMERICAL & STATISTICAL METHODS 40
50
2.000
12
10
0.40 2 = 7.16
E(14) = 5
x10 = 10 5
E(14) = 5
x10 = 100 3
E(14) = 5
x10 = 50 4
E(14) = 5
x10 = 100 2
E(14) = 5
x10 = 0 0
E(14) = 5
x10 = 50 1
d.f = n-1 =5 let = 5% = 0.05 = 2 (0.05) = 11.07 Since 2 <2 (0.05), Ho is accepted. The male & female births are equally probable.
2.5.11 POISSON DISTRIBUTION:
Definition: A random variable x is said to follow a Poisson distribution if it assumes only nonnegative values & its probability mass function is given by: P(x,) = p(X=x) = e-*x/x!; x=0,1,2;>0 0,
otherwise
Here,is known as the parameter of the distribution. Moments of the poisson distribution: 11 = E(x) = xp(x,) = xe-*x/x!
FOR MORE DETIALS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
44
NUMERICAL & STATISTICAL METHODS
45
=e-{ x-1/(x-1)!} =e- (1++2/2!+3/3!+….) Mean = e-.e = 21 = E(x2) =x2p (x,) ={x(x-1) +x} e-*x/x! =e-x(x-1) x/x!+xe-x/X! =2e-[x-2/(x-2)!]+ 2e-e+ = 2+ 31=E(x3) = x3p (x,) ={x(x-1) (x-2) + 3x (x-1) +x} e- x/x! =x(x-1) (x-2) e- x/x! + 3x (x-1) e- x/x! +x e- x/x! = e- 3 { x-3/(x-3)!}3e-2{x-2/(x-2)!} + = e- 3 e+3e- 2e + =3 + 32 +. 41 = E(x4) = x4 P(x,) = {x(x-1) (x-2) (x-3) + 6x(x-1) (x-2) +7x (x-1) +x} e- x/x! =e-4 {x-4/(x-4)!} + 6e- 3{x-3/(x-3)!} +7e- 2 {x-2/(x-2)!} +. = 4(e- e) +63(e- e) + 72(e- e) +. = 4 +63 +72+. The four central moments. 2 = 21 - 112 = 2 + - 2 =. Thus mean & variance of Poisson distribution are each equal to . 3 =31 -321 11 +2113= (3 +32 +) -3(2 +) + 23 =. 4 = 1 - 43 1 +6 21 112 - 3114 4
1
1
= (4 +63 +72+ ) -4(3 +32 +) + 62 (2 +) -34. = 32 +. Coefficient of skewness & kurtoris are given by, 1 = 2/
3
2 =
= 3 +1/ and
3
2
4
2 / 2
= 2/3 = 1/. And
1 = 1 = 1/ 2 = 2 – 3 = 1/.
Poisson is skewed distribution proceeding to the limit as
, 1 = 0 and 2 = 3
Example: Fit a poisson distribution to the following data and test the goodness of fit. X: 0
1
2
3
4
5
6
Y :275
72
30
7
5
2
1
Solution: Mean x = fi xi/N = 129/392 = 0.482. Take Ho: the fit is good. H1: the fit is not good.
FOR MORE DETIALS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
NUMERICAL & STATISTICAL METHODS Expected frequencies = N.p(x) where p(x) = e x/x!, x = 0,1,2,….6 -
Under Ho, the test statistic = [(O-E) /E]~ (n-1) 2
E(275) = 392 x e -0.482
392 x e
-0.482
2
2
x(0.482/0!)0
= 242.08.
E(72) = 392 x e
-0.482
E(30) = 392 x e
-0.482
E(7) = 392 x e
-0.482
)1/1! = 116.68 (0.482)2/2! = 28.12
(0.482)3/3! =4.52
E(5) = 392 x e
-0.482
(0.482)4/4!=0.54
E(2) = 392 x e
-0.482
(0.482)5/5!=0.05
E(1) = 392 x e
-0.482
(0.482)6/6!=0.04
Under Ho, = [(O-E) /E] ~ (n-1) = 40.94 2
2
2
d.f = n-1-4 = 7-1-4 = 2, Let = 5% = 0.05, () = 5.991 2
Since > , Ho is rejected at 5% l.o.s. H1 is accepted at 5% l.o.s. The Poisson 2
2
distribution is not fit to given data.
Check your progress: 3. What is known as median? …………………………………………………………………………………………………………… …………………………………………………………………………………………………………… …………………………………………………………………… 4. What is variance? …………………………………………………………………………………………………………… …………………………………………………………………………………………………………… ……………………………………………………………………
8.6 SUMMARY: ANOVA is a powerful tool for tests of significance. It is based in the t – distribution. It takes the different sample of mean for significance. Variation are classified into two, based on causes. These causes can be defected and measured. ANOVA is the separation of variance ascribable to one group of causes from the variance ascribable to other group. When we apply the validity, different assumptions are made. Variance in the observation can be spilt into two components, namely variance between the classes and variance within the classes. The mean sum of squares is the sum of squares divided by its degree of freedom gives the corresponding variance. The error is formulated as xij i j ij . T - distribution has a number of applications in statistics that is means differ significantly, difference between the two samples, observed partial correlation coefficient and sample correlation coefficient and sample regression coefficient. Chi – square test is applied in statistics to test the goodness of fit to verify the distributions of observed data with assumed theoretical distribution.
2.7 KEYWORDS ANOVA ANOVA is the separation of variance ascribable to one group of causes from the variance ascribable to other group.
FOR MORE DETIALS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
46
NUMERICAL & STATISTICAL METHODS
T-test Let xi = (i=1,2,......n) be a random sample of size n from a normal population with 2 mean H & variance sigma . Then student's is is defined by t=x-m/srootn.
F-test F table give the critical values of F for the right-tailed test i.e. the critical region is determined by the right-tail areas. Mean Average or mean is a value which is typical or representative of a set of data. Average is only short way of expressing an arithmetical result.
Median Median can be defined as the value of that item which divides the series into two equal parts, one half containing values greater than it and other half contains values less than it. Variance Square of standard deviation is called variance. 2 Symbolically, variance = sigma where sigma = standard deviation sigma = root of variance Chi - square test Chi-square test is applied in statistics to test the goodness of fit to verify the distribution of observed data with assumed theoretical distribution. Goodness of fit A random variable x is said to follow binomial distribution if it assumes only nonnegative value and its probability mass function is given by, n-x P(x=x) = P(X) = [n/x] Px q ; x = 0,1,2,..n ; q = 1-p.
2.8 ANSWER FOR SELF-ASSESSMENT: 1.
Analysis of variance (ANOVA) is the separation of variance ascribable to one group of causes from the variance ascribable to other group.
2.
Chi-square test is applied is statistics to test the goodness of fit to verify the distribution of observed data with assumed theoretical distribution therefore it is a measure of study the divergence of actual & excepted frequencies. If there is no difference between actual and expected frequencies x2 is zero.
3.
Median can be defined as the value of that item which divides the series into two equal parts, one half containing values greater than it and other half contains values less than it.
4.
Square of standard deviation is called variance. 2 Symbolically, variance = sigma where sigma = standard deviation
FOR MORE DETIALS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
47
NUMERICAL & STATISTICAL METHODS sigma = root of variance
2.9 EXERCISE QUESTIONS: 1. What is ANOVA? 2. Write about ANOVA table for one-way classified data? 3. Write about ANOVA table for two-way classified data? 4. What is Chi-square test? 5. Write about goodness of fit for Binomial distribution?
2.10 REFERENCE BOOKS: 1. E.Balagurusamy, “Object oriented programming with c++”, Tata McGraw Hill pub, co. Ltd. 2. S.B.Libman and I.Lajo “C++ Primer”, Addison Wesley, Massachusetts. 3. LADD, S.Robert , “C++ Techniques and applications”, M and T books, 1990.
-------------------------------------------------------THE END--------------------------------------------------------
FOR MORE DETIALS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
48
NUMERI CAL& STATTI STCALMETHODS
i g Publ i s he dby
I ns t i t ut eofManage me nt& Te c hni c alSt udi e s Addr e s s:E4 1 , Se c t o r 3 , No i da( U. P) www. i mt s i ns t i t ut e . c o m| Co nt a c t :9 1 +9 2 1 0 9 8 9 8 9 8