I ns t i t ut eo fMa na g e me nt & Te c hni c a lSt udi e s
APPLI EDOPERATI ONS RESEARCH& STATI STI CS 500
EXECUTI VEMBA www. i mt s i ns t i t ut e . c om
IMTS (ISO 9001-2008 Internationally Certified) APPLIED OPERATIONS RESEARCH & STATISTICS
APPLIED OPERATIONS RESEARCH & STATISTICS
FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
APPLIED OPERATIONS RESEARCH AND STATISTICS
CONTENTS: Chapter: 01 01-11 Linear Programming Problem (LPP) –Meaning-Formulation-Graphical methodSimplex method-Big-M-method Chapter: 02 12-21 Transportation problem-Degeneracy in transportation problem-North West Corner rule-Least cost method- Vogel’s Approximation method Chapter: 03 22-39 Assignment problem- Formulations of the assignment problem-Traveling salesman problem-Hungarian method Chapter: 04 40-48 Sequencing problem-Processing n jobs through two machines- Processing n jobs through three machines- Processing two jobs through m machines Chapter: 05 49-55 Game theory-Meaning-Saddle point-Two person zero sum game- Dominance property Chapter: 06 56-64 Network project scheduling –Network and basic components-Rules-CPM-PERT for project scheduling Chapter: 07 65-84 Sampling distribution-Concepts-Normal distribution-Sample size-Standard errorPoint and interval estimation-Estimation of proportion and mean for small and large samples Chapter: 08 85-99 Testing of hypothesis-Hypothesis testing of proportion and mean for single and two tail test-Errors in hypothesis testing and measuring the hypothesis testing Chapter: 09 100-108 Correlation and Regression –Meaning-Types-Methods of studying correlationregression equations Chapter: 10 109-122 Chi – Square Analysis - Introduction -Estimation of Chi – square and interpolation, chi – square distribution - chi – square tests for independent and goodness of fit.
FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
Chapter: 11 123-136 F- test and Analysis of Variance (ANOVA) – Introduction- Estimation and interpretation of F – test- one way classification and two way classification of ANOVA - Assumptions of the analysis of variance. Chapter: 12 137-146 Time series analysis-Variation in time series analysis-Trend analysis-Cyclical, Seasonal and irregular variation
FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
APPLIED OPERATIONS RESEARCH & STATISTICS
1
CHAPTER – I
LINEAR PROGRAMMING PROBLEM(LPP) STRUCTURE 1.1 INTRODUCTION 1.2 TERMINOLOGY 1.3 FORMALATION OF LPP 1.4 GRAPHICAL METHOD 1.5 SIMPLEX METHOD 1.6 BIG M-METHOD 1.7 QUESTIONS 1.8 SUGGESTED READINGS
LINEAR PROGRAMMING PROBLEMS (LPP)
1.1 INTRODUCTION
Linear Programming deals with the optimization of a function of variables known as objective function, subject to set of linear equalities/inequalities known as constraints. The objective function may be profit, loss, cost, production capacity or any other measure of effectiveness which is to be obtained in the best possible or optimal manner. The constraints may be imposed by different sources such as market demand, production processes and equipment, storage capacity, raw material availability, etc. By linearity is meant a mathematical expression in which the variables have unit power only.
Linear Programming is used for optimization problems that satisfy the following conditions: 1.
There is a well defined objective function to be optimized and which can be expressed as a linear function of decision variables.
2.
There are constraints on the attainment of the objective and they are capable of being expressed as linear equalities/inequalities in terms of variables.
3.
There are alternative courses of action.
4.
The decision variables are interrelated and non-negative.
5.
Resources are in limited supply.
FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
APPLIED OPERATIONS RESEARCH & STATISTICS
2
1.2 Terminology.
The function to be maximized or minimized is called the objective function.
A vector, x for the standard maximum problem or y for the standard minimum problem, is said to be feasible if it satisfies the corresponding constraints.
The set of feasible vectors is called the constraint set.
A linear programming problem is said to be feasible if the constraint set is not empty; otherwise it is said to be infeasible.
A feasible maximum (resp. minimum) problem is said to be unbounded if the objective function can assume arbitrarily large positive (resp. negative) values at feasible vectors; otherwise, it is said to be bounded. Thus there are three possibilities for a linear programming problem. It may be bounded feasible, it may be unbounded feasible, and it may be infeasible.
The value of a bounded feasible maximum (resp, minimum) problem is the maximum (resp. minimum) value of the objective function as the variables range over the constraint set.
A feasible vector at which the objective function achieves the value is called optimal.
1.3 FORMULATION OF LINEAR PROGRAMMING PROBLEM A linear programming problem was defined as maximizing or minimizing a linear function subject to linear constraints. All such problems can be converted into the form of a standard maximum problem by the following techniques.
A minimum problem can be changed to a maximum problem by multiplying the objective function by −1.
Some variables may not be restricted to be nonnegative. An unrestricted variable, xj , may be replaced by the difference of two nonnegative variables, xj = uj − vj, where uj ≥ 0 and vj ≥ 0. This adds one variable and two nonnegativity constraints to the problem.
Any theory derived for problems in standard form is therefore applicable to general problems. However, from a computational point of view, the enlargement of the number of variables and constraints in (2) is undesirable and, as will be seen later, can be avoided.
Example 1.1 A firm produces three products. These products are processed on three different machines. The time required manufacturing one unit of each of the three products and the daily capacity of the three machines are given in the table below.
FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
APPLIED OPERATIONS RESEARCH & STATISTICS
3
Time per unit (minutes) Machine
Product 1
Product 2
Product 3
Machine capacity(minutes/d ay)
M1
2
3
2
440
M2
4
-
3
470
M3
2
5
-
430
It is required to determine the daily number of units to be manufactured for each product. The profit per unit for product 1,2 and 3 is Rs.4, Rs.3 and Rs.6 respectively. It is assumes that all the amounts produced are consumed in the market. Formulate the mathematical model for the problem.
Solution Step 1: From the study of the situation find the key-decision to be made. It is this connection, looking for variables helps considerably. In the given situation key decision is to decide the extent of products 1, 2 and 3 as the extents are permitted to vary. Step 2: Assume symbols for variable quantities noticed in step 1. Let the extents (amounts) of products 1,2 and 3 manufactured daily be x1,x2 and x3 respectively. Step 3: Express the feasible alternatives mathematically in terms of variables. Feasible alternatives are those which are physically, economically and financially possible. In the given situation feasible alternatives are set of values of x1, x2 and x3, where x1, x2 ,x3
ď‚ł 0,
Since negative production has no meaning and is not feasible. Step 4: Mention the objective quantitatively and express it as a linear function of variables. In the present situation, objective is to maximize the profit. i.e., maximize Z = 4x1 + 3x2 + 6x3. Step 5:
FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
APPLIED OPERATIONS RESEARCH & STATISTICS
4
Put into words the influencing factors (or constraints). These occur generally because of constraints on availability (resources) or requirements (demands). Express these constraints also as linear equalities/inequalities in terms of variables. Here, constraints are on the machine capacities and can be mathematically expressed as
440, 4x1 + 0x2 + 3x3 470, 2x1 + 5x2 + 0x3 430. 2x1 + 3x2 + 2x3
The complete linear programming problem is then given by Maximize Z = 4x1 + 3x2 + 6x3,
440, 4x1 + 0x2 + 3x3 470, 2x1 + 5x2 + 0x3 430, x1, x2, x3 0.
Subject to 2x1 + 3x2 + 2x3
1.4 GRAPHICAL METHOD Linear programming problem with only two variables presents a simple case, for which the solution can be derived using a graphical method. This method consists of the following steps: 1. Represent the given problem in mathematical form, i.e., formulate an L.P. model for the given problem. 2. Represent the given constraints as equalities on x1, x2 co-ordinates plane and find the convex region formed by them. 3. Plot the objective function. 4. Find the vertices of the convex region and also the value of the objective function at each vertex. The vertex that gives the optimum value of the objective function gives the optimal solution to the problem. In general, a linear programming problem may have (i) a definite and unique optimal solution (ii) an infinite number of optimal solutions (iii) an unbounded solution, and (iv) no solution Example: A firm manufactures two products A & B on which the profits earned per unit are Rs.3 and Rs.4 respectively. Each product is processed on two machines M 1 and M2. Product A requires one minute of processing time on M1 and two minutes on M2, while B requires one minute on M1 and one minute on M2. Machine M1 is available for not more than 7 hrs.30mins, while machine M 2 is available for not more than 10hrs. during any working day. Find the number of units of products A and B to be manufactured to get maximum profit.
FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
APPLIED OPERATIONS RESEARCH & STATISTICS
5
Formulation of LPP Step1: Key decision is to determine the extent of manufacturing the products A and B. Step2: Let these extents be x1and x2 respectively. Step3: Feasible alternatives are sets of values of x1, x2 where x1 0 and x2
0.
Step4: Objective is to maximize the profit. i.e., maximize Z = 3x1 + 4x2 Step5: Constraints are on the time available for machines M1 and M2 i.e., for machine M1, 1x1 + 1x2 450 i.e., for machine M2, 2x1 + 1x2
600
Solution: The two constraints are x1 + x2 450 and 2x1 + x2
600. We plot the line x1 + x2 = 450 by joining
two convenient points, say (0,450) and (450,0) and the line 2x1 + x2 = 600 by joining points, say (0,600) and (300,0). Then any point lying on or below the line x1 + x2 = 450 satisfies the constraint x1 + x2
450.
Similarly, any point lying on or below the line 2x1 + x2 = 600 satisfies the constraint
600. This is clearly indicated by the direction of arrow heads in the figure. The shaded area in the figure satisfies both the constraints x 1 + x2 450 and 2x1 + x2 600 and also the nonnegativity restrictions x1 0, x2 0. This area is called the solution space or the region of feasible 2x1 + x2
solutions. Any point in this shaded region is a feasible solution to the given problem.
600
500
400
300
200
100
100
200
300
400
500
600
FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
APPLIED OPERATIONS RESEARCH & STATISTICS
6
Method 2 : The four vertices of the convex region OCDE are O (0,0), C(0,450),D(150,300) and E(300,0). Values of the objective function Z = 3x 1 + 4x2 at these vertices are Z(O) = 0, Z(C) = 1800, Z(D) = 450 + 1200 1650, Z(E) = 900. Thus the maximum value of Z is Rs.1800 and it occurs at the vertex C(0,450). Hence the solution to the problem is x1 = 0, x2 = 450 and Z = Rs. 1800. 1.5 Simplex Method The simplex method or technique is an iterative procedure for solving the linear programming problems. Example.1: Maximize Z = 3x1 + 4x2 subject to x1 + x2 450 2x1 + x2 x1, x2
600
0
solution: Introducing slack variables s1, s2, standard form of the problem is Maximize Z = 3x1 + 4x2 + 0s1 + 0s2 subject to x1 + x2 + s1 + 0s2 = 450 2x1 + x2 + 0s1 + s2 = 600 x1, x2, s1, s2
0
Put x1= 0 and x2 = 0 to find the initial basic feasible solution s1 = 450, s2 =600 Table-1 cj
3
4
0
0
cB
Basis
x1
x2
s1
s2
b
0
s1
1
1
1
0
450
0
s2
2
1
0
1
600
cj
3
4
0
0
cB
Basis
x1
x2
s1
s2
0
s1
1
1
1
0
0
s2
2
1
0
1
Ej
0
0
0
0
Cj- Ej
3
4
0
0
Table-2
b 450
600
Since Cj- Ej is positive under x1, x2- columns, Table- 2 is not optimal.
FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
APPLIED OPERATIONS RESEARCH & STATISTICS
7
x2 and s1 are incoming and outgoing variables respectively and 1 is the key element. In Table-3, s1 is replaced by x2 in the basis. Table-3 cj
3
4
0
0
cB
Basis
x1
x2
s1
s2
b
4
x2
1
1
1
0
450
0
s2
1
0
-1
1
150
Ej
4
4
4
0
1800
Cj- Ej
-1
0
-4
0
Since Cj- Ej is either negative or zero under all columns Table-3 is optimal and the optimal basic feasible solution is given by x1 = 0, x2 = 450 and Z = Rs.1800 1.6 BIG M-METHOD Example.2: Use penalty (Big M) method to Maximize Z = 6x1 + 4x2 Subject to 2x1 + 3x2 30
24 x1 + x2 3 x1, x2 0
3x1 + 2x2
Is the solution is unique? If so, give two different solutions. Example.3: Minimize Z = x1 - 3x2 + 3 x3 Subject to 3x1 - x2 + 2 x3
7
-12 -4x1 + 3x2 + 8 x3 10 x1, x2, x3 0 2x1 + 4x2
Solution: Multiplying the second constraint throughout by -1, it can be written as -2x1 - 4x2
12
Introducing slack variables s1 , s2, s3, the problem can be expressed in the standard form as Minimize Z = x1 - 3x2 + 3x3 +0s1 + 0s2 + 0s3 Subject to 3x1 - x2 + 2 x3 + s1 + 0s2 + 0s3 = 7 -2x1 -4x2 + 0x3 + 0s1 + s2 + 0s3 = 12 -4x1 +3x2 + 8 x3 + 0s1 + 0s2 + s3 = 10 x1, x2, x3, s1, s2, s3
0
FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
APPLIED OPERATIONS RESEARCH & STATISTICS
8
Table 1 cj
1
-3
3
0
0
0
cB
Basis
x1
x2
x3
s1
s2
s3
b
0
s1
3
-1
2
1
0
0
7
0
s2
-2
-4
0
0
1
0
12
0
s3
-4/3
1
8/3
0
0
1/3
10/3
Ej
0
0
0
0
0
0
0
Cj- Ej
1
-3
3
0
0
0
Since Cj- Ej is negative under x2- column, Table 1 is not optimal. In Table 1 x2 is incoming variable, s3 is outgoing variable and 3 is the key element. This element is made unity in Table 2. Table 2 cj
1
-3
3
0
0
0
cB
Basis
x1
x2
x3
s1
s2
s3
b
0
s1
5/3
0
-14/3
1
0
1/3
31/3
0
s2
-22/3
0
32/3
0
1
4/3
76/3
-3
x2
-4/3
1
8/3
0
0
1/3
10/3
Ej
4
-3
-8
0
0
-1
-10
Cj- Ej
-3
0
11
0
0
1
Since Cj- Ej is negative under x1- column, Table-2 is not optimal. In Table-2 x1 is incoming variable, s1 is outgoing variable and 5/3 is the key element. This element is made unity in Table-3. Table-3 cj
1
-3
3
0
0
0
cB
Basis
x1
x2
x3
s1
s2
s3
b
1
x1
1
0
14/5
3/5
0
1/5
31/5
0
s2
0
0
156/5
22/5
1
14/5
354/5
-3
x2
0
1
32/5
4/5
0
3/5
58/5
Ej
1
-3
-82/5
-9/5
0
-8/5
-143/5
Cj- Ej
0
0
97/5
9/5
0
8/5
Since Cj- Ej is non-negative under all variable columns, Table-3 is optimal. The optimal basic feasible solution is
FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
APPLIED OPERATIONS RESEARCH & STATISTICS
9
x1 = 31/5, x2 = 58/5, x3 = 0 and Z = -143/5 1.7 Questions: PART-A(ONE MARK) 1
If a decision variable is unrestricted in sign, it is replaced by difference of two ……………….. variables. An n-tuple (x1, x2, ……xn) of real numbers which satisfies the constraints of the general
2.
LPP is called a………………………..to the general LPP. 3.
Any solution to a general LPP which also satisfies the non-negativity restrictions of the problem is called a……………………..to the general LPP.
4.
Any feasible solution which optimizes the objective function of a general LPP is called as an ……………………..solution to the general LPP. The set of feasible solutions to an LPP is……………..set.
5.
True or False 6.
Linear Programming is probabilistic in nature
7.
Decision variables can be unrestricted in the context of an LPP.
8.
Objective function specifies the dependent relationship between the decision variables and the objective function .
9.
Linear Programming is a mathematical technique used to solve the problem of allocating limited resources among the competing activities .
10.
The feasible solution of an LPP is independent of the objective function.
Answer; 1.non negative. 2. solution 3. feasible solution 4. optimum 5. an convex set 6. False
7.False
8. True
9. True
10. True
PART-B 11. Define LPP. What are the assumptions Made in LPP and the components of LPP. 12. Define constraint, feasible solution, unbounded solution and Optimum Solution. 13. Define feasible region. 14. Write the characteristics of Standard Form and Canonical Form of LPP. 15. Define slack variable and surplus variable and Artificial variable. What is the purpose of using Artificial variable? 16. What are the assumptions made in the Simplex method and in Big-M method? 17. Convert into Maximization type:
Minimize z = 10 x 1 + x2 + 2x3.
18 Write the standard form of the following LPP: Maximize z = x1+3 x2+5x3+6x4 Subject to the constraints x1+2 x2=15, 5 x1+10 x2+5 x3+4 x4
25,
3 x1-8 x2-9 x3+ x4 30, - x3+ x4 -20, x1, x2, x3, x4 0. PART-C 19. A person wants to decide the constituents of a diet which will fulfill his daily
requirements of
proteins (Pro), fat and carbohydrates (carbo) at the minimum cost. The choice is to be made from
FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
APPLIED OPERATIONS RESEARCH & STATISTICS
10
your different types of foods. The yields per unit of these foods are given in the following table: Formulate the problem as a LPP and solve by graphical method. Food
Yield per unit
Type
Pro
Fats
Carbo
Cost / unit
1
3
2
6
45
2
4
2
4
40
3
8
7
7
85
4
6
5
4
65
800
200
700
Minimum Requirement
20) A firm manufactures headache pills in two sizes A and B. Size A contains 2 grains of Aspirin, 5 grains of Bicarbonate and 1 grain of Codeine. Size B contains 1 grain of Aspirin, 8 grains of Bicarbonate and 6 grain of Codeine. It is found by users that it requires at least 12 grains of Aspirin, 74 grains of Bicarbonate and 24 grains of Codeine for providing immediate effect. It is required to determine the least number of pills to cure Headache. Formulate the problem as a LPP and solve by graphical method.
21) A T.V. company operates two assembly sections Section A, Section B. Each section is used to assemble the components of three types of T.V.colour, standard, Economy. The expected daily production is TV Model
Section A
Section B
Colour
3
1
Standard
1
1
Economy
2
6
The daily running costs for two sections average Rs.6000 for section A and Rs.4000 for Section B.It is given that company must produce atleast 24 colours, 16 standard and 40 economy TV sets. Formulate this as an LPP. 22) A company makes two types of leather products A & B. Product A is of high quality & Product B is of low quality. The respective profits are Rs. 4 and Rs.3 per product. Each product A requires twice as much time as product B and if all products were of type B, the company could make 1000 per day (Both A and B Combined). Product A requires a special spare part and only 400 per day are available. There are only 700 special spare parts a day available for product B. Formulate this as an LPP.
1.8 SUGGESTED READINGS 1. Taha, H.A., " Operations research - An Introduction ", Mac Millan publishing Co., (1982).
FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
APPLIED OPERATIONS RESEARCH & STATISTICS
2. Gupta, P.K.and Hira, D.S., "Operations Research", S.Chand & Co., New Delhi, (1999). 3. Ochi, M.K. " Applied Probability and Stochastic Processes ", John Wiley & Sons (1992).
FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
11
APPLIED OPERATIONS RESEARCH & STATISTICS
12
CHAPTER – II
TRANSPORTATION PROBLEM STRUCTURE 2.1 INTRODUCTION 2.2 DEFINITIONS OF TRANSPORTATION MODEL 2.3 NORTH-WEST CORNER RULE 2.4 LEAST COST METHOD OR MATRIX OR MINIMA METHOD 2.5 VOGEL’S APPROXIMATION METHOD 2.6 QUESTIONS 2.7 SUGGESTED READINGS
TRANSPORTATION PROBLEMS 2.1 Introduction As stated in previous sections, simplex algorithm can be used to solve any linear programming model. But this algorithm is laborious. For this reason wherever possible, we try to simplify the calculations.
One such model requiring simplified calculations is called transportation model.
Transportation problems are expressed in matrix form columns vertically and rows’ horizontally. The cell located at the intersection of a row and a column is designated by its row and column headings. Thus the cell located at the intersection of row A and column 3 is called cell ( A, 3 ) Unit costs are placed in each cell. 2.2 Definition of Transportation Model Transportation models deal with problems concerning as t what happens to the effectiveness function when we associate each of a number of origins (Sources) with each of a possibly different number of destinations (jobs). The total movement from each origin and the total movement to each destination is given and it is desired to find how the associations be made subject to the limitations on totals.
In such problems, sources can be divided among the jobs or jobs may be done with a
combination of sources. The distinct feature of transportation problems is that sources and jobs must be expressed in terms of only one kind of unit.
Suppose that there are m sources and n destinations Let a i be the number of supply units available at source i(i=,1,2,3,,,….,m) and let bj, be the number of
demand units required at
destination j (j=1,2,3,….n). Let cij represent the per unit transportation cost for transporting the units from source I to destination j. the objective is to determine the number of units to be transported from
FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
APPLIED OPERATIONS RESEARCH & STATISTICS
13
source I to destination j so that the total transportation cost is minimum. In addition, the supply limits at the sources and the demand requirements at the destination must be satisfied exactly. If xij (xij≥0) is the number of units shipped from source ii to destination j, then the equivalent linear programming model will be: Find xij = 1,2,3..,m; j = 1,2,3,..n) in order to
minimize
z
m
n
c i 1
j 1
n
x
Subject to
ij
i 1
m
x
ij
j 1
ij
xij ,
ai
bj
i= 1,2 ,3…..m
j= 1,2, 3, ….n.
Where xij ≥ 0. The two sets of constraints will be consistent i.e the system will be in balance if m
a i 1
i
n
=
b j 1
j
This restriction causes one of the constraints to be redundant (and hence it can be deleted) so that the problem will have (m + n-1) constraints and (m * n) unknowns. 2.2.1 Methods for Obtaining Initial Basic Feasible solution A feasible solution to a transportation problem is a set of a non –negative allocations, that satisfies the rim (row and column) restrictions. A feasible solution is called a basic feasible solution if it contains no more than m+n-1 non- negative allocations, where m is the number of rows and n is the number of columns of the transportation problem.
A feasible solution (not necessarily basic ) that minimizes (maximizes) the transportation cost (profit) is called an optimal solution. There are several methods for finding an initial basic feasible solution, out of which there are described here. 2.3. North-West Corner Rule This method consists of the following steps: (i) Start with the north-west (upper-left)corner cell of the transportation matrix. Compare the supply of source 1 (S1) with the demand of destination1 (D1). (a) if S1>D1, set x11 = D1 and proceed horizontally to cell(1,2), (b) if S1=D1, set x11 = D1 and proceed diagonally to cell (2,2), and (c) if S1<D1,set x11 = S1 and proceed vertically to cell(2,1).
FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
APPLIED OPERATIONS RESEARCH & STATISTICS
14
(ii) Continue the procedure, step by step, away from the north-west corner cell till an allocation is made in the south- east corner cell. 2.4 Least cost Method or Matrix or Minima Method This method consists in allocating as much as possible in the lowest cost cell/ cells and then further allocation is done in the cell/cells with second lowest cost and so on. 2.5 Vogel’s Approximation Method Vogel’s approximation method yields a very good initial solution, which, sometimes may be the optimal solution. This method takes into account not only the least cost c, but also the costs that just exceed c,.it consists of the following steps: (i) Enter the difference between the smallest and second smallest elements in each column below the corresponding column and the difference between the smallest and second smallest elements in each row to the right of the row. Each such difference is the unit penalty cost for not allocating in the lowest cost cell. Put these differences in brackets. (ii) Select the row or column with the greatest difference and allocate as much as possible, with in the constraints of the rim conditions, to the lowest cost cell in that row or column so as to either exhaust the source supply or to satisfy the destination demand, In case of a tie, allocate to the cell associated with the lowest cost. (iii) Reduce the supply/ demand units by the amount assigned to the cell and cross out the column/row completely satisfied. (iv) Write down the reduced transportation table omitting rows or columns crossed out in step (iii) Repeat steps (i) through (iii) until all allocations have been made. Example (Transportation Problem) A dairy firm has their plants located throughout a state. Daily milk production at each plant is as follows: Plant1 ……….6 million litres, Plant2………..1 million liters, and Plant 3………..10 million liters, Each day the firm must fulfill the needs of its four distribution centres. Minimum requirement at each center is as follows: Distribution center1…..7 million liters, Distribution center 2…..5 million liters, Distribution center 3………3 million liters, and Distribution center 4…..2 million liters. Cost of shipping one million liters of milk from each plant to each distribution center is given in the following table in hundreds of rupees: Table2.1 Distribution centers
FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
APPLIED OPERATIONS RESEARCH & STATISTICS
1 1 Plants
2
3
15
4
2
3
11
7
1
0
6
1
5
8
15
9
2 3
Formulate the problem as an L.P .Problem and find its initial basic feasible solution by (i) North-west corner rule (ii) Least cost method (iii) Vogel’s approximation method if the object is to minimize the transportation cost. Formulation Step 1: Key decision to be made is to find how much quantity of milk from which plant to which distribution center be shipped so as to satisfy the constraints and minimize the cost. Thus the variables in the situation are : x11, x12, x13, x14, x21, x22, x23, x24, x31, x32, x33 and x34.These variables represent the quantities of milk to be shipped from different in the form different plants to different distribution centers and can be represented in the form of s matrix shown below. Table2.2 Distribution centers 1 1 Plants
2
3
4
X11
X12
X13
X14
X21
X22
X23
X24
X31
X32
X33
X34
2 3
In general, we can say that the key decision to be made is to find the quantity of units from each origin to each destination. Thus if there are m origins and n destinations, then x, are the decision variables (quantities to be found), wherei =1,2,3,…,m.and j =1,2,3,…,n. Step 2: Feasible alternatives are sets of values of xij, where xij, ≥ 0. Step3:
Objective is to minimize the cost of transportation. i.e, minimize
2x11 + 3x12 + 11x13 + 7x14
+x21 + 0x22 + 6x23+ x24 +5x31 + 8x22 + 15x33 + 9x34.
FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
APPLIED OPERATIONS RESEARCH & STATISTICS
16 th
th
In general, we can say that if c, is the unit cost shipping from i source to j destination, the objective is
m n c x c x ij ij ij ij i 1 j 1 i 1 j 1 m
Minimize
n
Step 4: Constraints are (i) Because of availability or supply: X11 + x12 + x13 + x14 = 6,
(for milk plant 1)
X21 + x22 + x23 + x24 = 1,
( for milk plant 2)
And
(for milk plant 3)
x31 + x32 + x33 + x34 =10.
Thus, in all, there will be 3 constraints (equal to the number of plants ). In general, there will be m constraints if number of origins is m, which can be expressed as n
x j 1
ij
S j,
i=1,2,3…,m
(ii) Because of requirements or demand: x11 + x21 + x31 =7,
(for distribution center 1)
x12 + x22 + x32 =5,
(for distribution center 2)
x13 + x23 + x33 =3,
(for distribution center 3)
And x14 + x24 + x34 = 2.
(for distribution center 4)
In general, there are n constraints if the number of destinations is n, which can be expressed as j = 1,2,3,...n Thus we find that the given situation involves ( 3 x 4 = 12) variables and (3 + 4 = 7) constraints. In general, such a solution will involve ( m x n) variables and (m + n) constraints. It can be easily seen that in this model the objective function as the constraints are liner functions of the variables and there fore the model can be solved by simplex method. However, as a large number of variables are involved, many times computation will be required which may even exceed the capacity of an electronic computer.
Again , in the transportation situations, the general requirement is minimization of the objective function whereas simplex method was more suitable for maximization problems. Note that the coefficients of xij, in the constraints are either zero or unity. Further, each x ij, occurs only once in the supply constraints and only once in demand constraints.
Such a model called
transportation model can be solved by transportation technique which is easier and shorter than the simplex technique. (i) Initial Basic Feasible Solution by North-West corner Rule
FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
APPLIED OPERATIONS RESEARCH & STATISTICS
17
Following the N-W corner method explained in section 10.3, one proceeds as follows (refer table 2.3):
Table 2.3 Distribution Centers 1
1
2
2
3
3
4
11
7
6/0
(6) 1 2
0
6
1
8
15
9
1/0
(1) 5
3
(5) 2/0
(i)
Supply
7/1/0
(3) 5/0
(2)
1 0/5/2/0
3/0
Set x11 equal to 6,namely, the smaller of the amount available at s1(6) and that needed at D1(7) and
(ii)
Proceed to cell (2,1)(rule c). Compare the number of units available at s2 ( namely 1 ) with the amount required at D1 (1) and accordingly set X21=1.
(iii)
Proceed to cell (3,2) (rule b). Now supply from plant S3 is 10 units while the demand for D2 is 5 units; accordingly set x32 equal to5.
(iv)
Proceed to cell (3,3) (rule a) and allocate 3 there.
(v)
Proceed to cell (3,4) (rule a) and allocate 2 there. It can be easily seen that the proposed solution is a feasible solution since all the supply and requirement constraints are fully satisfied. In this method, allocations have been made to various cells without any consideration of the cost of transportation associated with them. The transportation cost associated with this solution is Z
= Rs.(2 * 6 + 1 * 1 + 8 * 5 + 15 * 3 + 9 * 2 ) * 100 = Rs.(12 + 1 + 40 + 45 + 18 ) * 100 = Rs.11,600.
Note that for any cell which no allocation is made, the corresponding x, is equal to zero. A cell in gets an allocation is called a basic cell.
(iii) Initial Basic Feasible Solution by Least cost Method Following the procedure explained in section 2.3, one proceeds as follows: Here , the lowest cost cell is (2,2) and maximum possible allocation (meeting supply and requirement positions) is made here. Evidently, maximum feasible allocation in cell (2,2) is (1) .This meets the supply position of plant 2. Therefore, row 2 is crossed out, indicating that no allocations are to be made in cells (2,1),(2,3) and (2,4).
FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
APPLIED OPERATIONS RESEARCH & STATISTICS
18
The next lowest cost cell (excluding the cells in row 2) is (1,1), max. possible allocation of (6) is made here and row 1 is crossed out .Next lowest cost cell in row 3 is (3,1) and allocation of (1) is done here. Likewise, allocations of (4),(2) and (3) are done in cells done in cells (3,2),(3,4) and (3,3) respectively. The transportation cost associated with this solution is = Rs. (2 * 6 + 0 * 1 + 5 * 1 + 8 * 4 + 15 * 3 + 9 * 2 ) * 100 = Rs ( 12 + 0 + 5 + 32 + 45 + 18 ) * 100 = Rs.11,200 which is less than the cost associated with the solution obtained by N-W corner method. Example 2.4 Use least cost method to find i.b.f.s. and then find the minimum transportation cost for the following transportation problem: Table 2.4 Destinations
D1
D2
D3
D4
Availability
O1 1
2
1
4
30/10/0
Origins (20)
(10)
O2 3
3
2
1
50/40/20/0 (20)
(20)
(10)
5
9
O3 4
2
20/0
(20)
Requirement
20/0
40/20/0
30/20/0
10
Solution: Step1. Find the Initial Basic Feasible Solution The i.b.f.s obtained by applying the least cost method is shown in table 2.5. Step II. Perform optimality Test Required number of allocations = m + n – 1 = 3 + 4 – 1 = 6. actual number of allocations = 6, and these 6 allocations are in independent positions.Optimality test can therefore, be performed. It consists (1) through (5),details of which are given in example 2.2 Table 2.5
vj
0 1
1
0 1
-1
2
1
µi 1 FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621 2
3
APPLIED OPERATIONS RESEARCH & STATISTICS
19
µi + vj matrix for occupied cells Table 2.6
vj µi 1
0 -
1 2
0 -
-1
2
2
-
-
-
1
1
-
1
0
µi+ vj matrix for vacant cells
Table 2.7
-
0
-
4
1
-
-
-
3
-
4
9
Cell evaluation matrix
Since all cell evaluations are positive, i.b.f.s given by table 10.38 is optimal. The minimum transportation cost is given by Zmin = [1x 20 + 1x10 + 3x 20 + 2x 20 + 1 x 10 + 2 x 20] =180 2.6 QUESTIONS PART-A 1.
A degenerate solution may or may not be optimum
2.
A feasible solution involving exactly (m+n-1) positive variables is known as nondegenerate basic feasible solution. Otherwise it is degenerate basic feasible
3.
Degeneracy can occur in initial solution or it may arise in some subsequent iterations
4.
A basic feasible solution for the general transportation problem must consists of (m+n-1) occupied cells
5.
The cost elements in a dummy row/column shall always be taken equal to zero
Ans: 1 to 5 -true 6. A transportation problem is a special case of a ------------7. Every loop has an ---------------------of cells. 8. Each row and column in the transportation table should have only -----------------------
FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
APPLIED OPERATIONS RESEARCH & STATISTICS
20
9. Closed loops may or may not be -------------in shape. 10. Mathematical formulation of the transportation problem is _______________. Ans: 1.LPP 2. even number 3. one plus and minus sign 4. square 5. assignment. PART-B 11 Define Transportation problem. 12. Write the necessary and sufficient condition for the existence of feasible solution to the transportation problem. 13. Define unbalanced Transportation Problem. 14. Write the mathematical formulation of Transportation Problem. 15. Define Traveling Salesman Problem. 16. When degeneracy occurs in Transportation Problem? How will you resolve it? III.PART-C 17 .Solve the following transportation problem, starting with Vogelâ&#x20AC;&#x2122;s solution 7
3
2
2
1
3
3
4
6
18. Write a short note on traveling salesman problem. 19. A company has three plants at locations A, B and C, which supply to
warehouses located at D,
E, F, G and H. Monthly plant capacities are 800, 500 and 900 units respectively. Monthly warehouse requirements are 400, 400, 500, 400 and 800 units respectively. Unit transportation costs ( in rupees ) are given below : To
From
D
E
F
G
H
A
5
8
6
6
3
B
4
7
7
6
5
C
8
4
6
6
4
Determine an optimum distribution for the company in order to minimize the total transportation cost. 20. Solve by MODI Method A B
C
Supply
P1
8
7
3
60
P2
3
8
9
70
P3
11
3
5
80
80
80
Demand 50
2.7 SUGGESTED READINGS 1. Taha, H.A., " Operations research - An Introduction ", Mac Millan publishing co. (1982).
FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
APPLIED OPERATIONS RESEARCH & STATISTICS
2. Ochi, M.K. " Applied Probability and Stochastic Processes ", John Wiley & Sons(1992). 3. Ross, S., “A First Course in Probability”, Fifth edition, Pearson Education, Delhi, 2002.
FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
21
APPLIED OPERATIONS RESEARCH & STATISTICS
22
CHAPTER â&#x20AC;&#x201C; III
ASSIGNMENT PROBLEM STRUCTURE 3.1 INTRODUCTION 3.2 DEFINITION 3.3 MATHEMATICAL REPRESENTATION OF ASSIGNMENT MODEL 3.4 ASSINGNMENT ALGORITHM 3.5TRAVELLING SALESMAN METHOD 3.6 HUNGARIAN METHOD 3.7QUESTIONS 3.8 SUGGESTED READINGS
3.1 INTRODUCTION It is a s special type of transportation problem in which the number of jobs are allocated for different machines or operators. Each operator/machine will perform only one job (or) Task (or) operation. The objective is to maximize the overall profit or minimize the overall cost for a given assignment schedule. 3.2 DEFINITION The assignment problem may be defined as follows: Given n facilities and n jobs, and given the effectiveness of each facility for each job, the problem is to assign each facility to one and only one job so that the given measure of effectiveness is optimized. The assignment problem stated above can be translated into problems in many decision fields. As an example, consider the following situation: The municipal committee of a city has a fleet of n tractors located at different places in the city. There are also n trailers lying at different places in the same city and it is desired to pick up and haul the trailers to the centralized depot. The problem is to assign each of the n tractors to the corresponding trailers in such a way that a given measure of effectiveness (e.g., total cost involved or the total distance travelled or the total time of travel for tractors) is optimized.
FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
APPLIED OPERATIONS RESEARCH & STATISTICS
23
During our discussion of degeneracy we found that a transportation problem is degenerate if, while deriving a feasible solution, an allocation to any cell satisfies the column as well as row requirements simultaneously. We also know that in the assignment problem, each resource can be assigned to only one job and each job requires only one resource. Hence the assignment problem is a completely degenerate form of the transportation problem. Assignment model may be regarded as a special case of transportation Table 3.1
1 1 2 .
Facilities
C11 C21 . . Cm1
. m Demand bj
1
1
C12 C222 . . Cm2 --
-----. . ---
C1n C2nn . . Cmn
1 1ai Supply 1
1
Model here the facilities represent the sources while the jobs represent the destinations The supply available at each source is 1 i.e., ai=1 for all i. similarly the demand at each destination is 1 I,e,.,bi=1, for all j. The cost of transporting (assigning) facility I to job j is C ij, The resulting transportation model can be represented as in table 3.1. 3.3 Mathematical Representation of assignment Model Mathematically, assignment model can be expressed as follows: Let xij =
0, if the ith facility is not assigned to jth job; 1, if the ith facility is assigned to jth job.
Then the model is given by
n n C x C x ij ij ij ij , j 1 i 1 i 1 j 1 n
Minimize
n
FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
APPLIED OPERATIONS RESEARCH & STATISTICS
n
Subject to constraints n
x i 1
ij
1
x j 1
ij
1
24
i=1,2,3,….n.
j = 1, 2, 3,….n, And x,= 0 or 1.
Example 3.1 A machine tool company decides to make four subassemblies through four contractors. Table 3.2 Contractors
Subassemblies
1
2
3
4
1
15
13
14
17
2
11
12
15
13
3
13
12
10
11
4
15
17
14
16
Each contractor is to receive only one subassembly. The cost of each subassembly is determined by the bids submitted by each contractor and is shown in table 11.2 in hundreds of rupees. Assign the different subassemblies to contractors so as to minimize the total cost. Also formulate the problem as an L.P problem. 3.4 The Assignment Algorithm Step 1
key decision is what to whom ie. which subassembly be assigned to which contractor or
what are the ‘n’=4 optimum assignments on 1-1 basis. Step II
Feasible alternatives are n! possible arrangements for n x n assignment situation. In the
given situation there are 4! different arrangements. n
Step III
Objective is to minimize the total cost involved.i.e., minimize
n
C j 1 i 1
ij
xij
FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
APPLIED OPERATIONS RESEARCH & STATISTICS
25
4
In the given situation the objective is
Step IV
Minimize z=
4
C j 1 i 1
ij
xij
Constraints : a) Constraints on subassemblies are
x11 + x12 + x13 + x14 = 1, x21 + x22 + x23 + x24 = 1, x31 + x32 + x33 + x34 = 1, x41 + x42 + x43 + x44 = 1, b) Constraints on contractors are x11 + x21 + x31 + x41 = 1, x12 + x22 + x32 + x42 = 1, x13 + x23 + x33 + x43 = 1, x14 + x24 + x34 + x44 = 1, Comparing this model with the transportation model, we find that ai=1 and bj=1. Thus assignment model can be represented as in Table 3.3. Therefore, assignment model is a special case of transportation model in which i)
all right-hand-side constants in the constraints are unity ie., ai=1, bj=1. Table 3.3 Contractors (facilities, agents or means)
1
2
3
4
Supply ai
1
15
13
14
17
1
2
11
12
15
13
1
3
13
12
10
11
1
4
15
17
14
16
1
FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
APPLIED OPERATIONS RESEARCH & STATISTICS
Demand bj
1
26
1
1
1
ii)
all coefficients of xij in the constraints are unity.
iii)
m = n.
Solution of the problem Step I Prepare a Square Matrix: Since the situation involves a square matrix this step is not necessary. Step II
Reduce the Matrix: This involves the following sub steps:
Substep1: In the effectiveness matrix, subtract the minimum element of each row from all the elements of the row. See if there is at least one zero in each row and in each column. If it is so, stop here. If not, proceed to sub step 2. Sub step 2: Now subtract the minimum element of each column from all the elements of the column. In the given situation, the minimum element in first row is 13. So, we subtract 13 from all the element of the first row. Similarly we subtract 11, 10 and 14 from all the elements of row 2,3, and 4 respectively. This gives at least one zero in each row as shown in table 3.4. Table 3.4
Contractors
Subassemblies
1
2
3
4
1
2
0
1
4
2
0
1
4
2
3
3
2
0
1
4
1
3
0
2
Since column 4 contains no zero entry, we go to sub step 2 giving the following matrix
FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
APPLIED OPERATIONS RESEARCH & STATISTICS
27
Table 3.5 2
0
1
3
0
1
4
1
3
2
0
0
1
3
0
1
Initial basic feasible solution Step III Check if Optimal Assignment can be made in the current Feasible solution or not Basis for making this check is that if the minimum number of lines crossing all zeros is less than n (in our example n =4) then optimal assignment cannot be made in the current solution. if it is equal to n (=4), then optimal assignment can be made in the current solution. Approach for obtaining minimum number of lines crossing all zeros consists of the following sub steps: Sub step 1 : Examine rows successively until a row with exactly one unmarked zero is found marl. ( ) this zero indicating that an assignment will be made there. mark (x) all other zeros in the same column showing that they cannot be used for making other assignments. proceed in this manner until all rows have been examined. In the given situation, row 1 has a single unmarked zero in column 2. make an assignment as shown. Row 2 has single unmarked zero in column1, make an assignment. row 4 has a single unmarked zero in column1, make an assignment. cross the 2
nd
zero in column 3. Now row 3 has a single
unmarked zero in column 4, make an assignment here. this is shown in the matrix below.
FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
APPLIED OPERATIONS RESEARCH & STATISTICS
28
Table 3.6 2
00
1
3
00
1
4
1
3
2
0
0
1
3
00
1
0
Sub step 2 : Next examine columns for single unmarked zeros, marking them ( ) and also making (x) other zeros in there rows . Repeat the process till no unmarked zero is left in the cost matrix. sub step 3: repeat sub steps 1 and 2 successively till one of the two things occurs: (a)
there may be no row and no column without assignment i.e there is one assignment in each row and in each column. in such a case the optimal assignment can be made in the current solution i.e the current feasible solution is an optimal solution. the minimum number of lines crossing all zeros will be equal to ‘n’
(b)
there may be some row and / or column without assignment. hence optimal assignment cannot all zeros have to be obtained in this case.
In the present example sub steps 2 and 3 are not necessary since there is no column left unmarked. since there is one assignments in each row and in each column, the optimal assignment can be made in the current solution. This minimum total cost is = Rs. (13 x 1 x + 11 x 1 x 1 x 11 x 1 +14 x 1) x 100 = Rs. 4,900, and the optimal assignment policy is Subassembly 1 –
contractor 2,
Subassembly 2 –
contractor 1,
Subassembly 3–
contractor 4,
FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
APPLIED OPERATIONS RESEARCH & STATISTICS
Subassembly 4 –
29
contractor 3,
Example 3.2 Four different jobs can be done on four different machines. The set up and take down time costs are assumed to be prohibitively high for change over. The matrix below gives the cost in rupees of producing jobs i on machines j. Table 3.7 Machines
Jobs
M1
M2
M3
M4
J1
5
7
11
6
J2
8
5
9
6
J3
4
7
10
7
J4
10
4
8
3
How should the jobs be assigned to the various machines so that the total cost is minimized? Represent the problem as an L.P problem Formulation as an L.P problem Step I I key decision is to find what job be assigned to which machine i.e what are the ‘n’ optimum assignments on 1-1 basis. Step II Feasible alternatives are 4! possible arrangements for the given 4*4 assignment situation. Step III Objective is to minimize the total cost involved, 4
I.e minimize
4
C
ij
j 1 i 1
xij
Step IV Constraints are (a)
Due to jobs:
x11 + x12 + x13 + x14 = 1,
x 21 + x22 + x23 + x24 = 1, x 31 + x32 + x33 + x34 = 1, x 41 + x42 + x43 + x44 = 1, (b)
Due to machines:
x 11 + x21 + x31 + x41 = 1,
x 12 + x22 + x32 + x42 = 1,
FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
APPLIED OPERATIONS RESEARCH & STATISTICS
30
x 13 + x23 + x33 + x43 = 1, x 14 + x24 + x34 + x44 = 1, Also xij=0 or 1. Solution of the Problem Step I
Prepare a square Matrix: Since the situation involves a square matrix, this step is not
necessary. Step II
Reduce the matrix: Table 3.8
Jobs
Table 3.9
J1 M1 M2 M3 1 0 2 J12 3 0 2 J3 0 3 J4
7
1
M4 6
1
4
1
6
3
5
0
J1 J2 J3 J4
Matrix after sub step 1
M1 0 3 0 7
M2 2 0 3 1
M3 2 0 2 1
M4 1 1 3 0
First feasible solution
(Contains no zero in column 3)
(Matrix after sub step 2)
Step IIICheck if Optimal Assignment can be made in the Current Feasible Solution or not In the present example, after following sub steps 1 and 2 of example 11.1 we find that their repetition is unnecessary and also row 3 and column 3 are with out any assignments (table 11.10). Hence we proceed as follows to find the minimum number of lines crossing all zeros: Table 3.10 M1
M2
M3
M4
J1
0
2
2
1
J2
3
0
0
1
J3
0
3
2
3
J4
7
1
1
0
Sub step 1: Mark( √ ) the rows for which assignment has been made. In our problem it is the third row. Sub step 2: Mark ( √ ) columns (not already marked) which have zeros in marked rows. Thus column 1 is marked ( √ ) Sub step 3: Mark (√ ) rows (not already marked ) which have assignments in marked columns. Thus row 1 is marked ( √ ). Sub step 4: Repeat steps 2 and 3 until no more marking is possible. In the present case this repetition is not necessary.
FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
APPLIED OPERATIONS RESEARCH & STATISTICS
31
Table 3.11 M1
M2
M3
M4
J1
0
2
2
1
J2
3
0
0
1
J3
0
3
2
3
J4
7
1
1
0
Sub step 5: Draw lines through all unmarked rows and through all marked columns .This gives the minimum number of lines crossing all zeros. If the procedure is correct, there will be as many lines as the number of assignments. In this example, number of such lines is 3 which is less than n (n=4 here).Hence optimal assignment is not possible in the current solution. Step IV
Iterate Towards Optimality
Examine the elements that do not have a line through them. Select the smallest of these elements and subtract elements that do not have a line through them. Add this smallest element to every element that lies at the intersection of two lines. Leave the remaining elements of the matrix unchanged. Proceeding in this manner we get the following matrix: Table 3.12 M1
M2
M3
M4
J1
0
1
1
0
J2
4
0
0
1
J3
0
2
1
2
J4
8
1
1
0
Second feasible solution Step V Check if Optimal Assignment can be made in the Current Feasible Solution or not Repeating step III i.e., substeps 1 through 5 we get table 11.13.
FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
APPLIED OPERATIONS RESEARCH & STATISTICS
32
Table 3.13
J1
M1
M2
M3
M4
0
1
1
0
4
0
0
1
2
1
2
1
1
0
0
J2
0 J3
0
8 J4
0
Since the minimum number of lines passing through all zero is 3 (<4),optimal assignment cannot be made in the current solution. Step VI Iterate Towards Optimality Table 3.14
J1
M1
M2
M3
M4
0
0
0
0
0
2
0
5
0
0
1
0
8
0
0
J2
0
0
2
J3
J4
Step VII
0
0
Check if Optimal Assignment can be made in the Current Feasible solution or not
Repeat or not Repeat step III i.e., sub steps 1 through 5 therein, since there is no row with exactly one unmarked zero, we start considering the columns directly,
FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
APPLIED OPERATIONS RESEARCH & STATISTICS
33
Make assignment in cell (J1,M1) and delete remaining zeros in row and column 1. Make assignment in cell (j3,M3) and delete the other zeros in column 3. Make assignment in cell (J 2,M2) and delete other zero other zero in column 2. Make assignment in cell (j4,M4). As there is assignment in each row and in each column, optimal assignment can be made in the current solution. Hence optimal assignment policy is Job J1 should be assigned to Machine M1, J2 should be assignment to MachineM2, J3 should be assignment to Machine M3, J4 should be assignment to Machine M4, And optimum cost = Rs.(5 + 5 + 10 + 3) = Rs.23. Example 3.3 Solve the following assignment problem: Table 3.15
I
II
III
IV
V
1 11
17
8
16
20
2
9
7
12
6
15
3
13
16
15
12
16
21
24
17
28
26
14
10
12
11
13
4
5
Solution StepI. Reduce the Matrix Subtract the minimum element of each row from all the elements of the row. Then subtract the minimum element of each column from all the elements of the column. Thus tables 3.16 and 3.17 are obtained.
FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
APPLIED OPERATIONS RESEARCH & STATISTICS
34
Table 3.16 I
II
III
Table 3.17
IV
V
I
1
3
9
0
8
12
2
3
1
6
0
9
3
1
4
3
0
4
4
4
7
0
11
9
5
4
0
2
1
3
II
III
IV
1
2
9
0
2
2
1
3
0
4 5
V
8
9
6
0
6
4
3
0
1
3
7
0
11
6
3
0
2
1
0
0
0
0
0
Matrix containing zero in every row City A should supply the vehicle to city 2, City B should supply the vehicle to city6, City C should supply the vehicle to city3. City D should supply the vehicle to city1, City E should supply the vehicle to city4, and Minimum distance traveled = (10 + 12+ 3 + 6 + 7) km = 38 km
3.5 TRAVELING SALESMAN PROBLEM Example 3.6 A company has a team of four salesmen and there are four districts where the company wants to start its business. After taking into account the capabilities of salesmen and the nature of districts, the company estimates that the profit per day in rupees for each salesman in each district is as below. Table 3.18 Districts 1
A Salesman B
2
3
4
16
10
14
11
14
11
15
15
15
15
13
12
13
12
14
15
C D
Find the assignment of salesmen to various districts which will yield maximum profit.
FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
APPLIED OPERATIONS RESEARCH & STATISTICS
35
Solution: As the given problem is of maximization type. It has to be reduced to minimization type before solving it by Hungarian method. This is achieved by subtracting all the elements of the matrix form the highest element in it. The highest element is 16. Subtracting all the elements from 16, the problem reduces to minimization of “loss” given by table 3.19.
Table 3.19 1
A B
2
3
4
0
6
2
5
2
5
1
1
1
1
3
4
3
4
2
1
C D
Hungarian method can now be applied which consists f the following steps: Step I Prepare a Square Matrix: This step is not required here. Step II Reduce the Matrix: Proceeding as in example 3.2, we get table 3.20 Table 3.20 1
A B
2
3
4
0
6
2
5
1
4
0
0
0
0
2
3
12
3
1
0
C D
Matrix after sub step I Initial feasible solution Step III Check if Optimal Assignment can be made in the Current Feasible solution or not Proceeding as in example 1.2 we get
FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
APPLIED OPERATIONS RESEARCH & STATISTICS
36
Table 3.21 1
2
A
0
B
1
C
0
D
2
3
4
6
2
4
0 3
2
5
0
0 3
1
0
As there is one assignment in each row and in each column, optimal assignment can be made in the current feasible solution. Assignment policy shall be salesman A should be assigned district 1, salesman B should be assigned district 3, salesman C should be assigned district 2, salesman D should be assigned district 4, Maximum profit per day = Rs. (16 + 15 + 15 + 15) = Rs.61. Similarly, the demand at each destination is I i.e., bi=1, for all j The cost of transporting (assigning) facility i to job j is Cij. The resulting transportation model can be represented as in table 3.1 3.2 Mathematical Representation of assignment Model Mathematically, assignment model can be expressed as follows: Let xij 0, if the ith facility is not assigned to jth job; 1, if the facility is assigned to jth job. Then the model is given by Minimize Z = Subject to constraints And xij = 0 or 1. 3.6 Hungarian method Hungarian method or reduced matrix method or Floodâ&#x20AC;&#x2122;s technique is used for solving assignment problem. It involves a rapid reduction of the original matrix and finding a set of n independent zeros, one in each row and column, which results in an optimal solution. The method consists of the following steps: Step-1. Prepare a square matrix, this step will not be required for n x n assignment problems. For m x n (m = n) problems, a dummy column or a dummy row, as the case may be, is added to make the matrix square. Step-2. Reduce the matrix. Subtract the smallest element of each row from all the elements of the row. Examine if there is at least one zero in each row and in each column. If not, subtract the minimum element of the column(s) not containing zero from all the elements of that column(s). Step-3. Check whether an optimal assignment can be made in the reduced matrix or not
FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
APPLIED OPERATIONS RESEARCH & STATISTICS
37
(a) Examine rows successively until a row with exactly one unmarked Zero is obtained. Make an assignment to this single zero by making square (
) around it. Cross (x) all other zeros in the
same column as they will not be considered for making any more assignments in that column. Proceed in this way until all rows have been examined. (b) Now examine columns successively until a column with exactly one unmarked zero is found. Make an assignment there by making a square (
) around it and cross (x) any other zeros in the
same row. In case there is no row or column containing single unmarked zero (they contain more than one unmarked zero), mark square ( ) any unmarked zero arbitrarily and cross (x) all other zeros in its row as well as column. Proceed in this manner till there is no unmarked zero left in the cost matrix. Repeat sub-steps (a) and (b) till one of the following two things occur. (i)
There is one assignment in each row and in each column.
In this case the optimal
assignment can be made in the current solution. Ie,., the current feasible solution is an optimal solution. The minimum number of lines crossing all zeros is n, the order of the matrix. (ii)
There is some row and /or column without assignment. In this case optimal assignment cannot be made in the current solution. Further reduction is necessary. The minimum number of lines crossing all zeros have to be obtained in this case by following
Step-4. Find the minimum number of lines crossing all zeros. This consists of the following substeps: (a) Mark () the row s that do not have assignments (b)Mark() the columns (not already marked) that have zeros in marked rows. (c) Mark ( ) the rows (not already marked) that have assignment in marked columns. (d) Repeat sub steps (b) and (c) till no more rows or columns can be marked. (e) Draw straight lines thorough all unmarked rows and marked columns. This gives the minimum number of lines crossing all zeros. If this number is equal to the order of the matrix, then it is an optimal solution, otherwise go to step 5. Step-5. Iterate towards the optimal solution. Examine the uncovered elements. Select the smallest element and subtract it from all the uncovered elements. Add this smallest element to every element that lies at the intersection of two lines. Leave the remaining elements of the matrix of the matrix as such. This yields second basic feasible solution. Step-6. Repeat steps 3 through 5 successively until the number of lines crossing all zeros becomes equal to the order of the matrix. In such a case every row and column will have one assignment. It indicates that an optimal solution has been obtained. 3.7 Questions: PART-A 1. Define an assignment problem. 2. What is meant by restricted assignments? 3. What is the prerequisite to solve an assignment problem?
FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
APPLIED OPERATIONS RESEARCH & STATISTICS
38
PART-B 1. Differentiate between transportation problem and assignment problem. 2. Define Assignment problem. 3. Write the mathematical Formulation of Assignment Problem. 4. Write the difference between Transportation and Assignment Problem. 5.When an Assignment problem becomes unbalanced? 6.Write the Hungarian Algorithm for Solving Assignment Problem. PART-C 1. A firm plans to begin production of three new products. They own three plants and wish to assign one new plant. The unit cost of producing i at plant j is cij as given by the following matrix. Find the assignment that minimizes the total unit cost
P1 P2 A
10
Product B C
18 6
4
P3 (Plant) 8
12
6
14
2
2.A company has 4 machines on which to do 3 jobs. Each job can be assigned to one and only one machine. The cost of each job on each machine is given below. Determine Optimum assignment.
AB 1 Job
2
C
D (Machine)
15
13
14
17
11
12
15
13
3
18
12
10
11
4
15
17
14
16
3.Five jobs are to be processed and five machines are available. Any machine can process any job with the resulting profit (in Rs.) as follows: A
B
C
D
E
1
32
38
40
28
40
2
40
24
28
21
36
3
41
27
33
30
37
4
22
38
41
36
36
5
29
33
40
35
39
FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
APPLIED OPERATIONS RESEARCH & STATISTICS
39
What is the maximum profit that may be expected if an optimum assignment is made? 6.Solve the following Assignment Problem: A
Salesman
B
C
D
(Zone)
P
42
35
28
21
Q
30
25
20
15
25
20
15
24
20
16
R
30
S
12
. 7. A company has a team of four salesmen and there are four districts where the company wants to start its business. After taking into account the capabilities of salesmen and the nature of districts, the company estimates that the profit per day in rupees for each salesman in each district is as below.
Districts 1
2
3
4
A
16
10
14
11
Salesmen B
14
11
15
15
C
15
15
13
12
D
13
12
14
15
Find the assignment of salesmen to various districts which will yield maximum profit.
3.8 SUGGESTED READINGS 1. Taha, H.A., " Operations research - An Introduction ", Mac Millan publishing Co. (1982). 2. Gupta, P.K.and Hira, D.S., "Operations Research", S.Chand & Co., New Delhi, (1999). 3. Ochi, M.K. " Applied Probability and Stochastic Processes ", John Wiley & Sons (1992).
FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
APPLIED OPERATIONS RESEARCH & STATISTICS
40
CHAPTER – IV
SEQUENCING PROBLEM STRUCTURE 4.1 INTRODUCTION 4.2 DEFINITIONS 4.3 PROCESSING N JOBS THROUGH TWO MACHINES 4.4 PROCESSING N JOBS THROUGH THREE MACHINES 4.5 PROCESSING TWO JOBS THROUGH M MACHINES 4.6 QUESTIONS 4.7 SUGGESTED READINGS
4.1 Introduction The selection of an appropriate order for a series of different jobs to expressions done on a finite number of service facilities is called sequencing. In a sequencing problem, we have to determine the optimal order (Sequence) of performing the jobs in such a way that the total time(or cost) involved is minimum. The following simplifying assumptions are usually made while dealing with sequencing problems: (i)
only one operation is carried out on a machine at a particular time,
(ii)
each operation, once started, must be completed.
(iii)
An operation must be completed before its succeeding operations can start.
(iv)
Only one machine of each type is available
(v)
A job is processed as soon as possible, but only in the order specified.
(vi)
Processing times are independent of order of performing the operations.
(vii)
The transportation time i.e the time required to transport jobs
form one machine to
another is negligible. (viii)
Jobs are completely known and are ready for processing when the period under consideration starts.
4.2 Definitions A general sequencing problem may be defined as follows: Let there be n different jobs (1,2,3,……n) each of which has to be processed, one at a time on each of am machines (A,B,C,…) The order of processing each job through the machines is given for example, job 1 is processed on machines A,C,B in this order) Also the time required for processing each job on each machine is given.
The
problem is to find among (n!)
m
FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
possible
APPLIED OPERATIONS RESEARCH & STATISTICS
41
sequences ,that technologically feasible sequence for processing the jobs which gives the minimum total elapsed time for all the jobs. Symbolically, LetAi = time required for job I on mahine A, Bi time required for job I on machine B, etc., and T = Toal elapsed time for jobs 1,2………., n i.e time from start of the first job to completion of the last job. The problem is to determine a sequence (i1, i2,…i) Where (i1,i2,.in) is a permuation of integers(1,2,….n) which will minimize T. 4.3 Processing n jobs Through Two machines This sequencing problem is completely described as follows: (i)
Only two machines are involved, A and B,
(ii)
Each job is processed in the order AB
(iii) The actual or expected processing times A1, A2.,.,An, Bi, B2,……Bn are known and represented by a table of the type shown below Table.1 Machine times for n jobs and two machines Job i
A
B
1
A1
B1
2
A2
B2
3
A3
B3
.
.
.
.
.
.
I
Ai
Bi
.
.
.
.
.
.
n
An
Bn
The problem is to determine the sequence (order) of jobs which minimizesT, the total elapsed time from the start of first job to the completion of last job the solution procedure consists of the following steps: Step-1 : Examine the columns for processing times on machines A dnd B and find the smallest value [Min (Ai,Bi)] Step-2: If this value falls in column A, Schedule this job first on machine A. If this value falls in column B, Schedule this job last on machine A (because of the given order AB). If there are equal minimal values (there is tie) one in each column, schedule the one in the first column first on machine A; and the one in the second column, alst on machineA, if both equal values are in
FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
APPLIED OPERATIONS RESEARCH & STATISTICS
42
the first column (A), select the one with lowest entry in column B first . if the equal values are in the second column (b), select the one with the lowest entry in column A first. Step-3: Cross out the job assigned and continue the process (repeat steps 1 and 2) placing the jobs next to first or next to last tiell all the jobs are scheduled, The resulting sequence will minimize T. 4.4 Processing N jobs Through Three Machines This sequencing problem is completely described as follows: (i)
only two machines are involved,A and B,
(ii)
each job processed in the prescribed order ABC,
(iii) no passing of jobs is permitted (I,e.,the same order over each machine is maintained),and (iv) the actual or expected processing times A1,A2,…,An;B1,B2,…,Bn and C1,C2,…,Cn are known and represented by a table of the type shown(table12.2). Table 2 Machine times for n jobs and machines Job
A
B
C
1
A1
B1
C1
2
A2
B2
C2
3
A3
B3
C3
.
.
.
.
.
.
.
.
i
Ai
Bi
Ci
.
.
.
.
.
.
.
.
n
An
Bn
Cn
The problem, again, is to find the optimum sequence of jobs which minimizes T. No general solution is available at present for such a case. However , the method of section 4.3 can be extended to cover the special cases where either one or both of the following conditions hold good (if neither of the conditions holds good , the method fails): (1)
the minimum time on machine A is ≥ maximum time on machine B,and
(2)
the minimum time on machineC is ≥ maximum time on machine B. The method, described here without proof, is to replace the problem by an equivalent
problem involving n jobs, and two machines. These two (fictitious)machines are denoted by G and H and their corresponding processing times are given by Gi = Ai + Bi, Hi = Bi + Ci, If this new problem with the prescribed order GH is solved by the method of section 4.3, the resulting optimal optimal sequence will also be optimal for the original problem.
FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
APPLIED OPERATIONS RESEARCH & STATISTICS
43
4.5 PROCESSING TWO JOBS THROUGH M MACHINES Let us consider following situation: (a)
there are n machines, denoted by A,B,C,…,K.
(b)
only two jobs are to be performed :job 1 and job2.
(c)
The technological ordering of each of the two jobs through m machines is known. This ordering may not be the same for both jobs. Alternative ordering is not permissible for either job.
(d)
The actual or expected processing times A1,B1,C1,… K1; A2,B2,C2,…,K2 are known ,and
(e)
Each machine can work only one job at a time and storage space for in-process inventory is available. The problem is to minimize the total elapsed time T i,e., to minimize the time from the start of
first job to the completion of last job. Such a problem can be solved by graphic method which is simple and provides good (through not necessarily optimal) results. Processing n jobs through m Machines This sequencing problem is described as follows: (i)
there are n jobs to be performed, denoted by 1,2,3,…,i,…,n.
(ii)
there are m machines, denoted by A,B,C,…,K.
(iii) each job is to be processed in the prescribed order ABC…K. (iv) no passing of is permitted(i.e., same order over each machine is maintained). (v)
The actual or expected processing times A1,A2,….An;B1,B2,….Bn;C1,C2,…,Cn;K1,K2,…,Kn are known and represented by a table of the type shown below. Table.3
Machine times for n jobs and m machines Job
A
B
C
…
K
1
A1
B1
C1
…
K1
2
A2
B2
C2
…
K2
3
A3
B3
C3
…
K3
:
:
:
:
:
:
i
AI
BI
BI
…
KI
:
:
:
:
:
:
n
AN
BN
CN
…
KN
The problem , as before , os to find the optimum sequence of jobs which minimizes T. No general solution is available at present for such a case. However, the method of section12.4 can be applied (extended) to cover the special cases where either one or both of the following conditions hold gold (if neither of the conditions holds good, the method fails): (i)
the minimum time on machine A is> maximum time on machines B,C,…,K-1,
FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
APPLIED OPERATIONS RESEARCH & STATISTICS
(ii)
44
the minimum time on machine K is > maximum time on machines B,C..,K-1.
The method is to replace the m machine problem by an equivalent two machine problem. These two (fictitious machines are denoted by a and b and their corresponding processing times are given by ai = Ai + bi …. + (K-I)i bi = Bi + Ci+…. + (K-1)I + Ki If this new problem with the prescribed order ab is solved by the method of section 12.4 the resulting optimal sequence will also be optimal for the original problem: Further, if Bi+Ci+…+(K-1)i,=K, Where Kis a fixed positive constant for all jobs (i=1,2,3,…,n), then the given problem can be solved simply as n job two machine problem (where the two machines are A and K in the order AK) as per the method of section 12.3. EXAMPLE.1 A machine operator has to perform two operations, turning and threading, on a number of different jobs. The time required to perform these operations (in minutes) for each job is known. Determine the order in which the jobs should be processed in order to minimize the total time required to turn out all the jobs. Table .4 Job
Time for Turing (Minutes)
Time for Threading (minutes)
1
3
8
2
12
10
3
5
9
4
2
6
5
9
3
6
11
1
Solution: By examining the columns, we find the smallest value. It is threading time of 1 minute for job 6 in second columns. Thus we schedule job 6 last as shown below 6 The reduced set of processing times becomes
FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
APPLIED OPERATIONS RESEARCH & STATISTICS
45
Job
Turing time (minutes)
Threading time (minutes)
1
3
8
2
12
10
3
5
9
4
2
6
5
9
3
The smallest value is turning time of 2 minutes for job 4 first columns Thus we schedule job 4 first as shown below.
6
4 The reduced set of processing times becomes
Job
Training time (minutes)
threading time (minutes)
1
3
8
2
12
10
3
5
9
5
9
3
There are two equal minimal values: turning time of 3 minutes for job 5 in second coiumn. According to the rules, job 1 is scheduled next to job 4 and 5 next to job 6 as shown below.
4
1
5
6
The reduced set of processing times becomes
Job
Training time (minutes)
threading time (minutes)
2
12
10
3
5
9
The smallest value is turning time of 5 minutes for job 3 in first column. Therefore, we schedule job 1 and we get the optimal sequence as
FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
APPLIED OPERATIONS RESEARCH & STATISTICS
4
1
3
2
46
5
6
Now we can calculate the elapsed time corresponding to the optimal sequence, using the individual processing times given in the problem. The details are shown in table
Job
Turing operation
Threading operation
Time in
Time Out
4
0
2
2
8
1
2
5
8
16
3
5
10
16
25
2
10
22
25
35
5
22
31
35
38
6
31
42
42
43
Time in
Time Out
Thus the minimum elapsed time is 43 minutes. Idle time for turning operation (m/c) is 1 minute (from 42
nd
rd
minute to 43 minute) and for threading operation (m/c) is 2+4=6 minutes (from0-2 and 38-42
minutes). 4.6 QUESTIONS PART-A 1.
A sequencing problem involving five jobs and two machines requires evaluation of
(a) 5x5 sequences (b) 5+5 sequences (c)5!+2! Sequences (d) (5!) 2.
2
In sequencing problems, which of the following assumptions is not true:
(i) All jobs are completely known and are ready for processing (ii) The involved in moving a job from one service facility to another is negligibly small (iii) All jobs are processed on the first service facility and then on the second service facility. (iv) Processing times are independent of processing the jobs. 3.
Effectiveness of the sequencing technique increases when a company has large number of unprocessed orders on hand-True
PART-B 1. What are the assumptions underlying while dealing with Sequencing Problem? 2. State â&#x20AC;&#x2DC;No Passing Ruleâ&#x20AC;&#x2122; in Sequencing Problem 3. Define Sequencing Problem, Processing time and Idle time. PART-C 1. We have five jobs each of which must go through the two machines A and B in the order AB. Processing times in hours are given below in the table:
FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
APPLIED OPERATIONS RESEARCH & STATISTICS
Job
47
1
2
3
4
5
Machine A
5
1
9
3
10
Machine B
2
6
7
8
4
Machine
Determine a sequence for five jobs that will minimize the elapsed time. 2. Time required (in minutes) by each of the six persons in a company is given Position
P1
P2
P3
P4
P5
P6
Aptitude Test
140
180
150
200
170
100
Job Interview
70
120
110
80
100
90
below
In order to minimize the waiting time of the management personnel, in what order the position filling be handled? 3)Find the sequence that minimizes total time required in performing the following jobs on three machines in the order ABC Job
1
2
3
4
5
6
Machine A
8
3
7
2
5
1
Machine B
3
4
5
2
1
6
Machine C
8
7
6
9
10
9
4)Determine the optimum sequence of jobs that minimizes the total elapsed time based on the following information, processing time on machines is given in hours and passing is not allowed. Job
A
B
C
D
E
F
G
Machine M1
3
8
7
4
9
8
7
Machine M2
4
3
2
5
1
4
3
Machine M3
6
7
5
11
5
6
12
5)A firm works 40 hours a week and has a capacity of overtime work to the extent of 20 hours in a week. It has received seven orders to be processed on three machines A, B and C in the order ABC to be delivered in a weekâ&#x20AC;&#x2122;s time from now. The process time in hours are recorded in the following table: Find the optimum sequence of jobs Job
1
2
3
4
5
6
7
Machine A
7
8
6
6
7
8
5
Machine B
2
2
1
3
3
2
4
Machine C
6
5
4
4
2
1
5
FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
APPLIED OPERATIONS RESEARCH & STATISTICS
48
6)A book binder has one printing press, one binding machine,
manuscripts of a number of different
books. The time required to perform the printing and binding operations for each book are shown below. Determine the order in which books should be processed in order to minimize the total time required to turn out all the books Book
1
2
3
4
5
6
Printing Time(Hrs)
30
120
50
20
90
110
Binding Time (Hrs)
80
100
90
60
30
10
7)Solve the following sequencing problem when passing is not allowed Machine
A
B
C
D
E
I
9
7
4
5
11
II
8
8
6
7
12
III
7
6
7
8
10
IV
10
5
5
4
8
Item
4.7 SUGGESTED READINGS 1. Taha, H.A., " Operations research - An Introduction ", Mac Millan publishing Co.1982). 2. Gupta, P.K.and Hira, D.S., "Operations Research", S.Chand & Co., New Delhi, (1999). 3. J K SHARMA, " Operations research â&#x20AC;&#x201C; Problems&solution ", Mac Millan india Ltd2004..
FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
APPLIED OPERATIONS RESEARCH & STATISTICS
49
CHAPTER â&#x20AC;&#x201C; V
GAME THEORY
STRUCTURE 5.1 INTRODUCTION 5.2 DEFINITIONS 5.3 TWO PERSONS ZERO SUM GAME 5.4 DOMINANCE PROPERTY 5.5 QUESTIONS 5.6 SUGGESTED READING
5.1 INTRODUCTION A competitive game has the following characteristics: (a) There are finite number of participants or competitors .If the number is two ,the game is called two â&#x20AC;&#x201C;person game; For number >2,it is called n- person game. (b) each participant has a finite number of possible courses of action. (c) each participant must know all the courses of action available to others but must not know which of these will be chosen. (d) a play of the game is to occur when each player chooses one of his courses of action. The choices a course are assumed to be made simultaneously, so that no participant knows the choice of other until he has decided his finite. (e) after all participants have chosen a course of action, their respective gains are finite. (f) the again of the participant depends upon his own actions as well as those of others. 5.2Definitions 1. Game; It is an activity between two or more persons involving actions by each one them (according to a set of rules) and results in some gain (+ve,-ve or zero) for each.A game in which the sum of payments to all the players, after the play of the game, is zero is called zero-sum game .here ,the gain of players that win is exactly equal to the loss players that lose. If the number of players in a zero-sum game is two ,it is know as two-person zero-sum game or rectangular game. 2. Strategy: It is the predetermined rule by which a player decides his course of action. 3. Pure strategy; It is the predetermined rule to always select a particular course of action.
FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
APPLIED OPERATIONS RESEARCH & STATISTICS
50
4. Mixed strategy: It is decision rule to advance of all plays, to use all or some of the available courses of action in some fixed proportion. Thus a mixed strategy is a selection among pure strategies with some fixed probabilities(proportions). 5. payoff: It is outcome of the game payoff (gain or game) matrix is the table showing the amounts received by the player named at the left hand side after all possible players of the game. The payment is made by the player named at the top of the table. EXAMPLE.1 Table illustrates a game, where competitors A and B are assumed to be equal in ability and intelligence. A has a choice of strategy 1 or strategy 2, while B can select strategy 3 or 4. Table Competitor B Strategy 3
Strategy
Minimum of
4
row
Strategy 1 Competitor A Strategy 2
Maximum of column
4
+4
+6
+3
+5
4
6
Both competitors know the payoffs for every possible strategy. It should be noted that the game favours competitor A since all values are positive. Values that favour B would be negative .Based upon these conditions, game is biased against B. However, since B must play the game he will play to minimize his losses. The game value must be 4 since A wins 4 points while B loses 4 points each time the game is played. The ‘game value’ is the average winnings per play over a long number of plays. The game illustrated in table is a two –person zero sum game since A wins 4 points in each play while B loses the same amount. A game is solved when the following has been determined: (a)
the average amount per play that A will win in the long run if A and
B use their best
strategies. As explained earlier, it is called the value of the game. (b)
The strategy that A should use to ensure that his average gain per play is at least equal to the value of the game.
(c)
The strategy that B should use to ensure that his average loss per play is not more than the value of the game.
Maximin and Minimax Criteria of Optimality They state that if a player has to choose out of the worst possible outcomes of all his potential strategies, he will choose the strategy that corresponds to the best of these worst outcomes. Such a strategy is called optimum strategy.
FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
APPLIED OPERATIONS RESEARCH & STATISTICS
51
5.3 TWO PERSONS ZERO SUM GAME EXAMPLE-2 Consider the game G with the following payoff Table Player B B1 Player
A
A1
B2
2
6
A2
(a)
-2 Show that G is strictly determinable, whatever h may be.
(b)
Determine the value of G.
Solution (a)
Ignoring whatever the value of h may be, the given pay off matrix represents Table Player B B1 Player
A
A1
B2 2 -2
A2
6
2
-2
Maximin value=2 and minimix value=2. The game G is strictly determinable, whatever h may be. (b)
Value of the game=2,
Strategies: A,row1;B,coloum 1. EXAMPLE-3 For what value of h the game with following payoff matrix is strictly determinable? Table Player B B1 Player A
A1
2 A1 -2 -2
A2 A3
B2 6 4
B3 2 -7
Solution Ignoring whatever the value of h may be , the given pay off matrix represents Table Player B B1 Player A A1 A2 A3
2 -2 -2
B2
B3
6
2 -7
4
2 -7 -2
FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
APPLIED OPERATIONS RESEARCH & STATISTICS
column maxima
-1
52
6
2
Maximin value=2 and minimax value=-1. Value of the game lies between-1 and 2i.e.,-1 ≤ V ≤ 2. have-1 ≤ ≤ 2.
For strictly determinable game since maximin value= minimax value, we must EXAMPLE -4
Find the ranges of values of p and q which will render entry (2,2) a saddle point for the game. Table Player B B1 Player A A1
B2 B3
2 10 4
A2 A3
5 q 6
4 7 p
Solution First ignoring the values of p and q we determine the maximin and minimax values of pay off matrix as follows: Table Player B B1 Player
A
A1
2 10 4
A2 A3 Column maxima
B2
B3 5 q 6
4 7 p
10
7
2 7 4
6
Maximin value=7, Minimax value=7, This imposes the condition on p as p< 7 and q as q >7. Hence the range of p and q will be p<7,q>7. EXAMPLE-5 The payoff matrix of a game is given below. Find the solution of the game to A and B.
Table B
I A
II III IV
I
II
-4
-2 0 -5 1
1 -6 3
III -2 -1 -2 -6
IV
V
Row minima 4
3 0
1 0
(-1)
-4
4
-6
0
-8
-8
FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
APPLIED OPERATIONS RESEARCH & STATISTICS
Column maxima 3
1
(-1)
3
53
4
Solution At the right of each row, write the minimum and ring the largest of them. Similarly at the bottom of each column, write the maximum and ring the smallest of them .Obviously, the matrix has a saddle point in cell (II,III) and the value of the game is -1 to A and +1 to B. the optimum strategies for players A and B are ii and respectively. 5.4 Dominance Property If no saddle point exists, the pay off matrix can be reduced if it is possible to eliminate certain strategies by dominance, the resulting reduced game can be solved by some mixed strategy. Example-6 Two players P and Q play a game. Each of them has to choose one of the three colours: white(W), black (B) and red 速 independently of the other. There after the colours are compared. If both P and Q have chosen White (W,W), neither wins any thing . If player p selects white and player . Q black(W,B),players p loses Rs.2 or player Q wins the same amount and so on .The complete payoff table is shown below (table).Find the optimum strategies for p and Q and the value of the game.
Table Player B w
B
Player A
R w
Colour chosen by P Solution
B
0 2
-2 5
7 6
3
-3
8
R
This matrix has no saddle point, evidently, player Q will not play strategy R since this will resulting heaviest losses to him and gains to player P. He can do better by playing strategies W or B. Thus column R is to be deleted and strategy R is called dominated strategy. The dominance rule for columns is: Every value in the dominating column(s) must be less than or equal to the corresponding value of the dominated column. The resulting matrix is
Table
FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
APPLIED OPERATIONS RESEARCH & STATISTICS
54
Player B W Player
A
w
P
B
0 2 3
B R
-2 5 -3 Table B
-1 Player A
1
3
1 2 -1 -1
2
3 (-1)
-1 3 -1
-1 -1 2
(-1) (-1)
5.5 QUESTIONS: PART-A 1) ………………… is the predetermined rule by which a player decides his course of action. 2) ……………… is the predetermined rule to always select a particular course of action. 3) mixed strategy is a selection among pure strategies with some fixed ……………… Ans: 1) strategy 2) pure strategy 3) probabilities(proportions). PART--B 1. Define the Saddle point 2. Define the pay off matrix 3. What is rectangular game 4. Explain about the Dominance property 5 What are the bidding problems? 6. How to apply the game theory in LPP PART-C 1. Find the range of values of p and q so that the entry (2,2) is a saddle in the following games:
12
q
4
0
2
3
p
6
11
8
5
q
17
3
4
2
p
4
2. Two players A and B match coins. If the coins match, then A wins one unit of value; if the coins do not match, then B wins one unit of value. Determine optimum strategies for the players and the value of the game.
3. Solve the game whose payoff matrix is
FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
APPLIED OPERATIONS RESEARCH & STATISTICS
55
5
2
3
4
By arithmetic method and verify the results by algebraic method. Calculate the game value. 4.
In a game of matching coins with two players, suppose A wins one unit of value when there are two heads, wins nothing when there are two tails, and loses ½ unit of value when there are one head and one tail. Determine the payoff matrix, the best strategies for each player and the value of the game to A.
5.
6.
7.
8.
Use dominance property to solve the following game between two players A and B:
6
8
6
4
12
2
Solve, by using dominance property, the following game: 1
7
2
6
2
7
6
1
6
Reduce the following game to 2 2 game and solve it.
-2
-4
3
4
-6
-5
2
1
Find the optimum strategies for X and Y and the value of the game:
-6
10
11
-1
-2
-3
-1
-2
-4
5.6 SUGGESTED READINGS 1. J K SHARMA, " Operations research – Problems&solution ", Mac Millan india Ltd2004.. 2. Gupta, P.K.and Man Mohan, "Problems in Operations Research", S.Chand & Co., New Delhi, (2006). 3. N D Vohra " Quantitative Techniques in Management ", Tata McGraw-Hill,New Delhi-2007.
FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
APPLIED OPERATIONS RESEARCH & STATISTICS
56
CHAPTER – VI
NETWORK PROJECT SCHEDULING STRUCTURE 6.1 INTRODUCTION 6.2 DEFINITIONS 6.3 NETWORK CONSTRUCTION PRACTICE 6.4 CRITICAL PATH METHOD (CPM) 6.5 PROCEDURE FOR DETERMINING THE CRITICAL PATH 6.6 PROGRAM AVALUATION AND REVIEW TECHNIQUE (PERT) 6.7 QUESTIONS 6.8 SUGGESTED READINGS
6.1 INTRODUCTION In a large and complex project involving a number of interrelated activities, requiring a number of men, machines and materials, it is not possible for the management to make and execute an optimum schedule just by intuition based on the organizational capabilities and work experience. PERT ( Programme Evaluation and Review Techniques) and CPM (Critical path Method) are two of the many network techniques which have been widely used for planning, scheduling and controlling the large and complex projects. 6.2 DEFINITIONS Network: A network is a graphical representation of a projects operations and is composed of all the events and the activities in sequence, along with their interrelationship and interdependencies. Event: An event(node) is the beginning or end of a job or an activity. It represents a specific point in time, money, resources. Activity: An activity represents a job or an individual operation of a project. It consumes time, money and resources in doing the work. Dummy Activity: An activity which doesn’t consumes time, money and resources but merely depicts the technical dependence. Predecessor Activity: Activities must be completed immediately prior to the stop of another activity. Successor Activity: An activity that can’t be started until one or more of the other activities are completed, but immediately succeed them.
FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
APPLIED OPERATIONS RESEARCH & STATISTICS
57
Concurrent Activities: Activities which can be accomplished concurrently are known as concurrent activities. Work Breakdown Structure (W.B.S.) A project is a combination of interrelated activities which must be performed in a certain order for its completion. The process of dividing the project in to these activities is called the work Breakdown Structure (W.B.S). The activity or a unit of work, also called work content is a clearly identifiable and manageable work unit.
6.3 NETWORK CONSTRUCTION PRACTICE:
Let us consider a very simple situation to illustrate the W.B.S.A group of students is given the project of designing, fabricating and testing a small centrifugal pump. The project can be broken down in to the following sub-parts: (i) design, (ii) fabrication, (iii) Testing The network at this level of detail will look as shown in figure 20.1 DESIGN
FABRICATION
TESTING
The work units can be further broken down into smaller work contents as shown below. A.
design.
B.
Make drawings.
C.
Make patterns.
D.
Make moulds.
E.
Do casting of parts.
F.
Do machining of parts.
G.
Assemble.
H.
Design test rig.
I.
Fabricate test rig.
J.
Perform the test.
The network at this level of detail may look like the shown in figure 20.2 A
B
C
D
E
F
H
I
G
J
FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
APPLIED OPERATIONS RESEARCH & STATISTICS
58
Some of these activities can still be further sub-divided and the work can be carried in parallel on a number of activities. The varies can be listed as A.design. B.Make drawings. C. Make pattern of impeller. D. Make pattern of casing. E. Make mould for impeller. F. Make mould for casing. G. Do casting of all the parts. H.Machine impeller. I.Machine casing and other parts. J. Assemble the pump. K. Design test rig. L. fabricate test rig. M. perform test. N. computes results. At this level of detail the network for the project may look as shown in fig. The level of detail depends upon the objective of the management, the extent of control desired and the availability of the computational aids. The larger the details better will be the control and more involved will be the computations.
Example.1 In a boiler over hauling project following activities are to be performed: A.
Inspection of boiler by boiler engineer and preparation of list of parts to be replaced/ repaired.
B.
Collecting quotations for the parts to be purchased.
C.
Placing the orders and purchasing.
D.
Dismantling of the defective parts from the boiler.
E.
Preparation of necessary instructions for repairs.
F.
Repair of parts in the workshop.
G.
Cleaning of the various mountings and fittings.
FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
APPLIED OPERATIONS RESEARCH & STATISTICS
H.
Installation of the repaired parts.
I.
Installation of the purchased parts.
J.
Inspection.
K.
Trail run.
59
Assuming that the work is assigned to the boiler engineer who one boiler mechanic and one boiler attendant at his disposal, draw a network showing the precedence relationships. Solution. When we look at the list of activities, we note that activity A (inspection of boiler ) is to be followed by dismantling (d) and only after that it can be decided which parts can be repaired and which will have to be replaced. Now the repairing and purchasing can go side by side. But the instructions for repairs may be prepared after sending the letters for quotations. Note that it becomes a partial constraint. Also the cleaning of the boiler which is to be done by the attendant can be started after activity D. now we assume that repairing will take less time than purchasing . But the installation of repaired parts can be started only when the cleaning is complete. This results in the use of a dummy activity. After the installation of repaired parts, installation of purchased parts can be taken up. This will be followed by inspection and trail run. The network showing the precedence relationship will look as shown in figure. C B2
B1 A
DUMMY
PARTIAL CONSTRAINT D
E
G
F
H
I
DUMMY
Example.2 Following are the activities which are to be performed for a building site preparation. Determine the precedence relationship and draw the network. A.
Clear the site.
B.
Survey and layout.
C.
Rough grade.
D.
Excavate for sewer.
E.
Excavate for electrical manholes.
F.
Install sewer and backfill.
G.
Install electrical manholes.
H.
Construct the boundary wall.
FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
APPLIED OPERATIONS RESEARCH & STATISTICS
60
Solution. Looking at the list of activities we can fix the following precedence order: B succeeds A and C succeeds B i.e., B>A; C>b. D and E can start together after the completion of ,i.e.,D,E>C. F will following D and G will follow E,i.e.,F>D;G>E. H can start after Ci.e.,H>C Thus the precedence relationship is: B>A;C>B;D,E,H>C;F>D and G> E. The project can be represented in the form of a network as shown in figure Example.3. A project consists of a series of tasks labeled A,B,â&#x20AC;Ś,II,I with the following relationships (W<X,Y means X and Y can not start until W is completed ;X,Y <W means W cannot start until both X and Y
are completed. With this notation construct the network diagram having the following
constraints: A>D,E;B,D<F; C<G; B<H; F,G<I. D
A
B
C
F
H
E
G
Find also the minimum time of completion of the project , when the time (in days) of completion of each task is as follows: Task
:A
B
C
D
E
F
G
H
I
Time
:23
8
29
16
24
18
19
4
10
Solution For the given precedence relationships, the project network shown in Fig. is obtained .The activity durations are given below the activity arrows. The events have been numbered by employing the Fulkersonâ&#x20AC;&#x2122;s rule. the earliest occurrence time (ET) of the events , determined by the forward pass computations and the latest occurrence time (LT) of the events, determined by the back ward pass computations , are written along the node circles. The critical path of the network is 1-3-5-6-7., and consists of the critical activities A,D,F and I. The minimum time of completion of the project is 67 days.
FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
APPLIED OPERATIONS RESEARCH & STATISTICS
2
61
2
1
3
5
6
4
7
6.4 CRITICAL PATH METHOD(CPM) CPM is a technique used for planning and controlling the most logical and economic sequence of operations for accomplishing a project. It is the longest possible duration to complete the entire project. 6.5 PROCEDURE FOR DETERMINING THE CRITICAL PATH: Step-1: List all the tasks and draw a network diagram. Each task is indicated by an arrow with the direction of the arrow showing the sequence of task. Each task is considered as an activity and place the activity on a diagram by considering the precedence relationship between them. Step-2: Indicate the deterministic activity times above the arrow in the network diagram. Step-3: Calculate the earliest start time and earliest finish time for each event and put them above the event. Also calculate the latest start time and latest finish time and write them in the box. Step-4: Tabulate various times namely activity normal time, earliest time and latest time and the network diagram. Step-5: Calculate the total float, free float and independent float for each activity by taking earliest and latest time events. Step-6: Find the critical path by identifying the critical activities and connect them with the beginning and ending event in the arrow diagram by double line arrows. Step-7: Calculate the overall project time duration. 6.6 PROGRAM AVALUATION AND REVIEW TECHNIQUE (PERT) The activity time duration in CPM network is deterministic. But in practical cases due to change in technology is difficult to estimate the realistic time. For such cases where the activity time duration is uncertain, PERT technique is used.
FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
APPLIED OPERATIONS RESEARCH & STATISTICS
62
In PERT technique, the activity time duration is estimated by three times namely (a) Optimistic time estimate(t0) (b) Most likely time estimate(tm) (c) Pessimistic time estimate(tp) (a) Optimistic time: It is the shortest possible time to perform the activity, assuming that everything goes well. (c) Pessimistic time: This is the maximum time that is required to perform the activity,
under
extremely bad conditions. However, such conditions do not include acts of nature like earth Quakes, flood etc. It is the longest of all the three estimates. (b) Most likely time: It is the most often occurring duration of the activity. Statistically, it is the model value of duration of the activity.
6.6 QUESTIONS PART – A 1. An activity which started immediately after one or more of other activities are completed is known as predecessor activity – False 2. A dummy activity is needed when two or more parallel activities in a project have same head and tail events – True 3. A dummy activity is introduced to avoid dangling – True 4. For network construction each activity is represented by two arrows – False 5. Network Scheduling is a technique used for planning and scheduling large projects in fields. 6. A network is a graphic representation of project operations. 7. An event represents the start or completion of. 8. An event is also known as node. PART – B 9.Define Network. 10.Define CPM and PERT. 11.State any four differences between CPM and PERT. 12.Define Optimistic time and Pessimistic time. 13.Define MostLikely Time 14.What are the rules of Network construction? 15.What are the rules followed in numbering the network events? 16.How will you find the expected time and variance in PERT?
FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
APPLIED OPERATIONS RESEARCH & STATISTICS
63
PART â&#x20AC;&#x201C; C Problems in CPM: 17.The following table gives the activities in a construction project and time duration in hours: Activity
1-2
1-3
2-4
3-4
3-5
4-9
5-6
Duration
4
1
1
1
6
5
4
Activity
5-7
6-8
7-8
8-10
9-10
Duration
8
1
2
5
7
Find the Critical Path and calculate Total float, Free float and Independent float. 18The following table gives the activities in a construction project and time duration in hours: Activity
1-2
1-3
1-4
2-6
3-7
3-5
4-5
5-9
6-8
7-8
8-9
Duration
2
2
1
4
5
8
3
5
1
4
3
Find the Critical Path and calculate Total float, Free float and Independent float.
19.Draw the network for the following Project: Activity
1-2
1-3
2-4
3-4
4-5
4-6
5-7
6-7
7-8
Duration
5
4
6
2
1
7
8
4
3
Find (i)Earliest Start and Finish Time (ii)Latest Start and Finish Time (iii) Critical Path (iv) All the types of slack. 20.The following table gives the activities in a construction project and time duration in hours:
(i)
Activity
1-2
1-3
2-4
2-5
3-4
3-6
4-5
4-6
5-7
6-7
Duration
4
5
2
12
3
8
10
6
8
10
Draw the arrow diagram (ii) Identify the critical path and find the total project duration (iii) find total float, free and independent float.
21.A project is given below: Task
Optimistic
Pessimistic
Mostlikely
Time
Time
Time
1-2
5
10
8
1-3
18
22
20
1-4
26
40
33
2-5
16
20
18
2-6
15
25
20
FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
APPLIED OPERATIONS RESEARCH & STATISTICS
64
3-6
6
12
9
4-7
7
12
10
5-7
7
9
8
6-7
3
5
4
Determine (i) Expected Task time and variance (ii) Earliest and Latest times to each node (iii) Critical Path (iv)Probability of a node occurring at the proposed completion date if the original contract time of completing the project is 41.5.
22.A project consists of the following activities and time estimates : Activity
to
tm
tp
1-2
3
6
15
1-3
2
5
14
1-4
6
12
30
2-5
2
5
8
2-6
5
11
17
3-6
3
6
15
4-7
3
9
27
5-7
1
4
7
6-7
2
5
8
a)Draw the network.b)Find CPM.c)What is the probability that the project will be completed in 27 days? 23.Tasks A,B,C,D,E,F,G,H,I Constitute a project. The notation x<y means that the task x must be finished before y can begin With this notation A<D, A<E, B<F, D<F, C<G, C<H, F<I, G<I Draw a graph to represent the sequence of tasks and find the minimum time of completion of the project when the time (in days) of completion of each task is given as follows: Task: A B C D E F
G H I
Time: 8 10 8 10 16 17 18 14 9 6.7 SUGGESTED READINGS 1. J K SHARMA, " Operations research â&#x20AC;&#x201C; Problems&solution ", Mac Millan india Ltd2004.. 2. Gupta, P.K.and Man Mohan, "Problems in Operations Research", S.Chand & Co., New Delhi, (2006). 3. N D Vohra " Quantitative Techniques in Management ", Tata McGraw-Hill,New Delhi-2007.
FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
APPLIED OPERATIONS RESEARCH & STATISTICS
65
CHAPTER – VII
SAMPLING DISTRIBUTIONS STRUCTURE 7.1 INTRODUCTION 7.2 SAMPLING DISTRIBUTIONS 7.3 SAMPLING ERROR 7.4 SAMPLING FROM NORMAL POPULATIONS 7.5 SAMPLING DISTRIBUTION OF THE DIFFERENCE OF TWO PROPORTIONS
7.6 TYPES OF
ESTIMATES 7.7 INTERVAL ESTIMATION FOR MEAN OF LARGE SAMPLES 7.8 DETERMINING SAMPLE SIZE 7.9STUDENT’S t- DISTRIBUTION 7.11QUESTIONS 7.12 SUGGESTED READINGS
SAMPLING DISTRIBUTIONS 7.1 INTRODUCTION We know that a small section selected from the population is called a sample and the process of drawing a sample is called sampling. It is essential that a sample must be a random selection so that each member of the population has same chance of being included in the sample Thus the fundamental assumption underlying theory of sampling is Random sampling.
A special case of random sampling in which each event has the same probability of success and the chance of success of different events are independent whether previous trial have been made or not, is know as simple sampling.
The statistical constants of the population such as mean (µ), standard deviation () etc. are called the parameters. Similarly, constants for the sample drawn from the given population. i.e. mean (x), standard deviation (S) etc. are called the statistic. The populations parameters are in general not know and their estimates given by the corresponding sample statistic are used. We use the Greek letters to denote the population parameters and Roman letters for sample statistic.
Objectives of sampling. Sampling aims at gathering the maximum information about the population with the minimum effort cost and time .The object of sampling determines the reliability of these estimates. The logic of the sampling theory is the logic of induction in which
FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
APPLIED OPERATIONS RESEARCH & STATISTICS
66
we pass from a particular (sample) to general (population) such a generalization from sample to population is called statistical Inference. In business, there arise several situations when managers have to make quick estimator. Since their estimates have an important role on the success or failure of their enterprises.
Estimates are made without complete information and with a great deal of uncertainty about the eventual outcome.
Estimation is the procedure for assigning value to a population parameter based on the data collected from a sample.
Estimation is used to draw conclusions about the population characteristic is one of the fundamental applications of statistical inference in business.
7.2 Sampling Distributions If the same statistic is computed for each of the samples, the value is likely to vary from sample to sample. Thus theoretically it would be possible to construct a frequency table showing the values assumed by the statistic and their frequency of occurrences, this distribution of values of a statistic is called sampling distribution. Sampling distribution is the probability distribution of all possible values of a given statistic from all the distinct possible samples of equal size drawn from a population. The sampling distribution of variable
X is the probability distribution of all possible values. The
X may take when a sample of size n is taken from a specified population. When sampling is
done from a normal distribution with mean µ and standard deviation σ, the sample mean
X has a
normal sampling distribution. The sample size is large enough, the sampling distribution of
X is normal.
7.3 SAMPLING ERROR
Sampling error is due to faulty process of selection, faulty work during the collection, faulty method of analysis and variability of the population.
Different samples selected from the same population will give different results as the elements included in the sample will be different. This will give rise to sampling errors.
Sampling error arises from two principal sources: random error, and non-random error. Random error results from taking a sample from a population, instead of measuring the entire population. It is predictable, using probability theory. It is the reason that sample statistics only provide estimates of population parameters, but the amount of random error is known.
7.3.1 Non Sampling Error
It is due to human factors and it varies from one investigator to another. It is investigator carelessness and faculty planning of sampling.
FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
APPLIED OPERATIONS RESEARCH & STATISTICS
67
It is due to incomplete investigation and sample survey.
It is due to wrong editing, coding
and presenting of the responses received through the questionnaire.
It is due to the interviewers are not get proper training. Errors committed during presentation and printing of tabulated results
Non-random error results from bias being introduced into the sample from some flaw in the design or implementation of the sample. For example, using a telephone book as the sampling frame for all the residents of a city will result in some bias, because some people are not listed in the directory or do not have telephones. People who refuse to take part in a study (which is their right) also may introduce bias into the sample. Some people may provide erroneous information, which also biases the results. Finally, mistakes in computing the required sample size, in identifying the actual units to be included in the sample, or other errors can introduce bias into the sample. To assess whether an adequate sample was used in a piece of research, ask the following questions: Size-- was the size adequate for the purpose of the study, especially if there were many sub-
groups included in the analysis, or many variables used simultaneously. Representativeness--was the sample selected randomly from the population. Implementation--was the sampling plan carried out carefully, was it adequately supervised, was there some quality control plan. 7.3.2 STANDARD ERROR OF STATISTIC Sampling distribution describes how values of a statistic, say mean, is scattered around its mean µ, its standard deviation σx is called the standard error to distinguish it from the standard deviation σ of a population. Example - 1 A random sample of size 5 is drawn without replacement from a finite population consisting of 41 units. If the S.D of the population is 6.25, then find the S.E of the sample mean. Solution: Given n = 5, N = 41, σ = 6.25 Since the sample is drawn without replacement
The standard error of X S.E
n
6.25 5
N n N 1 41 5 41 1
FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
APPLIED OPERATIONS RESEARCH & STATISTICS
68
= 2.65 Example - 2 A simple random sample of size 16 is drawn without replacement from a finite population consisting of 50 units. If the number of defective units in the population is 5, find the S.E of the sample proportion of defectives. Solution: Given n = 16, N = 50, p
5 0.1 , q = 1 − p = 0.9 50
The standard error of X S.E
pq N n n N 1
0.1 0.9 50 16 16 50 1
= 0.0625
7.4 SAMPLING FROM NORMAL POPULATIONS Given a sample of size n taken from a population with mean µ and standard deviation σ are defined by, Mean of the distribution of sample means,
x
(Or) x n Standard deviation of the distribution of sample means Standard error of the mean
Example – 3 The mean length of life of a certain bearing is 41.5 hours with a standard deviation of 2.5 hours. What is the probability that a simple random sample of size 50 drawn from this population will have a mean between 40.5 hours and 42 hours?
Solution: Given
n = 50 μ = 41.5 hours σ = 2.5 hours
The parameters of sampling distributions are
x 41.5 x
n
2.5 50
0.3536
FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
APPLIED OPERATIONS RESEARCH & STATISTICS
69
Since sample size n = 50 ( 30 ), we apply normal distribution.
x x P 40.5 x 42 P 1 Z 2 x x
42 41.5 4.05 41.5 P Z 0.3536 0.3536
P 2.8281 Z 1.4140 = 0.9184 (Using Normal Table) Example – 4 A continuous manufacturing process produces items whose weights are normally distributed with mean weight of 800 Gms and a standard deviation of 300 Gms. A random sample of 16 items is to be drawn from the process. What is the probability that the arithmetic mean of the sample exceeds 900 Gms? Solution: Given
n = 16 μ = 800 gms σ = 300 gms
Since population is normally distributed, the distributions of sample mean S.D is,
x 800 x
n
300 16
75
X P X 900 P Z x
900 800 P Z 75
PZ 1.33 0.5 P0 Z 1.33 = 0.5−0.4082 (Using the Table of Area under normal curve)
= 0.0918. 7.4.1 Sampling Distribution of Difference between Two Sample Means To compare a population of size N1 having mean μ1 and standard deviation σ1 with another population of size N2 having mean μ2 and standard deviation σ2. Let
X 1 and X 2 be the mean of
sampling distribution of mean of two populations respectively.
FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
APPLIED OPERATIONS RESEARCH & STATISTICS
Let Where
Z
70
X
X X1 X 2
1 2 The standard error of sampling distribution,
12 n1
22 n2
, [n1 and n2 be the independent random
samples drawn from first & second population] Example - 5 A Tyre manufacturer I have a mean life time of 1400 hours with a standard deviation of 200 hours, while those of Tyre manufacturer II have a mean life time of 1200 hours with a standard deviation of 100 hours. If the random sample of 125 stereos of each manufacturer are tested, what is the probability that manufacturer I Tyre will have a mean life time which is atleast 160 hours more than manufacturer II? Solution: Given μ1 = 1400, σ1 = 200 hours, n1 = 125 μ2 = 1200, σ2 = 100 hours, n2 = 125 Thus, μ = μ1 − μ2 =200
12 n1
22 n2
200 2 100 2 125 125
The probability of this difference is more than or equal to 160 hours
X P x1 x2 160 P Z 160 200 P Z 20
PZ 2 = 0.5 + 0.4772 = 0.9772
FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
APPLIED OPERATIONS RESEARCH & STATISTICS
71
7.5 SAMPLING DISTRIBUTION OF THE DIFFERENCE OF TWO PROPORTIONS Suppose two populations of size N1 and N2 re given. For each sample of size n1 from population SD
P
2
P1 and SD P1 , similarly for n2, from population, compute sample proportion P2 and
.
The mean and SD of this distribution is
P P P1 P2 1
P P 1
2
2
P1 1 P1 P2 1 P2 n1 n2
Example - 6 A motor manufacturer has determined from experience that 3 percent of the motors he produces are defective. If a random sample of 300 motors is examined what is the probability that the proportion defective between 0.02 and 0.035? Solution: Given
P p 0.03 P1 0.02 P2 0.035 And n = 300
The standard error of population is
P1 P 0.03 0.97 0.0098 n 300
P
P P P P P 0.02 P 0.035 P 1 Z 2 P P 0.035 0.03 0.02 0.03 P Z 0.0098 0.0098
P 1.02 Z 0.51 = 0.3461 + 0.1950 ( From normal table) = 0.5411
7.6 TYPES OF ESTIMATES Estimate
FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
APPLIED OPERATIONS RESEARCH & STATISTICS
72
The sample statistic used to estimate a population parameter is called an estimate. Estimator Estimator is a sample statistic that is used to estimate a population parameter There are two types of estimates i.
Point Estimate
ii.
Interval Estimate
7.6.1Point Estimation: If an estimate of a population parameter is given by a single value, then the estimate is called a point estimate of the parameter. Most of the statistic can be used to estimate
X is the point estimator
Point estimation is a single sample statistic
X , it is calculated from the sample to provide a
best estimate of the true value of the corresponding population parameter. 7.6.2 Interval Estimation: It is the range of values used in estimation of the population parameter. If an estimate of a population parameter is given by two different values between which the parameter may be considered to lie. In interval estimate, we first find a point estimate. Then we use this estimate to construct an interval on both sides of the point estimate, within which it can be reasonably confident that the true parameter will lie. Interval estimation establishes an interval consisting of a lower limit and an upper limit in which the true value of the population parameter is expected to fall. Example If the height of a student is measured as 170 cms, then the measurement gives a point estimate. But in interval estimate the height lies between 165 and 175 cms.
7.6.3 QUALITIES FOR SELECTION OF GOOD ESTIMATOR 1.
Unbiased estimator
2.
Consistent estimator
3.
Efficient estimator
4.
Sufficient estimator
1. Unbiased Estimator
The expected value of the estimator must be equal to the mean of the parameter.
The sample mean
X is an unbiased estimator of the population mean μ, when the expected
value (or mean) of a sample statistic is equal to the value of the corresponding population parameter, the sample statistic is said to be an unbiased estimator.
FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
APPLIED OPERATIONS RESEARCH & STATISTICS
73
The statistic is said to be an unbiased estimator of the corresponding parameter if the mean of the sampling distribution of the statistic is equal to the corresponding population parameter.
2. Consistent Estimator
The value of the estimator approaches the value of the parameter as the sample size increases.
An estimator is said to be consistent, if gives values more closely to the population parameter as the sample size increases.
Consistency refers to the effect of sample size on the accuracy of the estimator. A statistic is said to be consistent estimator of the population parameter, if it approaches the parameter as the sample size increases.
3. Efficient Estimator
The estimator has the smallest variance of all estimators which could be used.
Efficiency is measured in terms of size of the standard error of the statistic.
An estimator that has smaller standard error as compared to some other estimator of the population parameter.
For the same population, out of two unbiased point estimators an unbiased estimator with smaller standard deviation is said to be efficient.
4. Sufficient Estimator
An estimator is said to be sufficient if it uses all the information about the population parameter contained in the sample.
7.6.4 Confidence Interval
An interval estimate with a specific level of confidence
Confidence interval (or margin of error) is the range within which a population parameter is expected to lie with a specified level of confidence
It is how much confidence we have so that the true population parameter lies within a confidence interval.
In the interval estimation, estimate for the population parameter lies between two limits (i.e.) the estimate for population lies within the interval.
7.6.5 Confidence Level
This is the probability that is associated with an interval estimation of a population parameter. This indicates how confident we are, that the population parameter will fall in this interval.
Significance (confidence) level reflects the amount of evidence you want to ensure that you are correct with your conclusions. Social science typically that you are 95 percent certain of your
uses a confidence level of .05 so
results.
7.6.6 Confidence Limit
It is the upper and lower boundaries of a confidence interval.
FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
APPLIED OPERATIONS RESEARCH & STATISTICS
74
Probability that we can associate with an interval is called confidence interval
Z PC1 C2 7.6.7 Degrees Of Freedom:
Degrees of freedom indicate how many of the sample values are free to vary.
Degrees of freedom is the number of unrestricted (independent) moments can have out of the given sample size.
7.6.8Table for Confidence Interval
Confidence level
Z
90%
10%
1.64
95%
5%
1.96
98%
2%
2.33
99%
1%
2.58
1
Example Two quantities t1 and t2 based on the sample observations drawn from the population such that the unknown parameter θ is included in the interval (t1, t2) in a specified percentage. Then the interval (t1, t2) is called confidence interval for the parameter θ. 7.7 INTERVAL ESTIMATION FOR MEAN OF LARGE SAMPLES TYPE − I (Population standard deviation σ given) Confidence Interval
X Z n
If sample size N and n given Confidence Interval
N n X Z N 1 n
TYPE − II (Sample standard deviation given) Confidence Interval
S X Z n 1
If sample size N and n given Confidence Interval
S N n X Z N 1 n
FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
APPLIED OPERATIONS RESEARCH & STATISTICS
75
Example - 7 The mean height obtained from a sample of size 100 and taken randomly from the mean population of 164 cm. If SD of the height of distribution of population is 3 cm. Find 95% of confidence limit for the mean height of the population. Solution: Given Sample Size (n) = 100 Sample Mean ( X ) = 164 cm S.D of the population (σ) = 3 cm From the table Zα for 95% = 1.96 Confidence Interval
X Z n
3 164 1.96 100
164 0.3 1.96 164 0.588 163.412,164.588 Example – 8 A civil engineer is analyzing the strength of a concrete is approximately distributed with 2
2
variance σ = 100 psi . A random sample of 12 specimens has a mean strength of
X = 3250 psi.
Construct a 95% confidence interval. Solution: Given, Sample Size (n) = 12 Sample Mean ( X ) = 3250 S.D of the sample =
1000
Sample S.D (S) = 31.623 From the table Zα for 95% = 1.96 Confidence Interval
S X Z n 1
31.623 3250 1.96 12 1
3250 18.89
FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
APPLIED OPERATIONS RESEARCH & STATISTICS
76
32.37, 3268.68 Standard Error, SE
S n 1
31.623 11
31.623 9.56 3.316
Example - 9 In a random selection, 64 of 600 broadcasting in a town, the mean number in automobile accident for a year was found to 4.2 and sample SD was 0.8. Construct a 95% confidence interval for the mean number of accidents. Solution: Given, Sample Size (n) = 64 Sample Mean ( X ) = 4.2 S.D of the sample = 0.8 Population size, N = 600 Coefficient of Zα is the standard error
S N n X Z N 1 n
0.8 600 64 4.2 1.96 600 1 63 0.8 536 4.2 1.96 599 7.94 0.8 4.2 0.894 1.96 7.94
4.2 0.187 4.013, 4.387 Standard Error, SE
S n
N n 0.945 N 1
FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
APPLIED OPERATIONS RESEARCH & STATISTICS
77
7.7.1 Interval Estimation for Proportion of Large Samples
Population proportion,
Sample proportion,
P
p
Number of elements posses sin g the character Total number of elements in the population
Number of elements posses sin g the character Total number of elements in the sample
TYPE III Confidence interval for the proportion
P S.E Z
pq Z P n pq P n
N n Z N 1
Example - 10 A random sample of 800 units from a large consignment showed that 200 were damaged. Find 95% confidence limits for the population proportion of damaged units in the consignment. Solution: Given, Sample Size (n) = 800
P
200 0.25 800
Z = 95 = 1.96 q = 1−P = 0.75 From the table Zα for 95% = 1.96 CI
pq Z P n
0.25
0.25 0.75 1.96 800
0.25
0.187 1.96 800
FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
APPLIED OPERATIONS RESEARCH & STATISTICS
0.25
78
0.432 1.96 800
0.19, 0.31 7.7.2 Interval Estimation for Difference of Two Means If the samples of large size n1 and n2 are drawn from two different populations then the sampling distribution of the difference between two means mean
1 2 and standard deviation x x 1
X 1 and X 2 is approximately normal with
. 2
12 22 Confidence interval X 1 X 2 Z n1 n2
Example – 11 The strength of the wire produced by company A has a mean of 4500 kg and a standard deviation of 200 kg. Company B has a mean of 4000 kg and a standard deviation of 300 kg. A sample of 50 wires of company A and 100 wires of company B are selected at random for testing the strength. Find 99 percent confidence limits in the difference in the average strength of the populations of wires produced by the two companies. Solution: Given, Company A:
X 1 = 4500, σ1 = 200, n1 = 50
Company B:
X 2 = 4000, σ2 = 300, n2 = 100
X 1 X 2 = 500, and Z = 2.58 Confidence interval
12 22 X 1 X 2 Z n1 n2
40,000 90,000 4500 4000 2.58 50 100
500 2.5841.23 500 106.20
393.60, 606.20
FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
APPLIED OPERATIONS RESEARCH & STATISTICS
79
7.8 DETERMINING SAMPLE SIZE Given a confidence level c, and standard deviation , drawing a larger sample will decrease the margin of error. If a desired margin of error and level of confidence are known, and if an estimate of the standard deviation can be made, then the sample size necessary can be determined by solving the error formula for n. We know:
E zc Multiplying by
n
n yields:
E n zc Dividing by E we obtain:
n
zc E
Finally, squaring both sides yields:
z n c E
2
Example – 12 Suppose we want to estimate the mean weight of Indian men, and we want to be 95% confident that our estimate is within 2 lbs. of the actual mean. the
standard deviation of men’s weights is
18.4 pounds, we can use the formula above to determine the appropriate sample size. Solution: Given,
c .95 , so zc 1.96 , the maximum desired error is E 2 , and we estimated that
18.4 , so
z 1.96 18.4 n c 325.15 2 E 2
2
Thus a sample of 326 men should give the desired level of accuracy.
Determination of Sample Size ‘n’: Using central limit theorem to the given mean data
P X Z 1 Confidence level n
FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
APPLIED OPERATIONS RESEARCH & STATISTICS
80
P X 1.64 0.90 n P X 1.96 0.95 n P X 2.33 0.98 n P X 2.58 0.99 n Similarly we can write for proportion
pq P P p 1.64 0.90 n pq P P p 1.96 0.95 n pq P P p 2.33 0.98 n pq P P p 2.58 0.99 n Determination of Sample Size ‘n’ From Mean
n Here,
Z 2 2 E2
Z
Population S.D desired Zα (Confidence value)
E
X (Sampling error)
σ
Determination of Sample ‘n’ From Proportion
n
Z 2 pq E2
Here, Z E p
Zα (Confidence desired value) P − p (Sampling error) Estimated true proportion of series
FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
APPLIED OPERATIONS RESEARCH & STATISTICS
81
q=1−p Example - 13 A manufacturing concern wants to estimate the average amount of purchase of his product in a month by the customers whose standard error is Rs.10. Find the sample size ‘n’. If the maximum error is not to exceed Rs.3 of with probability of 0.99
Solution: Given, P = 0.99 (i.e.)
P X 3 0.99
---- (1)
By using central limit theorem
P X 2.58 0.99 n
---- (2)
Comparing (1) & (2)
2.58
n
3
2.58 10 n 3
8.6 n 8.6 2 n n 73.96
n 74 Example - 14 Suppose a university is performing a survey of the annual earnings of last year graduate from this business group, from post experience that the S.D of annual earning of the entire population (1000) of this graduate is 1500$. Find out a sample size from the university taken in order to estimate the mean annual earnings of last year class within 500 at a 95% confidence level Solution: Given
P X 500 0.95
---- (1)
By using the central limit theorem
P X 1.96 0.95 n
---- (2)
Comparing (1) & (2)
FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
APPLIED OPERATIONS RESEARCH & STATISTICS
1.96
n
82
500
1.96 1500 n 500 2940 n 500 5.88 2 n n 34.57
n 35 7.9 STUDENT’S t − DISTRIBUTION The Sample is small (n<30) then the distribution of the standardized variable Z of the statistic ‘t’ will be far from normality and as a result the normal test cannot be applied.
It is used when the sample size is less than 30 and the population SD is unknown.
It is a probability distribution.
It is similar to structure of normal distribution except that it depends on degrees of freedom.
7.9.1 Definition If X1, X2,…….Xn is a random sample of size ‘n’ drawn from a normal population with mean 2
(µ) and variance (σ ). Then the student’s t’ statistic is
t
X X S .E of mean S2 n 1
With (n − 1) degrees of freedom 7.9.2 Properties of t – Distribution:
‘t’ distribution is asymptotic to X – axis.
t varies with the degrees of freedom.
‘t’ distribution is symmetrical distribution with mean zero.
It depends only on degrees of freedom.
‘t’ distribution has greater spread than normal distribution.
7.10 QUESTIONS PART – A 1. Which of the following is a necessary condition for using a t-distribution?
FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
APPLIED OPERATIONS RESEARCH & STATISTICS
83
a. Small sample size
b.
Unknown
population
d.
Infinite population
standard
deviation c. 2. If
Both (i) & (ii)
x 85 , 8 , and n 64 , then standard error of sample mean is equal to: a. 1
b. 1.96
c. 2.576
d. none of the above
3. Sampling distribution is usually the distribution of: a. Parameter
b. statistic
c. mean
d. variance
b.
not consistent
4. An unbiased estimator is necessarily a. Consistent c.
efficient
d. None of the above
5. The standard deviation of the sampling distribution of a statistic is referred as a. Sampling error c.
b. Standard error
Mean error
Answers: 1. (c),
2. (d),
d. None of the above
3. (b),
4. (c),
5. (b)
PART - B 1. Differentiate between a.
Estimate and Estimator
b.
Point estimate and interval estimate
2. Define degrees of freedom. 3. What are the advantages of using an interval estimate over a point estimate? 4. What are the criteria of a good estimator? 5. What is the confidence coefficient? How is it useful in determining the confidence limits in interval estimation? PART - C 1. If a random sample of size 5 is drawn from a finite population of 41 units without replacement, then find the standard error of the sample mean, if the population S.D is 10. (Ans: 4.24) 2. A random sample of 500 oranges was taken from a large consignment and 65 were found to be defective. Show that the S.E of the proportion of bad one in a sample of the size is 0.015. 3. A simple random sample of size 16 is drawn without replacement from a finite population consisting of 50 units. If the number of defective units in the population be 8, find the S.E of the sample proportion of defectives. (Ans:0.0763) 4. The mean of the sample size 16 from a normal population is 20. If it is known that the variance of the population is 4, find the standard error of the sample mean and 95% confidence interval for the population mean. (Ans: 0.5,19.02 to 20.98)
FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
APPLIED OPERATIONS RESEARCH & STATISTICS
84
5. In a random sample of 100 articles taken from a large batch of articles, 10 are found to be defective. Obtain a 95% confidence interval for the true proportion of defectives in the batch. (Ans: 0.0412 to 0.1588) 6. A random sample of 160 people is taken and 120 were in favour of liberalizing licensing regulations. With 95% confidence, what proportion of all people are in favour? (Ans: 0.683 to 0.817) 7. A random sample of 100 articles taken from a batch of 2000 articles shows that the average diameter of the articles is 0.354 with S.D 0.048. Find 95% confidence interval. (Ans: [0.3448,0.3632]) 8. Find the sample size such that the probability of the sample mean differing from the population mean by not more than
1 th of the standard deviation is 0.9545. (Ans:256) 8
9. A vendor claims that no more than 8% of parts shipped to a manufacturer fails to meet specifications. The manufacturer selects at random 200 parts from a large batch just received from the vendor and finds 19 defective parts. Determine the extent to which the current sample contradicts the vendors claim. (Ans: 3.31 to 6.19)
7.11 SUGGESTED READINGS 1. Sheldon M. Ross, “Introductory Statistics”, Academic press, London, 2005. 2. Dr. Parimal Mukhopadhyay, “Applied statistics”, Arunabha Son Books & Allied Pvt Ltd, Kolkata, 2005. 3. Richard I. Levin & David S.Rubin, “Statistics for management”, Pearson education, Singapore, 2004.
FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
APPLIED OPERATIONS RESEARCH & STATISTICS
85
CHAPTER – VIII
TESTING OF HYPOTHESIS STRUCTURE 8.1 INTRODUCTION 8.2 TESTING OF HYPOTHESIS 8.3 TEST OF SIGNIFICANCE OF A SINGLE MEAN 8.4 TEST OF SIGNIFICANCE OF DIFFERENCE BETWEEN TWO MEANS 8.5 TEST OF SIGNIFICANCE FOR THE DIFFERENCE OF THE TWO STANDARD DEVIATIONS 8.6 TEST OF SIGNIFICANCE OF SINGLE PROPORTIONS 8.7 TEST OF SIGNIFICANCE OF DIFFERENCE BETWEEN TWO SAMPLES PROPORTIONS 8.8 TEST OF SIGNIFICANCE OF SINGLE MEAN (SMALL SAMPLES N < 30) 8.9 PAIRED T - TEST FOR DIFFERENCE OF MEANS 8.10 QUESTIONS 8.11 SUGGESTED READINGS
TESTING OF HYPOTHESIS
8.1INTRODUCTION Many problems in engineering require that we decide whether to accept or reject a statement about some parameter. The statement is called a hypothesis, and the decision-making procedure about the hypothesis is called hypothesis testing. This is one of the most useful aspects of statistical inference, since many types of decision-making problems, tests, or experiments in the engineering world can be formulated as hypothesis-testing problems. Symbols for Population and Samples
POPULATION
SAMPLE
PARAMETER
STATISTIC
Population size = N
Sample size = n
Population mean = µ
Sample mean =
Population standard deviation = σ
Sample standard deviation = s
X
FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
APPLIED OPERATIONS RESEARCH & STATISTICS
Population proportion = p
86
Sample proportion =
p
8.2TESTING A HYPOTHESIS On the basis of sample information, we make certain decisions about the population. In taking such decisions we make certain assumptions. These assumptions are known as statistical hypothesis. There hypothesis are tested. Null Hypothesis (H0):
Null hypothesis is based for analyzing the problem. Null hypothesis is the hypothesis of no difference. Thus, we shall presume that there is no significant difference between the observed value and expected value.
We shall test whether this hypothesis is satisfied by the data or not.
If the hypothesis is not approved, the difference is considered to be significant.
If the hypothesis is approved then the difference would be described as due to sampling fluctuation.
Alternate Hypothesis (H1): Any hypothesis which is complementary to the null hypothesis (H 0) is called an alternative hypothesis, denoted by H1. 1 Types of Tests
There are three basic types of hypothesis tests: Left-tailed Test – used when the null hypothesis being tested is a claim that the population parameter at least
() a given value. Note that the alternative hypothesis then claims that the
parameter is less than (<) the value.
Example:
H 0 : 35,000 H a : 35,000 We would reject H 0 in the case above if our sample mean was significantly less than 35,000. That is, if our sample mean was in the left tail of the distribution of all sample means Right-tailed Test – used when the null hypothesis being tested is a claim that the population parameter is at most
() a given value. Note that the alternative hypothesis then claims that the
parameter is greater than (>) the value. Example:
FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
APPLIED OPERATIONS RESEARCH & STATISTICS
87
H 0 : 35,000
H a : 35,000 We would reject H 0 in this case if our sample mean was significantly more than 35,000. That is, if our sample mean was in the right tail of the distribution of all sample means. (See Diagrams on Page 326 in the text) Two-tailed Test – used when the null hypothesis being tested is a claim that the population parameter is equal to (=) a given value. Note that the alternative hypothesis then claims that the parameter is not equal to () the value.
Example: The Census Bureau claims that the percentage of Erode Area residents with a bachelor’s degree or higher is 24.4%. We would write the null and alternative hypotheses for this claim as:
H 0 : p .244
H a : p .244 In this case, we would have to reject
H0
if our sample percentage was either significantly more than
24.4%, or significantly less than 24.4%. That is, if our sample proportion was in either tail of the distribution of all sample proportions.
8 .2.1Types of Error Whenever sample data is used to make an estimate of a population parameter, there is always a probability of error due to drawing an unusual sample. There are two main types of error that occur in hypothesis tests. Type I Error – A sample is chosen whose sample data leads to the rejection of the null hypothesis when, in fact,
H0
is true.
Type II Error – A sample is chosen whose sample data leads to not rejecting the null hypothesis when, in fact,
H0
is false.
H 0 True
H 0 False
H 0 Rejected
Type I Error
Correct Decision
H 0 Not Rejected
Correct Decision
Type II Error
8.2.2 Level of Significance
FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
APPLIED OPERATIONS RESEARCH & STATISTICS
88
In hypothesis tests, a conservative approach is usually taken toward the rejection of the null hypothesis. That is, we want the probability of making a Type I Error to be small.
The maximum acceptable probability is usually chosen from the beginning of the hypothesis test, and is called the level of significance for the test. The level of significance is denoted by
, and the most commonly used values are .10 , .05 , and .01 .
The probability of making a Type II Error in a hypothesis test is denoted by .
determined, the value of
Once
is also fixed, but the calculation of this value is beyond the scope of
this course. 8.2.3 Critical Region The region of the standard normal curve corresponding to a pre-determined level of significance 8.2.4 Procedure for Testing Hypothesis STEP − 1 Setting up of Hypothesis Null Hypothesis (H0): Set up for testing of hypothesis
X or 0
H0: Alternate Hypothesis (H1):
Negation of null hypothesis H1 =
0
[For two tailed test]
=
0
[Right tailed test]
=
0
[Left tailed test]
STEP − 2 Computation of test statistic Test statistic sample statistic used to decide whether to reject or fail to reject the null hypothesis. Z − Test: [n > 30 (large sample)]
Z t
is
t − Test: [n < 30 (small sample)]
t E t S E t
X S .E X
STEP − 3 Level of Significance The probability of rejecting the null hypothesis when it is true. alpha = 0.05 and alpha = 0.01 are common. If no level of significance is given, use alpha = 0.05. The level of significance is the complement of the level of confidence in estimation. The critical value of Z can be obtained, using the table of area under normal curve. STEP − 4 Decisions:
FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
APPLIED OPERATIONS RESEARCH & STATISTICS
89
A statement based upon the null hypothesis. It is either "reject the null hypothesis" or "fail to reject the null hypothesis". We will never accept the null hypothesis. A statement which indicates the level of evidence (sufficient or insufficient), at what level of significance, and whether the original claim is rejected (null) or supported (alternative). If
Z Z then we accept H0.
If
Z Z then we reject H0.
8.2.4 Test of Significance An important aspect of the sampling theory is to study the tests of significance, which will enable us to decide, on the basis of the results of the samples, whether i.
The deviation between the observed sample statistic and the hypothetical parameter value or
ii.
The deviation between two sample statistics is significant or might be attributed due to chance or the fluctuations of the sampling. If n is large, all the distributions like, Binomial, Poisson, Chi-square t distribution, F distribution can be approximated by a normal curve.
8.3TEST OF SIGNIFICANCE OF A SINGLE MEAN Case − 1 (σ given)
Z
Case − 2 (σ not given)
X
Z
X S
n
n
8.4 TEST OF SIGNIFICANCE OF DIFFERENCE BETWEEN TWO MEANS Case − 1 (σ1 & σ2 given)
Z
X1 X 2
12 n1
22
Case − 2 (σ1 & σ2 not given)
Z
n2
X1 X 2 S12 S 22 n1 n2
8.5 TEST OF SIGNIFICANCE FOR THE DIFFERENCE OF THE TWO STANDARD DEVIATIONS Case − 1 (σ given)
Z
S1 S 2
12 2n1
22 2n 2
Case − 2 (σ not given)
Z
S1 S 2 S12 S2 2 2n1 2n2
8.6 TEST OF SIGNIFICANCE OF SINGLE PROPORTIONS
FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
APPLIED OPERATIONS RESEARCH & STATISTICS
For S.E of proportion
Z
90
For S.E of percentage
pP
Z
pq n
P p P(100 p) n
8.7TEST OF SIGNIFICANCE OF DIFFERENCE BETWEEN TWO SAMPLE PROPORTIONS Case − 1 (Population proportion P given)
Z
Case− 2 (Sample proportion given)
P1 P2
Z
pq pq n1 n2
p1 p 2 p1 q1 p 2 q 2 n1 n2
8.8 TEST OF SIGNIFICANCE OF SINGLE MEAN(SMALL SAMPLES n < 30)
t
X S n 1
8.9 PAIRED t - TEST FOR DIFFERENCE OF MEANS
D Where S t S
D
2
n 1
D
2
nn 1
n D = Difference X − Y
D
D n
Test of Significance of Difference between Two Means
Z
X1 X 2
Where
1 1 n1 n2
S
n1 s12 n2 s 22 n1 n2 2
Data are in Services Form
Z
X Y n n2 S 1 n1 n2
Where
S
d
2 1
d 22
n1 n2 2
FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
APPLIED OPERATIONS RESEARCH & STATISTICS
91
Example – 1 A sample of size 400 was drawn and the sample mean was found to be 99. Test whether this sample could have come from a normal population with mean 100 and standard deviation 8 at 5% level of significance. Solution:
Given,
n 400 , X 99 , 100 , 8 , Z 5% 0.05 STEP − 1 Setting up of Hypothesis Null Hypothesis: H0:
100
Alternate Hypothesis: H1:
100
It is a two tailed test. STEP − 2 Test statistic
Z
X
n
99 100 1 1 2.5 8 8 0.4 20 400
STEP − 3 Critical Value Level of significance = 5% = 0.05
Z 1.96 STEP − 4 Decision Calculated value is greater than critical value (i.e.)
Z Z (i.e.) 12.5 > 1.96
Null hypothesis is rejected (i.e.) The sample has not been drawn from a normal population with mean 100 and S.D 8. Example – 2 A sample of 900 members has a mean 3.4cms and S.D 2.61cms. Can the sample be regarded as one drawn from a population with mean 3.25cms? Using the level or significance as 0.05. Solution: Given
n 900 , X 3.4 , 3.25 , S 2.61 , 5% 0.05
STEP − 1 Setting up of Hypothesis Null Hypothesis: H0:
3.25
Alternate Hypothesis: H1:
3.25
It is a two tailed test.
FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
APPLIED OPERATIONS RESEARCH & STATISTICS
92
STEP − 2 Test statistic:
Z
X S
n
3.4 3.25 0.15 0.15 1.72 2.61 2.61 0.087 30 900
STEP − 3 Critical Value: Level of significance = 5% = 0.05
Z 1.96 STEP − 4 Decision Calculated value is less than critical value (i.e.)
Z Z (i.e.) 1.72 < 1.96
Null hypothesis is accepted (i.e.) The sample has been drawn from a population with mean 3.25cms. Example – 3 A manufacturer of bulbs claims that a certain bulb he manufactures has a mean life of 400 days with a standard deviation of 20 days. A purchasing agent selects a sample of 100 bulbs and puts then for test. The mean life for the sample was 390 days. Should the purchasing agent reject the manufacturers claim at 5% level? The table value of z at 5% level is 1.96 for two tail test. Solution: Given n 100 , X
390 , 400 , 20 , Z 5% 0.05
STEP − 1 Setting up of Hypothesis Null Hypothesis: H0:
400 400
Alternate Hypothesis: H1: It is a two tailed test. STEP − 2 Test statistic:
Z
X
n
99 100 1 1 2.5 8 8 0.4 20 400
Z 5 STEP − 3 Critical Value Level of significance = 5% = 0.05 From table, Z
1.96
STEP − 4 Decision
Z 5 1.96
FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
APPLIED OPERATIONS RESEARCH & STATISTICS
(i.e.)
93
Z Z
Null hypothesis is rejected Example – 4 An insurance agent has claimed that the average age of policy holders who insure through him is less than the average for all agents which are 30.5 years. A random sample of 100 policy holders who had insured through him reveal that the mean and S.D are 28.8 years and 6.35 years respectively. Test his claim at 5% level of significance. Solution:
n 100 , X 28.8 , 30.5 , 6.35 , Z 5% 0.05
Given,
STEP − 1 Setting up of Hypothesis Null Hypothesis: H0:
30.5
Alternate Hypothesis: H1:
30.5
It is a left tailed test. STEP − 2 Test statistic:
Z
X
n
28.5 30.5 1.7 10 2.68 6.35 6.35 100
Z 2.68 STEP − 3 Critical Value Level of significance = 5% = 0.05 From table, Z
1.645
STEP − 4 Decisions:
Z 2.68 1.645 (i.e.)
Z Z
Null hypothesis is rejected (i.e.) The claim of the agent with mean 30.5 is valid Example – 5 The sales manager of a large company conducted a sample survey in states A and B taking 400 samples in each case. The results were
FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
APPLIED OPERATIONS RESEARCH & STATISTICS
94
STATE A
STATE B
AVERAGE SALES
Rs. 2500
Rs.2200
S.D
Rs.400
Rs.550
Test whether the average sales is same in the 2 states at 1% level Solution: Given
n1 400
n2 400
X 1 28.8
X 2 2200
S1 400
S 2 550
Z 1% 0.01 STEP − 1 Setting up of Hypothesis Null Hypothesis: H0:
1 2
Alternate Hypothesis: H1:
1 2
STEP − 2 Test statistic
Z
X1 X 2 S12 S 22 n1 n2 2500 2200 400 2 550 2 400 400
300 400 756.25
300 8.82 34.003
Z 2.68 STEP − 3 Critical Value Level of significance = 1% = 0.01 From table, Z
2.58
STEP − 4 Decisions:
Z 8.82 2.58 (i.e.)
Z Z Null hypothesis is rejected
(i.e.) The average sales within two states differ significantly.
FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
APPLIED OPERATIONS RESEARCH & STATISTICS
95
Example – 6 In a company there are two independent processes manufacturing the same item. The average weight in a sample of 250 items produced from one process is found to be 120 kg. with a standard deviation of 12 kg. While the corresponding figures in a sample of 400 items from the other process are 124 and 14. Obtain the standard error of difference between the two sample means. Is this difference significant? Also find the 99% confidence limits for the difference in the average weights of items produced by the two processes respectively. Solution: Given,
n1 250
n2 400
X 1 120 oz
X 2 124 oz
S1 12
S 2 14
Z 1% 0.01 STEP − 1 Setting up of Hypothesis Null Hypothesis: H0:
1 2
Alternate Hypothesis: H1:
1 2
It is a two tailed test. STEP − 2 Test statistic
Z
X1 X 2 S12 S 22 n1 n2 120 124 144 2 196 2 250 400
120 124 3.87 1.034
Z 3.87 STEP − 3 Critical Value Level of significance = 1% = 0.01 From table, Z
2.58
STEP − 4 Decisions:
Z 3.87 2.58 (i.e.)
Z Z Null hypothesis is rejected
FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
APPLIED OPERATIONS RESEARCH & STATISTICS
96
(i.e.) There is significant difference between the sample means. 99% confidence limits for
1 2
i.e., for the difference in the average weights of items
produced by two processes, are
X 1 X 2 2.58 S.E X 1 X 2 X 1 X 2 2.58
S12 S 22 n1 n2
120 124 2.58 (1.034) 4 2.58 (1.034) 4 2.67 6.67 and 1.33
1.33 1 2 6.67 Example – 7 A manufacturer of electric bulbs, by some process, finds the S.D of the lamps to be 100hrs. He wants to change the process if the new process results in even smaller variation in the life of lamps. In adopting the new process a sample of 150 bulbs gave the S.D of 95 hrs. Is the manufacturer justified in changing the process? Solution: Given n 150 ,
S 95 ,
100 ,
Z 5% 0.05
STEP − 1 Setting up of Hypothesis Null Hypothesis: H0:
S
Alternate Hypothesis: H1:
S
STEP − 2 Test statistics:
Z
S
2n
95 100 5 17.32 0.866 100 100 300
Z 0.866 STEP − 3 Critical Value Level of significance = 5% = 0.05 From table, Z
1.96
FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
APPLIED OPERATIONS RESEARCH & STATISTICS
97
STEP − 4 Decisions:
Z 0.866 1.96 (i.e.)
Z Z Null hypothesis is accepted
(i.e.) The manufacturer finds no justification in changing the process on the evidence alone. Example – 8 Random samples drawn from two countries gave the following date relating to the heights of adult males.
Is
COUNTRY A
COUNTRY B
MEAN HEIGHT (in inches)
67.42
67.25
S.D (in inches)
2.58
2.50
NUMBER IN SAMPLES
1000
1200
the
difference
between the standard deviation significant? Solution:
n1 1000
Given
n2 1200
X 2 67.25
X 1 67.42
S1 2.58
S 2 2.50
Z 1% 0.05 STEP − 1 Setting up of Hypothesis
1 2
Null Hypothesis: H0:
Alternate Hypothesis: H1:
1 2
STEP − 2 Test statistic:
Z
S1 S 2 S12 S2 2 2n1 2n2 2.58 2.50 2.58 2 2.50 2 2000 2400
0.08 1.03 0.07746
STEP − 3 Critical Value Level of significance = 5% = 0.05 From table, Z
1.96
FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
APPLIED OPERATIONS RESEARCH & STATISTICS
98
STEP − 4 Decision:
Z 1.03 1.96 (i.e.)
Z Z We accept H0.
(i.e.) The sample standard deviations do not differ significantly.
8.10 QUESTIONS PART – A 1).Critical region is a region of a. Large
b. sufficiently large
c. small
d. none of these
2) The probability of Type II error is a.
b.
c.
1
d.
1
3) The usual notation the standard error of the sampling distribution is a.
b.
n
n
c.
2
d. None of these
4).The asymptotic distribution of t-statistic with n degrees of freedom is a. F
b. normal
c. z
d. t
5).If we reject the null hypothesis, we might be making a. A Type II error c. A correct decision Answers: 1. (a),
2. (b),
b.
A Type I error
d.
either (i) or (ii)
3. (a),
4. (b),
5. (b)
PART - B 1. What is a hypothesis? What steps are involved in statistical testing of a hypothesis? 2. Distinguish between i.
One tail and a two tail test.
ii.
Critical region and acceptance region
iii.
Null hypothesis and alternate hypothesis
iv.
Type I error and Type II error
3. Explain how hypothesis testing is useful to management. PART - C 1. An insurance agent has claimed that the average age of policy holders who insure through him is less than the average for all agents which are 30.5 years. A random sample of 100 policy holders who had insured through him reveal that the mean and S.D are 28.8 years and 6.35 years respectively. Test his claim at 5% level of significance.
FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
APPLIED OPERATIONS RESEARCH & STATISTICS
99
10. The sales manager of a large company conducted a sample survey in states A and B taking 400 samples in each case. The results were
STATE A
STATE B
AVERAGE SALES
Rs. 2500
Rs.2200
S.D
Rs.400
Rs.550
Test whether the average sales is same in the 2 states at 1% level. 11. In a company there are two independent processes manufacturing the same item. The average weight in a sample of 250 items produced from one process is found to be 120 ozs. With a standard deviation of 12 ozs. While the corresponding figures in a sample of 400 items from the other process are 124 and 14. Obtain the standard error of difference between the two sample means. Is this difference significant? Also find the 99% confidence limits for the difference in the average weights of items produced by the two processes respectively. 12. A manufacturer of electric bulbs, by some process, finds the S.D of the lamps to be 100hrs. He wants to change the process if the new process results in even smaller variation in the life of lamps. In adopting the new process a sample of 150 bulbs gave the S.D of 95 hrs. Is the manufacturer justified in changing the process? 13. Random samples drawn from two countries gave the following date relating to the heights of adult males. COUNTRY A
COUNTRY B
MEAN HEIGHT (in inches)
67.42
67.25
S.D (in inches)
2.58
2.50
NUMBER IN SAMPLES
1000
1200
Is the difference between the standard deviation significant? 8.11 SUGGESTED READINGS 1. T. Veerarajan, “Probability & Statistics”, Tata McGraw hill publishing company Ltd, New Delhi, 2008. 2. Ronald E. Walpole & Sharon L. Myers, “Probability & Statistics for scientists”, Pearson education, 2005. 3. J.N. Kapur & H.C. Saxena, “Mathematical statistics”, S.Chand & Company Ltd, New Delhi, 2001.
FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
APPLIED OPERATIONS RESEARCH & STATISTICS
100
CHAPTER – IX
CORRELATION AND REGRESSTIONS STRUCTURE 9.1 INTRODUCTION 9.2 DEFINITIONS 9.3 CO-EFFICIENT OF CORRELATION 9.4 LINES OF REGRESSION 9.5QUESTIONS 9.6 SUGGESTED READINGS
CORRELATION AND REGRESSION 9.1 INTRODUCTION So far we have confined our attention to the analysis of observations on a single variable. There are, however, many phenomena where the changes in one variable are related to the changes in the other variable. For instance, the yield of a crop varies with the amount of rain full, the prince of a commodity increases with reduction in its supply and so on. In multiple correlations we are dealing with situations that involve two or more variables. The variables whose value we are trying to estimate is called the dependent variable and the other variables on which our estimates are based are known as independent variables.In problems of multiple correlations we always have three or more variables (one dependent and others independent 9.2 DEFINITIONS Correlation: when the changes in one variable are associated or followed by changes in the other , is called correlation. Such a data connecting two variables is called bivariate population. Positive & Negative correlation: If an increase (or decreases) in the values of one variablecorresponds to an increase (or decrease) in the other, the correlation is said to be positive. If the increase (or decrease) in one corresponds to the decrease (or increase) in the other, the correlation is said to be negative.
If there is no relationship indicated between the variables, they are said to be independent or uncorrelated.
To obtain a measure of relationship between the two variables, we plot their corresponding values on the graph, taking one of the variables along the x-axis and the other along the yaxis.
FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
APPLIED OPERATIONS RESEARCH & STATISTICS
101
Let the origin be shifted to ( x,y ), where x, y are the means of x’s and y’s that the new coordinates are given by X= x - x, Y = y - y.
Now the points (X,Y) are so distributed over the four quadrants of XY-plane that the product XY is positive in the first and third quadrants but negative in the second and fourth quadrants.
The algebraic sum of the products can be taken as describing the trend of the dots in all the quadrants.
(i) xy is positive, the trend of the dots is through the first and third quadrants, (ii) If xy is negative the trend of the dots is in the second and fourth quadrants, and (iii) If xy is zero, the points indicate no trend i.e. the points are evenly distributed over the four quadrants. The xy or better still
1 xy ,i.e. the average of n products may be taken as a measure of n
correlation. If we put X and Y in their units, i.e. taken x as the unit for x and y for y. then
1 n
XY X Y . , ie n X y X y
. is the measure of correlation.
9.3 CO-EFFICIENT OF CORRELATION The numerical measure of correlation is called the co-efficient of correlation and is defined by the relation
R=
XY
n X y
where X= deviation from the mean x =
x x . Y = deviation from the mean y = y y ,
x = S.D. of x-series, y = S.D. of y-series and n = number of values of the two variables. Methods of calculation: (a) Direct method. Substituting the value of x and y in the above formula, we get r=
XY ( X 2 Y 2 )
….(1)
Another form of the formula (1) which is quite handy for calculation is
r
n xy x y [(n x ( (x) 2 ) X (n y 2 ( ( y) 2 )] 2
FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
APPLIED OPERATIONS RESEARCH & STATISTICS
102
(b) Step- deviation method. The direct method becomes very lengthy and tedious if the means of the two series are not integers. In such cases, use is made of assumed means. If d x and dy are step-deviations from the assumed means, then Obs. The change of origin and units do not alter the value of the correction co- efficient singer r is a pure number. dxdy - dxdy r = [{nd x-(dx) } x {nd y-(dy) }] 2
2
2
2
(c) Co-efficient of correlation for grouped data. When x and y series are both given as frequency distributions, these can be represented by a two-way table known as the correlation- table .It is double-entry table with one series along the horizontal and the other along the vertical. The coefficient of correlation for such a bivariate frequency distribution is calculated by the formula n(fdxdy) – (fdx)(fdy) r
= [{nfd x-(fdx) } x {nfd y-(fdy) }] 2
2
2
2
Where dx= deviation of the central values from the assumed mean of x-series, dy = deviation of the central values from the assumed mean of y- series , f is the frequency corresponding to the pair (x,y) and EXAMPLE--1
n (=f) is the total number of frequencies. Psychological tests of intelligence and
of engineering ability were applied to 10
students. Here is a record of ungrouped data showing intelligence ratio(I.R.) and engineering ratio (E.R.) calculate the co-efficient of correlation. Student A
B
C
D
E
F
G
H
I
J
I.R
104
102
101
100
99
98
96
93
92
101
103
100
98
95
96
104
92
97
105
E.R.
94
Sol. We construct the following table:
Student
Intelligence ration
Engineering ration
2
Y
XY
3
36
9
18
103
5
25
25
25
3
100
2
9
4
6
101
2
98
0
4
0
0
100
1
95
-3
1
9
-3
x
x- x= X
y
y-y=Y
A
105
6
101
B
104
5
C
102
D E
X
2
FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
APPLIED OPERATIONS RESEARCH & STATISTICS
103
F
99
0
96
-2
0
4
0
G
98
-1
104
6
1
36
-6
H
96
-3
92
-6
9
36
18
I
93
-6
97
-1
36
1
6
J
92
-7
94
-4
49
16
28
Total
990
0
980
0
170
140
92
From this table, mean of x, i.e. x= 990/10 and mean of y, i.e. y = 980/10 = 98. Substituting these values in the formula (1) p.925, we have r=
XY
( X 2 Y 2 )
92 (170 X 140)
= 92 / 154.3 = 0.59 9.4 LINES OF REGRESSION It frequently happens that the dots of the scatter diagram generally, tend to cluster along a well defined direction which suggests a linear relationship between the variables x and y. Such a line of best-fit for the given distribution of dots is called the line of regression (Fig.). In fact there are two such lines, one giving the best possible mean values of y for each specified value of x and the other giving the best possible mean values of x for given values of y. The former is known as the line of regression of y on x and the latter as the line of regression of x on y.
Consider first the line of regression of y on x. Let the straight line satisfying the general trend of n dots in a scatter diagram be
Y = a + bx
We have to determine the constants a and b so that (1) gives for each value of x, the best estimate for the average value of y in accordance with the principle of least squares therefore, the normal equations for a and b are y = na +bx
xy = ax+ bx
Thus the line of best fit becomes
2
y y r
y ( x x) x
which is the equation of the line of regression of y on x. Its slope is called the regression co-efficient of y on x. Interchanging x and y, we find that the line of regression of x on y is
xx r
x ( y y) y
Thus the regression co-efficient of y on x = r y /x and the regression co-efficient of x on y = r x /y
FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
APPLIED OPERATIONS RESEARCH & STATISTICS
104
Example-2 The two regression equations of the variables x and y are
x =19.13- 0.87y and
y =11.64 - 0.50x .Find (i) mean of x’s,(ii)mean of y’s and (iii) the correlation co-efficient between x and y Sol. Since the mean of x’s and the mean of y’s lie on the two regression lines, we have x
= 19.13 -0.87y
y = 11.64-0.50x Multiplying (ii) by 0.87 and subtracting from (i), we have [1-(0.87)(0.50)]
x =19.13-(11.64)(0.87)
or
0.57 x =9.00 or
x
= 15.79 y =11.64-(0.50)(15.79) = 3.74 Regression co- efficient of y on x is -0.50 and that of x on y is -0.87. Now since the co-efficient of correlation is the geometric mean between the two regression coefficients. r = [(-0.50)(-0.87)] = (0.43) = -0.66 [-ve sign is taken since both the regression co-effcients are –ve] EXAMPLE-3 In the following table are recorded data showing the test scores made by salesmen on an intelligence test and their and their weekly sales: Salesmen
1
2
3
4
5
6
7
8
9
10
Test scores
40
70
50
60
80
50
90
40
60
60
Sales(000)
2.5
6.0
4.5
5.0
4.5
2.0
5.5
3.0
4.5
3.0
Calculate the regression line of sales on test scores and estimate the most probable weekly sales volume if a salesmen makes a score of 70. Sol. With the help of the table below, we have x = mean of x (test scores) = 60+ 0/10 = 60 y = mean of y (sales) = 4.5 + (-4.5)/10 = 4.05 Regression line of sales (y) on scores (x) is given by y –y =
r(y/x)(x - x)
The required regression line is y - 4.05 = 0.06(x - 60)
or
y = 0.06x + 0.45
For x = 70, y = 0.06 x 70 + 0.45 Thus the most probable weekly sales volume for a score of 70 is 4.65. Test
Sales
Deviation of x
Deviation of y
Scor
from
from
es
assumed
assumed
mean(=60
average
)dx
(=4.5)
y x
2
2
dxx dy
dx
dy
dy 40
2.5
-20
-2
40
400
4
70
6.0
10
1.5
15
100
2.25
50
4.5
-10
0
0
100
0
FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
APPLIED OPERATIONS RESEARCH & STATISTICS
105
60
5.0
0
0.5
0
0
2.25
80
4.5
20
0
0
400
0
50
2.0
-10
-2.5
25
100
6.25
90
5.5
30
1
30
900
1.00
40
3.0
-20
-1.5
30
400
2.25
60
4.5
0
0
0
0
0
60
3.0
0
-1.5
0
0
2.25
dxdy=140
2 dx =2400
dy =18.25
dx=0
dy=-4.5
2
Example -4 Calculate r from the following data: X: 21 23
30
54
57
58
72
78
87
90
Y: 60 71
72
83
110
84
100
92
113
135
x
X=x- 54
X
2
y
2
Y=y-100
y–x
Y
(x-y)
2
21
-33
1089
60
-40
1600
39
1521
23
-31
961
71
-29
841
48
2304
30
-24
576
72
-28
784
42
1764
54
0
0
83
-17
289
29
841
57
3
9
110
10
100
53
2809
58
4
16
84
-16
256
26
676
72
18
324
100
0
0
28
784
78
24
576
92
-8
64
14
196
87
33
1086
113
13
169
26
676
90
36
1296
135
35
1225
45
2025
Total
30
5936
-80
5328
350
13596
Using the formula
r = 0.876 Example-5 While calculating correlation coefficient between two variables x and y from 25 pairs of observations,
the
following
results
were
obtained:
y =460,xy=508. Later it was discovered at the 2
n=25,
x=125,x =650, 2
y=100,
time of checking that the pairs of
FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
APPLIED OPERATIONS RESEARCH & STATISTICS
106
x y x y obtain the correct values of correlation values 8 12 were copied down as 6 14 6 8 8 6 coefficient. Sol .To get the correct results, we subtract the incorrect values and the corresponding correct values. n = 25, x =125 - 6 - 8 + 8 + 6 = 125, x = 650 – 6 – 8 + 8 + 6 = 650 2
2
2
2
2
y =100 – 14 – 6 + 12 + 8 =100, y = 460 - 14 - 6 + 12 + 8 = 436 2
2
2
2
2
xy= 508 – 6 x 14 – 8 x 6 + 8 x 12 + 6 x 8 = 520 n xy - (x)( y) r=
20
=
0.66
25 x36 [{nx -(x) }{ny -(y) }] 2
2
2
2
9.5 Questions: PART – A 1. The degrees of freedom for standard error of estimate are n k 1. What does the a.
Number of observations in the sample
b.
Number of independent variables
c.
Mean of the sample values of dependent variable.
d.
None of these.
2. Given a regression equation Y
k stand for?
25.26 4.78 X 1 3.09 X 2 1.98 X 3 . The value of b2 for this
equation is a.
25.26
b. 4.78
3. In the regression equation Y a.
b2 0
b.
c.
3.09
d. -1.98
a b1 X 1 b2 X 2 , Y is independent of X when
b2 1
c.
b2 1
d. None of these.
2 Y Yc 2 4. Since r 1 , then r is equal to 2 Y Y 2
a.
1 SSR SST
b.
d.
b.
1 SSE SSR
c.
1 SST SSE
1 SSE SST
5. In regression analysis, the explained deviation of the dependent variable y is given by
FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
APPLIED OPERATIONS RESEARCH & STATISTICS
a.
Y Y
b.
d. None of these
2
b.
Y
c
Y
2
107
c.
Y Y
2
c
PART-B 1.Regression coefficient of y on x is 0.7 and that of x on y is 3.2. Is the correlation coefficient r consistent? 2.The equations of regression lines are y = 0.5x + 9 and x = 0.4 y +5. Find the correlation coefficient. PART-C 1. Calculate the correlation coefficient for the following data X
65
66
67
67
68
69
70
72
Y
67
68
65
68
72
72
69
71
2. The two regression lines are 4x 5y + 33 = 0 and 20x 9y 107 = 0 and variance of x = 25 . Find (i) the means of x and y . (ii) the value of r . (iii) standard deviation with respect to y . 3. Find the correlation coefficient between x and y from the given data: X
:
78
89
97
69
59
79
68
57
Y
:
125
137
156
112
107
138
123
108
4. Find the correlation co-efficient between xand y for the givien value. Find also the two regression lines. X
:1 2
3
4
5
6
7
8
9
10
Y
:10 12
16
28
25
36
41
49
40
50
5.Find the co-efficient of correlation between industrial production and export using the following data and comment on the result. Production(in crore tons): 55
56
58
59
60
60
62
Exports (in crore tons)
38
38
39
44
43
45
: 35
6. Ten people of various heights as under, were requested to read the letters on a car at 25 yards distance. The number of letters correctly read is given below: Height(in feet): 5.1
5.3
5.6
5.7
5.8
5.9
5.10
5.11
6.0
6.1
No.of letters : 11
17
19
14
8
15
20
60
18
12
Is there any correlation between heights and visual power? 9.6 SUGGESTED READINGS 1. Ross, S., “A First Course in Probability”, , Pearson Education, Delhi, 2002. 2. Gupta, S.C and Kapoor, V.K., “Fundamentals of Mathematical Statistics”, Sultan Chand and Sons, New Delhi, 1999. 3 .P.N. Arora & S. Arora, “Statistics for management”, S.Chand & Company Ltd, New
FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
APPLIED OPERATIONS RESEARCH & STATISTICS
Delhi, 2003. 4. Amir D. Aczel, “Business statistics”, Tata Mcgraw hill, Edition 2002.
FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
108
APPLIED OPERATIONS RESEARCH & STATISTICS
109
CHAPTER – X CHI – SQUARE ANALYSIS STRUCTURE 10.1 INTRODUCTION 10.2 CHI – SQUARE DISTRIBUTION 10.3 CHI – SQUARE TESTS FOR GOODNESS OF FIT 10.4 CHI – SQUARE TESTS FOR INDEPENDENCE OF ATTRIBUTES 10.5 QUESTIONS 10.6 SUGGESTED READINGS
10.1 INTRODUCTION The Chi – square test (
2
test) is a useful measure of comparing experimentally obtained results
with those expected theoretically and based on the hypothesis such that it is used as a test statistic in testing a hypothesis that provides a set of theoretical frequencies with which observed frequencies are compared Chi – square analysis is widely used in research studies for testing hypothesis involving
nominal data.
Statistical theoretical distributions have a great role in decision making, especially during the periods of uncertainty, managers would presume one of the theoretical models to estimate the parameters of a likely event.
Similarly research scientists in the area of management to choose a certain probability distribution of the date under consideration. In such as a case chi – square analysis is very much useful for finding out significant difference between observed and theoretical frequency distribution.
10.2 CHI – SQUARE DISTRIBUTION DEFINITION ( If
2
Test)
Oi i 1,2,.......n are set of observed (experimental) frequencies and Ei i 1,2,.......n is the
corresponding set of expected (theoretical and hypothetical) frequencies, then the statistic defined as
n
Oi Ei 2
i 1
Ei
2
2
is
and the degree of the freedom this statistic is v n 1 .
10.2.1 Grouping of Small Frequencies: One or more observations with frequencies less than 5 may be grouped together to represent a single category before calculating the difference between observed and expected frequencies.
FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
APPLIED OPERATIONS RESEARCH & STATISTICS
110
For example, the figures given below are the theoretical (observed) and expected frequencies (based on Poisson distribution) having same mean value and equal number of total frequencies.
Observed frequenc
305
365
210
80
28
9
3
301
361
217
88
26
6
1
ies Expected frequenc ies
These 7 classes can be reduced to 6 by combining the last two frequencies in both the cases as follows:
Observed frequencies
305
365
210
80
28
12
Expected frequencies
301
361
217
88
26
7
Since the original 7 classes have been reduced to 6 by grouping, therefore the revised degrees of freedom are
df 6 2 4 due to two restraints.
10.2.2 Condition For Applying
2
Test:
For the validity of chi – square test of “goodness of fit” between theory and experiment, the following conditions must be satisfied.
The total frequency must be reasonably large N should be at least 50.
The constraints must be linear.
The sample observations drawn from a population must be independent and random.
The data must be in frequency (counting) form. If the original data are in percentages, they must be converted into frequency.
No frequency in any cell must be less than 5. If the frequency is less than 5 for a category, you have to do some regrouping.
10.2.3 Characteristics of Chi – Square:
The chi – square distribution has only one parameter called the degrees of freedom. The slope of a particular chi – square distribution depends on the number of degrees of freedom.
Test is based on events or frequencies whereas in theoretical distribution.
The test can be used between the entire set of observed and expected frequencies.
FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
APPLIED OPERATIONS RESEARCH & STATISTICS
111
It is used to draw inferences; this test is applied to test the hypothesis but not used for estimation.
The
2
test is one of the simplest and most widely used non – parametric tests in statistical
work.
10.2.4 Degrees of Freedom:
When the observed frequencies are listed along one dimension (row or column) there are (n-1) degrees of freedom, where ‘n’ refer to the number of column (or row) or there are (n-1) degrees of freedom where ‘n’ refers to the number of observed frequencies.
While comparing the calculated value of
2
with the table value, we have to determine the
degrees of freedom.
The number of degrees of freedom is the total number of observations less than the number of independent constraints imposed on the observations. Thus d.f
v n k where k is the number
of independent constraints in a set of data of n observations. For fitting Binomial distribution,
v n 1
For fitting Poisson distribution, v n 2 For fitting Normal distribution, v n 3 10.3 Chi – Square Test Of Goodness Of Fit:
It is a very powerful test for testing the significance of the discrepancy between theory and experiment. It enables us to find if the deviation of the experiment from theory is just by chance or is it really due to the inadequacy of the theory to fit the observed data.
The
2 test
for goodness of fit enables us to determine the extent to which theoretical
probability distributions coincide with empirical sample distribution. To apply this test, a particular theoretical distribution is first hypothesized for a given population and then the test is carried out to determine whether or not the sample data could have come from the population of interest with the hypothesized theoretical distribution. Working Procedure: The general steps to conduct a goodness of fit test for any hypothesized population distribution are summarized as follows: STEP − 1: State the null and alternative hypothesis
H 0 : The population characteristics coincide with the theoretical probability distribution.
H 1 : Does not coincide.
FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
APPLIED OPERATIONS RESEARCH & STATISTICS
112
STEP − 2: Test statistic Select a random sample and record the observed frequencies (O values) for each category. Calculate expected frequencies (E values) in each category by multiplying the category probability by the sample size. Compute the value of test-statistic n
Oi Ei 2
i 1
Ei
2
STEP − 3: Critical value
Using a level of significance
and df
n 1 , provided that the number of expected
frequencies are 5 or more for all categories, find the critical (table) value of
2
.
STEP − 4: Decision Compare the table and calculated values of
2
. Decide whether the variables are independent
or not, using the decision rule:
H 0 if cal is less than its table value , n 1 2
2
Accept
Otherwise reject H 0 .
Example – 1 200 digits were chosen at random from a set of tables. The frequencies of the digits were Digit
0
1
2
3
4
5
6
7
8
9
Frequency
18
19
23
21
16
25
22
20
21
15
Use chi – square test to assess the correctness of the hypothesis that the digits were distributed in equal number in the tables from which they were chosen. Solution: STEP − 1 Setting up of Hypothesis Null Hypothesis: H 0 : The digits were distributed in equal number in the tables from which they were chosen Alternate Hypothesis: H 1 : The expected frequency of each digit
200 20 10
STEP − 2 Test statistic: Now the data becomes Digit
0
1
2
3
4
5
6
7
8
9
Observed Frequency Oi
18
19
23
21
16
25
22
20
21
15
FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
APPLIED OPERATIONS RESEARCH & STATISTICS
Expected Frequency E i n
Oi Ei 2
i 1
Ei
2 2
20
20
113
20
20
20
20
20
20
20
1 4 1 9 1 16 25 4 0 1 25 20
86 4.3 20
We choose the L.O.S:
0.05
Degrees of freedom 10 1 9 STEP − 3 Critical value The table value of
2 for 9 degrees of freedom = 16.92
STEP − 4 Decision: Since the calculated value of
2
is less than the table value,
H 0 is accepted.
The hypothesis that the digits are distributed in equal number holds good. Example – 2 The following table shows the distribution of goals in a football match. No. of goals
0
1
2
3
4
5
6
7
No. of mistakes
95
158
108
63
40
9
5
2
Fit a Poisson distribution and test the goodness of fit. Solution: Fitting of Poisson distribution
x
0
1
2
3
4
5
6
7
f
95
158
108
63
40
9
5
2
fx 812 And f 480 fx 812 1.7 x f 480 The expected frequencies are computed by
e 1.7 1.7 480 r 0,1,2,3,4,5,6,7 r! r
FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
20
APPLIED OPERATIONS RESEARCH & STATISTICS
114
88,150,126, 72, 30,10, 3,1 STEP − 1 Setting up of Hypothesis Null Hypothesis: H 0 : The fitness is good. Alternate Hypothesis: H 1 : The fitness is not good. STEP − 2 Test statistic: The test statistic
2
n
Oi Ei 2
i 1
Ei
is calculated as follows
O
E
95
88
158
150
108
126
63
72
40
30
9
10
5
16
3
2
Oi
Ei Ei
2
0.56 0.43 2.57 1.12 3.33 14
0.29
1
2 8.30 STEP − 3 Critical value The No. of degrees of freedom 6 2 4 The table value of
2
at 5% level of significance for 4 d.f = 9.483
STEP − 4 Decision: Since the calculated value of
2
is less than the table value,
H 0 is accepted at 5% level. Hence
the fit is good. 10.4 CHI – SQUARE TEST FOR INDEPENDENCE OF ATTRIBUTES
It is used if there are two categorical variables and our interest is to find out whether these two variables are associated with each other.
The cross tabulation is called a contingency table containing frequency data corresponding to the categorical variable in the row and the column.
The expected frequency
eij
is obtained by the rule
FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
APPLIED OPERATIONS RESEARCH & STATISTICS
eij
115
row total Ri column total c j N
Where,
i 1,2,3,......m j 1,2,3,......n 10.4.1 Degrees Of Freedom: No. of degrees of freedom associated with a sxt contingency table s 1 (t Chi –square value for In a
1)
2 2 contingency table
2 2 contingency table wherein the frequencies are
2
a
b
c
d
, the value of
2 is
a b c d ad bc 2 a ba c c d b d
Working Procedure: The working rule to test the association between two independent variables where the sample data is presented in the form of a contingency table with r rows and c columns is summarized as follows: STEP − 1: State the null and alternative hypothesis
H 0 : No relationship or association exists between two variables, that is, they are independent.
H 1 : A relationship exists, that is, they are dependent. STEP − 2: Test statistic Select a random sample and record the observed frequencies (O values) in each cell of the contingency table and calculate the row, column, and grand totals. Calculate the expected frequencies (E values) for each cell:
E
Row total Column total Grand total
Compute the value of test-statistic n
Oi Ei 2
i 1
Ei
2
STEP − 3: Critical value Calculate the degrees of freedom. The degrees of freedom for the chi – square test of independence are given by the formula.
df Number of rows 1Number of column 1 r 1c 1
FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
APPLIED OPERATIONS RESEARCH & STATISTICS
Using a level of significance
116
and df , find the critical (table) value of . This value of 2
2
corresponds to an area in the right tail of the distribution. STEP − 4: Decision Compare the table and calculated values of . Decide whether the variables are independent or 2
not, using the decision rule:
H 0 if cal is less than its table value , r 1c 1 2
Accept
Otherwise reject H 0 .
2
Example – 3 Examine whether the nature of area is related to voting preference in the election for which the data are tabulated below. Area\Vote for
A
B
Total
Rural
620
480
1100
Urban
380
520
900
Total
1000
1000
2000
Solution: STEP − 1 Setting up of Hypothesis
H 0 : Voting preference and the nature of the area are independent. STEP − 2 Test statistic: The table of expected frequencies is A
B
Rural
550
550
Urban
450
450
n
Oi Ei 2
i 1
Ei
2
2 2 2 2 70 70 70 70
2
550
550
450
450
39.6 Degrees of freedom 2 12 1 1 STEP − 3 Critical value The table value of
2 for 1 degrees of freedom at 5% level = 3.84
STEP − 4 Decision:
FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
APPLIED OPERATIONS RESEARCH & STATISTICS
Since the calculated value of
2
117
is greater than the table value,
H 0 is rejected.
The nature of area is related to voting preference in the election. Example – 4 From the following data, test whether there is any association between intelligency and economic conditions Intelligency Excellent
Good
Medium
Dull
Total
Good
48
200
150
80
478
Not Good
52
180
190
100
522
Total
100
380
340
180
Economic Conditions
Solution: STEP − 1 Setting up of Hypothesis
H 0 : There is no association between intelligency and economic conditions. STEP − 2 Test statistic Under H 0 , the expected frequencies are obtained as follows.
E 48
478 100 47.8 ; 1000
E 150 E 52
E 200
478 340 162.52 ; 1000
E 80
522 100 52.2 ; 1000
E 190
522 340 177.48 ; 1000 n
Oi Ei 2
i 1
Ei
2
478 380 181.64 1000
478 180 86.04 1000
E 180
522 380 198.36 1000
E 100
522 180 93.96 1000
2 0.0008 1.8558 0.9645 0.424 0.0008 1.6994 0.8832 0.3883 6.2168 Degrees of freedom 2 14 1 3 STEP − 3 Critical value The table value of
2 for 3 degrees of freedom at 5% level = 7.815
STEP − 4 Decision: Since the calculated value of
2
is less than the table value,
H 0 is accepted.
FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
APPLIED OPERATIONS RESEARCH & STATISTICS
118
Hence we conclude there is no association between intelligency and economic conditions. Example – 5 A company keeps records of accidents. During a safety review, a random sample of 60 accidents was selected and classified by the day of the week on which they occurred. Day
Mon
Tue
Wed
Thur
Fri
No of accidents
8
12
9
14
17
Test whether there is any evidence that accidents are more likely on some days on others. Solution: STEP − 1 Setting up of Hypothesis
H 0 : Accidents are equally likely to occur on any day of the week.
H 1 : Accidents are not equally likely to occur on any day of the week. STEP − 2 Test statistic: n
Oi Ei 2
i 1
Ei
2
On the assumption H 0 , the expected number of accidents on any day
60 12 . 5
Let O denote observed frequency and E denote expected frequency.
2
Oi
Ei Ei
O
E
OE
8
12
-4
16
12
12
0
0
9
12
-3
9
14
12
2
4
17
12
5
25
60
60
Oi Ei 2 Ei
2
54
54 4.5 12
STEP − 3 Critical value Degrees of freedom n 1
5 1
4 The table value of
2 for 4 degrees of freedom at 5% level = 9.488
FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
APPLIED OPERATIONS RESEARCH & STATISTICS
119
STEP − 4 Decision: Since the calculated value of
2
is less than the table value,
H 0 is accepted.
This means that the accidents are equally likely to occur on any day of the week. Example – 6 A sample analysis of examination results of 500 students was made. It was found that 200 students have failed, 170 have secured a third class, 90 have secured a second class and the rest, a first class. So these figures support the general belief that the above categories are in the ratio 4 : 3 : 2 : 1 respectively? Solution: Given: n 4 STEP − 1 Setting up of Hypothesis
H 0 : The results in the four categories are in the ratio 4 : 3 : 2 : 1 .
H 1 : The results in the four categories are not in the ratio 4 : 3 : 2 : 1 . STEP − 2 Test statistic: n
Oi Ei 2
i 1
Ei
2
On are
the
assumption H 0 ,
the
frequencies
of
the
4
classes
4 3 2 1 500, 500, 500, 500 . 10 10 10 10 (i.e.)
200,150,100,50
Oi
Ei Ei
O
E
OE
O E 2
Failures
220
200
20
400
2.000
III
170
150
20
400
2.667
II
90
100
-10
100
1.000
I
20
50
-30
900
18.000
500
2
expected
Oi Ei 2 Ei
2
23.667
23.667
STEP − 3 Critical value Degrees of freedom n 1
4 1 3
FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
APPLIED OPERATIONS RESEARCH & STATISTICS
The table value of
120
2 for 3 degrees of freedom at 5% level = 7.815.
STEP − 4 Decision: Since the calculated value of
2
is greater than the table value,
H 0 is rejected.
The results of the four categories are not in the ratio 4 : 3 : 2 : 1 . 10.5 QUESTIONS PART – A 1. Which of the following is a ‘non-parametric’ test?
2
a. c.
test
b. t-test
z-test
d.
2. For any given level of significance,
none of these.
table2 value:
a.
Increases as sample size increase
b.
decreases as degree of freedom decreases
c.
increases as degree of freedom decreases
d.
decreases as sample size increase
3. A contingency table: a.
Always involves two degrees of freedom
b.
Always involves two dependent variables
c.
Always involves two variables
d.
All of these
4. The degrees of freedom for a contingency table: a. c. Equal 5. The
n 1 r 1c 1 Equal
2
b. d.
Equal
rc 1
cannot be determined.
test requires that:
a.
Data be measured on a nominal scale
b.
Data conform to a normal distribution
c.
Expected frequencies are equal in all cells
d.
All of these
Answers: 1. (a),
2. (c),
3. (c),
4. (c),
5. (a)
PART – B 1. Give the definition of Chi – square test. 2. Give the characteristics of Chi – square test.
FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
APPLIED OPERATIONS RESEARCH & STATISTICS
3. Define
2
121
test of goodness of fit.
4. Write down the formula of test statistic‘t’ to test the significance of difference between the means (large samples). 5. Give the main use of
2 test.
6. Write the condition for the application of
2 test.
7. What is the assumption of t – test? 8. What do you mean by test of Hypothesis – two tailed test? PART – C 1. The following table gives the number of aircraft accidents that occurred during the various days of the week. Find whether the accidents are uniformly distributed over the week. (Ans:
2 = 4.17;
Accept H 0 . Uniformly distributed) Days
Sun
Mon
Tue
Wed
Thu
Fri
Sat
Total
No. of accidents
14
16
8
12
11
9
14
84
2. The following table shows the number of electricity failures in a town for a period of 180 days.
Use
2
(Ans:
Failures
0
1
2
3
4
5
6
7
No. of days
12
39
47
40
20
17
3
2
test, examine whether the data are Poisson distributed.
2 = 4.359; Accept H 0 . Poisson distributed)
3. In a locality 100 persons were randomly selected and asked about their educational achievements. The results are given as below. Can you say that education depends on sex? (Ans:
2 = 9.93; Reject H 0 ) Education Middle
High School
College
Male
10
15
25
Female
25
10
15
4. Test whether the income and type of school are independent from the following data. (Ans: 77.783; not independent)
Income
Private Schools
Govt. Schools
Total
FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
2=
APPLIED OPERATIONS RESEARCH & STATISTICS
122
Low
506
494
1000
High
438
162
600
Total
944
656
1600
5. Find out whether the new treatment is comparatively superior to the conventional one from the following data. (Ans:
2 = 18.18; Reject H 0 : treatment is dependent) Favourable
Non Favourable
Total
Conventional
40
70
110
New
60
30
90
Total
100
100
200
10.6 SUGGESTED READINGS 5. P.N. Arora & S. Arora, “Statistics for management”, S.Chand & Company Ltd, New Delhi, 2003. 6. Richard I. Levin & David S.Rubin, “Statistics for management”, Pearson education, Singapore, 2004. 7. Amir D. Aczel, “Business statistics”, Tata Mcgraw hill, Edition 2002.
FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
APPLIED OPERATIONS RESEARCH & STATISTICS
123
CHAPTER – XI F- TEST AND ANALYSIS OF VARIANCE (ANOVA)
STRUCTURE 11.1 INTRODUCTION 11.2 THE F-DISTRIBUTION 11.3ANALYSIS OF VARIANCE APPROACH 11.4 ONE WAY CLASSIFICATION ANOVA 11.5 TWO WAY CLASSIFICATION OF ANOVA 11.6 QUESTIONS 11.7 SUGGESTED READINGS
11.1 INTRODUCTION In this chapter, we develop a method for comparing several population means at the same time. This method is known as analysis of variance though it is abbreviated from ANOVA is more frequently used. The analysis of variance tests are performed using the F – distribution. 11.2 THE F – DISTRIBUTION Suppose that two independent normal populations are of interest, when the population means and variances, say
1 , 12 , 2 and 22 are
unknown. We wish to test hypothesis about the
equality of the two variances, say, H 0 : 1 2 . Assume that two random samples of size 2
from population 1 and of size
2
n1
n 2 from population 2 are available, and let S 12 and S 22 be the
sample variances. We wish to test the hypothesis.
H 0 : 12 22
H 1 : 12 22 The development of a test procedure for these hypotheses requires a new probability distribution, the F – distribution. If
S 12 and S 22 are the variances of two samples of sizes n1 and n 2 respectively, the estimates of
the population variance based on these samples are respectively The quantities
S12
n1 s12 n 2 s 22 2 and S 2 . n1 1 n2 1
v1 n1 1 and v 2 n2 1 are called the degrees of freedom of these
estimates. We want to test if these estimates
S 12 and S 22 are significantly different or if the
samples may be regarded as drawn from the same population or from two populations with same variance . 2
FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
APPLIED OPERATIONS RESEARCH & STATISTICS
124
n1 s12 S 2 n 1 Let F 12 1 2 S2 n2 s 2 n2 1 11.2.1 Applications: The square of the t –variate with n degrees of freedom follows a F – distribution with
l and n degrees
of freedom. 1. The mean of the F – distribution is
v2 v 2 2 v2 2
2. The variance of the F – distribution is
2 v 22 v1 v 2 2
v1 v 2 2 v 2 4 2
v 2 4
3. We should always make F 1. This is done by taking the larger of the two estimates of
2 as
12 and by assuming that the corresponding degree of freedom as v1 . 4. F – test is used to test i.
Whether two independent samples have been drawn from the normal populations with the same variance
ii.
2 , or
Whether the two independent estimates of the population variance are homogeneous or not.
11.2.2 Assumptions For the validity of the F – test in ANOVA, the following assumptions are made: i.
The observations are independent
ii.
Parent population from which observations are taken in normal and
iii.
Various treatment and environmental effects are additive in nature.
S.S.C
- Between sum of squares (Column)
T.S.S
- Total Sum of Squares
S.S.E
- Error sum of squares (or) within sum of squares
C.F
- Correlation factor
S.S.R
- Sum of squares between Rows
M.S.C
- Mean sum of squares (between columns)
M.S.E
- Mean sum of squares (within columns)
M.S.R
- Mean sum of squares (between rows)
N
- Number of Observations
N1
- Number of elements in each column
N2
- Number of elements in each row
FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
APPLIED OPERATIONS RESEARCH & STATISTICS
125
Example – 1 Water melons were grown under two experimental conditions. Two random samples of 11 and 9 water melons show the sample standard deviations of their weights as 0.8 and 0.5 respectively. Assuming that the weight distributions are normal, test the hypothesis that the true variances are equal,
against
the
alternative
that
they
are
not,
at
the
10%
level.
[Assume
that P F10,8 3.35 0.05 ] and P F8,10 3.07 0.05 ] Solution: STEP − 1 Setting up of Hypothesis Null Hypothesis: H 0 : 1 2 2
Alternate Hypothesis:
2
H 1 : 12 22
STEP − 2 Test statistic:
n1 11 , n2 9 n1 s12 11 0.8 S 0.704 n1 1 10 2
s1 1 0.8
2 1
n 2 s 22 9 0.5 0.28125 n2 1 8 2
s 2 2 0.5
S 22
S12 0.704 F 2 2.5 S 2 0.28125 STEP − 3 Critical Value Level of Significance,
0.10
Degrees of freedom,
v1 n1 1 11 1 10 v 2 n2 1 9 1 8 F10,8 0.05 3.35 and F 0.33 [From F table] STEP − 4 Decision: Since 2.5 is in between 0.33 and 3.35. So we accept
H 0 at 10% level of significance.
Example – 2 A group of 10 rats fed on diet A and another group of 8 rats fed on diet B, recorded the following increase in weight. Diet A
5
6
8
1
12
4
3
9
6
10
FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
APPLIED OPERATIONS RESEARCH & STATISTICS
Diet B
2
3
6
8
126
10
1
2
8
Find if the variances are significantly different. Solution: STEP − 1 Setting up of Hypothesis Null Hypothesis: H 0 : 1 2 2
Alternate Hypothesis:
2
H 1 : 12 22
STEP − 2 Test statistic:
n1 10 , n2 8
x1
x2 2 1
s
x2
x 22
36
2
4
8
64
3
9
1
1
6
36
12
144
8
64
4
16
10
100
3
9
1
1
9
81
2
4
6
36
8
64
10
100
64
512
40
282
x
1
10
x
2
8
x
s 22
2 1
n1
x n2
2 2
x1
x12
5
25
6
64 6.4 10
40 5 8
x1
2
x2
2
512 2 6.4 10.24 10
282 25 10.25 8
n1 s12 10 10.24 S 11.3777 n1 1 9 2 1
FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
APPLIED OPERATIONS RESEARCH & STATISTICS
127
n 2 s 22 8 10.25 S 11.7143 n2 1 7 2 2
F
S12 11.7143 1.02958 S 22 11.3777
STEP − 3 Critical Value Level of Significance,
0.05
Degrees of freedom,
v1 n1 1 10 1 9
v 2 n2 1 8 1 7 STEP − 4 Decision: Since 1.02958 is less than 3.29, so we accept
H 0 at 5% level of significance.
Hence we conclude that the two samples have come from populations with equal variances. 11.3 ANALYSIS OF VARIANCE APPROACH The analysis of variance is a powerful statistical tool for tests of significance. The term “Analysis of Variance” was introduced by prof. R.A. Fisher in 1920’s to deal with problem in the analysis of agronomical data. Variation is inherent in nature. The total variation in any set of numerical data is due to a number of causes which may be classified as i.
Assignable causes and
ii.
Chance causes The first step in the analysis of variance is to partition the total variation in the sample data into the following two component variations in such a way that it is possible to estimate the contribution of factors that may cause variation. 1.
The amount of variation among the sample means or the variation attributable to the difference among sample means. This variation is due to assignable causes.
2.
The amount of variation within the sample observations. This difference is considered due to chance causes or experimental (random) errors.
The observations in the sample data may be classified according to one factor (criterion) or two factors (criteria). The classifications according to one factor and two factors are called one-way classification and two-way classification, respectively. The calculations for total variation and its components may be carried out in each of the two types of classifications by i.
direct method
ii.
short-cut method
iii.
coding method
11.3.1 Assumptions for Analysis of Variance Test
FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
APPLIED OPERATIONS RESEARCH & STATISTICS
128
1.
The sampled population is normal.
2.
The treatment effects are additive.
3.
The individuals have been randomly selected from the population.
4.
The variance between the samples is constant. It is supposed that apart from affecting the mean of the samples different treatments do not change the variances of the samples.
5.
Experimental errors should be homogeneous in their variance and independent.
11.3.2 Techniques of Analysis of Variance The technique of analysis of variance is discussed under three types 1.
One way classification (CRD)
2.
Two way classification (RBD)
3.
Three way classification (LSD)
11.4 ONE WAY CLASSIFICATION Here the data are classified on the basis of one criterion. Suppose a simple of N values of a given variate x is subdivided into k-classes according to some th
th
th
criterion of classification. Let the i class consists of ni members and let j member of i class be denoted by xij. Example – 3 The following are the numbers of mistakes made in 5 successive days of 4 technicians working for a photographic laboratory: Technician I
Technician II
Technician III
Technician IV
(X1)
(X2)
(X3)
(X4)
6
14
10
9
14
9
12
12
10
12
7
8
8
10
15
10
11
14
11
11
Test at the level of significance
0.01 whether the differences among the 4 sample means can be
attributed to chance. Solution: STEP − 1 Setting up of Hypothesis Null Hypothesis:
H 0 : There is no significant difference between the technicians.
Alternate Hypothesis:
H 1 : Significant difference between the technicians.
STEP − 2 Test statistic:
Total
X 12
X 22
X 32
X 42
FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
APPLIED OPERATIONS RESEARCH & STATISTICS
129
X1
X2
X3
X4
-4
4
0
-1
-1
16
16
0
1
4
-1
2
2
7
16
1
4
4
0
2
-3
-2
-3
0
4
9
4
-2
0
5
0
3
4
0
25
0
1
4
1
1
7
1
16
1
1
Total -1
9
5
0
13
37
37
39
10
N 20 T 13 T 2 13 8.45 N 20 2
TSS X 12 X 22 X 32 X 42
T2 N
37 37 39 10 8.45 114.55
X SSC
2
1
N1
X
2
2
N1
X
2
3
N1
X
2
4
N1
T2 N
N 1 Number of elements in each column.
12 5
92 5
52 5
0 8.45
1 81 5 8.4 0.2 16.2 5 8.45 12.95 5 5
SSE TSS SSC 114.55 12.95 101.6 STEP − 3 ANOVA Source of Variation
Sum of squares
d.f
Mean square
MSC Between columns
SSC = 12.95
C–1=4–1 =3
SSC C 1
12.95 3
4.317 Error
SSE = 101.6
N – C = 20 – 4 =16
MSE
SSE N C
Variance ratio
Fc
MSE MSC 6.35 4.317
Table Value at 1% level
Fc 3,16 5.29
1.471 Since
FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
APPLIED OPERATIONS RESEARCH & STATISTICS
130
Total
101.6 16
MSE 1 MSC
114.55
STEP − 4 Decision: Cal
Fc < Tab Fc
Hence we accept
H0
EXAMPLE – 4 There are three main brands of a certain powder. A set of 120 sample values is examined and found to be allocated among four groups (A, B, C and D) and three brands (I, II, III) as shown here under: Groups
Brands
A
B
C
D
I
0
4
8
15
II
5
8
13
6
III
8
19
11
13
Is there any significant difference in brands preference? Answer at 5% level. Solution: STEP − 1 Setting up of Hypothesis Null Hypothesis:
H 0 : There is no significant difference in brands.
Alternate Hypothesis:
H 1 : There is significant difference in brands.
STEP − 2 Test statistic: Brands
Groups Total
X 12
X 22
X 32
X 42
A (X1)
B (X2)
C (X3)
D (X4)
I (Y1)
0
4
8
15
27
0
16
64
225
II (Y2)
5
8
13
6
32
25
64
169
36
III (Y3)
8
19
11
13
51
64
361
221
169
Total
13
31
32
34
110
89
441
354
430
N 12 T 110 T 2 110 1008.3 N 12 2
FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
APPLIED OPERATIONS RESEARCH & STATISTICS
TSS X 12 X 22 X 32 X 42
131
T2 N
89 441 354 430 1008.3 305.7
Y SSR
2
1
N2
Y
2
2
N2
Y
2
3
N2
Y
2
4
N2
T2 N
N 1 Number of elements in each row. 2 2 2 27 32 51
4
4
4
1008.3 80.2
SSE TSS SSR 305.7 80.2 225.50 STEP − 3 ANOVA Source of
Sum of
Variation
d.f
squares
Mean square
MSR Between rows
r–1=3–1
SSR = 80.2
=2
Variance ratio
Error
N – r = 12 – 3
SSE = 225.5
=9
Total
1% level
SSR r 1
80.2 2
FR
40.1 MSE
Table Value at
MSR MSE
40.1 20.06
SSE N r
FR 2,9 4.26
1.999
225.5 9
305.7
STEP − 4 Decision: Cal
FR < Tab FR
Hence we accept
H0
11.5 TWO WAY CLASSIFICATION OF ANOVA Example – 5 Perform two – way ANOVA for the given table below: Treatment Plots of Land
A
B
C
D
I
38
40
41
39
FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
APPLIED OPERATIONS RESEARCH & STATISTICS
132
II
45
42
49
36
III
40
38
42
42
Use coding method, subtracting 40 from the given numbers Solution: Subtract 40 from all the numbers. By doing so, the F ratio is unaffected reduces the numbers to smaller numbers. A
B
C
D
I
-2
0
1
-1
II
5
2
9
-4
III
0
-2
2
2
Total
X 12
X 22
X 32
X 42
-1
-2
4
0
1
1
9
-4
12
25
4
81
16
-2
2
2
2
0
4
4
4
0
12
-3
12
29
8
86
21
X1
X2
X3
X4
Y1
-2
0
1
Y2
5
2
Y3
0
Total
3
STEP − 1 Setting up of Hypothesis Null Hypothesis:
H 0 : There is no significant difference between column means
as well as row means.
Alternate Hypothesis:
H 1 : There is significant difference between column means
or the row means. STEP − 2 Test statistic:
N 3 3 3 3 12
T 3 0 12 3 12
T 2 12 12 N 12 2
TSS X 12 X 22 X 32 X 42
T2 N
29 8 86 21 12 132
FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
APPLIED OPERATIONS RESEARCH & STATISTICS
X SSC
2
1
N1
X
2
2
X
2
3
N1
N1
133
X
2
4
N1
T2 N
N 1 Number of elements in each column.
9 0 144 9 12 3 0 48 3 12 42 3 3 3 3
Y SSR
2
1
N2
Y
2
2
N2
Y
2
3
N2
Y
2
4
N2
T2 N
N 2 Number of elements in each row.
4 144 4 12 26.0 4 4 4
SSE TSS SSC SSR
132 42 26 64
STEP − 3 ANOVA Source of Variation
Sum of squares
d.f
Mean square
MSC Between columns
SSC = 42
C–1=4–1
=3
SSC C 1
42 3
14
MSR Between rows
Error
Total
SSR = 26
SSE = 64
TSS = 132
r–1=3–1 =2
N – C – r +1 =6
MSE
SSR r 1
Variance ratio
Fc
MSE MSC
14 10.67
Table Value at 1% level
Fc 3,6 4.76
1.31
FR
MSR MSE
26 2
13 10.67
13
1.22
FR 2,6 5.14
SSE N C r 1 64 10.67 6
11
STEP − 4 Decision:
FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
APPLIED OPERATIONS RESEARCH & STATISTICS
134
In both cases the calculated value of F < the table value of F.
accept H 0 Hence there is no significant difference between column means as well as row means.
11.6 QUESTIONS PART – A 1. The number of parts in which total variance in a two-way analysis of variance partitioned is a.
2
b. 3
c. 4
d. None of these
2. Any difference among the population means in the analysis of variance will inflate the expected value b.
MSE
b. MSB c. MSW
d. All of these
3. If data in a two-way classification is displayed in r rows and c columns, then the degrees of freedom will be
r 1
c.
b.
c 1
c. r
c 1
d.
r 1c 1
4. The degrees of freedom associated with the denominator of F-test in the analysis of variance are
k n 1
d.
b.
nk 1
c.
nk 1
d. none of these
5. The error sum of squares can be obtained from the equation: e.
b. SSE = SSR + SSC – SST
SSE = SST + SSR + SSC
c. SSE = SST – SSR – SSC
d. None of these
Answers: 1. (b),
3. (d),
2. (b),
4. (a),
5. (c)
PART – B 1. What are the major assumptions of ANOVA? 2. How is analysis of variance technique helpful in solving business problems? Illustrate your answer with suitable examples. 3. Distinguish between one-way and two-way classifications to test the equality of population means. 4. What is meant by the critical value used in the analysis of variance? How is it found? 5. How is the F-distribution related to the student’s t-distribution and the chi-square distribution? What important hypothesis can be tested by the F-distribution? PART – C 1. Two random sample drawn from two normal populations are: Sample I
20
16
26
27
23
22
18
24
25
19
FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
APPLIED OPERATIONS RESEARCH & STATISTICS
Sample II
27
33
42
35
135
32
34
38
28
41
43
30
37
Obtain estimates of the variances of the populations and test whether the populations have the same variance. 2. Two random samples of sizes 8 and 7 had the following values of the variables. Sample A
9
11
13
11
15
9
12
Sample B
10
12
10
14
9
8
10
14
Do the estimates of population variance differ significantly? 3. Weekly sales in Rs. in small shops in 3 towns A, B and C are as follows. A
620
600
740
B
410
380
350
C
920
870
1040
800
1030
1010
Can we conclude that the shops in the 3 towns have the same average sales?
Given
F0.05 2,9 4.26 . 4. Three different machines are used for a production. On the basis of the outputs, set up one – way ANOVA table and test whether the machines are equally effective. Outputs Machine I
Machine II
Machine III
10
9
20
15
7
16
11
5
10
10
6
14
Given that value of F at 5% level of significance
2,9
df 4.26
5. Perform a two – way classification to the following data. Varieties
Blocks 1
2
3
4
A
4
8
6
8
B
5
5
7
8
C
6
7
9
5
Given F0.05 2,6 5.14 ,
F0.05 6,2 19.33 , F0.05 3,6 4.76
6. The following data represent the no. of units of production per day turned out by 5 different workers using 4 different types of machines.
FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
APPLIED OPERATIONS RESEARCH & STATISTICS
136
Machine Types
Workers
A
B
C
D
1
44
38
47
36
2
46
40
52
43
3
34
36
44
32
4
43
38
46
33
5
38
42
49
39
a.
Test whether the 5 men differ with respect to mean productivity.
b.
Test whether the mean productivity is the same for the four different machine types.
11.7 SUGGESTED READINGS 1. P.N. Arora & S. Arora, “Statistics for management”, S.Chand & Company Ltd, New Delhi, 2003. 2. Sheldon M. Ross, “Introductory Statistics”, Academic press, London, 2005. 3. Dr. Parimal Mukhopadhyay, “Applied statistics”, Arunabha Son Books & Allied Pvt Ltd, Kolkata, 2005
FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
APPLIED OPERATIONS RESEARCH & STATISTICS
137
CHAPTER – XII TIMESERIES ANALYSIS STRUCTURE 12.1 INTRODUCTION 12.2 COMPONENTS OF TIME SERIES 12.3 SECULAR TREND 12.4 METHODS OF MEASURING TREND 12.5 SEASONAL VARIATIONS 12.6 CYCLICAL VARIATIONS 12.7 FORECASTING 12.8 QUESTIONS 12.9 SUGGESTED READINGS
TIME SERIES ANALYSIS 12.1 INTRODUCTION
“A time series is a set of statistical observations arranged in chronological order”
“A time series may be defined as a collection of readings belonging to different time periods, of some economic variable or composite of variables. A time series is a set of observations of a variable usually at equal intervals of time. Here time may be yearly, monthly, weekly, daily or even hourly usually at equal intervals of time.
Hourly temperature reading, daily, sales, monthly production is examples of time series. Number of factors affects the observations of time series continuously, some with equal intervals of time and others are erratic studying, interpreting analyzing the factors is called Analysis of Time Series.
The primary purpose of the analysis of time series is to discover and measure all types of variations which characterize a time series. The central objective is to decompose the various elements present in a time series and to be use them in business decision making.
12.2 COMPONENTS OF TIME SERIES: The components of a time series are the various elements which can be segregated from the observed data. The following are the broad classification of these components.
FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
APPLIED OPERATIONS RESEARCH & STATISTICS
138
Components
Long Term
Secular Trend
Short Term
Cyclical
Seasonal
Irregular (or)
Erratic
Regular
In time series analysis, it is assumed that there is a multiplicative relationship between these four components, symbolically, Y=T x S x C x I Where Y denotes the result of the four elements; T = Trend ; S = Seasonal component;C = Cyclical components; I = Irregular component.
In the multiplicative model it is assumed that the four components are due to different causes but they are not necessarily independent and they can affect one another.
Another approach is treat each observation of a time series as the sum of these four components. Symbolically Y = T + S + C + I
The additive model assumes that all the components of the time series are independent of one another. 1)
Secular Trend or Long – Team movement or simply Trend
2)
Seasonal Variation
3)
Cyclical Variations
4)
Irregular or erratic or random movements(fluctuations)
12.3 SECULAR TREND: It is a long term movement in Time series. The general tendency of the time series is to increase or decrease or stagnate during a long period of time is called the secular trend or simply trend. Population growth, improved technological progress, changes in consumers taste are the various factors of upward trend. We may notice download trend relating to deaths, epidemics, due to improved medical facilities and sanitations. Thus a time series shows fluctuations in the upward or downward direction in the long run. 12.4 METHODS OF MEASURING TREND: Trend is measured by the following mathematical methods. 1.
Graphical method
2.
Method of Semi-averages
3.
Method of moving averages
4.
Method of Least Squares
FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
APPLIED OPERATIONS RESEARCH & STATISTICS
139
12.4.1 GRAPHICAL METHOD: This is the easiest and simplest method of measuring trend. In this method, given data must be plotted on the graph, taking time on the horizontal axis and values on the vertical axis. Draw a smooth curve which will show the direction of the trend.while fitting a trend line the following important points should be noted to get a perfect trend line. (i)
The curve should be smooth.
(ii)
As for as possible there must be equal number of points above and below the trend line.
(iii)
The sum of the squares of the vertical deviations from the trend should be as small as possible.
(iv)
If there are cycles, equal number of cycles should be above or below the trend line.
(v)
In case of cyclical data, the area of the cycles above and below should be nearly equal.
Example 1: Year Sales (in Rs â&#x20AC;&#x2122;000)
Fit a trend line to the following data by graphical method. 1996 60
1997 72
1998 75
1999 65
2000 80
2001 85
2002 95
Merits: 1. It is the simplest and easiest method. It saves time and labour. 2. It can be used to describe all kinds of trends. 3. This can be used widely in application. 4. It helps to understand the character of time series and to select appropriate trend. Demerits: 1. It is highly subjective. Different trend curves will be obtained by different persons of the same set of data . 2. It is dangerous to use free hand trend for forecasting purposes. 3. It does not enable us to measure trend in preside quantitative terms. 12.4.2 METHOD OF SEMI AVERAGES: In this method, the given data is divided into two parts, preferably with the same number of years. For example, if we are given data from 1981 to 1998 i.e., over a period of 18 years, the two equal parts will be first 9 years, i.e., 1981 to 1989 and from 1990 to 1998. In case of odd number of years like 5,7,9,11etc, two equal parts can be made simply by omitting the middle year. For e.g. , if the data are given for 7 years from 1991 to 1997, the two equal parts would be from 1991 to 1993 and from 1995 to 1997, the middle year 1994 will be omitted. After the data have been divided into two parts, and average of each parts is obtained. Thus we get two points. Each point is plotted at the mid- point of the class intervell covered by respective part and then the two points are joined by a straight line which gives us the required trend line. The line can be extended downwards and upwards to get intermediate values or two predict future values.
FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
APPLIED OPERATIONS RESEARCH & STATISTICS
140
Examples 2: Draw a trend line by the method of semi- averages. 1991 1992 1993 1994 1995 1996 Ye ar Sales (in 60 75 81 110 106 120 Rs ’000) 347.5 – 341.33 = 6.17 Divide the two parts by taking three values in each part. Year Sales (Rs) Semi total Semi average Trend values 1991 60 59 1992 75 216 72 72 1993 81 85 1994 110 98 1995 106 333 111 111 1996 120 124 Difference in middle periods = 1995 – 1992 = 3years Difference in semi average = 111 -72 = 39 Annual increase in trends = 39/3 =13 Trend of 1991 = Trend of 1992 -13
= 72- 13 = 59
Trend of 1993 = Trend of 1992 + 13
= 72 + 13 = 85
Similarly, we can find all the values The following graph will show clearly the trend line. 12.5 SEASONAL VARIATIONS: Seasonal Variations are fluctuations within a year during the season. The factors that cause seasonal variations are i)
Climate and weather condition.
ii)
Customs and traditional habits For example the sale of ice creams increase in summer, the umbrella sales increase in rainy
season, sales of woolen clothes increase in winter season and agricultural production depends upon the monsoon etc., Secondly in marriage season the price of gold will increase, sale of crackers and new clothes increase in festival times. So seasonal variations are of great importance to business men, producers and sellers for planning the future. The main objective of the measurement of seasonal variations is to study their effect and isolate them from the trend. 12.5.1 MEASUREMENT OF SEASONAL VARIATION: The following are some of the methods more popularly used for measuring The seasonal variations. 1.
Method of simple averages.
2.
Ratio to trend method.
3.
Ratio to moving average method.
4.
Link relative method.
Among the above four methods the method of simple averages is easy to compute seasonal variations.
FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
APPLIED OPERATIONS RESEARCH & STATISTICS
141
12.5.2 METHOD OF SIMPLE AVERAGES: The steps for calculations: i)
Arrange the data season wise.
ii)
Compute the average for each season.
iii)
Calculate the grand average, which is the average of seasonal averages.
iv)
Obtain the seasonal indices by expressing each season as percentage of Grand average.
The total of these indices would be 100n where ‘n’ is the number of seasons in the year. Example 8: Find the seasonal variations by simple average method for the data given below. Quarter Year I II III IV 1989 30 40 36 34 1990 34 52 50 44 1991 40 58 54 48 1992 54 76 68 62 1993 80 92 86 82 Solution : Quarter Year I II III IV 1989 30 40 36 34 1990 34 52 50 44 1991 40 58 54 48 1992 54 76 68 62 1993 80 92 86 82 Total 238 318 294 270 Average 47.6 63.6 58.8 54 Seasonal 85 113.6 105 56.4 Indic es Grand average = =
47.6 63.6 58.8 54 4
224 56 4
Seasonal Index for I quarter =
First quarterly Average 100 Grand Average
=
47.6 100 85 56
Seasonal Index for II quarter =
=
Second quarterly Average 100 Grand Average
63.6 100 113.6 56
Seasonal Index for III quarter
=
Third quarterly Average 100 Grand Average
FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
APPLIED OPERATIONS RESEARCH & STATISTICS
=
58.8 100 105 56
Seasonal Index for IV quarter =
Fourth quarterly Average 100 Grand Average
=
54 100 96.4 56
Example 9: Calculate the seasonal indices from the following data using simple average method. Year Quarter 1974 1975 1976 1977 1978 I 72 76 74 76 74 II 68 70 66 74 74 III 80 82 84 84 86 IV 70 74 80 78 82 Solution : Quarter Year I II III IV 1974 72 68 80 70 1975 76 70 82 74 1976 74 66 84 80 1977 76 74 84 78 1978 74 74 86 82 Total 372 352 416 384 Average 74.4 70.4 83.2 76.8 Seasonal 97.6 92.4 109.2 100.8 Indic es Grand average = =
74.4 70.4 83.2 76.8 4
304.8 76.2 4
Seasonal Index for I quarter =
First quarterly Average 100 Grand Average
=
74.4 100 97.6 76.2
Seasonal Index for II quarter
=
Second quarterly Average 70.4 100 = 100 92.4 76.2 Grand Average
Seasonal Index for III quarter =
=
Third quarterly Average 100 Grand Average
83.2 100 109.2 76.2
Seasonal Index for IV quarter
=
Fourth quarterly Average 100 Grand Average
FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
142
APPLIED OPERATIONS RESEARCH & STATISTICS
=
143
76.8 100 100.8 76.2
12.6 CYCLICAL VARIATIONS: The term cycle refers to the recurrent variations in time series that extend over longer period of time, usually two or more years. Most of the time series relating to economic and business show some kind of cyclic variation. A business cycle consists of the recurrent of the up and down movement of business activity. It is a four- phase cycle namely. 1. Prosperity 2. Decline 3. Depression 4. Recovery Each phase changes gradually into the following phase. The following diagram illustrates a business cycle. The study of cyclical variation is extremely useful in framing suitable for stabilizing the level of business activities. Businessmen can take timely steps in maintaining business during booms and depression. 12.6.2 IRREGULAR VARIATION: Irregular variations are also called erratic. These variations are not regular and which do not repeat in a definite pattern. These variations are caused by war, earthquakes, strikes flood, revolution etc. This variation is short-term one, but it affects all the components of series. There are no statistical techniques for measuring or isolating erratic fluctuation. Therefore the residual that remains after eliminating systematic components is taken as representing irregular variations. 12.7 FORECASTING A very important use of time series data is towards forecasting the likely value of variable in future. In most cases it is the projection of trend fitted into the values regarding a variable over a sufficiently long period by any of the methods discussed latter. Adjustments for seasonal and cyclical character introduce further improvement in the forecasts based on the simple projection of the trend. The importance of forecasting in business and economic fields lies on account of its role in planning and evaluation. If suitably interpreted, after consideration of to her forces, say political, social governmental policies etc., this statistical technique can be of immense help in decision making. The success of any business depends on its future estimates. On the basis of theses estimates a business man plans his production stocks, selling market, arrangement of additional funds etc. forecasting is different from predictions and projections. Regression analysis, time series analysis, Index numbers are some of the techniques through which the predictions and projections are made. Where as forecasting is a method of foretelling the course of business activity based on the analysis of past and present data mixed with the consideration of ensuring economic policies and circumstances. In particular forecasting means fore-warning. Forecasts based on statistical analysis are much reliable than a guess work. According to T.S.Levis and R.A.Fox, “Forecasting is using the knowledge we have at one time to estimate what will happen at some future movement of time” 1 Methods of Business forecasting There are three methods of forecasting 1. naïve method 2. Barometric methds 3. Analytical methods 1. Naïve method : It contains only the economic rhythm theory. 2. Barometric methods It covers (i) Specific historical analogy (ii) Lead-Lag relationship (iii) Diffusion method (iv) Action-reaction theory 3. Analytical methods : It contains
FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
APPLIED OPERATIONS RESEARCH & STATISTICS
144
(i) The factor listing method (ii) Cross-cut analysis theory (iii) Exponential smoothing (iv) Econometric methods 12.7.1 THE ECONOMIC RHYTHM THEORY In this method the manufactures analysis the time-series data of his own firm and forecasts on the basis of projections so obtained. This method is applicable only for the individual firm for which the data are analyzed, The forecasts under this method are not very reliable as no subjective matters are being considered. 12.7.2 DIFFUSION METHOD OF BUSINESS FORECASTING The diffusion index method is based on the principle that different factors, affecting business, do not attain their peaks and troughs simultaneously. There is always time-log between them. This method has the convenience that one has not to identify which series has a lead and which has a lag. The diffusion index depicts the movement of broad group of series as a whole without bothering about the individual series. The diffusion index shows the percentage of a given set of series as expanding in a time period. It should be carefully noted that the peaks and troughs of diffusion index are not the peaks troughs of the business cycles. All series do not expand or contract concurrently. Hence if more than 50% are expanding at a given time, it is taken that the business is in the process of booming and vice – versa. The graphics method is usually employed to work out the diffusion index. The diffusion index can be constructed for a group of business variables like prices, investments, profits etc. 12.7.3 CROSS CUT ANALYSIS THEORY OF BUSINESS FORECASTING: In this method a through analysis of all the factors under present situations has to be done and an estimate of the composite effect of all the factors is being made. This method takes into account the views of managerial staff, economists, consumers etc. prior to the forecasting. The forecasts about the future state of the business are made on the basis of over all assessment of the effect of all the factors. 12.8 QUESTIONS: PART – A: 1. A time series in a set of values arranged in____________ order. 2. Quarterly fluctuations observed in a time series represent________ variation. 3. Periodic changes in a business time series are called________ 4. A complete cycle passes through__________ stages of phenomenon. PART – B: 1. What is a time series? 2. Write briefly about seasonal variation. 3. What is cyclic variation? 4. Discuss irregular variation in the context of time series. 5. What do you understand by business fore-casting?
PART – C: 1. With the help of graph paper obtain the trend values.
1.
Year
1996
1997
1998
1999
2000
2001
Value
65
85
95
75
100
80
2002 130
From the following data calculate the 4-yearly moving average and determine the trend values. Find the short-term fluctuations. Plot the original data and the trend on a graph. Year
93
94
95
96
97
98
99
00
01
02
Value
50
36.5
43
44.5
38.9
38.1
32.6
41.7
41.1
33.8
FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
APPLIED OPERATIONS RESEARCH & STATISTICS
2.
145
Obtain seasonal fluctuations from the following time-series Quarterly output of coal for four years. year
2000
2001
2002
2003
1
65
58
70
60
2
58
63
59
55
3
56
63
56
51
4
61
67
52
58
11.7 SUGGESTED READINGS 1. Montgomery D.C and Johnson, L.A, “Forecasting and Time series”, McGraw Hill. 2. Anderson, O.D, “Time series Analysis: Theory and Practice”, I.North-Holland, Amsterdam, 1982. 3. Gupta, S.C and Kapoor, V.K., “Fundamentals of Mathematical Statistics”, Sultan Chand and Sons, New Delhi, 1999.
FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
APPLIED OPERATIONS RESEARCH & STATISTICS
FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621
146
APPLI EDOPERATI ONS RESEARCH& STATI STI CS
Publ i s he dby
I ns t i t ut eofManage me nt& Te c hni c alSt udi e s Addr e s s:E4 1 , Se c t o r 3 , No i da( U. P) www. i mt s i ns t i t ut e . c o m| Co nt a c t :+9 1 9 2 1 0 9 8 9 8 9 8