QUANTITATIVE RESEARCH METHODOLOGY
NOR IDAYU MAHAT CENTRE FOR UNIVERSITY-INDUSTRY COLLABORATION (CUIC) UNIVERSITI UTARA MALAYSIA 04-928 4098 / noridayu@uum.edu.my
1
Contents Basic concepts
Exploring your data
o Statistics and research
Sampling, techniques and procedures Measurements: o Scale o Adequacy, validity, reliability and sensitivity
Statistical inference Hypothesis testing
Analysis of difference Complex analyses
2
Basic concept Research Activity Types Pure Basic
Strategic Basic
Applied
Experimental
• Experimental and theory work undertaken to acquire new knowledge for the advancement of knowledge.
• Experimental and theoretical work undertaken to acquire new knowledge for specified broad areas in the expectation of useful discoveries.
• Original work undertaken to acquire new knowledge with a specific application in view, e.g. to determine possible uses for the findings of basic research.
• Systematic work, using existing knowledge for the purpose of creating new or improved products/proce sses.
3
Example Research Example: Modification on existing Control chart (Nor Idayu Mahat & Sharipah Soaad, 2011) This study discusses on the problem of constructing control charts for multi quality characteristics when the traditional Hotelling T2 fails to detect shifts in the mean or the relationship among the measured quality characteristics. Alternative control charts based on modified one-step M-estimator which is robust towards outliers is proposed to overcome this weakness..... Results from simulation studies proved that the proposed robust control charts offer better performance..... when the variables are independent or dependent.
4
Example Research Example: The use of Principal Component Analysis in monitoring gear faults (Li et al., 2003) This paper presents a study that uses principal component analysis to reduce dimensionality of the feature space and to get an optimal subspace for machine fault classification.‌. The experimental results indicate that the method extracts diagnostic information effectively for gear fault classification and has a good potential for application in practice.
5
Basic concept
Quantitative Research
Scientific application of mathematical principals to the collection, analysis and presentation of numerical data.
Mathematical principals (?) Collection – knowledge to the design of surveys and experiments in order to get information Analysis – processing and analysing the collected information to answer some questions Presentation –interpret the results obtained from the analysis in some meaningful ways.
6
Basic concept When Quantitative Analysis is needed? There is a need to present and to interpret numerical data.
There is a need to test some defined statements mathematically.
The aim is to classify variables, count them, and construct statistical models in an attempt to explain what is observed.
Precise prediction is a major concern. 7
Basic concept: What is data? Words
Numbers
Measurements
Figures
8
Basic concept: Types of data o Secondary data • •
data that has already been collected. It could be raw data or compiled data.
o Secondary sources: •
•
Hardcopies – books, articles, directories, conference papers, newspapers, magazines, research reports and market reports. Electronic resources – CD-ROM, on-line databases, internet, videos and broadcasts. 9
Basic concept: Types of data o Primary data ~ the researcher collect the data herself. o
Methods • Observation • Experiment • Interviews: face-to-face interview, focus group, panels • Questionnaire • Diaries • Portfolios
10
Basic concept: Types of data Secondary data May not match your need. Access may be difficult or costly. May save some costs and time.
Primary data Commonly match to your need. Original. Sometimes involve some costs and time.
Allow for longitudinal studies.
May be not appropriate for longitudinal studies.
Validity of some secondary data (e.g. internet sources)
Validity of the process in collecting the data. 11
Population and sample Where can we get the data?
Population – all entities (people or items) with the characteristics one wishes to study. Population structure describes the relative numbers of entities with similar characteristics.
Sample – Some of the entities from the population that one may have to answer questions about the population as a whole.
12
Population and sample Principle of Sampling o
Entities in a sample must be • taken from the target population following some standard precedures. • able to represent the actual population. • adequate to be used in the analysis parts. • adequate to supply necessary information to the research questions.
13
Population and sample ?
Sample B
Sample A
Sample C 14
Basic concept of statistical tools Before we decide to use either population or sample, let focus on statistical tools….
Descriptive statistics Procedures to summarise and to describe the important characteristics of a set of measurements. Arts of statistics.
Inferential statistics Procedures to make inferences about population characteristics from information contained in a sample drawn the target population. 15
Basic concept of statistical tools o
Probability sampling •
•
o
All objects in the population will have equal chance to be chosen as sampel. Less bias sampling procedure.
Nonprobability sampling •
•
Objects in a sample are usually selected on the basis of accessibility. Bias sampling procedure.
16
Sampling methods Probability sampling
1. 2. 3. 4.
Simple random Systematic sampling Stratified sampling Cluster sampling (and multi-stage)
Nonprobability sampling 5. 6. 7. 8. 9.
Quota Snow-ball Convenience (opportunity) Purposive Self-selection
17
Probability sampling o
Researcher must ensure that every object has equal opportunity for selection
o
Randomisation is a must.
o
The techniques are free of systematic and sampling bias.
18
Sampling methods: example In the early stages of planning a school restructuring effort, school district board members are considering a year round schooling program. For the moment, the board is interested in the degree to which parents/legal guardians favor such a change. A simple random sample (n = 300) of parents/legal guardians was drawn from 1,850 families (only one adult per household) and given a questionnaire.
19
Sampling methods 1. Simple random sampling (pensampelan rawak mudah)  Pilihan ideal bagi mendapatkan objek secara rawak.
 Setiap objek untuk sampel perlu o o
dipilih secara rawak daripada senarai populasi. mempunyai peluang yang sama untuk terpilih.
 Kekurangan o o
Senarai populasi sukar diperolehi. Kadang-kala sukar untuk mendapatkan objek yang telah dikenalpasti. 20
Sampling methods 2. Systematic sampling (pensampelan sistematik)  Tatacara pensampelan 1. 2.
3.
4.
Sediakan senarai semua objek populasi. Pilih objek pertama secara rawak daripada senarai populasi. Pilih objek seterusnya pada selang ke-k daripada pilihan yang terdahulu. Ulang proses pemilihan (3) sehingga bilangan objek yang diperolehi adalah memenuhi saiz sampel yang diperlukan. 21
List of student in Class A (5 students are needed for every 7 position)
1 2 3 4 5 6 7 8 9 10
11 12 13 14 15 16 17 18 19 20
21 22 23 24 25 26 27 28 29 30
31 32 33 34 35 36 37 38 39 40
22
Sampling methods 3. Stratified sampling (pensampelan berstratum)  Tatacara pensampelan 1.
2.
Setiap objek dalam populasi disusun mengikut kumpulan (strata) berpandukan atribut tertentu (e.g. jantina, sosio-ekonomi dan pendapatan) Pilih sejumlah objek daripada setiap strata secara rawak mengikut ď Ź peratus sama banyak bagi setiap strata, atau ď Ź peratus berbeza mengikut strata.
23
Sampling methods 4. Cluster sampling (pensampelan berkelompok)  Pensampelan berkelompok o o
hampir menyerupai kaedah pensampelan berstrata. Kelompok daripada populasi dipilih secara rawak, kemudian semua objek dalam kumpulan terpilih dijadikan sampel kajian.
 Pensampelan multi-stage adalah sesuai bagi kes yang
melibatkan struktur geografi.
24
Sampling methods 5. Quota sampling (pensampelan berkuota)  Hampir menyerupai kaedah pensampelan berstrata tetapi
ia adalah tidak rawak.  Biasanya banyak digunakan o o
dalam kajian yang melibatkan temuduga. Apabila saiz populasi adalah tidak terhingga.
25
Sampling methods
Researcher chooses proportion representation of objects depending on trait which is considered as the quota. Example:
Gender Male Female
Age (year)
Quota
20 – 29
56
30 - 44
104
20 – 29
50
30 - 44
110 26
Sampling methods 6. Snowball sampling (pensampelan bola salji)  Kaedah ini sesuai apabila objek dalam populasi adalah
sukar untuk dikesan.  Strategi pensampelan: 1.
2.
3.
Penyelidik perlu mendapatkan objek pertama yang sesuai untuk kajian. Objek kedua dan seterusnya dikenalpasti berdasarkan bantuan daripada objek yang telah dikenalpasti. Objek dalam sampel adalah tidak rawak. 27
Sampling methods 7.
Convenience: objek dipilih atas dasar mudah untuk diperolehi.
8.
Purposive: penyelidik memilih hanya objek yang bersesuaian untuk mencapai objektif kajian.
9.
Self-selection: sampel bagi kaedah ini terdiri daripada objek yang menyertainya secara sukarela.
28
More sampling methods
Line-intersect sampling Elements are chosen in a region whereby an element is sampled in a chosen line segment.
Panel sampling A sampling group is chosen (usually by random), and is asked for the same information repeatedly over a period of time.
Event sampling Behaviour of interest is collected at the specified interval. 29
More sampling methods: Hypothetical data
A set of data that is generated randomly from some known distribution(s).
When hypothetical data set can be used? To test performance of a new model/approach under in-control condition. To help a researcher to identify some possible problems with the proposed model / approach.
30
Hypothetical data: Example Phase I: construction of control chart
Step 1
Generate 5000 samples of observations, Xij, i=1,2,..,p and j=1,2,..,n from Np(0,Ip).
Step 2
Compute the robust location and scale estimates for each sample.
Step 3
Randomly generate a new observation, Xi, from Np(0,Ip).
Step 4
Compute the respective T2.
Step 5
Identify the UCL at the 95th (99th) percentile of the 5000 T2 in Step 4.
Step 6
Generate 1000 samples of observations, Xij, i=1,2,..,p and j=1,2,..,n from contaminated model.
Step 7
Compute the robust location and scale estimates for each sample. 31
Checklist…. • • • • • •
Population vs. Sample Objects/respondents Variables vs. Constant value Parameter vs. Estimator Randomness Types of data: • Cross-sectional • Time series • Functional series • Spatial data 32
Errors in research activities Sampling error – caused by sampling design Selection error Estimation error Non-sampling error – caused by mistakes in data processing Over / under coverage Processing error Non-response Measurement error
33
Measurement
Constant value – an actual value or a specific character whose value does not change.
Variable – a character with values that may vary.
Level of measurement: Nominal Ordinal Interval Ratio 34
Nominal
Which of the following daily newspapers have you read during the past month? Read
Not read
Don’t know
The Star The New Straits Times
Berita Harian
35
Ordinal ď Ź
One can ask respondent to place things in rank order. Example: Please number each of the factors listed in order of importance in your choice of a new car. a. Price ____ b. Fuel economy ____ c. Acceleration ____ d. Safety features ____
ď Ź Otherwise, one may use the common scales such as Likert,
semantic differential scale, Guttman scale and Thurstone scale. 36
Measurement Data Categorical
Nominal
Ordinal
Quantifiable
Interval
Ratio
Increasing precision 37
Exploring your data  It is a good practice to understand your data before any
complex analysis is performed.  Objective: o o o o
To identify some strange behaviour. To determine a suitable technique that can be employed to the data. For validation purposes. To make better interpretation on the obtained results.
38
Exploring your data • Missing value: Objects with no value in some
variables. • Some strategies to handle missing value:
Exclude objects with missing value. o Replace a missing value by the mean of all available values for the relevant variable. o Imputation: missing values can be replaced by some suitable numerical entries. o
39
Exploring your data • Outliers: Values that are distinctly different from
other values.
• Outliers may contribute to biased estimated value
and this leads to give misleading results.
• Strategies to handle outliers:
Outliers due to recording errors should be corrected. o If the values are genuine then some thought must be given as to whether or not they should be retained. o
40
Exploring your data The effect of an outlier in computing the average value.
Sales (RM) 70.63
Sales (RM) 70.63
56.28 70.98 7.00 68.42 56.74 60.04 55.73
56.28 70.98 70.00 68.42 56.74 60.04 64.73 41
How to explore your data? Tabular display Plot (e.g. histogram, bar chart etc.) o
Better than statistic values but limited to 2 or 3 variables at one time.
Statistical values o
Common statistical values can be used such as mean, variance etc.
Map the data •
Can be done using e.g. Principal Component Analysis, Factor Analysis, Data Dimensional Scaling etc. 42
Tabular display Sex
Valid
Female Male Total
Frequency 87 183 270
Percent 32.2 67.8 100.0
Valid Percent 32.2 67.8 100.0
Cumulativ e Percent 32.2 100.0
Frequency table
Number of repeated exams
Cross tabulation
Sex
1
2
3
4
Total
Female
59
12
12
4
87
Male
101
46
21
15
183
160
58
33
19
270
Total
What information can be extracted from these tables? 43
Tabular display Bad presentation
44
Pie chart Years Experience 5 or less 6-10
7
8
11-15
GOOD
16-20 21-35
16
18
25
36 or more
26
BAD 45
Line chart / series plot  For continuous
measurements.  Often used to
highlight some patterns or behaviour of the target variable.
46
Scatter plot
47
Bar chart Alternative presentation
for table. For categorical
measurement only. Sometimes can be useful
to identify the distribution of the data.
48
Box and Whiskers  Suitable for numerical
values.  This plot summarises
some important statistics (and features) which include: o Median o Quartiles o Potential outliers
49
Histogram
50
Numerical values  The centre (middle) of the distribution of
measurements.  Some measurements:
Mode o Median o Sum o Arithmetic mean o Trimmed mean o Robust mean o
51
Numerical values  Represent how the data scatter around the centre point,
i.e. central tendency values.  Some measurements:
Range o Percentile o Quartiles ; interquartile range (IQR) o Variance o Standard deviation; coefficient of variation (CV) o Standard error of mean o
52
Weakness of descriptive tools
ď Ź
Descriptive statistics cannot give broader statement about the difference and relationships between data.
ď Ź
They cannot draw conclusions and making predictions about the properties of a population if the information obtained from sample.
53
Statistical inference  Why inference about population is necessary? o o o
Sometimes relevant facts are abundant. Plots may yield conflict opinions regarding conclusions among decision makers. Humans are incapable of utilising large amounts of data.
 So, information contained in a sample is used to make
inferences about a population. Common methods are o estimation. o statistical hypothesis testing. 54
Statistical inference Estimation: a process that will predict a value of a
parameter of interest. It answers the following question • What is the value of the population parameter? • Example: What is the average salary of Malaysians? Statistical hypothesis testing: a procedure that test a
hypothesis about the value of a parameter of interest. It answers the following question • Is the parameter value equal to this specific value? • Is it true that Malaysians earn RM2200 monthly? 55
Hypothesis testing Step 1: Formulate hypotheses. Step 2: Identify an appropriate test statistic to assess the hypotheses. Step 3: Compute the test statistic (or the p-value). Step 4: Compare the test statistic (p-value) to a related distribution value (identified alpha, Îą). Step 5: Make decision and conclusion. 56
Hypothesis testing
•
Null hypothesis (H0): hypothesis with no effects, e.g. the process change makes no different.
•
Alternative hypothesis (H1): a choice that can be considered if H0 can be ruled out, e.g. the process change has an effect.
57
Hypothesis testing
58
Hypothesis testing: identifying test statistics • Test statistic: a quantity computed from the sample data. • Test statistic vs. distribution value (e.g. normal dist., chi-square dist etc.) • p-value: probability that the obtained test statistic is likely to reject H0. • Also known as level of significance. • p-value vs. identified value of α.
59
Hypothesis testing: decision making Choose either one: If p-value is less than or equal to α means we
have enough evidence to reject H0. If p-value is greater than α, then we do not have
enough evidence to reject H0 (but it doesn’t mean that H0 is true).
60
ANALYSIS OF DIFFERENCE One population comparison Two populations comparison
Multiple populations comparison
61
One Population Comparison To test the central values for a target population. Various hypotheses testing:
Two-tail test
H 0 : CT 0 H1 : CT 0
One-tail tests
H 0 : CT 0 H1 : CT 0
or
H 0 : CT 0 H1 : CT 0
µ0 value is known. The value might be obtained from some previous studies, experts’ opinion etc. 62
One Population Comparison 
Parametric methods o Robust if the population is normally distributed. o
Strategy: 1. Write a research hypothesis. 2. Choose an appropriate test statistics (either Zstatistics or T-statistics) and calculate its value (or p-value) based on the obtained sample. 3. Check for the rejection region. Reject H0 if p-value is less than the fixed value of type one error, Îą. 4. Draw conclusions. 63
One Population Comparison 
Non-parametric methods o Might be best methods when the population distribution is highly skewed or heavily tailed. o Often, median is used. o Example methods: sign test and Binomial test. o Strategy: 1. Identify the value of population median. 2. Values are ordered from the smallest to the largest. ˆ , is calculated. 3. Sample median, M 4. Compare the sample median and the population median. 64
Example Let say that normally, the average number of passengers fly with a local flight during school breaks is 270 thousands. So, we might be interested to check whether this number (270) maintain for the current situation.
Mode = 229.00
Median = 265.50
Mean = 280.30
65
Example Parametric test’s result:
Non-parametric test’s result:
66
Two Populations Comparison
Aim: to compare a central value of two different populations. (Need to consider whether both populations have a homogeneous variance). Inferences about 1 2: Independent samples with three different cases: o Both population distributions are normally distributed with equal variance. o Both sample sizes are large. o The sample sizes are small and the population distributions are non-normal. 67
Two Populations Comparison Two-tail test:
H 0 : 1 2 H1 : 1 2
One-tail tests: H 0 : 1 2
H1 : 1 2
or
H 0 : 1 2 H1 : 1 2
Parametric tests: - Independent samples t-test with equal variances. - Independent samples t-test with unequal variances. Non-parametric test: - Mann-Whitney U test - Wilcoxon Rank Sum test 68
Example
An experiment was conducted to evaluate the effectiveness of a treatment for tapeworm in the stomachs of sheep. A random sample of 24 worm-infected lambs of approximately the same age and health was randomly divided into two groups: drug-treated sheep and untreated sheep. 69
Example: initial data analysis
Untreated Drug treated
What is your expected result?
70
Parametric test’s result
Non-parametric test’s result:
71
Two Populations Comparison 1 :2 Paired data
Inferences about
Appropriate for studies in which measurement in one sample is matched or paired with a particular measurement in the other sample.
Hypothesis Two-tail test:
H 0 : 1 2 D0 H1 : 1 2 D0
H 0 : 1 2 D0
One-tail tests:
H1 : 1 2 D0
or
H 0 : 1 2 D0 H1 : 1 2 D0 72
Example To compare the wearing qualities of two automobile tires, A and B, a tire of type A and one type of B are randomly assigned and mounted on the rear wheels of each of five automobiles. The automobiles are then operated for a specified number of miles, and the amount of wear is recorded for each tire. Automobile
Tire A
Tire B
1
10.6
10.2
2
9.8
9.4
3
12.3
11.8
4
9.7
9.1
5
8.8
8.3
Mean (A) = 10.24 Mean (B) = 9.76 Std. dev (A) = 1.32 Std. dev (B) = 1.33 73
Example Independent Samples Test Lev ene's Test for Equality of Variances
F wear
Equal variances assumed Equal variances not assumed
.003
Sig. .960
t-test f or Equality of Means
t
df
Sig. (2-tailed)
Mean Dif f erence
Std. Error Dif f erence
95% Conf idence Interv al of the Dif f erence Lower Upper
.574
8
.582
.4800
.8362
-1.4482
2.4082
.574
7.999
.582
.4800
.8362
-1.4483
2.4083
Paired Samples Test Paired Dif ferences
Pair 1
wearA - wearB
Std. Error Mean Std. Dev iation Mean .4800 .0837 .0374
95% Conf idence Interv al of the Dif f erence Lower Upper .3761 .5839
t 12.829
df
Sig. (2-tailed) 4 .000
74
Multi-Populations Comparison To check whether k populations share the same value of central tendency value.
75
Multi-Populations Comparison A factory produces disc brakes for high-performance automobiles. The following table summarises the average production of four machines. The target diameter for the brake is 322 mm. Disc Brake Diameter (mm) 1 Mean 321.9985 Std. Dev iation .0111568
Mac hine Number 2 3 322.0143 321.9983 .0106913 .0104812
4 321.9954 .0069883
76
Multi-Populations Comparison
Total variation = variation within groups + variation between groups 77
Multi-Populations Comparison
Hypothesis testing: H 0 : 1 2 ... k H1 : at least two populations are different
Parametric test o
One-way ANOVA
Nonparametric test o o
Kruskal-Wallis H Median test 78
Parametric test’s result:
Nonparametric test’s result:
79
Think!! Job satisfaction was investigated in two different factories A and B. In factory A the employees are on a fixed shift system while in factory B the workers have a rotating shift system. In factory A, a worker always works the same shift, while in factory B, a worker rotates through the three shifts. A satisfaction score was collected from each employee and the aim is to identify difference in job satisfaction between the two groups of workers. Q: What information needed in order to determine the choice of test?
80
MEASUREMENT ADEQUACY o
Validity • Does the instrument measures what it is supposed to?
o
Reliability • Does the instrument consistently measure what it is supposed to?
o
Sensitivity • How good the instrument in detecting the smallest amount that it can measure? 81
Validity •
In general, there are two types; Internal and external validity. •
Internal validity refers to the rigor with which the study was performed. • Design of the study • Measurements chosen • Factors involved especially in a study of causal relationships
•
External validity refers to the extent to which the results of a study are generalisable or transferable (authenticity). 82
Internal validity
Face validity Content validity Criterion-related validity Predictive validity occurs when the criterion measures are obtained at a time after the test e.g. career tests. Concurrent validity occurs when the criterion measures are obtained at the same time as the test scores e.g. level of depression. Construct validity Convergent Discriminant 83
1. Face validity
It is the basic and minimal index of validity.
It is concerned with how a measure or procedure appears and understandable by to the respondents. Does it seem well designed? Does it seem as though it will work reliably?
Testing strategy: a set of questionnaire is given to a sample of respondents to judge their reaction to the items. 84
2. Criterion-Related Validity ď Ź
Also known as instrumental validity.
ď Ź
It demonstrates the accuracy of a measure or procedure by comparing it with another measure or procedure which has been demonstrated to be valid.
ď Ź
Example: let say we have a hands-on driving test that has been shown to be an accurate test of driving skills. Then, one propose to a new written driving test. Then, the written test can be validated by using a criterion related strategy in which the hands-on driving test is compared to the written test. 85
2. Criterion-Related Validity
Predictive validity Indicates the ability of the measuring instrument to differentiate among individuals on a future criterion. Example: employees ability test
Concurrent validity Indicates the ability of the measuring instrument to differentiate among individuals who are known to be different (they should score differently on the instrument). Example: work ethic among welfare recipients. 86
3. Construct validity ď Ź
Construct validity testifies to the agreement between a theoretical concept and a specific measuring device or procedure.
ď Ź
Example: A doctor would like to test the effectiveness of painkillers on chronic back sufferers. Every day, he asks the test subjects to rate their pain level on a scale of one to ten. In this case, construct validity would test whether the doctor actually was measuring pain and not numbness, discomfort, anxiety or any other factor.
87
3. Construct validity
Convergent validity is the actual general agreement among ratings, gathered independently of one another, where measures should be theoretically related. The scores obtained by two different instruments measuring the same concept is highly correlated.
Discriminate validity is the lack of a relationship among measures which theoretically should not be related. Two variables are predicted to be uncorrelated and the scores obtained by measuring them are indeed empirically found to be so. 88
3. Construct validity
Strategy to achieve construct validity: Literature review Confirmatory factor analysis Correlation analysis Some multivariate analyses
89
4. Content validity ď Ź
Content validity ensures measures include an adequate and representative set of items that tap the concept.
ď Ź
Example: 1. A researcher needing to measure an attitude like self-esteem must decide what constitutes a relevant domain of content for that attitude. 2. In socio-cultural studies, content validity forces the researchers to define the very domains they are attempting to study.
90
4. Content validity
Strategy to achieve content validity: Existing literature Qualitative research
Judgment of panel of experts
91
Reliability
Reliability is defined as the extent to which an instrument consistently measures what it is supposed to.
Classical test theory – a ratio of variation between the true score and the observed score.
The true-score model
92
Approaches to estimate reliability Equivalency reliability 2. Stability o Test-retest reliability o Parallel-form reliability 3. Internal consistency o Inter-item consistency reliability o Split-half reliability 4. Inter-rater reliability 1.
93
1. Equivalency reliability o
Equivalency reliability is the extent to which two items measure identical concepts at an identical level of difficulty.
o
Equivalency reliability is determined by relating two sets of test scores to one another to highlight the degree of relationship or association.
94
2. Stability ď Ź
A set of measures is consider stable if it has an ability to maintain stability over time despite of uncontrollable conditions or the state of the respondents themselves.
ď Ź
Example: The method of maintaining weights used by the U.S. Bureau of Standards. Platinum objects of fixed weight (one kilogram, one pound, etc...) are kept locked away. Once a year they are taken out and weighed, allowing scales to be reset so they are "weighing" accurately. Keeping track of how much the scales are off from year to year establishes a stability reliability for these instruments. In this instance, the platinum weights themselves are assumed to have a perfectly fixed stability reliability. 95
2. Stability
Test-retest reliability is the correlation between two successive measurements with the same test.
Example: you can give your test in the morning to your pilot sample and then again in the afternoon. The two sets of data should be highly correlated if the test is reliable.
96
2. Stability
Parallel-form reliability is the successive administration of two parallel forms of the same test.
Examples: There are two versions that measure Verbal and Math skills in SAT. Two forms for measuring Math should be highly correlated and that would document reliability. In an exam, two groups of students are given questions having similar items and the same response format, with only different in wording and the ordering of questions. 97
3. Internal consistency ď Ź
It indicates the homogeneity of the items in the measure that tap the construct.
ď Ź
Example: a questionnaire was designed to find out about college students' dissatisfaction with a particular textbook. Then, a researcher needs to analyzing the internal consistency of the survey items dealing with dissatisfaction which reveal the extent to which items on the questionnaire focus on the notion of dissatisfaction.
98
3. Internal consistency
Inter-item consistent tests the consistency of respondents’ answers to all items in a measure.
In other words, it ensures that the items are homogeneous or all measuring the same construct.
Statistical procedures like KR-20 (Kuder-Richardson) or Cronbach's Alpha are commonly use for these purposes.
99
3. Internal consistency ď Ź
Split-half reflects the correlation between two halves of an instrument.
ď Ź
Example: you have the SAT Math test and divide the items on it in two parts. If you correlated the first half of the items with the second half of the items, they should be highly correlated if they are reliable.
100
4. Inter-rater reliability ď Ź
Inter-rater reliability reflects the consistency of the judgment of several raters on how they interpret the responses. In other words, it is the extent to which two or more individuals (coders or raters) agree.
ď Ź
Scenario: Two or more researchers are observing a high school classroom. The class is discussing a movie that they have just viewed as a group. The researchers have a sliding rating scale (1 being most positive, 5 being most negative) with which they are rating the student's oral responses. Inter-rater reliability assesses the consistency of how the rating system is implemented. 101
Power of a statistical test
It is the probability of rejecting the null hypothesis when the null hypothesis is false.
Power also represents the sensitivity of the undertaken analysis.
Factors influencing power: (i) the statistical significance criterion (alpha value), (ii) magnitude of the effect under alternate hypothesis (effect size) and (iii) sample size.
102
Complex analysis Choosing the right statistical tool Number of variables
1 variable
2 variables
More than 2 variables
Homogeneous sample? 103
Bivariate analysis
BIVARIATE studies two variables simultaneously.
Common studies • Correlation – measuring relationship between two continuous variables. • Cross tabulation - measuring relationship between two categorical (or binary) variables. • Simple modelling – a study involves in finding the best curve (e.g. straight line) that best explain how a variable (independent variable) influences the other variable (dependent variable).
104
MULTIVARIATE ANALYSIS
Multivariate data arise when more than one variable or measurement is made on each object. Data arrangement x11 x12 . . . x1 p x21 x22 . . . x2 p . . . . x x . . . x np n1 n 2 Type of studies: o Descriptive multivariate studies o Inferential studies o Modelling and prediction 105
Multivariate analysis
Interdependence methods Involve only either independent variables or dependent variables. Aim: to seek for patterns or any hidden information. Methods: principal component analysis, factor analysis, multidimensional scaling, cluster analysis, projection pursuits etc.
Dependence methods Both independent variable and dependent variable(s) are measured. Methods: multiregression, discriminant analysis, MANOVA, canonical analysis, SEM etc. 106
~: The End :~ 107