Testing for Difference - Parametric and Non-Parametric Tests

Page 1

Understanding Your Data 2: Testing for Difference

BML224: Data Analysis for Research


Aims:   To understand the ra-onale for the use of parametric and non-­‐

parametric tests and to able to relate to significance tes-ng

  To construct appropriate null and alterna-ve hypotheses   To apply the procedure for calcula-ng parametric and non-­‐

parametric tests in SPSS

  To interpret computer generated SPSS output rela-ng to

parametric and non-­‐parametric tests and produce an accurate write-­‐up of the results


‘The one where it all makes sense!’


Where are we?

Types of Data

Normal Distribution

Descriptive Statistics

Dispersion


Sta$s$cal Tests •  Used to make deduc6ons about a par6cular data set or rela6onships between different data sets; •  Random sample of 50 households in two rural villages in West Sussex: • Village A: mean income £17,650 • Village B: mean income £22,220 •  A test can be used to determine if there is a ‘real difference’ or whether the difference occurred ‘purely by chance’


Sta$s$cal Tests •  Parametric Tests: data conforms to normal distribu6on and is of interval or ra6o in nature •  Non-­‐Parametric Tests: data does not conform to normal distribu6on – use ordinal data


Tes3ng for Difference: Hypothesis Building


Hypotheses •  Before conduc6ng a test establish a hypothesis or statement which the test then challenges: •  Null Hypothesis (Ho): •  There is no significant difference between incomes of households in village A as compared with the incomes of households in village B; household income is not influenced geographical loca6on; •  Null Hypothesis (Ho): µa= µb


Hypotheses •  Before conduc6ng a test establish a hypothesis or statement which the test then challenges: •  Alterna$ve (H1): •  There is a significant difference in the income of households in village A as compared with the income of households in village B; household income is influenced geographical loca6on; •  Alterna$ve Hypothesis (H1): µa¹ µb


One or Two Tailed Tests Example 1: •  Null Hypothesis (Ho): •  There is no significant difference between the income of households in village A as compared with the income of households in village B; household income is not influenced geographical loca6on; •  Alterna$ve (H1): •  There is a significant difference in the income of households in village A as compared with the income of households in village B; household income is influenced geographical loca6on.


One or Two Tailed Tests Example 1: •  Two Tailed Test •  No statement indica6ng direc6on has been given •  Therefore would have to allow for the average income in village b to be either larger or smaller than for village a


One or Two Tailed Tests Example 2: •  Null Hypothesis (Ho): •  There is no significant difference between the income of households in village A as compared with the income of households in village B; household income is not influenced geographical loca6on; •  Alterna$ve (H1): •  There is a significant difference in incomes between Village A and Village B; household income is greater in Village A -­‐ household income is influenced by geographical loca6on


One or Two Tailed Tests Example 2: •  One Tailed Test •  A statement indica6ng direc6on has been given: reduces/increases; •  Therefore would have to allow for the average income in village a to be larger than for that village b


Tes3ng for Difference: Understanding Significance Tes3ng


Significance Tes$ng •  For each test we produce a test sta6s6c based on our sample •  For each test, and for a given sample size, there is a known distribu6on of the sta6s6c if the samples were drawn from the same popula6on (referred to as cri6cal values) •  This is the distribu6on of the sta6s6c when the null hypothesis is true (i.e. there is no significant difference)


Significance Tes$ng •  Our test sta6s6c is then compared to the cri6cal value at a given significance level: •  If the test sta6s6c is greater than the calculated value the null hypothesis is rejected •  If the test sta6s6c is less than the calculated value the null hypothesis is accepted


Visualising Significance Tes$ng


Visualising Significance Tes$ng

•  When does the difference become significant? (What is our significance level?) •  When can we become confident the differences have not occurred by chance?


Visualising Significance Tes$ng


Visualising Significance Tes$ng

•  Test sta6s6c = 3.82 (df=4) •  Cri6cal value at a 95% significance level (df=4) = 2.13 •  Test Sta6s6c > Cri6cal Value therefore significant therefore Reject Null Hypothesis


•  We don’t have to calculate the values by hand or refer to sta$s$cal tables •  In SPSS we have to turn our aKen$on to the interpreta$on of the significance value (Sig.) and the p-­‐value (probability)


Significance Tes$ng and SPSS Output

What does all this mean?


Significance Tes$ng and SPSS Output

•  What does this mean?


Significance Tes$ng and SPSS Output

•  This is the probability of the difference occurring by chance


Visualising Significance Tes$ng •  Let’s go back to our understanding of the Cri6cal Value and Rejec6on Region and make a few changes


Visualising Significance Tes$ng

Our significance level


A Word on Significance Levels •  Significance levels allow the researcher to state whether or not they believe a null hypothesis to be true with a given level of confidence •  Significant values are normally expressed in decimal terms: •  0.05 (5% or p = 0.05 / 1 in 20) •  0.01 (1% or p = 0.01 / 1 in 100) •  The value 0.05 indicates the 95% confidence limit and represents the minimum limit for deciding upon whether or not a par6cular result is significant


A Word on Significance Levels •  Anything lower than the 95% confidence level, means the null hypothesis is normally accepted and the result is not regarded as significant •  Anything higher than the 95% confidence level, means the null hypothesis is normally rejected and the result is regarded as significant •  SPSS reports the calculated test sta6s6c in terms of a probability value (p): •  Where p<0.05 this would indicate a significant result at a 0.05 =(5%) level

•  Where p<0.01 this would indicate a significant result at a 0.01 =(1%) level


Visualising Significance Tes$ng Accept the Null Hypothesis

Reject the Null Hypothesis

5% chance occurance (0.05)

Critical Region

t

95% confidence level

t

•  If p value <0.05 there is a significant difference – therefore reject the null hypothesis


Visualising Significance Tes$ng Accept the Null Hypothesis

Reject the Null Hypothesis

5% chance occurance (0.05)

Critical Region

t

95% confidence level

t

•  p = 0.03 (3% occurrence by chance) •  Therefore lies within 5% cri6cal region we set •  Reject the Null


Visualising Significance Tes$ng Accept the Null Hypothesis

Reject the Null Hypothesis

5% chance occurance (0.05)

Critical Region

t

95% confidence level

t

•  If p value >0.05 there is not a significant difference – therefore accept the null hypothesis


Visualising Significance Tes$ng Accept the Null Hypothesis

Reject the Null Hypothesis

5% chance occurance (0.05)

Critical Region

t

95% confidence level

t

•  p = 0.08 (8% occurrence by chance) •  Therefore lies outside 5% cri6cal region we set •  Accept the Null


Tes3ng for Difference: Significance Tes3ng and One and Two Tailed Tests


Significance Tes$ng: One-­‐Tailed Tests Let’s go back to our ini$al one-­‐tailed hypothesis: •  Null Hypothesis (Ho):

•  There is no significant difference between the income of households in village A as compared with the income of households in village B; household income is not influenced geographical loca6on;

•  Alterna$ve (H1):

•  There is a significant difference in the income of households in village A as compared with the income of households in village B; household income is influenced geographical loca6on.


Significance Tes$ng: One-­‐Tailed Tests


Significance Tes$ng: One-­‐Tailed Tests


Significance Tes$ng: Two Tailed Tests Let’s go back to our ini$al two-­‐tailed hypothesis: •  Null Hypothesis (Ho):

•  There is no significant difference between the income of households in village A as compared with the income of households in village B; household income is not influenced geographical loca6on;

•  Alterna$ve (H1):

•  There is a significant difference in the income of households in village A as compared with the income of households in village B; household income is influenced geographical loca6on;


Significance Tes$ng: Two-­‐Tailed Tests


A Word on Significance Levels •  With a two-­‐tailed test, both tails of the known distribu6on are of interest, as the unknown distribu6on could lie at either end •  However if we set the significance level to take 5% at each end of the tail we increase the risk of making a mistake – as we would end up with a 10% chance of error •  We therefore retain our 95% confidence level and share the 5% risk at both ends of the tail of the known distribu6on •  We set our confidence level at 2.5% (0.025) at each end; if the score falls into one of the 2.5% tails then we can say it comes from a different popula6on


Significance Tes$ng: Two Tailed Tests


A Word on Significance Levels •  Therefore when undertaking a two-­‐tailed projec6on the result has to fall within a smaller area of the tail compared to one-­‐ tailed test before we can claim the distribu6ons are different


A Word on Hypotheses •  There is much debate about the wording of hypotheses – par6cularly around ‘rejec6ng’ the null hypothesis •  What we are actually saying is that we have not have a difference in our experiment •  Therefore we have not found a big enough difference for us to reject the possibility that the difference arose by chance …or… •  …the probability of difference arising by chance is too large to claim a genuine difference in the distribu6ons


A Word on Type I and Type II Errors •  Type I Error: Sta$ng a difference exists that does not or rejec$ng the null when it is true •  In one-­‐tailed if test scores are beyond the sig.level then it belongs to the unknown distribu6on •  A score beyond the sig.level is more likely to come from the unknown as more of it is beyond the sig.level •  However area ‘α’ denotes the size of risk that scores come from the known distribu6on

β

α

•  By seong the sig.level at 0.05 we are saying that only 5% distribu6on lies beyond the sig level


A Word on Type I and Type II Errors •  Type II Error: Sta$ng no difference when there is or failing to reject a null hypothesis that is false •  If scores fall below sig.level then we accept the null hypothesis-­‐ scores come from the known distribu6on •  However there is a risk that scores come from unknown distribu6on (β) •  Risk of Type II error is the amount of unknown distribu6on below the sig.level

β

α


A Word on Type 1 and Type 2 Errors Null Hypothesis (Ho) is Alterna$ve true Hypothesis (H1) is true Accept Null Hypothesis

Reject Null Hypothesis

Right decision

Wrong decision Type II Error False Nega6ve

Wrong decision Type I error False Posi6ve

Right Decision

•  Generally don’t make to make false claims therefore happier to make Type II areas •  Increasing sample size can reduce poten6al for Type I and Type II errors


An Alterna$ve Perspec$ve on the P Value


Tes3ng for Difference: Parametric and Non-­‐Parametric Tests


Choosing the Right Test Refer back to your A3 template sheet!


Choosing the Right Test Parametric Tests •  Independence of observa6ons (except where the data is paired) •  Random sampling •  Interval scale measurement for the dependent variable •  A minimum sample size of 30 per group is recommended •  Equal variances of the popula6on from which the data is drawn •  Hypotheses are usually made about the mean of the popula6on


Choosing the Right Test Non-­‐Parametric Tests •  Independence of randomly selected observa6ons except when paired •  Few assump6ons concerning the distribu6on of the popula6on •  Ordinal or nominal scale of measurement •  Ranks or frequencies of data are the focus of tests •  A minimum sample size of 30 per group is recommended •  Hypotheses are posed regarding ranks, medians or frequencies •  Sample size requirements are less stringent than for parametric tests


Tes3ng for Difference: Student T-­‐Test


Student T-­‐Test •  T-­‐test compares two unrelated data sets by inspec6ng the amount of difference between their means and the variability in each data set •  The larger the difference the more likely that a real difference exists, and samples come from different popula6ons •  The independent samples T-­‐test is undertaken when the samples are unrelated •  We examine ra6o (dependent) against nominal categories (independent) variables


Tes3ng for Difference: Student T-­‐Test -­‐ Worked Example


Student T-­‐Test Scenario •  As part of the bidding process to Tourism South East for future tourism funding, local tourism officers have to demonstrate if there is a significant difference in profit levels between businesses in the Arun and Chichester Districts •  Variables: We are therefore going to examine if there is a difference between Area and Profit10


Student T-­‐Test – Ini$al Analysis


Student T-­‐Test – Ini$al Analysis


Student T-­‐Test •  The Null Hypothesis: •  There is no significant difference in profit levels between businesses in the Arun and Chichester Districts; profit is not influenced by loca6on •  The Alterna$ve Hypothesis: •  There is a significant difference in profit levels between businesses in the Arun and Chichester Districts; profit levels are higher in Chichester District than Arun District -­‐ profit is influenced by loca6on [Note that this is one-­‐tailed test]


Student T-­‐Test-­‐ Interpre$ng the Output

Profit10 – dependent variable

Area – independent variable


Student T-­‐Test – Interpre$ng the Output

•  •  •

N – Number of businesses in each sample Mean turnover is higher in Chichester District S.D for Chichester is also higher indica$ng a wider range of values


Student T-­‐Test – Interpre$ng the Output

•  •

Before we start remember one of the criteria for using a parametric t-­‐test is that both popula6ons have equal variances We use the Levene’s Test for Equality of Variances to check if the variances do differ significantly. If they do we use the bovom line


Student T-­‐Test – Interpre$ng the Output

If p > 0.05, then the homogeneity of variance assump$on has not been violated and the normal t-­‐test based on equal variances (Equal variances assumed) is used (the top line)

If p < 0.05, then the homogeneity of variance assump$on has been violated and the normal t-­‐test based on equal variances should be replaced by one based on separate variance es6mates (Equal variances not assumed)(the boKom line)


Student T-­‐Test

•  Test Sta$s$c •  p value – this is given for a two-­‐tailed hypothesis; if your hypothesis is one-­‐tailed divide this figure by 2


Student T-­‐Test

Repor$ng the Output

‘A student t-­‐test was conducted to determine if a significant difference in profit levels existed between the Arun and Chichester Districts. A null hypothesis of no significant difference and an alterna6ve hypothesis of a significant difference were established, and a 95% confidence level was assumed. The difference was significant t = 6.533, p(<.0005)<0.05. Therefore the null hypothesis can be rejected and we can assume that there is a significant difference between profit and area, and that profit is influenced by loca6on’


Student T-­‐Test

Repor$ng Probability Values •  Note that in the above we have reported the probability value as <.0005. You cannot have a probability value of 0.000. •  The reported probability value has actually been rounded down to three decimal places and therefore for accuracy we would report this as p<0.0005.


Student T-­‐Test – refer to your template sheet

Grouping Variables [Independent Variables]

Test Variables [Dependent Variables]


Tes3ng for Difference: Mann Whitney -­‐ Worked Example


Choosing the Right Test


Mann Whitney Scenario •  Tourism South East are developing a new e-­‐tourism strategy and they want to establish if there is any rela6onship between e-­‐strategy (e-­‐commerce adopters and non-­‐adopters) and business aotudes to the value of the internet •  Variables: We are therefore going to examine if there is a rela6onship between EStrategy and Webqual10


Mann Whitney – Ini$al Analysis


Mann Whitney – Ini$al Analysis


Mann Whitney •  The Null Hypothesis: •  There is no significant difference in the perceived value of the Internet between e-­‐commerce adopters and non-­‐ adopters; aotudes towards the internet are not influenced by e-­‐strategy •  The Alterna$ve Hypothesis: •  There is a significant difference in the perceived value of the internet between e-­‐commerce adopters and non-­‐ adopters; e-­‐commerce adopters value the internet more highly -­‐ aotudes towards the internet are influenced by e-­‐strategy [Note that this is one-­‐tailed test]


Mann Whitney – Interpre$ng the Output


Mann Whitney – Interpre$ng the Output Repor$ng the Output A Mann Whitney Test was conducted to determine if a significant difference between business aotudes to the value of the internet existed between e-­‐commerce adopters and non-­‐adopters. A null hypothesis of no significant difference and an alterna6ve hypothesis of a significant difference were established, and a 95% confidence level was assumed. The difference was significant U = 6508, p (<0.0005)<0.05. Therefore the null hypothesis can be rejected and we can assume that there is a significant difference between business aotudes towards the internet between e-­‐commerce adopters and non-­‐ adopters, and that e-­‐commerce adopters value the internet more highly -­‐ aotudes towards the internet are influenced by ecommerce strategy.


Mann Whitney Test – refer to your template sheet

Grouping Variables [Independent Variables]

Test Variables [Dependent Variables]


Tes3ng for Difference: Related or Paired Samples -­‐ Worked Example


Related or Paired Samples T-­‐Test •  The Paired Sample T-­‐test is undertaken when the samples are related or paired (oyen with the same par6cipants in each sample) •  The test uses parametric data


Related Samples T-­‐Test Scenario •  In 2009 Tourism South East ran a series of courses in conjunc6on with the Green Tourism Business Scheme to help GTBS members progress to the next stage of accredita6on (e.g. bronze to silver; silver to gold). As part of the monitoring process, Tourism South East want to establish if these courses have had any impact on GTBS scores. •  Variables: We are going to examine differences in GTBS scores in 2008 and 2010.


Related Samples T-­‐Test GTBS08

GTBS10


Related Samples T-­‐Test

GTBS08

GTBS10


Related Samples T-­‐Test •  The Null Hypothesis: •  There is no significant difference in GTBS scores between 2008 and 2010. •  The Alterna$ve Hypothesis: •  There is a significant difference in GTBS scores between 2008 and 2010.

[Note that this is one-­‐tailed test]


Related Samples T-­‐Test

GTBS08 GTBS10

GTBS08 - GTBS10

GTBS08 - GTBS10


Related Samples T-­‐Test

GTBS08 - GTBS10

The key elements of the Pair Samples Test include: (a)  The test sta6s6c -­‐ this is denoted as t; in this case the value of t=-­‐11.386 (b)  The degrees of freedom -­‐ the degrees of freedom equal the size of the sample (300) minus 1. The minus 1 represents minus 1 for the sample as you have only asked one set of respondents. The degrees of freedom value is placed in brackets between the t and the = sign (e.g. t(299)=-­‐11.386)


Related Samples T-­‐Test

GTBS08 - GTBS10

•  As can be seen from the SPSS output, the difference between the two means is significant. This is specifically reported as: •  There is a significant difference in GTBS scores between 2008 and 2010, t (299)= -­‐11.386, p (<.0005)<0.05.


Related Samples T-­‐Test

GTBS08 - GTBS10

Repor$ng the Output ‘A related samples t-­‐test was conducted to determine if a significant difference GTBS scores existed in 2008 and 2010. A null hypothesis of no significant difference and an alterna6ve hypothesis of a significant difference were established, and a 95% confidence level was assumed. The difference was significant: t (299)= -­‐11.386, p (<0.0005)<0.05. Therefore the null hypothesis can be rejected and we can assume that there is a significant improvement in GTBS scores between 2008 and 2010.


Paired Samples T-­‐Test – refer to your template sheet

Appropriate Related/Paired Variables


Tes3ng for Difference: Wilcoxon -­‐ Worked Example


Wilcoxon Test Scenario •  Between 2008 and 2010, Tourism South East ran a series of e-­‐commerce workshops across the South East region. As part of the monitoring process, Tourism South East want to establish if these workshops have had any impact on business aotudes to the value of the internet. •  Variables: We are going to examine differences in Webqual08 and Webqual10.


Wilcoxon Test


Wilcoxon Test


Wilcoxon Test •  The Null Hypothesis: •  There is no significant difference in business aotudes towards the value of the internet between 2008 and 2010. •  The Alterna$ve Hypothesis: •  There is a significant improvement in business aotudes towards the value of the internet between 2008 and 2010. [Note that this is one-­‐tailed test]


Wilcoxon Test-­‐ Interpre$ng the Output

Nega$ve Ranks: •  How many ranks of Webqual08 were larger than Webqual10; the value is 0 which suggests that aotude scores have increased


Wilcoxon Test – Interpre$ng the Output

Posi$ve Ranks: •  How many ranks of Webqual08 were smaller than Webqual10; the value is 259


Wilcoxon Test – Interpre$ng the Output

Ties: •  How many rankings of Webqual08 and Webqual10 are the same; the value is 41


Wilcoxon Test Repor$ng the Output A Wilcoxon test was conducted to determine if a significant difference in business aotudes towards the internet between 2008 and 2010 existed. A null hypothesis of no significant difference and an alterna6ve hypothesis of a significant difference were established, and a 95% confidence level was assumed. The difference was significant: z = -­‐16.093, p (<0.0005) <0.05. Therefore the null hypothesis can be rejected and we can assume that there is a significant difference between business aotudes towards the internet between 2008 and 2010.


Wilcoxon Test – refer to your template sheet

Appropriate Related/Paired Variables


Learning Outcomes At the end of this session, you should be able to:   Understand the ra-onale for the use of parametric and non-­‐

parametric tests and be able to relate to significance tes-ng

  Construct appropriate null and alterna-ve hypotheses   Apply the procedure for calcula-ng parametric and non-­‐

parametric tests in SPSS

  Interpret computer generated SPSS output rela-ng to

parametric and non-­‐parametric tests and produce an accurate write-­‐up of the results


Turn static files into dynamic content formats.

Create a flipbook
Issuu converts static files into: digital portfolios, online yearbooks, online catalogs, digital photo albums and more. Sign up and create your flipbook.