Section 4

Page 1

Section 4

Student T-Test, Paired Samples T-Test, Mann Whitney and Wilcoxon Learning Outcomes At the end of this session, you should be able to: 

Understand the rationale for the use of parametric and non-parametric tests

Examine the relationship between variables using parametric and non-parametric tests, constructing suitable null and alternative hypotheses

Apply the procedure for conducting parametric and non-parametric tests in SPSS in relation to the Student T-Test, the Paired Samples T-Test, Mann Whitney and Wilcoxon

Interpret computer generated SPSS output in relation to the above tests



Statistical Tests: Introduction

Data Analysis for Research

4.0

Introduction Statistical tests are used to make deductions about a particular data set or relationships between different data sets. For example, you might have interviewed a random sample of 50 households from two rural villages in West Sussex to compare whether income levels are different. In village A, you calculate the mean income to be £17,650 and for village B, £22,200. In this instance, a statistical test can be used to determine whether we have a real difference or whether the difference could have occurred purely by chance. There are a wide variety of statistical tests, each designed to take account of the different characteristics of the data sets you may wish to examine. The choice of test to use can prove overwhelming, and indeed frightening at first. At the most basic level, the principal distinction drawn between different statistical tests is whether they are ‘parametric’ and ‘non-parametric’ tests. Parametric tests can only be performed where the data conforms to a normal distribution and is of an interval or ratio nature. In contrast, non-parametric tests involvement less rigorous conditions and can be used on data of lower level which does not conform to a normal frequency distribution.

4.1

Null and Alternative Hypotheses Before conducting a statistical test it is first necessary to establish a hypothesis or statement which the test then challenges. These hypotheses are referred to as the null hypothesis (Ho ) and the alternative or research hypothesis (H1). The null hypothesis is usually expressed as Ho: μ1= μ1 where μn is the mean for each group, and the subscript n denotes the group. When stating a null hypothesis, the normal procedure is to start by assuming that there is no real difference between your data sets. A statistical test effectively helps the researcher to decide whether or not the null hypothesis is true, or more precisely, whether or not it should be accepted. If the result of the test shows that the null hypothesis should not be accepted and that it should be rejected, we can then go on to say, with some degree of confidence, that a difference does exist or a change has occurred (Riley, M. et al, 1998, p. 203). It is important that you express both Ho and H1 in the context of your own research problem before collecting your data and before starting your analysis. In reference to the rural income example quoted above, we could formulate the following hypotheses: Ho: μa= μb

There is no significant difference between the mean income of households in village A as compared with the mean income of households in village B; mean household income is not influenced by geographical location.

H1: μa≠ μb

There is a significant difference in the mean household income for households in village A as compared with village B; mean household income is influenced by geographical location.

To determine whether or not your sampled data sets are consistent with the null hypothesis or the alternative hypothesis, we need to perform a probability based significance test. However, before such a test is conducted we must determine how big any difference has to be, to be considered real beyond that expected due to chance.

© Dr Andrew Clegg

p. 4-127


Data Analysis for Research

4.2

Statistical Tests: Introduction

Hypothesis Testing Most tests follow the same basic logic, in that a research hypothesis (your alternative hypothesis) predicts a difference in distributions, whereas a null hypothesis predicts that they are the same. For each significance test, we can produce a probability distribution of a test statistic, termed a sampling distribution under the null hypothesis, calculated on the basis that the null hypothesis is true. A simple example relating to the probability distribution curve of the Student’s t-statistic is shown in Figure 4.1.

Figure 4.1:

Rejection Region for a Probability Region

A large visible difference between data sets corresponds to a probability towards the tails of distribution, therefore meaning that differences occurring by chance are unlikely. We can determine whether any difference between the data sets is large enough to have occurred by chance, by determining whether the difference occurs in relation to the tails of the distribution. We can define a critical or rejection region as that part of the probability distribution beyond a critical value of a test statistic at a certain probability (see Figure 1.1). We compare this critical value with the calculated test statistic. If the calculated statistic is greater than the critical value and therefore falls within the rejection region, the difference in the data are unlikely to have occurred by chance. Consequently, we can reject the null hypothesis and accept the alternative hypothesis. However, if the calculated value does not fall within the rejection region, this does not prove the truth of the null hypothesis, but merely fails to reject it. The size of the rejection region is determined by the significance level. Significance levels allow the researcher to state whether or not they believe a null hypothesis to be true with a given level of confidence or significance value. Significance levels are presented in statistical tables as a probability value normally expressed in decimal terms i.e. 0.05 (5% or p=00.5/1 in 20) and 0.01 (1% or p=0.01/ 1 in 100). The value 0.05 indicates the 95 per cent confidence limit and represents the minimum limit for deciding upon whether or not a particular result is significant and whether or not the null hypothesis should be accepted or rejected. Anything lower than 95 per cent confidence level, that is where the level is computed to be 94 per cent or less, means the null hypothesis is normally accepted and the result is regarded as not significant. If the significance level is found to be higher, that is, it indicates a confidence level of 95 per cent or more has been achieved, then we say the observed change or difference is significant. By the selection of either a 5% or a 1% significance level, what we are saying is that we are willing to accept either a 5% or a 1% chance of making an error in Š Dr Andrew Clegg

p. 4-128


Statistical Tests: Introduction

Data Analysis for Research

rejecting the null hypothesis when it is in fact true; this is known as a Type I error. A Type II error represents the probability of not rejecting the null hypothesis when it is in fact false. SPSS will report the significance of the calculated test statistic in terms of a probability value p. Where p<0.05 this would indicate a significant result result at a 0.05 (5%) level, and p<0.01 would indicate a significant result at a 0.01 (1%) level, often termed ‘highly significant’.

4.3

One and Two Tailed Tests When conducting a statistical test to a given significance level it is important to consider how the hypothesis is worded as this will either create conditions for a one- or two tailed test. Any statement including terms such as reduces or increases, no lower or no higher implies a specific direction in the null hypothesis and consequently forms the basis of a one-tailed test. In contrast, any statement indicating no direction (no different/no effect) forms the basis of a two-tailed test. Therefore in relation to the rural income example stated above, we would perform a two-tailed test. This is because we would have to allow for the average income for village B to be either larger or smaller than that for village A. We could, however, have chosen a slighty different alternative hypothesis, for example: H1: μa> μb

The mean household income for households in Village A is significantly larger as compared with Village B; mean household income is influenced by geographical location.

This is termed a one-tailed test as we are only interested in a difference in one direction, in this case positive differences (larger). As a result, the rejection region must be concentrated at one end of the distribution (hence the term one-tailed). For the sample mean to be larger than the population mean, the rejection region must lie at the positive end of the x-axis. The choice of a two-tailed or one-tailed test will determine the distribution of the rejection region. This will now be discussed in the following section.

© Dr Andrew Clegg

p. 4-129


Data Analysis for Research

4.3.1

Statistical Tests: Introduction

Significance Levels and One and Two-Tailed Predictions The relationship between significance levels and one and two-tailed predictions is explained by Hinton (2004) in the following extract: When we undertake a one-tailed test we argue that if the test score has a probability lower than the significance level then it falls within the tail-end of the known distribution we are interested in. We interpret this as indicating that the score is unlikely to have come from a distribution the same as the known distribution but from a different distribution. If the score arises anywhere outside this part of the tail cut off by the significance level we reject the alternative hypothesis. This is shown in Figure 4.2. Notice that this shows a one-tailed projection that the unknown distribution is higher than the known distribution.

Figure 4.2:

A One-Tailed Prediction and the Significance Level

With a two-tailed prediction, unlike the one-tailed, both tails of the known distribution are of interest, as the unknown distribution could be at either end. However, if we set our significance level so that we take the 5 per cent at the end of each tail we increase the risk of making an error. Recall that we are arguing that, when the probability is less than 0.05 that a score arises from the known distribution, then we conclude that the distributions are different. In this case the chance that we are wrong, and the distributions are the same, is less than 5 per cent. If we take 5 per cent at either end of the distribution, as we are tempted to do in a twotailed test, we end up with a 10 per cent chance of an error, and we have increased the chance of making a mistake. We want to keep the risk of making an error down to 5 per cent overall, as otherwise there will be an increase in our false claims of differences in distributions which can undermine our credibility with other researchers, who might stop taking our findings seriously. When we gamble on the unknown distribution being at either tail of the known distribution, to keep the overall error risk to 5 per cent, we must share our 5 per cent between the two tails of the known distribution, so we set our confidence level at 2.5 per cent at each end. If the score falls into one of the 2.5 per cent tails we then say it comes from a diffferent distribution. Thus, when we undertake a two-tailed prediction the result has to fall within a smaller area of the tail compared to a one-tailed prediction, before we claim that the distributions are different, to compensate for hedging our bets in our prediction. This is shown in Figure 4.3. Š Dr Andrew Clegg

p. 4-130


Statistical Tests: Introduction

Data Analysis for Research

Figure 4.3:

A Two-Tailed Prediction and the Significance Level

[Extract taken from Hinton, P. (2004), Statistics Explained, Routledge, London] The changes in the critical values between one and two-tailed tests have important consequences because it is possible for Ho to be accepted if the test is two-tailed but rejected if it is one-tailed. This happens with z values within the range 1.645 and 1.96 and test statistics of, say, 1.75 which fall outside the two-tailed rejection region but within the one-tailed. Consequently the phrasing and justification of the alternative hypothesis should be formulated with considerable care. Although the actual method for calculating the test statistics is not influenced by the nature of the null hypothesis, the effect of stating a direction is to impose a more rigorous test which in turn affects the significance level that can be quoted. By stating a direction to the null hypothesis we are effectively establishing a more precise test. Table 4.1:

Critical z Values for the 0.01 and 0.05 Rejection Regions for One- and Two-tailed Tests Critical Values Tailedness

0.05 Level

0.01 Level

One-tailed test Two-tailed test

-1.645 or +1.645 -1.96 or +1.96

-2.33 or +2.33 -2.58 or +2.58

Š Dr Andrew Clegg

p. 4-131


Data Analysis for Research

4.4

Statistical Tests: Introduction

Choosing the Right Test The main motivation for choosing a statistical test to apply to a set of data has to be driven ultimately by the objectives of your research project. Indeed your project should have been designed and data sampled with a certain test or set of tests in mind (Kitchin and Tate, 1999). When deciding upon a particular test, you need to consider the nature and characteristics of the data sets that you are investigating and, in particular, whether they will allow the use of a parametric or non-parametric tests. The common characteristics of both parametric and non-parametric tests are listed in Table 4.2. Table 4.3 also provides a useful framework to help you choose the correct test.

Table 4.2:

Common Characteristics of Parametric and Non-parametric Tests Parametric Tests      

Independence of observations, except where the data are paired Random sampling of observations from a normally distributed population Interval scale measurement (at least) for the dependent variable A minimum sample size of about 30 per group is recommended Equal variances of the population from which the data is drawn Hypotheses are usually made about the mean (μ) of the population Non-Parametric Tests

     

Independence of randomly selected observations except when paired Few assumptions concerning the distribution of the population Ordinal or nominal scale of measurement Ranks or frequencies of data are the focus of tests Hypotheses are posed regarding ranks, medians or frequencies Sample size requirements are less stringent than for parametric tests

[Kitchin and Tate, 1999, p. 113]

© Dr Andrew Clegg

p. 4-132


Statistical Tests: Introduction

Data Analysis for Research

Table 4.3:

Identifying the Right Test Question 1: What combination of variables have you?

Which test to use:

Two categorical

Chi-Square

Two seperate continuous

Go to question 2

Two continuous which is the same measure administered twice

Two continuous which is the same measure administered on three occasions or more

One categorical and one continuous

Question 2: Should your continuous data be used with parametric tests or non-parametric tests?

Which test to use:

Parametric

Pearson

Non-Parametric

Spearman

Parametric

Related t-test

Non-Parametric

Wilcoxon sign-ranks

Parametric

ANOVA (within subjects)

Non-Parametric

Friedmann test

Parametric

Go to question 3

Question 3: How many levels has your categorical data?

Which test to use:

2

Independentsamples t-test

3 or more

ANOVA (between subjects)

2

Mann-Whitney U

3 or more

Kruskal-Wallis

Go to question 2

Go to question 2

Go to question 2

Non-Parametric

Go to question 3

[Source: Maltby & Day, 2002]

Š Dr Andrew Clegg

p. 4-133


Data Analysis for Research

4.5

Parametric Tests

4.5.1

T-Test or Student’s T-test

Statistical Tests: Student T-Test

The t-test is most useful for testing whether or not a significant difference exists between the means of two samples, or alternatively, whether or not two samples come from one population. There are two principal versions of the t-test. One relates to samples involving independent data sets and the other to samples which involve paired comparisons. In both cases, the data must be of ratio or interval in nature, randomly chosen and normally (or near normally) distributed. The variances of the two data sets should also be similar. Where there is doubt over the frequency distribution and the values of the variances that may jeopardise the accuracy of the test, alternative and less refined non-parametric tests should be used.

4.5.2

T-Test for Independent Samples In this instance, the t-test compares two unrelated data sets by inspecting the amount of difference between their means and taking into account the variability of each data set. The larger the difference in the means, the more likely that a real, significant difference exists, and our samples come from different populations (see Figure 4.5).

Figure 4.5:

Differences in Means and Populations

The following section will illustrate how to use SPSS to conduct a student t-test using variables from the Dataset file.

Š Dr Andrew Clegg

p. 4-134


Statistical Tests: Introduction

Data Analysis for Research

4.6

Using SPSS to Calculate the Student T-Test The aim of the following section is to demonstrate how to use SPSS to perform the unrelated and related ttest. As already mentioned in this section, the t-test is most useful for testing whether or not a significant difference exists between the means of two samples, or alternatively, whether or not two samples come from one population. There are two principal versions of the t-test. One relates to samples involving independent data sets, and the other to samples which involve paired comparisons. In both cases, the data must be of interval nature, randomly chosen and normally (or near normally) distributed. The variances of the two data sets should also be similar. Where there is doubt over the frequency distribution and the values of the variances that may jeopardise the accuracy of the test, alternative and less refined non-parametric tests should be used. To begin, open SPSS and open the file dataset file that you have used in previous sessions. We are going to use the Student T-test to examine the relationship between different variables. Let us consider a potential research scenario to help you place the use of the student t-test in context. Scenario:

As part of the bidding process to Tourism South East for future tourism funding, local tourism officers have to demonstrate if there is a significant difference in turnover between businesses in the Arun and Chichester Districts.

Variables:

We are therefore going to examine if there is a relationship between Area and Turnover08.

Before we start we first need to establish a Null and Alternative hypothesis.

In this case: The Null Hypothesis: Ho: μa= μb

There is no significant difference in Turnover between Area; business turnover is not influenced by location

TheAlternativel Hypothesis: H1: μa≠ μb

© Dr Andrew Clegg

There is a significant difference in Turnover between Area; business turnover is influenced by location

p. 4-135


Data Analysis for Research

4.6.1

Statistical Tests: Introduction

T-Test for Independent Samples To perform the unrelated t-test for two independent samples, first move the mouse over Analyse and press the left mouse button. Move the mouse over Compare Means and then over Independent Samples T Test.

The Independent-Samples T Test dialog box appears.

simulation Š Dr Andrew Clegg

p. 4-136


Statistical Tests: Student T-Test

Data Analysis for Research

Move the mouse over the variable Turnover08 and press the left mouse button. Move the mouse over the centre arrow and press the left mouse button so that the variable Tunrover08 appears in the Test Variable(s) box.

Turnover08

Select the variable Area and press the lower arrow so that Area appears in the Grouping Variable box.

Move the mouse over Define Groups and press the left mouse button. The Define Groups dialog box appears. In the box beside Group 1: type 1 and in the box beside Group 2: type 2. Note in this case the groups have been defined in terms of their two codes (1=Chichester District and 2=Arun District). The values can also be used as a cut-off point, at or above which all the values constitute one group while those below form the other group. In this instance the cut-off point is two, which would be placed in parentheses after gender. Move the mouse over Continue and press the left mouse button. This will return you to the IndependentSamples T Test dialog box. Move the mouse over OK and press the left mouse button. SPSS performs the test and displays the results in the Output window. Š Dr Andrew Clegg

p. 4-137


Statistical Tests: Student T-Test

Data Analysis for Research

In this case the following output is produced:

Turnover08

Turnover08

You are now wondering what this all means. Let us start by referring back to our null and alternative hypothesis.

In this case: The Null Hypothesis: Ho: μa= μb

There is no significant difference in Turnover between Area; business turnover is not influenced by location

TheAlternativel Hypothesis: H1: μa≠ μb

There is a significant difference in Turnover between Area; business turnover is influenced by location

The second subtable in the output, provides the information we need by tabulating the value of t and its pvalue (Sig.(2-tailed)) together with the 95% Confidence Interval of Difference for both Equal variances assumed and Equal variances not assumed. The key to which situation to use lies in the first two columns labelled Levene’s Test for Equality of Variances which is a test for the homogeneity of variance assumption of a valid t-test. One of the criteria for using a parametic t test is the assumption that both populations have equal variances. If the test statistic F is significant, Levene’s test has found that the two variances do differ significantly, in which case we must use the bottom values. Provided the test is not significant (p>0.05), the variances can be assumed to be homogenous and the Equal Variances line of values for the t-test can be used. As Kinnear and Gray (1999) point out: 

If p > 0.05, then the homogeneity of variance assumption has not been violated and the normal t-test based on equal variances (Equal variances assumed) is used (the top line).

If p < 0.05, then the homogeneity of variance assumption has been violated and the normal t-test based on equal variances should be replaced by one based on separate variance estimates (Equal variances not assumed)(the bottom line).

© Dr Andrew Clegg

p. 4-138


Statistical Tests: Student T-Test

Data Analysis for Research

In this example, the Levene Test is significant (p = 0.041 and is therefore < than 0.05), so the t value calculated with the pooled variance estimate (Equal variances not assumed) is appropriate.

The results are relatively straightforward. The table includes the tstatistic, the degree of freedom, and the two-tailed probability of the former being equalled or exceeded by chance alone (Sig.). This form of output does not give the critical t-value that must be exceeded for the null hypothesis to be rejected and Sig. is therefore of great importance in this and other tests in the output of which it is commonly listed. It allows us to dispense with tables of critical values and, if this probability value is equal to or less than the selected significance level, the null hypothesis must be rejected. In this case, the test produces a two-tailed p-value of 0.000; this value is significant. Remember for the pvalue not to be significant at the 0.05 level, the p-value would have to be greater than 0.05. In this case, the null hypothesis is rejected at the 0.05 significance level. In other words we would conclude that there is a significant difference in mean turnover between area, and that turnover is influenced by location. It is important to write up the results clearly and fully. In this instance we could write: A student t-test was conducted to determine if a significant difference between turnover and area existed. A null hypthosis of no significant difference and an alternative hypthosis of a significant different were established, and a 95% confidence level was assumed. The difference was significant t = 6.354, p(<.0005)<0.05. Therefore the null hypthosis can be rejected and we can assume that there is a significant difference between turnover and area, and that turnover is influenced by location. Note that in the above we have reported the probability value as <.0005. You cannot have a probability value of 0.000. The reported probability value has actually been rounded down to three decimal places and therefore for accurary we would report this as p<0.0005.

Š Dr Andrew Clegg

p. 4-139


Data Analysis for Research

Statistical Tests: Student T-Test

A note on Significance Testing taken from Maltby and Day (2002): ‘Significance testing is a criterion, based on probability, that researchers use to decide whether two variables are related. Remember, as researcher always use samples, and because of the possible error, they use significance testing to decide whether the relationships observed are real, or not. Researchers are then able to use a criteria level (significance testing) to decide whether or not their findings are probable (confident of their findings) and not probable (not confident of their findings). This criterion is expressed in terms of percentages, and their relationship to probability values. If we accept that we can never be 100 per cent sure of our findings, we have to set a criterion of how certain we want to be of our findings. Traditionally, two criterion are used. The first is that we are 95 per cent confident of our findings, the second is that we are 99 per cent confident of our findings. This is often expressed in another way. Rather, there is only a 5 per cent (95 per cent confidence) or 1 per cent (99 per cent confidence) probability that we have we have made an error. In terms of significance testing these two criteria are often termed the .05 (5 per cent) and 0.01 (1 per cent) significance levels. Throughout this handbook, you will be using a number of tests to determine whether there is a significant association/relationship between two variables. These tests always provide a probability statistic, in the form of a value; e.g. 0.75, 0.40, 0.15, 0.04, 0.03 and 0.002. Here, the notion of significance testing is essential. This probability statistic is compared against the criteria of 0.05 and 0.01 to decide whether the findings are significant. If the probability value (p) is less than 0.05 (p<0.05) or less than 0.01(p<0.01) then we conclude that the findings is significant. If the probability value is more than 0.05 (p>0.05) then we decide that the finding is not significant. Therefore we can use this information in relation to our research idea and we can determine whether our variables are significantly related, or not. Therefore, for the probability values stated above: * The probability values of 0.75, 0.40 and 0.15 are greater than 0.05 (p>0.05) and these probability values are not significant at the 0.05 level (p>0.05). * The probability values of 0.04, and 0.03 are less than 0.05 (p<0.05) and these probability values are significant at the 0.05 level (p<0.05). * The probability value of 0.02 is less than 0.01 (p<0.01) therefore this probability value is significant at the 0.01 level (p<0.01)’

Š Dr Andrew Clegg

p. 4-140


Statistical Tests: Student T-Test

Data Analysis for Research

4.6.2

One or Two-Tailed Tests The above test has been based on a two-tailed test as the null and alternative hypothesis did not specify any specific direction. If we were going to perform a one-tailed test we would first need to look at the mean values of the data and then rewrite our hypotheses accordingly. Remember that when applying a one-tailed test it is first necessary to establish whether the difference in the samples corresponds to the direction outlined in the alternative hypothesis. For example if the alternative hypothesis is that the mean of sample Y is greater than the mean of sample X, the null hypothesis can only be rejected if the mean of sample Y is greater than the mean of sample X and if it is significant at the chosen level. If we use Descriptives Statistics in SPSS to look at the mean turnovers for businesses in the Chichester and Arun Districts we would find that the mean turnover in the Chichester District is £43,968.47and in the Arun District is £37,591.69. The mean turnover is higher in Chichester which therefore suggests that turnover may be influenced by location. We can therefore conduct a one-tailed t-test to test if there is actually a significance difference between the two mean scores. In this case The Null Hypothesis: Ho: μa= μb

There is no significant difference in Turnover between Area; business turnover is not influenced by location.

TheAlternativel Hypothesis: H1: μa≠ μb

There is a significant difference in Turnover between Area; business turnover is higher in Chichester than Bognor Regis.

To calculate the one-tailed level of significance, divide the two-tailed significance value by 2 (0.000/2). The resultant one-tailed value would be 0.000 which would still be significant (p.<0.05).

© Dr Andrew Clegg

p. 4-141


Data Analysis for Research

4.6.3

Statistical Tests: Student T-Test

Choosing the Correct Data for a T-Test SPSS will not tell you if you are using the wrong data in a test, and it is therefore imperative that you are capable of selecting the right variables to use in a t-test. This will be central to your assessment in this module and it is vital that you get it right. Let us first refer back to Table 4.3 on page 4-133. This table clearly shows that a t-test is a combination of one continuous variable and one categorical (with two levels).

In the worked example provided, Turnover08 was the continuous variable and Area was the categorical variable. Note that Area has two levels (i.e. 1 - Chichester District and 2 - Arun District). You can only use categorical variables that have two levels in a t-test. The actual Independent Samples T-Test actually provides a clue here as you are only able to define two groups (levels) within the Grouping Variable.

Turnover08

In this case also note that the continuous variable (Turnover08) goes in the Test Variable box.

Š Dr Andrew Clegg

p. 4-142


Data Analysis for Research

ď €

Statistical Tests: Student T-Test

Activity 16: Referring to the variables in the Dataset file and your accompanying data set guide, attempt to complete the following diagram listing Test Variables and Grouping Variables that would be suitable for use in a series of ttests.

Test Variables

Grouping Variables

Š Dr Andrew Clegg

p. 4-143


Statistical Tests: Student T-Test

Data Analysis for Research

Activity 17: From the list of potential relationships that you have identified overleaf, please conduct 3 separate Ttests and record your results in the following tables. For each test, identify a research scenario that you are using the test to explore.

Table 18: Student T-Test 1 Student T-Test

Research Scenario

Test Variable

Grouping Variable

Null Hypothesis

Alternative Hypothesis

SPSS Output Record the p. value of the Levene Test Is this significant: yes/no? Is your test based on:

Equal variances assumed Equal variances not assumed

Record the value of p. (Sig. 2-tailed) Is the value of p. significant: yes/no? Your conclusions (with full reference to the null and alternative hypotheses)

© Dr Andrew Clegg

p. 4-144


Statistical Tests: Student T-Test

Data Analysis for Research

Activity 17:

Table 19: Student T-Test 2 Student T-Test

Research Scenario

Test Variable

Grouping Variable

Null Hypothesis

Alternative Hypothesis

SPSS Output Record the p. value of the Levene Test Is this significant: yes/no? Is your test based on:

Equal variances assumed Equal variances not assumed

Record the value of p. (Sig. 2-tailed) Is the value of p. significant: yes/no? Your conclusions (with full reference to the null and alternative hypotheses)

© Dr Andrew Clegg

p. 4-145


Statistical Tests: Student T-Test

Data Analysis for Research

Activity 17:

Table 20: Student T-Test 3 Student T-Test

Research Scenario

Test Variable

Grouping Variable

Null Hypothesis

Alternative Hypothesis

SPSS Output Record the p. value of the Levene Test Is this significant: yes/no? Is your test based on:

Equal variances assumed Equal variances not assumed

Record the value of p. (Sig. 2-tailed) Is the value of p. significant: yes/no? Your conclusions (with full reference to the null and alternative hypotheses)

© Dr Andrew Clegg

p. 4-146


Data Analysis for Research

4.7

Statistical Tests: Related T-Test

Using SPPS to Calculate the T-Test for Related Samples The t-test can also be used to examine means of the same participants in two conditions or at two points in time. The advantage of using the same participants or matched participants is that the amount of error deriving from differences between participants is reduced. The difference between a related and unrelated t-test lies essentially in the fact that two scores from the same person are likely to vary less than two scores from two different people. For example, if you were to weigh the same person on two occasions, the difference between those two weights is likely to be less than the weights of two seperate individuals. The variability of the standard error for the related t-test is less than that for the unrelated one. Indeed, the variability of the standard error of the differences in means for the related t test will depend on the extent to which the pairs of scores are similar or related. The more similar they are, the less the variability will be of their estimated standard error. In the following example we are going to look at paired data from the Dataset file. Let us consider a potential research scenario to help you place the use of the related t-test in context. Scenario:

Between 2008 and 2010, Tourism South East ran a series of courses in conjunction with the Green Tourism Business Scheme to help GTBS members progress to the next stage of accreditation (e.g. bronze to silver; silver to gold). As part of the monitoring process, Tourism South East want to establish if these courses have had any impact on GTBS scores.

Variables:

We are are going to examine differences in GTBS scores in 2008 and 2010.

As always we need to start by defining our hypthoses. In this instance, the null and alternative hypthoseses have been stated as: Ho: μa= μb

There is no significant difference between the GTBS scores in 2008 and 2010.

H1: μa≠ μb

There is a significant difference in the GTBS scores in 2008 and 2010.

To perform the related t-test, first move the mouse over Analyse and press the left mouse button.

simulation

Move the mouse over Compare Means and then over Paired-Samples T-Test. © Dr Andrew Clegg

p. 4-147


Data Analysis for Research

Statistical Tests: Related T-Test

The Paired-Samples T-Test dialog box appears.

Move the mouse over GTBS08 and press the left mouse button. GTBS08 is selected. Now move the mouse over GTBS10 and press the left mouse button. GTBS10 is selected. Move the mouse over the central button and press the left mouse button.

GTBS08 and GTBS10 now appear in the Paired Variables box. Click OK.

Š Dr Andrew Clegg

p. 4-148


Data Analysis for Research

Statistical Tests: Related T-Test

The procedure produces the following results in the output window:

The first table evident in the SPSS output is the Paired Samples Statistics which reports the descriptive statistics. By observing the mean scores we can see that mean GTBS scores were higher in 2010 than 2008. These differences seem to be supporting our initial hypothesis. To establish whether this result is significant or has merely occured by chance we refer to the Paired Samples Test. The key elements of the Pair Samples Test include: (a) The test statistic - this is denoted as t; in this case the value of t=-11.386 (b) The degrees of freedom - the degrees of freedom equal the size of the sample (300) minus 1. The minus 1 represents minus 1 for the sample as you have only asked one set of respondents. The degrees of freedom value is placed in brackets between the t and the = sign (e.g. t(299)=-11.386). (c) The Probability Value - as in all tests we also have to report the probability value. Note that the value of p =.000 (which remember we report as p<.0005) is less than 0.05 which means that there has been a significant change in GTBS scores between 2006 and 2008. Let us bring these different elements together. As can be seen from the SPSS output, the difference between the two means is significant. This is specifically reported as: There is a significant difference in GTBS scores between 2006 and 2008, t (299)= -11.386, p (<.0005)<0.05.

Š Dr Andrew Clegg

p. 4-149


Statistical Tests: Related T-Test

Data Analysis for Research

However, we can be more specific and in our altnerative hypothesis look for an improvement in GTBS scores. As a result our alternative hypothesis would be: H1: μa≠ μb

There is a significant improvement in the GTBS scores between 2008 and 2010.

This therefore means we have conducted a one-tailed test, as we have specified a specific direction in which to examine change. To alter the output here so that it complies with a onetailed test we merely divide the p-value by 2. The resultant value (.000) is still significant (p=.000<0.05). As a result we can reject the null hypothesis and conclude that there has been a significant improvement in GTBS scores between 2008 and 2010, at the 95% confidence level. Specifically: There has been a significant improvement in GTBS scores between 2008 and 2010, t (299)= -11.386, p (<.0005)<0.05.

© Dr Andrew Clegg

p. 4-150


Data Analysis for Research

Statistical Tests: Related T-Test

Activity 18: We are now going to use the Dataset file to conduct a number of additional related t-tests. Please complete the following tables, making clear reference to the SPSS output. You have been provided with research scenarios for each table to place the test in context.

Table 21: Related T-Test: Turnover08 Against Turnover10 [Tourism South East want to establish if regional marketing strategies implemented between 2008 and 2010 have had an impact on business turnover.] Related T-Test Null Hypothesis

Alternative Hypothesis Comment on the SPSS Output

Table 22: Related T-Test: Green08 Against Green10 [Tourism South East want to establish if support given to the use of local produce has impacted on how much businesses spend on local produce] Related T-Test Null Hypothesis

Alternative Hypothesis Comment on the SPSS Output

Note that the tests conducted here relate to the entire sample. If we used the Split File option as we have done previously, we could conduct Related T-tests to provide comparisons between selected variables such as Area, Town or G-Strategy. Attempt to apply the Split File option and repeat one of the tests above. Cut and paste the output into your log book.

© Dr Andrew Clegg

p. 4-151


Statistical Tests: Mann Whitney U Test

Data Analysis for Research

4.8

Non Parametric Tests

4.8.1

The Mann-Whitney U Test (Independent Samples) When comparing samples of geographical data, assumptions of normality which underpin the accuracy of parametric tests, such as the t- test, are often quite unrealistic. In these cases, the use of a non-parametric test, such as the Mann Whitney U Test, provides a convenient alternative. The Mann Whitney U test is the non-parametric counterpart of the t-test for unrelated (independent) data. The test is used to determine whether ordinal data collected in two different samples differ significantly. As a non-parametric test it is not restricted by any assumptions regarding the nature of the population from which the sample was taken and is applicable to ordinal (ranked data). In additition, the sample sizes of the data sets need not be equal. The test calculates whether there is a significant difference in the distribution (based on the median) of data by comparing ranks of each data set. Within the Mann Whitney U test the null hypothesis is that the two populations are taken from a common population so that there should be no consistent difference between the two sets of values. Any observed differences are due entirely to chance in the sampling process. To begin, open SPSS and open the file Dataset file that you have used in previous sessions. We are going to use Mann Whitney to examine the relationship between different variables. Let us consider a potential research scenario to help you place the use of the Mann Whitney test in context.

4.8.2

Scenario:

Tourism South East are developing a new e-tourism strategy and they want to establish if there is any relationship between e-strategy (e-commerce adopters and non-adopters) and business attitudes to the value of the internet.

Variables:

We are therefore going to examine the relationship between EStrategy and the perceived value of the internet in 2008 (Webqual08).

Writing Null and Alternative Hypotheses Before we start we first need to establish a Null and Alternative hypothesis.

In this case: The Null Hypothesis: H o:

There is no significant difference between the two groups in terms of their perceived value of the internet; e-strategy does not influence attitudes towards the internet

TheAlternative Hypothesis: H 1:

Š Dr Andrew Clegg

There is a significant difference between the two groups in terms of their perceived value of the internet; e-strategy does influence attitudes towards the internet p. 4-152


Statistical Tests: Mann Whitney U Test

Data Analysis for Research

4.9

Using SPSS to Calculate Mann Whitney To perform the Mann Whitney U test, first move the mouse over Analyse and press the left mouse button. Move the mouse over Nonparametric Tests and then over Legacy Dialogs. Select 2 Independent Samples. The Two-Independent Samples Tests dialog box appears.

Select the variable labelled Webqual08. Move the mouse over the central arrow and press the left mouse button so Webqual08 appears in the Test Variable List.

Š Dr Andrew Clegg

p. 4-153


Data Analysis for Research

Statistical Tests: Wilcoxon Signed Ranks

Select the variable EStrategy and press the lower arrow so that EStrategy appears in the Grouping Variable box.

Move the mouse over Define Groups and press the left mouse button.

The Define Groups dialog box appears. In the box beside Group 1: type 1 and in the box beside Group 2: type 2. Note in this case the groups have been defined in terms of their two codes (1=ECommerce Adopter and 2=ECommerce - Non Adopter). The values can also be used as a cut-off point, at or above which all the values constitute one group while those below form the other group. In this instance the cut-off point is two, which would be placed in parentheses after gender. Move the mouse over Continue and press the left mouse button. This will return you to the IndependentSamples Tests dialog box. Move the mouse over OK and press the left mouse button. SPSS performs the test and displays the results in the Output window.

Š Dr Andrew Clegg

p. 4-154


Data Analysis for Research

Statistical Tests: Wilcoxon Signed Ranks

The first subtable, Ranks, illustrates the number of businesses in each group, and the total number of businesses. The Mean Rank indicates the mean rank of scores within each group and the Sum of Ranks indicates the total sum of all ranks within each group. If our null hypothesis of no significant difference was true, then we would expect the mean rank and sum of ranks to be roughly similar across the two groups. As we can see the mean rank for E-Commerce Adopters is 178.82 and for E-Commerce Non-Adopters is 116.35. There is a clear difference between the two, and to determine whether this difference is significant we refer to the Test Statistics table below. This tells us that the Mann Whitney U value is 6508.000 and that the probability value (p), ascertained by examining the Asymp. Sig. (2-tailed) is .000. In this case, the p-value (.000) (reported as p<.0005) is less than 0.05, so we can therefore reject the null hypothesis and conclude that there a significant difference between EStrategy and attitudes towards the internet. Our Mann Whitney test was two-tailed but again we could be more specific by indicating a direction in our alternative hypothesis. In this case the alternative hypothesis would be: H 1:

There is a significant difference between the two groups in terms of their perceived value of the internet. E-commerce adopters rank the value of the internet higher than e-commerce non-adopters.

Note that an initial examination of the mean ranks would support our alternative hypothesis. As before, for a one-tailed test, the p value needs to be halved (.000/2 = .000). In this case the test would still be significant as the p-value (.000) (reported as p<.0005) is less than 0.05, so we can again reject the null hypothesis and conclude that there a significant difference between EStrategy and attitudes towards the internet and that eCommerce adopters rank the value of the internet higher than E-commerce non-adopters.

Š Dr Andrew Clegg

p. 4-155


Data Analysis for Research

4.9.1

Statistical Tests: Wilcoxon Signed Ranks

Choosing the Correct Data for a Mann Whitney Test SPSS will not tell you if you are using the wrong data in a test, and it is therefore imperative that you are capable of selecting the right variables to use in a Mann Whitney T-test. This will be central to your assessment in this module and it is vital that you get it right. Let us first refer back to Table 3 on page 4-133. This table clearly shows that a Mann Whitney T-Test is nonparametric and comprises a combination of one continuous variable and one categorical (with two levels).

In the worked example provided, Webqual was the continuous variable and EStrategy was the categorical variable. Note that EStrategy has two levels (i.e. 1 - E-Commerce Adopter and 2 - E-Commerce NonAdopter). You can only use categorical variables that have two levels in a Mann Whitney Test. The actual Mann Whitney Test dialog box actually provides a clue here as you are only able to define two groups (levels) within the Grouping Variable.

In this case also note that the continuous variable (Webqual08) goes in the Test Variable box.

Š Dr Andrew Clegg

p. 4-156


Data Analysis for Research

ď €

Statistical Tests: Wilcoxon Signed Ranks

Activity 19: Referring to the variables in the Dataset file, attempt to complete the following diagram listing Test Variables and Grouping Variables that would be appropriate for use in a series of Mann Whitney Tests.

Test Variables

Grouping Variables

Š Dr Andrew Clegg

p. 4-157


Data Analysis for Research

Statistical Tests: Wilcoxon Signed Ranks

Activity 20: From the list of potential relationships that you have identified overleaf, please conduct 3 separate Mann Whitney tests and record your results in the following tables. For each test, identify a research scenario that you are using the test to explore.

Table 23: Mann Whitney Test 1 Mann Whitney Test

Research Scenario

Test Variable

Grouping Variable

Null Hypothesis

Alternative Hypothesis

SPSS Output Record the Mann Whitney U Value Record the value of p (Asymp. Sig. (2-tailed)) Is the value of p. significant: yes/no? Your conclusions (with full reference to the null and alternative hypotheses, and your test statistics)

© Dr Andrew Clegg

p. 4-158


Data Analysis for Research

Statistical Tests: Wilcoxon Signed Ranks

Activity 20: Table 24: Mann Whitney Test 2

Mann Whitney Test

Research Scenario

Test Variable

Grouping Variable

Null Hypothesis

Alternative Hypothesis

SPSS Output Record the Mann Whitney U Value Record the value of p (Asymp. Sig. (2-tailed)) Is the value of p. significant: yes/no? Your conclusions (with full reference to the null and alternative hypotheses, and your test statistics)

© Dr Andrew Clegg

p. 4-159


Data Analysis for Research

Statistical Tests: Wilcoxon Signed Ranks

Activity 20: Table 25: Mann Whitney Test 3

Mann Whitney Test

Research Scenario

Test Variable

Grouping Variable

Null Hypothesis

Alternative Hypothesis

SPSS Output Record the Mann Whitney U Value Record the value of p (Asymp. Sig. (2-tailed)) Is the value of p. significant: yes/no? Your conclusions (with full reference to the null and alternative hypotheses, and your test statistics)

© Dr Andrew Clegg

p. 4-160


Data Analysis for Research

4.10

Statistical Tests: Wilcoxon Signed Ranks

Using SPSS to Calculate Wilcoxon Signed Ranks Test (Related Data Sets) The Wilcoxon signed ranks test is the non-parametric conterpart of the t-test for related data or paired t-test. The basic assumptions for the test are that the data are paired across conditions or time, and that the data are symmetrical but need not be normal or any other shape. The data should also be of at least ordinal level, which therefore makes the test very useful for analysing data based on ranked scores. The test itself examines the differences between data from the phenomenon collected in two different conditions or times by examining the ranks of the difference in values over the two conditions. For example, you may want to know whether a village’s fertility or mortality rate changes significantly between dates or whether the conditions under which a questionnaire or interview is conducted influence the findings of a study significantly. In this case, the test calculates whether there is a significant difference by examining whether the ranks of individual phenomena differ between conditions or times. To begin, open SPSS and open the file Dataset file that you have used in previous sessions. We are going to use the Wilcoxon test to examine the relationship between different variables. Let us consider a potential research scenario to help you place the use of the Wilcoxon Test in context. Scenario:

Between 2008 and 2010, Tourism South East have been running E-Commerce workshops across the South East region. As part of the monitoring process, Tourism South East want to establish if these workshops have had any impact on business attitudes to the value of the internet.

Variables:

Therefore we are are going to examine the relationship between Webqual08 and Webqual10.

In this instance, the null and alternative hypthoseses have been stated as: H o:

There is no difference in business attitudes towards the value of the internet between 2008 and 2010

H1:

There is a difference in business attitudes towards the value of the internet between 2008 and 2010.

The significance level has been set at 0.05 (95%). Note that this is also a two-tailed test as no direction has been specified in the alternative hypothesis.

Š Dr Andrew Clegg

p. 4-161


Data Analysis for Research

Statistical Tests: Wilcoxon Signed Ranks

To perform the Wilcoxon Test, first move the mouse over Analyse and press the left mouse button.

Move the mouse over Nonparametric Tests and then Legacy Dialogs and then over 2 Related Samples and press the left mouse button.

The Two-Related Samples Tests dialog box appears.

Move the mouse over Webqual08 and press the left mouse button. Webqual08 is selected. Now move the mouse over Webqual10 and press the left mouse button. Webqual10 is selected. You will notice that in the Current Selections area in the dialog box, Webqual08 is now beside Variable 1 and Webqual10 is beside Variable 2.

Š Dr Andrew Clegg

p. 4-162


Data Analysis for Research

Statistical Tests: Wilcoxon Signed Ranks

Move the mouse over the central button and press the left mouse button. Webqual08 and Webqual10 now appear in the Paired Variables box.

Click OK. The procedure produces the following in the output window.

Š Dr Andrew Clegg

p. 4-163


Data Analysis for Research

Statistical Tests: Wilcoxon Signed Ranks

The first subtable, Ranks, shows the number of negative, positive and tied ranks, along with the mean rank and the Sum of Ranks. Let us explore this is additional detail. Key observations: 

Webqual10 has been entered into the equation first, therefore the calculation is based on the attitudes scores in 2010 minus the attitude scores in 2008.

The Negative Ranks indicate how many ranks of Webqual08 were larger than Webqual10. Here the value is 0, which would initially suggest that attitude scores have increased.

The PostiveRanks indicate how many ranks of Webqual08 were smaller than Webqual10. The value here is 259.

The Tied Ranks indicate how many of the rankings of Webqual08 and Webqual10 are the same. The value here is 41.

The Total is the total number of ranks, which is equal to the number of attitude scores in the sample (in this case 300).

From the second subtable, Test Statistics, it can be seen that the value of z = -16.093, which is significant as the value of p (.000) is less than 0.05. We can therefore reject the null hypothesis and conclude that there is a significant difference in business attitudes towards the value of the internet between 2008 and 2010. The findings of the Wilcoxon test should be reports as: z= -16.093, p(<0.0005)<.005 The Wilcoxon Test was two-tailed but again we could be more specific by indicating a direction in our alternative hypothesis. In this case the alternative hypothesis would be: H1: There is a significant difference in attitudes towards the value of the internet between 2008 and 2010; business attitudes have improved. Note that an initial examination of the data in the ranks table support our alternative hypothesis. As before, for a one-tailed test, the p value needs to be halved (.000/2 = .000). In this case the test would still be significant as the p-value (.000) (reported as p<.0005) is less than 0.05, so we can again reject the null hypothesis and conclude that there a significant difference between attitudes towards the value of the internet between 2008 and 2010 and that business attitudes have improved.

© Dr Andrew Clegg

p. 4-164


Data Analysis for Research

Statistical Tests: Wilcoxon Signed Ranks

Activity 21: We are now going to use the Dataset file to conduct a number of additional Wilcoxon Tests. Please complete the following tables, making clear reference to the SPSS output. You have been provided with research scenarios for each table to place the test in context.

Table 26: Wilcoxon Test = BLINK08-BLINK10 [Following a complete review of their business advisory services, instigated by poor industry feedback in 2007, Business Link need to establish if business attitudes towards their advisory services has improved between 2008 and 2010] Related T-Test Null Hypothesis

Alternative Hypothesis Comment on the SPSS Output

Table 27: Wilcoxon Test- WEBVALUE08-WEBVALUE10 [Tourism South East want to establish if business attitudes to destination management systems have changed following the change of DMS platform and a complete relaunch of booking systems] Related T-Test Null Hypothesis

Alternative Hypothesis Comment on the SPSS Output

Note that the tests conducted here relate to the entire sample. If we used the Split File option as we have done previously, we could conduct Wilcoxon Tests to provide comparisons between selected cases such as Area, Town or G-Strategy. Attempt to apply the Split File option and repeat one of the tests above. Cut and paste the output into your log book.

© Dr Andrew Clegg

p. 4-165


Data Analysis for Research

Statistical Tests: Wilcoxon Signed Ranks

Notes:

Š Dr Andrew Clegg

p. 4-166


Turn static files into dynamic content formats.

Create a flipbook
Issuu converts static files into: digital portfolios, online yearbooks, online catalogs, digital photo albums and more. Sign up and create your flipbook.