Research Design and Data Collection [revised] 2014

Page 1

Research Design & Data Collection: 2

BML224: Data Analysis for Research


Research Design and Data Collection:

Previously‌


Types of Data -­‐ Summary

NOIR NOMINAL

ORDINAL

NON-­‐PARAMETRIC

INTERVAL

RATIO

PARAMETRIC


Quan>ta>ve Research Design Nature of the Ques>on

NOMINAL ORDINAL Type of Data

INTERVAL RATIO

Type of Analysis DESCRIPTIVE INFERENTIAL


Quan>ta>ve Research Design General Purpose

Descrip0on (only)

Specific Purpose

Summarise Data

Compare Groups

Finds Strengths of Associa>on, Relate Variables

Type of Ques0on/ Hypothesis

Descrip0ve

Difference

Associa0onal

Descrip>ve Sta>s>cs (e.g. mean, percentage, range)

(e.g. t-­‐test, Mann Whitney)

(e.g. correla8on)

General Type of Sta0s0c

Explore Rela0onship Between Variables

[Source: Morgan, G. et al (2011), IBM SPSS for Introductory Sta8s8cs, Routledge, London, p. 6]


Research Design and Collec>on General Purpose

Descrip0on (only)

Specific Purpose

Summarise Data

Compare Groups

Finds Strengths of Associa>on, Relate Variables

Type of Ques0on/ Hypothesis

Descrip0ve

Difference

Associa0onal

Descrip>ve Sta>s>cs (e.g. mean, percentage, range)

(e.g. t-­‐test, Mann Whitney)

(e.g. correla8on)

General Type of Sta0s0c

Explore Rela0onship Between Variables

[Source: Morgan, G. et al (2011), IBM SPSS for Introductory Sta8s8cs, Routledge, London, p. 6]


Basic Descrip>ve Sta>s>cs NOMINAL

Quan%ta%ve)Research)Design! Type)of)Analysis)

ORDINAL

Quan%ta%ve)Research)Design!

Tabular)

Type)of)Analysis)

Graphical) Expected(Grade(for(BML224(

What!grade!do!you!expect!to!get!for!the!module?!

INTERVAL

2010/2011!

%)

2011/2012! )

%)

Grade)A)

6)

8.5%)

3)

12%)

Grade)B)

17)

23.9%)

16)

32%)

Grade)C)

37)

52.1%)

19)

38%)

Grade)D)

11)

15.5%)

11)

22%)

Grade)E)

0)

0%)

1)

2%)

Total&

71)

100%)

50)

100%)

RATIO

F"(<40%):" 2%"

B"(60,69%):" 32%"

C"(50,59%):" 38%"

Quan%ta%ve)Research)Design! Type)of)Analysis)

Graphical) Expected)Grade)for)BML224) 40%

35%

30%

Percentage)(%))

25%

20%

15%

10%

5%

0% A)(70%+):

B)(60169%):

C)(50159%):

Expected)Grade)

D)(40149%):

F)(<40%):

A"(70%+):" 6%"

D"(40,49%):" 22%"


Basic Descrip>ves Quan%ta%ve)Research)Design!

Quan%ta%ve)Research)Design! Type)of)Analysis)

ORDINAL

Type)of)Analysis)

Tabular)

Graphical) Student&Confidence&Levels&2011&

How!confident!are!you!about!star1ng!this!module?! 2010/2011!

%)

2011/2012! )

0)

0.0%)

0)

7);)Very)confident)

INTERVAL RATIO

%) 0.0%)

6);)Quite)Confident)

4)

5.60%)

1)

2.0%)

5);)Confident)

16)

22.50%)

10)

20.0%)

4);)Uncertain)

28)

39.40%)

26)

52.0%)

3);)Anxious)

14)

19.70%)

7)

14.0%)

2);)Quite)Anxious)

4)

5.60%)

4)

8.0%)

1);)Very)Anxious)

5)

7.00%)

2)

4.0%)

Uncertain) to)very)anxious)

51)

72%)

39)

78%)

Sample'(n)'

71'

Very&Anxious:! 4%!

Quite&Confident:! 2%!

Quite&Anxious:! 8%!

Confident:! 20%! Anxious:! 14%!

50' Uncertain:! 52%!

Quan%ta%ve)Research)Design!

Quan%ta%ve)Research)Design! Type)of)Analysis)

Type)of)Analysis)

Tabular)

Graphical) Student$Attitudes$to$Statistics$

A#tudes!Towards!Sta/s/cs! Strong!Agree! [5]! This)is)my)first)ever) sta%s%cs)class) I)am)worried)about)this) module) If)I)could)avoid)taking)this) module)I)would) I've)never)enjoyed)maths) Passing)is)my)main)goal) for)this)module! I)do)not)see)the)relevance) of)this)module!

Agree! [4]!

No!Opinion! [3]!

Disagree! [2]!

Strongly! Disagree! [1]!

I,do,not,see,the,relevance,of,this,module

7%$

9%$

Passing,is,my,main,goal,for,this,module

Statement$

NOMINAL

24%$

39%$

32%$

21%$

39%$

17%$

10%$ 1%$

Strongly,Agree Agree

I've,never,enjoyed,maths

32%!

37%!

4%!

14%!

13%!

17%!

26%!

18%!

32%!

7%!

If,I,could,avoid,taking,this,module,I,would

14%!

32%!

25%!

20%!

9%!

I,am,worried,about,this,module

10%!

24%!

24%!

30%!

13%!

32%!

38%!

17%!

10%!

1%!

10%$

24%$

24%$

30%$

13%$

No,Opinion Disagree Strongly,Disagree

14%$

32%$

17%$

This,is,my,first,ever,statistics,class

25%$

25%$

18%$

32%$

0%

20%

9%!

24%!

39%!

21%!

32%$

37%$

40% Percentage$

7%!

20%$

4%$

60%

14%$

80%

9%$

7%$

13%$

100%


Basic Descrip>ves NOMINAL

Quan%ta%ve)Research)Design! Type)of)Analysis)

ORDINAL

Quan%ta%ve)Research)Design!

Analy%cal)

Type)of)Analysis)

Graphical)

Descrip)ve!Sta)s)cs!–!Turnover!2010! Turnover!2010!

INTERVAL

Mean)

£41,311.40(

Median)

£44,640.00(

Mode)

£44,760.00(

Standard)Devia%on)

£9191.0316(

RATIO

Distribution of the Data

Box plot

Quan%ta%ve)Research)Design! Type)of)Analysis)

Distribution of the Data

Graphical)


Research Design and Data Collec>on 2: Learning Outcomes Aims:   To map out different types of advanced sta8s8cal analysis and

demonstrate how the choice of sta8s8cal analysis is influenced by the type of data

  To map and log opportuni8es for advanced sta8s8cal analysis and

sta8s8cal tests against the different variables within the dataset guide


Research Design and Collec>on General Purpose

Descrip0on (only)

Specific Purpose

Summarise Data

Compare Groups

Finds Strengths of Associa>on, Relate Variables

Type of Ques0on/ Hypothesis

Descrip0ve

Difference

Associa0onal

Descrip>ve Sta>s>cs (e.g. mean, percentage, range)

(e.g. t-­‐test, Mann Whitney)

(e.g. correla8on)

General Type of Sta0s0c

Explore Rela0onship Between Variables

[Source: Morgan, G. et al (2011), IBM SPSS for Introductory Sta8s8cs, Routledge, London, p. 6]


Research Design and Data Collection:

Exploratory Data Analysis: Crosstabulations


Crosstabula>ons Defini&on   A crosstabula8on is a joint frequency distribu8on of cases based on

two or more categorical variables

  Displaying a distribu8on of cases by their values on two or more

variables is known as con&ngency table analysis


Crosstabula>ons   Examples: Area by Response to Recession


Crosstabula>ons   Examples: Area by Response to Recession

  Analysis by Row


Crosstabula>ons   Examples: Area by Response to Recession

  Analysis by Column


Crosstabula>ons   Examples: Size by Response to Recession

  Analysis by Row


Crosstabula>ons   Examples: Size by Response to Recession

  Analysis by Column


Crosstabula>ons Variables for Analysis

Identify potential variables that could form the basis of four separate crosstabulations


Research Design and Data Collection:

Planning the Journey! Basic to Advanced Statistical Analysis


The Role of Sta>s>cal Tests in Advanced Analysis   Used to make deduc8ons/inferences about a par8cular data set or

rela8onships (differences/associa8ons) between different data sets

  Random sample of 50 households in two rural villages in West

Sussex:

  Village A: mean income £17,650   Village B: mean income £22,220

  A test can be used to determine if there is a ‘real difference’ or

whether the difference occurred ‘purely by chance’


Sta>s>cal Tests: Parametric Tests Parametric Tests: data conforms to normal distribu8on and is of interval or ra8o in nature   Independence of observa8ons (except where the data is paired)   Random sampling   Interval scale measurement for the dependent variable   A minimum sample size of 30 per group is recommended   Equal variances of the popula8on from which the data is drawn   Hypotheses are usually made about the mean of the popula8on


Sta>s>cal Tests: Non-­‐Parametric Tests Non-­‐Parametric Tests: data does not conform to normal distribu8on – use ordinal data   Independence of randomly selected observa8ons except when paired   Few assump8ons concerning the distribu8on of the popula8on   Ordinal or nominal scale of measurement   Ranks or frequencies of data are the focus of tests   A minimum sample size of 30 per group is recommended   Hypotheses are posed regarding ranks, medians or frequencies   Sample size requirements are less stringent than for parametric tests


Basic to Advanced Sta>s>cal Analysis   Scenario 1   As part of a review of tourism compe88veness along the

South Coast, local tourism officers have been asked to look at respec8ve profit levels between businesses in the Arun and Chichester Districts drawing on the results of the business survey.

  Where would you begin your analysis?


Basic to Advanced Sta>s>cal Analysis   Start with your descrip0ve analysis


Basic to Advanced Sta>s>cal Analysis   Start with your descrip0ve analysis

This analysis suggests that there is a difference in profit levels between Chichester District and Arun District


Basic to Advanced Sta>s>cal Analysis   Scenario 1   As part of a review of tourism compe88veness along the

South Coast, local tourism officers have been asked to look at respec8ve profit levels between businesses in the Arun and Chichester Districts drawing on the results of the business survey.

  What would be a suitable test?


Choosing the Right Test


Choosing the Right Test

One Categorical and One Continuous


Student T-­‐Test – Data Requirements Grouping Variable [Independent Variables]

Test Variables [Dependent Variables]


Student T-­‐Test – Data Requirements Grouping Variable [Independent Variables]

Test Variables [Dependent Variables]

It is a variable that stands alone and isn't changed by the other variables you are trying to measure

A variable that depends on other factors


Student T-­‐Test – Data Requirements Grouping Variable [Independent Variables]

Test Variables [Dependent Variables]

Nominal (Categorical) [2 Levels]

Ra0o or Interval (Con0nuous)


Basic to Advanced Sta>s>cal Analysis   Scenario 1   As part of a review of tourism compe88ve along the South

Coast, local tourism officers have been asked to look at respec8ve profit levels between businesses in the Arun and Chichester Districts drawing on the results of the business survey.


Basic to Advanced Sta>s>cal Analysis   Scenario 1   As part of a review of tourism compe88veness along the

South Coast, local tourism officers have been asked to look at respec8ve profit levels between businesses in the Arun and Chichester Districts drawing on the results of the business survey Profit Test Variable Ratio (continuous)


Basic to Advanced Sta>s>cal Analysis   Scenario 1   As part of a review of tourism compe88veness along the

South Coast, local tourism officers have been asked to look at respec8ve profit levels between businesses in the Arun and Chichester Districts drawing on the results of the business survey Area Code Grouping Variable Nominal (categorical) 2 Levels [Chichester District 1 / Arun District 2]


Student T-­‐Test – Data Requirements Grouping Variable [Independent Variables]

Test Variables [Dependent Variables]

Nominal (Categorical) [2 Levels]

Ra0o or Interval (Con0nuous)

Nominal (Categorical) [2 Levels]

Nominal (Categorical) [2 Levels]

Identify potential variables for use in a Student T-Test from the dataset guide


Basic to Advanced Sta>s>cal Analysis   Scenario 2   Tourism South East are developing a new e-­‐tourism strategy

and they want to establish if there is any difference between e-­‐strategy mo8ves (e-­‐commerce adopters and non-­‐adopters) and business ajtudes to the value of the internet in 2010.

  Where would you start?


Basic to Advanced Sta>s>cal Analysis   Start with your descrip0ve analysis


Basic to Advanced Sta>s>cal Analysis   Start with your descrip0ve analysis

This analysis suggests that there is a difference in aLtudes towards the internet between e-­‐strategy adopters and non-­‐ adopters


Basic to Advanced Sta>s>cal Analysis   Scenario 2   Tourism South East are developing a new e-­‐tourism strategy

and they want to establish if there is any difference between e-­‐strategy mo8ves (e-­‐commerce adopters and non-­‐adopters) and business ajtudes to the value of the internet in 2010.

  What would be a suitable test?


Choosing the Right Test


Choosing the Right Test

One Categorical and One Continuous


Mann Whitney – Data Requirements Grouping Variable [Independent Variables]

Test Variables [Dependent Variables]


Mann Whitney – Data Requirements Grouping Variable [Independent Variables]

Test Variables [Dependent Variables]

Nominal (Categorical) [2 Levels]

Ordinal


Basic to Advanced Sta>s>cal Analysis   Scenario 2   Tourism South East are developing a new e-­‐tourism strategy

and they want to establish if there is any difference between e-­‐strategy mo8ves (e-­‐commerce adopters and non-­‐adopters) and business ajtudes to the value of the internet in 2010.


Basic to Advanced Sta>s>cal Analysis   Scenario 2   Tourism South East are developing a new e-­‐tourism strategy

and they want to establish if there is any difference between e-­‐strategy mo8ves (e-­‐commerce adopters and non-­‐adopters) and business aLtudes to the value of the internet in 2010

WEBQUAL10 Test Variable Ordinal (continuous)


Basic to Advanced Sta>s>cal Analysis   Scenario 2   Tourism South East are developing a new e-­‐tourism strategy

and they want to establish if there is any difference between e-­‐strategy mo&ves (e-­‐commerce adopters and non-­‐ adopters) and business ajtudes to the value of the internet in 2010 E-Strategy Grouping Variable Nominal (categorical) 2 Levels [E-Commerce Adopters 1 / E-Commerce Non-Adopters 2]


Mann Whitney – Data Requirements Grouping Variable [Independent Variables]

Test Variables [Dependent Variables]

Nominal (Categorical) [2 Levels]

Ordinal

Nominal (Categorical) [2 Levels]

Nominal (Categorical) [2 Levels]

Identify potential variables for use in a Mann Whitney Test from the dataset guide


Basic to Advanced Sta>s>cal Analysis   Scenario 3   Between 2008 and 2010, Tourism South East ran a series of

courses in conjunc>on with the Green Tourism Business Scheme to help GTBS members progress to the next stage of accredita>on (e.g. bronze to silver; silver to gold). As part of the monitoring process, Tourism South East want to establish if these courses have had an impact on GTBS scores

  Where would you start you analysis?


Basic to Advanced Sta>s>cal Analysis   Start with your descrip0ve analysis

This analysis suggests that there is a difference in GTBS scores between 2008 and 2010


Basic to Advanced Sta>s>cal Analysis   Scenario 3   Between 2008 and 2010, Tourism South East ran a series of

courses in conjunc>on with the Green Tourism Business Scheme to help GTBS members progress to the next stage of accredita>on (e.g. bronze to silver; silver to gold). As part of the monitoring process, Tourism South East want to establish if these courses have had an impact on GTBS scores

  What would be a suitable test?


Choosing the Right Test


Choosing the Right Test

Two continuous which is the same administered twice


Basic to Advanced Sta>s>cal Analysis   Scenario 3   Between 2008 and 2010, Tourism South East ran a series of

courses in conjunc>on with the Green Tourism Business Scheme to help GTBS members progress to the next stage of accredita>on (e.g. bronze to silver; silver to gold). As part of the monitoring process, Tourism South East want to establish if these courses have had an impact on GTBS scores

Paired Values GTBS08/GTBS10


Related or Paired Samples T-­‐Test Appropriate Related/Paired Variables (Ra0o or Interval)

Identify potential variables for use in a Related or Paired Samples TTest from the dataset guide


Basic to Advanced Sta>s>cal Analysis   Scenario 4   Between 2008 and 2010, Tourism South East ran a series of

e-­‐commerce workshops across the South East region promo>ng e-­‐commerce. As part of the monitoring process, Tourism South East want to establish if these workshops have had an impact on business a\tudes to the value of the internet.

  Where would you start your analysis?


Basic to Advanced Sta>s>cal Analysis   Start with your descrip0ve analysis

This analysis suggests that there is a difference in aLtudes towards the Internet between 2008 and 2010


Basic to Advanced Sta>s>cal Analysis   Scenario 4   Between 2008 and 2010, Tourism South East ran a series of

e-­‐commerce workshops across the South East region promo>ng e-­‐commerce. As part of the monitoring process, Tourism South East want to establish if these workshops have had an impact on business a\tudes to the value of the internet.

  What would be a suitable test?


Choosing the Right Test

Two continuous which is the same administered twice


Basic to Advanced Sta>s>cal Analysis   Scenario 4   Between 2008 and 2010, Tourism South East ran a series of

e-­‐commerce workshops across the South East region promo>ng e-­‐commerce. As part of the monitoring process, Tourism South East want to establish if these workshops have had an impact on business aVtudes to the value of the internet. Paired Values Webqual08/ Webqual10


Wilcoxon Appropriate Related/Paired Variables (Ordinal)

Identify potential variables for use in a Wilcoxon Test from the dataset guide


Basic to Advanced Sta>s>cs Analysis Scenario 5   A review of research literature conducted by the University of

Chichester indicates that the length of business ownership influences business response to recession, and the longer the length of business ownership, the more proac8ve businesses are in terms of their overall business strategy and their response to recession. In this instance, the University would like to establish if there is a significant difference between the length of business ownership and the business response to recession.

  Where would you start your analysis?


Basic to Advanced Sta>s>cal Analysis   Start with your descrip0ve analysis

This analysis suggests that there is a difference between length of business ownership and response to the recession


Basic to Advanced Sta>s>cs Analysis Scenario 5   A review of research literature conducted by the University of

Chichester indicates that the length of business ownership influences business response to recession, and the longer the length of business ownership, the more proac8ve businesses are in terms of their overall business strategy and their response to recession. In this instance, the University would like to establish if there is a significant difference between the length of business ownership and the business response to recession.

  What would be a suitable test?


Choosing the Right Test Two categorical


Chi-­‐Squared   A test to examine difference between data that is grouped into

independent and mutually exclusive groups

  Data must be in the form of con8ngency tables, showing frequency

of observa8ons in different categories (h) for one or more samples (k)

  Chi-­‐Squared Test uses nominal data


Chi-­‐Squared Scenario 5   A review of research literature conducted by the University of

Chichester indicates that the length of business ownership influences business response to recession, and the longer the length of business ownership, the more proac8ve businesses are in terms of their overall business strategy and their response to recession.

  In this instance, the University would like to establish if there is a

significant difference between the length of business ownership and the business response to recession. Nominal Variables Lengthcat v Response


Chi-­‐Squared Nominal Variables

Identify potential variables for use in a Chi-Squared T-Test from the dataset guide


Basic to Advanced Sta>s>cal Analysis Scenario 6   Tourism South East is in the process of developing a new

Sustainable Tourism Strategy for the region, as part of which they are inves8ga8ng factors influencing the uptake of local goods and services. Tourism South East would like to establish if there is any rela8onship/associa8on between GTBS score and the use of local goods and services.

  Where would you start your analysis?


Basic to Advanced Sta>s>cal Analysis   Start with your descrip0ve analysis

The scaTerplot shows evidence of a linear rela&onship between Green10 and GTBS10


Basic to Advanced Sta>s>cal Analysis Scenario 6   Tourism South East is in the process of developing a new

Sustainable Tourism Strategy for the region, as part of which they are inves8ga8ng factors influencing the uptake of local goods and services. Tourism South East would like to establish if there is any rela8onship/associa8on between GTBS score and the use of local goods and services.

  What would be a suitable test?


Choosing the Right Test Two separate continuous


Types of Correla>on   When variables are parametric in nature (e.g. ra8o/interval data),

the most commonest measure of correla8on is the Pearson’s Product Moment Correla&on Coefficient

  Where data is ordinal or when not normally distributed, or when

other assump8ons of the Pearson correla8on coefficient are violated, we use the Spearman Rank Correla&on Coefficient


Correla>on   Correla8on is a means to measure the degree of associa8on

between two variables, that is, the extent to which changes in values of one variable are matched by changes in another variable

•  Posi&ve Correla&on •  Measures the extent to which higher values of one variable are matched with higher values of the other


Correla>on   Correla8on is a means to measure the degree of associa8on

between two variables, that is, the extent to which changes in values of one variable are matched by changes in another variable

•  Nega&ve Correla&on •  Measures the extent to which higher values of one variable are matched with lower values of the other


Correla>on Scenario 6   Tourism South East is in the process of developing a new

Sustainable Tourism Strategy for the region, as part of which they are inves8ga8ng factors influencing the uptake of local goods and services. Tourism South East would like to establish if there is any rela8onship/associa8on between GTBS score and the use of local goods and services. Ratio/Ratio Variables GTBS10 v Green10


Correla>on Pearson’s Product Moment Correla&on Coefficient (Ra&o/Interval)

Spearman Correla&on Coefficient (Ordinal)

Nominal (Categorical) [2 Levels]

Nominal (Categorical) [2 Levels]

Identify potential variables for use in Correlation Tests from the dataset guide


Learning Outcomes At the end of this session you should be able to:   Map out different types of advanced sta8s8cal analysis and

demonstrate how the choice of sta8s8cal analysis is influenced by the type of data

  Map and log opportuni8es for advanced sta8s8cal analysis and

sta8s8cal tests against the different variables within the dataset guide


Self-­‐Directed Ac>vity: To do:   Please complete self-­‐directed Ac8vity 6 – Scenario Quiz   Please complete self-­‐directed Ac8vity 7 using the Dataset Guide

Analysis Template


Turn static files into dynamic content formats.

Create a flipbook
Issuu converts static files into: digital portfolios, online yearbooks, online catalogs, digital photo albums and more. Sign up and create your flipbook.