Research Design and Data Collection 2 - 2014

Page 1

Research Design & Data Collection: 2

BML224: Data Analysis for Research


Research Design and Data Collection:

Previously‌


Types of Data -­‐ Summary

NOIR NOMINAL

ORDINAL

NON-­‐PARAMETRIC

INTERVAL

RATIO

PARAMETRIC


Quan>ta>ve Research Design Nature of the Ques>on

NOMINAL ORDINAL Type of Data

INTERVAL RATIO

Type of Analysis DESCRIPTIVE INFERENTIAL


Quan>ta>ve Research Design General Purpose

Descrip0on (only)

Specific Purpose

Summarise Data

Compare Groups

Finds Strengths of Associa>on, Relate Variables

Type of Ques0on/ Hypothesis

Descrip0ve

Difference

Associa0onal

Descrip>ve Sta>s>cs (e.g. mean, percentage, range)

(e.g. t-­‐test, Mann Whitney)

(e.g. correla8on)

General Type of Sta0s0c

Explore Rela0onship Between Variables

[Source: Morgan, G. et al (2011), IBM SPSS for Introductory Sta8s8cs, Routledge, London, p. 6]


Research Design and Collec>on General Purpose

Descrip0on (only)

Specific Purpose

Summarise Data

Compare Groups

Finds Strengths of Associa>on, Relate Variables

Type of Ques0on/ Hypothesis

Descrip0ve

Difference

Associa0onal

Descrip>ve Sta>s>cs (e.g. mean, percentage, range)

(e.g. t-­‐test, Mann Whitney)

(e.g. correla8on)

General Type of Sta0s0c

Explore Rela0onship Between Variables

[Source: Morgan, G. et al (2011), IBM SPSS for Introductory Sta8s8cs, Routledge, London, p. 6]


Research Design and Data Collec>on 2: Learning Outcomes Aims:   To map out different types of advanced sta8s8cal analysis and

demonstrate how the choice of sta8s8cal analysis is influenced by the type of data

  To map and log opportuni8es for advanced sta8s8cal analysis and

sta8s8cal tests against the different variables within the dataset guide


Research Design and Collec>on General Purpose

Descrip0on (only)

Specific Purpose

Summarise Data

Compare Groups

Finds Strengths of Associa>on, Relate Variables

Type of Ques0on/ Hypothesis

Descrip0ve

Difference

Associa0onal

Descrip>ve Sta>s>cs (e.g. mean, percentage, range)

(e.g. t-­‐test, Mann Whitney)

(e.g. correla8on)

General Type of Sta0s0c

Explore Rela0onship Between Variables

[Source: Morgan, G. et al (2011), IBM SPSS for Introductory Sta8s8cs, Routledge, London, p. 6]


Research Design and Data Collection:

Exploratory Data Analysis: Crosstabulations


Crosstabula>ons Defini&on   A crosstabula8on is a joint frequency distribu8on of cases based on

two or more categorical variables

  Displaying a distribu8on of cases by their values on two or more

variables is known as con&ngency table analysis


Crosstabula>ons   Examples: Length of Ownership by Response to Recession


Crosstabula>ons   Examples: Length of Ownership by Response to Recession


Crosstabula>ons   Examples: Length of Ownership by Response to Recession

  Analysis by Row


Crosstabula>ons   Examples: Length of Ownership by Response to Recession


Crosstabula>ons   Examples: Length of Ownership by Response to Recession

  Analysis by Column


Crosstabula>ons   Examples: Size by Response to Recession

  Analysis by Row


Crosstabula>ons   Examples: Size by Response to Recession

  Analysis by Column


Crosstabula>ons   Examples: E-­‐Strategy by Value of the Internet

  Combining nominal (E-­‐Strategy) with Ordinal (Webvalue)


Crosstabula>ons Variables for Analysis

Identify potential variables that could form the basis of four separate crosstabulations


Research Design and Data Collection:

Linking Data Types to Advanced Statistical Analysis


Sta>s>cal Tests   Used to make deduc8ons/inferences about a par8cular data set or

rela8onships (differences/associa8ons) between different data sets

  Random sample of 50 households in two rural villages in West

Sussex:

  Village A: mean income £17,650   Village B: mean income £22,220

  A test can be used to determine if there is a ‘real difference’ or

whether the difference occurred ‘purely by chance’


Sta>s>cal Tests   Parametric Tests: data conforms to normal distribu8on and is of

interval or ra8o in nature

  Non-­‐Parametric Tests: data does not conform to normal

distribu8on – use ordinal data


Sta>s>cal Tests: Parametric Tests   Independence of observa8ons (except where the data is paired)   Random sampling   Interval scale measurement for the dependent variable   A minimum sample size of 30 per group is recommended   Equal variances of the popula8on from which the data is drawn   Hypotheses are usually made about the mean of the popula8on


Sta>s>cal Tests: Non-­‐Parametric Tests   Independence of randomly selected observa8ons except when paired   Few assump8ons concerning the distribu8on of the popula8on   Ordinal or nominal scale of measurement   Ranks or frequencies of data are the focus of tests   A minimum sample size of 30 per group is recommended   Hypotheses are posed regarding ranks, medians or frequencies   Sample size requirements are less stringent than for parametric tests


Research Design and Data Collection:

Testing for Difference


Choosing the Right Test


Choosing the Right Test

One Categorical and One Continuous


Research Design and Data Collection:

Student T-Test


Sta>s>cal Tests   Scenario   As part of the bidding process to Tourism South East for

future tourism funding, local tourism officers have to demonstrate if there is a difference in profit levels between businesses in the Arun and Chichester Districts


Student T-­‐Test – Data Requirements Grouping Variable [Independent Variables]

Test Variables [Dependent Variables]


Student T-­‐Test – Data Requirements Grouping Variable [Independent Variables]

Test Variables [Dependent Variables]

It is a variable that stands alone and isn't changed by the other variables you are trying to measure

A variable that depends on other factors


Student T-­‐Test – Data Requirements Grouping Variable [Independent Variables]

Test Variables [Dependent Variables]

Nominal (Categorical) [2 Levels]

Ra0o or Interval (Con0nuous)


Student T-­‐Test – Data Requirements   Scenario   As part of the bidding process to Tourism South East for

future tourism funding, local tourism officers have to demonstrate if there is a difference in profit between businesses in the Arun and Chichester Districts

Profit Test Variable Ratio (continuous)


Student T-­‐Test – Data Requirements   Scenario   As part of the bidding process to Tourism South East for

future tourism funding, local tourism officers have to demonstrate if there is a difference in turnover between businesses in the Arun and Chichester Districts

Area Code Grouping Variable Nominal (categorical) 2 Levels [Chichester District 1 / Arun District 2]


Student T-­‐Test – Data Requirements Grouping Variable [Independent Variables]

Test Variables [Dependent Variables]

Nominal (Categorical) [2 Levels]

Ra0o or Interval (Con0nuous)

Nominal (Categorical) [2 Levels]

Nominal (Categorical) [2 Levels]

Identify potential variables for use in a Student T-Test from the dataset guide


Choosing the Right

One Categorical and One Continuous


Research Design and Data Collection:

Mann Whitney


Sta>s>cal Tests   Scenario   Tourism South East are developing a new e-­‐tourism strategy

and they want to establish if there is any difference between e-­‐strategy mo8ves (e-­‐commerce adopters and non-­‐adopters) and business aitudes to the web-­‐based customer rela8onship management systems TSE offer


Student T-­‐Test – Data Requirements Grouping Variable [Independent Variables]

Test Variables [Dependent Variables]


Student T-­‐Test – Data Requirements Grouping Variable [Independent Variables]

Test Variables [Dependent Variables]

Nominal (Categorical) [2 Levels]

Ordinal


Sta>s>cal Tests   Scenario   Tourism South East are developing a new e-­‐tourism strategy

and they want to establish if there is any difference between e-­‐strategy mo8ves (e-­‐commerce adopters and non-­‐adopters) and business aPtudes to the web-­‐based customer rela&onship management systems TSE offer TSECMS Test Variable Ordinal (continuous)


Sta>s>cal Tests   Scenario   Tourism South East are developing a new e-­‐tourism strategy

and they want to establish if there is any difference between e-­‐strategy mo&ves (e-­‐commerce adopters and non-­‐ adopters) and business aitudes to the web-­‐based customer rela8onship management systems TSE offer E-Strategy Grouping Variable Nominal (categorical) 2 Levels [E-Commerce Adopters 1 / E-Commerce Non-Adopters 2]


Student T-­‐Test – Data Requirements Grouping Variable [Independent Variables]

Test Variables [Dependent Variables]

Nominal (Categorical) [2 Levels]

Ordinal

Nominal (Categorical) [2 Levels]

Nominal (Categorical) [2 Levels]

Identify potential variables for use in a Mann Whitney Test from the dataset guide


Choosing the Right Test

One Categorical and One Continuous


Choosing the Right Test

Two continuous which is the same administered twice


Research Design and Data Collection:

Related or Paired Samples T-Test


Related or Paired Samples T-­‐Test   The Paired Sample T-­‐test is undertaken when the samples are

related or paired (ojen with the same par8cipants in each sample)

  Parametric Test (ra8o or interval data)


Related or Paired Samples T-­‐Test   The Paired Sample T-­‐test is undertaken when the samples are

related or paired (ojen with the same par8cipants in each sample)

  Parametric Test (ra8o or interval data)   Scenario   Between 2008 and 2010, Tourism South East ran a series of courses

in conjunc>on with the Green Tourism Business Scheme to help GTBS members progress to the next stage of accredita>on (e.g. bronze to silver; silver to gold). As part of the monitoring process, Tourism South East want to establish if these courses have had an impact on GTBS scores


Related or Paired Samples T-­‐Test   The Paired Sample T-­‐test is undertaken when the samples are

related or paired (ojen with the same par8cipants in each sample)

  Parametric Test (ra8o or interval data)   Scenario   Between 2008 and 2010, Tourism South East ran a series of courses

in conjunc>on with the Green Tourism Business Scheme to help GTBS members progress to the next stage of accredita>on (e.g. bronze to silver; silver to gold). As part of the monitoring process, Tourism South East want to establish if these courses have had an impact on GTBS scores Paired Values GTBS08/GTBS10


Related or Paired Samples T-­‐Test Appropriate Related/Paired Variables (Ra0o or Interval)

Identify potential variables for use in a Related or Paired Samples TTest from the dataset guide


Choosing the Right Test

Two continuous which is the same administered twice


Research Design and Data Collection:

Wilcoxon


Wilcoxon   The Wilcoxon Test is undertaken when the samples are related or

paired (ojen with the same par8cipants in each sample)

  Non-­‐Parametric Test (ordinal data)


Wilcoxon   The Wilcoxon Test is undertaken when the samples are related or

paired (ojen with the same par8cipants in each sample)

  Non-­‐Parametric Test (ordinal data)   Scenario   Between 2008 and 2010, Tourism South East ran a series of e-­‐

commerce workshops across the South East region suppor>ng the implementa>on of their new CMS system. As part of the monitoring process, Tourism South East want to establish if these workshops have had an impact on business a\tudes to the value of their CMS systems.


Wilcoxon   The Paired Sample T-­‐test is undertaken when the samples are

related or paired (ojen with the same par8cipants in each sample)

  Parametric Test (ra8o or interval data)   Scenario   Between 2008 and 2010, Tourism South East ran a series of e-­‐

commerce workshops across the South East region. As part of the monitoring process, Tourism South East want to establish if these workshops have had an impact on business aStudes to the value of the internet. Paired Values TSECMS08/ TSECMS10


Wilcoxon Appropriate Related/Paired Variables (Ordinal)

Identify potential variables for use in a Wilcoxon Test from the dataset guide


Choosing the Right Test

Two continuous which is the same administered twice


Choosing the Right Test Two categorical


Research Design and Data Collection:

Chi-Squared


Chi-­‐Squared   A test to examine difference between data that is grouped into

independent and mutually exclusive groups

  Data must be in the form of con8ngency tables, showing frequency

of observa8ons in different categories (h) for one or more samples (k)

  Chi-­‐Squared Test uses nominal data


Chi-­‐Squared Scenario   A review of research literature conducted by the University of

Chichester indicates that the length of business ownership influences business response to recession, and the longer the length of business ownership, the more proac8ve businesses are in terms of their overall business strategy and their response to recession.

  In this instance, the University would like to establish if there is a

significant difference between the length of business ownership and the business response to recession. Nominal Variables Lengthcat v Response


Crosstabula>ons   Examples: Length of Ownership by Response to Recession


Chi-­‐Squared Nominal Variables

Identify potential variables for use in a Chi-Squared T-Test from the dataset guide


Research Design and Data Collection:

Tests for Association


Choosing the Right Test Two categorical


Choosing the Right Test Two separate continuous


Research Design and Data Collection:

Correlation


Correla>on   Correla8on is a means to measure the degree of associa8on

between two variables, that is, the extent to which changes in values of one variable are matched by changes in another variable

•  Posi&ve Correla&on •  Measures the extent to which higher values of one variable are matched with higher values of the other


Correla>on   Correla8on is a means to measure the degree of associa8on

between two variables, that is, the extent to which changes in values of one variable are matched by changes in another variable

•  Nega&ve Correla&on •  Measures the extent to which higher values of one variable are matched with lower values of the other


Types of Correla>on   When variables are parametric in nature (e.g. ra8o/interval data),

the most commonest measure of correla8on is the Pearson’s Product Moment Correla&on Coefficient

  Where data is ordinal or when not normally distributed, or when

other assump8ons of the Pearson correla8on coefficient are violated, we use the Spearman Rank Correla&on Coefficient


Correla>on Scenario   Tourism South East is in the process of developing a new

Sustainable Tourism Strategy for the region, as part of which they are inves8ga8ng factors influencing the uptake of local goods and services. Tourism South East would like to establish if there is any rela8onship/associa8on between GTBS score and the use of local goods and services.


Correla>on Scenario   Tourism South East is in the process of developing a new

Sustainable Tourism Strategy for the region, as part of which they are inves8ga8ng factors influencing the uptake of local goods and services. Tourism South East would like to establish if there is any rela8onship/associa8on between GTBS score and the use of local goods and services. Ratio/Interval Variables GTBS10 v Green10


Correla>on Pearson’s Product Moment Correla&on Coefficient (Ra&o/Interval)

Spearman Correla&on Coefficient (Ordinal)

Nominal (Categorical) [2 Levels]

Nominal (Categorical) [2 Levels]

Identify potential variables for use in Correlation Tests from the dataset guide


Learning Outcomes At the end of this session you should be able to:   Map out different types of advanced sta8s8cal analysis and

demonstrate how the choice of sta8s8cal analysis is influenced by the type of data

  Map and log opportuni8es for advanced sta8s8cal analysis and

sta8s8cal tests against the different variables within the dataset guide


Self-­‐Directed Ac>vity: To do:   Please complete self-­‐directed Ac8vity 7 using the Dataset Guide

Analysis Template


Turn static files into dynamic content formats.

Create a flipbook
Issuu converts static files into: digital portfolios, online yearbooks, online catalogs, digital photo albums and more. Sign up and create your flipbook.