Table of Contents How to Identify Variables in Your Dataset .................................................................. 2 1.1 What is a variable? .................................................................................................................... 2 1.2 Identifying variables from an experiment................................................................................. 2 1.3 Identifying variables from a questionnaire ............................................................................... 5 1.4 Arranging data in a spreadsheet ............................................................................................... 8
Additional Resources.................................................................................................. 9
Created by ASK (2012)
Page 1 of 9
How to Identify Variables in Your Dataset 1.1 What is a variable? A variable is a measurement of…
A characteristic (e.g., Gender, Age, Height, Weight…) An activity or task (e.g., time to complete a task, 6 minute walk test…) Time points (e.g., pre-test, post-test, T0, T1, T2…) Experimental condition (e.g., Condition, Experimental group…) Opinion/belief (e.g., A survey question which asks for a respondent’s level of agreement with a statement) Etc…
You will have multiple variables in your dataset. It is important to identify your variables in order to correctly arrange your data in Excel or SPSS.
1.2 Identifying variables from an experiment Look at your dataset and ask the following questions: 1. Do I have any independent categorisations or groupings? (Independent here means that your subjects can only be categorised into one of several groups). Figure 1 shows the examples given below. EXAMPLES: I have male and female subjects. The variable here is Gender. Under Gender, the data for each subject is recorded as either male or female, but cannot be both. I have a control group & one or more experimental groups. The variable here is Experimental condition. Under this variable, the data for each subject is recorded as the group they have been assigned for the duration of the experiment. Subjects have been assigned to exactly one condition. I have categorised each subject as either under weight, normal weight, overweight or obese, based on their BMI. The variable here is Weight group. Under this variable, the data for each subject is recorded as the weight group corresponding to their BMI. Each subject is categorised into exactly one group.
Created by ASK (2012)
Page 2 of 9
Figure 1. Independent categorisations or groupings. (Back to description). 2. Do I have any quantitative measurements taken of my subjects? (Quantitative here means that the measurements are numbers and not groups, categories, words or text). Figure 2 shows the examples given below. EXAMPLES: I have the weight (in kg, stones, lbs, etc…) of each subject. The variable here is Weight. Under this variable, the data for each subject is recorded as their measured weight. NOTE: this is different from Weight group in the example above, because here you have not grouped or categorised their weights – you are recording the actual weight for each subject. I have the height (in cm, m, inches, feet, etc…) of each subject. The variable here is Height. Under this variable, the data for each subject is recorded as their actual measured height. Always use only one unit of measure and use the same unit of measure for each subject. E.g., instead of 1 metre and 53 cm, write as 1.53 metres. I have the age (in days, months, years, etc…) for each subject. The variable here is Age. Under this variable, the data for each subject is recorded as their actual age (not age group). Always use only one unit of measure and use the same unit of measure for each subject. E.g., instead of 23 years and 6 months, write all in years (23.5 yrs) or all in months (282 months). I gave each subject a test and recorded their total score. The variable here is Score. Under this variable, the data for each subject is recorded as their test score. Each subject walked for 6 minutes and I recorded how far (in metres) they were able to walk. The variable here is Distance. Under this variable, the data for each subject is recorded as the distance that each subject walked in 6 minutes. Always use one unit of measure and use the same unit of measure for each subject. I counted “how many _______” for each subject. For example, “how many cells die/live after treatment A”, “how many failures in 1 hour”, “how many hours worked per week”, etc… The variable here is Frequency. Under this variable, the data for each subject is recorded as the “number of _______” (fill in the blank with your measure). Created by ASK (2012)
Page 3 of 9
Figure 2. Quantitative measurements. (Back to description). 3. Do I have any repeated measures data? That is, have I taken the same measurements from all subjects at several time points or under several conditions? In this case, each time point or each condition is its own variable. Figure 3 shows two of the examples given below. EXAMPLES: Each subject walked for 6 minutes and I recorded how far (in metres) they were able to walk, both pre-test and post-test (after 6 months of rehabilitation). The variables here are Pre distance and Post distance. Record each of these as quantitative data as described in (2) above. I gave each subject a test before they started the study. Each subject was then subjected to condition 1 and afterwards, they took the test again. Then, each subject was subjected to condition 2 and took the test a third time. The variables here are Pre score, Cond1 score and Cond2 score. Record each of these variables as quantitative data as described in (2) above. I have categorised each subject as either under weight, normal weight, overweight or obese, based on their BMI at baseline and then 6 months after starting an exercise regime. The variables here are Baseline weight group and Post weight group. Record the data for each of these as independent groupings or categorisations as described in (1) above.
Figure 3. Repeated measurements of all subject. (Back to description).
Created by ASK (2012)
Page 4 of 9
1.3 Identifying variables from a questionnaire Look at your questionnaire and ask the following questions: 1. Do I have any single response questions? (Single response here means that participants select one response out of the options given). EXAMPLES: 1
Treat these as independent categorisations or groupings as described in (1) above. The variable here is Role. Under this variable, the data for each participant is recorded as staff, student or visitor. 2
How important are the following when considering where to live? Very Important
Important
Unimportant
Very Unimportant
Cost Distance to Uni Distance to work I feel safe
Although this is presented as one question in matrix format, each of the items listed in column 1 are actually separate questions. All 4 of these questions use likert scales and thus should be treated as independent categorisations or groupings as described in (1) above. The variables here are Cost, Distance to Uni, Distance to work and I feel safe. For each variable, the data for each participant is recorded as “very important”, “important”, “unimportant” or “very unimportant” .
Created by ASK (2012)
Page 5 of 9
2. Do I have any multiple response questions? (Multiple response here means that participants select one or more responses out of the options given). EXAMPLES: 1
Because only enter 1 piece of information per participant into a spreadsheet, you cannot treat question 3 as one variable. Each response is a variable. The data for each variable (except for Other) is then Yes (it was ticked) or No (it wasn’t ticked). For the variable Other, enter all the responses given by participants. Treat each variable as independent categorisations or groupings as described in (1) above (see Figure 4 for an example). The variables here are Hybrid/electric, Foot, Cycle, Public transport, Car/taxi and Other. Under each variable (except Other), the data for each participant is recorded as either Yes or No. If a participant ticked Other, simply enter the response they gave. You should not create a new variable for each response given by participants under Other. If a participant gives more than one Other response, then you will need to create multiple Other variables and enter one response in each (e.g., Other1, Other2, etc…).
Figure 4. “Tick all that apply” question.
Created by ASK (2012)
Page 6 of 9
2
This is a ranked response question. Again, only 1 piece of information per participant can be entered into a spreadsheet, question 4 cannot be treated as one variable. Instead, each of the items being ranked is a variable. The data for each variable is the rank given by the participant (i.e., 1 (Most important), 2, 3, 4 or 5 (Least important)). Treat each variable as independent categorisations or groupings as described in (1) above (see Figure 5 for an example). The variables here are Never been, Weather, Surroundings, Cost and Accomodation. Under each variable, the data for each participant is recorded as Most important, 2 nd , 3 rd , 4 th or Least important.
Figure 5. Ranked response question. 3. Do I have any numeric open response questions? That is, questions in which participants write in a numeric response rather than tick a category. Treat each variable as a quantitative measurement as described in (2) above. EXAMPLES: I asked participants to write their age in years. The variable here is Age. Under this variable, the data for each participant is recorded as their age in years (see Figure 2). I asked participants to write how many years they have worked at their current job. The variable here is Years worked. Under this variable, the data for each participant is recorded as the number of years they have worked at their job. I asked participants the age at which they plan to retire. The variable here is Retirement Age. Under this variable, the data for each participant is recorded as the age (in years) at which they plan to retire.
Created by ASK (2012)
Page 7 of 9
1.4 Arranging data in a spreadsheet As shown in the examples above, each variable is a column heading. Each row represents a subject/participant. The data for each participant should go in the corresponding row for each variable. E.g., all data for subject/participant 1 should go in the row labelled “1�. For more detail regarding how to arrange data in a spreadsheet so that you can do analysis in SPSS or Excel, please see How to arrange data in an Excel file on Blackboard.
Created by ASK (2012)
Page 8 of 9
Additional Resources In the Getting Started folder under the SPSS resources section, you may be interested in the following: 1. How to code categorical variables (check this out if you have data from a questionnaire) 2. Levels of measurement (nominal, ordinal and scale variables) 3. How to enter your data into SPSS 4. How to create value labels for categorical variables 5. How to code, replace and define missing values in SPSS 6. How to arrange data in an Excel file (so it can be imported into SPSS) * If you are unsure about which variables are categorical, have a look at the Levels of Measurement guide mentioned above.
Return to: 1.1 What is a variable? 1.2 Identifying variables from an experiment Do I have any independent categorisations or groupings? Do I have any quantitative measurements taken of my subjects? Do I have any repeated measures data?
1.3 Identifying variables from a questionnaire Do I have any single response questions? Do I have any multiple response questions? Do I have any numeric open response questions?
1.4 Arranging data in a spreadsheet
Created by ASK (2012)
Page 9 of 9