10 minute read
1.4 Data, measurement and error
1.4
Data, measurement and error
KEY IDEAS
raw data
Scientific investigations are important. They aim to develop explanations for natural measurements or observations of the phenomena. Evidence that is collected needs to be organised and presented in an appropriate dependent variable manner and then analysed to consider the quality of the data.
table
a form of organising Presentation and analysis of data data systematically into columns Raw data can be difficult to interpret. Therefore, raw data must be presented in a way that qualitative data makes it easy to analyse so that you can draw conclusions. data that tends to be non-numerical and is subjective (e.g. Tables hair colour, choice of Tables can be used to present quantitative and qualitative data. All tables should have clothing) a heading that states what the table is showing. The heading usually indicates what the quantitative data data expressed independent and dependent variables are. Each column must have a heading and if you are as a number (e.g. using numerical data, the units (e.g. minutes, seconds, grams) must also be included. concentration Qualitative data may show trends and so it is often useful to present qualitative data in of solutions, temperature) tables before comparing or contrasting these results in the discussion. graph Quantitative data is displayed as the values of each of the related variables, but this a way of representing may not clearly show the relationship between the variables. Displaying quantitative data data to visually in a table is usually the first step in recording information and allows you to decide on the identify the relationship between most appropriate way to graph the data. Once the data is in a table, you can apply various the variables mathematical applications. Graphs You can represent your data in a graph. When graphing your data, you must consider the following (shown in Figure 1). • The information on the graph should be easily identified. Make sure you include the heading, axis titles and numbers. • Include a title that is a descriptive statement and contains the independent variable and the dependent variable. • Start each axis at zero and make the points on each axis equal in unit size (scaled). • Clearly label each axis and include the unit of measurement. • Do not plot the data beyond the axes. • If there are two sets of data on a single graph, use two different symbols or a coloured key.
In this topic, you will learn that: ✚ raw data should be organised and presented so it is easy to interpret Temperature (°C) 10 20 40 30 60 50 70 80 10 ✚ continuous and discrete data are represented differently ✚ data can be analysed in terms of accuracy, precision, repeatability, reproducibility, validity and true value ✚ there can be different types of errors, uncertainty and outliers. 2 3 4 5 6 The change in temperature of honey and distilled water over time 7 8 Distilled water Honey 9 10 DRAFT ONLY - NOT FOR SALE Time (minutes) FIGURE 1 This graph displays all the key features of a scientific graph.
Graphing continuous data
The type of graph used is determined by the type of data present. Continuous data refers to data that can be measured on a scale; for example, time or temperature.
Resource Graphing in Excel
Line graphs
A line graph forms a line when there is a relationship or correlation between the independent and dependent variables. If the line slopes upwards (Figure 2a), it means the independent and dependent variables increase together. This is called a positive correlation. If the line slopes downwards (Figure 2b), the independent variable increases, while the dependent variable decreases. This is known as a negative correlation. a The effect of enzyme concentration on reaction rate b The effect of humidity on transpiration rate 30 90
DEPENDENT VARIABLE Rate of enzymatic reactions (grams of substrate converted per minute) 5 10 15 20 25 DEPENDENT VARIABLE Transpiration rate (mL water lost from leaves) 10 20 30 40 50 60 70 80 0 0 2 4 6 8 10 12 0 20 40 60 80 100 Concentration of enzyme (arbitrary units) Humidity (% water vapour in air) INDEPENDENT VARIABLE INDEPENDENT VARIABLE FIGURE 2 a An example of a positive correlation in a line graph; b an example of a negative correlation in a line graph
Scatterplots
If data points on a graph are not in a line, then a scatterplot is a better choice of graph (Figures 3 and 4). You can draw a line of best fit by eye or input using Microsoft Excel to show the general trend of the data. 180 25 Height (cm) 150 160 170 Number of branches 5 10 15 20
a Line of best t Line of best t b 0 DRAFT ONLY - NOT FOR SALE 0 50 0 55 60 65 70 75 80 85 90 95 1 Mass (kg) FIGURE 3 Two different lines of best fit in scatter plots
2 3 4 5 Height of tree (m)
6 7
The amount of scatter on either side of a line of best fit indicates the closeness of the variables. The closer the points are to the line of best fit, the stronger the correlation between the variables. When the points are so scattered that you cannot draw a line of best fit, there is no correlation between the two variables.
40 40 y 30 30 20 20 10 20 30 10 20 30 x x
10 20 30 40 20 30 x y 10 20 30 40 20 30 x y 10 20 30 40 20 30 x y 10 20 30 40 20 30 x y 10 20 30 40 20 30 x y 10 20 30 40 20 30 x y High positive correlation Low positive correlation High negative correlation Low negative correlation FIGURE 4 Examples of different correlations represented by the amount of scatter in the data Interpreting line graphs When describing a graph, you should consider the: • independent and dependent variables • type of correlation shown by the graph (e.g. positive, negative or neutral) • shape of the graph (i.e. linear or curved). Although there may be a correlation between the two variables, this does not mean that the independent variable caused the change in the dependent variable. Correlation does not mean causation. Graphing discrete data When the data is discrete, you can use several types of graph. Discrete data is not related (e.g. the energy content in different food types or individual recovery rates after exercise). • A column graph shows the distribution of a distinct characteristic within a population (e.g. human blood groups). • A histogram represents continuous values of the independent variable that are grouped into classes of equal width (e.g. the recovery time after exercise divided into 2-minute intervals, or pulse rates of a population). • A pie graph is useful for showing the relationship of all the parts of a whole. y DRAFT ONLY - NOT FOR SALE
Different sets of data are often drawn on the same graph, with different colours or symbols used to compare them.
a b c
80
Percentage in population
50
40
30
20
10
0
Data and measurement
When analysing and discussing quantitative data, the accuracy, precision, repeatability, reproducibility, true value and validity need to be considered. • Accuracy describes how close the experimental data is to the ‘true’ value of the measurement.
This can be improved by carefully calibrating the equipment before each experiment. • Precision analyses how close the set of data values are to one another. An experimenter can improve the precision by repeating an experiment or by increasing the sample size.
Frequency (number of people) 20 40 60
O A B AB 53–57 58–6263–6768–72 73–7778–82 Blood group phenotypes Pulse rate (bpm)
accuracy
how close the experimental data is to the true value
precision
how close a set of data values are to each other
repeatability
a measure of achieving the same set of data if the experiment was repeated under the same conditions
reproducibility
a measure of achieving the same set of data if the experiment was repeated with a different experimenter in a different laboratory
true value
the value that accurately represents the measurement had the experiment been conducted perfectly
validity
a measure of whether the investigation is sound
Other prey 13% Caterpillar 12% Field mouse 18% Beetle 35% Earthworm 22% FIGURE 5 Different graphs for discontinuous data: a column graph; b histogram; c pie graph
High accuracy high precision Low accuracy high precision High accuracy low precision Low accuracy low precision FIGURE 6 Examples of accuracy and precision. Accuracy is not the same as precision. • Repeatability describes the ability for the same data to be produced again by the same experimenter in the same laboratory under the same conditions. Repeatability relies on a detailed and informative method with well-defined variables. • Reproducibility refers to the ability for the same data to be produced by a different experimenter in a different laboratory. Reproducibility also relies on a detailed and informative method with well-defined variables. • The true value is the value that would be obtained had the quantity been measured perfectly. The experimental data is compared with the true value to determine the accuracy of the data. • The validity of the measurement describes whether the experiment will actually answer the scientific question that was asked.
When discussing validity, the experimental design and its implementation should be considered.
0 DRAFT ONLY - NOT FOR SALE
Study tip Experimental errors, uncertainty and outliers
A mistake caused by incorrect Experimental errors should not be confused with human error. An error is defined as measurements is the difference between the measurement and the true value. It is also important not to not the same as experimental error, confuse errors with uncertainty. Uncertainty is when a measurement seems unreliable which relates to and is associated with doubt. problems with the experimental design. There are two types of errors to consider in scientific investigations: random errors and systematic errors. random error Random errors an error that affects the precision of Random errors are unpredictable. They are present in all measurements because they are the data due to caused by an error in the measurement process. Random errors reduce the precision of the data. an unknown and Parallax error is an example of a random error. Parallax error occurs when an observer views unpredictable error in the experimental an object (e.g. a measuring cylinder containing water) from the wrong angle. The measurement process that in will differ from the true value. You can reduce random errors by doing multiple trials. uncertain systematic error Systematic errors an error that affects Systematic errors are consistent and repeatable. This type of error reduces the accuracy of the accuracy of the data by causing the the data. Systematic errors are usually caused by faulty equipment or uncalibrated measuring reading to differ from instruments. These errors cause readings to consistently differ from the true value every time the true value they are measured. This means that repeating the experiment does not reduce systematic Study tip errors. An example of a systematic error is if you did not zero scales at the beginning of an Data presentation, experiment, and repeated this uncalibrated measurement across all tests. measurement and errors associated Outliers with different forms of data can be found A data point that is outside the rest of the data set is called an outlier. This abnormal in the Biology for VCE data point may be caused by mistakes made by the experimenter or equipment during the Units 1 & 2 Student Workbook. experiment. Always plot outliers in your graph, but they may not be included in the line of best fit. You should attempt to explain the cause of the unexpected data in the discussion section of the scientific investigation. Outliers cannot be simply dismissed, but should be investigated outlier any value that sits and accounted for. Conducting multiple trials can be a useful way to examine outliers. outside the data set
Describe and explain
1 State the three types of graphs that can be used to represent discrete data. 2 Identify the error (random or systematic) that affects the accuracy of the data. Explain your answer. 3 Use an example to describe an outlier.
Apply, analyse and compare
4 Compare (discuss the similarities and difference between): a accuracy and precision
b repeatability and reproducibility c continuous and discrete data d uncertainty and error. 5 Explain which type of graph would best represent continuous data that measures the heart rate of an aquatic organism in solutions of different salinities. Design and discuss 6 Discuss the importance of organising raw data CHECK YOUR LEARNING 1.4DRAFT ONLY - NOT FOR SALE into tables and/or graphs.