Stewart validatingmodels nov21

Page 1

Validating Models of Cognition

Terrence C. Stewart Centre for Theoretical Neuroscience University of Waterloo


Cognitive Modelling ●

Computer simulation of cognition

How do we know when we're right? –

Some sort of match between the output of the model and the empirical data

What sort of match?

2


Matching to Empirical Data ●

Lots of different data –

Many different tasks

Many different conditions in each task

Example: ACT-R –

Mental arithmetic, driving a car, learning word pairs, parsing English sentences, playing rockpaper-scissors, dialing a phone, air traffic control

Same components, same parameter values (different background knowledge, different sensory data) 3


Matching to Empirical Data ●

Lots of different types of data –

Accuracy, reaction time

fMRI, spike recording, neural connectivity, etc

4


Wait a second... ●

What do we mean by a match?

How do we say how good it is?

How do we handle these large numbers of different kinds of measures?

5


What not to do â—?

Correlation

6


What not to do â—?

Correlation

7


What not to do â—?

Mean squared error

8


What not to do ●

Mean squared error

Need to account for confidence intervals!

What can we safely conclude about this model? 9


What not to do ●

Mean squared error given what we know, the model is unlikely to be wrong by more than this amount

Need to account for confidence intervals!

What can we safely conclude about this model? 10


Multiple measures ●

How do we handle the many different conditions? –

Data from my entry in the 2009 Technion Choice Prediction tournament ●

Behavioural economics model

11


Multiple measures ●

Worst-case scenario –

Do not take the mean! Then you could improve the model by adding conditions that it's good at!

Take the worst over all conditions

12


Multiple measures ● ●

Instead, remove measures it is bad at Forces you to be explicit about what your model can and cannot account for –

Extreme conditions ●

Something missing from the model (strategy shift)

Anomalous data ●

● ●

Might not be something missing from the model Might be incorrect empirical data! With 60 measures, ~3 will actually be outside the 95% confidence intervals!

13


Implications for parameter fitting â—?

RMSE approach

â—?

Equivalence approach (scaled so <1 is inside empirical CI range)

14


Implications for parameter fitting â—?

RMSE approach

Would not have won

â—?

Equivalence approach (scaled so <1 is inside empirical CI range)

Won the competition

15


Importance of Confidence Intervals â—?

Dynamic Stocks and Flows task

16


Measures other than the mean

17


Conclusion ● ●

Use many conditions, tasks, and types of measure Largest likely difference between model and empirical data Scaled by size of empirical CI –

So <1 means no statistical difference

Maximum across all measures –

Explicitly remove measures model is bad at ●

Guides future research

Consider median, s.d., and others as well

Stewart & West, 2011. Testing for Equivalence: A Methodology for Computational Cognitive Modelling. Journal of Artificial General Intelligence, 2(2).

18


Turn static files into dynamic content formats.

Create a flipbook
Issuu converts static files into: digital portfolios, online yearbooks, online catalogs, digital photo albums and more. Sign up and create your flipbook.