Evaluating the Creation and Interpretation of Causal Influence Models Dapeng Cao1 , Theresa Guarrera1, Michael Jenkins1, Priyadarshini Pennathur1, Ann Bisantz1, Richard Stone2, Michael Farry3, Jonathan Pfautz3, and Emilie Roth4 1University
at Buffalo, 2 Iowa State University, 3Charles River Analytics, 4Roth Cognitive Engineering
Abstract
Experimental Variables
Results
Bayesian networks (BNs) are probabilistic models used to reason under uncertainty by graphically expressing domain knowledge in order to reason about states, causes, and effects. While BNs have many advantages, their complexity can hamper the process of knowledge elicitation and encoding. For example, BNs require the definition of a priori, conditional probabilities: as complex models increase in size, this requires eliciting exponential numbers of complex probabilities. Multiple “canonical modeling” approaches, such as Causal Influence Models (CIMs), have been developed to address these complexities. However, little progress has been made towards human-in-the-loop evaluation of such approaches - specifically, their accessibility and usability, their related user interfaces, and how they enable a user to correctly create and interpret variables and probabilistic relationships. In this study, we evaluated the CIM approach (implemented in a software application) to determine the effect on user task performance. Results indicate that the model complexity has an adverse effect on performance when users are interpreting an existing model; that semantics of a model may impact performance; and that users were generally successful in creating new models of different situations.
Model Complexity – the number of nodes and relationships presented in the
The following major findings are as shown:
a “negatively phrased” node to investigate semantic properties effect on task performance. Examples are:
80% 70% 60% 50%
38.0%
40% 30%
30.0%
24.0%
20% 10%
8.0%
0%
1:1
1:2
2:1
100%
Participants were shown models of the 4 levels of complexity and asked to describe the model and answer questions which required them to interpret the model given different states of variables in the model.
For Task 1 ONLY: Participants were NOT allowed to interact with models to assist them in describing and answering questions regarding the models.
For Task 2 ONLY:
Participants were provided with short situation descriptions and created models representing those situations using the software.
Example Experimental Models & Conditions Task 1 & 2 Examples of Models Presented to users:
Task 3 Examples of Models Created by users:
90.0%
90%
83.0%
80% 70% 60% 50% 40% 30% 20% 10% 0%
2:2
Positive
Model Type (Child:Parent)
Framing
Task 1 (No Interaction)
Framing Condition Effect on Accuracy for Tasks 3
Task 2 (Interaction)
120% 100%
100% 100% 90% 83%
84%
84%
87%
92%
80% 60% 40% 20% 0% 1:1
1:2
Negative
Framing Condition
Interaction
Tasks 1 & 2:
For Task 3:
To evaluate how participants would interpret and create Causal Influence Models of simple situations, a basic point-and-click interface was developed. Shown here:
87.0% 81.2%
Effect of Interaction on Accuracy
Participants were allowed to interact with the software and models to assist them in describing and interpreting the models.
User Interface
Confusion
87.7%
Experimental Tasks
A Bayesian network (BN)1,2 is a graphical representation of a set of variables and their probabilistic relationships to one another. For example:
Allow the relationship between a parent and a child to be expressed through a single value. As seen here:
Percentage Observed
Framing – the presence or absence of
Background
Causal Influence Models
93.1%
90%
Percentage of Accurate Responses
Accuracy 100%
2:1
2:2
Percentage of Accurate Responses
Investigate users’ abilities to create Bayesian networks with varying complexity and types of relationships, using a causal influence model.
Framing Framing Condition Effect on Accuracy for Tasks 1 & 2
Effect of Model Complexity on Accuracy and Confusion for Tasks 1 & 2
Percentage of Accurate Responses
Objective
Model Complexity
models or variables in the vignettes; the following levels of complexity were given:
100%
98.1% 90.0%
90% 80% 70% 60% 50% 40% 30% 20% 10% 0%
Positive
Model Type (Child:Parent)
Negative
Framing Condition
Additional Findings: •Node Link Relationship: Negative links decreased accuracy by 9% (from 90.2% to 80.9%) •Level of Belief: Participants did not consistently express quantitative or qualitative belief values associated with a node (belief expressed in 18.5% of the cases) •When allowed, participants interacted with model while answering 50.7% of questions •Of 205 Interaction instances: 30% moved node slider, 24% hovered over relationship •Interaction accounted for 60% of the qualifiers expressed during Task 2
Discussion •Highest level of accuracy and lowest rate of confusion resulted when constructing and interpreting the simplest model (1:1) •Majority of confusion displayed with the most complicated model (2:2) •Difficulty increased with model complexity and negative causal relationships •Participant use of qualifiers was low when describing nodes of given conditions •Participants overall did well creating their own models in Task 3 •Participants avoided using negative language when constructing models, but this did not improve performance results (perhaps because of complexity) Limitations: (1) Model simplicity used in experiment – maximum of 2 parent, 2 child nodes and (2) only use of Boolean nodes across tasks Future Work: Data collection of the interpretation of self-created models – to better understand the model during interpretation and application
Acknowledgements Analysis Transcription of the participants’ comments throughout the sessions were synchronized with users’ interaction with the software, when describing and creating models, using Transana© transcription software. Specific dependent variables which were analyzed can be seen in the Results section of the poster. All data was coded by at least 2 researchers to ensure accuracy.
Funding for this research was supported by Charles Rivers Analytics
References: 1 Jensen , F.V. (2001). Bayesian Networks and Decision Graphs. New York. Springer-Verlag. 2 Perl, J. (1988). Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. San Mateo, CA: Morgan Kaufmann.