Towards context-aware data fusion: Incorporating situationally qualified human observations into a fusion process for intelligence analysis Michael 1
1 Jenkins ,
Geoff Gross2, Ann Bisantz1 (Advisor), Rakesh Nagi2 (Advisor)
Dept. of Industrial Systems and Engineering – Human Factors Focus, 2 Dept. of Industrial Systems and Engineering – Operations Research Focus
Research Objectives Develop a methodology for incorporating human observations into a hard & soft information fusion process for counterinsurgency intelligence analysis • Develop a method for characterizing soft data sources (human observations) in terms of qualifying contexts & error characteristics • Represent these observations in a unified framework to cover both quantitatively and qualitatively reported observations • Develop a consistent and quantitative methodology for assessing similarity of referenced entities which incorporates the qualified human observations • Utilize uncertainty in data association and situation assessment processes and provide an indication of the uncertainty present in overall situation assessment
Framing the Research Problem
Generation of Context-Aware Models of Human Observation:
Graph Background
An error model for each exemplar (Age Estimation shown) was generated based on existing empirical literature which: • Incorporates qualifying variables & error/bias based on variable states • Provides qualified error ranges as model outputs
A graphical representation is created for incoming data • Nodes represent entities (people, locations, organizations, etc.) • Edges represent inter-entity relationships • Both can have attribute values • Supports both data association and situation assessment fusion tasks
Model output consists of sub-model for both bias and variance representation components • Distribution family combines bias and variance terms to form uncertainty representation
Research Motivation:
Environmental Factors
Quantitative Estimation?
Intelligence analysis is a context dependent, time sensitive, dynamic and complex series of tasks requiring human analysts to perceive and often predict past, present, and/or future of situations based on multiple sources of information characterized by varying levels of uncertainty. The goal of intelligence analysts is to provide actionable intelligence to the military commanders.
Data Fusion for Counterinsurgency Intelligence Analysis: Fusion systems combine data from multiple sources, utilizing various mathematical techniques, to exploit the core competencies of different data sources. It is believed intelligence analysis fusion systems can assist analysts in tasks such as: • Identifying insurgent actions, developing plans, and relationships • Reducing the complexity and heterogeneity of large volumes of data • Allowing for consideration of a more robust set of information
Methodology: Exemplar Category Selection Identification & Selection of Categories of Human Observation: 67 categories of observations relevant to COIN domain selected based on: • Military field manuals on COIN Operations & Field Reporting • Human Terrain Team (HTT) guidebook • A simulated COIN Scenario 200-message set A literature review approach was employed to: • Categories with empirical evidence of human capabilities • Over 300 citations found to be relevant to at least one of the 67 categories Exemplar categories selected to define modeling process based on relevancy & empirical literature: • Age Estimation – age of another individual • Egocentric Distance Estimation – distance from self to target • Temporal Estimation – time of occurrence of past event • Numerosity Estimation – number of objects in a scene
No
Observer Factors
Yes
Observer Race Known?
No
Yes
ME Range: 7-8 Yrs SD: 3-5 Yrs Bias: Unknown
ME: 4.3 Yrs SD: 6.3 Yrs Bias: Overestimation
Target Race Known?
Yes
Observer Race = Target Race?
No
No
No
Yes
ME: 7.9 Yrs SD: 7.6 Yrs Bias: Underestimation
Yes
No
All ME Estimates reduced by 1 Yr.
Estimated AT <32 Yrs
Situation Assessment (SA): ME: 7.39 Yrs SD: 3.12 Yrs Bias: Unknown
No
Target Age estimate 8+ yrs older than Ao?
Yes
ME: 8.21 Yrs SD: 5.41 Yrs Bias: Overestimation
Yes
ME: 6.79 Yrs SD: 4.80 Yrs Bias: Underestimation
No Target Age estimate 8+ yrs younger than Ao? No No hard data available: Ss: 16-25 were more accurate for judging targets ages 5-20 than Ss 51-60 Ss: 51-60 were more accurate for judging targets ages 45-70 than Ss 16-25 George & Hole (1995)
Target Factors & Error Characterizations
Methodology: Uncertainty Alignment Application of Models for Uncertainty Alignment: Messages received from various sources may refer to the same or related entities • Example has 2 messages referring to 2 groups of males with seemingly different ages
Data association attempts to combine common entities from incoming and previously processed observations • Attribute-attribute similarity scores aggregated to form nodenode (entity-entity) similarity • Node-node scores used in to determine if observations refer to same entity
Attribute Types and Values
Node (entity)
Age: 20-22 years Weight: 200-215 lbs Height: 6’1”
Arc/Edge (relationship)
Attribute Similarities Aggregated Attribute
Observed Person 1
Situation assessment provides an indication of the existence of situations of interest (e.g. developing bomb threats) within the domain of interest • Graph matching utilized as situation assessment process • Graph matching attempts to locate template graph (situation of interest) in data graph (observed data) • Truncated Search Tree (TruST3) built to attempt to locate optimal template graph-data graph match • Potential matches ranked in decreasing score order
DA
Observed Person 2
DB
Height
6’1”
Tall
Age
22
18-20
Weight
215 lbs
210-220 lbs
Attribute-Attribute Similarity
Node-to-Node Similarity
+
DA
No
D1
Data Association (DA):
Yes
ME Range: 60 to 350% of true target age
Estimated AT >50 Yrs
No
No
Observer Age (Ao) Known?
Observer trained in Age Estimation?
Yes
No empirical evidence found, attempt to convert
Military counterinsurgency (COIN) is characterized by1:
Domain Overview – Intelligence Analysis:
Yes
Yes
Domain Overview – Military Counterinsurgency: • A focus on achieving an end-state characterized by 1. Self-Sufficient Security 2. Stable Governance 3. Functioning Economy • Presence of highly organized insurgent forces 1. Seeking to undermine military/political authority and efforts 2. Utilizing subversive techniques and violence to accomplish their goals
Facial cues significantly reduced?
No
Observer Meta-Data Available?
•
Data fusion systems traditionally have only integrated hard (sensor) data sources • Stems from difficulty in integrating soft (human) observations • Presents a significant drawback to fusion systems in the counterinsurgency domain • Human observations provide unique information, unobtainable by hard sensors • Soft data integration difficulties due to: • Unknown and Contextually Dependent Error Characteristics • Inconsistent and Often Qualitative Observation Format
Methodology: Fusion Processing
Methodology: Model Generation
Template Graph (Situation of Interest)
T2
T1
DB
T3
=
DN
User inputs template graph to initiate search
if(node-node score > threshold), merge entities Data Graph (Observed Data)
Node-Node similarities calculated similar to Data Association Fuzzy Score Ranking Method Indicates Node-Node Score Precedence
Heuristic search scans data graph for possible matches based on Node-Node scores
Messages received…
Surveillance of house #23 on Dhubat Street found that Sufian Mashhadan entered at 0700. Two unknown males, approximately 18 to 20 years old, visited the house at 0932. Two teenagers, both males, about 13 or 14-years old, were seen walking towards the warehouse by the abandoned railroad station in Ramadi.
Based on the category & context of observation, the fusion system selects the appropriate error model to apply • Here we assume observer trained in age estimation • Model describes bias and variance Utilizing the model observation represented as a probabilistic distribution • Bias model defines the observational bias under the observation conditions • Variance model describes the true value containing spread from the translated mode Observation transformed to possibilistic representation, the uncertainty theory utilized throughout further fusion processing • Possibilistic representations are well suited for linguistic observations Attribute-attribute similarity calculation2 performed, indicating the similarity of the two groups of people based solely on age
Observation uncertainties aligned based on error model and transformed for processing Probabilistic representation generated Transformation to possibilistic representation
Attribute-Attribute Similarity Calculated
Possible matches ranked
Conclusions We provided a methodology comprised of 3-components allowing fusion systems to more effectively integrate soft data streams: • Component 1: A method for characterizing soft data sources in terms of qualifying contexts and error characteristics • Component 2: A method for representing quantitative and qualitative data in a unified framework for consistent processing • Component 3: A method for assessing similarity of referenced entities to support data association and situation assessment tasks Bibliography 1 Elan Freedy, Col Lou Lartigue, et al., Decision Infrastructure for Counterinsurgency Operational Planning (DICOP). CogSIMA 2011, Miami Beach, FL. 2 D. Guha and D. Chakraborty, A new approach to fuzzy distance measure and similarity measure between two generalized fuzzy numbers. Applied Soft Computing, Vol. 10, 3 Kedar Sambhoos, Rakesh Nagi, Moises Sudit and Adam Stotz, Enhancements to High Level Data Fusion using Graph Matching and State Space Search, Information Fusion, Vol. 11, No. 4, pp. 351-364. 2010. This work was supported Army Research Office MURI grant W911NF-09-1-0392 for “Unified Research on Network-based Hard/Soft Information Fusion”.