Audio Machine Learning for Mobility Laurent Pilati - Engineering Manager - Voice Audio Solution
November 2018
Company Public – NXP, the NXP logo, and NXP secure connections for a smarter world are trademarks of NXP B.V. All other product or service names are the property of their respective owners. Š 2018 NXP B.V.
IA for Mobility: Myth and reality of Machine Learning
1
PUBLIC
IA for Mobility: Myth and reality of Machine Learning
https://mapr.com/blog/demystifying-ai-ml-dl/
2
PUBLIC
IA for Mobility: AI missing a Common Sense
Frequent Word
Proper name
Segmentation of Words from sentences Typical Word stress
Language prosody Segmentation In large clauses Typical vowels Typical consonant
Segmentation In small clauses Phototactic
Phototactic illusion Loss of non native contrast
Emmanuel Dupoux - EHESS, Paris 15 janvier 2018
3
PUBLIC
IA for Mobility: Importance of Common sense 1. Speech Recognition in Car
4
PUBLIC
IA for Mobility: Importance of Common sense Speech Recognition in Car: Lombard effect
The Lombard effect with Thai lexical tones: an acoustic analysis of articulatory modifications in noise - Benjawan Kasisopa, Virginie Attina, Denis 2014 in INTERSPEECH
5
PUBLIC
IA for Mobility: Importance of Common sense Speech Recognition in Car: Lombard effect
Non-Lombard Lombard Compensated Lombard
The impact of the Lombard effect on audio and visual speech recognition systems - Ricard Marxer Jon Barker Najwa Alghamdi SteveMaddock
6
PUBLIC
IA for Mobility: Importance of Common sense 2. Source separation through Neuro Evolution Bias
Output Evolved network
Input
Raw signal
Enhanced signal
1. End to End approach: Let the Neural Network to find the best discriminative features 2. Fitness score 3. Small networks Evolving recurrent neural networks that process and classify raw audio in a streaming fashion - Adrien Daniel - INTERSPEECH 2017
7
PUBLIC
IA for Mobility: Importance of Common sense 2. Source separation through Neuro Evolution
Speech vs. Music
8
PUBLIC
IA for Mobility: Importance of Common sense 2. Audio source separation through Neuro Evolution Talker B
Talker A
Talker A
9
PUBLIC
IA for Mobility: Importance of Common sense
Actual Class
3. Audio Context Recognition
Predicted class 10
PUBLIC
IA for Mobility: Importance of Common sense 3. Audio Context Recognition: Machine Hit-rate vs. time Total Accuracy vs Time Analysis 88 86
Total Accuracy [%]
84 82 80 78 76 74 72 70 0
11
PUBLIC
5
10
15 Time analysis [s]
20
25
30
IA for Mobility: Importance of Common sense 3. Audio Context Recognition: Human Hit-rate vs. time
AES 110th Convention: Recognition of Everyday Auditory Scenes: Potentials, Latencies and Cues – Peltonen et al.
12
PUBLIC
NXP vision on AI: Video, Voice, Vibration i.MX 8QuadMax – Edge Computing Node for Vision
ICS
GPU1
GPU0
Collect Video
Machine Vision Object Detection
CNN Neural Net Classifier
i.MX 8QuadMax A complete Machine Vision and Neural Net Processing Edge Node
13
PUBLIC
CPU
Machine Decision
NXP vision on AI: Video, Voice, Vibration 1. Voice Activity Detection for Speech Enhancement
14
PUBLIC
NXP vision on AI: Video, Voice, Vibration 1. Multi modal: Voice Activity Detection for Speech Processing:
15
PUBLIC
NXP vision on AI: Video, Voice, Vibration 1. Multi modal: Voice Activity Detection for Speech Processing:
16
PUBLIC
NXP vision on AI: Video, Voice, Vibration 1. Multi modal: Voice Activity Detection for Speech Processing: Wind Noise Reduction
Samsung S7 output
WNR processing combining Audio + Video
Note: Same context of recording – but not same microphone
17
PUBLIC
NXP vision on AI: Video, Voice, Vibration 2. Multi modal: Voice Activity Detection for Speech Recognition in Car
18
PUBLIC
NXP vision on AI: Video, Voice, Vibration 2. Multi modal: Voice Activity for Speech Recognition
DOA
Source: DOA algorithm module from Interview Mode 2.0 - Thomas Camier
19
PUBLIC
NXP vision on AI: New AI / IoT era will create new Industry leaders “The 4th Tectonic Shift in Computing”
Security
Edge Processing Machine Learning
Connected Data Source: Jeffer
20
PUBLIC
A Position of Strength to Better Serve Our Customers 7TH
largest semiconductor company2 11,000 engineers
Operations in 33 countries Headquarters: Eindhoven, Netherlands
21
Automotive
PUBLIC
60+ year history $9.3B annual revenue3
31,000 employees
#1
9,000 patent families
#1
Broad-based MCUs1
#1
Secure Identification
#1
Communications Processors
#1
RF Power Transistors
Sources: HIS, ABI Research, Strategy Analytics, The Linley Group 1) MCU market excluding Automotive 2) Excludes memory 3) Posted revenue for 2017
NXP and the NXP logo are trademarks of NXP B.V. All other product or service names are the property of their respective owners. Š 2018 NXP B.V.
BACK-UP
23
PUBLIC
False Positive with Face recognition algorithm
24
PUBLIC
SophIA – November 7-8-9 • • • •
25
PUBLIC
Technical Session Master Class Keynote Talk
NXP and the NXP logo are trademarks of NXP B.V. All other product or service names are the property of their respective owners. Š 2018 NXP B.V.