NetCom Learning Machine Learning & AI Foundations: A Guide to Predictive Modeling
Tom Goodheart NetCom Learning www.netcomlearning.com | info@netcomlearning.com | (888) 563 8266
Š1998-2019 NetCom Learning
Agenda
• • • • • •
Evaluating the Proper Amount of Data Assessing the Data Quality and Quantity Data Preparation and Modeling Challenges Scoring Machine Learning Models Deploying Models and adjusting data prep / scoring Monitor and Maintenance
www.netcomlearning.com | info@netcomlearning.com | (888) 563 8266
©1998-2019 NetCom Learning
www.netcomlearning.com | info@netcomlearning.com | (888) 563 8266
Š1998-2019 NetCom Learning
Evaluating the ProperAmountof Data
As data sets used in the field of Machine Learning grow bigger and more complex we enter the realm of big data: massive amounts of data, flowing from varieties of sources, that grow exponentially.
Information needs to be stored efficiently and rapidly in order to provide business value.
2) Reason by Analogy – Learning Curves
Just because we have data doesn’t mean it’s quality data or that it will provide the types of answers we need.
The Proper Amount of Data
1) It Depends, No One Can Really tell You 1) Complexity of the Problem 2) Complexity of the Algorithms chosen
1)Study how your problem changes as data scales as well as your algorithm. Averaging over many iterations of size can give you a good enough idea 3) Domain Expertise
1) Delphi Technique 4) Statistical Heuristics 1) Factor the Number of classes 2) Factor the number of input features 3) Factor the number of model parameters
www.netcomlearning.com | info@netcomlearning.com | (888) 563 8266
©1998-2019 NetCom Learning
Evaluating the ProperAmountof Data
Learning Curves
These are statistical tools built off of heuristics and domain expertise. As the model training examples increase you can see the scores of the model reach an asymptotic point At this point no matter how much more data you throw at your model, performance will not improve.
http://acl.ldc.upenn.edu/P/P01/P01-1005.pdf
www.netcomlearning.com | info@netcomlearning.com | (888) 563 8266
Š1998-2019 NetCom Learning
Evaluating the ProperAmountof Data
Domain Expertise We reach out to people we know with more information than us, We Google search, We continue our growth towards Domain Expertise by exposing ourselves to experts and information in the field.
https://proxy.duckduckgo.com/iu/?u=https%3A%2F%2Ftse4.mm.bing.net%2Fth%3Fid%3DOI P.LHEhOiJ-QBM7ZhVhNxrO6wHaHa%26pid%3DApi&f=1
www.netcomlearning.com | info@netcomlearning.com | (888) 563 8266
©1998-2019 NetCom Learning
Evaluating the ProperAmountof Data
Statistical Heuristics The “Ten Times” Rule A rule of thumb, along with manyothers, that can help set the tone for your exploratory analysis.
https://proxy.duckduckgo.com/iu/?u=https%3A%2F%2Fwww.designheuristics.com%2Fwpcontent%2Fuploads%2F2012%2F07%2Fhomepage.png&f=1
www.netcomlearning.com | info@netcomlearning.com | (888) 563 8266
©1998-2019 NetCom Learning
AssessingQuantityand Qualityof Data
One of the major setbacks in the availability of data present today is that you have to slice your way through copious amounts of questionable data sets.
The Data Quality Cheat Sheet
1) Clarify your objectives and collected data 1) Is my question a categorical or numerical prediction? 2) Have I collected the right data source? 2) Take the time to execute proper collection methods
1) Machine Learning is the best example of Garbage In, Garbage Out you can compute expensively.
Timoelliott.com
3) Maintain a rigorous audittrail 1) git blame 2) Scoring Engines/Scoring Files; Logging 3) Proper CI/CD Pipelines 4) Designate a Data Quality Team 1) Their responsibility is the maintenance, testing, and assurance of the data.
5) Independent Quality Assurance 1) Can someone else reproduce your results? https://hbr.org/2018/04/if-your-data-is-bad-your-machine-learning-tools-are-useless www.netcomlearning.com | info@netcomlearning.com | (888) 563 8266
©1998-2019 NetCom Learning
Data Preparationand Modeling Challenges
www.netcomlearning.com | info@netcomlearning.com | (888) 563 8266
Š1998-2019 NetCom Learning
Data Preparationand Modeling Challenges “To create a model, then, we make choices about what’s important enough to include, simplifying the world into a toy version that can be easily understood and from which we can infer important facts and actions. We expect it to handle only one job and accept that it will occasionally act like a clueless machine, one with enormous blind spots.” ― Cathy O'Neil
www.netcomlearning.com | info@netcomlearning.com | (888) 563 8266
©1998-2019 NetCom Learning
ScoringMachineLearningModels
Scoring is also called prediction, and is the process of generating values based on a trained machine learning model, given some newinput data.
www.netcomlearning.com | info@netcomlearning.com | (888) 563 8266
Scoring is widely used in machine learning to mean the process of generating new values, given a model and some new input. The generic term "score" is used, rather than "prediction," because the scoring process can generate so many different types of values: A list of recommended items and a similarity score. Numeric values, for time series models and regression models. A probability value, indicating the likelihood that a new input belongs to some existing category. The name of a category or cluster to which a new item is most similar. A predicted class or outcome, for classification models.
Š1998-2019 NetCom Learning
DeployingMachineLearningModels
https://proxy.duckduckgo.com/iu/?u=https%3A%2F%2Fdocs.microsoft.com%2Fen-us%2Fazure%2Fmachine-learning%2Fdesktop-workbench%2Fmedia%2Fmodel-managementoverview%2Fmodelmanagement.png&f=1 www.netcomlearning.com | info@netcomlearning.com | (888) 563 8266
Š1998-2019 NetCom Learning
Monitor and Maintenance Microsoft AzureML Studio has a built in tool that allows you to glean insights from your running models, like how they are scoring over time, how fast they respond to customer inputs, etc.
Remember: If It Isn’t Logged; It Doesn’t Exist
www.netcomlearning.com | info@netcomlearning.com | (888) 563 8266
©1998-2019 NetCom Learning
RecordedWebinarVideo
To watch the recorded webinar video for live demos, please access the link: http://bit.ly/2GIpZYc
www.netcomlearning.com | info@netcomlearning.com | (888) 563 8266
Š1998-2019 NetCom Learning
About NetCom Learning
www.netcomlearning.com | info@netcomlearning.com | (888) 563 8266
Š1998-2019 NetCom Learning
RecommendedCourses& MarketingAssets Courses: » 20774: Perform Cloud Data Science with Azure Machine Learning – Class scheduled on June 03 » Artificial Intelligence (AI) for Beginners » AI-100T01: Designing and Implementing an Azure AI Solution » DP-200T01: Implementing an Azure Data Solution (Data Scientist) » Master Program - Artificial Intelligence
Marketing Assets:
Blog - Game Changers of 2019: Top 8 In-Demand IT Skills Free On-Demand Training - Preparing and Architecting for Machine Learning Free On-Demand Training - Develop Your AI Strategy with These Trends in Mind
www.netcomlearning.com | info@netcomlearning.com | (888) 563 8266
©1998-2019 NetCom Learning
• • • • • • • • •
Google Cloud Fundamentals - Core Infrastructure Autodesk Inventor: How to Organize and Reuse Your Data Project Management Essentials for Non-project Managers How to Build Effective Data Communications with TableauDesktop Critical Thinking: Developing Problem-Solving & Decision-MakingSkills Mastering Microsoft Teams - The Future of Teamwork Microsoft Azure: Managing Subscriptions and Resources Explore Photoshop CC for the Web Designers Explore Data Warehousing and Business Intelligence & More…
www.netcomlearning.com | info@netcomlearning.com | (888) 563 8266
©1998-2019 NetCom Learning
Promotions
It’s time for a SALEbration! NetCom Learning is headed for its next milestone – 21 years of nonstop training and learning. To commemorate, we will kick off the best SALEbration of the year – Data & AI Courses at 21% OFF! Learn More
www.netcomlearning.com | info@netcomlearning.com | (888) 563 8266
©1998-2019 NetCom Learning
FollowUsOn:
www.netcomlearning.com | info@netcomlearning.com | (888) 563 8266
Š1998-2019 NetCom Learning
www.netcomlearning.com | info@netcomlearning.com | (888) 563 8266
Š1998-2019 NetCom Learning
THANK YOU !!!
www.netcomlearning.com | info@netcomlearning.com | (888) 563 8266
©1998-2019 NetCom Learning