Download ebooks file Official google cloud certified professional machine learning engineer study gu
Official Google Cloud Certified Professional Machine Learning Engineer Study Guide Mona
Visit to download the full and correct content document: https://ebookmass.com/product/official-google-cloud-certified-professional-machine-le arning-engineer-study-guide-mona/
More products digital (pdf, epub, mobi) instant download maybe you interests ...
Official Google Cloud Certified Professional Data Engineer Study Guide 1st Edition Dan Sullivan
No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, scanning, or otherwise, except as permitted under Section 107 or 108 of the 1976 United States Copyright Act, without either the prior written permission of the Publisher, or authorization through payment of the appropriate per‐copy fee to the Copyright Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, (978) 750‐8400, fax (978) 750‐4470, or on the web at www.copyright.com. Requests to the Publisher for permission should be addressed to the Permissions Department, John Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030, (201) 748‐6011, fax (201) 748‐6008, or online at www.wiley.com/go/permission.
Trademarks: WILEY and the Wiley logo are trademarks or registered trademarks of John Wiley & Sons, Inc. and/or its affiliates, in the United States and other countries, and may not be used without written permission. Google Cloud is a trademark of Google, Inc. All other trademarks are the property of their respective owners. John Wiley & Sons, Inc. is not associated with any product or vendor mentioned in this book.
Limit of Liability/Disclaimer of Warranty: While the publisher and authors have used their best efforts in preparing this book, they make no representations or warranties with respect to the accuracy or completeness of the contents of this book and specifically disclaim any implied warranties of merchantability or fitness for a particular purpose. No warranty may be created or extended by sales representatives or written sales materials. The advice and strategies contained herein may not be suitable for your situation. You should consult with a professional where appropriate. Further, readers should be aware that websites listed in this work may have changed or disappeared between when this work was written and when it is read. Neither the publisher nor authors shall be liable for any loss of profit or any other commercial damages, including but not limited to special, incidental, consequential, or other damages.
For general information on our other products and services or for technical support, please contact our Customer Care Department within the United States at (800) 762‐2974, outside the United States at (317) 572‐3993 or fax (317) 572‐4002.
Wiley also publishes its books in a variety of electronic formats. Some content that appears in print may not be available in electronic formats. For more information about Wiley products, visit our website at www.wiley.com.
To my late father, grandparents, mom, and husband (Pratyush Ranjan), mentor (Mark Smith), and friends. Also to anyone trying to study for this exam. Hope this book helps you pass the exam with flying colors!
—Mona Mona
To my parents, wonderful wife (Swetha), and two fantastic children: Rishab and Riya.
—Pratap Ramamurthy
Acknowledgments
Although this book bears my name as author, many other people contributed to its creation. Without their help, this book wouldn't exist, or at best would exist in a lesser form. Pratap Ramamurthy as my co‐author has helped contribute a third of the content of this book. Kim Wimpsett, the development editor, Christine O'Connor, the managing editor, and Saravanan Dakshinamurthy, the production specialist, oversaw the book as it progressed through all its stages. Arielle Guy was the book's proofreader and Judy Flynn was the copyeditor. Last but not the least, thanks to Hitesh Hinduja for being an amazing reviewer throughout the book writing process.
I'd also like to thank Jim Minatel and Melissa Burlock at Wiley, and Dan Sullivan, who helped connect me with Wiley to write this book.
—Mona Mona
This book is the product of hard work by many people, and it was wonderful to see everyone come together as a team, starting with Jim Minatel and Melissa Burlock from Wiley and including Kim Wimpsett, Christine O' Connor, Saravanan Dakshinamurthy, Judy Flynn, Arielle Guy, and the reviewers.
Most importantly, I would like to thank Mona for spearheading this huge effort. Her knowledge from her previous writing experience and leadership from start to finish was crucial to bringing this book to completion.
—Pratap Ramamurthy
About the Author
Mona Mona is an AI/ML specialist at Google Public Sector. She is the author of the book Natural Language Processing with AWS AI Services and a speaker. She was a senior AI/ML specialist Solution Architect at AWS before joining Google. She has 14 certifications and has created courses for AWS AI/ML Certification Specialty Exam readiness. She has authored 17 blogs on AI/ML and also co‐authored a research paper on AWS CORD‐19 Search: A neural search engine for COVID‐19 literature, which won an award at the Association for the Advancement of Artificial Intelligence (AAAI) conference. She can be reached at monasheetal3@gmail.com.
Pratap Ramamurthy loves to solve problems using machine learning. Currently he is an AI/ML specialist at Google Public Sector. Previously he worked at AWS as a partner solution architect where he helped build the partner ecosystem for Amazon SageMaker. Later he was a principal solution architect at H2O.ai, a company that works on machine learning algorithms for structured data and natural language. Prior to that he was a developer and a researcher. To his credit he has several research papers in networking, server profiling technology, genetic algorithms, and optoelectronics. He holds three patents related to cloud technologies. In his spare time, he likes to teach AI using modern board games. He can be reached at pratap.ram@gmail.com.
About the Technical Editors
Hitesh Hinduja is an ardent artificial intelligence (AI) and data platforms enthusiast currently working as a senior manager in Azure Data and AI at Microsoft. He worked as a senior manager in AI at Ola Electric, where he led a team of 30+ people in the areas of machine learning, statistics, computer vision, deep learning, natural language processing, and reinforcement learning. He has filed 14+ patents in India and the United States and has numerous research publications under his name. Hitesh has been associated in research roles at India's top B‐schools: Indian School of Business, Hyderabad, and the Indian Institute of Management, Ahmedabad. He is also actively involved in training and mentoring and has been invited as a guest speaker by various corporations and associations across the globe. He is an avid learner and enjoys reading books in his free time.
Kanchana Patlolla is an AI innovation program leader at Google Cloud. Previously she worked as an AI/ML specialist in Google Cloud Platform. She has architected solutions with major public cloud providers in financial services industries on their quest to the cloud, particularly in their Big Data and machine learning journey. In her spare time, she loves to try different cuisines and relax with her kids.
About the Technical Proofreader
Adam Vincent is an experienced educator with a passion for spreading knowledge and helping people expand their skill sets. He is multi‐certified in Google Cloud, is a Google Cloud Authorized Trainer, and has created multiple courses about machine learning. Adam also loves playing with data and automating everything. When he is not behind a screen, he enjoys playing tabletop games with friends and family, reading sci‐fi and fantasy novels, and hiking.
Google Technical Reviewer
Wiley and the authors wish to thank the Google Technical Reviewer Emma Freeman for her thorough review of the proofs for this book.
Introduction
When customers have a business problem, say to detect objects in an image, sometimes it can be solved very well using machine learning. Google Cloud Platform (GCP) provides an extensive set of tools to be able to build a model that can accomplish this and deploy it for production usage. This book will cover many different use cases, such as using sales data to forecast for next quarter, identifying objects in images or videos, and even extracting information from text documents. This book helps an engineer build a secure, scalable, resilient machine learning application and automate the whole process using the latest technologies.
The purpose of this book is to help you pass the latest version of the Google Cloud Professional ML Engineer (PMLE) exam. Even after you've taken and passed the PMLE exam, this book should remain a useful reference as it covers the basics of machine learning, BigQuery ML, the Vertex AI platform, and MLOps.
Google
Cloud Professional Machine Learning Engineer Certification
A Professional Machine Learning Engineer designs, builds, and productionizes ML models to solve business challenges using Google Cloud technologies and knowledge of proven ML models and techniques. The ML engineer considers responsible AI throughout the ML development process and collaborates closely with other job roles to ensure the long‐term success of models. The ML engineer should be proficient in all aspects of model architecture, data pipeline interaction, and metrics interpretation. The ML engineer needs familiarity with foundational concepts of application development, infrastructure management, data engineering, and data governance. Through an understanding of training, retraining, deploying, scheduling, monitoring, and improving models, the ML engineer designs and creates scalable solutions for optimal performance.
Why Become Professional ML Engineer (PMLE)
Certified?
There are several good reasons to get your PMLE certification.
Provides proof of professional achievement Certifications are quickly becoming status symbols in the computer service industry. Organizations, including members of the computer service industry, are recognizing the benefits of certification.
Increases your marketability According to Forbes (www.forbes.com/sites/louiscolumbus/2020/02/10/15-toppaying-it-certifications-in-2020/?sh=12f63aa8358e), jobs that require GCP certifications are the highest‐paying jobs for the second year in a row, paying an average salary of $175,761/year. So, there is a demand from many engineers to get certified. Of the many certifications that GCP offers, the AI/ML certified engineer is a new certification and is still evolving.
Provides an opportunity for advancement IDC's research (www.idc.com/getdoc.jsp?containerId=IDC_P40729) indicates that while AI/ML adoption is on the rise, the cost, lack of expertise, and lack of life cycle management tools are among the top three inhibitors to realizing AI and ML at scale.
This book is the first in the market to talk about Google Cloud AI/ML tools and the technology covering the latest Professional ML Engineer certification guidelines released on February 22, 2022.
Recognizes Google as a leader in open source and AI
Google is the main contributor to many of the path‐breaking open source softwares that dramatically changed the landscape of AI/ML, including TensorFlow, Kubeflow, Word2vec, BERT, and T5. Although these algorithms are in the open source domain, Google has the distinct ability of bringing these open source projects to the market through the Google Cloud Platform (GCP). In this regard, the other cloud providers are frequently seen as trailing Google's offering.
Raises customer confidence As the IT community, users, small business owners, and the like become more familiar with the PMLE certified professional, more of them will realize that the PMLE professional is more qualified to architect secure, cost‐effective, and scalable ML solutions on the Google Cloud environment than a noncertified individual.
How to Become Certified
You do not have to work for a particular company. It's not a secret society. There is no prerequisite to take this exam. However, there is a recommendation to have 3+ years of industry experience, including one or more years designing and managing solutions using Google Cloud.
This exam is 2 hours and has 50–60 multiple‐choice questions. You can register two ways for this exam:
Take the online‐proctored exam from anywhere or sitting at home. You can review the online testing requirements at www.webassessor.com/wa.do?
page=certInfo&branding=GOOGLECLOUD&tabs=13.
Take the on‐site, proctored exam at a testing center.
We usually prefer to go with the on‐site option as we like the focus time in a proctored environment. We have taken all our certifications in a test center. You can find and locate a test center near you at www.kryterion.com/Locate-Test-Center.
Who Should Buy This Book
This book is intended to help students, developers, data scientists, IT professionals, and ML engineers gain expertise in the ML technology on the Google Cloud Platform and take the Professional Machine Learning Engineer exam. This book intends to take readers through the machine learning process starting from data and moving on through feature engineering, model training, and deployment on the Google Cloud. It also walks readers through best practices for when
to pick custom models versus AutoML or pretrained models. Google Cloud AI/ML technologies are presented through real‐world scenarios to illustrate how IT professionals can design, build, and operate secure ML cloud environments to modernize and automate applications.
Anybody who wants to pass the Professional ML Engineer exam may benefit from this book. If you're new to Google Cloud, this book covers the updated machine learning exam course material, including the Google Cloud Vertex AI platform, MLOps, and BigQuery ML. This is the only book on the market to cover the complete Vertex AI platform, from bringing your data to training, tuning, and deploying your models.
Since it's a professional‐level study guide, this book is written with the assumption that you know the basics of the Google Cloud Platform, such as compute, storage, networking, databases, and identity and access management (IAM) or have taken the Google Cloud Associate‐level certification exam. Moreover, this book assumes you understand the basics of machine learning and data science in general. In case you do not understand a term or concept, we have included a glossary for your reference.
How This Book Is Organized
This book consists of 14 chapters plus supplementary information: a glossary, this introduction, and the assessment test after the introduction. The chapters are organized as follows:
Chapter 1: Framing ML Problems This chapter covers how you can translate business challenges into ML use cases.
Chapter 2: Exploring Data and Building Data Pipelines
This chapter covers visualization, statistical fundamentals at scale, evaluation of data quality and feasibility, establishing data constraints (e.g., TFDV), organizing and optimizing training datasets, data validation, handling missing data, handling outliers, and data leakage.
Chapter 3: Feature Engineering
This chapter covers topics such as encoding structured data types, feature selection, class imbalance, feature crosses, and transformations (TensorFlow Transform).
Chapter 4: Choosing the Right ML Infrastructure
This chapter covers topics such as evaluation of compute and accelerator options (e.g., CPU, GPU, TPU, edge devices) and choosing appropriate Google Cloud hardware components. It also covers choosing the best solution (ML vs. non‐ML, custom vs. pre‐packaged [e.g., AutoML, Vision API]) based on the business requirements. It talks about how defining the model output should be used to solve the business problem. It also covers deciding how incorrect results should be handled and identifying data sources (available vs. ideal). It talks about AI solutions such as CCAI, DocAI, and Recommendations AI.
Chapter 5: Architecting ML Solutions
This chapter explains how to design reliable, scalable, and highly available ML solutions. Other topics include how you can choose appropriate ML services for a use case (e.g., Cloud Build, Kubeflow), component types (e.g., data collection, data management), automation, orchestration, and serving in machine learning.
Chapter 6: Building Secure ML Pipelines This chapter describes how to build secure ML systems (e.g., protecting against unintentional exploitation of data/model, hacking). It also covers the privacy implications of data usage and/or collection (e.g., handling sensitive data such as personally identifiable information [PII] and protected health information [PHI]).
Chapter 7: Model Building This chapter describes the choice of framework and model parallelism. It also covers modeling techniques given interpretability requirements, transfer learning, data augmentation, semi‐supervised learning, model generalization, and strategies to handle overfitting and underfitting.
Chapter 8: Model Training and Hyperparameter
Tuning This chapter focuses on the ingestion of various file types into training (e.g., CSV, JSON, IMG, parquet or databases, Hadoop/Spark). It covers training a model as a job in different environments. It also talks about unit tests for model training and serving and hyperparameter tuning. Moreover, it discusses ways to track metrics during training and retraining/redeployment evaluation.
Chapter 9: Model Explainability on Vertex AI This chapter covers approaches to model explainability on Vertex AI.
Chapter 10: Scaling Models in Production This chapter covers scaling prediction service (e.g., Vertex AI Prediction, containerized serving), serving (online, batch, caching), Google Cloud serving options, testing for target performance, and configuring trigger and pipeline schedules.
Chapter 11: Designing ML Training Pipelines This chapter covers identification of components, parameters, triggers, and compute needs (e.g., Cloud Build, Cloud Run). It also talks about orchestration framework (e.g., Kubeflow Pipelines/Vertex AI Pipelines, Cloud Composer/Apache Airflow), hybrid or multicloud strategies, and system design with TFX components/Kubeflow DSL.
Chapter 12: Model Monitoring, Tracking, and Auditing
Metadata This chapter covers the performance and business quality of ML model predictions, logging strategies, organizing and tracking experiments, and pipeline runs. It also talks about dataset versioning and model/dataset lineage.
Chapter 13: Maintaining ML Solutions This chapter covers establishing continuous evaluation metrics (e.g., evaluation of drift or bias), understanding the Google Cloud permission model, and identification of appropriate retraining policies. It also covers common training and serving errors (TensorFlow), ML model failure, and resulting biases. Finally, it talks about how you can tune the performance of ML solutions for training and serving in production.
Chapter 14: BigQuery ML This chapter covers BigQueryML algorithms, when to use BigQueryML versus Vertex AI, and the interoperability with Vertex AI.
Chapter Features
Each chapter begins with a list of the objectives that are covered in the chapter. The book doesn't cover the objectives in order. Thus, you shouldn't be alarmed at some of the odd ordering of the objectives within the book.
At the end of each chapter, you'll find several elements you can use to prepare for the exam.
Exam Essentials This section summarizes important information that was covered in the chapter. You should be able to perform each of the tasks or convey the information requested.
Review Questions Each chapter concludes with 8+ review questions. You should answer these questions and check your answers against the ones provided after the questions. If you can't answer at least 80 percent of these questions correctly, go back and review the chapter, or at least those sections that seem to be giving you difficulty.