[PDF Download] Machine learning q and ai: 30 essential questions and answers on machine learning and

Page 1


Visit to download the full and correct content document: https://textbookfull.com/product/machine-learning-q-and-ai-30-essential-questions-an d-answers-on-machine-learning-and-ai-1-converted-edition-sebastian-raschka/

More products digital (pdf, epub, mobi) instant download maybe you interests ...

Machine Learning and AI Beyond the Basics Raschka

https://textbookfull.com/product/machine-learning-and-ai-beyondthe-basics-raschka/

Kubernetes for MLOps Scaling Enterprise Machine Learning Deep Learning and AI 2nd Edition Sam

Charrington

https://textbookfull.com/product/kubernetes-for-mlops-scalingenterprise-machine-learning-deep-learning-and-ai-2nd-edition-samcharrington/

Effective Machine Learning Teams 1 / converted Edition

David Tan

https://textbookfull.com/product/effective-machine-learningteams-1-converted-edition-david-tan/

Practical machine learning with H2O powerful scalable techniques for deep learning and AI First Edition Cook

https://textbookfull.com/product/practical-machine-learningwith-h2o-powerful-scalable-techniques-for-deep-learning-and-aifirst-edition-cook/

Practical Machine Learning with H2O Powerful Scalable Techniques for Deep Learning and AI 1st Edition Darren Cook

https://textbookfull.com/product/practical-machine-learningwith-h2o-powerful-scalable-techniques-for-deep-learning-andai-1st-edition-darren-cook/

Interpretable AI Building explainable machine learning systems MEAP V02 Ajay Thampi

https://textbookfull.com/product/interpretable-ai-buildingexplainable-machine-learning-systems-meap-v02-ajay-thampi/

Probabilistic Machine Learning for Finance and Investing 1 / converted Edition Deepak K. Kanungo

https://textbookfull.com/product/probabilistic-machine-learningfor-finance-and-investing-1-converted-edition-deepak-k-kanungo/

Practical Automated Machine Learning on Azure Using Azure Machine Learning to Quickly Build AI Solutions

1st Edition Deepak Mukunthuk Parashar Shah Wee Hyong Tok

https://textbookfull.com/product/practical-automated-machinelearning-on-azure-using-azure-machine-learning-to-quickly-buildai-solutions-1st-edition-deepak-mukunthuk-parashar-shah-weehyong-tok/

Machine Learning and AI for Healthcare: Big Data for Improved Health Outcomes Arjun Panesar

https://textbookfull.com/product/machine-learning-and-ai-forhealthcare-big-data-for-improved-health-outcomes-arjun-panesar/

MACHINE LEARNING Q AND AI

30 Essential Questions and Answers on Machine Learning and AI

San Francisco

MACHINE LEARNING Q AND AI. Copyright © 2024 by Sebastian Raschka.

All rights reserved No part of this work may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopying, recording, or by any information storage or retrieval system, without the prior written permission of the copyright owner and the publisher.

First printing

28 27 26 25 24 1 2 3 4 5

ISBN-13: 978-1-7185-0376-2 (print)

ISBN-13: 978-1-7185-0377-9 (ebook)

Published by No Starch Press , Inc

245 8th Street, San Francisco, CA 94103

phone: +1 415 863 9900 www.nostarch.com; info@nostarch.com

Publisher: William Pollock

Managing Editor: Jill Franklin

Production Manager: Sabrina Plomitallo-González

Production Editor: Miles Bond

Developmental Editor: Abigail Schott-Rosenfield

Cover Illustrator: Gina Redman

Interior Design: Octopod Studios

Technical Reviewer: Andrea Panizza

Copyeditor: George Hale

Proofreader: Audrey Doyle

Indexer: BIM Creatives, LLC

Library of Congress Control Number: 2023041820 ®

For customer service inquiries, please contact info@nostarch.com. For information on distribution, bulk sales, corporate sales, or translations: sales@nostarch com For permission to translate this work: rights@nostarch.com. To report counterfeit copies or piracy: counterfeit@nostarch com

No Starch Press and the No Starch Press logo are registered trademarks of No Starch Press, Inc. Other product and company names mentioned herein may be the trademarks of their respective owners. Rather than use a trademark symbol with every occurrence of a trademarked name, we are using the names only in an editorial fashion and to the benefit of the trademark owner, with no intention of infringement of the trademark.

The information in this book is distributed on an “As Is” basis, without warranty. While every precaution has been taken in the preparation of this work, neither the author nor No Starch Press, Inc. shall have any liability to any person or entity with respect to any loss or damage caused or alleged to be caused directly or indirectly by the information contained in it

To my partner, Liza; my family; and the global community of creators who have motivated and influenced my journey as an author

About the Author

Sebastian Raschka, PhD, is a machine learning and AI researcher with a strong passion for education As Lead AI Educator at Lightning AI, he is excited about making AI and deep learning more accessible and teaching people how to utilize these technologies at scale. Before dedicating his time fully to Lightning AI, Sebastian held a position as assistant professor of statistics at the University of Wisconsin–Madison, where he specialized in researching deep learning and machine learning You can find out more about his research on his website (https://sebastianraschka.com). Moreover, Sebastian loves open source software and has been a passionate contributor for over a decade Next to coding, he also loves writing and authored the bestselling books Python Machine Learning and Machine Learning with PyTorch and ScikitLearn (both from Packt Publishing)

About the Technical Reviewer

Andrea Panizza is a principal AI specialist at Baker Hughes, leveraging cutting-edge AI/ML techniques to accelerate engineering design, automate information retrieval and extraction from large collections of documents, and use computer vision to power unmanned asset inspection. He has a PhD in computational fluid dynamics Before joining Baker Hughes, he worked as a CFD researcher at CIRA (the Italian Center for Aerospace Research).

BRIEF CONTENTS

Foreword

Acknowledgments

Introduction

PART I: NEURAL NETWORKS AND DEEP LEARNING

Chapter 1: Embeddings, Latent Space, and Representations

Chapter 2: Self-Supervised Learning

Chapter 3: Few-Shot Learning

Chapter 4: The Lottery Ticket Hypothesis

Chapter 5: Reducing Overfitting with Data

Chapter 6: Reducing Overfitting with Model Modifications

Chapter 7: Multi-GPU Training Paradigms

Chapter 8: The Success of Transformers

Chapter 9: Generative AI Models

Chapter 10: Sources of Randomness

PART II: COMPUTER VISION

Chapter 11: Calculating the Number of Parameters

Chapter 12: Fully Connected and Convolutional Layers

Chapter 13: Large Training Sets for Vision Transformers

PART III: NATURAL LANGUAGE PROCESSING

Chapter 14: The Distributional Hypothesis

Chapter 15: Data Augmentation for Text

Chapter 16: Self-Attention

Chapter 17: Encoder- and Decoder-Style Transformers

Chapter 18: Using and Fine-Tuning Pretrained Transformers

Chapter 19: Evaluating Generative Large Language Models

PART IV: PRODUCTION AND DEPLOYMENT

Chapter 20: Stateless and Stateful Training

Chapter 21: Data-Centric AI

Chapter 22: Speeding Up Inference

Chapter 23: Data Distribution Shifts

PART V: PREDICTIVE PERFORMANCE AND MODEL EVALUATION

Chapter 24: Poisson and Ordinal Regression

Chapter 25: Confidence Intervals

Chapter 26: Confidence Intervals vs. Conformal Predictions

Chapter 27: Proper Metrics

Chapter 28: The k in k-Fold Cross-Validation

Chapter 29: Training and Test Set Discordance

Chapter 30: Limited Labeled Data

Afterword

Appendix: Answers to the Exercises

Index

CONTENTS IN DETAIL

FOREWORD

ACKNOWLEDGMENTS

INTRODUCTION

Who Is This Book For?

What Will You Get Out of This Book?

How to Read This Book

Online Resources

PART I

NEURAL NETWORKS AND DEEP LEARNING

1 EMBEDDINGS, LATENT SPACE, AND REPRESENTATIONS

Embeddings

Latent Space

Representation

Exercises References

2 SELF-SUPERVISED LEARNING

Self-Supervised Learning vs. Transfer Learning

Leveraging Unlabeled Data

Self-Prediction and Contrastive Self-Supervised Learning

LOTTERY TICKET HYPOTHESIS

The Lottery Ticket Training Procedure

Practical Implications and Limitations

Exercises References

OVERFITTING WITH DATA

Common Methods

Collecting More Data

Data Augmentation

Pretraining

Other Methods

Exercises

References

6

REDUCING OVERFITTING WITH MODEL MODIFICATIONS

Common Methods

Regularization

Smaller Models

Caveats with Smaller Models

Ensemble Methods

Other Methods

Choosing a Regularization Technique

Exercises

References

7 MULTI-GPU TRAINING PARADIGMS

The Training Paradigms

Model Parallelism

Data Parallelism

Tensor Parallelism

Pipeline Parallelism

Sequence Parallelism

Recommendations

Exercises

References

The Attention Mechanism

Pretraining via Self-Supervised Learning

Large Numbers of Parameters

Easy Parallelization

Exercises

References

Generative vs. Discriminative Modeling

Types of Deep Generative Models

Energy-Based Models

Variational Autoencoders

Generative Adversarial Networks

Flow-Based Models

Autoregressive Models

Diffusion Models

Consistency Models

Recommendations

Exercises

References

10

SOURCES OF RANDOMNESS

Model Weight Initialization

Dataset Sampling and Shuffling

Nondeterministic Algorithms

Different Runtime Algorithms

Hardware and Drivers

Randomness and Generative AI

Exercises

References

PART II

COMPUTER VISION

11

CALCULATING THE NUMBER OF PARAMETERS

How to Find Parameter Counts

Convolutional Layers

Fully Connected Layers

Practical Applications

Exercises

12

FULLY CONNECTED AND CONVOLUTIONAL LAYERS

When the Kernel and Input Sizes Are Equal

When the Kernel Size Is 1

Recommendations

Exercises

13

LARGE TRAINING SETS FOR VISION TRANSFORMERS

Inductive Biases in CNNs

ViTs Can Outperform CNNs

Inductive Biases in ViTs

Recommendations

Exercises

References

Word2vec, BERT, and GPT

Does the Hypothesis Hold?

Exercises

References

15 DATA AUGMENTATION FOR TEXT

Synonym Replacement

Word Deletion

Word Position Swapping

Sentence Shuffling

Noise Injection

Back Translation

Synthetic Data

Recommendations

Exercises

References

16

SELF-ATTENTION

Attention in RNNs

The Self-Attention Mechanism

Exercises

References

17

ENCODER- AND DECODER-STYLE TRANSFORMERS

The Original Transformer

Encoders

Decoders

Encoder-Decoder Hybrids

Terminology

Contemporary Transformer Models

Exercises

References

18

USING AND FINE-TUNING PRETRAINED TRANSFORMERS

Using Transformers for Classification Tasks

In-Context Learning, Indexing, and Prompt Tuning

Parameter-Efficient Fine-Tuning

Reinforcement Learning with Human Feedback

Adapting Pretrained Language Models

Exercises

References

19

EVALUATING GENERATIVE LARGE LANGUAGE MODELS

Evaluation Metrics for LLMs

Perplexity

BLEU Score

ROUGE Score

BERTScore

Surrogate Metrics

Exercises

References

PART

20

STATELESS AND STATEFUL TRAINING

Stateless (Re)training

Stateful Training

Exercises

21

DATA-CENTRIC AI

Data-Centric vs. Model-Centric AI

Recommendations

Exercises

References

22 SPEEDING UP INFERENCE

Parallelization

Vectorization

Loop Tiling

Operator Fusion

Quantization

Exercises

References

23

DATA DISTRIBUTION SHIFTS

Covariate Shift

Label Shift

Concept Drift

Domain Shift

Types of Data Distribution Shifts

Exercises

References

PART V

PREDICTIVE PERFORMANCE AND MODEL EVALUATION

24 POISSON AND ORDINAL REGRESSION

Exercises

CONFIDENCE INTERVALS

Defining Confidence Intervals

The Methods

Method 1: Normal Approximation Intervals

Method 2: Bootstrapping Training Sets

Method 3: Bootstrapping Test Set Predictions

Method 4: Retraining Models with Different Random Seeds

Recommendations

Exercises

References

26

CONFIDENCE INTERVALS VS. CONFORMAL PREDICTIONS

Confidence Intervals and Prediction Intervals

Prediction Intervals and Conformal Predictions

Prediction Regions, Intervals, and Sets

Computing Conformal Predictions

A Conformal Prediction Example

The Benefits of Conformal Predictions

Trade-offs in Selecting Values for k

Determining Appropriate Values for k Exercises

Improving Model Performance with Limited Labeled Data

Labeling More Data

Bootstrapping the Data

Transfer Learning

Self-Supervised Learning

Active Learning

Few-Shot Learning

Meta-Learning

Weakly Supervised Learning

Semi-Supervised Learning

Self-Training

Multi-Task Learning

Multimodal Learning

Inductive Biases

Recommendations

Exercises

References

AFTERWORD

APPENDIX: ANSWERS TO THE EXERCISES

INDEX

FOREWORD

There are hundreds of introductory texts on machine learning. They come in a vast array of styles and approaches, from theory-first perspectives for graduate students to business-focused viewpoints for C-suites. These primers are invaluable resources for individuals taking their first steps into the field, and they will remain so for decades to come

However, the path to expertise is not solely about the beginnings. It also encompasses the intricate detours, the steep ascents, and the nuances that aren’t evident initially. Put another way, after studying the basics, learners are left asking, “What’s next?” It is here, in the realm beyond the basics, that this book finds its purpose.

Within these pages, Sebastian walks readers through a broad spectrum of intermediate and advanced topics in applied machine learning they are likely to encounter on their journey to expertise One could hardly ask for a better guide than Sebastian, who is, without exaggeration, the best machine learning educator currently in the field. On each page, Sebastian not only imparts his extensive knowledge, but also shares the passion and curiosity that mark true expertise.

To all the learners who have crossed the initial threshold and are eager to delve deeper, this book is for you. You will finish this book as a more skilled machine learning practitioner than when you began. Let it serve as the bridge to your next phase of rewarding adventures in machine learning.

Good luck.

San Francisco August 2023

ACKNOWLEDGMENTS

Writing a book is an enormous undertaking. This project would not have been possible without the help of the open source and machine learning communities who collectively created the technologies that this book is about.

I want to extend my thanks to the following people for their immensely helpful feedback on the manuscript:

Andrea Panizza for being an outstanding technical reviewer, who provided very valuable and insightful feedback.

Anton Reshetnikov for suggesting a cleaner layout for the supervised learning flowchart in Chapter 30.

Nikan Doosti, Juan M. Bello-Rivas, and Ken Hoffman for sharing various typographical errors

Abigail Schott-Rosenfield and Jill Franklin for being exemplary editors. Their knack for asking the right questions and improving language has significantly elevated the quality of this book.

INTRODUCTION

Thanks to rapid advancements in deep learning, we have seen a significant expansion of machine learning and AI in recent years.

This progress is exciting if we expect these advancements to create new industries, transform existing ones, and improve the quality of life for people around the world On the other hand, the constant emergence of new techniques can make it challenging and time-consuming to keep abreast of the latest developments Nonetheless, staying current is essential for professionals and organizations that use these technologies.

I wrote this book as a resource for readers and machine learning practitioners who want to advance their expertise in the field and learn about techniques that I consider useful and significant but that are often overlooked in traditional and introductory textbooks and classes. I hope you’ll find this book a valuable resource for obtaining new insights and discovering new techniques you can implement in your work.

Who Is This Book For?

Navigating the world of AI and machine learning literature can often feel like walking a tightrope, with most books positioned at either end: broad beginner’s introductions or deeply mathematical treatises This book illustrates and discusses important developments in these fields while staying approachable and not requiring an advanced math or coding background.

This book is for people with some experience with machine learning who want to learn new

concepts and techniques. It’s ideal for those who have taken a beginner course in machine learning or deep learning or have read an equivalent introductory book on the topic (Throughout this book, I will use machine learning as an umbrella term for machine learning, deep learning, and AI )

What Will You Get Out of This Book?

This book adopts a unique Q&A style, where each brief chapter is structured around a central question related to fundamental concepts in machine learning, deep learning, and AI Every question is followed by an explanation, with several illustrations and figures, as well as exercises to test your understanding. Many chapters also include references for further reading. These bitesized nuggets of information provide an enjoyable jumping-off point on your journey from machine learning beginner to expert.

The book covers a wide range of topics. It includes new insights about established architectures, such as convolutional networks, that allow you to utilize these technologies more effectively. It also discusses more advanced techniques, such as the inner workings of large language models (LLMs) and vision transformers. Even experienced machine learning researchers and practitioners will encounter something new to add to their arsenal of techniques

While this book will expose you to new concepts and ideas, it’s not a math or coding book. You won’t need to solve any proofs or run any code while reading. In other words, this book is a perfect travel companion or something you can read on your favorite reading chair with your morning coffee or tea.

How to Read This Book

Each chapter of this book is designed to be self-contained, offering you the freedom to jump between topics as you wish. When a concept from one chapter is explained in more detail in another, I’ve included chapter references you can follow to fill in gaps in your understanding.

However, there’s a strategic sequence to the chapters. For example, the early chapter on embeddings sets the stage for later discussions on self-supervised learning and few-shot learning

Another random document with no related content on Scribd:

Turn static files into dynamic content formats.

Create a flipbook
Issuu converts static files into: digital portfolios, online yearbooks, online catalogs, digital photo albums and more. Sign up and create your flipbook.