[PDF Download] Machine learning q and ai: 30 essential questions and answers on machine learning and
Visit to download the full and correct content document: https://textbookfull.com/product/machine-learning-q-and-ai-30-essential-questions-an d-answers-on-machine-learning-and-ai-1-converted-edition-sebastian-raschka/
More products digital (pdf, epub, mobi) instant download maybe you interests ...
All rights reserved No part of this work may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopying, recording, or by any information storage or retrieval system, without the prior written permission of the copyright owner and the publisher.
For customer service inquiries, please contact info@nostarch.com. For information on distribution, bulk sales, corporate sales, or translations: sales@nostarch com For permission to translate this work: rights@nostarch.com. To report counterfeit copies or piracy: counterfeit@nostarch com
No Starch Press and the No Starch Press logo are registered trademarks of No Starch Press, Inc. Other product and company names mentioned herein may be the trademarks of their respective owners. Rather than use a trademark symbol with every occurrence of a trademarked name, we are using the names only in an editorial fashion and to the benefit of the trademark owner, with no intention of infringement of the trademark.
The information in this book is distributed on an “As Is” basis, without warranty. While every precaution has been taken in the preparation of this work, neither the author nor No Starch Press, Inc. shall have any liability to any person or entity with respect to any loss or damage caused or alleged to be caused directly or indirectly by the information contained in it
To my partner, Liza; my family; and the global community of creators who have motivated and influenced my journey as an author
About the Author
Sebastian Raschka, PhD, is a machine learning and AI researcher with a strong passion for education As Lead AI Educator at Lightning AI, he is excited about making AI and deep learning more accessible and teaching people how to utilize these technologies at scale. Before dedicating his time fully to Lightning AI, Sebastian held a position as assistant professor of statistics at the University of Wisconsin–Madison, where he specialized in researching deep learning and machine learning You can find out more about his research on his website (https://sebastianraschka.com). Moreover, Sebastian loves open source software and has been a passionate contributor for over a decade Next to coding, he also loves writing and authored the bestselling books Python Machine Learning and Machine Learning with PyTorch and ScikitLearn (both from Packt Publishing)
About the Technical Reviewer
Andrea Panizza is a principal AI specialist at Baker Hughes, leveraging cutting-edge AI/ML techniques to accelerate engineering design, automate information retrieval and extraction from large collections of documents, and use computer vision to power unmanned asset inspection. He has a PhD in computational fluid dynamics Before joining Baker Hughes, he worked as a CFD researcher at CIRA (the Italian Center for Aerospace Research).
BRIEF CONTENTS
Foreword
Acknowledgments
Introduction
PART I: NEURAL NETWORKS AND DEEP LEARNING
Chapter 1: Embeddings, Latent Space, and Representations
Chapter 2: Self-Supervised Learning
Chapter 3: Few-Shot Learning
Chapter 4: The Lottery Ticket Hypothesis
Chapter 5: Reducing Overfitting with Data
Chapter 6: Reducing Overfitting with Model Modifications
Chapter 7: Multi-GPU Training Paradigms
Chapter 8: The Success of Transformers
Chapter 9: Generative AI Models
Chapter 10: Sources of Randomness
PART II: COMPUTER VISION
Chapter 11: Calculating the Number of Parameters
Chapter 12: Fully Connected and Convolutional Layers
Chapter 13: Large Training Sets for Vision Transformers
PART III: NATURAL LANGUAGE PROCESSING
Chapter 14: The Distributional Hypothesis
Chapter 15: Data Augmentation for Text
Chapter 16: Self-Attention
Chapter 17: Encoder- and Decoder-Style Transformers
Chapter 18: Using and Fine-Tuning Pretrained Transformers
Chapter 19: Evaluating Generative Large Language Models
PART IV: PRODUCTION AND DEPLOYMENT
Chapter 20: Stateless and Stateful Training
Chapter 21: Data-Centric AI
Chapter 22: Speeding Up Inference
Chapter 23: Data Distribution Shifts
PART V: PREDICTIVE PERFORMANCE AND MODEL EVALUATION
Chapter 24: Poisson and Ordinal Regression
Chapter 25: Confidence Intervals
Chapter 26: Confidence Intervals vs. Conformal Predictions
Chapter 27: Proper Metrics
Chapter 28: The k in k-Fold Cross-Validation
Chapter 29: Training and Test Set Discordance
Chapter 30: Limited Labeled Data
Afterword
Appendix: Answers to the Exercises
Index
CONTENTS IN DETAIL
FOREWORD
ACKNOWLEDGMENTS
INTRODUCTION
Who Is This Book For?
What Will You Get Out of This Book?
How to Read This Book
Online Resources
PART I
NEURAL NETWORKS AND DEEP LEARNING
1 EMBEDDINGS, LATENT SPACE, AND REPRESENTATIONS
Embeddings
Latent Space
Representation
Exercises References
2 SELF-SUPERVISED LEARNING
Self-Supervised Learning vs. Transfer Learning
Leveraging Unlabeled Data
Self-Prediction and Contrastive Self-Supervised Learning
LOTTERY TICKET HYPOTHESIS
The Lottery Ticket Training Procedure
Practical Implications and Limitations
Exercises References
OVERFITTING WITH DATA
Common Methods
Collecting More Data
Data Augmentation
Pretraining
Other Methods
Exercises
References
6
REDUCING OVERFITTING WITH MODEL MODIFICATIONS
Common Methods
Regularization
Smaller Models
Caveats with Smaller Models
Ensemble Methods
Other Methods
Choosing a Regularization Technique
Exercises
References
7 MULTI-GPU TRAINING PARADIGMS
The Training Paradigms
Model Parallelism
Data Parallelism
Tensor Parallelism
Pipeline Parallelism
Sequence Parallelism
Recommendations
Exercises
References
The Attention Mechanism
Pretraining via Self-Supervised Learning
Large Numbers of Parameters
Easy Parallelization
Exercises
References
Generative vs. Discriminative Modeling
Types of Deep Generative Models
Energy-Based Models
Variational Autoencoders
Generative Adversarial Networks
Flow-Based Models
Autoregressive Models
Diffusion Models
Consistency Models
Recommendations
Exercises
References
10
SOURCES OF RANDOMNESS
Model Weight Initialization
Dataset Sampling and Shuffling
Nondeterministic Algorithms
Different Runtime Algorithms
Hardware and Drivers
Randomness and Generative AI
Exercises
References
PART II
COMPUTER VISION
11
CALCULATING THE NUMBER OF PARAMETERS
How to Find Parameter Counts
Convolutional Layers
Fully Connected Layers
Practical Applications
Exercises
12
FULLY CONNECTED AND CONVOLUTIONAL LAYERS
When the Kernel and Input Sizes Are Equal
When the Kernel Size Is 1
Recommendations
Exercises
13
LARGE TRAINING SETS FOR VISION TRANSFORMERS
Inductive Biases in CNNs
ViTs Can Outperform CNNs
Inductive Biases in ViTs
Recommendations
Exercises
References
Word2vec, BERT, and GPT
Does the Hypothesis Hold?
Exercises
References
15 DATA AUGMENTATION FOR TEXT
Synonym Replacement
Word Deletion
Word Position Swapping
Sentence Shuffling
Noise Injection
Back Translation
Synthetic Data
Recommendations
Exercises
References
16
SELF-ATTENTION
Attention in RNNs
The Self-Attention Mechanism
Exercises
References
17
ENCODER- AND DECODER-STYLE TRANSFORMERS
The Original Transformer
Encoders
Decoders
Encoder-Decoder Hybrids
Terminology
Contemporary Transformer Models
Exercises
References
18
USING AND FINE-TUNING PRETRAINED TRANSFORMERS
Using Transformers for Classification Tasks
In-Context Learning, Indexing, and Prompt Tuning
Parameter-Efficient Fine-Tuning
Reinforcement Learning with Human Feedback
Adapting Pretrained Language Models
Exercises
References
19
EVALUATING GENERATIVE LARGE LANGUAGE MODELS
Evaluation Metrics for LLMs
Perplexity
BLEU Score
ROUGE Score
BERTScore
Surrogate Metrics
Exercises
References
PART
20
STATELESS AND STATEFUL TRAINING
Stateless (Re)training
Stateful Training
Exercises
21
DATA-CENTRIC AI
Data-Centric vs. Model-Centric AI
Recommendations
Exercises
References
22 SPEEDING UP INFERENCE
Parallelization
Vectorization
Loop Tiling
Operator Fusion
Quantization
Exercises
References
23
DATA DISTRIBUTION SHIFTS
Covariate Shift
Label Shift
Concept Drift
Domain Shift
Types of Data Distribution Shifts
Exercises
References
PART V
PREDICTIVE PERFORMANCE AND MODEL EVALUATION
24 POISSON AND ORDINAL REGRESSION
Exercises
CONFIDENCE INTERVALS
Defining Confidence Intervals
The Methods
Method 1: Normal Approximation Intervals
Method 2: Bootstrapping Training Sets
Method 3: Bootstrapping Test Set Predictions
Method 4: Retraining Models with Different Random Seeds
Recommendations
Exercises
References
26
CONFIDENCE INTERVALS VS. CONFORMAL PREDICTIONS
Confidence Intervals and Prediction Intervals
Prediction Intervals and Conformal Predictions
Prediction Regions, Intervals, and Sets
Computing Conformal Predictions
A Conformal Prediction Example
The Benefits of Conformal Predictions
Trade-offs in Selecting Values for k
Determining Appropriate Values for k Exercises
Improving Model Performance with Limited Labeled Data
Labeling More Data
Bootstrapping the Data
Transfer Learning
Self-Supervised Learning
Active Learning
Few-Shot Learning
Meta-Learning
Weakly Supervised Learning
Semi-Supervised Learning
Self-Training
Multi-Task Learning
Multimodal Learning
Inductive Biases
Recommendations
Exercises
References
AFTERWORD
APPENDIX: ANSWERS TO THE EXERCISES
INDEX
FOREWORD
There are hundreds of introductory texts on machine learning. They come in a vast array of styles and approaches, from theory-first perspectives for graduate students to business-focused viewpoints for C-suites. These primers are invaluable resources for individuals taking their first steps into the field, and they will remain so for decades to come
However, the path to expertise is not solely about the beginnings. It also encompasses the intricate detours, the steep ascents, and the nuances that aren’t evident initially. Put another way, after studying the basics, learners are left asking, “What’s next?” It is here, in the realm beyond the basics, that this book finds its purpose.
Within these pages, Sebastian walks readers through a broad spectrum of intermediate and advanced topics in applied machine learning they are likely to encounter on their journey to expertise One could hardly ask for a better guide than Sebastian, who is, without exaggeration, the best machine learning educator currently in the field. On each page, Sebastian not only imparts his extensive knowledge, but also shares the passion and curiosity that mark true expertise.
To all the learners who have crossed the initial threshold and are eager to delve deeper, this book is for you. You will finish this book as a more skilled machine learning practitioner than when you began. Let it serve as the bridge to your next phase of rewarding adventures in machine learning.
Good luck.
Chris Albon Director of Machine Learning, the Wikimedia Foundation
San Francisco August 2023
ACKNOWLEDGMENTS
Writing a book is an enormous undertaking. This project would not have been possible without the help of the open source and machine learning communities who collectively created the technologies that this book is about.
I want to extend my thanks to the following people for their immensely helpful feedback on the manuscript:
Andrea Panizza for being an outstanding technical reviewer, who provided very valuable and insightful feedback.
Anton Reshetnikov for suggesting a cleaner layout for the supervised learning flowchart in Chapter 30.
Nikan Doosti, Juan M. Bello-Rivas, and Ken Hoffman for sharing various typographical errors
Abigail Schott-Rosenfield and Jill Franklin for being exemplary editors. Their knack for asking the right questions and improving language has significantly elevated the quality of this book.
INTRODUCTION
Thanks to rapid advancements in deep learning, we have seen a significant expansion of machine learning and AI in recent years.
This progress is exciting if we expect these advancements to create new industries, transform existing ones, and improve the quality of life for people around the world On the other hand, the constant emergence of new techniques can make it challenging and time-consuming to keep abreast of the latest developments Nonetheless, staying current is essential for professionals and organizations that use these technologies.
I wrote this book as a resource for readers and machine learning practitioners who want to advance their expertise in the field and learn about techniques that I consider useful and significant but that are often overlooked in traditional and introductory textbooks and classes. I hope you’ll find this book a valuable resource for obtaining new insights and discovering new techniques you can implement in your work.
Who Is This Book For?
Navigating the world of AI and machine learning literature can often feel like walking a tightrope, with most books positioned at either end: broad beginner’s introductions or deeply mathematical treatises This book illustrates and discusses important developments in these fields while staying approachable and not requiring an advanced math or coding background.
This book is for people with some experience with machine learning who want to learn new
concepts and techniques. It’s ideal for those who have taken a beginner course in machine learning or deep learning or have read an equivalent introductory book on the topic (Throughout this book, I will use machine learning as an umbrella term for machine learning, deep learning, and AI )
What Will You Get Out of This Book?
This book adopts a unique Q&A style, where each brief chapter is structured around a central question related to fundamental concepts in machine learning, deep learning, and AI Every question is followed by an explanation, with several illustrations and figures, as well as exercises to test your understanding. Many chapters also include references for further reading. These bitesized nuggets of information provide an enjoyable jumping-off point on your journey from machine learning beginner to expert.
The book covers a wide range of topics. It includes new insights about established architectures, such as convolutional networks, that allow you to utilize these technologies more effectively. It also discusses more advanced techniques, such as the inner workings of large language models (LLMs) and vision transformers. Even experienced machine learning researchers and practitioners will encounter something new to add to their arsenal of techniques
While this book will expose you to new concepts and ideas, it’s not a math or coding book. You won’t need to solve any proofs or run any code while reading. In other words, this book is a perfect travel companion or something you can read on your favorite reading chair with your morning coffee or tea.
How to Read This Book
Each chapter of this book is designed to be self-contained, offering you the freedom to jump between topics as you wish. When a concept from one chapter is explained in more detail in another, I’ve included chapter references you can follow to fill in gaps in your understanding.
However, there’s a strategic sequence to the chapters. For example, the early chapter on embeddings sets the stage for later discussions on self-supervised learning and few-shot learning
Another random document with no related content on Scribd: