Speaker 1: Jeremy Howard Speaker 2: John Speaker 3: Audience Member
0:02 | Lesson 7: Inside a Neural Net 1:06 | Patty, Rice Patty Competition 2:47 | Scaling Up Models 4:12 | CUDA Out of Memory Error 8:13 | Gradient Accumulation 15:18 | Gradient Accumulation Clarifications 17:33 | Learning Rate Scaling 17:57 | Gradient Accumulation in Fast AI 19:13 | Training Different Models 22:04 | Ensemble of Models 25:52 | K-Fold Cross-Validation 28:01 | Drawbacks of Gradient Accumulation 28:57 | GPU Recommendations 30:56 | Teacher-Student Models 31:37 | Road to the Top Conclusion 31:47 | Multi-Target Model 32:24 | Data Loader with Two Dependent Variables 35:47 | getVariety Function 38:00 | Model Predicting Two Things 39:10 | Metrics and Loss Functions 41:12 | Cross-Entropy Loss 46:18 | Softmax 49:00 | Cross-Entropy Loss Calculation 52:20 | Binary Cross-Entropy 53:04 | Loss Function Versions 54:26 | Multi-Target Model 57:03 | Error Rate for Disease and Variety 58:15 | Multi-Target Model Performance 1:00:04 | Reasons to Learn Multi-Target Models 1:01:14 | Break 1:01:25 | Collaborative Filtering Deep Dive 1:02:03 | Movie Lens Data Set 1:04:04 | Collaborative Filtering Data 1:05:31 | Filling in the Gap 1:06:02 | Predicting User Preferences 1:08:40 | Latent Factors 1:09:16 | Latent Factors in Excel 1:11:02 | Matrix Product and Dot Product 1:11:54 | Stochastic Gradient Descent 1:12:41 | Optimizing the Loss Function 1:15:03 | Matrix Completion 1:15:22 | Cosine Similarity and Correlation 1:16:08 | PyTorch Implementation 1:16:36 | Excel Implementation with PyTorch Format 1:17:59 | Dot Product in Excel 1:18:39 | Embedding 1:19:15 | Embedding in PyTorch 1:20:03 | Learning Latent Factors in PyTorch 1:20:16 | Data Loaders 1:21:45 | User and Movie Factors 1:22:19 | Choosing the Number of Factors 1:23:19 | Training Speed 1:23:49 | Embedding as Matrix Multiplication 1:25:51 | One-Hot Encoded Vector 1:26:26 | Embedding as a Computational Shortcut 1:27:12 | Collaborative Filtering Model 1:27:19 | Creating a Model from Scratch 1:27:41 | Creating a Class in Python 1:28:03 | Magic Methods 1:29:03 | Object-Oriented Programming in PyTorch 1:29:12 | Super Class 1:29:28 | Module Super Class 1:29:39 | dunder init Method 1:30:11 | Treating a Model as a Function 1:30:24 | forward Method 1:31:39 | Training the Model 1:32:30 | Model Limitations 1:33:01 | Movie Enthusiasts 1:33:17 | Sigmoid Function 1:33:53 | Sigmoid Range 1:34:49 | Improving the Model 1:35:06 | User Bias 1:36:00 | Adding User Bias to the Model 1:36:38 | Movie Bias 1:37:02 | Training with Bias 1:38:13 | Overfitting 1:38:56 | Weight Decay 1:39:16 | Weight Decay in the Loss Function 1:41:37 | Weight Decay in PyTorch 1:43:06 | Reducing Overfitting 1:43:49 | Regularization 1:44:35 | Next Time 1:44:45 | Questions 1:45:01 | Hyperparameter Search 1:45:39 | Recommendation Systems Based on Averages 1:46:32 | Conclusion