| Dima Bykhovsky

Lectures, Deep learning course, 2024

Notes:

= Interactive visualization

= static visualization

= video

= book

= paper

= article

Preface and motivation

History of Netflix Prize (Wikipedia)
History of ImageNet competition (Wikipedia), AlexNet paper
History of neural networks (link)
FP8 (floating-point 8-bit) number format (announcement) and technical (description)

Learning by Gradient Descent (GD)

GD algorithm by RealPython
GD visualization at deeplearning.ai

Singe- and multi-layer neural network

Overview

A Quick Introduction to Vanilla Neural Networks (link)
On-line visualization of neural networks at TensorFlow Playground with some explanations from Google
Hidden representation (learned features) visualization (link)
Understanding Deep Learning book by Simon J.D. Prince: Chapters 3 and 4 (download page)
DNN chapter in Hebrew
But what is a neural network? 3Blue1Brown video

Activation functions

Comprehensive synthesis of the main activation functions pros and cons (link)
Commonly used activation functions on stackexchange

Backpropagation and initialization

Visualization of backpropagation and (Xavier) initialization by Katanforoosh & Kunin, “Initializing neural networks”, deeplearning.ai, 2019
GD modifications and stochastic GD by Katanforoosh, Kunin et al., “Parameter optimization in neural networks”, deeplearning.ai, 2019

Regularization (Dropout, batch normalization)

Ridge (L2) Regression (StatQuest)
Paperswithcode review
StatQuest video
Where should place Dropout, Batch Normalization, and Activation Layer?
Intro to Optimization in Deep Learning: Busting the Myth About Batch Normalization (link)
Hands-On Batch normalization
Paper Group Normalization

Scaled Exponential Linear Unit (self-normalizing networks)

Theory (paperswithcode)
Note: requires special AlphaDropout and LecunNormal initializer.

Training

GD for NN

Gradient descent, how neural networks learn 3Blue1Brown video
Bigger batches are not nessesary better by Keshar at el, On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima, ArXiv, 2017.

CNN and Conv2D

CNN Explainer online visualization
2 Convolution arithmetic
Understanding 1D, 2D and 3D Convolution Network on Kaggle
Intuitive understanding of 1D, 2D, and 3D convolutions in convolutional neural networks report
Hyper-parameters calculation: Convolutional Neural Networks, Explained