Dima
Bykhovsky
Toggle navigation
about
blog
news
teaching
CV
research
Courses
ML
Lectures,
Deep learning course
, 2024
Notes:
= Interactive visualization
= static visualization
= video
= book
= paper
= article
Preface and motivation
History of Netflix Prize
(Wikipedia)
History of ImageNet competition
(Wikipedia)
, AlexNet
paper
History of neural networks
(link)
FP8 (floating-point 8-bit) number format
(announcement)
and technical
(description)
Learning by Gradient Descent (GD)
GD algorithm by
RealPython
GD visualization at
deeplearning.ai
Singe- and multi-layer neural network
Overview
A Quick Introduction to Vanilla Neural Networks
(link)
On-line visualization of neural networks at
TensorFlow Playground
with some explanations from
Google
Hidden representation (learned features) visualization
(link)
Understanding Deep Learning book by Simon J.D. Prince: Chapters 3 and 4
(download page)
DNN chapter
in Hebrew
But what is a neural network?
3Blue1Brown video
Activation functions
Comprehensive synthesis of the main activation functions pros and cons
(link)
Commonly used activation functions on
stackexchange
Backpropagation and initialization
Visualization of backpropagation and (Xavier) initialization by
Katanforoosh & Kunin, “Initializing neural networks”, deeplearning.ai, 2019
GD modifications and stochastic GD by
Katanforoosh, Kunin et al., “Parameter optimization in neural networks”, deeplearning.ai, 2019
Regularization (Dropout, batch normalization)
Ridge (L2) Regression
(StatQuest)
Paperswithcode
review
StatQuest
video
Where should place Dropout, Batch Normalization, and Activation Layer?
Intro to Optimization in Deep Learning: Busting the Myth About Batch Normalization
(link)
Hands-On Batch normalization
Paper
Group Normalization
Scaled Exponential Linear Unit (self-normalizing networks)
Theory
(paperswithcode)
Note: requires special
AlphaDropout
and
LecunNormal
initializer.
Training
GD for NN
Gradient descent, how neural networks learn
3Blue1Brown video
Bigger batches are not nessesary better by Keshar
at el
, On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima,
ArXiv
, 2017.
CNN and Conv2D
CNN Explainer
online visualization
2 Convolution arithmetic
Understanding 1D, 2D and 3D Convolution Network on
Kaggle
Intuitive understanding of 1D, 2D, and 3D convolutions in convolutional neural networks
report
Hyper-parameters calculation:
Convolutional Neural Networks, Explained