Lectures, Deep learning course, 2024

Notes:

:desktop_computer: = Interactive visualization :film_strip: = static visualization :film_projector: = video :book: = book :page_with_curl: = paper :page_facing_up: = article

Preface and motivation (Lecture 1)

Logistic regression (Lecture 2)

Definition

  • :page_facing_up:Logistic Regression in Python by RealPython
  • :page_facing_up:Coding interpretation of entropy (Wikipedia)
  • :film_projector:Understanding Binary Cross-Entropy / Log Loss in 5 minutes (YouTube)

Learning by Gradient Descent (GD)

Singe- and multi-layer neural network

Overview

Activation functions

  • Comprehensive synthesis of the main activation functions pros and cons (link)
  • Commonly used activation functions on stackexchange

Backpropagation and initialization

Regularization (Dropout, batch normalization)

Scaled Exponential Linear Unit (self-normalizing networks)
  • Theory (paperswithcode)
  • Note: requires special AlphaDropout and LecunNormal initializer.

Training

GD for NN
  • :film_projector:Gradient descent, how neural networks learn 3Blue1Brown video
  • Bigger batches are not nessesary better by Keshar at el, On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima, ArXiv, 2017.

CNN and Conv2D