1st Two Lessons of From Deep Learning Foundations to Stable Diffusion

4 videos from Practical Deep Learning for Coders Part 2, 2022 have been released as a special early preview of the new course.
courses
Author

Jeremy Howard

Published

October 19, 2022

The University of Queensland has opened the course to late registrations, if you want to join the rest of the course live: register here.

We started teaching our new course, From Deep Learning Foundations to Stable Diffusion, a couple of weeks ago. The experience of developing and teaching these first lessons has been amazing. Course contributors have included some brilliant folks from Hugging Face, Stability.ai, and fast.ai, and we’ve had some inspirational contributions already from amazing people from Deoldify, Lambda Labs, and more.

Some important new papers have come out in the last two weeks, and we’ve covered them already in the course. Because this field is moving so quickly, and there’s so much interest, we’ve decided to release our first two lessons early. In fact, we’re releasing them right now!

In total, we’re releasing four videos, with around 5.5 hours of content, covering the following topics (the lesson numbers start at “9”, since this is a continuation of Practical Deep Learning for Coders part 1, which had 8 lessons):

These videos will make the most sense if you’ve already completed part 1 of the course, or already have some experience with training and deploying deep learning models (preferably in PyTorch).

Lesson 9—Pipelines and concepts

This lesson starts with a tutorial on how to use pipelines in the Diffusers library to generate images. Diffusers is (in our opinion!) the best library available at the moment for image generation. It has many features and is very flexible. We explain how to use its many features, and discuss options for accessing the GPU resources needed to use the library.

We talk about some of the nifty tweaks available when using Stable Diffusion in Diffusers, and show how to use them: guidance scale (for varying the amount the prompt is used), negative prompts (for removing concepts from an image), image initialisation (for starting with an existing image), textual inversion (for adding your own concepts to generated images), Dreambooth (an alternative approach to textual inversion).

The second half of the lesson covers the key concepts involved in Stable Diffusion:

  • CLIP embeddings
  • The VAE (variational autoencoder)
  • Predicting noise with the unet
  • Removing noise with schedulers

You can discuss this lesson, and access links to all notebooks and resources from it, at this forum topic.

Lesson 9A—Deep dive

In this video Jonathan Whitaker shows us what is happening behind the scenes when we create an image with Stable Diffusion, looking at the different components and processes and how each can be modified for further control over the generation process.

He shows how to replicate the sampling loop from scratch, and explains each of the steps involved in more detail:

  • The Auto-Encoder
  • Adding Noise and image-to-image
  • The Text Encoding Process
  • The UNET and classifier free guidance
  • Sampling.

Lesson 9B—Math of diffusion

Wasim Lorgat and Tanishq Abraham walk through the math of diffusion models from the ground up. They assume no prerequisite knowledge beyond what you covered in high school.

Lesson 10—Custom pipeline

This lesson creates a complete Diffusers pipeline from the underlying components: the VAE, unet, scheduler, and tokeniser. By putting them together manually, this gives you the flexibility to fully customise every aspect of the inference process.

We also discuss three important new papers that have been released in the last week, which improve inference performance by over 10x, and allow any photo to be “edited” by just describing what the new picture should show.

The second half of the lesson begins the “from the foundations” stage of the course, developing a basic matrix class and random number generator from scratch, as well as discussing the use of iterators in Python.

You can discuss this lesson, and access links to all notebooks and resources from it, at this forum topic.