Our courses (all are free and have no ads):

fast.ai in the news:


Providing a Good Education in Deep Learning

Paul Lockhart, a Columbia math PhD, former Brown professor, and K-12 math teacher, writes in the influential essay A Mathematician’s Lament of a nightmare world where children are not allowed to listen to or play music until they have spent over a decade mastering music notation and theory, spending classes transposing sheet music into a different key. In art class, students study colors and applicators, but aren’t allowed to actually paint until college. Sound absurd? This is how math is taught–we require students to spend years doing rote memorization, and learning dry, disconnected “fundamentals” that we claim will pay off later, long after most of them quit the subject.

Unfortunately, this is where several of the few resources on deep learning begin–asking learners to follow along with the definition of the Hessian and theorems for the Taylor approximation of your loss function, without ever giving examples of actual working code. I’m not knocking calculus. I love calculus and have even taught it at the college level, but I don’t think it’s a good or helpful introduction to deep learning.

So many students are turned off to math because of the dull way it’s taught, with tedious, repetitive exercises, and a curriculum that saves so many fun parts (such as graph theory, counting and permutations, group theory) to so late that everyone except math majors has abandoned the subject. And the gate-keepers of deep learning are doing something similar whenever they ask that you can derive the multivariate chain rule or give the theoretical underpinnings of KL divergence before they’ll teach you how to use a neural net to handle your own projects.

We’ll be leveraging the best available research on teaching methods to try to fix these problems with technical teaching, including:

  • Teaching “the whole game”–starting off by showing how to use a complete, working, very usable, state of the art deep learning network to solve real world problems, by using simple, expressive tools. And then gradually digging deeper and deeper into understanding how those tools are made, and how the tools that make those tools are made, and so on…
  • Always teaching through examples: ensuring that there is a context and a purpose that you can understand intuitively, rather than starting with algebraic symbol manipulation
  • Simplifying as much as possible: we’ve spent months building tools and teaching methods that make previously complex topics very simple
  • Removing barriers: deep learning has, until now, been a very exclusive game. We’re breaking it open, and ensuring that everyone can play
  • …and many more. I’ve discussed some of our approaches to teaching in more detail below.

In the end, what we’re talking about is good education. That’s what we most care about. Here are more of our thoughts on good education:

Good education starts with “the whole game”

Just as kids have a sense of what baseball is before they start batting practice, we want you to have a sense of the big picture of deep learning well before you study calculus and the chain rule. We’ll move from the big picture down to the details (which is the opposite direction than most education, which tries to teach all the individual elements before putting them together). For a good example of how this works, watch Jeremy’s talk on recurrent neural networks–he starts with 3 line RNN using a highly featured library, then removes the library builds his own architecture using a GPU framework, and then removes the framework and builds everything from scratch in gritty detail just using basic python.

In a book that inspires us, David Perkins, a Harvard education professor with a PhD from MIT in Artificial Intelligence, calls the approach of not doing anything complicated until you’ve taught all the individual elements first a disease: “elementitis”. It’s like batting practice without knowing what the game baseball is. The elements can seem boring or pointless when you don’t know how they fit in with the big picture. And it’s hard to stay motivated when you’re not able to work on problems you care about, or have a sense of how the technical details fit into the whole. Perhaps this is why studies have shown that the intrinsic motivation of school children steadily declines from 3rd grade to 8th grade (the only range of years studied).

Good education equips you to work on the questions you care about

Whether you’re excited to identify if plants are diseased from pictures of their leaves, auto-generate knitting patterns, diagnose TB from x-rays, or determine when a raccoon is using your cat door, we will get you using deep learning on your own problems (via pre-trained models from others) as quickly as possible, and then will progressively drill into more details. You’ll learn how to use deep learning to solve your own problems at state-of-the-art accuracy within the first 30 minutes of the first lesson! There is a pernicious myth out there that you need to have computing resources and datasets the size of those at Google to be able to do deep learning, and it’s not true.

Good education is not overly complicated.

Have you watched Jeremy implement modern deep learning optimization methods in Excel? If not, go watch it (starts at 4:50min in the video) and come back. This is often considered a complex topic, yet after weeks of work Jeremy figured out how to make it so easy it seems obvious. If you truly understand something, you can explain it in an accessible way, and maybe even implement it in Excel! Complicated jargon and obtuse technical definitions arise out of laziness, or when the speaker is unsure of the meat of what they’re saying and hides behind their peripheral knowledge.

Good education is inclusive.

It doesn’t put up any unnecessary barriers. It doesn’t make you feel bad if you didn’t start coding at age 12, if you have a non-traditional background, if you can’t afford a mac, if you’re working on a non-traditional problem, or if you didn’t go to an elite college. We want our course to be as accessible as possible. I care deeply about inclusion, and spent months researching and writing each of my widely read articles with practical tips on how we can increase diversity in tech, as well as spending a year and a half teaching full-stack software development to women full-time. Currently deep learning is even more homogenous than tech in general, which is scary for such a powerful and impactful field. We are going to change this.

Good education motivates the study of underlying technical concepts.

Having a big picture understanding gives you more of a framework to place the fundamentals in. Seeing what deep learning is capable of and how you can use it is the best motivation for the more dry or tedious parts.

“Playing baseball is more interesting than batting practice, playing pieces of music more interesting than practicing scales, and engaging in some junior version of historical or mathematical inquiry more interesting than memorizing dates or doing sums,” writes Perkins. Building a working model for a problem that interests you is more interesting than writing a proof (for most people!)

Good education encourages you to make mistakes.

In the most viewed TED talk of all time, education expert Sir Ken Robinson argues that by stigmatizing mistakes, our school systems destroy the children’s innate creative capacity. “If you’re not prepared to be wrong, you’ll never end up with anything original,” says Robinson.

Teaching deep learning with a code-heavy approach in interactive Jupyter notebooks is a great setup for trying lots of things, making mistakes, and easily changing what you’re doing.

Good education leverages existing resources

There is no need to reinvent teaching materials where good ones already exist. If you need to brush up on matrix multiplication, we’ll refer you to Khan Academy. If you’re fascinated by X and want to go deeper, we’ll recommend you read Y. Our goal is to help you achieve your deep learning goals, not to be the sole resource in getting you there.

Good education encourages creativity

Lockhart quote

Lockhart argues that it would be better to not teach math at all, then to teach such a mangled form of it that alienates most of the population from the beauty of math. He describes math as a “rich and fascinating adventure of the imagination” and defines it as “the art of explanation”, although it is rarely taught that way.

The biggest wins for deep learning will come when you apply it to the outside domains you’re an expert in and the problems you’re passionate about. This will require you to be creative.

Good education teaches you to ask questions, not just to answer them

Even those who seem to thrive under traditional education methods are still poorly served by them. I received a mostly traditional approach to education (although I had a few exceptional teachers at all stages and particularly at Swarthmore). I excelled at school, aced exams, and generally enjoyed learning. I loved math, going on to earn a math PhD at Duke University. While I was great at problem sets and exams, this traditional approach did me a huge disservice when it came to preparing me for doctoral research and my professional career. I was no longer being given well-formulated, appropriately scoped problems by teachers. I could no longer learn every incremental building block before setting to work on a task. As Perkins writes about his struggles with finding a good dissertation topic, I too had learned how to solve problems I was given, but not how to find and scope interesting problems on my own. I now see my previous academic successes as a weakness I’ve had to overcome professionally. When I began studying deep learning, I enjoyed reading the math theorems and proofs, but this didn’t actually help me build deep learning models.

Good education is evidence-based

We love data and the scientific method, and we are interested in techniques that have been supported by research.

Spaced repetition learning is one such evidence-backed technique, where learners revisit a topic periodically, just before they would forget it. Jeremy used this technique to obtain impressive results in teaching himself Chinese. The whole game method of learning dovetails nicely with spaced repetition learning in that we will revisit topics, going into more and more low level details each time, but always returning to the big picture.

What We Will Cover in the First Deep Learning Certificate

For those of you considering joining our deep learning certificate, I’m sure you’d like to hear more about what we will be covering. This first course if part 1 of a two part series, which have the following high level goals:

  • Part 1: Get you to the point where you can successfully implement and debug best practice deep learning techniques in the most widely used current areas, such as computer vision, and natural language
  • Part 2: Take you right up to the cutting edge of current research, and beyond, including applications in robotics and self-driving cars, time series analysis (such as for financial, marketing, and production applications), and large-scale imaging (including 3d imaging for medicine, and analysis of satellite images).

Here’s what we’re planning to cover in part 1 of the course:

  1. The opportunities and constraints in applying deep learning to solving a wide range of problems, including how deep learning is being applied today
  2. How to quickly get up and running using popular deep learning libraries such as Keras
  3. How to test that a model is working correctly
  4. Just enough linear algebra, probability theory, and calculus to understand how deep learning works
  5. The role of each key component of deep learning: input, architecture, output, loss function, optimization, regularization, and testing
  6. The key techniques used for each of these components, why they are used, and how to apply them using popular deep learning libraries
  7. How each of these techniques are applied to achieve state of the art results in computer vision, and natural language processing
  8. Recent advances in deep learning for improving model training outcomes
  9. Techniques for getting good results even with smaller datasets

We’ll be covering these topics in a very different way to what you’ll be used to if you’ve taken any university level math or CS courses in the past. We’ll be telling you all about our teaching philosophy in our next post. Our approach will be code-heavy and math-light, so we do ask that participants already have at least a year or two of solid coding experience. We’ll be using Python (via the wonderful Jupyter Notebook) for our examples, so if you’re not already familiar with Python, we’d strongly suggest going through a quick introduction to Python and to Jupyter (formerly known as IPython)

More Details

Here’s some more detail on what topics we will be covering. For convolutional neural networks (CNNs), primarily used for image classification, we will teach:

  • Basics of image convolutions
  • Introduction to the CNN architecture
  • Going beyond basic SGD
  • Regularization with dropout and weight decay
  • Image classification in Theano

To learn more, you may be interested this great visual explanation of image kernels

For recursive neural networks (RNNs), used for natural language processing (NLP) and times series data, we will cover:

  • Basics of NLP
  • Introduction to RNNs
  • Introduction to the LSTM architecture
  • Char-rnn in Theano

To find out more now, you can read this excellent post by Andrej Karpathy.

One of our primary goals for this course is to teach you practical techniques for training better models such as:

  • Batch normalization
  • Resnets
  • Testing and Visualization

Check out this helpful advice on babysitting your learning process and Chris Olah’s illuminating visualizations of language representations

There is a dangerous myth that you need huge data sets to effectively use deep learning. This is false, and we will teach you to deal with data shortages, such as through:

  • Data augmentation
  • Unsupervised learning and autoencoders
  • Semi-supervised learning
  • Transfer learning

Background & Preparation

To participate, you should either have some familiarity with matrix multiplication, basic differentiation, and the chain rule, or be willing to study them before the course starts. If you need a refresher on these concepts, we recommend the Khan Academy videos on matrix multiplication and the chain rule).

We will make significant use of list comprehensions in Python - here is a useful introduction. It would also be very helpful to know your way around the basic python data science tools: numpy, scipy, scikit-learn, pandas, jupyter notebook, and matplotlib. The best guide I know of to these tools is Python For Data Analysis. For those with no python experience, you may want to prepare by reading Learn Python The Hard Way.

Read the official USF Data Institute description of our upcoming deep learning course on Monday evenings and send your resume to datainstitute@usfca.edu by Oct 12 to apply.

The First Certificate in Deep Learning

Update: The deadline has been extended to 10/17

We’ve previously discussed why for fast.ai our first goal is to provide a way for any coder to become a deep learning expert. Until we deal with the huge shortage of deep learning expertise, it will be very difficult to fix all of the other problems in deep learning that hold it back from helping solve society’s most challenging problems.

For coders who wish to learn to use deep learning effectively, there is no obvious path available. Doing a PhD requires many years, and you can’t even start until you have an appropriate CV to get admitted. Programs like the Deep Learning Summer School and the Insight Data Science Fellows require a PhD to even get accepted. Most blog posts assume that you’re already an expert, and those that don’t, do little to make you an expert.

For those who make it through all of those obstacles, they then have to deal with the difficulty that deep learning is generally taught as a mathematical discipline – and, as we’ll discuss in our next post, mathematical disciplines have a particularly impractical learning path. For instance, Oxford University’s graduate level course (available online), requires a high level of mathematical proficiency to understand the material, and does very little to teach the important practical skills involved in practical deep learning coding. The deep learning book by Ian Goodfellow et al has similar issues. (Extraordinarily, the book contains no code whatsoever, and very few mentions of practical computing issues.) Given that these are considered perhaps the strongest existing deep learning training materials, you can imagine what the average quality ones look like! (To clarify - for those people looking to enter academia, or experts looking to better understand research issues, these are excellent resources.) Rachel found that even for a mathematician who wants to build practical tools with deep learning, these resources aren’t very helpful.

In 2013, Rachel heard Ilya Sutskever (then a newly minted PhD working at Google, now director of Open AI) speak at a meetup. She was less interested in the theory, and primarily wanted to be able to implement a neural net at home (Caffe, the first open source deep learning framework, wasn’t released until Jan 2014). During the Q&A at the end, she asked how he initialized his network, and he said that was part of a dirty bag of tricks that nobody published. How could anyone do this at their own organizations when nobody was sharing the practical info? In this course, we want to give you practical tips on how to preprocess your data, which architecture to use when, and yes, how to initialize your weights.

Our first step towards resolving these issues is to provide a series of courses designed to bring coders all the way to the cutting-edge of deep learning research. On October 24, we will begin part one of the Data Institute deep learning certificate. This course will be (as far as we are aware) the first university accredited, open access, in person deep learning certificate in the world.

Applications to attend need to be in by October 12 17 – so if you are interested, and are based in the San Francisco Bay area, please apply right away by emailing your resume to datainstitute@usfca.edu! If you want to get a sense of the teaching style, take a look at the link above - about 30 mins in to the talk I provide a introduction to convolutional neural networks. (The actual course will of course be paced and run very differently however - the talk above was a brief introduction as part of the launch of the Data Institute.)

To learn more about what will be covered in the certificate, please see our article What We Will Cover.

Read the official USF Data Institute description of our upcoming deep learning course on Monday evenings and send your resume to datainstitute@usfca.edu by Oct 12 17 to apply.

Launching fast.ai

Jeremy is the past president of Kaggle, founder of Enlitic, FastMail.FM, and Optimal Decisions Group, and is on the faculty at Singularity University. See fast.ai’s About page for a brief bio.

About six months ago, I resigned from my position as CEO of Enlitic, the company that I founded to bring medical diagnostics and treatment planning into the data driven world. I created Enlitic because there are 4 billion people in the world without access to modern medical diagnostics, and it will take about 300 years to train enough doctors to fill this gap – but with deep learning, we can make doctors 10 times more productive, and therefore bring modern medicine to these people within 5 to 10 years. This is only possible thanks to the power of deep learning and neural networks, a technology which I have been using for over 20 years, but which just in the last couple of years has reached a point where it can help solve many previously unsolved problems. I discussed the implications of this a couple of years ago in my TED.com talk, back when I was first launching Enlitic. And indeed, many of the predictions I made then, have since come to pass. Deep learning is now becoming embedded in products such as Apple’s Siri, Google photos, and self-driving cars.

(Some people are even claiming that deep learning is “overhyped”. This is as ridiculous a claim as somebody in the early 90s claiming that the Internet was overhyped. Deep learning is clearly going to be even more widely used and far-reaching and transformative than the Internet.)

But for all the successes, I discovered during my two years at Enlitic that deep learning has a very long way to go before it can help most people. Creating a deep learning model is, ironically, a highly manual process. Training a model takes a long time, and even for the top practitioners, it is a hit or miss affair where you don’t know whether it will work until the end. No mature tools exist to ensure models train successfully, or to ensure that the original set up is done appropriately for the data.

Therefore, Dr. Rachel Thomas (a math PhD with past experience as a quant, Uber data scientist, full-stack developer, and educator) and I decided to create fast.ai, a research lab dedicated to doing everything necessary to allow deep learning to meet its enormous potential. We believe that this requires allowing domain experts to be able to leverage the technology themselves, rather than leaving it in the hands of a small and exclusive group of mathematicians. Only domain experts: fully understand and appreciate what are the most important problems in their field; have access to the data necessary to solve those problems; and understand the opportunities and constraints to implementing data driven solutions.

Consider this chart shared by Jeff Dean, leader of Google Brain:

Growth of Deep Learning at Google

At the start of 2012, deep learning was not being used at Google outside of Google Brain research. Since then, it’s use has grown exponentially, and it was being used in aprox 1,200 different projects by late 2015. Now imagine the impact deep learning can have as it spreads beyond the Bay Area tech elite, and we see this exponential growth in every organization around the world. The impact will be greatest in the two-thirds world, where resources are most constrained. For instance: there are only 14 pediatric radiologists for the entire continent of Africa (and half of those are in a single country, South Africa); many African countries have none! What if medical technology could read x-rays? Tens of millions of children would have access to medical image diagnostics for the first time. And the value of this technology to automate identifying tuberculosis, a disease that receives little research attention in the west but is the leading cause of death from infectious disease worldwide, killing almost 4,000 people daily, could be even higher. In India, Indonesia, and China, there are over 3 billion people, most of whom live in areas with similarly poor access to medical image diagnostics.

During my eight years in management consulting I worked with hundreds of domain experts across dozens of fields and industries. I saw people who were highly creative in figuring out how to solve their problems, given the tools that they were familiar with. Nowadays we receive requests for help nearly every day, from people who want to use deep learning by solving everything from helping treat mental illness, to increasing agricultural yields in the developing world, identifying and treating plant disease, and developing adaptive educational materials. The best way we can help these people is by giving them the tools and knowledge to solve their own problems, using their own expertise and experience.

We believe that the steps necessary to meet our goal of democratising deep learning are as follows:

  1. fix the shortage of data scientists with deep learning expertise
  2. create highly automated tools for training deep learning models
  3. build software to provide deep insight into the training of, and results from, deep learning models
  4. develop a range of “role model” applications, in areas where deep learning is currently being poorly utilized.

So we’re starting at the start! In our next post, we’ll talk about how we’re trying to deal with step 1 - fixing the shortage of data scientists with deep learning expertise.

If you can’t wait, check out the official USF Data Institute description of our upcoming deep learning course on Monday evenings and send your resume to datainstitute@usfca.edu to apply.