Our courses (all are free and have no ads):

fast.ai in the news:

Where is AI/ML actually adding value at your company?

An interesting thread came up over at Hacker News: Ask HN: Where is AI/ML actually adding value at your company?. And the folks at High Scalability were good enough to summarize the answers. It was somewhat buried in a lengthy blog post, so we wanted to highlight it here. So without further ado, here is the list:

  • Predicting if a part scanned with an acoustic microscope has internal defects
  • Find duplicate entries in a large, unclean data set
  • Product recommendations
  • Course recommendations
  • Topic detection
  • Pattern clustering
  • Understand the 3D spaces scanned by customers
  • Dynamic selection of throttle threshold
  • EEG interpretation
  • Predict which end users are likely to churn for our customers
  • Automatic data extraction from web pages
  • Model complex interactions in electrical grids in order to make decisions that improve grid efficiency
  • Sentiment classification
  • Detecting fraud
  • Credit risk modeling
  • Spend prediction
  • Loss prediction
  • Fraud and AML detection
  • Intrusion detection
  • Email routing
  • Bandit testing
  • Optimizing planning/ task scheduling
  • Customer segmentation
  • Face- and document detection
  • Search/analytics
  • Chat bots
  • Topic analysis
  • Churn detection
  • Phenotype adjudication in electronic health records
  • Asset replacement modeling
  • Lead scoring
  • Semantic segmentation to identify objects in the users environment to build better recommendation systems and to identify planes (floor, wall, ceiling) to give us better localization of the camera pose for height estimates
  • Classify bittorrent filenames into media classify bittorrent filenames into media categories
  • Predict how effective a given CRISPR target site will be
  • Check volume, average ticket $, credit score and things of that nature to determine the quality and lifetime of a new merchant account
  • Anomaly detection
  • Identify available space in kit from images
  • Optimize email marketing campaigns
  • Investigate & correlate events, initially for security logs
  • Moderate comments
  • Building models of human behavior to provide interactive intelligent agents with a conversational interface
  • Automatically grading kids’ essays
  • Predict probability of car accidents based on the sensors of your smartphone
  • Predict how long JIRA tickets are going to take to resolve
  • Voice keyword recognition
  • Produce digital documents in legal proceedings
  • PCB autorouting

The Deep Learning MOOC is now available!

We’re very excited and proud to announce the launch of the fast.ai Deep Learning MOOC. It contains all the lessons from the in-person course we’ve been discussing here over the last few months, along with extra online material to help students understand the content and complete the assigments. All MOOC participants are invited to participate in the fast.ai deep learning community, including through the forums and the wiki.

For me, the most gratifying part of putting the course online was going through all the wonderful testimonials we’ve received from our students. Thank you all for you inspiring words!

(Update - problem resolved!) Azure and AWS's 'GPU general availability' lies


Huge thanks to Boyd Mcgeachie from AWS for reaching out to us and organizing a (nearly) frictionless AWS onboarding experience for our MOOC participants. He couldn’t have been more gracious in accepting the criticisms and concerns laid out below, and explained that AWS is aware of them and working hard to fix them for all customers. I’m thrilled that we have a solution to this that allows our students to use AWS, since it’s a great service and we invested a lot of time in automating and simplifying the management of AWS instances.

Original post:

Both Microsoft and AWS have, with great fanfare, recently announced the general availability of their deep learning capable GPU instances. Unfortunately, they are far less “available” than they claim, and they have not even bothered to tell their own support teams about these limitations, let alone telling their potential customers.

The problem is that for both companies, the so-called “available” GPUs can not actually be purchased by new users. This is not mentioned anywhere, and in the case of AWS they let you go through the entire onboarding process before giving a totally obscure error (“You have requested more instances (1) than your current instance limit of 0 allows for the specified instance type”). Azure at least are a little better (they grey out the GPU instance types and write “not available” over the top of them).

We have a major deep learning MOOC launching tomorrow, and we think it may be pretty popular (it’s the first course that shows how to create state of the art models using a code-centric approach). Many students will be learning how to use cloud-based machines for the first time. But, as it stands, there is nowhere they can pay for the privilege of renting a GPU-based machine, unless they have an existing established account with Azure or AWS. Trying to resolve this with Azure and AWS has been a rather bemusing experience, as I have to repeat myself again and again to explain this limitation to support staff who have not been briefed on it. I’ve had to explain that no, it’s not user error (our 100 students of the in-person course that the MOOC is based on are not likely to have all made the exact same error!), and yes we are using the correct region, and no we’re not trying to use spot instances, etc, etc, etc…

To be clear, I understand that for capacity planning reasons it may be necessary to limit access to new instance types. I also understand that there are fraudsters around and that companies want to protect themselves. But none of this excuses or explains:

  • Not telling your customers about the limitation
  • Not telling your own support staff about the limitation
  • Allowing customers to complete the entire onboarding process, including selecting a GPU instance
  • Making a PR fanfare about your product being available, but in practice (and in secret) only making it available to your established customers (indeed, why make a marketing splash about something that those people that see the marketing can’t actually use?!?)
  • The totally bizarre responses that requests received. For instance, my co-instructor’s request (who in her request included a link to the course and her linkedin, and who has a Duke math PhD, worked as a quant, and was a data scientist at Uber) was denied, whereas some students who provided no justification were accepted, on the same day!
  • Why some of our students, who were fully paid-up, suddenly found their access cut off in the middle of the course

I should also say that the support and capacity planning folks at both AWS and Azure have been tenacious in trying to find a way to solve this problem. Although neither company responded to my tweets informing them about the issue, both companies did respond to support tickets (although in both cases it required me to educate them about their own system’s limitations). They’re looking for a solution as I post this. Hopefully with broader awareness of this issue, and of the impact it has on those looking to get into deep learning for the first time, they will get the resources they need to fix it.

A plea: If you are from Amazon or Microsoft, or know anyone in a position of power there, could you please pass this on to them and ask them to help us? We’re looking for a way that our students can pay them money for GPU access! Our email address is info@fast.ai

So you are interested in deep learning

This was inspired by a bright high school student that emailed me for advice about his interest in deep learning.

Q: Hello Dr. Thomas! I’ve been trying to find good resources for deep learning, but the field does seem rather cryptic and a bit technically prohibitive for me at this point. If you wouldn’t mind, I had a couple of questions I’d love to ask you about learning deep learning:

  • Is there a single book or a set that you’ve found that explains deep learning well? I’ve looked at ones like deeplearning.net or MIT’s free book, but all resources I’ve found are either too brief an introduction or wonderfully mathematically engaged but not applicable at all
  • Do you think it’s a good idea for me to frontload mathematical rigor at this point, or should I wait until I’m further down the path to try to get the technical details down?
  • When you take on a data science problem, how do you answer the classic “what to try next” question? For instance, sometimes on Kaggle problems, I’ll hit a wall where I don’t know what the best next move is.

A: Your assessment that most deep learning resources are either too brief or too mathematical is spot-on! My partner Jeremy Howard and I feel the same way, and we are working to create more practical resources. We will soon be producing a MOOC based on the in-person course we taught this autumn in collaboration with the Data Institute at USF. Until then, here are my recommendations:

In my opinion, the best existing resource is the Stanford CNN course. I recommend working through all the assignments.

Below are some of my favorite tutorials, blog posts, and videos for those getting started with Deep Learning:


Gradient Descent



As for your question about whether to front-load mathematical rigor, I think it’s good to focus on practical coding, since that way you can experiment and develop a good intuition and understanding of what you’re doing. Math is best learned on an as-needed basis - if you can’t understand something you’re trying to learn because math concepts are popping up you’re not familiar with, jump over to Khan Academy or to the absolutely beautiful 3 Blue 1 Brown Essence of Linear Algebra videos (great for visual thinkers) and get to work! Jeremy’s RNN tutorial above is nice example of a code-oriented approach to deep learning, although I know this can be hard given the existing resources.

It’s great that you’re doing Kaggle competitions. That is a fantastic way to learn–and to see if you understand the theory that you’re reading about. I’d have to know more about what you’re trying to know what to suggest next.

How should you structure your Data Science and Engineering teams?

I sometimes receive emails asking for guidance related to data science, and I’m going to start sharing my answers here as a data science advice column. If you have a data science related quandary, email me at rachel@fast.ai. Note that questions may be edited for clarity or brevity.

Q: Hello Rachel, I’m VP of Engineering at a start-up that is increasingly seeing our data & ML algorithms as our core asset. In thinking about the next few technical hires we want to make, we want to target engineers that will be able to accelerate the efforts of our Data Science team, so I’m trying to do some pre-recruiting research to understand how engineering teams focused on support of production ML pipelines are structured. Some of what I’m wondering about:

  • How are the Data Science & Engineering teams structured? e.g. Is there a notion of a “Data Engineering” team that is paired with the Data Science team? Or are Data Scientists & Engineers just integrated together in “vertical” product teams?

  • How do the Data Science & Engineering teams interact? How is the roadmap for Data Science coordinated with the roadmap for Data Engineering?

  • How are responsibilities split between Data Scientists vs. Engineers? Is there a notion of a hybrid role, and what does it look like if so?

A: This answer is based on my experience as a data scientist, my experience interviewing for data science roles, and talking with a number of data scientists. I’ve watched employers go through multiple data science re-orgs.

There are a lot of potential pitfalls related to data science and org structure (no matter what you choose). I’m going to take the liberty of expanding your question to cover the relationship between data science and other teams, as well as data engineering. Consider these scenarios:

  • The data science team interviews a candidate with impressive math modeling and engineering skills. Once hired, the candidate is embedded in a vertical product team that needs simple business analytics. The data scientist is bored and not utilizing their skills.
  • The data science team is separate (not embedded within other teams). They build really cool stuff that never gets used. There’s no buy-in from the rest of the org for what they’re working on, and some of the data scientists don’t have a good sense of what can realistically be put into production.
  • There is a backlog with data scientists producing models much faster than there is engineering support to put them in production.
  • The data infrastructure engineers are separate from the data scientists. The pipelines don’t have the data the data scientists are asking for now, and the data scientists are under-utilizing the data sources the infrastructure engineers have collected.
  • The company has definitely decided on feature/product X. They need a data scientist to gather some data that supports this decision. The data scientist feels like the PM is ignoring data that contradicts the decision; the PM feels that the data scientist is ignoring other business logic.


Having data scientists all on a separate team makes it nearly impossible for their work to be appropriately integrated with the rest of the company. Have your data scientists distributed throughout the company, but also have a team doing data science evangelism within the company. Vertical product teams need to know what is possible and how to best utilize data science. It’s too hard for a lone data scientist to advocate for the role of data-driven decisions within the team they’re embedded in.

Data scientists should report to both a data science manager and a manager within the product team. You need a lot of communication: to make sure that the team is getting the most value and that the data scientist is finding fulfilling work. One approach that can work really well is to have half the data scientists switch to a different group each year (or even more often).

While it’s common to have machine learning, engineering, and data/pipeline/infrastructure engineering all as separate roles, try to avoid this as much as possible. This leads to a lot of duplicate or unused work, particularly when these roles are on separate teams. You want people who have some of all these skills: can build the pipelines for the data they need, create models with that data, and put those models in production. You’re not going to be able to hire many people who can do all of this. So you’ll need to provide them with training. In general, the most underused resource of most companies is their own employees, and the situation is even worse with data scientists (since “data science” encompasses such a wide variety of possible skills). Tech companies waste their employees’ potential by not offering enough opportunities for on-the-job learning, training, and mentoring. Your people are smart and eager to learn. Be prepared to offer training, pair-programming, or seminars to help your data scientists fill in skills gaps. I always tell students who are interested in both data science and engineering, that the more you know about software development, the better a data scientist you will be.

Even when you have people who are both data scientists and engineers (that is, they can create machine learning models and put those models into production), you still need to have them embedded in other teams and not cordoned off together. Otherwise, there won’t be enough institutional understanding and buy-in of what they’re doing, and their work won’t be as integrated as it needs to be with other systems.

The term data scientist refers to at least 5 distinct jobs, so communication and clarity is key. Companies need to be clear on what their needs are and what they’re hiring for. I can tell you from firsthand experience as a job applicant that lots of companies want to hire a data scientist but aren’t sure why, or how they will use data science. You want to hire someone who is interested in the role they’d be working in. You probably won’t find a candidate that’s both interested in writing machine learning implementations in C and extensively using Google Analytics, although that is a real job description I’ve encountered. Note I say “interested in” and not “already has the skills”. Assume that any applicant will be learning lots of new skills on the job (if not, they will soon grow bored).

Further reading: After drafting this post, I came across an excellent article called The Data Science Delusion which details several additional problems companies may encounter when incorporating data science into their org.