Our courses (all are free and have no ads):

fast.ai in the news:


Please Don't Say "It used to be called big data and now it's called deep learning"

At the Financial Times-Nikkei conference on The Future of AI, Robots, and Us a few weeks ago, Andreesan Horowitz partner Chris Dixon spoke just before Jeremy Howard and I were on stage. Dixon said many totally reasonable things in his talk–but it’s no fun to comment on them, so I’m going to focus on something rather unreasonable that he said, which was: “A few years ago it was called big data and then it was machine learning and now it’s called deep learning”. It was not entirely clear if he was saying that these are all terms for the same thing (they are most definitely not!) or if he was suggesting that the “in” data-driven approach changes from year to year. Either way, this obscures what a complete game-changer deep learning is. It is not just the 2016 version of “big data” (which has always been an empty buzzword). It is going to have an impact the size of the impact of the internet, or as Andrew Ng suggests, the impact of electricity. It is going to effect every industry, and leaders of every type of organization are going to be wishing that they had looked into it sooner.

First, to clear up some terms:

Big Data: This was an empty marketing term that falsely convinced many people that the size of your data is what matters. It also cost companies huge sums of money on Hadoop clusters they didn’t actually need. Vendors of these clusters did everything they could to maintain momentum on this nonsense because when CEOs believe it’s the size of your hardware that counts, it’s a very profitable situation if you make, sell, install, or service that hardware…

Francois Chollet, creator of the popular deep learning library Keras and now at Google Brain, has an excellent tutorial entitled Building powerful image classification models using very little data in which he trains an image classifier on only 2,000 training examples. At Enlitic, Jeremy Howard led a team that used just 1,000 examples of lung CT scans with cancer to build an algorithm that was more accurate at diagnosing lung cancer than a panel of 4 expert radiologists. The C++ library Dlib has an example in which a face detector is accurately trained using only 4 images, containing just 18 faces! Face Recognition with Dlib

Machine Learning: Machine learning is the science of getting computers to act without being explicitly programmed. For instance, instead of coding rules and strategies of chess into a computer, the computer can watch a number of chess games and learn by example. Machine learning encompasses a wide variety of algorithms.

Deep Learning: Deep learning refers to many-layered neural networks, one specific class of machine learning algorithms. Deep learning is achieving unprecedented state of the art results, by an order of magnitude, in nearly all fields to which it’s been applied so far, including image recognition, voice recognition, and language translation. I personally think deep learning is an unfortunate name, but that’s no reason to dismiss it. If you studied neural networks in the 80s and are wondering what has changed since then, the answer is the development of:

  • Using multiple hidden layers instead of just one. (Even though the Universal Approximation Theorem shows that it’s theoretically possible to just have one hidden layer, it requires exponentially more hidden units, which means exponentially more parameters to learn.)
  • GPGPU, programmable libraries for GPUs that allow them to be used for applications other than video games, resulting in orders of magnitude faster training and inference for deep learning
  • A number of algorithmic tweaks (especially the Adam optimizer, ReLU activation functions, batch normalization, and dropout) that have made training faster and more resilient
  • Larger datasets– although this has been a driver of progress, it’s value is often over-emphasized, as the “little data” examples above show.

Another common misconception Chris Dixon stated was that deep learning talent is incredibly scarse, and it will take years for graduate programs at the top schools to catch up to the demand. Although in the past a graduate degree from one of just a handful of schools was necessary to become a deep learning expert, this is a completely artificial barrier and no longer the case. As Josh Schwartz, chief of engineering and data science at Chartbeat, writes in the Harvard Business Review, “machine learning is not just for experts”. There has been a proliferation of cutting-edge commercially usable machine learning frameworks, machine learning specific services being released from major cloud providers Amazon and Google, tutorials, publicly released code, and publicly released data sets.

We are currently in the middle of teaching 100 students deep learning from scratch, with the only prerequisite being one year of programming experience. This will be turned into a MOOC shortly after the in-person class finishes. We’re in the 4th week of the course, and already the students are building world-class image recognition models in Python.

It generally far better to take a domain expert within your organization and teach them deep learning, than it is to take a deep learning expert and throw them into your organization. Deep learning PhD graduates are very unlikely to have the wide range of relevent experiences that you value in your most effective employees, and are much more likely to be interested in solving fun engineering problems, instead of keeping a razor-sharp focus on the most commercially important problems. In our experiences across many industries and many years of applying machine learning to a range of problems, we’ve consistently seen organizations under-appreciate and under invest in their existing in-house talent. In the days of the big data fad, this meant companies spent their money on external consultants. And in these days of the false “deep learning exclusivity” meme, it means searching for those unicorn deep learning experts, often including paying vastly inflated sums for failing deep learning startups.

Additional Diversity Fellowship, New International Fellowships, and Deadline Extended to 10/17

We have been getting a lot of interest in our upcoming deep learning course over the last couple of days. With applications closing today, we’ve heard from USF that they will be able to extend the application deadline. The revised deadline is October 17. This will mean some late nights from the team at USF to ensure that all enrollments are processed in time for the start of the course–so many thanks to them! USF’s page with logistical details about the course is available here.

We are also excited to announce an additional diversity fellowship. After learning about our decision to sponsor a diversity fellowship, USF has generously said that they will match us, by sponsoring a 2nd diversity fellowship for our course! Women, people of Color, and LGBTQ people are all invited to apply. To apply, please email datainstitute@usfca.edu a copy of your resume, a sentence stating that you are interested in the diversity fellowship, and a paragraph describing the problem you want to use deep learning to solve. We are excited to start addressing the diversity crisis in Artificial Intelligence.

Finally, we’d like to announce that we have launched an International Fellowship program for up to five people, who will be able to fully participate in the course remotely, for free. We are very excited to introduce our first successful applicant, Samar Haider from Pakistan. Samar first taught himself machine learning using online resources like Andrew Ng’s Coursera class and is now a researcher applying natural language processing to his native language of Urdu, at the Center for Language Processing in Lahore. Pakistan has a rich heritage of 70 different spoken languages, many of which have not been well-studied. At fast.ai, this is exactly the type of project that we want to equip people to work on–domains outside of mainstream deep learning research, meaningful but low-resource areas, problems that smart people from a wide variety of backgrounds are passionate about. And Samar is exactly the kind of passionate person that we want to support–as well as teaching himself machine learning, he has even invested his own money to get access to GPU time on Amazon so that he can train models. We hope to see Samar’s fellowship benefit the Pakastani community more widely, by making available Urdu deep learning resources for the first time.

To apply to join Samar as an international fellow of the program, please email rachel@fast.ai a copy of your resume, a sentence stating that you are interested in the international fellowship, and a paragraph describing the problem you want to use deep learning to solve. The program is open to people anywhere in the world (including the US) who can not attend the course in person in SF.

International fellowship winners:

  • Must be willing to attend the course via Skype in realtime, even if the time is inconvenient in their home time zone
  • Must be willing to participate remotely with group members (who will be based in the San Francisco Bay Area)
  • Will not be eligible to receive an official certificate.

Andrew Ng says Deep learning is the "New Electricity"; what this means to your organization

I’ve been saying for some time that Deep Learning is going to be even more transformative than the internet. This view is shared by the always insightful Andrew Ng (Chief Scientist at Baidu, former CEO of Coursera, and head of Google Brain–and perhaps the only person I’m aware of who understands both business strategy and deep learning). This month’s Fortune magazine has Deep Learning as their cover story, and in it they quote Ng as saying: “In the past a lot of S&P 500 CEOs wished they had started thinking sooner than they did about their Internet strategy. I think five years from now there will be a number of S&P 500 CEOs that will wish they’d started thinking earlier about their AI strategy.

In fact, Ng goes even further, saying “AI is the new electricity. Just as 100 years ago electricity transformed industry after industry, AI will now do the same.” Fortune discusses this in a commentary titled The AI Revolution: Why You Need to Learn About Deep Learning, which is a most timely reminder, given that applications for the first deep learning certificate close in two days!

I remember how many of my colleagues and clients reacted when I was at McKinsey & Co in the early nineties, and I was telling everyone I could that the internet was going to impact every part of every industry. At that time, as a very new consultant, I very little success in getting heard. (In hindsight, I clearly should have left consulting and started a company based on my conclusions!) I hope that this time around I am in a better position to help organizations understand why they need to invest in deep learning as soon as possible.

I have had many opportunities to discuss this issue with the S&P 500 executives who have attended my data science classes as part of the executive program at Singularity University. Many execs have gone on to develop data driven organization initiatives at their companies - but for those that don’t, these are some of the excuses that I’ve heard:

  1. As a big company, we focus on competing, and our competitors aren’t doing this now
  2. We run on expertise–we don’t trust models, but trust our instincts
  3. Our data is too messy; our data projects aren’t ready yet
  4. We can’t hire the right experts

Let’s look at each of these in turn.

1. You need to lead, not follow, on massive industry transformations

The lesson of the internet shows us the danger of being a follower when there’s a massive industry transformation going on. Whether it is Kodak vs Instagram, Amazon vs Borders, or any of the other pre-internet companies that got destroyed by new competitors, there are more than enough examples of the danger of waiting until you see your competitors’ completing transformation projects. You won’t know about your new competitors until it is far, far too late. And it’s much easier to get started early, when there’s time to build up the infrastructure and capabilities you need.

We can also see from the internet example that companies that are amongst the first into a space are the ones that win in the long term. Look at some of these examples:

  • Thomas Edison’s original electricity company today is GE
  • The company created for the first punch card data collection (in the 19th century for the US census) today is IBM
  • The first significant e-commerce company was Amazon.

2. Data and instinct work together

There is nothing wrong with trusting your instincts as an industry leader–for most execs, it’s your instincts that have gotten you to where you are today. But today’s data-driven companies are powering ahead on every metric that matters; the best role model is surely Google, which has nearly all leadership positions filled by computer science and math PhDs, and has used data to become one of the world’s largest companies in quick time.

Data and models should not be used to make decisions on their own, and neither should instinct. The best execs use a combination of both. Deep learning models provide deeper insight and greater accuracy, make existing products better, improve operations (e.g. Google used deep learning to reduce data center cooling requirements by 40%!) and make new classes of product available.

3. Using the data and infrastruture you have now is the best way to start

Every large organization I’ve even worked with has been in the middle of a major data infrastructure project going on at all times. If you wait until your data infrastructure is perfect, you’ll never start actually using that data to create value! There’s a lot of benefits to starting to use the infrastructure you already have to start creating value now:

  • You learn quickly what data is the most valuable in practice, and can focus your development efforts there
  • You create role model project results you can use to evangelize data-driven projects throughout the organization
  • Your further data infrastructure work can be funded by the value from your initial projects
  • You find out which of your team is most effective at delivering value from data, and can identify your recruiting needs more accurately

Deep learning is particularly effective at handling noise in data, and in handling unstructured data - so if your data infrastructure is not in a good state, it’s even more important that you invest in deep learning.

4. Rather than hiring experts, develop them internally

The people that best understand your business are the people who are in your business. Looking externally for deep learning experts, rather than developing deep learning expertise within your existing staff, means that you will be creating a gap between your domain experts and your new data experts. This gap can be nearly impossible to fill, and can lead to many organizational problems.

Furthermore, deep learning experts are like unicorns at the moment–there are very few available, and they are very expensive ($5m-$10m acquihire value, according to VC Steve Jurvetson). But any reasonable numerate coder can develop deep learning skills within a few months; in fact, we’re trying to teach the best practices in just seven weeks in our deep learning certificate!

The best approach, of course, is to do both: hire existing deep learning experts if you can, whilst developing your own team’s skills at the same time.

In closing

If you think that your organization should heed Andrew Ng’s advice, please send this article to the manager of every team that you think could benefit. My talk Deep Learning Changes Everything is included on USF’s deep learning certificate site, and has more information, including a sample deep learning lesson.

The Diversity Crisis in AI, and fast.ai Diversity Fellowship

Update: The deadline has been extended to 10/17. Read more here

At fast.ai, we want to do our part to help make deep learning more inclusive, and we are beginning by offering 1 full tuition fellowship for our deep learning certificate course at the Data Institute at USF, to be held on Monday evenings starting 10/24. Women, people of color, and LGBTQ people are invited to apply. To apply, please send your resume to datainstitute@usfca.edu before 10/12 10/17, along with a note that you are interested in the diversity fellowship and a brief paragraph on how you want to use deep learning. You can read more here about what we’ll cover and our approach to teaching.

Why are we doing this? Artificial intelligence is an incredibly exciting field to be working in right now, with new breakthroughs occurring almost daily. I personally feel so lucky to be able to work in this field and want everyone to have access to such fascinating and creative work. . Furthermore, artificial intelligence is missing out because of it’s lack of diversity. A study of 366 companies found that ethnically diverse companies are 35% more likely to perform well financially, and teams with more women perform better on collective intelligence tests. Scientific papers written by diverse teams receive more citations and have higher impact factors.

As big as the diversity crisis in tech is, it’s even worse in the field of artificial intelligence, which includes deep learning. Immensely powerful algorithms are being created by a very narrow and homeogeneous slice of the population. Only 3 of the 35 people on the Google Brain team are women; only 1 of the 15 AI researchers at Stanford is a women; and in 2015, only 14% of the attendees at one of the largest AI conferences (NIPS) were women. An analysis of the language in job postings found that job ads for machine intelligence roles were significantly more masculinely biased compared to postings for all other types of software engineer roles.

We’ve already seen the following sad (yet unintentional) reflections of bias in AI:

The opportunity for biased algorithms to have negative real world consequences will only increase as the role of machine learning continues to grow in coming years.

Olga Russakovsky, a research fellow at the CMU Robotics Institute (and soon to be CS professor at Princeton), wrote that the field of AI is in a rut, “We’ve tended to breed the same style of researchers over and over again–people who come from similar backgrounds, have similar interests, read the same books as kids, learn from the same thought leaders, and ultimately do the same kinds of research.”

Jeff Dean, the legendary head of Google Brain, said that he is not worried about an AI apocolypse, but he is very concerned by the lack of diversity in the field of AI.

Mathematics is a field notorious for its sexism, and when we make advanced mathematics an unnecessary barrier to entry for deep learning (a follow up post expanding on this is in the works), we greatly reduce the number of women that will be eligible, since they’ve already been weeded out by hostile and biased environments. Note that this is due to cultural factors in the US, and doesn’t hold true in all countries.

Please email rachel@fast.ai with any questions about the diversity fellowship or comments about this article.

A unique path to deep learning expertise

If you’re considering investing time in studying deep learning with fast.ai, then we want to give you all the information you need to decide whether you could benefit from this content. We believe that nearly everybody who can code should be able to benefit. For a start, the financial opportunities are extraordinary – there is a pricetag of $5 million-$10 million per deep learning expert for acquisitions in the Bay area. Much more importantly, deep learning experts can help to solve some of the world’s biggest challenges in nearly every area.

New deep learning content

The best way to make teaching easier, is to make the material that is being taught easier. Therefore, we have spent a lot of time writing new deep learning libraries that can provide state-of-the-art results with 10 to 100 times less code than the best existing libraries. All participants will be given access to these libraries, and will even learn how they are built from scratch. The libraries are very flexible, supporting a wide range of real-world problems, not just those that are discussed in the course. Here is an example that will be introduced within the first half-hour of the first lesson, which shows how to use fast.ai’s library to approach state of the art computer vision results in 7 lines of code:

Code snippet

We will also be teaching you how to use the most recent and effective deep learning tools. In particular, we will spend a lot of time looking at Keras, a deep learning library that utilises Tensorflow and Theano to flexibly and concisely build any kind of deep learning model that you can think of. Furthermore, we will be teaching you how to use the most recent deep learning optimised Amazon P2 GPU instances, released within the last two weeks. These instances are more complex to set up than most, which is why we will be providing you with tools designed to make it extremely easy.

Many people have incorrectly claimed that effective deep learning requires vast resources of data and compute power. This is totally incorrect, and comes partially from a lack of understanding (most researchers are at organisations that have access to these resources, so are not aware of the alternatives), and partially because it is helpful for big organisations’ recruiting efforts to make these false claims. A particular focus of the course will be showing you how to create great models using moderate computing resources (especially, limited to a single machine) and only whatever data you are able to find for your particular project. We will be showing many real-world examples of extraordinary results that we will be creating together using these resources. For instance, here’s a snippet from our workbook explaining how data augmentation can increase the effective amount of data available from training a model:

Data augmentation example

For more information about the course content, take a look at this more detailed discussion of what we’ll be studying.

New deep learning community

The course will not be just about developing skills, but also about developing a community. We will be working hard to help you form connections with your fellow participants, including providing forums and Slack channels which will be deeply integrated into the course logistics. Every week we will be providing interesting, in-depth exercises, designed for group work – and will be helping you to form these groups. Even after the course is finished, we will continue to maintain the community channels, and will invite all participants to be involved in future projects through the Data Institute.

Helping to create the community is one of the key reasons why we decided to create this course.

We have been very disappointed by how exclusive deep learning currently is – especially in the Bay area. We believe that progress will be fastest when the community is as inclusive as possible, since that will both increase the diversity, and the quantity, of people working in the field. Rachel has spent the last couple of years teaching programming to women, and is a popular writer covering diversity in tech issues. She is working to ensure that the course is inclusive.

We are also aware that most programmers around the world still use Windows, even as MacBooks seem to be absolutely everywhere in the Bay area. Our course will be accessible for those running Windows, Linux, or Mac OS. We believe that this is the first time that any deep learning course has ensured that all content can be accessed and utilised by Windows users. Since Windows computers are much cheaper than Macs, and are much more widely used in the developing world, we hope that we will be able to bring deep learning to many people for whom it was previously a closed field.

After the course is complete, we will be making all materials available online – although, of course, all of the benefits of the in person teaching, community development, and group projects will not be available to those who cannot attend the classes.

New deep learning teaching methods

We have worked particularly hard to incorporate the most pragmatic and up-to-date research into effective teaching methods. Dr Rachel Thomas’s years of teaching experience includes everything from teaching college calculus to teaching full stack web development to 1st time programmers. Jeremy Howard’s teaching of data science at Singularity University has reached many of the world’s top scientists and business executives around the world. Rachel is well aware of the ways in which technical teaching generally fails to meet the needs of most students, and has written about our teaching philosophy and approach.

Beyond the 2 1/2 hours of teaching time each Monday night, the course will continue throughout the week. The exercises will be very pragmatic programming projects, where you will develop the confidence that you can create deep learning models from scratch and solve real-world problems with them. Jeremy and Rachel will be online throughout the week, along with your fellow students, helping you to resolve any challenges that you face. Furthermore, the exercises will provide opportunities for further study – those with the time and interest will have the scope to take their studies right up to the edge of the state-of-the-art, and even beyond.

All of the classes will be recorded, and the recordings will be made available online the day after the class, to help you with your review during the week. As well as these recordings, we will be providing you with a map of external resources that you can use to go further in your studies, or fill in gaps in your foundational knowledge. During our research for this class we have looked at numerous data science and deep learning courses, books, and videos; from this research, we now know which materials to recommend in each area. We believe that it is best for course participants when best of breed external resources are incorporated into the curriculum.

A personal note: Even as deep learning allows computers to see for the first time, there are still millions of people around the world who cannot themselves see. This is because of an illness called “cataract blindness”, which nearly exclusively impacts the developing world, and is most closely related to poverty. We are teaching this course because we care deeply in our mission; we also care deeply about the mission of eliminating cataract blindness, which is why we are donating all of our teaching fees to the Fred Hollows Foundation. Each $25 donated cures one person of cataract blindness! We would encourage everyone reading this to consider whether they could spare $25 to give someone their life back in this way.