Deep learning is one of the most important fields of artificial intelligence. In this blog, I explain what it is, why it matters, and why you should know about it.
Deep learning is a special form of machine learning. However, as you will see, deep learning is a huge step towards genuine artificial intelligence. All the coolest and most impressive applications of machine learning are actually deep learning. These include computers that can beat Go masters and systems that can detect cancer.
Deep learning utilizes deep neural networks in order to allow unsupervised learning with huge datasets. So, before I explain how deep learning works, I need to introduce you to artificial neural networks.
Artificial neural networks are designed as extremely simplified replicas of our brains. Layers of artificial neurons are connected together in order to process data in a non-linear manner. The image below shows a simple artificial neuron.
This artificial neuron takes one or more inputs, combines them with a weight, then uses transfer function to give the output. Artificial neurons may have an activation function that only “fires” once the output passes a certain threshold value.
In an artificial neural network (ANN), you combine many artificial neurons together into a layer. Then you combine multiple layers together, typically an input layer, output layer, and one or more hidden layers. In the image below, you can see there are 3 inputs, 1 hidden layer, and 2 outputs.
The key advantage of artificial neural networks is that they allow you to perform non-linear calculations. In turn, this means you can use an ANN for machine learning. One of the earliest such examples was Yann LeCun’s pioneering work on recognizing hand-written numerals.
From neural networks to deep learning
Artificial neural networks were revolutionary. They provided us with the ability to perform feats like image recognition and voice recognition. Over time, we have extended ANNs with more and more hidden layers and larger numbers of inputs and outputs. We describe these neural networks as ‘deep’ because of the number of hidden layers.
In the simple ANN above, each output is already the result of a complex calculation. As you add layers, the complexity increases exponentially. This means you need ever-more-powerful computers to run your models. But the advantages are huge. As you add complexity, the neural network can make more and more connections.
A traditional neural network has to be trained using supervised learning and backpropagation. You start by identifying features you want to find and perform feature engineering on your dataset. You then need to train the model to find those features.
In deep learning, the machine uses unsupervised learning to identify interesting features. It then teaches itself to classify these and gives you the output. Unlike machine learning, you are unable to see what is going on inside a deep learning system. This means that the only way to verify it works is by testing it. In effect, this is exactly the way a child is taught at school. Their teacher doesn’t understand how the child’s brain is learning, but they can test it with an exam.
Why is deep learning so powerful?
Deep learning gives computers the ability to learn and hypothesize in the same way humans do. Let me give you a simple analogy. Imagine you see a dirty car drive into a building, then a few minutes later it emerges clean. Of course, you know this means the car was washed inside the building. And so, you can draw the inference that the building is a carwash. However, for a computer, this is usually impossible. For a start, it has to understand the concepts of dirty vs. clean. For another, it lacks the ability to draw inferences. However, deep learning systems can perform exactly this sort of task.
Teaching a machine to play Go
Alphago is one of the classic examples of deep learning. DeepMind (now a part of Google) trained AlphaGo to play the board game, Go. Go is notoriously difficult to learn and master. DeepMind taught AlphaGo the rules of the game. Initially, the team ”introduced AlphaGo to numerous amateur games to help it develop an understanding of reasonable human play.”
They then set it to play random games against different versions itself. Each time it played, the system was able to learn new strategies. This is known as reinforcement learning. Over time, AlphaGo got better and better at playing. Eventually, it was good enough to beat a world champion Go player.
DeepMind says “AlphaGo [is] a computer program that combines advanced search tree with deep neural networks. These neural networks take a description of the Go board as an input and process it through a number of different network layers containing millions of neuron-like connections. One neural network, the “policy network”, selects the next move to play. The other neural network, the “value network”, predicts the winner of the game.”
How does Saibre use deep learning?
Sonasoft Saibre is our AI bot factory. It is a universal AI platform that can create machine learning models from raw data. Saibre is based on deep learning. This gives it the unique ability to create and test hypotheses from your raw data. In turn, these hypotheses are then used to train machine learning models to perform key tasks.
Forecasting. Many businesses rely on accurately forecasting their future demand. Forecasting allows them to plan better and streamline their business. Forecasting like requires analyzing and modeling historical data and then extrapolating the model into the future. This can also be done in reverse. Given a future demand, when do you need to get resources into place to meet it. Saibre is able to create forecast models by analyzing your raw historical data using deep learning.
Anomaly detection. Often, it is critical to identify anomalies in data. We looked at this in detail in a recent blog. Anomaly detection allows banks to identify fraudulent card transactions. You can also use it to spot when a piece of machinery is starting to fail. In both cases, Saibre creates anomaly detection bots from your raw data. One of the key techniques it uses is deep learning. This allows it to determine which values are anomalous without needing to be given detailed knowledge about the data.
In part 2 of this blog, we will look in more detail at deep learning. In particular, seeing how it helps unlock useful insights into your data.