There has been much hype surrounding deep learning in recent times, and one of the cornerstones of deep learning is the neural network. In this article, we will look at what a neural network is and get familiar with the relevant terminologies.
In simplest terms, a neural network is an interconnection of neurons. Now the question arises, what is a neuron? To understand neurons in deep learning, we first need to look at neurons from a biological perspective, from which the concept of neural network has actually been taken.
A neuron is the primary component of our central nervous system (which includes the brain) and is a cell that can receive, process and transmit information when excited. A neuron transmits signals to other neurons via connections called synapses. Based on these synapses, the brain makes the decisions. An interconnection of such neurons forms what is known as, the neural network.
Using this concept from the biological neural network, an Artificial Neural Network (ANN), also simply known as the neural network, is formed. A biological neuron receives signals via the dendrites, the nucleus processes these signals and the axon transmits these signals to other neurons. In the same manner, an artificial neuron input takes an input, does some kind of processing and pass the result to the other neurons in the network. If you’d like an animated version of how a neural network works, check out this video from 3Blue1Brown.
Let’s start with the simplest possible neural network – the perceptron.
A perceptron is a single layer neural network. As we can see from the above figure, a perceptron consists of:
- Transfer function
- Activation function
Let’s understand this via an example. Suppose that we have a 4×4 image and we want the perceptron to recognize whether this image consists of a vertical line (as shown by dark boxes in the figure above). The inputs (x1,x2,x3, …, x16) are the intensity values for each of the pixel. Now, the weights indicate the strength about how much we care about a particular pixel. As shown in the figure, we are interested in two sets of vertical pixels, so these 8 pixels are assigned a higher weight that the other 8 pixels. This means even if some of the middle pixels are lit up, our perceptron cares less about those pixels. After all the intensity values have been acquired, the weighted sum of the intensities is calculated in the transfer function. Thus, the pixels of our interest have more influence on the result of the transfer function than the others. Finally, the result of this transfer function is passed into an activation function. If the result of the activation function is above a provided threshold, then the perceptron fires, otherwise it doesn’t. In other words, the activation function tells us whether there is a vertical line in our desired locations or not. Since single layer perceptrons can be used to answer these yes/no question, they are also called a linear (binary) classifier.
Besides these, a perceptron also consists of a bias. This bias can be used to tweak the values of the pixels towards or away from the decision boundary.
Two layered Neural Network
Now, that we’ve seen what a single layer neural network looks like, let’s take it up a notch. The two-layered neural network, as seen in the above diagram, is a more common representation of a neural network in literature. The concept of what a layer is, may not have been completely clear in case of a perceptron, so this section covers that portion.
In the case of a perceptron, all we had was an input and an output – a single layer. In a two-layered neural network, we have an additional hidden layer. Now, every node of this neural network is basically a perceptron. So, all we’re doing is, we’re stacking a bunch of perceptrons that can make a more complex decision than just the presence of a vertical line.
Taking the above example again, this time we are dealing with multiple perceptrons that have different functions. The goal here is to determine whether the image consists of a digit ‘4’. Here, one perceptron is determining the presence of a vertical line, another determines the presence of a horizontal line and yet another is looking for a vertical line where half of the pixels are lit up and the other half is not.
So, again we give it the inputs for intensities of all the pixels in the input layer. The individual perceptrons fire using the concept already explained above and provide that output to the second layer, the hidden layer. Now, the hidden layer takes these inputs for the vertical line, horizontal line and edge and makes its own sorts of calculations. This hidden layer also has its own transfer function and activation function. What is interesting is that we don’t define the hidden layer explicitly. The hidden layer generates these functions by a process called supervised learning. The hidden layer then passes its decision on to the output layer, which finally tells us whether the image consists of a ‘4’ or not.
So, what is this supervised learning and how does the hidden layer learn? Supervised learning is a process where a neural network is given a set of inputs (like the handwritten digits image) and in addition, we also tell it what the label of that set of inputs is (what number is contained in the image). In the example given above, we only gave 4×4 image as the input for simplification. The actual handwritten digits dataset, called the MNIST dataset consists of images of size 28×28 (784 pixels in total) for images of handwritten digits 0 through 9. It also consists of a label for each image.
In the beginning, we give the input layer all the intensities for the pixels and as for the weights, we assign them randomly. Then the hidden layer generates its results and the output layer decides which of the 10 digits it thinks the given image is. As the weights are randomly assigned, the first result (and in fact for many iterations), the result will be incorrect. This is when we give the network feedback using the label we have, that what we gave it was a 4 but it had incorrectly classified it as 9, for instance. The network then re-adjusts its weights and biases by using this feedback so that for the next iteration, it can better predict the given image. This process of re-adjusting the weights and biases using the feedback is known as backpropagation, and the whole process where the network learns using the labeled data to later classify images that don’t have labels is known as supervised learning.
Now the network uses this process of backpropagation to refine its transfer and activation functions. We don’t define these functions explicitly, neither can we program it ourselves. The network finds the parameters to tweak and decide what’s best for it. As the functionality of how this layer adjusts remains hidden from the user, hence the name hidden layer.
Here, we presented only a single hidden layer. In reality, there can be multiple hidden layers and all the layers work similar to the methodology explained above. Such a network is known as a multilayer neural network. There can be as many hidden layers as the problem requires. One such network is shown below.
Each neuron of each layer is just a simple perceptron that takes input from the previous layer, does some sort of calculation and provides the output to the next layer. So even though the above figure may look scary at first, if you break it down, everything is just a perceptron that trains through backpropagation and becomes better at making decisions. And that is the closest we have gotten to modeling how a human brain works, by creating an artificial brain.