Table of Contents Hide
Neural networks are a huge variety of algorithms in machine learning. What do they consist of and how do they work? Let’s try to figure this out.
A neural network is a functional unit of machine or deep learning. It mimics the behavior of the human brain because it is based on the concept of biological neural networks.
Neural networks are capable of solving many problems. They mainly consist of such components:
- input layer (receiving and transmitting data);
- hidden layer (calculation);
- output layer.
To implement a neural network, you need to understand how neurons behave. A neuron simultaneously receives several inputs, processes this data, and produces a single output.
Simply put, a neural network is input and output blocks, where each connection has corresponding weights (this is the strength of the neuron connection; the greater the weight, the stronger one neuron affects the other). The data from all inputs are multiplied by the weights:
- x → xw1;
- y → yw2.
The inputs after weighting are summed with the addition of the threshold value «c»:
xw1 + yw2 + c
The resulting value is passed through the activation function (sigmoid), which converts the inputs into a single output:
z = ƒ(xw1 + yw2 + c)
This is what sigmoid looks like:
The interval of sigmoid results is from 0 to 1. Negative numbers tend to zero, and positive numbers tend to one.
Let the neuron have the following values: w = [0,1] c = 4
Input layer: x = 2, y = 3.
((xw1) + (yw2)) + c = 20 + 31 + 4 = 7
z = ƒ(7) = 0.99
How to write your own neuron
We will use the Pytnon library NumPy to write the neuron code.
import numpy as np def sigmoid(x): # Activation function: f(x) = 1 / (1 + e^(-x)) return 1 / (1 + np. exp(-x)) class Neuron: def init(self, weights, bias): self.weights = weights self.bias = bias def feedforward(self, inputs): total = np. dot(self.weights, inputs) + self.bias return sigmoid(total) weights = np. array([0, 1]) # w1 = 0, w2 = 1 bias = 4 # c = 4 n = Neuron(weights, bias) x = np. array([2, 3]) # x = 2, y = 3 print(n. feedforward(x)) # 0.9990889488055994
We used the values from the example above and see that the calculation results are the same and equal to 0.99.
How to assemble a neural network from neurons
A neural network consists of many interconnected neurons.
An example of a simple neural network:
- x1, x2 – input layer;
- h1, h2 – hidden layer with two neurons;
- o1 – output layer.
Attention! There can be any number of layers in a neural network, just like neurons.
Imagine that the neurons from the graph above have weights [0, 1]. The threshold value (b) of both neurons is 0 and they have identical sigmoid.
With the input data x=[2, 3] we get:
h1 = h2 = ƒ(wx+b) = ƒ((02) + (13) +0) = ƒ(3) = 0.95 o1 = ƒ(w[h1, h2] +b) = ƒ((0h1) + (1h2) +0) = ƒ(0.95) = 0.72
The input data is passed through the neurons until the output values are obtained.
Neural network code
import numpy as np class OurNeuralNetwork: ''' Neural network data: two entrances two neurons in hidden layers (h1, h2) output (o1) Neurons have identical weights and thresholds: w = [0, 1] b = 0 ''' def init(self): weights = np. array([0, 1]) bias = 0 # Neuron class from the previous section self.h1 = Neuron(weights, bias) self.h2 = Neuron(weights, bias) self.o1 = Neuron(weights, bias) def feedforward(self, x): out_h1 = self.h1.feedforward(x) out_h2 = self.h2.feedforward(x) # Inputs for o1 are outputs h1 and h2 out_o1 = self.o1.feedforward(np. array([out_h1, out_h2])) return out_o1 network = OurNeuralNetwork() x = np. array([2, 3]) print(network. feedforward(x)) # 0.7216325609518421
We see that the neural network is created, the output value is 0.72.
Neural network training
Training a neural network is about selecting weights that match all the inputs for the task at hand.
Neural network class:
class NeuralNetwork: def init(self, x, y): self.input = x self.weights1 = np.random. rand(self.input.shape,4) self.weights2 = np.random. rand(4,1) self.y = y self.output = np. zeros(y.shape)
Each step of the learning process consists of:
- direct distribution (projected output);
- reverse propagation (updating weights and offsets).
A two-layer neural network is given:
ŷ = σ(w2σ(w1x + b1) + b2)
In this case only two variables, w (weights) and b (displacement), affect the output of ŷ.
Setting weights and offsets from input data or the learning process of a neural network can be represented as follows:
As you can see, the forward propagation formula is an uncomplicated calculation:
ŷ = σ(w2σ(w1x + b1) + b2)
Next we need to add a forward propagation function to the code. Assume that the offsets in this case will be 0.
class NeuralNetwork: def init(self, x, y): self.input = x self.weights1 = np.random. rand(self.input.shape,4) self.weights2 = np.random. rand(4,1) self.y = y self.output = np. zeros(self.y.shape) def feedforward(self): self.layer1 = sigmoid(np. dot(self.input, self.weights1)) self.output = sigmoid(np. dot(self.layer1, self.weights2))
To calculate the prediction error, you must use the loss function. In the example, it is appropriate to use the sum-of-squares error formula – the average between the predicted and actual results:
Inverse propagation allows you to measure the derivatives in reverse order – from the end to the beginning, and to correct the weights and offsets. To do this, you need to know the derivative of the loss function – the tangent of the slope angle.
The derivative of the function with respect to weights and offsets allows us to know the gradient descent.
The derivative of the loss function contains no weights and offsets, and a chain rule must be added to calculate it:
Because of this rule, you can adjust the weights.
We add a backpropagation function to the Python code:
class NeuralNetwork: def init(self, x, y): self.input = x self.weights1 = np.random. rand(self.input.shape,4) self.weights2 = np.random. rand(4,1) self.y = y self.output = np. zeros(self.y.shape) def feedforward(self): self.layer1 = sigmoid(np. dot(self.input, self.weights1)) self.output = sigmoid(np. dot(self.layer1, self.weights2)) def backprop(self): # application of the chain rule to find derivative of the loss function with respect to weights2 and weights1 d_weights2 = np. dot(self.layer1.T, (2*(self.y - self.output) * sigmoid_derivative(self.output)) d_weights1 = np. dot(self.input.T, (np. dot(2*(self.y - self.output) * sigmoid_derivative(self.output), self.weights2.T) * sigmoid_derivative(self.layer1)) # update the weights with the derivative (slope) of the loss function self.weights1 += d_weights1 self.weights2 += d_weights2
Neural networks are based on certain algorithms and mathematical functions. At first it may seem quite difficult to understand them. But there are ready-made machine learning libraries for building and training neural networks, which allow not to delve into their construction.