Neural Networks are one of the most popular techniques and tools in Machine learning.
Neural Networks were inspired by the human brain as early as in the 1940s. Researchers studied the neuroscience and researched about the working of the human brain i.e. how the human brain learn and solve any particular problem and then applied the same idea to the computers.
In particular, if we look at the structure of a biological neural network, it seems something like this:


Scientists observed that these neurons are connected and communicate with one another. They send electrical signals to one another, and these neurons process input signals and transmit the signal to other neurons in the future.
This biological idea is applied to the machine as well. We can design an Artificial Neural Network (ANN), which is a mathematical model for learning.
Artificial Neural Network is a computational model that can make some mathematical function that maps certain inputs to respective outputs based on the structure and parameters of the network.
In order to create our Artificial Neural Network like the biological neurons, there are units inside of a neural network which can be represented with a node in a graph also called neurons, and they can be connected to one another.
Suppose we have two units that are connected to each other. Now, let’s consider these artificial neurons as an input and output unit.
Now, suppose we have to solve some problems by modeling some mathematical function. For example, we have to predict whether it will rain or not, and we have taken two variables x1 and x2, which represents two factors like temperature and humidity. Our task is to predict a Boolean Classification if it is going to rain or not.
0= No Rain
1= Rain
So, for solving this task, let’s define a Hypothesis function ‘H’ that takes an input x1 and x2. Based on these inputs, we have to determine whether it is going to rain or not. So, we can use the Linear combination of these inputs to model a mathematical function which determines the output.
H(x₁, x₂) = w₀ + w₁x₁ + w₂x₂
H(x₁, x₂) = Hypothesis function with input variables x1 and x2.
w₀, w₁ and ware some weights (numbers).
Here, x1 and x2 are some inputs, and they are being multiplied by some weight (numbers) w1 and w2, respectively, and we add some bias weight, i.e., w0, which is used to bring the output value up or down.
So, In order to define the hypothesis function and get the output, we need to decide the values of these weights and do some sort of classification at the end, that whether it’s raining or not raining and for that, we need to define some sort of function that defines some threshold value, and those functions are called Activation Functions.
As explained above, to define the Hypothesis function and use it to determine whether it will rain or not, we need to create some threshold values based on the values produced. And Activation Functions are used to define those threshold values.
There are other activation functions, as well. And whichever Activation function we use, its inputs are always multiplied by some weights and bias weight is also added, and some of those are used by activation function to figure out the output.
H(x₁, x₂) = g (w₀ + w₁x₁ + w₂x₂)
Mathematically we can see an activation function as a function ‘g,’ which is applied to the result of the hypothesis function.
The neural network structure is the graphical representation of the idea discussed above. The inputs are multiplied by some weights, and bias weight is added, and the whole sum is used by the function to provide some output.


In the above figure, we can represent both the input variables x1 and x2 with the two units on the left and the unit on the right output. The edges connecting the inputs to the output have the weights associated with w1 and w2, respectively. And the output unit will calculate the output based on the input and weights by multiplying each weight with their respective input and adding some bias weight.
This is the graphical representation of the idea discussed above, and we call it a Neural Network Structure. The neural network will learn what should be the value of the weights and what the activation function should be to determine the output.
For Example, An OR Logical Function can be represented as function ‘f.’
Now, the question arises of how we can use this function to train the neural network and figure out the values of weights.
Suppose we have taken w1 and w2 as 1 and w0 as -1 and defined the step function. And we are going to see the results for all the combinations of 0 and 1.
Here we can see input and output values are similar to the OR function.
Output when x1 and x2 are 0 and 1 respectively.
g (w₀ + w₁x₁ + w₂x₂)
g (-1 + 1*0 + 1*1)
g (-1+0+1)
g (0)
g(0) lies at the threshold in the step function so, Output will be ‘1’.
We can also try AND function.
The truth table of the AND function is shown above. Now we have to decide the values of w0, w1, and w2 so that values in the AND function can be satisfied.
Suppose w1 and w2 are 1 and w0 is -2.
Then,
In the above table, we can see that the above output is similar to the AND function.
So, the neural network should be able to determine what the weights should be in order to determine the output. And in Neural Network, we can also compose a more complex network and have as many input units as possible.
The above representation allows us to represent problems with more number of inputs by growing the size of the neural network.
In the case of OR, AND function, it was easy to determine the values of weights but, now the question arises, How to calculate the weights in case of complex problems like when it’s going to rain, or what is the price of the house etc. so, the method to do these complex calculation is called Gradient Descent.
Designed by Elegant Themes | Powered by WordPress

source