I have designed a CNN in MATLAB with the next set of layers:
So, if I'm correct making my calculations, the output size of the last ReLU layer should be 7x7x32, which seems to be correct because that size contains 1568 values, and if we look at the Fully Connected Layer, we can see that the vector of weights has size 10x1568:
Now, I'm making a project where I'm coding the same CNN by hand. But when I have to code the Fully Connected Layer, I don't know how its vector of weights is related to the previous output. For example, I guess that a hypothetical output(1,1,1) value is connected to the weights Weights(:,1). But what about the others?.
My question therefore is, how should I loop through the output to match the weights from the first one (1) to the last one (1568)?.
Related
So I’m trying to make a CNN and so far I think I understand all of the forward propagation and the back propagation in the fully connected layers. However, I’m having some issues with the back prop in the convolutional layers.
Basically I’ve written out the dimensions of everything at each stage in a CNN with two convolutional layers and two fully connected layers, with the input having a depth of 1(as it is black and white) and only one filter being applied at each convolutional layer. I haven’t bothered to use pooling at this stage as to my knowledge it shouldn’t have any impact on the calculus, just to where it is assigned, so the dimensions should still fit as long as I also don’t include any uppooling in my backprop. I also haven’t bothered to write out the dimensions after the application of the activation functions as they would be the same as that as their input and I would be writing the same values twice.
The dimensions, as you will see, vary slightly in format. For the convolutional layers I’ve written them as though they are images, rather than in a matrix form. Whilst for the fully connected layers I’ve written the dimensions as that of the size of the matrices used(will hopefully make more sense when you see it).
The issue is that in calculating the delta for the convolutional layers, the dimensions don’t fit, what am I doing wrong?
Websites used:
http://cs231n.github.io/convolutional-networks/
http://neuralnetworksanddeeplearning.com/chap2.html#the_cross-entropy_cost_function
http://www.jefkine.com/general/2016/09/05/backpropagation-in-convolutional-neural-networks/
Calculation of dimensions:
I was reading this interesting article on convolutional neural networks. It showed this image, explaining that for every receptive field of 5x5 pixels/neurons, a value for a hidden value is calculated.
We can think of max-pooling as a way for the network to ask whether a given feature is found anywhere in a region of the image. It then throws away the exact positional information.
So max-pooling is applied.
With multiple convolutional layers, it looks something like this:
But my question is, this whole architecture could be build with perceptrons, right?
For every convolutional layer, one perceptron is needed, with layers:
input_size = 5x5;
hidden_size = 10; e.g.
output_size = 1;
Then for every receptive field in the original image, the 5x5 area is inputted into a perceptron to output the value of a neuron in the hidden layer. So basically doing this for every receptive field:
So the same perceptron is used 24x24 amount of times to construct the hidden layer, because:
is that we're going to use the same weights and bias for each of the 24×24 hidden neurons.
And this works for the hidden layer to the pooling layer as well, input_size = 2x2; output_size = 1;. And in the case of a max-pool layer, it's just a max() function on an array.
and then finally:
The final layer of connections in the network is a fully-connected
layer. That is, this layer connects every neuron from the max-pooled
layer to every one of the 10 output neurons.
which is a perceptron again.
So my final architecture looks like this:
-> 1 perceptron for every convolutional layer/feature map
-> run this perceptron for every receptive field to create feature map
-> 1 perceptron for every pooling layer
-> run this perceptron for every field in the feature map to create a pooling layer
-> finally input the values of the pooling layer in a regular ALL to ALL perceptron
Or am I overseeing something? Or is this already how they are programmed?
The answer very much depends on what exactly you call a Perceptron. Common options are:
Complete architecture. Then no, simply because it's by definition a different NN.
A model of a single neuron, specifically y = 1 if (w.x + b) > 0 else 0, where x is the input of the neuron, w and b are its trainable parameters and w.b denotes the dot product. Then yes, you can force a bunch of these perceptrons to share weights and call it a CNN. You'll find variants of this idea being used in binary neural networks.
A training algorithm, typically associated with the Perceptron architecture. This would make no sense to the question, because the learning algorithm is in principle orthogonal to the architecture. Though you cannot really use the Perceptron algorithm for anything with hidden layers, which would suggest no as the answer in this case.
Loss function associated with the original Perceptron. This notion of Peceptron is orthogonal to the problem at hand, you're loss function with a CNN is given by whatever you try to do with your whole model. You can eventually use it, but it is non-differentiable, so good luck :-)
A sidenote rant: You can see people refer to feed-forward, fully-connected NNs with hidden layers as "Multilayer Perceptrons" (MLPs). This is a misnomer, there are no Perceptrons in MLPs, see e.g. this discussion on Wikipedia -- unless you go explore some really weird ideas. It would make sense call these networks as Multilayer Linear Logistic Regression, because that's what they used to be composed of. Up till like 6 years ago.
Neural Networks are mostly used to classify. So, the activation of a neuron in the output layer indicates the class of whatever you are classifying.
Is it possible (and correct) to design a NN to get 3D coordinates? This is, three output neurons with values in ranges, for example [-1000.0, 1000.0], each one.
Yes. You can use a neural network to perform linear regression, and more complicated types of regression, where the output layer has multiple nodes that can be interpreted as a 3-D coordinate (or a much higher-dimensional tuple).
To achieve this in TensorFlow, you would create a final layer with three output neurons, each corresponding to a different dimension of your target coordinates, then minimize the root mean squared error between the current output and the known value for each example.
I am making 8 x 8 tiles of Images and I want to train a RBF Neural Network in Matlab using those tiles as inputs. I understand that I can convert the matrix into a vector and use it. But is there a way to train them as matrices? (to preserve the locality) Or is there any other technique to solve this problem?
There is no way to use a matrix as an input to such a neural network, but anyway this won't change anything:
Assume you have any neural network with an image as input, one hidden layer, and the output layer. There will be one weight from every input pixel to every hidden unit. All weights are initialized randomly and then trained using backpropagation. The development of these weights does not depend on any local information - it only depends on the gradient of the output error with respect to the weight. Having a matrix input will therefore make no difference to having a vector input.
For example, you could make a vector out of the image, shuffle that vector in any way (as long as you do it the same way for all images) and the result would be (more or less, due to the random initialization) the same.
The way to handle local structures in the input data is using convolutional neural networks (CNN).
I am trying to classify portions of time series data using a feed forward neural network using 20 neurons in a single hidden layer, and 3 outputs corresponding to the 3 events I would like to be able to recognize. There are many other things that I could classify in the data (obviously), but I don't really care about them for the time being. Neural network creation and training has been performed using Matlab's neural network toolbox for pattern recognition, as this is a classification problem.
In order to do this I am sequentially populating a moving window, then inputting the window into the neural network. The issue I have is that I am obviously not able to classify and train every possible shape the time series takes on. Due to this, I typically get windows filled with data that look very different from the windows I used to train the neural network, but still get outputs near 1.
Essentially, the 3 things I trained the ANN with are windows of 20 different data sets that correspond to shapes that would correspond to steady state, a curve that starts with a negative slope and levels off to 0 slope (essentially the left half side of a parabola that opens upwards), and a curve corresponding to 0 slope that quickly declines (right half side of a parabola that opens downwards).
Am I incorrect in thinking that if I input data that doesn't correspond to any of the items I trained the ANN with it should output values near 0 for all outputs?
Or is it likely due to the fact that these basically cover all the bases of steady state, increasing and decreasing, despite large differences in slope, and therefore something is always classified?
I guess I just need a nudge in the right direction.
Neural network output values
A neural network may not guarantee specific output values if these input values / expected output values were presented during the training period.
A neural network will not consistently output 0 for untrained input values.
A solution is to simply present the network with an array of input values that should result in the network outputting 0.