For any Keras layer (Layer class), can someone explain how to understand the difference between input_shape, units, dim, etc.?
For example the doc says units specify the output shape of a layer.
In the image of the neural net below hidden layer1 has 4 units. Does this directly translate to the units attribute of the Layer object? Or does units in Keras equal the shape of every weight in the hidden layer times the number of units?
In short how does one understand/visualize the attributes of the model - in particular the layers - with the image below?
Units:
The amount of "neurons", or "cells", or whatever the layer has inside it.
It's a property of each layer, and yes, it's related to the output shape (as we will see later). In your picture, except for the input layer, which is conceptually different from other layers, you have:
Hidden layer 1: 4 units (4 neurons)
Hidden layer 2: 4 units
Last layer: 1 unit
Shapes
Shapes are consequences of the model's configuration. Shapes are tuples representing how many elements an array or tensor has in each dimension.
Ex: a shape (30,4,10) means an array or tensor with 3 dimensions, containing 30 elements in the first dimension, 4 in the second and 10 in the third, totaling 30*4*10 = 1200 elements or numbers.
The input shape
What flows between layers are tensors. Tensors can be seen as matrices, with shapes.
In Keras, the input layer itself is not a layer, but a tensor. It's the starting tensor you send to the first hidden layer. This tensor must have the same shape as your training data.
Example: if you have 30 images of 50x50 pixels in RGB (3 channels), the shape of your input data is (30,50,50,3). Then your input layer tensor, must have this shape (see details in the "shapes in keras" section).
Each type of layer requires the input with a certain number of dimensions:
Dense layers require inputs as (batch_size, input_size)
or (batch_size, optional,...,optional, input_size)
2D convolutional layers need inputs as:
if using channels_last: (batch_size, imageside1, imageside2, channels)
if using channels_first: (batch_size, channels, imageside1, imageside2)
1D convolutions and recurrent layers use (batch_size, sequence_length, features)
Details on how to prepare data for recurrent layers
Now, the input shape is the only one you must define, because your model cannot know it. Only you know that, based on your training data.
All the other shapes are calculated automatically based on the units and particularities of each layer.
Relation between shapes and units - The output shape
Given the input shape, all other shapes are results of layers calculations.
The "units" of each layer will define the output shape (the shape of the tensor that is produced by the layer and that will be the input of the next layer).
Each type of layer works in a particular way. Dense layers have output shape based on "units", convolutional layers have output shape based on "filters". But it's always based on some layer property. (See the documentation for what each layer outputs)
Let's show what happens with "Dense" layers, which is the type shown in your graph.
A dense layer has an output shape of (batch_size,units). So, yes, units, the property of the layer, also defines the output shape.
Hidden layer 1: 4 units, output shape: (batch_size,4).
Hidden layer 2: 4 units, output shape: (batch_size,4).
Last layer: 1 unit, output shape: (batch_size,1).
Weights
Weights will be entirely automatically calculated based on the input and the output shapes. Again, each type of layer works in a certain way. But the weights will be a matrix capable of transforming the input shape into the output shape by some mathematical operation.
In a dense layer, weights multiply all inputs. It's a matrix with one column per input and one row per unit, but this is often not important for basic works.
In the image, if each arrow had a multiplication number on it, all numbers together would form the weight matrix.
Shapes in Keras
Earlier, I gave an example of 30 images, 50x50 pixels and 3 channels, having an input shape of (30,50,50,3).
Since the input shape is the only one you need to define, Keras will demand it in the first layer.
But in this definition, Keras ignores the first dimension, which is the batch size. Your model should be able to deal with any batch size, so you define only the other dimensions:
input_shape = (50,50,3)
#regardless of how many images I have, each image has this shape
Optionally, or when it's required by certain kinds of models, you can pass the shape containing the batch size via batch_input_shape=(30,50,50,3) or batch_shape=(30,50,50,3). This limits your training possibilities to this unique batch size, so it should be used only when really required.
Either way you choose, tensors in the model will have the batch dimension.
So, even if you used input_shape=(50,50,3), when keras sends you messages, or when you print the model summary, it will show (None,50,50,3).
The first dimension is the batch size, it's None because it can vary depending on how many examples you give for training. (If you defined the batch size explicitly, then the number you defined will appear instead of None)
Also, in advanced works, when you actually operate directly on the tensors (inside Lambda layers or in the loss function, for instance), the batch size dimension will be there.
So, when defining the input shape, you ignore the batch size: input_shape=(50,50,3)
When doing operations directly on tensors, the shape will be again (30,50,50,3)
When keras sends you a message, the shape will be (None,50,50,3) or (30,50,50,3), depending on what type of message it sends you.
Dim
And in the end, what is dim?
If your input shape has only one dimension, you don't need to give it as a tuple, you give input_dim as a scalar number.
So, in your model, where your input layer has 3 elements, you can use any of these two:
input_shape=(3,) -- The comma is necessary when you have only one dimension
input_dim = 3
But when dealing directly with the tensors, often dim will refer to how many dimensions a tensor has. For instance a tensor with shape (25,10909) has 2 dimensions.
Defining your image in Keras
Keras has two ways of doing it, Sequential models, or the functional API Model. I don't like using the sequential model, later you will have to forget it anyway because you will want models with branches.
PS: here I ignored other aspects, such as activation functions.
With the Sequential model:
from keras.models import Sequential
from keras.layers import *
model = Sequential()
#start from the first hidden layer, since the input is not actually a layer
#but inform the shape of the input, with 3 elements.
model.add(Dense(units=4,input_shape=(3,))) #hidden layer 1 with input
#further layers:
model.add(Dense(units=4)) #hidden layer 2
model.add(Dense(units=1)) #output layer
With the functional API Model:
from keras.models import Model
from keras.layers import *
#Start defining the input tensor:
inpTensor = Input((3,))
#create the layers and pass them the input tensor to get the output tensor:
hidden1Out = Dense(units=4)(inpTensor)
hidden2Out = Dense(units=4)(hidden1Out)
finalOut = Dense(units=1)(hidden2Out)
#define the model's start and end points
model = Model(inpTensor,finalOut)
Shapes of the tensors
Remember you ignore batch sizes when defining layers:
inpTensor: (None,3)
hidden1Out: (None,4)
hidden2Out: (None,4)
finalOut: (None,1)
Input Dimension Clarified:
Not a direct answer, but I just realized that the term "Input Dimension" could be confusing, so be wary:
The word "dimension" alone can refer to:
a) The dimension of Input Data (or stream) such as # N of sensor axes to beam the time series signal, or RGB color channels (3): suggested term = "Input Stream Dimension"
b) The total number / length of Input Features (or Input layer) (28 x 28 = 784 for the MINST color image) or 3000 in the FFT transformed Spectrum Values, or
"Input Layer / Input Feature Dimension"
c) The dimensionality (# of dimensions) of the input (typically 3D as expected in Keras LSTM) or (# of Rows of Samples, # of Sensors, # of Values..) 3 is the answer.
"N Dimensionality of Input"
d) The SPECIFIC Input Shape (eg. (30,50,50,3) in this unwrapped input image data, or (30, 2500, 3) if unwrapped
Keras:
In Keras, input_dim refers to the Dimension of Input Layer / Number of Input Features
model = Sequential()
model.add(Dense(32, input_dim=784)) #or 3 in the current posted example above
model.add(Activation('relu'))
In Keras LSTM, it refers to the total Time Steps
The term has been very confusing, we live in a very confusing world!!
I find one of the challenge in Machine Learning is to deal with different languages or dialects and terminologies (like if you have 5-8 highly different versions of English, then you need a very high proficiency to converse with different speakers). Probably this is the same in programming languages too.
Added this answer to elaborate on the input shape at the first layer.
I created tow variation of the same layers
Case 1:
model =Sequential()
model.add(Dense(15, input_shape=(5,3),activation="relu", kernel_initializer="he_uniform", kernel_regularizer=None,kernel_constraint="MaxNorm"))
model.add(Dense(32,activation="relu"))
model.add(Dense(8))
Case 2:
model1=Sequential()
model1.add(Dense(15,input_shape=(15,),kernel_initializer="he_uniform",kernel_constraint="MaxNorm",kernel_regularizer=None,activation="relu"))
model1.add(Dense(32,activation="relu"))
model1.add(Dense(8))
plot_model(model1,show_shapes=True)
Now if plot these and take summary,-
Case 1
[![Case1 Model Summary][2]][2]
[2]: https://i.stack.imgur.com/WXh9z.png
Case 2
summary
Now if you look closely , in the first case , input is two dimensional. Output of first layer generates one for each row x number of units.
Case two is simpler , there is not such complexity each unit produces one output after activation.
My input data is an 101*22 array(101 samples and 22 features).
These data(101) should be divided into 3 groups(L1, L2 and L3).
I want to use mat lab neural network as classifier.
What will be target array?
What other classifier you recommend?
Thanks
The target data should be the classes of the Input data. In your case you have 3 classes. You can use a binary coding.
More details about the input and target data can be found here at the end of the page see here
Other resources:
first
A simple example can be the following:
#this is the INPUT data that you have
X=randint(101,22,[0 10]);
#this is the TARGET data
y =randint(3,22,[0 1]);
#define hidden layer size
hiddenLayerSize = 10;
#create the neural net
my_net = patternnet(hiddenLayerSize);
#run it
[my_net,tr] = trainrp(my_net,X,y);
Then you should see something like the following:
Then explore this windows.
E.g. select confusion
In convolutional Neural Networks, How to know the output of a specific conv layer? (I am using keras to build a CNN model)
For example if I am using one dimensional conv layer, where number_of_filters=20, kernel_size=10, and input_shape(500,1)
cnn.add(Conv1D(20,kernel_size=10,strides=1, padding="same",activation="sigmoid",input_shape=(Dimension_of_input,1)))
and if I am using two dimensional conv layer, where number_of_filters=64, kernal_size=(5,100), input_shape= (5,720,1) (height,width,channel)
Conv2D(64, (5, 100),
padding="same",
activation="sigmoid",
data_format="channels_last",
input_shape=(5,720,1)
what is the number of output in the above two conv layers? Is there any equation that can be used to know the number of outputs of a conv layer in convolution neural network?
Yes, there are equations for it, you can find them in the CS231N course website. But as this is a programming site, Keras provides an easy way to get this information programmaticaly, by using the summary function of a Model.
model = Sequential()
fill model with layers
model.summary()
This will print in terminal/console all the layer information, such as input shapes, output shapes, and number of parameters for each layer.
Actually, the model.summary() function might not be what you are looking for if you want to do more than just look at the model.
If you want to access layers of your Keras model you can do this by using model.layers which returns all of the layers (assignement stores them as a list). If you then want to look at a specific layer you can simply index the list:
list_of_layers = model.layers
list_of_layers[5] # gives you the 6th layer
What you are still working with are just objects so you probably want to get specific values. You just have to specify attribute you want to look at then:
list_of_layers[-1].output_shape # returns output_shape of last layer
Gives you back the output_shape tuple of the last layer in the model.
You can even skip the whole list assignement thing if you already know that you only want to look at the output_shape of a certain layer and just do:
model.layers[-1].output_shape # equivalent to the above method without storing in a list
This might be useful if you want to use these values while building the model to guide the execution in a certain way (adding a pooling layer or doing the padding etc.).
when first time i am working with TensorFlow cnn it is very difficult to dealing with dimensions. below is the general scenario for calculating dimensions:
consider
we have a image of dimension (nXn), filter dimension : (fXf), no padding, no strides applies :
after convolution dimension are : (n-f+1,n-f+1)
dimension of image = (nXn) and filter dimension = (fXf) and we have padding : p
then output dims are = (n+2P-f+1,n+2P-f+1)
if we are using Padding = 'SAME" it means output dims = input dims in this case equation looks like : n+2P-f+1=n
so from here p = (f-1)/2
if we are using valid padding then it means no padding and p =0
in computer vision f is usually odd if f is even it means we have asymmetric padding.
case when we are using stride = s
output dims are ( floor( ((n+2P-f)/s)+1 ),floor( ( (n+2P-f)/s)+1 ) )
I wrote this script (Matlab) for classification using Softmax. Now I want to use same script for regression by replacing the Softmax output layer with a Sigmoid or ReLU activation function. But I wasn't able to do that.
X=houseInputs ;
T=houseTargets;
%Train an autoencoder with a hidden layer of size 10 and a linear transfer function for the decoder. Set the L2 weight regularizer to 0.001, sparsity regularizer to 4 and sparsity proportion to 0.05.
hiddenSize = 10;
autoenc1 = trainAutoencoder(X,hiddenSize,...
'L2WeightRegularization',0.001,...
'SparsityRegularization',4,...
'SparsityProportion',0.05,...
'DecoderTransferFunction','purelin');
%%
%Extract the features in the hidden layer.
features1 = encode(autoenc1,X);
%Train a second autoencoder using the features from the first autoencoder. Do not scale the data.
hiddenSize = 10;
autoenc2 = trainAutoencoder(features1,hiddenSize,...
'L2WeightRegularization',0.001,...
'SparsityRegularization',4,...
'SparsityProportion',0.05,...
'DecoderTransferFunction','purelin',...
'ScaleData',false);
features2 = encode(autoenc2,features1);
%%
softnet = trainSoftmaxLayer(features2,T,'LossFunction','crossentropy');
%Stack the encoders and the softmax layer to form a deep network.
deepnet = stack(autoenc1,autoenc2,softnet);
%Train the deep network on the wine data.
deepnet = train(deepnet,X,T);
%Estimate the deep network, deepnet.
y = deepnet(X);
Regression is a different problem from classification. You have to change your loss function to something that fits with a regression e.g. mean square error and of course change the number of neuron to one (you will only ouput 1 value on your last layer).
It is possible to use a Neural Network to perform a regression task but it might be an overkill for many tasks. True regression means to perform a mapping of one set of continuous inputs to another set of continuous outputs:
f: x -> ý
Changing the architecture of a neural network to make it perform a regression task is usually fairly simple. Instead of mapping the continuous input data to a specific class as it is done using the Softmax function as in your case, you have to make the network use only a single output node.
This node will just sum the outputs of the the previous layer (last hidden layer) and multiply the summed activations by 1. During the training process this output ý will be compared to the correct ground-truth value y that comes with your dataset. As a loss function you may use the Root-means-squared-error (RMSE).
Training such a network will result in a model that maps an arbitrary number of independent variables x to a dependent variable ý, which basically is a regression task.
To come back to your Matlab implementation, it would be incorrect to change the current Softmax output layer to be an activation function such as a Sigmoid or ReLU. Instead your would have to implement a custom RMSE output layer for your network, which is fed with the sum of activations coming from the last hidden layer of your network.
I have a question about using Keras (with Theano as my backend) to which I'm rather new. I'm using a many to one RNN (takes in a time series as the input, computes one number as the output) as my first set of layers. So far, this is trivial with Keras using the recurrent layer IO.
Here is where I'm having trouble:
Now I like to pass the output of this RNN (the one number) to a separate function (lets call this f) and then do some computation with it.
What I would like to do is take this computed output (after the function f) and train it against the expected output (via some loss such as mse).
I'd like some advice on how to feed the output post computation from the function f and still train it via model.fit feature in Keras.
My pseudo code is as follows:
X = input
Y = output
#RNN layer
model.add(LSTM(....))
model.add(Activation(...)) %%Returns W*X
#function f %%Returns f(W*X)
(Needs to take in output from final RNN layer to generate a new number)
model.fit(X,Y,....)
In above, I'm not sure how to write code to include the output from function f while it is training for weights in the RNN (i.e. train f(W*x) against Y).
Any help is appreciated, thanks!!
It is not clear from your question if the RNN's weights should update with the training of f.
1st option - They should
As Matias said - a simple Dense layer is probably what you are looking for:
X = input
Y = output
#RNN layer
model.add(LSTM(....))
model.add(Activation(...)) %%Returns W*X
model.add(Dense(...))
model.fit(X,Y,....)
option 2 - They should not
Your f function would still be a Dense layer but you will iteratively train f and the RNN separately.
Assuming you have an rnn_model that you defined as above, define a new model f:
X = input
Y = output
#RNN layer
rnn_model = Sequential()
rnn_model .add(LSTM(....))
rnn_model .add(Activation(...)) %%Returns W*X
f_model = Sequential()
f_model.add(rnn_model)
f_model.add(Dense(...))
Now you can train them separately by doing:
# Code to train rnn_model
rnn_model.trainable = False
# Code to train f_model
rnn_model.trainable = True
The simplest way is to add a layer to your model that does the exact computation you want. From your comments it seems you just want
f(W*X), and that is exactly what a Dense layer does, minus the bias term.
So I believe adding a dense layer with the appropriate activation function is everything you need to do. If you don't need an activation at the output then just use "linear" as activation.
Just note that function f needs to be specified as a symbolic function using methods from keras.backend and it should be a differentiable function.