Keras explanation: number of nodes in input layer - neural-network

I'm trying to understand the relationship between a simple Perceptron and a neural network one gets when using the keras Sequence class.
I learned that the neural network perceptron looks as such:
Each "node" in the first layer is one of the features of a sample x_1, x_2,...,x_n
Could somebody explain the jump to the neural network I find in the Keras package below?
Since the input layer has four nodes, does that mean that network consists of four of the perceptron networks?

There is seem to be misunderstanding on what a perceptron is. A perceptron is a single unit that multiplies the inputs with weights, sums them up and applies an activation function:
Now the diagrams you have are called multi-layer perceptrons (MLP) and consist of a stack of perceptrons organised in layers, wiki. In Keras, there is no explicit notion of a perceptron but of a layer of perceptrons implemented as a Dense layer because the layers are densely connected, ie every output is connected to every input between layers. The second diagram would correspond to:
model = Sequential()
model.add(Dense(4, activation='sigmoid', input_dim=3))
model.add(Dense(4, activation='sigmoid'))
model.add(Dense(1, activation='sigmoid'))
assuming you have sigmoid activation. In this case, the input layer is implicit by specifying the input_dim=3 and the final layer would be the output layer.

Related

what can you say about two exactly same neural networks after training on same dataset?

Supposed you have two convolutional neural networks implemented in matlab and composed by these layers:
imageInputLayer
ConvolutionalLayer
maxPoolinglayer
relulayer
softmaxlayer
fullyconnectedlayer
classification layer
Both of these networks have exactly same architecture.
I apply the same method of training for 2 networks with same hyperparameters.
Both of these networks have exactly same weights in their corresponding layers.
That is, both of these networks are a replica of each other.
Both of these networks are trained using exactly same training set and validation set without shuffle.
I am wondering:
Will the scores (training error and validation error) and trained weights be different for both?
Does it depend upon the method for training?
In short: Yes to both - because inital weights are usually initated using random numbers.
A tad less short: A neural network is simply an algorithm, if there is no noise (i.e. randomness) introduced in any function on the way, 2 networks will end up being completely the same.

How to add two output classification layers in keras?

I have a neural network whose job is to classify 10 classes. Further, I want these 10 classes to be classified into 2 classes (positive -> 3 , negative -> 7). How can I achieve this in keras?
It sounds like you are trying to solve two different, but closely related problems. I recommend that you train your first model to predict 10 classes, and then you create a copy of the first model (including weights) except with a different output layer to support binary classification. At this point you can either:
Train only your final dense layer and new output layer, or
Train the entire model with a low learning rate
For more information you can read about Transfer Learning.
Example code:
model.save('model_1') # load this to retrieve your original model
model.pop() # pop output activation layer and associated params
model.pop() # pop final dense layer
model.add(Dense(1), kernel_initializer='normal', activation='sigmoid')
for layer in model.layers[:-2]:
layer.trainable = False
model.compile(loss='binary_crossentropy', optimizer='nadam', metrics=['accuracy'])
model.fit(x_train, y_train, epochs=50, batch_size=32)
If you want to retrain the whole model then you can omit the loop setting all but the last two layers to untrainable, and choose an optimizer such as SGD with a low learning rate.

Are fully connected layers really required in Deep neural networks?

I mean to ask that can I have a neural network classifier with a large number of layers without fully connected layers?
Yes, you can make a fully convolutional classifier, one example is SqueezeNet.
The basic working principle is that at the end of the network you insert a convolutional layer with C output channels, where C is the number of classes. Then you proceed to apply global average pooling, which will produce a 1D vector of C elements (independent of input feature map width/height), and you can apply the softmax function to that vector to produce output class probabilities.

CNN feed forward or back propagtion model

Is convolutional neural network (CNN) a feed forward model or back propagation model. I get this confusion by comparing the blog of DR.Yann and Wikipedia definition of CNN.
A convolutional neural net is a structured neural net where the first several layers are sparsely connected in order to process information (usually visual).
A feed forward network is defined as having no cycles contained within it. If it has cycles, it is a recurrent neural network. For example, imagine a three layer net where layer 1 is the input layer and layer 3 the output layer. A feed forward network would be structured by layer 1 taking inputs, feeding them to layer 2, layer 2 feeds to layer 3, and layer 3 outputs. A recurrent neural net would take inputs at layer 1, feed to layer 2, but then layer two might feed to both layer 1 and layer 3. Since the "lower" layer feeds its outputs into a "higher" layer, it creates a cycle inside the neural net.
Back propagation, however, is the method by which a neural net is trained. It doesn't have much to do with the structure of the net, but rather implies how input weights are updated.
When training a feed forward net, the info is passed into the net, and the resulting classification is compared to the known training sample. If the net's classification is incorrect, the weights are adjusted backward through the net in the direction that would give it the correct classification. This is the backward propagation portion of the training.
So a CNN is a feed-forward network, but is trained through back-propagation.
In short,
CNN is feed forward Neural Network.
Backward propagation is a technique that is used for training neural network.
Similar to tswei's answer but perhaps more concise.
A convolutional Neural Network is a feed forward nn architecture that uses multiple sets of weights (filters) that "slide" or convolve across the input-space to analyze distance-pixel relationship opposed to individual node activations.
Backward propagation is a method to train neural networks by "back propagating" the error from the output layer to the input layer (including hidden layers).

How do i take a trained neural network and implement in another system?

I have trained a feedforward neural network in Matlab. Now I have to implement this neural network in C language (or simulate the model in Matlab using mathematical equations, without using direct functions). How do I do that? I know that I have to take the weights and bias and activation function. What else is required?
There is no point in representing it as a mathematical function because it won't save you any computations.
Indeed all you need is the weights, biases, activation and your architecture. I'm assuming it is a simple feedforward network as you said, you need to implement some kind of matrix multiplication and addition in C. Also, you'll need to implement the activation function. After that, you're ready to go. Your feed forward NN is ready to be implemented. If the C code will not be used for training, it won't be necessary to implement the backpropagation algorithm in C.
A feedforward layer would be implemented as follows:
Output = Activation_function(Input * weights + bias)
Where,
Input: (1 x number_of_input_parameters_for_this_layer)
Weights: (number_of_input_parameters_for_this_layer x number_of_neurons_for_this_layer)
Bias: (1 x number_of_neurons_for_this_layer)
Output: (1 x number_of_neurons_for_this_layer)
The output of a layer is the input to the next layer.
After some days of searching, I have found the following webpage to be very useful http://ufldl.stanford.edu/tutorial/supervised/MultiLayerNeuralNetworks/
The picture below shows a simple feedforward neural network. Picture taken from the above website.
In this figure, the circles denote the inputs to the network. The circles labeled “+1” are called bias units, and correspond to the intercept term. The leftmost layer of the network is called the input layer, and the rightmost layer the output layer (which, in this example, has only one node). The middle layer of nodes is called the hidden layer, because its values are not observed in the training set. In this example, the neural network has 3 input units (not counting the bias unit), 3 hidden units, and 1 output unit.
The mathematical equations representing this feedforward network are
This neural network has parameters (W,b)=(W(l),b(l),W(2),b(2)), where we write Wij(l) to denote the parameter (or weight) associated with the connection between unit j in layer l, and unit i in layer l+1. (Note the order of the indices.) Also, bi(l) is the bias associated with unit i in layer l+1.
So, from the trained model, as Mido mentioned in his answer, we have to take the input weight matrix which is W(1), the layer weight matrix which is W(2), biases, hidden layer transfer function and output layer transfer function. After this, use the above equations to estimate the output hW,b(x). A popular transfer function used for a regression problem is tan-sigmoid transfer function in the hidden layer and linear transfer function in the output layer.
Those who use Matlab, these links are highly useful
try to simulate neural network in Matlab by myself
Neural network in MATLAB
Programming a Basic Neural Network from scratch in MATLAB