I tried to get and set weight and bias of trained lstm in Matlab but I failed.
Does any one know how I can get and set weight and bias of a LSTM becuase function like getwb(net) did not work.
Related
what is bias weight in Matlab perception
I implemented the OR gate perceptron using nntool. Its works ok but what is the contribution of bias weight in case of the nntool perceptron.
If you interpret the perceptron as a linear classifier, then you can interpret the the perceptron itself as an algorithm saying if your input lies on one or the other side of the line. The bias parameter is then the distance of this computed line from the origin.
I am working on a regression problem with the following sample training data .
As shown I have an input of only 4 parameters with only one of them changing which is Z so the rest have no real value while an output of 124 parameters denoted from O1 to O124
Noting that O1 changes with a constant rate of 20 [1000 then 1020 then 1040 ...] while O2 changes with a different rate which is 30 however still constant and same applies for all the 124 outputs ,all changes linearily in a constant way.
I believed it's a trivial problem and a very simple neural network model will reach a 100% accuracy on testing data but the results were the opposite.
I reached 100% test accuracy using a linear regressor and 99.99997% test accuracy using KNN regressor
I reached 41% test data accuracy in a 10 layered neural network using relu activation while all the rest activation functions failed and shallow relu also failed
Using a simple neural network with linear activation function and no hidden layers I reached 92% on the test data
My Question is how can I get the neural network to get 100% on test data like the linear Regressor?
It is supposed that using a shallow network with linear activation to be equivilant to the linear regressor but the results are different ,am I missing something ?
If you use linear activation a deep model is in principle the same as a linear regression / a NN with 1 layer. E.g a deep NN with linear activation the prediction is given as y = W_3(W_2(W_1 x))), which can be rewritten as y = (W_3 (W_2 W_1))x, which is the same as y = (W_4 x), which is a linear Regression.
Given that check if your NN without a hidden layer converges to the same parameters as your linear regression. If this is not the case then your implementation is probably wrong. If this is the case, then your larger NN probably converges to some solution to the problem, where the test accuracy is just worse. Try different random seeds then.
Here is my scenario
I have used EMNIST database of capital letters of english language.
My neural network is as follows
Input layer has 784 neurons which are pixel values of image 28x28 grey scaled image divided by 255 so value will be in range[0,1]
Hidden layer has 49 neuron fully connected to previous 784.
Output layer has 9 neurons denoting class of image.
Loss function is defined as cross entropy of softmax of output layer.
Initialized all weights as random real number from [-1,+1].
Now I did training with 500 fixed samples for each class.
Simply, passed 500x9 images to train function which uses backpropagation and does 100 iterations changing weights by learning_rate*derivative_of_loss_wrt_corresponding_weight.
I found that when I use tanh activation on neuron then network learns faster than relu with learning rate 0.0001.
I concluded that because accuracy on fixed test dataset was higher for tanh than relu . Also , loss value after 100 epochs was slightly lower for tanh.
Isn't relu expected to perform better ?
Isn't relu expected to perform better ?
In general, no. RELU will perform better on many problems but not all problems.
Furthermore, if you use an architecture and set of parameters that is optimized to perform well with one activation function, you may get worse results after swapping in a different activation function.
Often you will need to adjust the architecture and parameters like learning rate to get comparable results. This may mean changing the number of hidden nodes and/or the learning rate in your example.
One final note: In the MNIST example architectures I have seen, hidden layers with RELU activations are typically followed by Dropout layers, whereas hidden layers with sigmoid or tanh activations are not. Try adding dropout after the hidden layer and see if that improves your results with RELU. See the Keras MNIST example here.
Suppose I am using the MATLAB probabilistic neural network example for a classification problem, as given (click here)
The weights and the bias for the network can be determined as follows;
weights = net.LW
biases = net.b
My question is, how do I get an equations describing the model. I am a beginner in neural network, any detailed explanation/sample code would be most helpful.
Thanks!
I write a multilayer perceptron and try to approximate the sine function.
My network contains only a single hidden layer with 50 neurals (input layer and output layer each has only 1 neural of course). Activation function used in hidden layer is tanh, and output layer is linear. Learning rate is set to 0.0001, momentum 0.9 (normal momentum not Nesterov momentum) Training mode is online since the data is generate without noise. Weights and bias are generate randomly with mean = 0;
After 10000 epochs, my network result plotted below (the upper image is real sine function, the lower image is my network output), although it is not too bad, I cannot achieve the exact sine function.
Can anyone give me advice for a better config for better error convergence.