Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 4 years ago.
Improve this question
I am implementing a neural network model for text classification. I am trying different configurations on RNN and lstm neural network.
My question: How to compare these configuration, should I compare the models using the training set accuracy, validation accuracy or testing set accuracy?
I will explain how I finally compared my different RNN models.
First of all, I used my CPU for model training. This will ensure that I get the same model parameters each run as GPU computations are known to be non-deterministic.
Secondly, I used the same tf seed for each run. To make sure that the random variables generated in each run is the same.
Finally, I used my validation accuracy to optimize my hyper-parameters. Each run I used a combination of different parameters until I choose the model with the highest validation accuracy to be my best model.
Related
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 2 years ago.
Improve this question
I am working on a small computer vision project and I'm using convolutional nets for classification. I have already used dropout, l1, l2 regularization and data augmentation to reduce overfitting. Are there any other techniques and algorithms for improving model accuracy and reducing overfitting?
there could be a 100 solutions
Use pretrained model (transfer learning).
Try to implement a smaller network.
Bigger data set.
Try different parameters[learning rate, batch size..].
Use grid search for these parameters.
Try data augmentation for your training data set.
...
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 4 years ago.
Improve this question
Im still not sure how clustering can be used for predictive analytics?
can someone tell me how to predict the future from extracting clusters?
generally, clustering isn't used for prediction but for labeling or analyzing existing set of data points.
after you use clusters to label your data points and divide them into groups based on common traits, you can run other prediction algorithms on that labeled data to get predictions.
I don't think clustering leads directly to predictions, other than cases of clusters that are well separated and can be used to make inferences about the data points and the properties of the clusters
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 3 years ago.
Improve this question
I am trying to understand word2vec(word embedding) architecture, and I have few questions about it:
first, why is word2vec model considered a log-linear model? Is it because it uses a soft max at output layer?
second, why does word2vec remove hidden layer? Is it just because of computational complexity?
third, why does word2vec not use activation function? (as compared to NNLM(Neural Network Language Model).
first, why word2vec model is log-linear model? because it uses a soft max at output layer?
Exactly, softmax is a log-linear classification model. The intent is to obtain values at the output that can be considered a posterior probability distribution
second, why word2vec removes hidden layer? it just because of
computational complexity?
third, why word2ved don't use activation function? compare for
NNLM(Neural Network Language Model).
I think your second and third question are linked in the sense that an extra hidden layer and an activation function would make the model more complex than necessary. Note that while no activation is explicitly formulated, we could consider it to be a linear classification function. It appears that the dependencies that the word2vec models try to model can be achieved with a linear relation between the input words.
Adding a non-linear activation function allows the neural network to map more complex functions, which could in turn lead to fit the input onto something more complex that doesn't retain the dependencies word2vec seeks.
Also note that linear outputs don't saturate which facilitates gradient-based learning.
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 8 years ago.
Improve this question
I have a CSV data set consisting of a movie details per line.
These are: name, budget, revenue, popularity, runtime, rating, votes, date released.
I'm wondering how to split the data set into training, validation and testing sets?
Then of course, how to get some results?
It would be nice to get a brief step-by-step intro on where/how I should begin.
You should use the nntool. In your case I guess curve fitting is appropriate. So use the nftool
Define your input and output in nftool then you can just randomly divide your data into training, validation and testing sets using the nftool. In the Nftool GUI you can choose how much to divide your data (80-10-10 or anything). Then you just follow the interface and then set the specifics of the network (e.g. the number of hidden neurons). Then you just train the network. After training you can plot the performance of the training and depending on the performance you can retrain or change the number of hidden neurons, percentage of the training data and so on.
You can also check this :
http://www.mathworks.com/help/toolbox/nnet/gs/f9-35958.html
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 3 years ago.
Improve this question
I need to design a neural network which has the following behavior:
p(1)={0,1,1,1} outputs a(1)={0,1,0,0}
p(2)={1,1,0,1} outputs a(2)={0,0,1,0}
p(3)={0,0,1,0} outputs a(3)={0,0,0,1}
p(4)={0,0,1,1} outputs a(4)={1,1,0,1}
How can i do so? Which type of neural network should I use? Which learning method can be used here?
Thanks.
At first glance it seems as though you could use a simple feedforward neural network with one input layer one, one hidden layer and one output layer. You can use your training data to train the neural network using the backpropogation algorithm.
See this page for more details:
http://en.wikipedia.org/wiki/Backpropagation