I am using Stanford crf classifier in java. when i train the classifier on small data up to 40,000 words, it works fine, but when i increase the training data and try to train it on 170000 words, the program gets stuck only after two or three iterations.The case is same even if i provide heap space up to 4GB to the program. i am using : edu.stanford.nlp.ie.crf.CRFClassifier library.
Related
This is my first post so please ask me if something is not clear.
I am currently working on training a neural network on a custom dataset that I have created. This dataset consists of 1000 folders which contain 81 images (512x512 px) each that are going to be loaded, processed and used as an input. My issue is that my computer cannot handle such a large dataset and I have to find a way to use the whole dataset.
The neural network that I am working on can be found here https://github.com/chshin10/epinet.
On the EPINET_train.py file you can see the data generator that is being used.
The neural network uses the RMSProp optimizer.
What I did to deal with this issue is that I split the data into 2 folders one for training and one for testing with an 80%-20% split. Then I load 10% of the data from each folder in order to train the neural network (data was not chosen randomly). I train the neural network for 100 epoches and the I load the next set of data until all of the sets have been used for training. Then I repeat the procedure.
After 3 iterations it seems to me that the loss function is not getting minimized more for each set of data. Is this solution used in a similar scenario? Is there something I can do better.
Is it possible to use Tensorflow or some similar library to make a model that you can efficiently train and use at the same time.
An example/use case for this would be a chat bot that you give feedback to. Somewhat like how pets learn (i.e. replicating what they just did for a reward). Or being able to add new entries or new responses they can use.
I think what you are asking is whether a model can be trained continuously without having to retrain it from scratch each time new labelled data comes in.
Answer to that is - Online models
There are models that can be trained continuously on data without worrying about training them from scratch. As per Wikipedia definition
Online machine learning is a method of machine learning in which data becomes available in sequential order and is used to update the best predictor for future data at each step, as opposed to batch learning techniques which generate the best predictor by learning on the entire training data set at once.
Some examples of such algorithms are
BernoulliNB
GaussianNB
MiniBatchKMeans
MultinomialNB
PassiveAggressiveClassifier
PassiveAggressiveRegressor
Perceptron
SGDClassifier
SGDRegressor
DNNs
As the title says, how can I determine the architecture or build a reasonable model for training a neural network with regards to the number of examples?
For example, assuming that I have roughly 50 thousand images and I have successfully converted all data to fit the model which means they are ready for training, how can I choose a model that is suitable for training a neural network? I am a little bit confused sometimes when I have data but I did not know how to initiate a model for training NN.
Fine tuning is the way
Sometimes you have a pre-trained CNN that you can use as a starting point for your domain. For more about fine tuning You can check here.
According to this, my advice is to fine tune a pre-trained Neural Network that you can find in Keras (This page, under "Available models") or TensorFlow. You can go deeper as far as you are confident with your training set!
In any case, you need to see the number of samples per class rather than the absolute number of images in your training set. If you are confident you can choose a Deep Learning SOA architecture and try to train it from zero.
I'm trying to build a Neural Machine Translation model that translates Latex-code into English.
An example for this would be: "\frac{1^{n}}{12}" -> "One to the power of n divided by 12".
The problem is that I don't have nearly enough labeled training data to produce a good result.
Is there a way to train the model with s small data set or to artificially increase the amount of training data for this problem?
I have found solutions for machine translation without parallel data, that built a dictionary "by aligning monolingual word embedding spaces in an unsupervised way". These approaches seem to rely on the fact that human languages are very similar in nature. But Latex-code is very different than human languages and I don't think that's going to yield great results.
i'm new to deep learning and i was using matlab's deep learning toolbox.
i wanted to run : "test_example_SAE.m" which builds a stacked auto-encoder and trains and tests it using MNIST dataset, but i couldn't because of this error :
*
Error using horzcat
Out of memory. Type HELP MEMORY for your options.
*
how much memory does this job want ? i mean can i run deep learning toolbox codes on an average PC with 4GB RAM ? or should i learn to run the code on GPU ?
It happened to me as well. If you can decrease number of samples.
Also if you are running 'for' loops, try change it to vectors.
One more hint is to divide your data in pieces(if it's possible) and try to run piece-by-piece.
Those are just my suggestions. No difference if you'll run it on GPU.
You can also try use cluster computing if it is available at your job or university.