neural network data for training and testing - neural-network

I have a question regarding Training and testing data for my ANN .
Should the testing data going trough a feature extraction process before it can be classified?
I am new to this field. Is what I am doing right?
I separate the dataset to 80% train and 20 % test. Both sets , I extract the features. for train data I put it into training network but not for the test data. Then go to classification. Is this correct? because my SV said the test data should not go through the feature extraction process. I am wondering how the ANN can recognize the input if not specific feature is being extract. Apologize my bad English.
If anyone have link or journal that I can refer please provide it..
Thanks a lot.

Both the training and the test data needs to be in the same format - thus your training data and test data should go through the same pre-processing steps else your network will not learn correctly.

You are doing it right (as far as I understand your question).
Example: If you were to show me 10 images of faces (training data) on paper and then present me 2 people (training data) by their name only (different feature representation) - I wouldn't be able to classify what I didn't learn. You can't train the network with images and then test it with audio or any representation other than the one you used for training. I can't link any papers for that as it's just common sense.
You can modify the training set, e.g. by adding noise. But whatever you do, the representation format has to be the same.

Related

Loading a dataset in parts for training a neural network

This is my first post so please ask me if something is not clear.
I am currently working on training a neural network on a custom dataset that I have created. This dataset consists of 1000 folders which contain 81 images (512x512 px) each that are going to be loaded, processed and used as an input. My issue is that my computer cannot handle such a large dataset and I have to find a way to use the whole dataset.
The neural network that I am working on can be found here https://github.com/chshin10/epinet.
On the EPINET_train.py file you can see the data generator that is being used.
The neural network uses the RMSProp optimizer.
What I did to deal with this issue is that I split the data into 2 folders one for training and one for testing with an 80%-20% split. Then I load 10% of the data from each folder in order to train the neural network (data was not chosen randomly). I train the neural network for 100 epoches and the I load the next set of data until all of the sets have been used for training. Then I repeat the procedure.
After 3 iterations it seems to me that the loss function is not getting minimized more for each set of data. Is this solution used in a similar scenario? Is there something I can do better.

Predict a number with a given image (0 to 1)

I am a total beginner to ML and Neural networks. I am currently working on a project where I have a lot of pictures stored in a MongoDB database. Each one of those pictures has a number from 0 to 1. For example "picture 1" 0.71.
I want to train my model given the database. The main goal for the project is that after the model is finished and trained, given an image the model will be able to return(predict) a number from 0 to 1. After doing some research and asking a few people I figured out some libraries that would be useful for the project are: Tenserflow and Keras. Some people told me that it is impossible, but I'm not sure therefore I came to ask here.
So my questions are: Is it possible? If so, how can I implement it? Are there any specific tools you recommend? If you specify a way that I should use for my project do I need to export my MongoDB database in a certain form? Since I am a beginner maybe there are some tutorials that you think that can help?
I'm sorry if this question is a bit too general, if there are any misunderstandings please comment and I will try to answer.
Thanks in advance!
What you want to do is totally feasible, this kind of project is called regression, since you are using images data the best type of models are called convolutional neural network (CNN), you'll need some understanding if you want to build your own model. I've done a project where I had to predict a number of bacterial colonies using an image, much like your problem except that I had no boundaries on the predicted values.
What is a CNN ? Here is a link
Basically a CNN will understand the features in the images and will use those features to predict a value.
You won't need to create your own model, most people just use well-designed one in the scientific litterature.
Go for keras, it's the easiest framework out there and work like a charm. Here is how to implement VGG16 (an architecture that is probably the best for your problem) : link
You should follow this tutorial to get going on developing with keras.
Last hint: don't use the same last layer as the one on the VGG16 implementation, use a Dense Layer with one neuron and with a sigmoid/linear/leaky relu activation.
ie:
#model.add(Dense(1000, activation='softmax'))
model.add(Dense(1, activation='sigmoid'))
This means : predict 1 number (sigmoid will bound it between 0 and 1, but maybe lrelu or linear is better)
Also, I guess you could use MongoDB to read the images as arrays, but I would just put the images on a folder.
Edit : When compiling the model, use a mean squared error as in
adam = keras.optimizers.Adam(lr=1e-4)
model.compile(optimizer=adam, loss='mse')
Here you have the "hello world program" in terms of neural networks and digits classification. You can start studying it because I think you will end up with a similar architecture for your NN. What you should focus on is the output of your model, because in this example they are performing classification on 10 classes (digits from 0 to 9) but you are trying to read a real number. You could try to use a single neurone with sigmoid or linear activation at the end of your model.

OCR software or homemade CNN for document processing?

I have a dilemma. If you have only one type of invoice/document, and you have a specific field that you want to process from that invoice and use somewhere else (that filed happens to be a handwritten digit, sometimes written with dashes or slashes), would you use some OCR software or build your own CNN for recognizing the digits? What accuracy would you expect from OCR? Would your CNN be more accurate, as you are just interested in a specific type of digit writing, with specific image dimensions, etc. What would be better in the given situation?
Keep in mind, that you would not use it in any other way, or any other place for handwritten digits recognition, and you already have up to 100k and more documents that are copied to a computer by a human, and you can use it for training and testing.
Thank you.
I would definitely go for a CNN based solution. Since the structure of your document is consistent:
Extract the desired portion of the document with a standard computer vision approach
Train a CNN on an annotated set of a few thousand documents. You should even be able to finetune an existing CNN trained on MNIST and this would require less training images.
This approach should give you >99% accuracy without much effort. The accuracy of the OCR solution really depends on which library you use and the preprocessing you implement.

How do I determine the architecture for deep NN training according to the number of examples?

As the title says, how can I determine the architecture or build a reasonable model for training a neural network with regards to the number of examples?
For example, assuming that I have roughly 50 thousand images and I have successfully converted all data to fit the model which means they are ready for training, how can I choose a model that is suitable for training a neural network? I am a little bit confused sometimes when I have data but I did not know how to initiate a model for training NN.
Fine tuning is the way
Sometimes you have a pre-trained CNN that you can use as a starting point for your domain. For more about fine tuning You can check here.
According to this, my advice is to fine tune a pre-trained Neural Network that you can find in Keras (This page, under "Available models") or TensorFlow. You can go deeper as far as you are confident with your training set!
In any case, you need to see the number of samples per class rather than the absolute number of images in your training set. If you are confident you can choose a Deep Learning SOA architecture and try to train it from zero.

Why we need training and test datasets in research?

I'm newbie in research area of data mining (text clustering) and i have couple question regarding to training and test datasets.
Is that clustering need training and testing datasets?
why we need to separate into training and test datasets?
Sorry for the rookie question hope expert in this group can help me.
As your question is on clustering:
In cluster analysis, there usually is no training or test data split.
Because you do cluster analysis when you do not have labels, so you cannot "train".
Training is a concept from machine learning, and train-test splitting is used to avoid overfitting.
But if you are not learning labels, you cannot overfit.
Properly used cluster analysis is a knowledge discovery method. You want to discover some new structure in your data, not rediscover something that is already labeled.
To train your data you need a sets of relevant data similar but not identical to your testing data. For example, you could split up your data where 0.7 of your data is training and the rest testing. This will allow your algorithm to get a feel for what it should be looking for. The rest of the data 0.3 can be used for testing as it is a distinct set of information (hopefully) which should allow the algorithm to test itself.
Why split it up?
Well if you train your data on data A and then test your algorithm on data A your algorithm will be able to identify all the information correctly because that is what it was trained on.
For example, if when learning addition you were given the sums 3+4, 4+5, 6+9, which you correctly solved it would be redundant to test your knowledge of addition using the same sums.
further information:
http://en.wikipedia.org/wiki/Natural_language_processing
http://www.nltk.org/book
Hope this helps.