How much time does it take to train a classifier in Watson? I uploaded around 500 images and it has been 48 hours and the module is still in training.
I am trying to differentiate plant leaves and thus gave images of plant leaves. Total file size is around 50MB.
Training a visual classifier can take some time, due to the upload speeds most people have and the size of the images being used to train the classifier. Think about how long it would take to transfer the data from the environment that you are working in, to a data center - and that is the absolute quickest that your training will be.
With that being said, I can't imagine that the training would take 24 hours. With 50MB of data, and 7-9 classes, training should take no longer than a hour at the very most.
Please try retraining the images and there might be some error.
It happened with me many times.
So cancel it and train it again.
Related
We have a customer requirement to search similar images in a collection using Watson Visual Recognition. The documentation mentions that each collection can contain 1 million images. Thus, I have the following questions:
a) What is the maximum size of the image?
b) Each image upload takes up to 1 second and the standard plan has a limit of 25000 images per day. So, can only 25k images added to the collection/day?
c) The customer has about 2 million images. How can we upload the images faster?
d) Is there a separate plan available for bulk volumes?
This information comes from the Visual Recognition documentation at the following url:
https://www.ibm.com/watson/developercloud/doc/visual-recognition/customizing.html
Size limitations
There are size limitations for training calls and data:
The service accepts a maximum of 10,000 images or 100 MB per .zip
file.
The service requires a minimum of 10 images per .zip file.
The service accepts a maximum of 256 MB per training call.
Minimum recommend size of an image is 32X32 pixels.
Guidelines for good training Anchor link
The following guidelines are not enforced by the API. However, the service tends to perform better when the training data adheres to them:
A minimum of 50 images is recommended in each .zip file, as fewer than 50 images can decrease the quality of the trained classifier.
If the quality and content of training data is the same, then classifiers that are trained on more images will generally be more accurate than classifiers that are trained on fewer images. The benefits of training a classifier on more images plateaus at around 5000 images, and this can take a while to process. You can train a classifier on more than 5000 images, but it may not significantly increase that classifier's accuracy.
Uploading a total of 150-200 images per .zip file gives you the best balance between the time it takes to train and the improvement to classifier accuracy. More than 200 images increases the time, and it does increase the accuracy, but with diminishing returns for the amount of time it takes.
Include approximately the same number of images in each examples file. Including an unequal number of images can cause the quality of the trained classifier to decline.
The accuracy of your custom classifier can be affected by the kinds of images you provide to train it. Provide example images that are similar to the images you plan to analyze. For example, if you are training the classifier "tiger", your classifier might be less accurate if you provide only images of tigers in a zoo taken by a mobile phone to train the classifier, but you want to test the classifier on images of tigers in the wild taken by professional photographers.
Guidelines for high volume classifying Anchor link
If you want to classify many images, submitting one image at a time can take a long time. You can maximize efficiency and performance of the service in the following ways:
Resize images to be no larger than 320 pixels in either width or height. Images do not need to be high resolution.
Submit images in batches as compressed (.zip) files.
Specify only the classifiers you want results for in the classifier_ids parameter. If you do not specify a value for this parameter, the service classifies the images against the default classifier and takes longer to return a response.
Ravi, I see you posted your question on developerWorks too - please see my answer here: https://developer.ibm.com/answers/questions/379227/similarity-search-api-of-watson-visual-recognition/
Recently, I've been playing with the MATLAB's RCNN deep learning example here. In this example, MATLAB has designed a basic 15 layer CNN with the input size of 32x32. They use the CIFAR10 dataset to pre-train this CNN. The CIFAR10 dataset has training images of size 32x32 too. Later they use a small dataset of stop signs to fine tune this CNN to detect stop signs. This small dataset of stop signs has only 41 images; so they use these 41 images to fine tune the CNN and namely train an RCNN network. this is how they detect a stop sign:
As you see the bounding box almost covers the whole stop signs except a small part on the top.
Playing with the code I decided to fine tune the same network pre-trained on the CIFAR10 dataset with the PASCAL VOC dataset but only for the "aeroplane" class.
These are some results I get:
As you see the detected bounding boxes barely cover the whole airplane; so this causes the precision to be 0 later when I evaluate them. I understand that in the original RCNN paper mentioned in the MATLAB example the input size 227x227 and their CNN has 25 layers. Could this be why the detections are not accurate? How does the input size of a CNN affect the end result?
almost surely yes!
when you pass an image through a net, the net tries to minimize the data taken from the image until it gets the most relevant data. during this process, the input shrinks again and again. If, for example, you insert to a net an image that smaller than the wanted, all the data from the image may lost during the pass in the net.
In your case, an optional reason to your results is that the net "looks for" features in limited resolution and maybe the big airplane has over high resolution.
I am trying to train the robot for specific actions such as grasping or pointing by using the RNN.
The robot is composed of one arm and a head containing camera in it. Also the workspace will be the small table so that the arm and objects can be located.
The input of the recurrent neural network wiil be the image frame of every time steps from the camera and the output will be the target motor angle of next frame of the robot arm.
When the current image frame is fed to the network, the network outputs the motor value of arm for the next frame. And when the arm reaches the next position, the input frame in that position is again goes to the network and it again yields the next motor output.
However, when making the data for training, I have to make all the data of (image, motor angle) pair for all the position on the workspace. Eventhough the network can do some generalization job by itselt, the data needed is stil too much and it takes lots of time since there are too many trajectories.
Generalizing the problem I have, the time for getting training data for network is too much. Is there any way or method that can train network with small size dataset? Or making huge dataset within relatively small human intervention?
Your question is very broad and definitely encompasses more than field of study. This question cannot be answered in this platform, however, i suggest you to check out this compilation of Machine Learning Resources on gitHub, specifically Data Analysis section.
A more specific resource related to your question is DeepNeuralClassifier.
I searched more paper and I found some that are related to the subject. The main topic of my question was to
find the way to train the network efficiently with small size of dataset
find the way to make huge dataset with small human effort
There were some papers and two of them helped me a lot. This is the link.
Explanation-Based Neural Network Learning for Robot Control
Supersizing Self-supervision: Learning to Grasp from 50K Tries and 700 Robot Hours
I have a question regarding the Matlab NN toolbox. As a part of research project I decided to create a Matlab script that uses the NN toolbox for some fitting solutions.
I have a data stream that is being loaded to my system. The Input data consists of 5 input channels and 1 output channel. I train my data on on this configurations for a while and try to fit the the output (for a certain period of time) as new data streams in. I retrain my network constantly to keep it updated.
So far everything works fine, but after a certain period of time the results get bad and do not represent the desired output. I really can't explain why this happens, but i could imagine that there must be some kind of memory issue, since as the data set is still small, everything is ok.
Only when it gets bigger the quality of the simulation drops down. Is there something as a memory which gets full, or is the bad sim just a result of the huge data sets? I'm a beginner with this tool and will really appreciate your feedback. Best Regards and thanks in advance!
Please elaborate on your method of retraining with new data. Do you run further iterations? What do you consider as "time"? Do you mean epochs?
At a first glance, assuming time means epochs, I would say that you're overfitting the data. Neural Networks are supposed to be trained for a limited number of epochs with early stopping. You could try regularization, different gradient descent methods (if you're using a GD method), GD momentum. Also depending on the values of your first few training datasets, you may have trained your data using an incorrect normalization range. You should check these issues out if my assumptions are correct.
I have a training dataset with 60,000 images and a testing dataset with 10,000 images. Each image represents an integer number from 0 to 9. My goal was to use libsvm which is a library for Support Vector Machines in order to learn the numbers from the training dataset and use the classification produced to predict the images of the testing dataset.
Each image is 28x28 which means that it has 784 pixels or features. While the features seem to be too many it took only 5-10 minutes to run the SVM application and learn the training dataset. The testing results were very good giving me 93% success rate.
I decided to try and use PCA from matlab in order to reduce the amount of features while at the same time not losing too much information.
[coeff scores latent] = princomp(train_images,'econ');
I played with the latent a little bit and found out that the first 90 features would have as a result 10% information loss so I decided to use only the first 90.
in the above code train_images is an array of size [60000x784]
from this code I get the scores and from the scores I simply took the number of features I wanted, so finally I had for the training images an array of [60000x90]
Question 1: What's the correct way to project the testing dataset to the coefficients => coeff?
I tried using the following:
test_images = test_images' * coeff;
Note that the test_images accordingly is an array of size [784x10000] while the coeff an array of size [784x784]
Then from that again I took only the 90 features by doing the following:
test_images = test_images(:,(1:number_of_features))';
which seemed to be correct. However after running the training and then the prediction, I got a 60% success rate which is way lower than the success rate I got when I didn't use any PCA at all.
Question 2: Why did I get such low results?
After PCA I scaled the data as always which is the correct thing to do I guess. Not scaling is generally not a good idea according to the libsvm website so I don't think that's an issue here.
Thank you in advance
Regarding your first question, I believe MarkV has already provided you with an answer.
As for the second question: PCA indeed conserves most of the variance of your data, but it does not necessarily means that it maintains 90% of the information of your data. Sometimes, the information required for successful classification is actually located at these 10% you knocked off. A good example for this can be found here, especially figure 1 there.
So, if you have nice results with the full features, why reduce the dimension?
You might want to try and play with different principal components. What happens if you take components 91:180 ? that might be an interesting experiment...