I attempt to compute the mAP of YOLO_v2 and SSD on the VOC2007 test set using VOCevaldet.m, but in both cases I reach lower mAPs than the values reported in the papers.
To produce the detection txt files in VOC format (a file per class), I use the command ./darknet detector valid cfg/voc.data cfg/yolo-voc.cfg weights/yolo-voc.weights for YOLO in darknet, and the python script score_ssd_pascal.py for SSD in Caffe.
What am I missing? Why don't I get the same results as in the papers?
Thanks.
Related
I am running a classification script in GEE and I have about 2100 training data since my AOI is a region in Italy and have many classes. I receive the following error while I try save my script:
Script error File too large (larger than 512KB).
I tried cancelling some of the training data and it saves. I thought there is no limit in GEE to choose training points. How can I know what is the limit so I adjust my training points or if there is a way to save the script without deleting any points.
Here is the link to my code
The Earth Engine Code Editor “drawing tools” are a convenient, but not very scalable, way to create geometry. The error you're getting is because “under the covers” they actually create additional code that is part of your script file. Not only is this fairly verbose (hence the error you received), it's not very efficient to run, either.
In order to use large training data sets, you will need to create your point data in another tool and upload it (using CSV or SHP files) to become one or more Earth Engine “table” assets, and use those from your script.
I am trying to run CNTK object detecion example on PascalVoc pretrained dataset. I run all required scripts in fastrcnn and get the visual output for the test data defined in dataset. Now I want to test network on my own image, how can I do that?
For Fast R-CNN you need a library that generates candidate ROIs (regions of interest) for your test images, e.g. selective search.
If you want to evaluate a batch of images you can follow the description in the tutorial to generate the test mapping file and the ROI coordinates (see test.txt and test.rois.txt in the corresponding proc sub folder). If you want to evaluate a single you would need to pass the image and the candidate ROI coordinates as inputs to cntk eval, similar to this example:
# compute model output
arguments = {loaded_model.arguments[0]: [hwc_format]}
output = loaded_model.eval(arguments)
For FastRCNN you need to first run your custom image through Selective Search algorithm to generate ROIs (regions of interest) and then feed it to your model with sth like this:
output = frcn_eval.eval({image_input: image_file, roi_proposals: roi_proposals})
You can find more details here: https://github.com/Microsoft/CNTK/tree/release/latest/Examples/Image/Detection/FastRCNN
Anyway FastRCNN is not the most efficient way to do it because of usage of Selective Search (which is a real bottleneck here). If you want to improve the performance you can try FasterRCNN as it gets rid of SS algorithm and replaces it with Region Proposal Network which performs much, much better.
If you're interested, you can check my repo on GitHub: https://github.com/karolzak/CNTK-Hotel-pictures-classificator
this question may come as being too broad, but I will try to make every sub-topic to be as specific as possible.
My setting:
Large binary input (2-4 KB per sample) (no images)
Large binary output of the same size
My target: Using Deep Learning to find a mapping function from my binary input to the binary output.
I have already generated a large training set (> 1'000'000 samples), and can easily generate more.
In my (admittedly limited) knowledge of Neural networks and deep learning, my plan was to build a network with 2000 or 4000 input nodes, the same number of output nodes and try different amounts of hidden layers.
Then train the network on my data set (waiting several weeks if necessary), and checking whether there is a correlation between in- and output.
Would it be better to input my binary data as single bits into the net, or as larger entities (like 16 bits at a time, etc)?
For bit-by-bit input:
I have tried "Neural Designer", but the software crashes when I try to load my data set (even on small ones with 6 rows), and I had to edit the project save files to set Input and Target properties. And then it crashes again.
I have tried OpenNN, but it tries to allocate a matrix of size (hidden_layers * input nodes) ^ 2, which, of course, fails (sorry, no 117GB of RAM available).
Is there a suitable open-source framework available for this kind of
binary mapping function regression? Do I have to implement my own?
Is Deep learning the right approach?
Has anyone experience with these kind of tasks?
Sadly, I could not find any papers on deep learning + binary mapping.
I will gladly add further information, if requested.
Thank you for providing guidance to a noob.
You have a dataset containing pairs of binary valued vectors, with a max length of 4,000 bits. You want to create a mapping function between the pairs. On the surface, that doesn't seem unreasonable - imagine a 64x64 image with binary pixels – this only contains 4,096 bits of data and is well within the reach of modern neural networks.
As your dealing with binary values, then a multi-layered Restricted Boltzmann Machine would seem like a good choice. How many layers you add to the network really depends on the level of abstraction in the data.
You don’t mention the source of the data, but I assume you expect there to be a decent correlation. Assuming the location of each bit is arbitrary and is independent of its near neighbours, I would rule out a convolutional neural network.
A good open source framework to experiment with is Torch - a scientific computing framework with wide support for machine learning algorithms. It has the added benefit of utilising your GPU to speed up processing thanks to its CUDA implementation. This would hopefully avoid you waiting several weeks for a result.
If you provide more background, then maybe we can home in on a solution…
I am using Caffe to extract features with matlab wrapper.I have 5011 images as test data set.I chopped all the layers after 'relu7' in 'deploy.prototxt'. I found out if you take the same image as input of matcaffe_demo.m and matcaffe_batch.m, you will get the different 4096-dim features.
Could someone tell me why?
what is the differences between you extract features from all these images one by one with matcaffe_demo.m and extract features by listing all these images with matcaffe_batch.m?
You can find the answer to this question at caffe github.
Basically, matcaffe_demo is used for classification and it averages results of 10 crops of the input image, while matcaffe_bathc uses only a single input.
Moreover, note that these m-files are no longer available in recent caffe versions.
I tried to train my own neural net using my own imagedatabase as described in
http://caffe.berkeleyvision.org/gathered/examples/imagenet.html
However when I want to check the neural net after training on some standard images using the matlab wrapper I get the following output / error:
Done with init
Using GPU Mode
Done with set_mode
Elapsed time is 3.215971 seconds.
Error using caffe
Invalid input size
I used the matlab wrapper before to extract cnn features based on a pretrained model. It worked. So I don't think the input size of my images is the problem (They are converted to the correct size internally by the function "prepare_image").
Has anyone an idea what could be the error?
Found the solution: I was referencing the wrong ".prototxt" file (Its a little bit confusing because the files are quite similar.
So for computing features using the matlab wrapper one needs to reference the following to files in "matcaffe_demo.m":
models/bvlc_reference_caffenet/deploy.prototxt
models/bvlc_reference_caffenet/MyModel_caffenet_train_iter_450000.caffemodel
where "MyModel_caffenet_train_iter_450000.caffemodel" is the only file needed which is created during training.
In the beginning I was accidently referencing
models/bvlc_reference_caffenet/MyModel_train_val.prototxt
which was the ".prototxt" file used for training.