Predict a number with a given image (0 to 1) - mongodb

I am a total beginner to ML and Neural networks. I am currently working on a project where I have a lot of pictures stored in a MongoDB database. Each one of those pictures has a number from 0 to 1. For example "picture 1" 0.71.
I want to train my model given the database. The main goal for the project is that after the model is finished and trained, given an image the model will be able to return(predict) a number from 0 to 1. After doing some research and asking a few people I figured out some libraries that would be useful for the project are: Tenserflow and Keras. Some people told me that it is impossible, but I'm not sure therefore I came to ask here.
So my questions are: Is it possible? If so, how can I implement it? Are there any specific tools you recommend? If you specify a way that I should use for my project do I need to export my MongoDB database in a certain form? Since I am a beginner maybe there are some tutorials that you think that can help?
I'm sorry if this question is a bit too general, if there are any misunderstandings please comment and I will try to answer.
Thanks in advance!

What you want to do is totally feasible, this kind of project is called regression, since you are using images data the best type of models are called convolutional neural network (CNN), you'll need some understanding if you want to build your own model. I've done a project where I had to predict a number of bacterial colonies using an image, much like your problem except that I had no boundaries on the predicted values.
What is a CNN ? Here is a link
Basically a CNN will understand the features in the images and will use those features to predict a value.
You won't need to create your own model, most people just use well-designed one in the scientific litterature.
Go for keras, it's the easiest framework out there and work like a charm. Here is how to implement VGG16 (an architecture that is probably the best for your problem) : link
You should follow this tutorial to get going on developing with keras.
Last hint: don't use the same last layer as the one on the VGG16 implementation, use a Dense Layer with one neuron and with a sigmoid/linear/leaky relu activation.
ie:
#model.add(Dense(1000, activation='softmax'))
model.add(Dense(1, activation='sigmoid'))
This means : predict 1 number (sigmoid will bound it between 0 and 1, but maybe lrelu or linear is better)
Also, I guess you could use MongoDB to read the images as arrays, but I would just put the images on a folder.
Edit : When compiling the model, use a mean squared error as in
adam = keras.optimizers.Adam(lr=1e-4)
model.compile(optimizer=adam, loss='mse')

Here you have the "hello world program" in terms of neural networks and digits classification. You can start studying it because I think you will end up with a similar architecture for your NN. What you should focus on is the output of your model, because in this example they are performing classification on 10 classes (digits from 0 to 9) but you are trying to read a real number. You could try to use a single neurone with sigmoid or linear activation at the end of your model.

Related

how to improve TensorFlow object detection model?

I need to diagnosis captcha for a project. I did this using the object_detection provided by Tensorflow.
also, I added 500 captcha samples by turning images into XML by LabelImg and then to TFRecord.
beside I used "faster_rcnn_inception_v2_coco_2018_01_28"
The problem is that the accuracy of the machine is very low.
My questions are:
Can the problem be solved by increasing the number of training data?
Should I change my algorithm?
How effective is the use of the Yolo 3 instead of the detection object provided by Tensorflow?
Q. Can the problem be solved by increasing the number of training data?
A. It would be depend on how many data you can get more. I think that only increasing the number of training data is not good approach.
Consider using Fine-tuning existing trained model to detect object class. If you want to fine-tune the model, you need to be careful class label assignment because existing trained model like YOLO3, Faster RCNN, etc. has no label "captcha" in their training dataset.
I recommend you to refer to this website that can help you to fine-tune the model.
Q. Should I change my algorithm?
A. Do as you wish.
Q. How effective is the use of the Yolo 3 instead of the detection object provided by Tensorflow?
A. In my opinion, two different models are much the same if you don't need to consider inference time.

Design of a Neural Network for Emotion Classification using Tweet Data

I have a dataset of four emotion labelled tweets (anger, joy, fear, sadness). For instance, I transformed tweets to a vector similar to the following input vector for anger:
Mean of frequency distribution to anger tokens
word2vec similarity to anger
Mean of anger in emotion lexicon
Mean of anger in hashtag lexicon
Is that vector valid to train a neural network?
Your input vector looks fine to start with. Of-course, you might later make it much advanced with statistical and derivative data from twitter or other relevant APIs or datasets.
Your network has four outputs, just like you mentioned:
Joy: [1,0,0,0]
Sadness: [0,1,0,0]
Fear: [0,0,1,0]
Anger: [0,0,0,1]
And you may consider adding multiple hidden layers and make it a deep network, if you wish, to increase stability of your neural network prototype.
As your question also shows, it may be best to have a good preprocessor and feature extraction system, prior to training and testing your data, which it certainly seems you know, where the project is going.
Great project, best wishes, thank you for your good question and welcome to stackoverflow.com!
Playground Tensorflow

neural network for sudoku solver

I recently started learning neural networks, and I thought that creating a sudoku solver would be a nice application for NN. I started learning them with backward propagation neural network, but later I figured that there are tens of neural networks. At this point, I find it hard to learn all of them and then pick an appropriate one for my purpose. Hence, I am asking what would be a good choice for creating this solver. Can back propagation NN work here? If not, can you explain why and tell me which one can work.
Thanks!
Neural networks don't really seem to be the best way to solve sudoku, as others have already pointed out. I think a better (but also not really good/efficient) way would be to use an genetic algorithm. Genetic algorithms don't directly relate to NNs but its very useful to know how they work.
Better (with better i mean more likely to be sussessful and probably better for you to learn something new) ideas would include:
If you use a library:
Play around with the networks, try to train them to different datasets, maybe random numbers and see what you get and how you have to tune the parameters to get better results.
Try to write an image generator. I wrote a few of them and they are stil my favourite projects, with one of them i used backprop to teach a NN what x/y coordinate of the image has which color, and the other aproach combines random generated images with ine another (GAN/NEAT).
Try to use create a movie (series of images) of the network learning to create a picture. It will show you very well how backprop works and what parameter tuning does to the results and how it changes how the network gets to the result.
If you are not using a library:
Try to solve easy problems, one after the other. Use backprop or a genetic algorithm for training (whatever you have implemented).
Try to improove your implementation and change some things that nobody else cares about and see how it changes the results.
List of 'tasks' for your Network:
XOR (basically the hello world of NN)
Pole balancing problem
Simple games like pong
More complex games like flappy bird, agar.io etc.
Choose more problems that you find interesting, maybe you are into image recognition, maybe text, audio, who knows. Think of something you can/would like to be able to do and find a way to make you computer do it for you.
It's not advisable to only use your own NN implemetation, since it will probably not work properly the first few times and you'll get frustratet. Experiment with librarys and your own implementation.
Good way to find almost endless resources:
Use google search and add 'filetype:pdf' in the end in order to only show pdf files. Search for neural network, genetic algorithm, evolutional neural network.
Neither neural nets not GAs are close to ideal solutions for Sudoku. I would advise to look into Constraint Programming (eg. the Choco or Gecode solver). See https://gist.github.com/marioosh/9188179 for example. Should solve any 9x9 sudoku in a matter of milliseconds (the daily Sudokus of "Le monde" journal are created using this type of technology BTW).
There is also a famous "Dancing links" algorithm for this problem by Knuth that works very well https://en.wikipedia.org/wiki/Dancing_Links
Just like was mentioned in the comments, you probably want to take a look at convolutional networks. You basically input the sudoku bord as an two dimensional 'image'. I think using a receptive field of 3x3 would be quite interesting, and I don't really think you need more than one filter.
The harder thing is normalization: the numbers 1-9 don't have an underlying relation in sudoku, you could easily replace them by A-I for example. So they are categories, not numbers. However, one-hot encoding every output would mean a lot of inputs, so i'd stick to numerical normalization (1=0.1, 2 = 0.2, etc.)
The output of your network should be a softmax with of some kind: if you don't use softmax, and instead outupt just an x and y coordinate, then you can't assure that the outputedd square has not been filled in yet.
A numerical value should be passed along with the output, to show what number the network wants to fill in.
As PLEXATIC mentionned, neural-nets aren't really well suited for these kind of task. Genetic algorithm sounds good indeed.
However, if you still want to stick with neural-nets you could have a look at https://github.com/Kyubyong/sudoku. As answered Thomas W, 3x3 looks nice.
If you don't want to deal with CNN, you could find some answers here as well. https://www.kaggle.com/dithyrambe/neural-nets-as-sudoku-solvers

General Questions about NeuralNetworks

I have some general questions about NNs and their training in hope that you can answer them:
Lets propose, that Ive got an untrained NN with n hidden Layers and m neurons in it. I want to train the network to, eg recognice voice and so words. How can I make this possible when my sound input doesnt always have the same length (eg one is 1 second the other one is 5)? How many layers and what type should my NN be (Recurrent,LSTM,CNNs etc)? Are there any other training algorithms than the normal backpropagation ( I thought about having a NN with just one neuron in each Layer and then let grow new one till the problem could be solved)? And finally is it recommended/helpfull to make connections between the neurons of eg Layer 2 to Layer 4?
Thank you about your help!
This is a perfectly valid question, for your record.
You should definitely use a recurrent network for voice recognition. So that means you output say 1/100 of a second one by one. So for one second, you activate the network 100 times for one second of data.
Using an LSTM will make sure that patterns over large time lags are remembered, so the network will essentially rememember (useful) parts from previous inputs.
How many layers you should use is dependant on what exactly you want to recognize. But because voice recognition is not one of the easiest classification tasks, it will have to be a large deep network (combine convolutional with LSTM).
What you proposed, evolving the network one node by one, is basically called neuroevolution. Libraries such as Neataptic support the evolution of networks towards a certain solution.
Yes, that could definitely help. But this can purely be found out by trial and error.
PS: I strongly recommend to start on an easier task to develop an understanding of neural networks.

Training a model for Latent-SVM

GOOD MORNING COLLEAGUES!
I am very into train a new model from my own data set of faces!
I have found no information about this topic, then I hope my information could help people and I can get some answers as well.
I will try to explain the steps I have needed to do to train my own model and later on some questions...
I have download the Latent code from: http://cs.brown.edu/~pff/latent-release4/
I have download the PASCAL VOC 2008 code (devkit) from: http://host.robots.ox.ac.uk/pascal/VOC/voc2008/index.html
I have emulate the structure of files/folders of the VOC PASCAL but in my own data set:
Annotations. I have created a .xml where I have defined a object, face, (in each image I only have one face). I didn't define difficulties or poses...
JPEGImages where I have stored all the images
ImageSets where I have defined three files:
test.txt, where I wrote the file name of my positive samples
train.txt, where I wrote the file name of my negative samples
trainval.txt, where I wrote the file name of my positive samples (exactly the same file than test.txt).
I have change some things in globals.m and VOCinit.m (to tell the algorithm the path and the location of some files...)
Then I run the training with the command: pascal('face', 1);
Following these steps I have achieved that the training run completely and doesn't fail and I get my own model BUT I have some doubts...
Can you see anything weird in my explanation? Could it work?
Must the files test.txt/trainval.txt be equal? Why... What does it mean?
Do I have to choose the number of parts I want in the model INSIDE the function?
Please, you imagine I have two kind of samples (frontal faces and side faces) and I want to detect both... How can I address this issue? I thought I have to train a model with two components... but How can I tell to the training code which are frontal or side samples?? In the annotations with the label pose?? (I don't think so...) Are there other way to handle this purpose?
Thank you for your time!!
I hope you can solve my doubts :)
I think test.txt should contain samples (images) that will be used to estimate how good the system is after learning the faces. However, trainval.txt is used during the learning stage (training) to fine-tune the parameters of the model; it is an essential part of supervised learning.
Also, it is very hard to have one single SVM to classify faces that are both frontal and sideways. Here is my suggestion:
Train one SVM to detect if the input image is a frontal face or a sideways face. Call this something like SVM-0.
Train another SVM for frontal faces. This SVM will classify all your individuals. Note, however, that SVM is usually a binary classifier, so make sure you choose the right SVM, one that as a multiclass architecture. Call this SVM-F.
Tran a final SVM for sideways faces. Again, use a multiclass SVM. Call it SVM-S.
Present the input image to SVM-0 and if it detects it is a frontal face, present the input again to SVM-F; otherwise, give the input to SVM-S.
In my experience, you should expect very low performance in SVM-S. It is a hard problem to solve. But frontal faces is not a big deal, unless you are working with faces that vary in pose, illumination, and expression (PIE). Face recognition is affected greatly with PIE variations in the images.
I recommend you this website, it contains very good information and tutorials for starters, with or without experience.