Train Neural network with image pixels as input and get the screen coordinate value as output in MATLAB - matlab

I am a newbie in the neural network, in my project I need to implement a neural network that uses image pixels as input and will give screen coordinate value as output. I am having Dataset which I have collected by performing an experiment by many volunteers. In details, I need to give only cropped eye pixel into the neural network for training. the cropped eye is of size 30*30 (approx) after resizing. I have created a dataset of users looking at a different specific point on the screen. Each point is on a specific coordinate and that is known to me. It is basically an implementation of one research paper.
If you can suggest me, how should I process to create the Neural network that it would be a great help?

If you want to use input image as input to neural network, you can use Conventional Neural Network. CNN can gives image as input.
If you want to extract feature from image, you can use MLP neural network.

Related

How to create a neural network that receive multiple input images in Matlab

I'd like to know if it's possible to create a neural network that receive multiple input images (imageInputLayer)
For example a Siamese architecture for computing the disparity (stereo correspondence) out of two image patches. The network input is two images and the output is a scalar that represent the disparity.
Currently matlab supports a single imageInputLayer for each neural network.
I'd like to to classify a 3D object by projecting the 3D object through 3 angles, Therefor converting the problem to classification of 3 images.
I'm trying to create a network that looks like the attached image.
Please let me know what you think and how to work things out with the network input
This is simply not possible in Matlab 2018B

Convolution Neural Network for image detection/classification

So here is there setup, I have a set of images (labeled train and test) and I want to train a conv net that tells me whether or not a specific object is within this image.
To do this, I followed the tensorflow tutorial on MNIST, and I train a simple conv net reduced to the area of interest (the object) which are training on image of size 128x128. The architecture is as follows : successively 3 layers consisting of 2 conv layers and 1 max pool down-sampling layers, and one fully connected softmax layers (with two class 0 and 1 whether the object is present or not)
I impleted it using tensorflow, and this works quite well, but since I have enough computing power I was wondering how I could improve the complexity of the classification:
- adding more layers ?
- adding more channel at each layer ? (currently 32,64,128 and 1024 for the fully connected)
- anything else ?
But the most important part is that now I want to detect this same object on larger images (roughle 600x600 whereas the size of the object should be around 100x100).
I was wondering how I could use the previously training "small" network used for small images, in order to pretrained a larger network on the large images ? One option could be to classify the image using a slicing window of size 128x128 and scan the whole image but I would like to try if possible to train a whole network on it.
Any suggestion on how to proceed ? Or an article / ressource tackling this kind of problem ? (I am really new to deep learning so sorry if this is stupid question...)
Thanks !
I suggest that you continue reading on the field overall. Your search keys include CNN, image classification, neural net, AlexNet, GoogleNet, and ResNet. This will return many articles, on-line classes and lectures, and other materials to help you learn about classification with neural nets.
Don't just add layers or filters: the complexity of the topology (net design) must be fitted to the task; a net that's too complex will over-fit the training data. The one you've been using is probably LeNet; the three I cite above are for the ImageNet image classification contest.
Since you are working on images, I would suggest you to use a pretrained image classification network (like VGG, Alexnet etc.)and fine tune this network with your 128x128 image data. In my experience until we have very large data set fine tuned network will give more accuracy and also save training time. After building a good image classifier on your data set you can use any popular algorithm to generate region of proposal from the image. Now take all regions of proposal and pass them to classification network one by one and check weather this network is classifying given region of proposal as positive or negative. If it classifying as positively then most probably your object is present in that region. Otherwise it's not. If there are a lot of region of proposal in which object is present according to classifier then you can use non maximal suppression algorithms to reduce number of positive proposals.

Deconvolution with caffe

I was wondering if it is possible to perform a deconvolution of images in Caffe using a point spread function of objective at a given focal point. Something along the lines of this approach.
If yes, what would be the best way to proceed?
It is possible to deconvolve images using Caffe (and CNN in general), but the approach may not be as general as you hope it to be.
CNNs can take blurry image as an input and output sharp image. As the networks are convolutional, the input can be of any size. This can be easily done in Caffe using Convolutional layers and Euclidean Loss layer. Optionally, you can experiment with adding some pooling and deconvolution layers.
CNNs can be trained to deconvolve images for specific blur PSF as in your link. (see: [Xu et al.:Deep Convolutional Neural Network for Image Deconvolution. NIPS 2014]). This works well but you have to re-train the CNN for each new PSF (which takes lot of time).
I've tried to train CNNs to do blind deconvolution (PSF is not known) and it works very well for text documents. You can get trained nets and python-Caffe scripts at [Hradiš et al.: Convolutional Neural Networks for Direct Text Deblurring. BMVC 2015]. This approach could work for other types of images, but it would not work for unrestricted photographs and diverse blurs. For general photos, I would guess It could work for small range of blurs.
Another possibility is to do inverse filtration (e.g. using Wiener filter) and process the output using a CNN. The advantage of this is that you can compute the inverse filter for new PSF very fast and the CNN stays the same. [Schuler et al.: A machine learning approach for non-blind image deconvolution. CVPR 2013]

How to choose the number of nodes for using BP network in face recognition?

I read some books but still cannot make sure how should I organize the network. For example, I have pgm image with size 120*100, how the input should be like(like a one dimensional array with size 120*100)? and how many nodes should I adapt.
It's typically best to organize your input image as a 2D matrix. The reason is that the layers at the lower levels of the neural networks used in machine perception tasks are typically locally connected. For example, each neuron of the first layer of such a neural net will only process the pixels of a small NxN patch of the input image. This naturally leads to a 2D structure which can be more easily described with 2D matrices.
For a detailed explanation I'll refer you to the DeepFace paper which describes the stat of the art in face recognition systems.
120*100 one dimensional vector is fine. The locations of the pixel values in that vector does not matter, because all nodes are fully connected with the nodes in the next layer anyway. But you must be consistent with their locations between training, validating, and testing.
The most successful approach so far was to go with a convolutional neural network with 2D input, just as #benoitsteiner stated. For a far simpler example I'd refer you to a LeNet-5, a small neural network developed for MNIST hand-written digit recognition. It is used in EBLearn for face recognition with quite good results.

image processing with neural network

I am working on the topic of brain tumor segmentation. I have used "Bounding Box Method
Using Symmetry" algorithm to find and segment the tumor. Following is the output
As you can see that I have successfully segmented the tumor and now want to implement neural network on it. I know the working and mathematics behind a simple neural network but doesn't know how to train neural network to work on my algorithm. In short, I want to know how to begin neural network training. Any simple code or direction, preferably using Matlab, will be highly appreciable.
Image recognition
The general neural networks are used for image recognition, not for pin pointing details in an image. You may design a neural network to tell your whether or not there is a tumor in the image, but it is not a trivial task for a neural network to tell you where it is located.
If you do decide to let your network determine whether or not a given image contains a tumor, you would need a huge amount of images without tumors as well. The ratio of "images with tumor" and "images without tumor" should be close to the actual ratio observed in the real world. If you defere from this ratio, the network will be proned to false positives - as it learns that a majority of the images should contain tumors.
In your case
If you input a MR image which contains a tumor, and want to receive a segmented tumor image, you should probably have 500*500 input signals and 500*500 output signals - and train the network to create a border around the tumor present in the image.
If you extend you question as to explain why you want to make the neural network behave like this, then there might be someone here at SO that could help you!