How to create a neural network that receive multiple input images in Matlab - matlab

I'd like to know if it's possible to create a neural network that receive multiple input images (imageInputLayer)
For example a Siamese architecture for computing the disparity (stereo correspondence) out of two image patches. The network input is two images and the output is a scalar that represent the disparity.
Currently matlab supports a single imageInputLayer for each neural network.
I'd like to to classify a 3D object by projecting the 3D object through 3 angles, Therefor converting the problem to classification of 3 images.
I'm trying to create a network that looks like the attached image.
Please let me know what you think and how to work things out with the network input

This is simply not possible in Matlab 2018B

Related

can we make a convolution network that use more than one image to make a prediction

I cropped the following image from a tutorial.
this diagram shows a rough structure of a standard neural network. takes one image as input and make a prediction.
what I am thinking about is some kind of parallel structure. think about something like the following image.
not exactly as in the above image. But you can see I am trying to use two images to make one prediction. this image is for you to get an idea about what I am trying to ask.
is it possible to use more than one (two, three ..) images like this or any other way in order to make one prediction. now, this is not to be used in actual photo classification. But I think such a technique can be used in a file like audio classification where a graphical representation of data is used with image classification techniques.
any advice, guidance or opinion on this?
if we consider implementing exactly what is in the diagram, if I use a high-level API like Keras (Keras.model.sequential) all we can do is keep adding a layer one after the other.
so what kind of technology can I use to implement the parallel structure
Yes, you can use more than one image as input. See for example the Siamese Neural Network which takes as input 2 images and passes them through a shared network architecture.
If instead you want to have an arbitrary and variable number of images as input you can use an architecture based on Recurrent Neural Networks like Convolutional LSTM, which essentially applies a CNN to every image of the input sequence using an LSTM recurrent network.

Train Neural network with image pixels as input and get the screen coordinate value as output in MATLAB

I am a newbie in the neural network, in my project I need to implement a neural network that uses image pixels as input and will give screen coordinate value as output. I am having Dataset which I have collected by performing an experiment by many volunteers. In details, I need to give only cropped eye pixel into the neural network for training. the cropped eye is of size 30*30 (approx) after resizing. I have created a dataset of users looking at a different specific point on the screen. Each point is on a specific coordinate and that is known to me. It is basically an implementation of one research paper.
If you can suggest me, how should I process to create the Neural network that it would be a great help?
If you want to use input image as input to neural network, you can use Conventional Neural Network. CNN can gives image as input.
If you want to extract feature from image, you can use MLP neural network.

How to train a Matlab Neural Network using matrices as inputs?

I am making 8 x 8 tiles of Images and I want to train a RBF Neural Network in Matlab using those tiles as inputs. I understand that I can convert the matrix into a vector and use it. But is there a way to train them as matrices? (to preserve the locality) Or is there any other technique to solve this problem?
There is no way to use a matrix as an input to such a neural network, but anyway this won't change anything:
Assume you have any neural network with an image as input, one hidden layer, and the output layer. There will be one weight from every input pixel to every hidden unit. All weights are initialized randomly and then trained using backpropagation. The development of these weights does not depend on any local information - it only depends on the gradient of the output error with respect to the weight. Having a matrix input will therefore make no difference to having a vector input.
For example, you could make a vector out of the image, shuffle that vector in any way (as long as you do it the same way for all images) and the result would be (more or less, due to the random initialization) the same.
The way to handle local structures in the input data is using convolutional neural networks (CNN).

How to choose the number of nodes for using BP network in face recognition?

I read some books but still cannot make sure how should I organize the network. For example, I have pgm image with size 120*100, how the input should be like(like a one dimensional array with size 120*100)? and how many nodes should I adapt.
It's typically best to organize your input image as a 2D matrix. The reason is that the layers at the lower levels of the neural networks used in machine perception tasks are typically locally connected. For example, each neuron of the first layer of such a neural net will only process the pixels of a small NxN patch of the input image. This naturally leads to a 2D structure which can be more easily described with 2D matrices.
For a detailed explanation I'll refer you to the DeepFace paper which describes the stat of the art in face recognition systems.
120*100 one dimensional vector is fine. The locations of the pixel values in that vector does not matter, because all nodes are fully connected with the nodes in the next layer anyway. But you must be consistent with their locations between training, validating, and testing.
The most successful approach so far was to go with a convolutional neural network with 2D input, just as #benoitsteiner stated. For a far simpler example I'd refer you to a LeNet-5, a small neural network developed for MNIST hand-written digit recognition. It is used in EBLearn for face recognition with quite good results.

Can I emulate an image manipulation using a neural network?

I have an image represented as a two-dimensional array of floats. I have a function that I apply to this image, which gives me back a new image also represented as a two-dimensional array of floats. This function is time consuming to run, so I am wondering if I can emulate it using a neural network. My initial thought is to use a set of random images, run the function on these and use the outputs to train a neural network that has an input node for each pixel and an output node for each pixel. The images are always 200 * 200 pixels. Does this sound like something that can be done with a neural network? Is there a better way to do it?