Why does distorting images improve training on a neural network? - neural-network

I cannot understand why distorting an image, e.g flipping it, increasing the gamma intensity would somehow increase the accuracy on neural network.
Within my situation, I am Using a CNN to detect if dogs are present in an image, and I was recommended to add distortion.

Actually, #Ash is right. You don't distort all the images. If you add it to some of them, the neural network is able to generalize better and therefore be more robust and immune to some noise.

Related

Train Neural network with image pixels as input and get the screen coordinate value as output in MATLAB

I am a newbie in the neural network, in my project I need to implement a neural network that uses image pixels as input and will give screen coordinate value as output. I am having Dataset which I have collected by performing an experiment by many volunteers. In details, I need to give only cropped eye pixel into the neural network for training. the cropped eye is of size 30*30 (approx) after resizing. I have created a dataset of users looking at a different specific point on the screen. Each point is on a specific coordinate and that is known to me. It is basically an implementation of one research paper.
If you can suggest me, how should I process to create the Neural network that it would be a great help?
If you want to use input image as input to neural network, you can use Conventional Neural Network. CNN can gives image as input.
If you want to extract feature from image, you can use MLP neural network.

Convolutional neural network back propagation - delta calculation in convolutional layer

So I’m trying to make a CNN and so far I think I understand all of the forward propagation and the back propagation in the fully connected layers. However, I’m having some issues with the back prop in the convolutional layers.
Basically I’ve written out the dimensions of everything at each stage in a CNN with two convolutional layers and two fully connected layers, with the input having a depth of 1(as it is black and white) and only one filter being applied at each convolutional layer. I haven’t bothered to use pooling at this stage as to my knowledge it shouldn’t have any impact on the calculus, just to where it is assigned, so the dimensions should still fit as long as I also don’t include any uppooling in my backprop. I also haven’t bothered to write out the dimensions after the application of the activation functions as they would be the same as that as their input and I would be writing the same values twice.
The dimensions, as you will see, vary slightly in format. For the convolutional layers I’ve written them as though they are images, rather than in a matrix form. Whilst for the fully connected layers I’ve written the dimensions as that of the size of the matrices used(will hopefully make more sense when you see it).
The issue is that in calculating the delta for the convolutional layers, the dimensions don’t fit, what am I doing wrong?
Websites used:
http://cs231n.github.io/convolutional-networks/
http://neuralnetworksanddeeplearning.com/chap2.html#the_cross-entropy_cost_function
http://www.jefkine.com/general/2016/09/05/backpropagation-in-convolutional-neural-networks/
Calculation of dimensions:

Deconvolution with caffe

I was wondering if it is possible to perform a deconvolution of images in Caffe using a point spread function of objective at a given focal point. Something along the lines of this approach.
If yes, what would be the best way to proceed?
It is possible to deconvolve images using Caffe (and CNN in general), but the approach may not be as general as you hope it to be.
CNNs can take blurry image as an input and output sharp image. As the networks are convolutional, the input can be of any size. This can be easily done in Caffe using Convolutional layers and Euclidean Loss layer. Optionally, you can experiment with adding some pooling and deconvolution layers.
CNNs can be trained to deconvolve images for specific blur PSF as in your link. (see: [Xu et al.:Deep Convolutional Neural Network for Image Deconvolution. NIPS 2014]). This works well but you have to re-train the CNN for each new PSF (which takes lot of time).
I've tried to train CNNs to do blind deconvolution (PSF is not known) and it works very well for text documents. You can get trained nets and python-Caffe scripts at [Hradiš et al.: Convolutional Neural Networks for Direct Text Deblurring. BMVC 2015]. This approach could work for other types of images, but it would not work for unrestricted photographs and diverse blurs. For general photos, I would guess It could work for small range of blurs.
Another possibility is to do inverse filtration (e.g. using Wiener filter) and process the output using a CNN. The advantage of this is that you can compute the inverse filter for new PSF very fast and the CNN stays the same. [Schuler et al.: A machine learning approach for non-blind image deconvolution. CVPR 2013]

image processing with neural network

I am working on the topic of brain tumor segmentation. I have used "Bounding Box Method
Using Symmetry" algorithm to find and segment the tumor. Following is the output
As you can see that I have successfully segmented the tumor and now want to implement neural network on it. I know the working and mathematics behind a simple neural network but doesn't know how to train neural network to work on my algorithm. In short, I want to know how to begin neural network training. Any simple code or direction, preferably using Matlab, will be highly appreciable.
Image recognition
The general neural networks are used for image recognition, not for pin pointing details in an image. You may design a neural network to tell your whether or not there is a tumor in the image, but it is not a trivial task for a neural network to tell you where it is located.
If you do decide to let your network determine whether or not a given image contains a tumor, you would need a huge amount of images without tumors as well. The ratio of "images with tumor" and "images without tumor" should be close to the actual ratio observed in the real world. If you defere from this ratio, the network will be proned to false positives - as it learns that a majority of the images should contain tumors.
In your case
If you input a MR image which contains a tumor, and want to receive a segmented tumor image, you should probably have 500*500 input signals and 500*500 output signals - and train the network to create a border around the tumor present in the image.
If you extend you question as to explain why you want to make the neural network behave like this, then there might be someone here at SO that could help you!

How do neural networks handle large images where the area of interest is small?

If I've understood correctly, when training neural networks to recognize objects in images it's common to map single pixel to a single input layer node. However, sometimes we might have a large picture with only a small area of interest. For example, if we're training a neural net to recognize traffic signs, we might have images where the traffic sign covers only a small portion of it, while the rest is taken by the road, trees, sky etc. Creating a neural net which tries to find a traffic sign from every position seems extremely expensive.
My question is, are there any specific strategies to handle these sort of situations with neural networks, apart from preprocessing the image?
Thanks.
Using 1 pixel per input node is usually not done. What enters your network is the feature vector and as such you should input actual features, not raw data. Inputing raw data (with all its noise) will not only lead to bad classification but training will take longer than necessary.
In short: preprocessing is unavoidable. You need a more abstract representation of your data. There are hundreds of ways to deal with the problem you're asking. Let me give you some popular approaches.
1) Image proccessing to find regions of interest. When detecting traffic signs a common strategy is to use edge detection (i.e. convolution with some filter), apply some heuristics, use a threshold filter and isolate regions of interest (blobs, strongly connected components etc) which are taken as input to the network.
2) Applying features without any prior knowledge or image processing. Viola/Jones use a specific image representation, from which they can compute features in a very fast way. Their framework has been shown to work in real-time. (I know their original work doesn't state NNs but I applied their features to Multilayer Perceptrons in my thesis, so you can use it with any classifier, really.)
3) Deep Learning.
Learning better representations of the data can be incorporated into the neural network itself. These approaches are amongst the most popular researched atm. Since this is a very large topic, I can only give you some keywords so that you can research it on your own. Autoencoders are networks that learn efficient representations. It is possible to use them with conventional ANNs. Convolutional Neural Networks seem a bit sophisticated at first sight but they are worth checking out. Before the actual classification of a neural network, they have alternating layers of subwindow convolution (edge detection) and resampling. CNNs are currently able to achieve some of the best results in OCR.
In every scenario you have to ask yourself: Am I 1) giving my ANN a representation that has all the data it needs to do the job (a representation that is not too abstract) and 2) keeping too much noise away (and thus staying abstract enough).
We usually dont use fully connected network to deal with image because the number of units in the input layer will be huge. In neural network, we have specific neural network to deal with image which is Convolutional neural network(CNN).
However, CNN plays a role of feature extractor. The encoded feature will finally feed into a fully connected network which act as a classifier. In your case, I dont know how small your object is compare to the full image. But if the interested object is really small, even use CNN, the performance for image classification wont be very good. Then we probably need to use object detection(which used sliding window) to deal with it.
If you want recognize small objects on large sized image, you should use "scanning window".
For "scanning window" you can to apply dimention reducing methods:
DCT (http://en.wikipedia.org/wiki/Discrete_cosine_transform)
PCA (http://en.wikipedia.org/wiki/Principal_component_analysis)