I have images of different sizes (i.e. 50x100, 20x90). At the moment for the input to Convolution Neural Network (CNN) is 28x28, so I just use imresize function in MATLAB to 28x28. But I think that will increase the noise in the image. is there any other to first make the image of equal size then resize it to 28x28?
Related
I am trying to use different low resolution images for my work. Recently, I was reading the LOW RESOLUTION CONVOLUTIONAL NEURAL NETWORK FOR AUTOMATIC TARGET
RECOGNITION in which they didn't mentioned the way how they made the low resolution images.
Resolution adaptation for feature computation: To show the influence
of resolution on the performances of these image representations, we
focus on seven specific resolutions ranging from 200 × 200 to 10 × 10
pixels
Here is the example images from the paper.
Anyone please help me to implement this method in MATLAB?
Currently, I am using this way to make the Low resolution images:
img = im2double(imread('cameraman.tif'));
conv_mat = ones(6) / 36;
img_low = convn(img,conv_mat,'same');
figure, imshow(img), title('Original');
figure, imshow(img_low), title('Low Resolution')
You have a good start there. The convolution makes it so that each pixel contains the average of a 6x6 neighborhood. Now all that is left is to keep only one pixel in each 6x6 neighborhood. This pixel will have an average of the deleted information:
img = im2double(imread('cameraman.tif'));
conv_mat = ones(6) / 36;
img_low = convn(img,conv_mat,'same');
img_low = img_low(3:6:end,3:6:end)
figure, imshow(img), title('Original');
figure, imshow(img_low), title('Low Resolution')
The 3:6:end simply indicates which columns and which rows to keep. I start the subsampling at 3, to avoid the pixels that were averaged with the background.
Judging from the images you posted, they used this averaging method. Other alternatives are to take the max in the neighborhood (as is done in the max-pooling layers of a convolutional neural network), or simply subsample without any filtering (introduces aliasing, I don't recommend this method).
I am trying to train my own network on Caffe, similar to Imagenet model. But I am confused with the crop layer. Till the point I understand about crop layer in Imagenet model, during training it will take random 227x227 image crops and train the network. But during testing it will take the center 227x227 image crop, does not we loose the information from image while we crop the center 227x27 image from 256x256 image? And second question, how can we define the number of crops to be taken during training?
And also, I trained the same network(same number of layers, same convolution size FC neurons will differ obviously), first taking 227x227 crop from 256x256 image, and second time taking 255x255 crop from 256x256 image. According to my intuition, the model with 255x255 crop should give me the best result. But I am getting higher accuracy with 227x227 image, can anyone explain me the intuition behind it, or am i doing something wrong?
Your observations are not specific to Caffe.
The sizes of the cropped images during training and testing need to be the same (227x227 in your case), because the upstream network layers (convolutions, etc) need the images to be the same size. Random crops are done during training is because you want data augmentation. However, during testing, you want to test against a standard dataset. Otherwise, the accuracy reported during testing would also depend on a shifting test database.
The crops are made dynamically at each iteration. All images in a training batch are randomly cropped. I hope this answers your second question.
Your intuition is not complete: With a bigger crop (227x227), you have more data augmentation. Data augmentation essentially creates "new" training samples out of nothing. This is vital to prevent overfitting during training. With a smaller crop (255x255), you should expect a better training accuracy but lower test accuracy, since the data is more likely be overfitted.
Of course, cropping can be overdone. Too much cropping and you lose too much information from an image. For image categorization, the ideal crop size is one that does not alter the category of an image, (ie, only background is cropped away).
I training a CNN, many authors have mentioned of randomly cropping images from the center of the original image with a factor of 2048 data augmentation. Can anyone plz elaborate what does it mean?
I believe you are referring to the ImageNet Classification with Deep Convolutional Neural Networks data augmentation scheme. The 2048x aspect of their data augmentation scheme goes as follows:
First all images are rescaled down to 256x256
Then for each image they take random 224x224 sized crops.
For each random 224x224 crop, they additionally augment by taking horizontal reflections of these 224x224 patches.
So my guess as to how they get to the 2048x data augmentation factor:
There are 32*32 = 1024 possible 224x224 sized image crops of a 256x256 image. To see this simply observe that 256-224=32, so we have 32 possible horizontal indices and 32 possible vertical indices for our crops.
Doing horizontal reflections of each crop doubles the size.
1024 * 2 = 2048.
The center crop aspect of your question stems from the fact that the original images are not all the same size. So what the authors did was they rescaled each rectangular image so that the shortest side was now of size 256, and they they took the center crop from this, thereby rescaling the entire dataset to 256x256. Once they have rescaled all the images to 256x256, they can perform the above (up to)-2048x data augmentation scheme.
Design a 2-D averaging filter that can decrease the effect of noise on the image “waves_noise.jpg”. You can use the image processing toolbox in MATLAB to read images, convert them from RGB to gray level, and do 2D convolution using MATLAB functions
Look into the imfilter command. This question is easy.
I have an image with uniform intensity everywhere with gray value = 100 then i added additive zero-mean independent Gaussian noise with standard deviation = 5
I = ones(100,100)*100;
I_n=I+(5*randn(100,100));
I think that the mean and standard deviation of the pixel intensity in the noisy image will be 100 and 5 respectively,
then i want to reduce the noise in the noisy image by 2x2 averaging mask.
what is the effect of the averaging mask on the mean and standard deviation of the pixel intensity in the image?
is it better to increase the size of the mask?
for a uniform original image, and uniform noise, averaging won't change the mean. it will reduce the variation between pixels, but also make the noise between adjacent pixels correlated.
if you calculated the standard deviation then you would find that the value is 2.5 (reduced by a factor of 2, 2 = sqrt(4), where you averaged 4 values).
using a larger mask will reduce the noise further, but correlate it over more pixels. it will also blur more any structure in the underlying image (not an issue in this case, because it is uniform).
Standard averaging techniques will not work well in these situations. Just use a Wiener Filter if you have the autocorrelation matrices, else use Gaussian Process Regression with a suitable kernel matrix.