Caffe bvlc_googlenet minimum accepted dimensions - neural-network

What is the minimum image input size accepted by bvlc_googlenet model implemented by Caffe?
I'm using 50 x 50 images with crop_size = 36, where i get the following error when running the solver:
caffe::Blob<>::Reshape() - Floating point exception
I have to resize my images to 256 x 256 (default input size of the bvlc_googlenet model) with crop_size = 224 to avoid the error.
Do this model only accept its default sizes or i have to hack around a bit to make it happen?
Thanks!!

After several hours of trying to fix the problem, i figured out why i was facing it.
GoogleNet accepts 224*224 images as input by default, so because it is so deep and after a set of convolution and pooling layers, using a 50*50 images (or 36*36 after crop) will lead into a very small sized output, after passing the input into some layers, smaller than the kernel size of the next layer. This will cause a Reshape exception similar to the one i faced here.
Solution:
Although its not preferred to edit the kernel_size param of the layer causing the exception (to keep working up to the NN's specifications), this will fix the problem, Where you could choose a smaller kernel size and then test the results until it works.
Follows the default GoogleNet's specifications by resizing your input images into 254*254 (keeping crop size to 224) or directly changing it to 224*224 and removing the crop_size param.

Related

How to train a FCN network while the size of images are not fixed and they are varying?

I have already trained the FCN model with fixed size images 256x256. Could I ask from experts how can I train the same model once the size of image are changing from one image to another image?
I really appreciate your advice.
Thanks
You can choose one of these strategies:
1. Batch = 1 image
By training each image as a different batch, you can reshape the net in the forward() (rather than in reshape()) of the data layer, thus changing the net at each iteration.
+write reshape once in forward method and you no longer need to worry about input shapes and sizes.
-reshapeing the net often requires allocation/deallocation of CPU/GPU memory and therefore it takes time.
-You might find a single image in a batch to be too small of a batch.
For example (assuming you are using a "Python" layer for input):
def reshape(self, bottom, top):
pass # you do not reshape here.
def forward(self, bottom, top):
top[0].data.reshape( ... ) # reshape the blob - this will propagate the reshape to the rest of the net at each iteration
top[1].data.reshape( ... ) #
# feed the data to the net
top[0].data[...] = current_img
top[1].data[...] = current_label
2. Random crops
You can decide on a fixed input size and then randomly crop all input images (and the corresponding ground truths).
+No need to reshape every iteration (faster).
+Control over model size during train.
-Need to implement random crops for images and labels
3. Fixed size
Resize all images to the same size (like in SSD).
+Simple
-Images are distorted if not all images have the same aspect ratio.
-You are no invariant to scale

Rectified images of same size as the initial ones

I want to rectify a stereo image pair in MATLAB. To rectify, I use the following call:
[J1,J2] = rectifyStereoImages(I1,I2, cameraParamsStereo);
If I do this, then I only get the so called valid part of each image which is smaller than the initial image size. If I specify the argument OutputView as full, then I get rectified images which are larger than the original ones.
Is there a way to get rectified images that have the same size as the original ones?
It is possible in principle, but rectifyStereoImages does not support this.

Fully Convolution Networks with Varied inputs

I have a fully convolutional neural network, U-Net, which can be read below.
https://arxiv.org/pdf/1505.04597.pdf
I want to use it to do pixelwise classification of images. I have my training images available in two sizes: 512x512 and 768x768. I am using reflection padding of size (256,256,256,256) in the former in the initial step, and (384,384,384,384) in the latter. I do successive padding before convolutions, to get output of the size of input.
But since my padding is dependant on the image/input's size, I can't build a generalised model (I am using Torch).
How is the padding done in such cases?
I am new to deep learning, any help would be great. Thanks.
Your model will only accept images of the size of the first layer. You have to pre-process all of them before forwarding them to the network. In order to do so, you can use:
image.scale(img, width, height, 'bilinear')
img will be the image to scale, width and heightthe size of the first layer of your model (if I'm not mistaken it is 572*572), 'bilinear' is the algorithm it is going to use to scale the image.
Keep in mind that it might be necessary to extract the mean of the image or to change it to BGR (depending on how the model was trained).
The first thing to do is to process all of your images to be the same size. The CONV layer input requires all images to be of the specified dimensions.
Caffe allows you a reshape within the prototxt file; in Torch, I think there's a comparable command you can drop at the front of createModel, but I don't recall the command name. If not, then you'll need to do it outside the model flow.

Form a single image from multiple blocks without getting the chessboard pattern

I'm using the Hopfield neural network to process a 400x400 satellite image.
However due to hardware issues I'm unable to process the entire image as a single image. Hence I've divided it into blocks of 50x50 each.
However after processing these blocks and combining them to form a single image, the borders of the blocks show up. How can I avoid this?
Maybe you can run the same algorithm on your image twice. do it once normally, then slightly offset your blocks and do it again. then average the two together you can still see the "checkerboard" but it's not as noticeable. You may have to play with the offset to get more desirable results. Also you an probably make the code smarter so that it doens't change the image size, but this was just a quick proof of concept.
I used histogram equalization as my algorithm. You can see the "avg of blocks" looks less "chessboard-like". I even did a difference between the whole image processing and the blocks. you can see the difference is much smaller between the avg and the whole image than for either of the two blocks
offset = 25;
fun = #(block_struct) histeq(block_struct.data)
%processes entire image, this is the baseline
a = histeq(im);
%does original block processing
b = blockproc(im,[125,125],fun);
%offsets the blocks and does processing again, please notice this
%changes the size of the image
c = blockproc(im(offset:end,offset:end),[125,125],fun);
%averages the two together (using the smaller image)
d= b(offset:end,offset:end)*.5+.5*c;
%the original image shows what processing the entire image loo
figure(1)
subplot(3,2,1:2);imshow(im);title('original')
subplot(3,2,3);imshow(a);title('operation on entire image')
subplot(3,2,4);imshow(d);title('avg of blocks')
subplot(3,2,5);imshow(b);title('blocks')
subplot(3,2,6);imshow(c);title('offset block')
figure(2);suptitle('difference between operation on entire image and block images')
subplot(2,2,1);imshow(a);title('operation on entire image')
subplot(2,2,2);imshow(abs(a(offset:end,offset:end)-d));title('avg of blocks')
subplot(2,2,3);imshow(abs(a-b));title('blocks')
subplot(2,2,4);imshow(abs(a(offset:end,offset:end)-c));title('offset block')

How to determine a projected (if 3D) aspect ratio (if set) of a figure in Matlab to specify a proper paper size?

I saw many Q&A here about squeezing space out of Matlab figures. However I want to squeeze space resulted from a possibly fixed aspect, i.e. to choose proper paper size for figure printing when aspect is fixed.
Quite often I work with DEM/map/image thus I use axis image. Now if I want to produce a high resolution image I do something like
set(gcf,'PaperUnits','inches','PaperPosition',[0 0 4 3])
print('-dpng','-r300','somefile.png')
as described in Matlab KB.
The problem here is to determine a proper aspect such that I can specify proper paper size that would leave no white/background stripes on either sides.
Apparently if I have a map (let's say 1000x2000 cells) with aspect ratio of 0.5, and I'm printing it on 4"x3" paper, I'll get background stripes on the sides. This is quite annoying as I'd prefer 1.5"x3" paper + axes & labels or so. Right now I have to manually adjust paper size.
This is inconvenient as I'd like a universal solution. For instance I may print a plot into file that I expect to occupy 4"x3" as well that has no fixed aspect. Or I may want to print a 3D figure. I'm aware of daspect and pbaspect, but how can I know how it is currently drawn?
Perhaps I can derive current 2D aspect from get(gca,'Position') and then scale it to my maximum allowed desired size (e.g., 4"x3") while respecting whether DataAspectRatioMode (?) property is set to manual. Is it the way to proceed or is there a better way?
I am not exactly sure if I understand your problem exactly, but I have used the following commands to create pdf images that are sized exactly to the size of the figure. I have used this for both 2D and 3D figures. The "handle" variable is simply your figure handle.
set(handle,'Units','inches');
set(handle,'PaperUnits','Inches','PaperPositionMode','auto');
P = get(handle,'Position');
set(handle,'PaperSize', [P(3),P(4)]);