How to train a FCN network while the size of images are not fixed and they are varying? - neural-network

I have already trained the FCN model with fixed size images 256x256. Could I ask from experts how can I train the same model once the size of image are changing from one image to another image?
I really appreciate your advice.
Thanks

You can choose one of these strategies:
1. Batch = 1 image
By training each image as a different batch, you can reshape the net in the forward() (rather than in reshape()) of the data layer, thus changing the net at each iteration.
+write reshape once in forward method and you no longer need to worry about input shapes and sizes.
-reshapeing the net often requires allocation/deallocation of CPU/GPU memory and therefore it takes time.
-You might find a single image in a batch to be too small of a batch.
For example (assuming you are using a "Python" layer for input):
def reshape(self, bottom, top):
pass # you do not reshape here.
def forward(self, bottom, top):
top[0].data.reshape( ... ) # reshape the blob - this will propagate the reshape to the rest of the net at each iteration
top[1].data.reshape( ... ) #
# feed the data to the net
top[0].data[...] = current_img
top[1].data[...] = current_label
2. Random crops
You can decide on a fixed input size and then randomly crop all input images (and the corresponding ground truths).
+No need to reshape every iteration (faster).
+Control over model size during train.
-Need to implement random crops for images and labels
3. Fixed size
Resize all images to the same size (like in SSD).
+Simple
-Images are distorted if not all images have the same aspect ratio.
-You are no invariant to scale

Related

If the size of input image is different from the size of images used for training, does that impact the end segmentation/accuracy?

I am doing a project for uni where i am detecting an object with U-net and then calculating the width of the object. I trained my U-net on images of size 300x300. Now i got to a point where i want to improve the accuracy of the width measurement, and for that reason i want to input images of larger size(600x600 lets say) into the model. Does this difference in size(training on 300x300, and using on 600x600) impact the overall segmentation quality?
I'm guessing it does but am not sure.

MLModel style transfer prediction. Scale effect(brushstrokes)

I have style transfer model which is trained by pytorch and converted by onnx to mlmodel. The style-image was 1500x2000. By using coremltools I set two sizes: 256x256 and 1500x2000.
Now I can pass two image sizes to prediction process. Here are results:
On the left side it is 1500x2000 image, and on the right side is 256x256 (scaled up after processing)
Is it possible to pass big image but have bigger size of brushstrokes as you can see on image on the right? So I want to keep image size and quality (1500x2000) but change the size of style(brushstrokes). Or it is not possible and it is totally depend of image-style size I was using to train model.

Fully Convolution Networks with Varied inputs

I have a fully convolutional neural network, U-Net, which can be read below.
https://arxiv.org/pdf/1505.04597.pdf
I want to use it to do pixelwise classification of images. I have my training images available in two sizes: 512x512 and 768x768. I am using reflection padding of size (256,256,256,256) in the former in the initial step, and (384,384,384,384) in the latter. I do successive padding before convolutions, to get output of the size of input.
But since my padding is dependant on the image/input's size, I can't build a generalised model (I am using Torch).
How is the padding done in such cases?
I am new to deep learning, any help would be great. Thanks.
Your model will only accept images of the size of the first layer. You have to pre-process all of them before forwarding them to the network. In order to do so, you can use:
image.scale(img, width, height, 'bilinear')
img will be the image to scale, width and heightthe size of the first layer of your model (if I'm not mistaken it is 572*572), 'bilinear' is the algorithm it is going to use to scale the image.
Keep in mind that it might be necessary to extract the mean of the image or to change it to BGR (depending on how the model was trained).
The first thing to do is to process all of your images to be the same size. The CONV layer input requires all images to be of the specified dimensions.
Caffe allows you a reshape within the prototxt file; in Torch, I think there's a comparable command you can drop at the front of createModel, but I don't recall the command name. If not, then you'll need to do it outside the model flow.

find accumulated frame difference energy image

I have gait recognition system using matlab. I want to find accumulated frame difference energy image (AFDEI) from frame difference image.By weighted average method, the AFDEI is obtained, which can reflect the time characteristic. Next formula shows how to calculate the accumulated frame difference image:
𝐴(𝑥,𝑦) = 1/N Σ 𝐹 (𝑥,𝑦, 𝑡) where Σ from t=1 to N
This is my frame difference images (5 image )
frame difference images
I want to find accumulated frame difference energy image (AFDEI) like this :
result image
I am try to sum 5 image and taking average put it give my a very different image .
So how to find AFDEI ?
I gave a shot at this:
There is some sort of post processing filtering after averaging the images.
This is the result from averaging:
And this is after applying a mode filter with a 3x3 window to the previous image:
So I'd say that your target image is using some sort of smarter colouring algorithm. Not sure at this, but it's like it overlaps the borders of the original frames, then fills the resulting zones with the mode/modal value of the AFDEI
EDIT: Mode filter used above
function target = modeFilter(origin)
%origin is a monochrome IMG matrix
%being lazy with the margin, you may resize to filter the borders,
%without OOB exceptions.
target=origin;
[h,w]=size(origin);
for x=[2:w-1]
for y=[2:h-1]
target(y,x)=mode(origin(y-1:y+1,x-1:x+1)(:));
end
end
end

Form a single image from multiple blocks without getting the chessboard pattern

I'm using the Hopfield neural network to process a 400x400 satellite image.
However due to hardware issues I'm unable to process the entire image as a single image. Hence I've divided it into blocks of 50x50 each.
However after processing these blocks and combining them to form a single image, the borders of the blocks show up. How can I avoid this?
Maybe you can run the same algorithm on your image twice. do it once normally, then slightly offset your blocks and do it again. then average the two together you can still see the "checkerboard" but it's not as noticeable. You may have to play with the offset to get more desirable results. Also you an probably make the code smarter so that it doens't change the image size, but this was just a quick proof of concept.
I used histogram equalization as my algorithm. You can see the "avg of blocks" looks less "chessboard-like". I even did a difference between the whole image processing and the blocks. you can see the difference is much smaller between the avg and the whole image than for either of the two blocks
offset = 25;
fun = #(block_struct) histeq(block_struct.data)
%processes entire image, this is the baseline
a = histeq(im);
%does original block processing
b = blockproc(im,[125,125],fun);
%offsets the blocks and does processing again, please notice this
%changes the size of the image
c = blockproc(im(offset:end,offset:end),[125,125],fun);
%averages the two together (using the smaller image)
d= b(offset:end,offset:end)*.5+.5*c;
%the original image shows what processing the entire image loo
figure(1)
subplot(3,2,1:2);imshow(im);title('original')
subplot(3,2,3);imshow(a);title('operation on entire image')
subplot(3,2,4);imshow(d);title('avg of blocks')
subplot(3,2,5);imshow(b);title('blocks')
subplot(3,2,6);imshow(c);title('offset block')
figure(2);suptitle('difference between operation on entire image and block images')
subplot(2,2,1);imshow(a);title('operation on entire image')
subplot(2,2,2);imshow(abs(a(offset:end,offset:end)-d));title('avg of blocks')
subplot(2,2,3);imshow(abs(a-b));title('blocks')
subplot(2,2,4);imshow(abs(a(offset:end,offset:end)-c));title('offset block')