I like blockproc, it makes working with large (very large) images easily. However, as far as I understand, it is limited to working with functions that output a matrix of the same size as the input they take.
So I was wondering if there is a way of replicating/simulating what blockproc does but for functions that output a cell array. We can either assume that the output array from the processing function is of the same dimensions of the input matrix, or that it just outputs one cell element, in which case the final output from the total processing would be a cell array with M x N elements, with M and N specifying the tiling for the processing.
I believe I can build this myself using cellfun, but I was wondering if there is are any other builtins or libraries (maybe third-party?) that I can use for this, and maybe even completely avoid reinventing the wheel.
More specifically, I am looking for something that has the same strengths as blockproc:
Can load a large image from disk progressively tile-by-tile to minimize the memory footprint of the processing
Takes care of the final concatenation of results for building the final cell array
Has an interface similar to blockproc (e.g. # of tiles, etc.)
Below is a solution that satisfies your criteria except for the first point
Use the IM2COL function to arrange distinct image blocks from the image into columns, then apply your function to each column storing the result in a cell array.
Of course this only works if all blocks fit into memory, otherwise you would have to manually write code that extracts one block at a time and process it in that way...
%# read image
img = im2double(imread('tire.tif'));
%# blocks params
sizBlk = [8 8];
numBlk = ceil( size(img) ./ sizBlk );
%# extract blocks
B = im2col(img, sizBlk, 'distinct');
B = reshape(B, [sizBlk size(B,2)]); %# put blocks on the 3rd dimension
B = squeeze( num2cell(B,[1 2]) ); %# convert to cell array
B = reshape(B, numBlk); %# reshape as blocks overlayed on image
%# process blocks
myFcn = #(blk) [mean2(blk) std2(blk)]; %# or any other processing function
I = cellfun(myFcn, B, 'UniformOutput',false);
%# in this example, we can show each component separately
subplot(121), imshow( cellfun(#(c)c(1),I) ), title('mean')
subplot(122), imshow( cellfun(#(c)c(2),I) ), title('std')
Alternatively, you could still use the BLOCKPROC function, but you have to call it multiple times, each time computing a single feature:
%# compute one feature at a time
b1 = blockproc(img, sizBlk, #(b)mean2(b.data), 'PadPartialBlocks',true);
b2 = blockproc(img, sizBlk, #(b)std2(b.data), 'PadPartialBlocks',true);
%# combine into cellarray of features
II = arrayfun(#(varargin)[varargin{:}], b1, b2, 'UniformOutput',false);
%# compare to previous results
isequal(I,II)
I've been doing something similar, although with numeric values rather than cell.
Something like this should work :
I = imread('pout.tif');
G = blockproc(I, [8 8], #(b) shiftdim(imhist(b.data)', -1), 'PadPartialBlocks', true);
G = reshape(G, size(G, 1) * size(G, 2), size(G, 3));
pout.tif is a greyscale image but I'm sure this can be changed up for RGB.
Also take care when using shiftdim, imhist returns a row vector so I transpose it to a column.
Related
Suppose I have 4D matrix:
>> A=1:(3*4*5*6);
>> A=reshape(A,3,4,5,6);
And now I want to cut given number of rows and columns (or any given chunks at known dimensions).
If I would know it's 4D I would write:
>> A1=A(1:2,1:3,:,:);
But how to write universally for any given number of dimensions?
The following gives something different:
>> A2=A(1:2,1:3,:);
And the following gives an error:
>> A2=A;
>> A2(3:3,4:4)=[];
It is possible to generate a code with general number of dimension of A using the second form of indexing you used and reshape function.
Here there is an example:
Asize = [3,4,2,6,4]; %Initialization of A, not seen by the rest of the code
A = rand(Asize);
%% This part of the code can operate for any matrix A
I = 1:2;
J = 3:4;
A1 = A(I,J,:);
NewSize = size(A);
NewSize(1) = length(I);
NewSize(2) = length(J);
A2 = reshape(A1,NewSize);
A2 will be your cropped matrix. It works for any Asize you choose.
I recommend the solution Luis Mendo suggested for the general case, but there is also a very simple solution when you know a upper limit for your dimensions. Let's assume you have at most 6 dimensions. Use 6 dimensional indexing for all matrices:
A1=A(1:2,1:3,:,:,:,:);
Matlab will implicit assume singleton dimensions for all remaining dimension, returning the intended result also for matrices with less dimensions.
It sounds like you just want to use ndims.
num_dimensions = ndims(A)
if (num_dimensions == 3)
A1 = A(1:2, 1:3, :);
elseif (num_dimensions == 4)
A1 = A(1:2, 1:3, :, :);
end
If the range of possible matrix dimensions is small this kind of if-else block keeps it simple. It seems like you want some way to create an indexing tuple (e.g. (1:2,:,:,:) ) on the fly, which I don't know if there is a way to do. You must match the correct number of dimensions with your indexing...if you index in fewer dimensions than the matrix has, matlab returns a value with the unindexed dimensions collapsed into a single array (similar to what you get with
A1 = A(:);
Assuming i have a series of column-vectors with different length, what would be the best way, in terms of computation time, to join all of them into one matrix where the size of it is determined by the longest column and the elongated columns cells are all filled with NaN's.
Edit: Please note that I am trying to avoid cell arrays, since they are expensive in terms of memory and run time.
For example:
A = [1;2;3;4];
B = [5;6];
C = magicFunction(A,B);
Result:
C =
1 5
2 6
3 NaN
4 NaN
The following code avoids use of cell arrays except for the estimation of number of elements in each vector and this keeps the code a bit cleaner. The price for using cell arrays for that tiny bit of work shouldn't be too expensive. Also, varargin gets you the inputs as a cell array anyway. Now, you can avoid cell arrays there too, but it would most probably involve use of for-loops and might have to use variable names for each of the inputs, which isn't too elegant when creating a function with unknown number of inputs. Otherwise, the code uses numeric arrays, logical indexing and my favourite bsxfun, which must be cheap in the market of runtimes.
Function Code
function out = magicFunction(varargin)
lens = cellfun(#(x) numel(x),varargin);
out = NaN(max(lens),numel(lens));
out(bsxfun(#le,[1:max(lens)]',lens)) = vertcat(varargin{:}); %//'
return;
Example
Script -
A1 = [9;2;7;8];
A2 = [1;5];
A3 = [2;6;3];
out = magicFunction(A1,A2,A3)
Output -
out =
9 1 2
2 5 6
7 NaN 3
8 NaN NaN
Benchmarking
As part of the benchmarking, we are comparing our solution to #gnovice's solution that was mostly based on using cell arrays. Our intention here to see that after avoiding cell arrays, what speedups we are getting if there's any. Here's the benchmarking code with 20 vectors -
%// Let's create row vectors A1,A2,A3.. to be used with #gnovice's solution
num_vectors = 20;
max_vector_length = 1500000;
vector_lengths = randi(max_vector_length,num_vectors,1);
vs =arrayfun(#(x) randi(9,1,vector_lengths(x)),1:numel(vector_lengths),'uni',0);
[A1,A2,A3,A4,A5,A6,A7,A8,A9,A10,A11,A12,A13,A14,A15,A16,A17,A18,A19,A20] = vs{:};
%// Maximally cell-array based approach used in linked #gnovice's solution
disp('--------------------- With #gnovice''s approach')
tic
tcell = {A1,A2,A3,A4,A5,A6,A7,A8,A9,A10,A11,A12,A13,A14,A15,A16,A17,A18,A19,A20};
maxSize = max(cellfun(#numel,tcell)); %# Get the maximum vector size
fcn = #(x) [x nan(1,maxSize-numel(x))]; %# Create an anonymous function
rmat = cellfun(fcn,tcell,'UniformOutput',false); %# Pad each cell with NaNs
rmat = vertcat(rmat{:});
toc, clear tcell maxSize fcn rmat
%// Transpose each of the input vectors to get column vectors as needed
%// for our problem
vs = cellfun(#(x) x',vs,'uni',0); %//'
[A1,A2,A3,A4,A5,A6,A7,A8,A9,A10,A11,A12,A13,A14,A15,A16,A17,A18,A19,A20] = vs{:};
%// Our solution
disp('--------------------- With our new approach')
tic
out = magicFunction(A1,A2,A3,A4,A5,A6,A7,A8,A9,A10,...
A11,A12,A13,A14,A15,A16,A17,A18,A19,A20);
toc
Results -
--------------------- With #gnovice's approach
Elapsed time is 1.511669 seconds.
--------------------- With our new approach
Elapsed time is 0.671604 seconds.
Conclusions -
With 20 vectors and with a maximum length of 1500000, the speedups are between 2-3x and it was seen that the speedups have increased as we have increased the number of vectors. The results to prove that are not shown here to save space, as we have already used quite a lot of it here.
If you use a cell matrix you won't need them to be filled with NaNs, just write each array into one column and the unused elements stay empty (that would be the space efficient way). You could either use:
cell_result{1} = A;
cell_result{2} = B;
THis would result in a size 2 cell array which contains all elements of A,B in his elements. Or if you want them to be saved as columns:
cell_result(1,1:numel(A)) = num2cell(A);
cell_result(2,1:numel(B)) = num2cell(B);
If you need them to be filled with NaN's for future coding, it would be the easiest to find the maximum length you got. Create yourself a matrix of (max_length X Number of arrays).
So lets say you have n=5 arrays:A,B,C,D and E.
h=zeros(1,n);
h(1)=numel(A);
h(2)=numel(B);
h(3)=numel(C);
h(4)=numel(D);
h(5)=numel(E);
max_No_Entries=max(h);
result= zeros(max_No_Entries,n);
result(:,:)=NaN;
result(1:numel(A),1)=A;
result(1:numel(B),2)=B;
result(1:numel(C),3)=C;
result(1:numel(D),4)=D;
result(1:numel(E),5)=E;
I got a satellite image of size [17935 10968] pixels, I want to cut image equally and process my required algorithm on individual parts (eg: I need to cut above pixel range into 4 equal parts).
How can I split image without loosing intermediate pixels? My requirement is like (1 to 5600 and 5601 to the end pixel).
And anybody got any idea how to split images that are this big in MATLAB?
Method 1
If you have the Image Processing Toolbox, this is the preferred and most efficient method. It utilizes the extremely useful blockproc function which is designed exactly for processing large image in blocks. For instance, it takes care of padding when your image does not divide equally into same size blocks and concatenates results from the processing of blocks into one result matrix.
Best you take a look at the official documentation, but here's how it would look like in your case:
vSize = [17935 10968];
imBig = rand([vSize 3]);
nParts = [2 2]; %means divide into 4 parts, 2 horizontal, 2 vertical
blockproc(imBig, ceil(vSize ./ nParts), #yourAlgorithm);
function res = yourAlgorithm(blockStruct)
%do your processing of the block here and
%optionally return a result in 'res'
%for example, just return the RGB vector of the first pixel
res = blockStruct.data(1,1,:);
end
Method 2
If you don't have the Image Processing Toolbox you can use the mat2cell function instead. Fisrt you figure out the required block sizes and then you get a cell array containing the different blocks. For such large images though, speed and memory may become an issue. The code is borrowed from this Matlab Central answer.
vSize = [17935 10968];
imBig = rand([vSize 3]);
nParts = [2 2]; %means divide into 4 parts, 2 horizontal, 2 vertical
%figure out the size of "regular" block and the last block
vRegBlockSize = ceil(vSize ./ nParts);
vLastBlockSize = vSize - vRegBlockSize .* (nParts - 1);
%put the sizes into a vector
vSplitR = [vRegBlockSize(1)*ones(1,nParts(1)-1), vLastBlockSize(1)];
vSplitC = [vRegBlockSize(2)*ones(1,nParts(2)-1), vLastBlockSize(2)];
%split the image
C = mat2cell(imBig, vSplitR, vSplitC, 3);
%access RGB pixel (x,y) in top left {1,1} block
p = C{1,1}(x, y, :);
upperLeft = theImage(1:5600, 1:5600, :);
upperRight = theImage(1:5600, 5601:end, :);
lowerLeft = theImage(5601:end, 1:5600, :);
lowerLeft = theImage(5601:end, 1:5601:end, :);
you can use reshape to make 4 matrices from the image:
A=reshape(Img, 17935 , 10968/4,[]);
then process A(:,:,1) , etc...
Use the following code to divide the image into 4 different images:
A=reshape(Img, 17935 , 10968/4, 3, []);
then A(:,:,:,1) is the first image.
Suppose A is your 17935x10968x3 matrix, I think you can do:
B = reshape(A, 17935, 10968 / 4, 4, 3);
In this way the last dimension still represents RGB. Only difference is that it becomes 4-D array.
I was trying to resample (with replacement) my database using 'bootstrap' in Matlab as follows:
D = load('Data.txt');
lead = D(:,1);
depth = D(:,2);
X = D(:,3);
Y = D(:,4);
%Bootstraping to resample 100 times
[resampling100,bootsam] = bootstrp(100,'corr',lead,depth);
%plottig the bootstraping result as histogram
hist(resampling100,10);
... ... ...
... ... ...
Though the script written above is correct, I wonder how I would be able to see/load the resampled 100 datasets created through bootstrap? 'bootsam(:)' display the indices of the data/values selected for the bootstrap samples, but not the new sample values!! Isn't it funny that I'm creating fake data from my original data and I can't even see what is created behind the scene?!?
My second question: is it possible to resample the whole matrix (in this case, D) altogether without using any function? However, I know how to create random values from a vector data using 'unidrnd'.
Thanks in advance for your help.
The answer to question 1 is that bootsam provides the indices of the resampled data. Specifically, the nth column of bootsam provides the indices of the nth resampled dataset. In your case, to obtain the nth resampled dataset you would use:
lead_resample_n = lead(bootsam(:, n));
depth_resample_n = depth(bootsam(:, n));
Regarding the second question, I'm guessing what you mean is, how would you just get a re-sampled dataset without worrying about applying a function to the resampled data. Personally, I would use randi, but in this situation, it is irrelevant whether you use randi or unidrnd. An example follows that assumes 4 columns of some data matrix D (as in your question):
%# Build an example dataset
T = 10;
D = randn(T, 4);
%# Obtain a set of random indices, ie indices of draws with replacement
Ind = randi(T, T, 1);
%# Obtain the resampled data
DResampled = D(Ind, :);
To create multiple re-sampled data, you can simply loop over the creation of random indices. Or you could do it in one step by creating a matrix of random indices and using that to index D. With careful use of reshape and permute you can turn this into a T*4*M array, where indexing m = 1, ..., M along the third dimension yields the mth resampled dataset. Example code follows:
%# Build an example dataset
T = 10;
M = 3;
D = randn(T, 4);
%# Obtain a set of random indices, ie indices of draws with replacement
Ind = randi(T, T, M);
%# Obtain the resampled data
DResampled = permute(reshape(D(Ind, :)', 4, T, []), [2 1 3]);
Let's assume we have two arrays of the same size - A and B.
Now, we need a filter that, for a given mask size, selects elements from A, but removes the central element of the mask, and inserts there corresponding element from B.
So the 3x3 "pseudo mask" will look similar to this:
A A A
A B A
A A A
Doing something like this for averaging filter is quite simple. We can compute the mean value for elements from A without the central element, and then combine it with a proper proportion with elements from B:
h = ones(3,3);
h(2,2) =0;
h = h/sum(h(:));
A_ave = filter2(h, A);
C = (8/9) * A_ave + (1/9) * B;
But how to do something similar for median filter (medfilt2 or even better for ordfilt2)
The way to solve this is to find a way to combine the information from A and B so that the filtering itself becomes easy.
The first thing I thought of was to catenate A and B along the third dimension and to pass with a filter mask that would take 8 elements from the 'A-slice' and the center element from the 'B-slice'. This is, unfortunately, not supported by Matlab.
While nlfilter only works on 2D images, it does allow you to specify any function for filtering. Thus, you could create a function that somehow is able to look up the right values of A and B. Thus I came to my first solution.
You create a new array, C, that contains the element index at each element, i.e. the first element is 1, the second element is 2, etc. Then, you run nlfilter, which takes a 3x3 sliding window and passes the values of C inside the window to the filtering function, ffn. ffn is an anonymous function, that calls crazyFilter, and that has been initialized so that A and B get passed at each call. CrazyFunction takes the values from the sliding window of C, which are nothing but indices into A and B, and collects the values from A and B from them.
The second solution is exactly the same, except that instead of moving a sliding window, you create a new array that, in every column, has the contents of the sliding window at every possible location. With an overlapping window, the column array gets larger than the original array. Again, you then just need to use the values of the column array, C, which are indices into A and B, to look up the values of A and B at the relevant locations.
EDIT
If you have enough memory, im2col and col2im can speed up the process a lot
%# define A,B
A = randn(100);
B = rand(100);
%# pad A, B - you may want to think about how you want to pad
Ap = padarray(A,[1,1]);
Bp = padarray(B,[1,1]);
#% EITHER -- the more more flexible way
%# create a pseudo image that has indices instead of values
C = zeros(size(Ap));
C(:) = 1:numel(Ap);
%# convert to 'column image', where each column represents a block
C = im2col(C,[3,3]);
%# read values from A
data = Ap(C);
%# replace centers with values from B
data(5,:) = Bp(C(5,:));
%# OR -- the more efficient way
%# reshape A directly into windows and fill in B
data = im2col(Ap,[3,3]);
data(5,:) = B(:);
% median and reshape
out = reshape(median(data,1),size(A));
Old version (uses less memory, may need padding)
%# define A,B
A = randn(100);
B = rand(100);
%# define the filter function
ffun = #(x)crazyFilter(x,A,B);
%# create a pseudo image that has indices instead of values
C = zeros(size(A));
C(:) = 1:numel(A);
%# filter
filteredImage = nlfilter(C,[3,3],ffun);
%# filter function
function out = crazyFilter(input,A,B)
%#CRAZYFILTER takes the median of a 3x3 mask defined by input, taking 8 elements from A and 1 from B
%# read data from A
data = A(input(:));
%# replace center element with value from B
data(5) = B(input(5));
%# return the median
out = median(data);
Here's a solution that will work if your data is an unsigned integer type (like a typical grayscale image of type uint8). You can combine your two matrices A and B into a single matrix of a larger integer type, with the data from one matrix stored in the lower bits and the data from the other matrix stored in the higher bits. You can then use NLFILTER to apply a filtering function that extracts the appropriate bits of data in order to collect the necessary matrix values.
The following example applies a median filter of the form you describe above (a 3-by-3 array of elements from A with the center element from B) to two unsigned 8-bit matrices of random values:
%# Initialize some variables:
A = randi([0 255],[3 3],'uint8'); %# One random matrix of uint8 values
B = randi([0 255],[3 3],'uint8'); %# Another random matrix of uint8 values
C = uint16(A)+bitshift(uint16(B),8); %# Convert to uint16 and place the values
%# of A in the lowest 8 bits and the
%# values of B in the highest 8 bits
C = padarray(C,[1 1],'symmetric'); %# Pad the array edges
%# Make the median filtering function for each 3-by-3 block:
medFcn = #(x) median([bitand(x(1:4),255) ... %# Get the first four A values
bitshift(x(5),-8) ... %# Get the fifth B value
bitand(x(6:9),255)]); %# Get the last four A values
%# Perform the filtering:
D = nlfilter(C,[3 3],medFcn);
D = uint8(D(2:end-1,2:end-1)); %# Remove the padding and convert to uint8
Here are additional links for some of the key functions used above: PADARRAY, BITAND, BITSHIFT.