Averaging Across All Matrices Stored in Structure - matlab

I am currently using the following code to get an average across 8766 matrices stored in a structure, matData, but when I look inside Mcell (1x8766 cell) all of the values stored in each cell are duplicates of that in cell 1x1. I would like to know what I am doing wrong, since I will then take the nanmean of all the matrices in this structure.
Mcell = arrayfun(#(x) matData(sprintf('(%d)',x)).shape, 1:8766, 'uni', 0);
M = nanmean( reshape(cell2mat(Mcell), 192, 144, []), 3 );
Extra notes: matData is 1x8766 struct with 1 field files in matdata are called matData(i).shape where i=1:8766 and are 192x144 double.
Thank you for all of your input and help.

You just need a combination of struct2cell, cell2mat and nanmean:
matData = cell2struct(num2cell(randn(192,144,8766),[1,2]), 'shape', 1); % Sample input
result = nanmean(cell2mat(struct2cell(matData)),3);

Related

How can I reference a matrix in a loop

I have a series of 11 large matrices corresponding to earthquake data. I want to draw data from individual matrices in a loop. I will use dummy matrices data1 and data2 to illustrate my problem.
load('data1');
load('data2'); %data1 and data2 are large matrices
matrixname={data1 data2};
for j=1:2
matrix=matrixname(j);
latitude=matrix(:,1);
longitude=matrix(:,2);
geoshow(latitude,longitude);
end
So in my loop I want to extract columns from different matrices depending on the index. But I cannot figure out how to do this - I get error messages saying that the index exceeds the matrix dimensions.
Appreciate the help!
As mentioned, use {} brackets for cell arrays. Also, if you want to store the latitude and longitude into new matrices then add the 'j' term to your matrices so that you don't overwrite the data in your for loop.
Also when writing for loops, it's better to use size/length/numel instead of a fixed variable in case the number of files your analysing changes.
So taking your code;
for j=1:size(matrixname,2)
matrix = []; % reformat matrix in case of size differences
matrix = matrixname{j}; % gets the jth matrix
latitude(j) = matrix(:,1); % outputs column 1
longitude(j) = matrix(:,2); % outputs column 2
geoshow(latitude(j),longitude(j)); % runs function on jth set of data
end
Remember, [] brackets are for matrices. Using the wrong brackets is what gave you the error, as pointed out above.
To get the matrix you need from a cell array (which is what {data1 data2} is), you need to use cell2mat():
matrix=cell2mat(matrixname(j));
Since data1 and data2 are matrices, when you do:
matrixname={data1 data2};
You don't get an array of doubles as you're expecting, but a cell array. That's ok and good since data1 and data2 doesn't have the same size (number of rows or columns).
Then, inside the for loop, when accessing one of the original matrices (data1 or data2) from matrixname, you should convert it back to an array of doubles. The simplest and fastest way of doing it is by:
matrix = matrixname{j};
Look at the difference from your code: I'm using curly brackets {} instead of parenthesis (). Doing that way, matrix is a array of doubles. Doing matrix=matrixname(j), matrix is a cell array.
Finally, it's always good to clear temporary variables inside a loop. Then, your code should like:
load('data1');
load('data2'); %data1 and data2 are large matrices
matrixname = {data1 data2};
for j=1:2
matrix=matrixname{j};
latitude=matrix(:,1);
longitude=matrix(:,2);
geoshow(latitude,longitude);
clear matrix latitude longitude
end
For example, if:
data1 = [1 1; 2 2; 3 3];
data2 = [10 10; 20 20; 30 30; 40 40];
matrixname = {data1 data2};
matrixname{1} gives you exactly data1 and matrixname{2} gives you data2.

Extract data from a Cell Array using a vector and converting into an array

I have a cell array [5x1] which all cells are column vectors such as:
exInt =
[46x1 double]
[54x1 double]
[40x1 double]
[51x1 double]
[ 9x1 double]
I need to have a vector (vec) containing the cells in extInt I need to extract and then I have to convert these into a single column array. Such as:
vec = [1,3];
Output = cell2mat(extInt{vec})
Output should become something an array [86x1 double].
The way I have coded I get:
Error using cell2mat
Too many input arguments.
If possible, I would like to have a solution not using a loop.
The best approach here is to use cat along with a comma-separted list created by {} indexing to yield the expected column vector. We specify the first dimension as the first argument since you have all column vectors and we want the output to also be a column vector.
out = cat(1, extInt{vec})
Given your input, cell2mat attempts to concatenate along the second dimension which will fail for your data since all of the data have different number of rows. This is why (in your example) you had to transpose the data prior to calling cell2mat.
Update
Here is a benchmark to compare execution times between the cat and cell2mat approaches.
function benchit()
nRows = linspace(10, 1000, 100);
[times1, times2] = deal(zeros(size(nRows)));
for k = 1:numel(nRows)
rows = nRows(k);
data = arrayfun(#(x)rand(randi([10, 50], 1), 1), 1:rows, 'uni', 0);
vec = 1:2:numel(data);
times1(k) = timeit(#()cat_method(data, vec));
data = arrayfun(#(x)rand(randi([10, 50], 1), 1), 1:rows, 'uni', 0);
vec = 1:2:numel(data);
times2(k) = timeit(#()cell2mat_method(data, vec));
end
figure
hplot(1) = plot(nRows, times1 * 1000, 'DisplayName', 'cat');
hold on
hplot(2) = plot(nRows, times2 * 1000, 'DisplayName', 'cell2mat');
ylabel('Execution Times (ms)')
xlabel('# of Cell Array Elements')
legend(hplot)
end
function out = cat_method(data, vec)
out = cat(1, data{vec});
end
function out = cell2mat_method(data, vec)
out = cell2mat(data(vec)');
end
The reason for the constant offset between the two is that cell2mat calls cat internally but adds some additional logic on top of it. If you just use cat directly, you circumvent that additional overhead.
You have a small error in your code
Change
Output = cell2mat(extInt{vec});
to
Output = cell2mat(extInt(vec));
For cells, both brackets and parentheses can be used to get information. You can read some more about it here, but to summarize:
Use curly braces {} for setting or getting the contents of cell arrays.
Use parentheses () for indexing into a cell array to collect a subset of cells together in another cell array.
In your example, using brackets with index vector vec will produce 2 separate outputs (I've made a shorter version of extInt below)
extInt = {[1],[2 3],[4 5 6]};
extInt{vec}
ans =
1
ans =
4 5 6
As this is 2 separate outputs, it will also be 2 separate input to the function cell2mat. As this function only takes one input you get an error.
One alternative is in your own solution. Take the two outputs and place them inside a new (unnamed) cell
{extInt{vec}}
ans =
[1] [1x3 double]
Now, this (single) result goes into cell2mat without a problem.
(Note though that you might need to transpose your result before depending on if you have column or row vectors in your cell. The size vector (or matrix) to combine need to match/align.)
Another way as to use parentheses (as above in my solution). Here a subset of the original cell is return. Therefore it goes directly into the cell2matfunction.
extInt(vec)
ans =
[1] [1x3 double]
I have been messing around and I got this working by converting this entry into a new cell array and transposing it so the dimensions remained equivalent for the concatenating process
Output = cell2mat({extInt{vec}}')
use
Output = cell2mat(extInt(vec))
Since you want to address the cells in extInt not the content of the cells
extInt(vec)
extInt{vec}
try those to see whats going on

Bitwise or over an array in Matlab?

I have a large array of binary numbers, and I want to do a bitwise OR over one dimension of the array:
X = [ 192, 96, 96, 2, 3
12, 12, 128, 49, 14
....
];
union_of_bits_on_dim2 = [
bitor(X(:,1), bitor(X(:,2), bitor(X(:,3), ... )))
];
ans =
[ 227
191
... ]
Is there a simple way of doing this? I'm actually working on an n-dimensional array. I tried bi2de but it flattens out my array and so the subscripting becomes complicated.
I could do it easily if matlab had a fold function but I don't think it does.
OK #Divakar asked for runnable code so to make it clear here is a long-winded version that might work for a 2D array.
function U=union_of_bits_on_dim2(X)
U=zeros(size(X,1),1);
for i=1:size(X,2)
U=bitor(U,X(:,i));
end
Surely it be done without looping? I was of course hoping that bitor could take arbitrary numbers of arguments. Then it could have been done with mat2cell.
One vectorized approach -
[m,n] = size(X) %// Get size of input array
bd = dec2bin(X)-'0' %// Get binary digits
%// Get cumulative "OR-ed" version with ANY(..,1)
cum_or = reshape(any(permute(reshape(bd,m,n,[]),[2 3 1]),1),8,[])
%// Finally convert to decimals
U = 2.^(7: -1:0)*cum_or
I don't know any function that can do that automatically. However you can loop over the dimension you are interested in:
function result = bitor2d(A)
result = A(1,:);
for i=2:size(A,1)
result = bitor(result,A(i,:));
end
end
If your array has more than 2 dimensions, then you need to prepare it to have only 2.
function result = bitornd(A,whichdimension)
B = shiftdim(A,whichdimension-1); % change dimensions order
s = size(B);
B = reshape(B,s(1),[]); % back to the original shape
result = bitor2d(B);
s(1) = 1;
result = reshape(result,s); % back to the original shape
result = shiftdim(result,1-whichdimension); % back to the original dimension order
end

Mathworks say comparison of unequal vectos by kruskal wallis is possible but doesn't work

I observed the length of 5 types of nursing care however now have 5 groups of differing sample sizes, just because type 1 care took place more often.
So when I run [P,ANOVATAB,STATS]=kruskalwallis([rand(10,1) rand(30,1)])
I get : Error using horzcat CAT arguments dimensions are not
consistent.
Why do unequal sample sizes matter and what should I do instead?
According to the kruskalwallis documentation, you need to pass it two vectors of the same length: one containing all the data, and another of the same length containing the group indices of each data point. So, if you were to call the two sets of data in your example groups 1 and 2, you could do:
data1 = rand(10,1);
data2 = rand(30,1);
% Concatenation with a ; in between because these are **column** vectors
allData = [data1; data2];
groups = [ones(size(data1)); 2 * ones(size(data2))];
[P,ANOVATAB,STATS] = kruskalwallis(allData, groups);
Please have a read of the documentation for Creating and Concatenating Matrices as well.
If you want to get a bit more fancy and a bit more general (e.g., in the case where you don't know how many groups you have until run time), you could use a cell array to initially store your data groups, like so:
% Initialise cell array with differently dimensioned data
xc{1} = rand(100, 1);
xc{2} = rand(1, 30);
% Reshape it all to column vectors and concatenate
allData = cellfun(#(x)x(:), xc, 'UniformOutput', false);
allData = vertcat(allData{:});
% Generate group indices for each set of data as column vectors and
% concatenate
groups = arrayfun(#(x, y)y * ones(numel(x{:}), 1), xc, 1:length(xc), 'UniformOutput', false);
groups = vertcat(groups{:});
As mentioned in the comments, this will also work if each of your data sets has different dimensions (i.e., in this example one is a row vector and one is a column vector).

Blockproc-like function for cell array output

I like blockproc, it makes working with large (very large) images easily. However, as far as I understand, it is limited to working with functions that output a matrix of the same size as the input they take.
So I was wondering if there is a way of replicating/simulating what blockproc does but for functions that output a cell array. We can either assume that the output array from the processing function is of the same dimensions of the input matrix, or that it just outputs one cell element, in which case the final output from the total processing would be a cell array with M x N elements, with M and N specifying the tiling for the processing.
I believe I can build this myself using cellfun, but I was wondering if there is are any other builtins or libraries (maybe third-party?) that I can use for this, and maybe even completely avoid reinventing the wheel.
More specifically, I am looking for something that has the same strengths as blockproc:
Can load a large image from disk progressively tile-by-tile to minimize the memory footprint of the processing
Takes care of the final concatenation of results for building the final cell array
Has an interface similar to blockproc (e.g. # of tiles, etc.)
Below is a solution that satisfies your criteria except for the first point
Use the IM2COL function to arrange distinct image blocks from the image into columns, then apply your function to each column storing the result in a cell array.
Of course this only works if all blocks fit into memory, otherwise you would have to manually write code that extracts one block at a time and process it in that way...
%# read image
img = im2double(imread('tire.tif'));
%# blocks params
sizBlk = [8 8];
numBlk = ceil( size(img) ./ sizBlk );
%# extract blocks
B = im2col(img, sizBlk, 'distinct');
B = reshape(B, [sizBlk size(B,2)]); %# put blocks on the 3rd dimension
B = squeeze( num2cell(B,[1 2]) ); %# convert to cell array
B = reshape(B, numBlk); %# reshape as blocks overlayed on image
%# process blocks
myFcn = #(blk) [mean2(blk) std2(blk)]; %# or any other processing function
I = cellfun(myFcn, B, 'UniformOutput',false);
%# in this example, we can show each component separately
subplot(121), imshow( cellfun(#(c)c(1),I) ), title('mean')
subplot(122), imshow( cellfun(#(c)c(2),I) ), title('std')
Alternatively, you could still use the BLOCKPROC function, but you have to call it multiple times, each time computing a single feature:
%# compute one feature at a time
b1 = blockproc(img, sizBlk, #(b)mean2(b.data), 'PadPartialBlocks',true);
b2 = blockproc(img, sizBlk, #(b)std2(b.data), 'PadPartialBlocks',true);
%# combine into cellarray of features
II = arrayfun(#(varargin)[varargin{:}], b1, b2, 'UniformOutput',false);
%# compare to previous results
isequal(I,II)
I've been doing something similar, although with numeric values rather than cell.
Something like this should work :
I = imread('pout.tif');
G = blockproc(I, [8 8], #(b) shiftdim(imhist(b.data)', -1), 'PadPartialBlocks', true);
G = reshape(G, size(G, 1) * size(G, 2), size(G, 3));
pout.tif is a greyscale image but I'm sure this can be changed up for RGB.
Also take care when using shiftdim, imhist returns a row vector so I transpose it to a column.