Average across matrices within a structures - matlab

I have a structure P with 20 matrices. Each matrix is 53x63x46 double. The names of the matrices are fairly random, for instance S154, S324, S412, etc. Is there any way I can do an average across these matrices without having to type out like this?
M=(P.S154 + P.S324 + P.S412 + ...)/20
Also, does it make sense to use structure for computation like this. According to this post, perhaps it should be converted to cell array.

struct2cell(P)
is a cell array each of whose elements is one of your structure fields (the field names are discarded). Then
cell2mat(struct2cell(P))
is the result of concatenating these matrices along the first axis. You might reasonably ask why it does that rather than, say, making a new axis and giving you a 4-dimensional array, but expecting sensible answers to such questions is asking for frustration. Anyway, unless I'm getting the dimensions muddled,
reshape(cell2mat(struct2cell(P)),[53 20 63 46])))
will then give you roughly the 4-dimensional array you're after, with the "new" axis being (of course!) number 2. So now
mean(reshape(cell2mat(struct2cell(P)),[53 20 63 46]),2)
will compute the mean along that axis. The result will have shape [53 1 63 46], so now you will need to fix up the axes again:
reshape(mean(reshape(cell2mat(struct2cell(P)),[53 20 63 46]),2),[53 63 46])

If you are using structures, and by your question, you have fieldnames for each matrix.
Therefore, you need to:
1 - use function fieldnames to extract all the matrix names inside your structure. - http://www.mathworks.com/help/matlab/ref/fieldnames.html
2- then you can access it by doing like:
names = fieldnames(P);
matrix1 = P.names{1}
Using a for loop you can then make your calculations pretty fast!

Related

How to vectorize conditional triple nested for loop - MATLAB

I have two 3D arrays:
shape is a 240 x 121 x 10958 array
area is a 240 x 1 x 10958 array
The values of the arrays are of the type double. Both have NaN as fill values where there is no relevant data.
For each [240 x 121] page of the shape array, there are several elements filled by the same number. For example, there will be a block of 1s, a block of 2s, etc. For each corresponding page of the area array there is a single column of numeric values 240 rows long. What I need to do is progressively go through each page of the shape array (moving along the 3rd, 10958-long axis) and replace each numbered element in that page with the number that fills the row of the matching number in the area array.
For example, if I'm looking at shape(:,:,500), I want to replace all the 8s on that page with area(8,1,500). I need to do this for numbers 1 through 20, and I need to do it for all 10958 pages of the array.
If I extract a single page and only replace one number I can get it to work:
shapetest = shape(:,:,500);
shapetest(shapetest==8)=area(8,1,500);
This does exactly what I need for one page and for one number. Going through numbers 1-20 with a for loop doesn't seem like an issue, but I can't find a vectorized way to do this for all the pages of the original 3D array. In fact, I couldn't even get it work for a single page without extracting that page as its own matrix like I did above. I tried things like this to no avail:
shape(shape(:,:,500)==8)=area(8,1,500);
If I can't do it for one page, I'm even more lost as to how to do it for all at once. But I'm inexperienced in MATLAB, and I think I am just ignorant of the proper syntax.
Instead, I ended up using a cell array and the following very inefficient nested for loops:
MyCell=num2cell(shape,[2 1]);
shapetest3=reshape(MyCell,1,10958);
for w=1:numel(shapetest3)
test_result{1,w}=zeros(121,240)*NaN;
end
for k=1:10958
for i=1:29040 % 121 x 240
for n=1:20
if shapetest3{1,k}(i)==n
test_result{1,k}(i)=area(n,1,k);
end
end
end
end
This gets the job done, and I can easily turn it back to an array, but it is very slow, and I am confident there is a much better vectorized way. I'd appreciate any help or tips. Thanks in advance.
To vectorize the mapping operation, we can use shape as an index into area. But because the mapping is different for each plane, we need to loop over the planes to accomplish this. In short, it'll look like this:
test_result = zeros(size(shape)); % pre-allocate output
for k=1:size(area,3) % loop over planes
lut = area(:,1,k);
test_result(:,:,k) = lut(shape(:,:,k));
end
The above only works if shape only contains integer values in the range [1,N], where N = size(area,1). That is, for other values in shape we'll be doing wrong indexing. We will need to fix shape to avoid this. The question here is, what do we want to do with those out-of-range values?
As an example, preparing shape to deal with NaN values:
code = size(area,1) + 1; % this is an unused code word
shape(isnan(shape)) = code;
area(code,1,:) = NaN;
This replaces all NaN values in shape with the value code, which is one larger than any code value we were mapping. Then, we extend area to have one more value, a value for the input code. The value we fill in here is the value that the output test_result will have where shape is NaN. In this case, we write NaN, such that NaN in the input maps to NaN in the output.
Something similar can be done with values below 0 and above 240 (shape(shape<1 | shape>240) = code), or with non-integer values (shape(mod(shape,1)~=0) = code).

Interpolation on 4D data

I am trying to perform an interpolation/fit (preferably non-linear, but linear should also be fine) on 4D data. My data has a form of:
[a,b,c] = func(input)
obviously, func is unknown and ultimately data looks like (input, a, b, c):
0 -0.1253 0.0341 0.01060
35 -0.0985 0.0176 0.02060
50 -0.0315 -0.0533 0.1118
60 -0.0518 -0.0327 0.03020
80 0.2939 -0.0713 0.05670
100 0.3684 -0.0765 0.06740
I take observations at e.g. input = [0, 35, 50, 60, 80, 100] (0 being min and 100 being max; I take 6 samples in between min and max) and then I get corresponding a, b and c values (I understand that 6 sample points are a bad design of experiment so I will extend it in future).
I am trying to guess the value of a, b and c at say input = 19? Any pointers?
How to estimate goodness of fit in such scenario?
This is not 4D interpolation, this is 3 times 1D interpolation. You just interpolate interp1([0 35],[-0.1253 -0.0985],19) and the same for b and c. (interp1(intput,a,19))
Note that for the most basic 1D interpolation in a mesh grid (not what you have), you need 2 data points in general. For the most basic 2D interpolation, you need 4 data points. For 3D interpolation, 8 minimum, 4D, 16.... (2^d in general).
Also note that 1D interpolation uses 2 "dims". Because you use one to guide the interpolation, the other one is interpolated. General, with [v,a,b,c] data you would use 3D interpolation.
all that said, you do are nto in this case. You have scattered data, not a grid, thus the problem becomes considerably more complicated.
In case you can generate a few more points (not necessarily 16) you can use the function griddatan for interpolating scattered data. Note that you can not just say "give me [a,b,c] for input=19, there could be infinite amount of a,b,cs that have that condition. In any case, you always need to give dim-1 amount of sample points, and get the last one interpolated. Just an advice: this function is computationally and memory-wise very expensive. Do not use for big data points because it will crash your PC.
In the case you want to find a set of parameters that make input=19 then you are getting to more complicated area. You want to minimise a function f(x), where x=[a,b,c] for f(x)=input
In math terms:
argmin_x |f(x)-input|^2= \vec{input}
this is a harder problem and arguably more mathematics than a programming question. Perhaps a ND bspline fitting of your data would be a good f

Changing numbers for given indices between matrices

I'm struggling with one of my matlab assignments. I want to create 10 different models. Each of them is based on the same original array of dimensions 1x100 m_est. Then with for loop I am choosing 5 random values from the original model and want to add the same random value to each of them. The cycle repeats 10 times chosing different values each time and adding different random number. Here is a part of my code:
steps=10;
for s=1:steps
for i=1:1:5
rl(s,i)=m_est(randi(numel(m_est)));
rl_nr(s,i)=find(rl(s,i)==m_est);
a=-1;
b=1;
r(s)=(b-a)*rand(1,1)+a;
end
pert_layers(s,:)=rl(s,:)+r(s);
M=repmat(m_est',s,1);
end
for k=steps
for m=1:1:5
M_pert=M;
M_pert(1:k,rl_nr(k,1:m))=pert_layers(1:k,1:m);
end
end
In matrix M I am storing 10 initial models and want to replace the random numbers with indices from rl_nr matrix into those stored in pert_layers matrix. However, the last loop responsible for assigning values from pert_layers to rl_nr indices does not work properly.
Does anyone know how to solve this?
Best regards
Your code uses a lot of loops and in this particular circumstance, it's quite inefficient. It's better if you actually vectorize your code. As such, let me go through your problem description one point at a time and let's code up each part (if applicable):
I want to create 10 different models. Each of them is based on the same original array of dimensions 1x100 m_est.
I'm interpreting this as you having an array m_est of 100 elements, and with this array, you wish to create 10 different "models", where each model is 5 elements sampled from m_est. rl will store these values from m_est while rl_nr will store the indices / locations of where these values originated from. Also, for each model, you wish to add a random value to every element that is part of this model.
Then with for loop I am choosing 5 random values from the original model and want to add the same random value to each of them.
Instead of doing this with a for loop, generate all of your random indices in one go. Since you have 10 steps, and we wish to sample 5 points per step, you have 10*5 = 50 points in total. As such, why don't you use randperm instead? randperm is exactly what you're looking for, and we can use this to generate unique random indices so that we can ultimately use this to sample from m_est. randperm generates a vector from 1 to N but returns a random permutation of these elements. This way, you only get numbers enumerated from 1 to N exactly once and we will ensure no repeats. As such, simply use randperm to generate 50 elements, then reshape this array into a matrix of size 10 x 5, where the number of rows tells you the number of steps you want, while the number of columns is the total number of points per model. Therefore, do something like this:
num_steps = 10;
num_points_model = 5;
ind = randperm(numel(m_est));
ind = ind(1:num_steps*num_points_model);
rl_nr = reshape(ind, num_steps, num_points_model);
rl = m_est(rl_nr);
The first two lines are pretty straight forward. We are just declaring the total number of steps you want to take, as well as the total number of points per model. Next, what we will do is generate a random permutation of length 100, where elements are enumerated from 1 to 100, but they are in random order. You'll notice that this random vector uses only a value within the range of 1 to 100 exactly once. Because you only want to get 50 points in total, simply subset this vector so that we only get the first 50 random indices generated from randperm. These random indices get stored in ind.
Next, we simply reshape ind into a 10 x 5 matrix to get rl_nr. rl_nr will contain those indices that will be used to select those entries from m_est which is of size 10 x 5. Finally, rl will be a matrix of the same size as rl_nr, but it will contain the actual random values sampled from m_est. These random values correspond to those indices generated from rl_nr.
Now, the final step would be to add the same random number to each model. You can certainly use repmat to replicate a random column vector of 10 elements long, and duplicate them 5 times so that we have 5 columns then add this matrix together with rl.... so something like:
a = -1;
b = 1;
r = (b-a)*rand(num_steps, 1) + a;
r = repmat(r, 1, num_points_model);
M_pert = rl + r;
Now M_pert is the final result you want, where we take each model that is stored in rl and add the same random value to each corresponding model in the matrix. However, if I can suggest something more efficient, I would suggest you use bsxfun instead, which does this replication under the hood. Essentially, the above code would be replaced with:
a = -1;
b = 1;
r = (b-a)*rand(num_steps, 1) + a;
M_pert = bsxfun(#plus, rl, r);
Much easier to read, and less code. M_pert will contain your models in each row, with the same random value added to each particular model.
The cycle repeats 10 times chosing different values each time and adding different random number.
Already done in the above steps.
I hope you didn't find it an imposition to completely rewrite your code so that it's more vectorized, but I think this was a great opportunity to show you some of the more advanced functions that MATLAB has to offer, as well as more efficient ways to generate your random values, rather than looping and generating the values one at a time.
Hopefully this will get you started. Good luck!

How to take the mean of columns from an array within an array?

I am currently working on a project in Matlab where I have a cell array of cell arrays. The first cell array is 464 columns long and 1 row deep. Each of these cells is another cell array that is 96 columns and 365 rows. I need to be able to get the mean of the 96 columns for each of the 464 arrays and place each of the 464 arrays on a different row in a new array called mean. I have tried to write code to just do one column as follow:
mean = Homes{1,1}(1:)
But I when ever I try to run this code I got the follow error:
mean = Homes{1,1}(1:)
|
Error: Unbalanced or unexpected parenthesis or bracket.
Basically my final array name mean needs to be 96 columns by 464 rows. I am stuck and could really use your help.
Thank you.
I suggest you to try the following code on a smaller matrix. See if it gives you the desired results.
a=cell(1,4); %for you it will be a=cell(1,464)
for i=1:4
a{i}=randi(10,[5 10]); %for you it will be a{i}=randi(10,[365 96]);
end
a1=cell2mat(a); %just concatenating
a1=mean(a1); %getting the mean for each column. in your case, you should get the mean for 96*464
a2=reshape(a1,[10 4]); %now what reshape does it it takes first 10 elements and arranges it into first column.
%Therefore, since I finally want a 4x10 matrix (in your case 464x96), i.e. mean of 10 elements in first cell array on first row and so on...
%Thus, 1st 10 elements should go to first column after doing reshape (since you want to keep them together). Therefore, instead of directly making matrix as 4x10, first make it as 10x4 and then take transpose (which is the next step).
a2=a2'; %thus you get a 4x10 matrix.
In your case specifically, the code will be
a=cell(1,464);
for i=1:464
a{i}=randi(10,[365 96]);
end
a1=cell2mat(a);
a1=mean(a1);
a2=reshape(a1,[96 365]);
a2=a2';

For loop inside another for loop to make new set of vectors

I would like to use a for loop within a for loop (I think) to produce a number of vectors which I can use separately to use polyfit with.
I have a 768x768 matrix and I have split this into 768 separate cell vectors. However I want to split each 1x768 matrix into sections of 16 points - i.e. 48 new vectors which are 16 values in length. I want then to do some curve fitting with this information.
I want to name each of the 48 vectors something different however I want to do this for each of the 768 columns. I can easily do this for either separately but I was hoping that there was a way to combine them. I tried to do this as a for statement within a for statement however it doesn't work, I wondered if anyone could give me some hints on how to produce what I want. I have attached the code.
Qne is my 768*768 matrix with all the points.
N1=768;
x=cell(N,1);
for ii=1:N1;
x{ii}=Qnew(1:N1,ii);
end
for iii = 1:768;
x2{iii}=x{iii};
for iv = 1:39
N2=20;
x3{iii}=x2{iii}(1,(1+N2*iv:N2+N2*iv));
%Gx{iv}=(x3{iv});
end
end
Use a normal 2D matrix for your inner split. Why? It's easy to reshape, and many of the fitting operations you'll likely use will operate on columns of a matrix already.
for ii=1:N1
x{ii} = reshape(Qnew(:, ii), 16, 48);
end
Now x{ii} is a 2D matrix, size 16x48. If you want to address the jj'th split window separately, you can say x{ii}(:, jj). But often you won't have to. If, for example, you want the mean of each window, you can just say mean(x{ii}), which will take the mean of each column, and give you a 48-element row vector back out.
Extra reference for the unasked question: If you ever want overlapping windows of a vector instead of abutting, see buffer in the signal processing toolbox.
Editing my answer:
Going one step further, a 3D matrix is probably the best representation for equal-sized vectors. Remembering that reshape() reads out columnwise, and fills the new matrix columnwise, this can be done with a single reshape:
x = reshape(Qnew, 16, 48, N1);
x is now a 16x48x768 3D array, and the jj'th window of the ii'th vector is now x(:, jj, ii).