I have a huge amount of data in MATLAB (350695x5).
An example is like this:
z = [
1.79 0.16 0.16 21.39 21.50
1.83 0.16 0.16 21.39 22.40
1.92 0.16 0.16 21.39 22.00
2.07 0.16 0.16 21.39 22.00
2.36 0.15 0.15 21.39 21.08
2.96 0.13 0.13 21.39 21.04
3.21 0.13 0.13 21.39 23.00
3.72 0.12 0.12 21.39 24.00
3.87 0.11 0.11 21.39 21.39
4.14 0.10 0.10 21.39 22.00
4.14 0.10 0.10 21.39 21.50
4.16 0.10 0.10 21.39 21.39]
and I need to sort it in the following way:
based on 1 column from 1-2, 2-3, 3-4
and find mean values in the range (0-1, 1-2, 2-3, 3-4) for 2,3,4 columns
the result should look like this:
1 0.16 0.16 21.39 21.97
2 0.15 0.15 21.39 21.49
3 0.12 0.12 21.39 22.68
4 0.10 0.10 21.39 21.63
The problem is that I cannot sort it in a proper way.
The part of the solution can be described by
[ii jj] = ndgrid(z(:,1)+1,1:size(z,2)-1) %should sort first column from 0-1,1-2, 2-3, 3-4
z23 = z(:,2:end)
out = [unique(z(:,1)),accumarray([ii(:),jj(:)],z23(:),[],#mean)], %find mean value
Try this:
idx = floor(z(:, 1));
sub = [idx z(:, 2:5)];
[xx, yy] = ndgrid(idx, 1:size(sub, 2));
out = accumarray([xx(:) yy(:)], sub(:), [], #mean)
out =
1.0000 0.1600 0.1600 21.3900 21.9667
2.0000 0.1467 0.1467 21.3900 21.3733
3.0000 0.1200 0.1200 21.3900 22.7967
4.0000 0.1000 0.1000 21.3900 21.6300
Results don't match exactly with yours. I'm not sure I understand exactly what you wanted, but the code I wrote calculates the average on ranges 1 <= x < 2, 2 <= x < 3, and so on.
Use logical indexing to find the values in z where a certain range applies, e.g.:
i01 = (z >= 0) & (z < 1); % Find logical indices
z01 = z(i01); % Get values from 0 up to 1 (but not including 1)
Then, calculation of the mean is easy: mu_z01 = mean(z01);. Of course, the same method can be applied to the other ranges from 1 to 2, 2 to 3, et cetera.
Related
I am trying to optimise the running time of my code by getting rid of some for loops. However, I have a variable that is incremented in each iteration in which sometimes the index is repeated. I provide here a minimal example:
a = [1 4 2 2 1 3 4 2 3 1]
b = [0.5 0.2 0.3 0.4 0.1 0.05 0.7 0.3 0.55 0.8]
c = [3 5 7 9]
for i = 1:10
c(a(i)) = c(a(i)) + b(i)
end
Ideally, I would like to compute it by writting:
c(a) = c(a) + b
but obviously it would not give me the same results since I have to recalculate the value for the same index several times so this way to vectorise it would not work.
Also, I am working in Matlab or Octave in case that this is important.
Thank you very much for any help, I am not sure that it is possible to be vectorise.
Edit: thank you very much for your answers so far. I have discovered accumarray, which I did not know before and also understood why changing the for loop between Matlab and Octave was giving me such different times. I also understood my problem better. I gave a too simple example which I thought I could extend, however, what if b was a matrix?
(Let's forget about c at the moment):
a = [1 4 2 2 1 3 4 2 3 1]
b =[0.69 -0.41 -0.13 -0.13 -0.42 -0.14 -0.23 -0.17 0.22 -0.24;
0.34 -0.39 -0.36 0.68 -0.66 -0.19 -0.58 0.78 -0.23 0.25;
-0.68 -0.54 0.76 -0.58 0.24 -0.23 -0.44 0.09 0.69 -0.41;
0.11 -0.14 0.32 0.65 0.26 0.82 0.32 0.29 -0.21 -0.13;
-0.94 -0.15 -0.41 -0.56 0.15 0.09 0.38 0.58 0.72 0.45;
0.22 -0.59 -0.11 -0.17 0.52 0.13 -0.51 0.28 0.15 0.19;
0.18 -0.15 0.38 -0.29 -0.87 0.14 -0.13 0.23 -0.92 -0.21;
0.79 -0.35 0.45 -0.28 -0.13 0.95 -0.45 0.35 -0.25 -0.61;
-0.42 0.76 0.15 0.99 -0.84 -0.03 0.27 0.09 0.57 0.64;
0.59 0.82 -0.39 0.13 -0.15 -0.71 -0.84 -0.43 0.93 -0.74]
I understood now that what I would be doing is rowSum per group, and given that I am using Octave I cannot use "splitapply". I tried to generalise your answers, but accumarray would not work for matrices and also I could not generalise #rahnema1 solution. The desired output would be:
[0.34 0.26 -0.93 -0.56 -0.42 -0.76 -0.69 -0.02 1.87 -0.53;
0.22 -1.03 1.53 -0.21 0.37 1.54 -0.57 0.73 0.23 -1.15;
-0.20 0.17 0.04 0.82 -0.32 0.10 -0.24 0.37 0.72 0.83;
0.52 -0.54 0.02 0.39 -1.53 -0.05 -0.71 1.01 -1.15 0.04]
that is "equivalent" to
[sum(b([1 5 10],:))
sum(b([3 4 8],:))
sum(b([6 9],:))
sum(b([2 7],:))]
Thank you very much, If you think I should include this in another question instead of adding the edit I will do so.
Original question
It can be done with accumarray:
a = [1 4 2 2 1 3 4 2 3 1];
b = [0.5 0.2 0.3 0.4 0.1 0.05 0.7 0.3 0.55 0.8];
c = [3 5 7 9];
c(:) = c(:) + accumarray(a(:), b(:));
This sums the values from b in groups defined by a, and adds that to the original c.
Edited question
If b is a matrix, you can use
full(sparse(repmat(a, 1, size(b,1)), repelem(1:size(b,2), size(b,1)), b))
or
accumarray([repmat(a, 1, size(b,1)).' repelem(1:size(b,2), size(b,1)).'], b(:))
Matrix multiplication and implicit expansion and can be used (Octave):
nc = numel(c);
c += b * (1:nc == a.');
For input of large size it may be more memory efficient to use sparse matrix:
nc = numel(c);
nb = numel(b);
c += b * sparse(1:nb, a, 1, nb, nc);
Edit: When b is a matrix you can extend this solution as:
nc = numel(c);
na = numel(a);
out = sparse(a, 1:na, 1, nc, na) * b;
I have this matrix in MATLAB:
x = [NaN -2 -1 0 1 2;
1 0.21 0.15 0.34 0.11 0.32;
2 0.14 0.10 0.16 0.31 0.11];
The first row represents the location of the values following X coordinates.
I shift the first row by -0.63, so x becomes:
New_x = [NaN -2.63 -1.63 -0.63 0.37 1.37;
1 0.21 0.15 0.34 0.11 0.32;
2 0.14 0.10 0.16 0.31 0.11];
How can I use interpolation to get the values at specific coordinates of the New_x matrix that we have in the x matrix? ([-2 -1 0 1 2] points)
New_xInterp = [NaN -2.63 .. -2 .. -1.63 .. -1 .. -0.63 .. 0 .. 0.37 .. 1 .. 1.37 .. 2;
1 0.21 .. ? .. 0.15 .. ? .. 0.34 .. ? .. 0.11 .. ? .. 0.32 .. ?;
2 0.14 .. ? .. 0.10 .. ? .. 0.16 .. ? .. 0.31 .. ? .. 0.11 .. ?];
I want to get the '?' values. I tried to use interp2 function but I don't know which step or 2^k-1 interpolated points between coordinates values I have to have in order to get the points like -2, -1, 0, 1, 2.
Thanks !
Since you do not have 2D data, you are only interpolating on one dimension, you only need the function interp1.
This function can work on vector or matrices if necessary, but it require a slight reorganisation of your data.
%% Input
M = [NaN -2 -1 0 1 2;
1 0.21 0.15 0.34 0.11 0.32;
2 0.14 0.10 0.16 0.31 0.11];
%% Demultiplex inputs
x = M(1,2:end).' ; % extract X values, reorder in column
y = M(2:end,2:end).' ; % extract Y values, reorder in columns
%% Interpolate
xn = sort( [x-0.63 ; x] ) ; % Generate the new_x target values
yn = interp1( x-0.63 , y , xn ,'linear','extrap') ; % Interpolate the full matrix in one go
At this point you have your new xn and yn values in columns:
xn= yn=
-2.63 0.21 0.14
-2 0.1722 0.1148
-1.63 0.15 0.1
-1 0.2697 0.1378
-0.63 0.34 0.16
0 0.1951 0.2545
0.37 0.11 0.31
1 0.2423 0.184
1.37 0.32 0.11
2 0.4523 -0.016
I would keep them like that if you have more operations to do on them later on. However, if you want it back into the format you had at the beginning, we can simply rebuild the new full matrix:
%% Rebuild global matrix
Mout = [ M(:,1) , [xn.' ; yn.'] ]
Mout =
NaN -2.63 -2 -1.63 -1 -0.63 0 0.37 1 1.37 2
1 0.21 0.1722 0.15 0.2697 0.34 0.1951 0.11 0.2423 0.32 0.4523
2 0.14 0.1148 0.1 0.1378 0.16 0.2545 0.31 0.184 0.11 -0.016
Maybe you can try interp1 + arrayfun like below
r = sort([x(1,2:end),New_x(1,2:end)]);
New_xInterp = [New_x(:,1),cell2mat(arrayfun(#(k) interp1(New_x(1,2:end),New_x(k,2:end),r),1:size(New_x,1),'UniformOutput',false).')];
which gives
New_xInterp =
NaN -2.63000 -2.00000 -1.63000 -1.00000 -0.63000 0.00000 0.37000 1.00000 1.37000 NA
1.00000 0.21000 0.17220 0.15000 0.26970 0.34000 0.19510 0.11000 0.24230 0.32000 NA
2.00000 0.14000 0.11480 0.10000 0.13780 0.16000 0.25450 0.31000 0.18400 0.11000 NA
The code above used linear interpolation. If you want other options, you can type help interp1 to see more.
I have a couple of matrices (1800 x 27) that represent subjects and their recordings (3 minutes equivalent for each of 27 subjects). Each column represents a subject.
I need to do intercorrelation between subjects, let's say to correlate F to G, G to H, and H to F for all 27 subjects.
I use CORR command corr(B) where B is a matrix and it returns the next example:
1 0.07 -0.05 0.10 0.04 0.12
0.07 1 -0.02 -0.08 0.17 0.03
-0.05 -0.02 1 0.04 0.16 0.13
0.10 -0.08 0.04 1 -0.04 0.34
0.04 0.18 0.16 -0.04 1 0.13
How can I adjust the code to exclude self-correlation (eg F to F) so I won't get "1" numerals?
(it's present in each row/column)
I have to perform some transformations afterwards, like Fisher Z-Transformation, which returns inf for each "1", and as result, I can't use further calculations.
Here is a sample of data
Time Data
0.32 1.5
0.45 0.6
0.68 2.1
0.91 0.8
1.23 1.3
1.54 1.0
1.68 2.0
1.92 2.3
1.95 0.7
1.98 1.6
2.12 1.9
2.34 0.3
My problem is I want to be able to have all data between the time range 0-0.3 and 0.3-0.6 for example in its own nx2 matrix. The time always continues to increase. It then would also be nice to set 'n' multiple increase in the bins to save writing 0.3,0.6,0.9,1.2 etc.
I can split the time into the relevant ranges no problem but I do not know how to keep the relevant data with its accompanying time.
I would then want to go on and plot this once I can do the above.
Thanks in advance for any help/suggestions :)
Assuming the first column contains non-decreasing values, a small modification of my answer to your previous question will work. Let data denote your input matrix and s be the step used for defining groups:
data = [ 0.32 1.5
0.45 0.6
0.68 2.1
0.91 0.8
1.23 1.3
1.54 1.0
1.68 2.0
1.92 2.3
1.95 0.7
1.98 1.6
2.12 1.9
2.34 0.3 ]; %// example data
s = .3; %// step to define groups
Then
result = mat2cell(data, diff([0; find(diff([floor(data(:,1)/s); NaN]))]) , size(data,2));
gives
result{1} =
0.3200 1.5000
0.4500 0.6000
result{2} =
0.6800 2.1000
result{3} =
0.9100 0.8000
result{4} =
1.2300 1.3000
result{5} =
1.5400 1.0000
1.6800 2.0000
result{6} =
1.9200 2.3000
1.9500 0.7000
1.9800 1.6000
result{7} =
2.1200 1.9000
2.3400 0.3000
Note that if some group is not present in the input it will simply be skipped in the result. For example,
data = [ 0.32 1.5
0.45 0.6
0.68 2.1
2.12 1.9
2.34 0.3 ]; %// example data
s = .3; %// step to define groups
will produce
result{1} =
0.3200 1.5000
0.4500 0.6000
result{2} =
0.6800 2.1000
result{3} =
2.1200 1.9000
2.3400 0.3000
If you would like to define custom bin edges for binning rows of input array, here's one approach with histcounts and arrayfun -
bin_edges = [0.3,0.6,1,12 15]; %// Define bin edges here
[~,~,bins] = histcounts(A(:,1),bin_edges);
groups = arrayfun(#(n) A(bins==n,:),1:max(bins),'Uni',0);
Sample input, output -
>> A
A =
0.32 1.5
0.45 0.6
0.68 2.1
0.91 0.8
1.23 1.3
1.54 1
1.68 2
1.92 2.3
1.95 0.7
1.98 1.6
12.12 1.9
12.34 0.3
>> celldisp(groups) %// Display cells of output
groups{1} =
0.32 1.5
0.45 0.6
groups{2} =
0.68 2.1
0.91 0.8
groups{3} =
1.23 1.3
1.54 1
1.68 2
1.92 2.3
1.95 0.7
1.98 1.6
groups{4} =
12.12 1.9
12.34 0.3
I have a matrix which is 36 x 2, but I want to seperate the elements to give me 18, 2 x 2 matrices from top to bottom. E.g. if I have a matrix:
1 2
3 4
5 6
7 8
9 10
11 12
13 14
... ...
I want to split it into seperate matrices:
M1 = 1 2
3 4
M2 = 5 6
7 8
M3 = 9 10
11 12
..etc.
maybe the following sample code could useful:
a=rand(36,2);
b=reshape(a,2,2,18)
then with the 3rd index of b you can access your 18 2x2 matrices, eg. b(:,:,2) gives the second 2x2 matrix.
I think that the direct answer to your question is:
sampledata = [...
0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.10 0.11 0.12 0.13 0.14 0.15 0.16 0.17 0.18 1.01 1.02 1.03 1.04 1.05 1.06 1.07 1.08 1.09 1.10 1.11 1.12 1.13 1.14 1.15 1.16 1.17 1.18; ...
0.19 0.20 0.21 0.22 0.23 0.24 0.25 0.26 0.27 0.28 0.29 0.30 0.31 0.32 0.33 0.34 0.35 0.36 1.19 1.20 1.21 1.22 1.23 1.24 1.25 1.26 1.27 1.28 1.29 1.30 1.31 1.32 1.33 1.34 1.35 1.36];
for ix = 1:(size(sampledata,2)/2)
assignin('base',['M' sprintf('%02d',ix)], sampledata(:,ix*2+[-1 0]))
end
This creates 18 variables, named 'M01' through 'M18', with pieces of the sampledata matrix.
However, please don't use dynamic variable names like this. It will complicate every other piece of code that it touches. Use a cell array, a 3D array (as suggested by #Johannes_Endres +1 BTW), or structure. Anything that removes the need for you to write something like this later on:
%PLEASE DO NOT USE THIS
%ALSO DO NOT BACK YOURSELF INTO A CORNER WHERE YOU HAVE TO DO IT IN THE FUTURE
varNames = who('M*');
for ix = 1:length(varNames )
str = ['result(' num2str(ix) ') = some_function(' varNames {ix} ');'];
eval(str);
end
I've seen code like this, and it is slow and extremely cumbersome to maintain, not to mention the headache and pain to your internal beauty-meter.
x = reshape(1:36*2,[2 36])'
k = 1
for i = 1:2:35
eval(sprintf('M%d = x(%d:%d,:);',k,i,i+1));
k = k+1;
end