matlab: sorting and random - matlab

I need to sort out few small matrices from 1 huge raw matrix ...according to sorting 1st column (1st column contain either 1, 2, or 3)...
if 1st column is 1, then randomly 75% of the 1 save in file A1, 25% of the 1 save in file A2.
if 1st column is 2, then randomly 75% of the 2 save in file B1, 25% of the 2 save in file B2.
if 1st column is 3, then randomly 75% of the 3 save in file C1, 25% of the 3 save in file C2.
how am i going to write the code?
Example:
a raw matrix has 15 rows x 6 columns:
7 rows are 1 in 1st column, 5 rows are 2 in 1st column, and 3 rows are 3 in 1st column.
1 -0.05 -0.01 0.03 0.07 0.11
1 -0.4 -0.36 -0.32 -0.28 -0.24
1 0.3 0.34 0.38 0.42 0.46
1 0.75 0.79 0.83 0.87 0.91
1 0.45 0.49 0.53 0.57 0.61
1 0.8 0.84 0.88 0.92 0.96
1 0.05 0.09 0.13 0.17 0.21
2 0.5 0.54 0.58 0.62 0.66
2 0.4 0.44 0.48 0.52 0.56
2 0.9 0.94 0.98 1.02 1.06
2 0.85 0.89 0.93 0.97 1.01
2 0.75 0.79 0.83 0.87 0.91
3 0.36 0.4 0.44 0.48 0.52
3 0.6 0.64 0.68 0.72 0.76
3 0.4 0.44 0.48 0.52 0.56
7 rows got 1 in 1st column, randomly take out 75% of 7 rows (which is 7*0.75=5.25) to be new matrix (5rows x 6 columns), the rest of 25% become another new matrix
5 rows got 2 in 1st column, randomly take out 75% of 5 rows (which is 5*0.75=3.75) to be new matrix (4rows x 6 columns), the rest of 25% become another new matrix
3 rows got 3 in 1st column, randomly take out 75% of 3 rows (which is 3*0.75=2.25) to be new matrix (2rows x 6 columns), the rest of 25% become another new matrix
Result:
A1=
1 -0.4 -0.36 -0.32 -0.28 -0.24
1 0.3 0.34 0.38 0.42 0.46
1 0.75 0.79 0.83 0.87 0.91
1 0.8 0.84 0.88 0.92 0.96
1 -0.05 -0.01 0.03 0.07 0.11
B1=
2 0.9 0.94 0.98 1.02 1.06
2 0.85 0.89 0.93 0.97 1.01
2 0.5 0.54 0.58 0.62 0.66
2 0.75 0.79 0.83 0.87 0.91
C1=
3 0.36 0.4 0.44 0.48 0.52
3 0.4 0.44 0.48 0.52 0.56

here is one possible solution to your problem using the function randperm:
% Create matrices
firstcol=ones(15,1);
firstcol(8:12)=2;
firstcol(13:15)=3;
mat=[firstcol rand(15,5)];
% Sort according to first column
A=mat(mat(:,1)==1,:);
B=mat(mat(:,1)==2,:);
C=mat(mat(:,1)==3,:);
% Randomly rearrange lines
A=A(randperm(size(A,1)),:);
B=B(randperm(size(B,1)),:);
C=C(randperm(size(C,1)),:);
% Select first 75% lines (rounding)
A1=A(1:round(0.75*size(A,1)),:);
A2=A(round(0.75*size(A,1))+1:end,:);
B1=B(1:round(0.75*size(B,1)),:);
B1=B(round(0.75*size(B,1))+1:end,:);
C1=C(1:round(0.75*size(C,1)),:);
C1=C(round(0.75*size(C,1))+1:end,:);
Hope it helps.

Related

vectorise foor loop with a variable that is incremented in each iteration

I am trying to optimise the running time of my code by getting rid of some for loops. However, I have a variable that is incremented in each iteration in which sometimes the index is repeated. I provide here a minimal example:
a = [1 4 2 2 1 3 4 2 3 1]
b = [0.5 0.2 0.3 0.4 0.1 0.05 0.7 0.3 0.55 0.8]
c = [3 5 7 9]
for i = 1:10
c(a(i)) = c(a(i)) + b(i)
end
Ideally, I would like to compute it by writting:
c(a) = c(a) + b
but obviously it would not give me the same results since I have to recalculate the value for the same index several times so this way to vectorise it would not work.
Also, I am working in Matlab or Octave in case that this is important.
Thank you very much for any help, I am not sure that it is possible to be vectorise.
Edit: thank you very much for your answers so far. I have discovered accumarray, which I did not know before and also understood why changing the for loop between Matlab and Octave was giving me such different times. I also understood my problem better. I gave a too simple example which I thought I could extend, however, what if b was a matrix?
(Let's forget about c at the moment):
a = [1 4 2 2 1 3 4 2 3 1]
b =[0.69 -0.41 -0.13 -0.13 -0.42 -0.14 -0.23 -0.17 0.22 -0.24;
0.34 -0.39 -0.36 0.68 -0.66 -0.19 -0.58 0.78 -0.23 0.25;
-0.68 -0.54 0.76 -0.58 0.24 -0.23 -0.44 0.09 0.69 -0.41;
0.11 -0.14 0.32 0.65 0.26 0.82 0.32 0.29 -0.21 -0.13;
-0.94 -0.15 -0.41 -0.56 0.15 0.09 0.38 0.58 0.72 0.45;
0.22 -0.59 -0.11 -0.17 0.52 0.13 -0.51 0.28 0.15 0.19;
0.18 -0.15 0.38 -0.29 -0.87 0.14 -0.13 0.23 -0.92 -0.21;
0.79 -0.35 0.45 -0.28 -0.13 0.95 -0.45 0.35 -0.25 -0.61;
-0.42 0.76 0.15 0.99 -0.84 -0.03 0.27 0.09 0.57 0.64;
0.59 0.82 -0.39 0.13 -0.15 -0.71 -0.84 -0.43 0.93 -0.74]
I understood now that what I would be doing is rowSum per group, and given that I am using Octave I cannot use "splitapply". I tried to generalise your answers, but accumarray would not work for matrices and also I could not generalise #rahnema1 solution. The desired output would be:
[0.34 0.26 -0.93 -0.56 -0.42 -0.76 -0.69 -0.02 1.87 -0.53;
0.22 -1.03 1.53 -0.21 0.37 1.54 -0.57 0.73 0.23 -1.15;
-0.20 0.17 0.04 0.82 -0.32 0.10 -0.24 0.37 0.72 0.83;
0.52 -0.54 0.02 0.39 -1.53 -0.05 -0.71 1.01 -1.15 0.04]
that is "equivalent" to
[sum(b([1 5 10],:))
sum(b([3 4 8],:))
sum(b([6 9],:))
sum(b([2 7],:))]
Thank you very much, If you think I should include this in another question instead of adding the edit I will do so.
Original question
It can be done with accumarray:
a = [1 4 2 2 1 3 4 2 3 1];
b = [0.5 0.2 0.3 0.4 0.1 0.05 0.7 0.3 0.55 0.8];
c = [3 5 7 9];
c(:) = c(:) + accumarray(a(:), b(:));
This sums the values from b in groups defined by a, and adds that to the original c.
Edited question
If b is a matrix, you can use
full(sparse(repmat(a, 1, size(b,1)), repelem(1:size(b,2), size(b,1)), b))
or
accumarray([repmat(a, 1, size(b,1)).' repelem(1:size(b,2), size(b,1)).'], b(:))
Matrix multiplication and implicit expansion and can be used (Octave):
nc = numel(c);
c += b * (1:nc == a.');
For input of large size it may be more memory efficient to use sparse matrix:
nc = numel(c);
nb = numel(b);
c += b * sparse(1:nb, a, 1, nb, nc);
Edit: When b is a matrix you can extend this solution as:
nc = numel(c);
na = numel(a);
out = sparse(a, 1:na, 1, nc, na) * b;

Calculate the mean values if the numbers are same

I want to calculate the mean value if the first column numbers are same. e.g:
M = [ 2 0.99 0.15 0.60 0.12 0.76 0.16 0.81 0.02 0.75 0.32
2 0.17 0.38 0.34 0.02 0.74 0.67 0.75 0.92 0.23 0.81
2 0.26 0.16 0.30 0.29 0.74 0.89 0.12 0.65 0.06 0.79
3 0.40 0.76 0.45 0.32 0.11 0.52 0.53 0.93 0.77 0.85
3 0.07 0.87 0.42 0.65 0.68 0.70 0.33 0.16 0.67 0.51
3 0.68 0.35 0.36 0.96 0.46 0.15 0.55 0.92 0.72 0.64
3 0.40 0.69 0.56 0.94 0.21 0.95 0.40 0.79 0.64 0.95
4 0.98 0.29 0.74 0.46 0.10 0.54 0.42 0.58 0.42 0.44
4 0.40 0.53 0.42 0.24 0.82 0.68 0.18 0.44 0.39 0.06
4 0.62 0.83 0.43 0.76 0.18 0.04 0.26 0.26 0.82 0.87 ]
Out=[ 2 0.47 0.23 0.41 0.15 0.75 0.57 0.56 0.53 0.35 0.64
3 0.39 0.67 0.45 0.72 0.37 0.58 0.45 0.70 0.70 0.74
4 0.67 0.55 0.53 0.49 0.37 0.42 0.28 0.43 0.54 0.46 ]
G = findgroups(M(:,1 ));
Out = [unique(M(:,1)) splitapply(#mean, M(:,2:end), G)]
G = findgroups(A) returns G, a vector of group numbers created from the grouping variable A. Here G = findgroups(M(:,1 )) means pick up the first column out the matrics.
Y = splitapply(func,X,G) splits X into groups specified by G and applies the function func to each group.

How can I find neighbors of spesific value in a matrix via Matlab?

I have a 256*256 matrix, some values are 0 (close the each other); and I find the coordinates' of 0 values.
% finding missing rows and cols: xi, yi
[row,col]=find(~X);
MIS=[row,col];
MISWO=[MIS zeros(size(MIS,1),1) ];
MISWO
...
168 224 0
169 224 0
170 224 0
171 224 0
172 224 0
173 224 0
174 224 0
Part of the X matrix:
0.57 0.58 0.00 0.55 0.54
0.55 0.54 0.00 0.55 0.52
0.56 0.55 0.00 0.55 0.53
0.56 0.55 0.00 0.53 0.52
0.56 0.00 0.00 0.53 0.54
0.55 0.00 0.00 0.53 0.52
0.55 0.00 0.00 0.55 0.51
0.55 0.00 0.00 0.53 0.51
0.56 0.00 0.00 0.51 0.53
0.55 0.00 0.00 0.51 0.51
0.55 0.00 0.00 0.51 0.49
0.55 0.00 0.00 0.52 0.49
0.56 0.00 0.53 0.51 0.48
My goal is finding the zero values 5-10 neighbors with coordinates and values.
Can anybody help me?
All the best
In order to find all nearest neighbors in a 5x5 box around each zero pixel we can use 2d convolution:
X1=conv2(double(~X),ones(5),'same')>0;
This yields a binary matrix with 1 in the places of ALL the nearest neighbors positions around zero pixels. finding the rows and cols for all the nearest neighbors without the zeros is just:
[row2 col2]=find(X1.*X);
Then the matrix that you want is:
MIS2=[row2 col2 X(row2, col2)];

Use matlab to arrange excel data

I have data in excel like this:
id date value
a 9/17/2012 0.25
a 9/18/2012 0.48
a 9/19/2012 0.29
a 9/20/2012 0.46
a 9/21/2012 0.17
a 9/24/2012 0.89
a 9/25/2012 0.20
a 9/26/2012 0.65
a 9/27/2012 0.26
b 9/17/2012 0.83
b 9/18/2012 0.87
b 9/19/2012 0.40
b 9/20/2012 0.33
b 9/21/2012 0.71
b 9/24/2012 0.13
b 9/25/2012 0.91
b 9/26/2012 0.73
b 9/27/2012 0.87
c 9/17/2012 0.47
c 9/18/2012 0.15
c 9/19/2012 0.73
c 9/20/2012 0.47
c 9/21/2012 0.03
c 9/24/2012 0.23
c 9/25/2012 0.21
c 9/26/2012 0.39
c 9/27/2012 0.77
and I would like to use Matlab to re-arrange to:
date a b c
9/17/2012 0.25 0.83 0.47
9/18/2012 0.48 0.87 0.15
9/19/2012 0.29 0.40 0.73
9/20/2012 0.46 0.33 0.47
9/21/2012 0.17 0.71 0.03
9/24/2012 0.89 0.13 0.23
9/25/2012 0.20 0.91 0.21
9/26/2012 0.65 0.73 0.39
9/27/2012 0.26 0.87 0.77
What's the easiest way to do this?
Use:
importdata or xlsread
join (statistics toolbox) or see MATLAB Combine matrices of different dimensions, filling values of corresponding indices.

finding out the scaling factors to match two curves with fmincon in matlab

This is a follow up question related to how to find out the scaling factors to match two curves in matlab?
I use the following code to figure out the scaling factors to match two curves
function err = sqrError(coeffs, x1, y1, x2, y2)
y2sampledInx1 = interp1(coeffs(1)*x2,y2,x1);
err = sum((coeffs(2)*y2sampledInx1-y1).^2);
end
and I used fmincon to optimize the result.
options = optimset('Algorithm','active-set','MaxFunEvals',10000,'TolCon',1e-7)
A0(1)=1; A0(2)=1; LBA1=0.1; UBA1=5; LBA2=0.1; UBA2=5;
LB=[LBA1 LBA2]; UB=[UBA1 UBA2];
coeffs = fmincon(#(c) sqrError(c,x1, y1, x2, y2),A0,[],[],[],[],LB,UB,[],options);
when I test with my data with the function,
x1=[-0.3
-0.24
-0.18
-0.12
-0.06 0
0.06
0.12
0.18
0.24
0.3
0.36
0.42
0.48
0.54
0.6
0.66
0.72
0.78
0.84
0.9
0.96
1.02
1.08
1.14
1.2
1.26
1.32
1.38
1.44
1.5
1.56
1.62
1.68
1.74
1.8
1.86
1.92
1.98
2.04 ] y1=[0.00
0.00
0.00
0.01
0.03
0.09
0.13
0.14
0.14
0.16
0.20
0.22
0.26
0.34
0.41
0.52
0.62
0.72
0.81
0.91
0.95
0.99
0.98
0.96
0.90
0.82
0.74
0.66
0.58
0.52
0.47
0.40
0.36
0.32
0.27
0.22
0.19
0.15
0.12
0.10 ];
x2=[-0.3
-0.24
-0.18
-0.12
-0.06 0
0.06
0.12
0.18
0.24
0.3
0.36
0.42
0.48
0.54
0.6
0.66
0.72
0.78
0.84
0.9
0.96
1.02
1.08
1.14
1.2
1.26
1.32
1.38
1.44
1.5
1.56
1.62
1.68
1.74
1.8
1.86
1.92
1.98
2.04 ]; y2=[0.00
0.00
0.00
0.00
0.05
0.15
0.15
0.13
0.11
0.11
0.13
0.18
0.24
0.33
0.43
0.54
0.66
0.76
0.84
0.90
0.93
0.94
0.94
0.91
0.87
0.81
0.75
0.69
0.63
0.55
0.49
0.43
0.37
0.32
0.27
0.23
0.19
0.16
0.13
0.10 ];
The error message shows up as follows:
??? Error using ==> interp1 at 172 NaN is not an appropriate value for
X.
Error in ==> sqrError at 2 y2sampledInx1 =
interp1(coeffs(1)*x2,y2,x1);
Error in ==> #(c)sqrError(c,x1,y1,x2,y2)
Error in ==> nlconst at 805
f =
feval(funfcn{3},x,varargin{:});
Error in ==> fmincon at 758
[X,FVAL,LAMBDA,EXITFLAG,OUTPUT,GRAD,HESSIAN]=...
Error in ==>coeffs = fmincon(#(c) sqrError(c,x1, y1, x2,
y2),A0,[],[],[],[],LB,UB,[],options);
What is wrong in the code and how should I get around with it.
Thanks for the help.
Your scaling is likely pushing the interpolated axis out of range of the x-axis of the data. i.e.
x1 < min(x2*coeffs(1)) or x1 > max(x2*coeffs(1)) for at least one x1 and the value of coeffs(1) chosen by the fitting algorithm
You can fix this by giving an extrapolation value for data outside the range. Alternately, you can use extrapolation to guess at these values. So try one of these
y2sampledInx1 = interp1(coeffs(1)*x2,y2,x1,'Linear', 'Extrap');
y2sampledInx1 = interp1(coeffs(1)*x2,y2,x1,'Linear', Inf);
y2sampledInx1 = interp1(coeffs(1)*x2,y2,x1,'Linear', 1E18); %if Inf messes with the algorithm