Transpose 1D to 2D by Group in Matlab - matlab

I need to transpose a vector into a 2D matrix according to a group of values that are equal in another column of a matrix. For example:
1 x1
1 x2
1 x3
1 x4
2 x5
2 x6
2 x7
2 x8
Should look like:
x1 x2 x3 x4;
x5 x6 x7 x8;
This is the same procedure you would do in SAS using proc tabulate. Reshape didn't work for me because it doesn't transpose it, and tried permute with no luck either. Is there any built in command that does this besides having to program it in using find, transpose, and vertcat?

If for some reason you want to avoid reshape, although the solution in comments will work, you can use sub2ind to get the linear indices of the new matrix V, given that your first column will always provide the new line sub:
X = [[1,1,1,1,2,2,2,2]' (1:8)'];
subs = X(:,1);
M = length(unique(subs)); % count unique ids
N = length(X)./M; % Problem assumption: M sets of size N (MxN=length(X))
V = zeros(M, N);
i = sub2ind([M, N], subs, repmat(1:N,1,M)');
V(i) = X(:,2);
The above, according to your specs, will work as long as there is an equal number of unique elements in X, so you can form an MxN matrix.

Related

How to specifically fprintf an array on a file

Lets say for example we have this array:
x =
0.5920 0.4635
0.6451 0.2118
-0.1206 -0.6036
0.2417 0.4773
0.3029 0.5172
What code would I need to write in order to print in such a way that it looks like this:
coords
x1 0.5920 y1 0.4635
x2 0.6451 y2 0.2118
x3 -0.1206 y3 -0.6036
x4 0.2417 y4 0.4773
x5 0.3029 y5 0.5172
I've tried this:
x = gallery('uniformdata',[1,10],0);
y = gallery('uniformdata',[1,10],1);
[v,c] = voronoin([x(:) y(:)]); %returns an array V with vertices and a cell array C with a matrix for each cell of the diagram.
c
for k = 1 : numel(c)
c{k} = c{k}(c{k} ~= 1)
end
fileID = fopen('cords.txt' , 'w');
for i=1:10
coord = v(c{i},:);
fprintf(fileID,'shape %d:\nx \t y\n', i);
fprintf(fileID,'%.4f %.4f\n', coord(:,1), coord(:,2));
end
fclose(fileID);
but im getting an output like this:
shape 10:
x y
0.5920 0.6451 %notice how the .6451 is on the right side when it should be on the bottom
-0.1206 0.2417
0.3029 0.4635
0.2118 -0.6036
0.4773 0.5172
The fprintf function reads the input variables in a column first manner and sends each value to its appropriate place in the string. So, in your code what happens is that even when you specify two different vectors per %.4f in your code, Matlab ignores that ordering. It puts the first value of coord(:, 1) in the first %.4f and the second value of coord(:, 1) in the second %.4f. Then it breaks the line. Then it again picks up the third value from coord(:, 1) and puts it in the first %.4f and so on. It only picks values from coord(:, 2) when all values of the first vector are exhausted.
The simplest fix is to transpose the coord matrix and then input it to Matlab like this:
fprintf(fileID,'%.4f %.4f\n', coord.'); % .' tranposes the matrix
Edit:
To get the format as x1 0.5920 y1 0.4635, we can make use of the column first ordering that Matlab follows to access a variable
% First we make a new matrix that has each of the required elements for the desired format
% The index of x, the value of x, the index of y and the value of y
tempCoord = [1:size(coord, 1); coord(:, 1).'; 1:size(coord, 1); coord(:, 2).'];
% Now we change the string specification for fprintf
fprintf(fileID,'x%d %.4f y%d %.4f\n', tempCoord);
Why does this work?
If you look at tempCoord, you will see that each of its columns has the format needed for the string specifier, i.e., the index of x, the value of x, the index of y and the value of y
tempCoord =
1.000000000000000 2.000000000000000 3.000000000000000 4.000000000000000 5.000000000000000
0.592000000000000 0.645100000000000 -0.120600000000000 0.241700000000000 0.302900000000000
1.000000000000000 2.000000000000000 3.000000000000000 4.000000000000000 5.000000000000000
0.463500000000000 0.211800000000000 -0.603600000000000 0.477300000000000 0.517200000000000
Now each column becomes each row of the printed file and you get the following output:
x1 0.5920 y1 0.4635
x2 0.6451 y2 0.2118
x3 -0.1206 y3 -0.6036
x4 0.2417 y4 0.4773
x5 0.3029 y5 0.5172

Draw the vector w as well as the projection of another vector onto w

How can I plot the vector w with the projected data onto this vector?
Here is the code - and my trials to plot the weight vector with y1 and y2.
x1=[1 2;2 3;3 3;4 5;5 5] % the first class 5 observations
x2=[1 0;2 1;3 1;3 2;5 3;6 5]
m1 = mean(x1);
m2 = mean(x2);
m = m1 + m2;
d1=x1-repmat(m1,5,1);
d2=x2-repmat(m2,6,1);
c = 0.5.*m;
Sw1 = d1'*d1;
Sw2 = d2'*d2;
Sw = Sw1 + Sw2;
invSw = inv(Sw);
w= invSw*(m1-m2)' %this is my vector projected
scatter(x1(:,1), x1(:,2), 10, 'ro');
hold on;
scatter(x2(:,1), x2(:,2), 10,'bo');
%this is how i plot the decision boundary, but it doesn't seems correct.
quiver(c(1,1), c(1,2), 1, -w(1,1)/w(2,1));
quiver(c(1,1), c(1,2), -1, w(1,1)/w(2,1));
auxw= w/norm(w);
plot([0 auxw(1)], [0 auxw(2)])
hold off;
figure;
y1 = x1*w;
y2 = x2*w;
hist([y1' y2'])
You are very close. You've only calculated (or tried to calculate) the scalar projection or the amount of scale you apply to each vector in order to project each vector in x1 and x2 onto w though what you have is incomplete. If you recall from linear algebra, to determine the scalar projection between two vectors a and b, or the scalar projection of b onto a, the formula is:
Source: Oregon State Mathematics: Calculus for Undergraduates
In our case, a would be w and b would be each of the vectors seen in x1 and x2. I'm assuming each row of these matrices is a vector. The scalar projections are seen in y1 and y2. You need to compute the vector projection, which is defined as taking the scalar projections and multiplying by the unit vectors of a, or simply:
Source: Oregon State Mathematics: Calculus for Undergraduates
Therefore, the calculation of the scalar projections in y1 and y2 are incorrect. You have to multiply by the normalized vector w, then when you find these scalar projection values, you multiply each of these scalar values with the corresponding normalized vector w. However, plotting these all simultaneously on a graph will be confusing. You will have many lines that will overlap onto the original vector w so what I did was I looped through plotting w, a vector in either x1 or x2 and the corresponding projected vector. Each time we loop, we pause and show the data then clear the figure and start again.
As such, I've added and changed the following to your code.
%// Your data
w = [-0.7936; 0.8899];
x1 = [1 2; 2 3; 3 3; 4 5; 5 5];
x2 = [1 0; 2 1; 3 1; 3 2; 5 3; 6 5];
%// Compute scalar projection
auxw = w/norm(w);
s1 = x1*auxw;
s2 = x2*auxw; %// Change for correctness
%// Compute the vector projection
y1 = bsxfun(#times, s1, auxw.');
y2 = bsxfun(#times, s2, auxw.');
%// Place the original vectors and corresponding projections
%// in one matrix
y = [y1; y2];
x = [x1; x2];
%// Loop through and plot w, a point in either x1 or x2
%// and the corresponding projection
for ii = 1 : size(y,1)
plot([0 w(1)], [0 w(2)]);
hold on;
plot([0 y(ii,1)], [0 y(ii,2)], 'r');
plot([0 x(ii,1)], [0 x(ii,2)], 'g');
pause(0.5);
clf;
end
The function bsxfun allows us to multiply each vector in x1 and x2 by their corresponding scalar values. Specifically, it will take the vectors s1 and s2 and when we transpose auxw to be a 1 x 2 vector, we will create new matrices y1 and y2 where each row of either will compute the vector projections of x1 and x2 and place them into the rows of y1 and y2.
The loop at the end cycles through w, a vector in either x1 or x2 and the corresponding projected vector one at a time and we pause for 0.5 seconds each time to see what the results look like. The vector w is in blue, the projected vector is in green and the original vector from either x1 or x2 is in red.
We get these series of figures:
We can see that the red line, which is the projected vector from either x1 or x2 onto w. The green line is the original vector from either x1 or x2.

Correlation dependent on samples

I have a variable y that depends on some variables x1 ∈ [x1_min,x1_max], x2 ∈ [x2_min,x2_max], x3 ∈ [x3_min,x3_max] and y can be a matrix as well, i.e. y=y(x1,x2,x3). I want to detect which among x1,x2,x3 is less relevant to determine the value of y.
I am using the following code in Matlab:
x = rand(1000,3); % x1, x2, x3 are the columns of x
y = fct(x); % A generic function of x1, x2, x3
[corr_mat, p_val] = corrcoef(x,y);
[i,j] = find(p_val > 0.5);
disp([i,j])
The problem is that the resulting indices strongly depend on the random samples (even if I increase the number of samples). How can I get a more precise measure?
As a simple alternative example, y=x1+x2+x3, with x1∈[50,80], x2∈[0,1], x3∈[0,1]. Clearly, the value of y depends much more on x1 than the other 2 variables. How do I quantify this dependence?
Thank you in advance.
EDIT: Here is what I mean with "quantification" or "relevance". I want to detect which variable determines very small changes in y, i.e. in the previous example x2 and x3 makes y to vary less than x1 does.
You need to use covariance and not correlation coefficient. The correlation coefficient is normalized by the variance of each variable to give the same weight to all variables when they have different ranges, and this is exactly what you want to avoid.
x1 = 50+30*rand(1000,1);
x2 = rand(1000,1);
x3 = rand(1000,1);
y = x1+x2+x3;
c=cov([x1 x2 x3 y]);
c(1:3,4) % Covariances of x[1-3] and y

draw graph with n node in matlab

I have a vector with x and y position.
If n=3 I have array with length 6. Each cell value is a position on space
A= [x1 y1 ,x2 y2 ,x3 y3]
// As example A = [2 3.122 , 1.3 6, 2.1 5.6]
how can I a complete graph of this positions ?
appreciate any help.
gplot(A,Coordinates) plots a graph of the nodes defined in Coordinates according to the n-by-n adjacency matrix A, where n is the number of nodes. Coordinates is an n-by-2 matrix, where n is the number of nodes and each coordinate pair represents one node
For two-dimensional data, Coordinates(i,:) = [x(i) y(i)] denotes node i, and Coordinates(j,:) = [x(j)y(j)] denotes node j. If node i and node j are connected, A(i,j) or A(j,i) is nonzero; otherwise, A(i,j) and A(j,i) are zero.
doc gplot
for more info.
For your example, with the trivial all ones adjacency matrix, you'll get:
A = [2 3.122 , 1.3 6, 2.1 5.6]; % # where A= [x1 y1 ,x2 y2 ,x3 y3]
gplot(ones(3),[A(1:2:end)',A(2:2:end)'],'-*')
You could create an X vector and a Y vector like this:
X = A(:,1);
Y = A(:,2);
and then simply use plot:
plot(X, Y);

3N linear Equations

Given the following equation:
It will be 3N linear equations.
Each Aij is a 3x3 matrix. Xj s are 3x1 unknowns. And bi s are known 3x1 matrix.
How can I Combine 3x3 matrix to build a 3Nx3N matrix?
I'm trying to find a method to work out this question.
If you have created all of your matrices Aij and vectors bi as variables in MATLAB, you can put them all into one large system of equations AX = b by simple concatenation using square brackets and semicolons. For example, when N = 3, you can do the following:
A = [A11 A12 A13; A21 A22 A23; A31 A32 A33]; %# A 9-by-9 matrix
b = [b1; b2; b3]; %# A 9-by-1 vector
Then, once you solve your system of equations (using X = A\b; or some other method), you can break X up into its individual 3-by-1 parts. For the above example of N = 3, you can do the following:
X1 = X(1:3);
X2 = X(4:6);
X3 = X(7:9);