Subselecting matrix and use logical selection (matlab) - matlab

I have a line of code in matlab for which i am selecting a subset of a matrix:
A(3:5,1:3);
Now i want to adapt this line, to only select rows for which all three values are larger than zero:
(A(3:5,1:3) > 0);
But apparently i am not doing this right. How do i select part of the matrix, and also make sure that only the rows (for which all three values are) larger than zero are selected?
EDIT: To clarify: lets say that i have a matrix of coordinates called A, that looks like this:
Matrix A [5,3]
3 4 0
0 1 0
0 3 1
0 0 0
4 8 7
Now i want to select only part [3:5,1:3], and of that part i only want to select row 3 and 5. How do i do that?

The expression:
A(find(sum(A(3:5,:),2)~=0),:)
will return only the rows of A(3:5,:) which have a row-sum not equal to zero.
If you had posted syntactically correct Matlab it would have been easier for me to cut and paste your test data into my Matlab session.

I'm modelling this answer off of A(find( A > 0 ))
distances = pdist(find( pdist(medoidContainer(i,1:3)) > 0 ));
This will give you a vector of values in the distances variable. The reason the pdist(medoidContainer(i,1:3) > 0) does not work is because it first, finds the indices specified by i,1:3 in medoidContainer. Then it finds the indices in medoidContainer(i,1:3) that are greater than 0. However, since medoidContainer(i,1:3) and pdist now likely have different dimensions, the comparison does not give the right indexes.

Related

How to highlight or identify a column with a specific value in matrix?

I would like to highlight or identify a column in a matrix with a specific value in MATLAB. Suppose I have a matrix A = [1 0 1 1 0 1; 1 1 0 0 0 1; 1 0 1 1 0 1; 0 1 1 0 0 1].
The result of the above matrix is a 5th column, as it contains all zeroes. I am also wondering if I could highlight the resulting column for identification. Please help me. I have a very large matrix to work on applying this principle.
How about combining find and all to get the column index of the all-zero column like this?
A = [1 0 1 1 0 1; 1 1 0 0 0 1; 1 0 1 1 0 1; 0 1 1 0 0 1];
ind = find(all(A==0,1))
ind =
5
The second input argument to all is to specify that it's along the first dimension, i.e. rows. It's not really necessary here, but I find that it's a good practice as you're always sure it's the right dimension. This is especially important if there are scenarios where you might get a 1xn vector instead of mxn.
Create a colored matrix:
This is a hack, and I don't necessarily recommend it, but if you really want to do this in MATLAB, this is an alternative. Also, I think you might learn quite a lot about MATLAB when doing this, so it might be worth the time.
You can create a colored plot with all values 1 except those in column 5 that will be 0 (or the other way around, doesn't matter) using imagesc. This will give a plot with only two colors, one for those values that are 1, and one for those that are 0. You can select which colors you want with colormap. Then you create a mesh to determine the location of all the values you want to show, convert the matrix to strings using num2str, and combine it all. You need to experiment some to get the correct locations, as you probably want less padding between the rows than the columns. You can use this answer as a guide. In the end, remove the axes. It should be fairly simple to adapt if you read and try to understand each line of the referenced answer.
The simple approach:
I have a very large matrix...". Such matrices are often not a good idea to include in a report. However, if you really want to, I actually suggest you copy paste it from the variable explorer and into MS Excel (or use xlswrite if you're doing this more than once). Since you know which column you want to color, it should be fairly simple to click the "color button".
The following displays the matrix in the command window with the matching columns in boldface. There may be several matching columns, and arbitrary column values can be matched.
A = [1 0 1 0 0 1; 1 1 0 1 0 1; 1 0 1 0 0 1; 0 1 1 1 0 1]; %// matrix
c = [0;1;0;1]; %// column to be matched
nn = find(all(bsxfun(#eq, A, c),1)); %// indices of matching columns
s = cellstr(num2str(A)); %// cell array of strings, one for each row; all same length
for n = nn %// for each matching column, with index n
s = regexprep(s, '\S+', '<strong>$0</strong>', n); %// make bold n-th value of each cell
end
s = vertcat(s{:}); %// convert back into a char array; all strings have the same length
disp(s); %// display
The result in this example is
Highlighting with red (stderr)
Just for proof of concept, you could highlight some of your data in the command window, although I wouldn't suggest actually doing this. Consider the following code:
A=randi(10,8);
%ind = find(all(A==0,1),1) %for actual data
ind = 5; %manual choice for demonstration
for k=1:size(A,1)
fprintf('%5d ',A(k,1:ind-1));
fprintf(2,'%5d ',A(k,ind));
fprintf('%5d ',A(k,ind+1:end));
fprintf('\n');
end
First we create a dummy matrix for demonstration purposes, and select column ind to highlight. Then we go along from line to line in A, we use fprintf(...) to write the non-highlighted values with a given format, then use fprintf(2,...) to write to stderr in red, then write the rest of the line, then newline. Note that for some reason fprintf(2,...) will not highlight the final character, I guess because usually this is \n and nobody noticed that highlighting is missing there.
Also, you can play around with the formats inside fprintf to suit your needs. If you need to print floating points, something like '%10.8f' might work. Or '%g'. The main point is to have a fixed width+precision for your print in order to get pretty columns.
For the sake of completeness, you can make it even a bit more messy to treat multiple highlightable columns:
A=randi(10,8);
%ind = find(all(A==0,1)) %for actual data
ind=[5 2];
fprintf('A = \n\n');
for k1=1:size(A,1)
for k2=1:size(A,2)
if ismember(k2,ind)
fprintf(2,'%5d ',A(k1,k2));
else
fprintf('%5d ',A(k1,k2));
end
end
fprintf('\n');
end
fprintf('\n');
I also added some extra printouts to make it prettier. Result:
Highlighting with blue (links)
As an afterthought, after some discussion with Luis Mendo, I decided that it's worth overdoing a bit while we're at it. You can turn your numbers into blue-and-underlined hyperlinks, making use of the built-in parsing of the link HTML tag implemented both in disp and in fprintf. Here's the corresponding code:
A=randi(10,8);
ind=[5 2];
fieldlen=5; %width of output fields, i.e. 5 in '%5d'
fprintf('A = \n\n');
for k1=1:size(A,1)
for k2=1:size(A,2)
if ismember(k2,ind)
fprintf([repmat(' ',1,fieldlen-length(num2str(A(k1,k2)))) '%d '],A(k1,k2));
else
fprintf('%5d ',A(k1,k2));
end
end
fprintf('\n');
end
fprintf('\n');
This will turn the elements of the highlighted column(s) into strings of the form '3' for an example value of 3.
Another trick here is that hyperlinks starting with matlab: are parsed as proper matlab commands, which are activated when you click the link. You can try it by typing disp('link') in your command window. By setting ... we make sure that nothing happens when someone clicks on the now-link-valued highlighted numbers.
And on a technical note: we only want to include the actual number in the links (and not the preceding spaces), so we have to manually check the length of the string we are about to print (using length(num2str(A(k1,k2)))) and manually include the rest of the spaces before the number. This is done via the parameter fieldlen which I set at the beginning: this specifies the total width of each printing field, i.e. if we originally had fprintf('%5d',...) then we need to set fieldlen=5; for the same effect. Result:

Graphing different sets of data on same graph within a ‘for’ loop MATLAB

I just have a problem with graphing different plots on the same graph within a ‘for’ loop. I hope someone can be point me in the right direction.
I have a 2-D array, with discrete chunks of data in and amongst zeros. My data is the following:
A=
0 0
0 0
0 0
3 9
4 10
5 11
6 12
0 0
0 0
0 0
0 0
7 9.7
8 9.8
9 9.9
0 0
0 0
A chunk of data is defined as contiguous set of data, without interruptions of a [0 0] row. So in this example, the 1st chunk of data would be
3 9
4 10
5 11
6 12
And 2nd chunk is
7 9.7
8 9.8
9 9.9
The first column is x and second column is y. I would like to plot y as a function of x (x is horizontal axis, y is vertical axis) I want to plot these data sets on the same graph as a scatter graph, and put a line of best fit through the points, whenever I come across a chunk of data. In this case, I will have 2 sets of points and 2 lines of best fit (because I have 2 chunks of data). I would also like to calculate the R-squared value
The code that I have so far is shown below:
fh1 = figure;
hold all;
ah1 = gca;
% plot graphs:
for d = 1:max_number_zeros+num_rows
if sequence_holder(d,1)==0
continue;
end
c = d;
while sequence_holder(c,1)~=0
plot(ah1,sequence_holder(c,1),sequence_holder(c,num_cols),'*');
%lsline;
c =c+1;
continue;
end
end
Sequence holder is the array with the data in it. I can only plot the first set of data, with no line of best fit. I tried lsline, but that didn't work.
Can anyone tell me how to
-plot both sets of graphs
-how to draw a line of best fit a get the regression coefficient?
The first part could be done in a number of ways. I would test the second column for zeroness
zerodata = A(:,2) == 0;
which will give you a logical array of ones and zeros like [1 1 1 0 1 0 0 ...]. Then you can use this to split up your input. You could look at the diff of that array and test it for positive or negative sign. Your data starts on 0 so you won't get a transition for that one, so you'd need to think of some way to deal with that or the opposite case, unless you know for certain that it will always be one way or the other. You could just test the first element, or you could insert a known value at the start of your input array.
You will then have to store your chunks. As they may be of variable length and variable number you wouldn't put them into a big matrix, but you still want to be able to use a loop. I would use either a cell array, where each cell in a row contains the x or y data for a chunk, or a struct array where say structarray(1).x and structarray)1).y hold your data values.
Then you can iterate through your struct array and call plot on each chunk separately.
As for fitting you can use the fit command. It's complex and has lots of options so you should check out the help first (type doc fit inside the console to get the inline help, which is the same as the website help in content). The short version is that you can do a simple linear fit like this
[fitobject, gof] = fit(x, y, 'poly1');
where 'poly1' specifies you want a first order polynomial (i.e. straight line) and the output arguments give you a fit object, which you can do various things with like plot or interpolate, and the second gives you a struct containing among other things the r^2 and adjusted r^2. The fitobject also contains your fit coefficients.

What does the command A(~A) really do in matlab

I was looking to find the most efficient way to find the non zero minimum of a matrix and found this on a forum :
Let the data be a matrix A.
A(~A) = nan;
minNonZero = min(A);
This is very short and efficient (at least in number of code lines) but I don't understand what happens when we do this. I can't find any documentation about this since it's not an operation on matrices like +,-,\,... would be.
Could anyone explain me or give me a link or something that could help me understand what is done ?
Thank you !
It uses logical indexing
~ in Matlab is the not operator. When used on a double array, it finds all elements equal to zero. e.g.:
~[0 3 4 0]
Results in the logical matrix
[1 0 0 1]
i.e. it's a quick way to find all the zero elements
So if A = [0 3 4 0] then ~A = [1 0 0 1] so now A(~A) = A([1 0 0 1]). A([1 0 0 1]) uses logical indexing to only affect the elements that are true so in this case element 1 and element 4.
Finally A(~A) = NaN will replace all the elements in A that were equal to 0 with NaN which min ignores and thus you find the smallest non-zero element.
The code you provided:
A(~A) = NaN;
minNonZero = min(A);
Does the following:
Create a logical index
Apply the logical index on A
Change A, by assigning NaN values
Get the minimum of all values, while not including NaN values
Note that this leaves you with a changed A, which may be indesirable. But more importantly this has some inefficiencies as you spend time changing A and possibly even because you get the minimum of a large matrix.
Therefore you could speed things up (and even reduce one line) by doing:
minNonZero = min(A(logical(A)))
Basically you have now skipped step 3 and possibly reduced step 4.
Furthermore, you seem to get an additional small speedup by doing:
minNonZero = min(A(A~=0))
I don't have any good reason for this, but it seems like step 1 is now done more efficiently.

Creating a path on coordinate system without repeating values (Matlab)

I'm trying to create a random "path" on a coordinate system on Matlab. I am doing this by creating a for loop where for each iteration it fills in a new value on a matrix that has initial values of zeros.
For example, I have 5 points so I have an initial matrix a=[0 0 0 0 0; 0 0 0 0 0] (row1 = x values, row2 = y values).
The path can move right/left or up/down (no diagonals). In my for loop, I call randi(4) and say something like "if randi(4)=1, then move 1 point to the left (x-1). if randi(4)=2, then move to the right (x+1), etc."
The problem is that you cannot visit a specific point more than once. For example, the path can start at (0,0), then go to (0,1), then (1,1), then (1,0), and then it CANNOT go back to (0,0).. in my current code I don't have this restriction so I was hoping I could get some suggestions..
Since in this example the matrix would look something like a=[0 0 1 1 0; 0 1 1 0 0].
I was thinking of maybe subtracting each new coordinate (here (0,0)) from each column on the matrix a and if any of the columns give me values of zero for both rows (since it's the same coordinate subtracted from itself), then go back one step and let randi(4) run again.. but
How could I tell it to "go back one step" (or two or three)?
How do you compare one column against each column of the already established matrix?
This was just an idea.. are there any functions in Matlab that would let me do this? or maybe compare if two columns are the same within a matrix?
To your questions.
to go back - I suppose this means just throwing away the rightmost columns in your matrix.
to find if it is present you could use ismember
unfortunately it only takes rows so you will need to transpose. Snippet:
a = [1:10; repmat(1:2,1,5)]'
test = ismember(a,[3,2],'rows')
any(test) % not found
test = ismember(a,[3,1],'rows')
any(test) % found
Of course your idea would also work.
I can answer this:
How do you compare one column against each column of the already
established matrix?
Use two different matrices. Compare them using the setdiff() function: http://www.mathworks.com/help/matlab/ref/setdiff.html

Find groups with high cross correlation matrix in Matlab

Given a lower triangular matrix (100x100) containg cross-correlation
values, where entry 'ij' is the correlation value between signal 'i'
and 'j' and so a high value means that these two signals belong to
the same class of objects, and knowing there are at most four distinct
classes in the data set, does someone know of a fast and effective way
to classify the data and assign all the signals to the 4 different
classes, rather than search and cross check all the entries against
each other? The following 7x7 matrix may help illustrate
the point:
1 0 0 0 0 0 0
.2 1 0 0 0 0 0
.8 .15 1 0 0 0 0
.9 .17 .8 1 0 0 0
.23 .8 .15 .14 1 0 0
.7 .13 .77 .83. .11 1 0
.1 .21 .19 .11 .17 .16 1
there are three classes in this example:
class 1: rows <1 3 4 6>,
class 2: rows <2 5>,
class 3: rows <7>
This is a good problem for hierarchical clustering. Using complete linkage clustering you will get compact clusters, all you have to do is determine the cutoff distance, at which two clusters should be considered different.
First, you need to convert the correlation matrix to a dissimilarity matrix. Since correlation is between 0 and 1, 1-correlation will work well - high correlations get a score close to 0, and low correlations get a score close to 1. Assume that the correlations are stored in an array corrMat
%# remove diagonal elements
corrMat = corrMat - eye(size(corrMat));
%# and convert to a vector (as pdist)
dissimilarity = 1 - corrMat(find(corrMat))';
%# decide on a cutoff
%# remember that 0.4 corresponds to corr of 0.6!
cutoff = 0.5;
%# perform complete linkage clustering
Z = linkage(dissimilarity,'complete');
%# group the data into clusters
%# (cutoff is at a correlation of 0.5)
groups = cluster(Z,'cutoff',cutoff,'criterion','distance')
groups =
2
3
2
2
3
2
1
To confirm that everything is great, you can visualize the dendrogram
dendrogram(Z,0,'colorthreshold',cutoff)
You can use the following method instead of creating the dissimilarity matrix.
Z = linkage(corrMat,'complete','correlation')
This allows Matlab to interpret your matrix as correlation distance and then, you can plot the dendrogram as follows:
dendrogram(Z);
One way to verify if your dendrogram is right or not is by checking its maximum height which should correspond to 1-min(corrMat). If the minimum value in corrMat is 0 then the maximum height of your tree should be 1. If the minimum value is -1 (negative correlation), the height should be 2.
Since it is given that there are going to be 4 groups, I'd start with a pretty simplistic two stage approach.
In the first stage you find the maximum correlation among any two elements, place those two elements in a group, then zero out their correlation in the matrix. Repeat, finding the next highest correlation among two elements and either adding those to an existing group or creating a new one until you have the correct number of groups.
Finally, check which elements aren't in a group, go to their column, and identify the highest correlation they have with any other group. If that element is in a group already, place them in that group as well, otherwise skip to the next element and come back to them later.
If there is interest or anything isn't clear I can add code later. Like I said, the approach is simplistic but if you don't need to verify the number of groups I think it should be effective.