Pull out values and associated strings of at least a certain value - matlab

I am trying to write code to pull out of a vector of data, 1x1000, the values that are a fold change of 2 or more. A fold change of 2 is equivalent to -1. I want to pull out the names of my genes (coded in vector C) and the values (coded in vector fcsites). This is what I have come up with so far but one of my issues is that I don't know what to specify as the length of the new vector. Does anyone know a better way to approach this?
atleast = {C,fcsites}
Z = zeros(length(C),1);
for i2=1:length(C)
Z(i2)=C(fcsites<=-1);
end
I get the error:
atleast =
{602x1 cell} [602x1 double]
The following error occurred converting from cell to double:
Error using double
Conversion to double from cell is not possible.

Find the desired elements in fcsites, and use their corresponding indices as a subscript for C:
idx = (fcsites <= -1);
X = C(idx)
or shorter:
X = C(fcsites <= -1)
Now X contains all the names from C that correspond to values in fcsites that are less than or equal to -1.

Related

Matlab find function throws size difference error when there is no apparent one

My matlab code is attempting to find the indices in a 601 by 1 matrix that correspond to a given value but says the left and right sides have a different number of elements
pH_fine = pH(1):0.01:pH(end);
pH_labvals = [7.72,9.87,7.4,7.63,7.06,6.85,8.29,9.37,11.1];
index_labvals = [];
a = find(pH_fine == 8); %This works perfectly
for i = 1:length(pH_labvals)
index_labvals(i) = find(pH_fine == pH_labvals(i)); %This throws an error
end
Your problem is that the find(pH_fine == pH_labvals(i)) on the right side sometimes doesn't find any match, and returns an empty result for an index, specifically a 1-by-0 row vector. This doesn't match the size of the left side, which is indexing a 1-by-1 element from your vector index_labvals.
You need to check first if the result of find is empty, and decide what you will put in the index vector in that case, like a 0 or NaN. You will also need to deal with find giving you a vector of indices if pH_labvals has the same value repeated. If you simply want to remove repeated values, you could use unique like so:
pH_labvals = unique(pH_labvals, 'stable');
If you're wondering why you're getting an empty result from find, you should read through this post about the perils of floating-point comparison. One possible solution, assuming pH_labvals contains non-repeated values with 2 decimal places of precision, is to first round your pH_fine vector to 2 decimal places:
pH_fine = round(pH(1):0.01:pH(end), 2);
This should allow you to avoid the errors from floating-point comparison.
An alternative approach is to use interp1 for table lookup:
pH = [1,14]; % Not sure what values you use here, it doesn't matter for the example.
pH_fine = pH(1):0.01:pH(end);
pH_labvals = [7.72,9.87,7.4,7.63,7.06,6.85,8.29,9.37,11.1];
index_labvals = interp1(pH_fine,1:numel(pH_fine),pH_labvals,'nearest')
Here, we're finding the nearest index within pH_fine that matches each of the values in pH_labvals. 1:numel(pH_fine) are the indices into pH_fine.
Note that there's no need for a loop, as interp1 will lookup all pH_labvals at once.

Finding Location of matrices within a structure in matlab

I am importing an RGB image U of the stars and doing the following:
im=rgb2gray(U);
img=(im>200);
BW=im2bw(img,0);
L=bwlabeln(BW,18);
b=regionprops(L,'PixelList');
The goal of this program is to find the largest and most prominent stars in this picture of hundreds of stars. b is a 2566x1 struct array that contains all the points with a value greater than 200. If a certain connected region within the image contains multiple values over 200, b will store a coordinate matrix of these points. Otherwise, it will only store a single coordinate pair.
I need a way to find all the rows within b that contain matrices? If possible, a way to find all the rows within b that contain matrices that contain 30 or more points?
You can use the arrayfun function to apply a function to each element in an array. Note that this is just a shorter way of writing a loop.
In this case you'd need to apply the function size(b(i).PixelList, 1) > 30 to each element i of the struct array b:
m = arrayfun(#(x)size(x.PixelList, 1) > 1, b)
This is identical to:
m = false(size(b));
for i=1:numel(b)
m(i) = size(b(i).PixelList, 1) > 30;
end
The matrix m is a logical array, you can use it to index as b(m). You can also get indices using find(m).
If you also include 'Area' in the properties calculated by regionprops, you'll already have the number of pixels in each component:
b=regionprops(L,'PixelList','Area');
idx = [b.Area] >= 30;

Find column-count inbetween integers row-wise in matrix (matlab)

I have a stupid problem, I can't find the answer to ^^.
I have a 100x10000 double matrix containing integers from 1 to 4 and want to find row-wise the column-count between every single integer
My first idea was to use:
storage_ones = cell(100,1);
for n = 1:100;
[row col] = find(matrix(n,:)==1);
storage_ones{n,1} = col;
end
And then substract them in another loop. But with find I get following Answer:
Empty matrix: 1-by-0
Does anybody have an idea how I can solve this problem?
Thanks in advance!!
Your issue is potentially due to one of two things:
Since you're using a double datatype, it is possible that you're encountering floating point errors where the values aren't going to be 1 exactly. If this is the case consider not checking for exact equality and instead check if it is very close to 1 using a small epsilon (here I used 1e-12).
[row, col] = find(abs(matrix(n,:) - 1) < 1e-12);
If you really have an integer datatype, consider using uint8 to store your data rather than double and then you can perform exact comparisons.
matrix = uint8(matrix);
% Then for your comparison
find(matrix(n,:) == 1)
You may just not have any 1's in that column. If find can't find any matches, it returns an empty array.
find([1 2 3] == 4)
% Empty matrix: 1-by-0

Maximum of a subset of array (MATLAB)

Suppose in MATLAB I have a real matrix A which is n x m and a binary matrix B of the same size. The latter matrix defines the optimization set (all indices for which the element of B equals one): over this set I would like to find the maximal element of A. How can I do this?
The first idea I had is that I consider C = A.*B and look for the maximal element of C. This works fine for all matrices A which have at least one positive element, however it does not work for matrices with all negative elements.
You can do
C = A(B==1);
to give you an array of just the values of A corresponding to a value of 1 in B. And
max( C )
will give you the maximum value of A where B is 1
With this method you don't run into a problem when all values of A are negative as the zeros don't appear in C.
Obviously you can condense this to
desiredValue = max(A(B(:)==1));
I am using the colon operator to make sure that the result of A(B(:)==1) is a column vector - if B is all ones I am not sure if Matlab would return a vector or a nxm matrix (and I can't confirm right now).
update to get the index of the value, you can do:
f = find(B==1);
[m mi] = max(A(f));
maxIndex = f(mi);
And to get that back to the 2D elements:
[i j] = ind2sub(size(A), maxIndex);

Using find on non-integer MATLAB array values

I've got a huge array of values, all or which are much smaller than 1, so using a round up/down function is useless. Is there anyway I can use/make the 'find' function on these non-integer values?
e.g.
ind=find(x,9.5201e-007)
FWIW all the values are in acceding sequential order in the array.
Much appreciated!
The syntax you're using isn't correct.
find(X,k)
returns k non-zero values, which is why k must be an integer. You want
find(x==9.5021e-007);
%# ______________<-- logical index: ones where condition is true, else zeros
%# the single-argument of find returns all non-zero elements, which happens
%# at the locations of your value of interest.
Note that this needs to be an exact representation of the floating point number, otherwise it will fail. If you need tolerance, try the following example:
tol = 1e-9; %# or some other value
val = 9.5021e-007;
find(abs(x-val)<tol);
When I want to find real numbers in some range of tolerance, I usually round them all to that level of toleranace and then do my finding, sorting, whatever.
If x is my real numbers, I do something like
xr = 0.01 * round(x/0.01);
then xr are all multiples of .01, i.e., rounded to the nearest .01. I can then do
t = find(xr=9.22)
and then x(t) will be every value of x between 9.2144444444449 and 9.225.
It sounds from your comments what you want is
`[b,m,n] = unique(x,'first');
then b will be a sorted version of the elements in x with no repeats, and
x = b(n);
So if there are 4 '1's in n, it means the value b(1) shows up in x 4 times, and its locations in x are at find(n==1).