How to replace values in specific lines and on determined conditions - matlab

I'm in trouble: I have a dataset with 128597 lines and 10 columns.
I need to change the values from column 10 between line 10276 until 128597.
And this change have to respect some conditions, like:
If the value is between 11 and 33, the value will become 1
If the value is between 34 and 56, the value will become 5
And go on...
I tried the code below, but didn't work:
m(10276:128597,10) > 11 & m(10276:128597,10)< 33=1;
Can anyone help me please!!! :)

i think this may be help
a=[1,2,3;2,3,4;3,2,1;2,4,1];
amask=false(size(a));
amask(2:3,3)= a(2:3,3)>3;
a(amask)=9999;
and the ans is
a =
1 2 3
2 3 4
3 2 1
2 4 1
a =
1 2 3
2 3 9999
3 2 1
2 4 1

If your bins are consecutive, i.e. there are no gaps between the categories, you can use discretize. Like this you can change all values at once and don't have to use an insane number of logical operators if you have a lot if bins.
% edges of the bins. As an example there are currently 3 bins ranging from
% 11 to 34, 34 to 57, 57 to 77. The upper values are included in the next bin, i.e. bin 2 goes from 34.0 to 56.999999
edges = [11 34 57 77];
% Example values you want to give to the bins. Length has to be one shorter than edges
replacementvalues = [1 5 9];
% subset of the data you want to change
subset = m(10276:128597,:);
% bin the subset into categories
binnumber = discretize(subset,edges);
% replace the values in subset that are inside those bins
subset(~isnan(binnumber)) = replacementvalues(binnumber(~isnan(binnumber)));
% replace the values in the original matrix with the updated subset
m(10276:128597,:) = subset;
If there are gaps between the bins, the code can be expanded by making a second set of "do not change" bins.

Related

How to split a matrix based on how close the values are?

Suppose I have a matrix A:
A = [1 2 3 6 7 8];
I would like to split this matrix into sub-matrices based on how relatively close the numbers are. For example, the above matrix must be split into:
B = [1 2 3];
C = [6 7 8];
I understand that I need to define some sort of criteria for this grouping so I thought I'd take the absolute difference of the number and its next one, and define a limit upto which a number is allowed to be in a group. But the problem is that I cannot fix a static limit on the difference since the matrices and sub-matrices will be changing.
Another example:
A = [5 11 6 4 4 3 12 30 33 32 12];
So, this must be split into:
B = [5 6 4 4 3];
C = [11 12 12];
D = [30 33 32];
Here, the matrix is split into three parts based on how close the values are. So the criteria for this matrix is different from the previous one though what I want out of each matrix is the same, to separate it based on the closeness of its numbers. Is there any way I can specify a general set of conditions to make the criteria dynamic rather than static?
I'm afraid, my answer comes too late for you, but maybe future readers with a similar problem can profit from it.
In general, your problem calls for cluster analysis. Nevertheless, maybe there's a simpler solution to your actual problem. Here's my approach:
First, sort the input A.
To find a criterion to distinguish between "intraclass" and "interclass" elements, I calculate the differences between adjacent elements of A, using diff.
Then, I calculate the median over all these differences.
Finally, I find the indices for all differences, which are greater or equal than three times the median, with a minimum difference of 1. (Depending on the actual data, this might be modified, e.g. using mean instead.) These are the indices, where you will have to "split" the (sorted) input.
At last, I set up two vectors with the starting and end indices for each "sub-matrix", to use this approach using arrayfun to get a cell array with all desired "sub-matrices".
Now, here comes the code:
% Sort input, and calculate differences between adjacent elements
AA = sort(A);
d = diff(AA);
% Calculate median over all differences
m = median(d);
% Find indices with "significantly higher difference",
% e.g. greater or equal than three times the median
% (minimum difference should be 1)
idx = find(d >= max(1, 3 * m));
% Set up proper start and end indices
start_idx = [1 idx+1];
end_idx = [idx numel(A)];
% Generate cell array with desired vectors
out = arrayfun(#(x, y) AA(x:y), start_idx, end_idx, 'UniformOutput', false)
Due to the unknown number of possible vectors, I can't think of way to "unpack" these to individual variables.
Some tests:
A =
1 2 3 6 7 8
out =
{
[1,1] =
1 2 3
[1,2] =
6 7 8
}
A =
5 11 6 4 4 3 12 30 33 32 12
out =
{
[1,1] =
3 4 4 5 6
[1,2] =
11 12 12
[1,3] =
30 32 33
}
A =
1 1 1 1 1 1 1 2 2 2 2 2 2 3 3 3 3 3 3 3
out =
{
[1,1] =
1 1 1 1 1 1 1
[1,2] =
2 2 2 2 2 2
[1,3] =
3 3 3 3 3 3 3
}
Hope that helps!

Choose multiple combination (matlab)

I have 3 sets of array each contains 12 elements of same type
a=[1 1 1 1 1 1 1 1 1 1 1];
b=[2 2 2 2 2 2 2 2 2 2 2];
c=[3 3 3 3 3 3 3 3 3 3 3];
I have to find how many ways it can be picked up if I need to pickup 12 items at a time
here, 1 1 2 is same as 2 1 1
I found this link Generating all combinations with repetition using MATLAB.
Ho this can be done in matlab within reasonable time.
is that way correct
abc=[a b c];
allcombs=nmultichoosek(abc,12);
combs=unique(allcombs,'rows');
If you only need to find the number of ways to select the items, then using generating functions is a way to very efficiently compute that, even for fairly large values of N and k.
If you are not familiar with generating functions, you can read up on the math background here:
http://mathworld.wolfram.com/GeneratingFunction.html
and here:
http://math.arizona.edu/~faris/combinatoricsweb/generate.pdf
The solution hinges on the fact that the number of ways to choose k items from 36, with each of 3 items repeated 12 times, can be determined from the product of the generating function:
g(x) = 1 + x + x^2 + x^3 + ... + x^12
with itself 3 times. The 12 comes from the fact the elements are repeated 12 times (NOT from the fact you are choosing 12), and multiplying by itself 3 times is because there are three different sets of elements. The number of ways to choose 12 elements is then just the coefficient of the power of x^12 in this product of polynomials (try it for smaller examples if you want to prove to yourself that it works).
The great thing about that is that MATLAB has a simple function conv for multiplying polynomials:
>> g = ones(1,13); %% array of 13 ones, for a 12th degree polynomial with all `1` coefficents
>> prod = conv(g, conv(g, g)); %% multiply g by itself 3 times, as a polynomial
>> prod(13)
ans =
91
So there are 91 ways to select 12 elements from your list of 36. If you want to select 11 elements, that's z(12) = 78. If you want to select 13 elements, that's z(14) = 102.
Finally, if you had different numbers of elements in the sets, say 10 1's, 12 2's, and 14 3's, then you would have 3 distinct polynomials of the same form, 1 + x + x^2 + ..., with degrees 10, 12 and 14 respectively. Inspecting the coefficient of the degree k term again gives you the number of ways to choose k elements from this set.

Group matrix values into separate matrices based on values of another matrix

I am reading in images with imread which results in 768x1024x3 matrix with R,G,B values of each pixel.
I have a function that takes in an image and returns matrix of segment labels for each pixel so this matrix is 768x1024. the labels are just numbers 1,2,3,4 depending on how many different segments the function finds.
Now I want to calculate the average Red, Green and Blue value in each segment of the image. So I want to use the indices from the segment label matrix to find group all R,G,B values into separate arrays and then be able to calculate the mean.
Is there any smart way to do this? use the indices of each 1 value in the segment matrix to get the values from the imread matrix and group the segments into different arrays? I though of using for loops and brute force through this but is there a better way to do this?
Here's a code that you will get you everything without looping.
Code
%// img is your input RGB image (NxMx3)
%// L is your label matrix (NxM)
t1 = bsxfun(#eq,L,permute(unique(L),[3 2 1]));
t2 = bsxfun(#times,permute(img,[ 1 2 4 3]),t1);
t2(t2==0)=nan;
out = squeeze(nanmean(nanmean(t2)))
%// out is the desired output matrix that is (NLx3),
%// where NL is the number of labels. Thus, the mean of labels is
%// along the rows and the corresponding values for R, G and B are in the three
%// columns of it.
Explanation
Let's test out with some random values for img -
img = randi(9,3,4,3)
Giving us -
img(:,:,1) =
9 7 5 3
7 7 2 4
1 6 7 9
img(:,:,2) =
8 6 6 4
4 9 3 9
3 9 8 1
img(:,:,3) =
5 4 4 5
7 2 5 3
2 3 1 3
Some assumed values for L that goes from 1 to 8
L = [1 3 3 4;
4 5 8 8;
5 6 7 2]
The code output is -
out =
9 8 5
9 1 3
6 6 4
5 4 6
4 6 2
6 9 3
7 8 1
3 6 4
Let's see how to make sense of the output.
Looking at the input, let's choose the label 8, which is at locations (2nd row,3rd col) and (2nd row,4th col). The corresponding R values at these locations in img are [2 4], and thus the R mean/average value must be 3. Similarly for G it must be from [3 9], that is 6 and again for B would be from [5 3], that is 4.
Let's look at the 8th row of out that represents the label-8, we have [3 6 4], which are the mean values as calculated earlier. Similarly other mean values could be interpreted from out.
Edited to handle all channels at once.
Let img be your RGB image and labels the labels array.
You can mask the RGB image with the labels like this:
% create a 3-channels mask:
labelsRGB=repmat(labels, 1, 1, 3);
Segment1=img.*(labelsRGB==1);
The average values in the segment labeled as 1 is then:
avg=mean(mean(Segment1, 1), 2);
Get the average for re in avg(1), the average for green in avg(2), etc.
Idem for the other segments.
Here goes a general alternative.
In this case you do not need to loop over the different segments to get the average of each.
%simulated image and label
img=rand(10,12,3);
labeled=[ones(10,3),ones(10,3)*2,ones(10,3)*3,ones(10,3)*4];
% actual code for the mean
red_mean = regionprops(labeled, img(:,:,1), 'MeanIntensity')

Creating combinations without repetitions

I have so far the following code:
data = xlsread('filename');
% 1000 samples without replacement
% each element of y contains 10 values without repetition
y = cell(10,1000);
for i = 1:1000
y{i} = datasample(data,10,'Replace',false);
end
Now I dont want to have the same vector twice in the cell y, and by twice I also mean vectors like [ 1 2 3 4 5 6 7 8 9 10] and [1 2 3 4 5 6 7 8 10 9], i.e the ordering of the elements does not matter, but if 2 vectors contain the same elements I want one to be deleted. How do I do that? Is there alternatively a way to sample some of combinations without replacement from data? Data contains 171 values, and all of the combinations without replacement would probably be some milions, whereas I basically only need around 1000 combinations without replacement.. Thanks

Find the increasing and decreasing trend in a curve MATLAB

a=[2 3 6 7 2 1 0.01 6 8 10 12 15 18 9 6 5 4 2].
Here is an array i need to extract the exact values where the increasing and decreasing trend starts.
the output for the array a will be [2(first element) 2 6 9]
a=[2 3 6 7 2 1 0.01 6 8 10 12 15 18 9 6 5 4 2].
^ ^ ^ ^
| | | |
Kindly help me to get the result in MATLAB for any similar type of array..
You just have to find where the sign of the difference between consecutive numbers changes.
With some common sense and the functions diff, sign and find, you get this solution:
a = [2 3 6 7 2 1 0.01 6 8 10 12 15 18 9 6 5 4 2];
sda = sign(diff(a));
idx = [1 find(sda(1:end-1)~=sda(2:end))+2 ];
result = a(idx);
EDIT:
The sign function messes things up when there are two consecutive numbers which are the same, because sign(0) = 0, which is falsely identified as a trend change. You'd have to filter these out. You can do this by first removing the consecutive duplicates from the original data. Since you only want the values where the trend change starts, and not the position where it actually starts, this is easiest:
a(diff(a)==0) = [];
This is a great place to use the diff function.
Your first step will be to do the following:
B = [0 diff(a)]
The reason we add the 0 there is to keep the matrix the same length because of the way the diff function works. It will start with the first element in the matrix and then report the difference between that and the next element. There's no leading element before the first one so is just truncates the matrix by one element. We add a zero because there is no change there as it's the starting element.
If you look at the results in B now it is quite obvious where the inflection points are (where you go from positive to negative numbers).
To pull this out programatically there are a number of things you can do. I tend to use a little multiplication and the find command.
Result = find(B(1:end-1).*B(2:end)<0)
This will return the index where you are on the cusp of the inflection. In this case it will be:
ans =
4 7 13