adding the values in a matrix based on the corresponding unique values of another matrix - matlab

e=[40 19 18 20 30 34 65 97 155 160];
If there is a minimum difference between two consecutive values (for e.g. (19,18), (30, 34) and (155,160)) then merge these values..
Similar values also...Whatever condition can be used to solve this..Kindly help to solve this..

Iteratively,
e = [ 40 19 18 20 30 34 65 97 155 160];
current = e + 1; % init
prev = e;
while ~isequal( current, prev )
prev = current;
d = [ diff( prev ) < 5 true]; % always keep the last one
current = prev( d );
end

Related

Dividing a matrix into two parts

I am trying to classify my dataset. To do this, I will use the 4th column of my dataset. If the 4th column of the dataset is equal to 1, that row will added in new matrix called Q1. If the 4th column of the dataset is equal to 2, that row will be added to matrix Q2.
My code:
i = input('Enter a start row: ');
j = input('Enter a end row: ');
search = importfiledataset('search-queries-features.csv',i,j);
[n, p] = size(search);
if j>n
disp('Please enter a smaller number!');
end
for s = i:j
class_id = search(s,4);
if class_id == 1
Q1 = search(s,1:4)
elseif class_id ==2
Q2 = search(s,1:4)
end
end
This calculates the Q1 and Q2 matrices, but they all are 1x4 and when it gives new Q1 the old one is deleted. I need to add new row and make it 2x4 if conditions are true. I need to expand my Q1 matrix.
Briefly I am trying to divide my dataset into two parts using for loops and if statements.
Dataset:
I need outcome like:
Q1 = [30 64 1 1
30 62 3 1
30 65 0 1
31 59 2 1
31 65 4 1
33 58 10 1
33 60 0 1
34 58 30 1
34 60 1 1
34 61 10 1]
Q2 = [34 59 0 2
34 66 9 2]
How can I prevent my code from deleting previous rows of Q1 and Q2 and obtain the entire matrices?
The main problem in your calculation is that you overwrite Q1 and Q2 each loop iteration. Best solution: get rid of the loops and use logical indexing.
You can use logical indexing to quickly determine where a column is equal to 1 or 2:
search = [
30 64 1 1
30 62 3 1
30 65 0 1
31 59 2 1
31 65 4 1
33 58 10 1
33 60 0 1
34 59 0 2
34 66 9 2
34 58 30 1
34 60 1 1
34 61 10 1
];
Q1 = search(search(:,4)==1,:) % == compares each entry in the fourth column to 1
Q2 = search(search(:,4)==2,:)
Q1 =
30 64 1 1
30 62 3 1
30 65 0 1
31 59 2 1
31 65 4 1
33 58 10 1
33 60 0 1
34 58 30 1
34 60 1 1
34 61 10 1
Q2 =
34 59 0 2
34 66 9 2
Warning: Slow solution!
If you are hell bent on using loops, make sure to not overwrite your variables. Either extend them each iteration (which is very, very slow):
Q1=[];
Q2=[];
for ii = 1:size(search,1) % loop over all rows
if search(ii,4)==1
Q1 = [Q1;search(ii,:)];
end
if search(ii,4)==2
Q2 = [Q2;search(ii,:)];
end
end
MATLAB will put orange wiggles beneath Q1 and Q2, because it's a bad idea to grow arrays in-place. Alternatively, you can preallocate them as large as search and strip off the excess:
Q1 = zeros(size(search)); % Initialise to be as large as search
Q2 = zeros(size(search));
Q1kk = 1; % Intialiase counters
Q2kk = 1;
for ii = 1:size(search,1) % loop over all rows
if search(ii,4)==1
Q1(Q1kk,:) = search(ii,:); % store
Q1kk = Q1kk + 1; % Increase row counter
end
if search(ii,4)==2
Q2(Q2kk,:) = search(ii,:);
Q2kk = Q2kk + 1;
end
end
Q1 = Q1(1:Q1kk-1,:); % strip off excess rows
Q2 = Q2(1:Q2kk-1,:);
Another option using accumarray, if Q is your original matrix:
Q = accumarray(Q(:,4),1:size(Q,1),[],#(x){Q(x,:)});
You can access the result with Q{1} (for class_id = 1), Q{2} (for class_id = 2) and so on...

Checking if value exists in a matrix and getting its columns

I have a 500x500 matrix with values ranging from 1-100.
I need to look at 5 rows at a time and see if those 5 rows contain values that are greater than 75. I then need to get the index of the first column where the value is greater than 75 and the index of the last column where the value is greater than 75.
So far, I have the following:
i = 1;
while i < size(data,1)
if (i + 5) <= size(data,1)
if any(envNoClutterscansV(i:i + 5, 1:500) > 75)
% do something
end
end
i = i + 5;
end
The idea here is that I am looking at 5 rows at a time. For every 5 rows, I'm looking through all the columns to see if there are values that meet my criteria. So far, this doesn't find any values, even though I'm sure that my dataset contains the values. Additionally, I am not sure what to do from here.
I think the trouble might be that the result of any in the above code is a vector of 500 true and false values. You should sum them if you e=want to respond every time there are larger than 75 values:
if sum(any(envNoClutterscansV(i:i + 5, 1:500) > 75))
If you want to speed it up, you can avoid the loop and vectorize it, for example like this:
data = [
11 76 25 44 55 75;
11 75 95 44 85 75;
11 75 25 44 55 75;
11 75 25 44 55 75;
11 75 25 44 55 75;
11 0 25 44 55 0;
11 0 25 44 55 0;
11 90 25 44 55 88;
11 0 25 44 55 0;
91 0 25 44 55 80;
];
% Geting the number of rows
nRows=size(data,1);
% Retting a logical matrix with all the cells that are above the treshold
cellsOverTreshold=data>75;
% Getting a logical index to all the rows that contain values above
% treshold
matchingRows=any(cellsOverTreshold,2);
% In nexy line of code "reshape" rearange the data to put in columns the
% values associated to each goup of 5 rows
% So colum 1 have group one corresponding to data columns 1,2,3,4,5
% colum 2 have group two corresponding to data columns 6,7,8,9,10
% and so on
% Now we can get all the row groups that have velues above threshold
matchingRowGroups=find(any(reshape(matchingRows,5,[])));
% Now e put each row of on a cell array to be able to operate row-wise
cellRows = num2cell(cellsOverTreshold, 2);
% We now get the first and last column over the threshold for each row
firstColumOfRow = cellfun(#(x)find(x,1,'first'), cellRows,'UniformOutput',false);
lastColumOfRow = cellfun(#(x)find(x,1,'last'), cellRows,'UniformOutput',false);
% We replace the empty cells with NaNs so we can convert them to vectors
% without losing the indexing
firstColumOfRow(~matchingRows)={NaN};
lastColumOfRow(~matchingRows)={NaN};
% We rearrange the data as above and get the minimum of the first columns
% of each group, that is the first colum of the group above the threshold
firstColInGroup=nanmin(reshape([firstColumOfRow{:}]',5,[]));
% With the maximum of the last colums we get the last column of each group
lastColInGroup=nanmax(reshape([lastColumOfRow{:}]',5,[]));
% We finaly keep only the data of the groups with at that have at least one
% element above the threshold
firstColInGroup=firstColInGroup(matchingRowGroups);
lastColInGroup=lastColInGroup(matchingRowGroups);
In this way the variable "matchingRowGroups" have the indexes of each group of 5 rows that matchs. The variable "firstColInGroup" have the first column matching for each group and "lastColInGroup" the last one.
In addition to my previous answer, here is another option of vectorization, avoiding to transform data into cell arrays and avoiding using cellfun too, therefore, it is probably faster. Here it is:
data = [
11 76 25 44 55 75;
11 75 95 44 85 75;
11 75 25 44 55 75;
11 75 25 44 55 75;
11 75 25 44 55 75;
11 0 25 44 55 0;
11 0 25 44 55 0;
11 90 25 44 55 88;
11 0 25 44 55 0;
91 0 25 44 55 80;
11 75 25 44 55 75;
11 75 25 44 55 75;
11 75 25 44 55 75;
11 0 25 84 55 0;
11 0 25 44 55 0;
];
% Geting the number of rows
[nRows, nCols]=size(data);
% Retting a logical matrix with all the cells that are above the treshold
cellsOverTreshold=data>75;
% Getting a logical index to all the rows that contain values above
% treshold
matchingRows=any(cellsOverTreshold,2);
% In nexy line of code "reshape" rearange the data to put in columns the
% values associated to each goup of 5 rows
% So colum 1 have group one corresponding to data columns 1,2,3,4,5
% colum 2 have group two corresponding to data columns 6,7,8,9,10
% and so on
% Now we can get all the row groups that have velues above threshold
matchingRowGroups=find(any(reshape(matchingRows,5,[])))
%We find the rows and columns of all the first and last columns of each row
% that have values above threshold
[firstRow, firstCol]=find(cumsum(cumsum(cellsOverTreshold,2),2)==1);
[lastRow, lastCol]=find(cumsum(cumsum(cellsOverTreshold,2,'reverse'),2,'reverse')==1);
% Sort this data in vectors with one value per row, leaving NANs for rows
% with no element above threshold
firstColumOfRow=NaN(nRows,1);
lastColumOfRow=NaN(nRows,1);
firstColumOfRow(firstRow)=firstCol;
lastColumOfRow(lastRow)=lastCol;
% We rearrange the data as above and get the minimum of the first columns
% of each group, that is the first colum of the group above the threshold
firstColInGroup=nanmin(reshape(firstColumOfRow,5,[]));
% With the maximum of the last colums we get the last column of each group
lastColInGroup=nanmax(reshape(lastColumOfRow,5,[]));
% We finaly keep only the data of the groups with at that have at least one
% element above the threshold
firstColInGroup=firstColInGroup(matchingRowGroups)
lastColInGroup=lastColInGroup(matchingRowGroups)
This code looks 5 rows a time. Use find to locate the values > 75 and ind2sub to convert the indices returned by find to rows (ignored) and columns cols.
data = [
11 76 25 44 55 78;
11 75 25 44 55 75;
11 75 25 44 55 75;
11 75 25 44 55 75;
11 75 25 44 55 75;
11 0 25 44 55 0;
11 0 25 44 55 0;
11 0 25 44 55 88;
11 0 25 44 55 0;
11 0 25 44 55 0;
];
for row = 1:5:size(data, 1)
fprintf('Row %d - %d\n', row, row+4);
indices = find(data(row:row+4,:) > 75);
if ~isempty(indices)
[~, cols] = ind2sub([5 size(data, 2)], indices);
col_min = min(cols);
col_max = max(cols);
fprintf('Column: %d and %d\n', col_min, col_max);
end
end
After thinking a bit more, here you have yet another simpler, faster and more compact solution. See my first solution for more datils on the naming of variables, but they are quite self explanatory
data = [
11 76 25 44 55 75;
11 75 95 44 85 75;
11 75 25 44 55 75;
11 75 25 44 55 75;
11 75 25 44 55 75;
11 0 25 44 55 0;
11 0 25 44 55 0;
11 90 25 44 55 88;
11 0 25 44 55 0;
91 0 25 44 55 80;
11 75 25 44 55 75;
11 75 25 44 55 75;
11 75 25 44 55 75;
11 0 25 84 55 0;
11 0 25 44 55 0;
];
% Geting the number of rows and columns
[nRows, nCols]=size(data);
%We create arrays with rows and column numbers of each element
[colNum,rowNum]=meshgrid(1:nCols,1:nRows);
% Set NaN the column numbers that do not match the treshold
colNum(data<=75)=NaN;
% Get the group number of each element
groupNum=ceil(rowNum/5);
%The matching groups are those that have at least one non-NaN element
matchingRowGroups = accumarray(groupNum(:),colNum(:),[],#(x)any(~isnan(x)))
%We get the minimum of the column numbers matching thershold on each group
firstColumOfGroup = accumarray(groupNum(:),colNum(:),[],#nanmin)
%We get the maximum of the column numbers matching thershold on each group
lastColumOfGroup = accumarray(groupNum(:),colNum(:),[],#nanmax)
The only difference with the previous solutions is that matchingRowGroups is a logical index, and firstColumOfGroup and lastColumOfGroup have one entry per group, instead of entries only for groups with elements above the threshold. Groups with no entry above threshold have NaN values

overlapping feature values and efficient features calculation [duplicate]

If I have a matrix
F=[ 24 3 17 1;
28 31 19 1;
24 13 25 2;
47 43 39 1;
56 41 39 2];
in the first three columns I have feature values a forth column is for class labels. my problem is to get rid of same feature values when class label is different for that particular values.
like for F matrix I have to remove the rows 1,3,4 and 5 ,because for first column there are 2 different values in column four and same is for third column (39 and 39)as class label again got changed.
so output should look like
F=[28 31 19 1];
The straightforward approach would be iterating over the columns, counting the number of different classes for each value, and removing the rows for values associated to more than one class.
Example
F = [24 3 17 1; 28 31 19 1; 24 13 25 2; 47 43 39 1; 56 41 39 2];
%// Iterate over columns
for col = 1:size(F, 2) - 1
%// Count number of different classes for each value
[vals, k, idx] = unique(F(:, col));
count = arrayfun(#(x)length(unique(F(F(:, col) == x, end))), vals);
%// Remove values associated to more than one class
F(count(idx) > 1, :) = [];
end
This results in:
F =
28 31 19 1
Another take at the problem, without arrayfun (edited)
F = [24 3 17 1; 28 31 19 1; 24 13 25 2; 47 43 39 1; 56 41 39 2];
Separate both classes:
A1 = F(F(:,4)==1,1:3);
A2 = F(F(:,4)==2,1:3);
Replicate them to a 3D matrix to compare each line of class1 with each line of class2:
B2 = repmat(shiftdim(A2',-1),size(A1,1),1);
B1 = repmat(A1,[1,1,size(A2,1)]);
D4 = squeeze(sum(B1 == B2,2));
remove rows duplicated rows
A1(logical(sum(D4,2)),:) = [];
A2(logical(sum(D4,1)),:) = [];
reconstruct original matrix
R = [A1 ones(size(A1,1),1);A2 2*ones(size(A2,1),1)];

overlapping feature values values in matlab

If I have a matrix
F=[ 24 3 17 1;
28 31 19 1;
24 13 25 2;
47 43 39 1;
56 41 39 2];
in the first three columns I have feature values a forth column is for class labels. my problem is to get rid of same feature values when class label is different for that particular values.
like for F matrix I have to remove the rows 1,3,4 and 5 ,because for first column there are 2 different values in column four and same is for third column (39 and 39)as class label again got changed.
so output should look like
F=[28 31 19 1];
The straightforward approach would be iterating over the columns, counting the number of different classes for each value, and removing the rows for values associated to more than one class.
Example
F = [24 3 17 1; 28 31 19 1; 24 13 25 2; 47 43 39 1; 56 41 39 2];
%// Iterate over columns
for col = 1:size(F, 2) - 1
%// Count number of different classes for each value
[vals, k, idx] = unique(F(:, col));
count = arrayfun(#(x)length(unique(F(F(:, col) == x, end))), vals);
%// Remove values associated to more than one class
F(count(idx) > 1, :) = [];
end
This results in:
F =
28 31 19 1
Another take at the problem, without arrayfun (edited)
F = [24 3 17 1; 28 31 19 1; 24 13 25 2; 47 43 39 1; 56 41 39 2];
Separate both classes:
A1 = F(F(:,4)==1,1:3);
A2 = F(F(:,4)==2,1:3);
Replicate them to a 3D matrix to compare each line of class1 with each line of class2:
B2 = repmat(shiftdim(A2',-1),size(A1,1),1);
B1 = repmat(A1,[1,1,size(A2,1)]);
D4 = squeeze(sum(B1 == B2,2));
remove rows duplicated rows
A1(logical(sum(D4,2)),:) = [];
A2(logical(sum(D4,1)),:) = [];
reconstruct original matrix
R = [A1 ones(size(A1,1),1);A2 2*ones(size(A2,1),1)];

Matlab: find mode in range

I have a matrix like:
A=
10 31 32 22
32 35 52 77
68 42 84 32
I need a function like mode but with range, for example mymode(A,10) that return 30, find most frequent number in range 0-10, 10-20, 20-30, .... and return most number in range.
You can use histc to bin your data into the ranges of your desire and then find the bin with the most members using max on the output of histc
ranges = 0:10:50; % your desired ranges
[n, bins] = histc(A(:), ranges); % bin the data
[v,i] = max(n); % find the bin with most occurrences
[ranges(i) ranges(i+1)] % edges of the most frequent bin
For your specific example this returns
ans =
30 40
which matches with your required output, as the most values in A lay between 30 and 40.
[M,F] = mode( A((A>=2) & (A<=5)) ) %//only interested in range 2 to 5
...where M will give you the mode and F will give you frequency of occurence
> A = [10 31 32 22; 32 35 52 77; 68 42 84 32]
A =
10 31 32 22
32 35 52 77
68 42 84 32
> min = 10
min = 10
> max = 40
max = 40
> mode(A(A >= min & A <= max))
ans = 32
>
I guess by the number of different answers that we may be missing your goal. Here is my interpretation.
If you want to have many ranges and you want to output most frequent number for every range, create a cell containing all desired ranges (they could overlap) and use cellfun to run mode() for every range. You can also create a cell with desired ranges using arrayfun in a similar manner:
A = [10 31 32 22; 32 35 52 77; 68 42 84 32];
% create ranges
range_step = 10;
range_start=[0:range_step:40];
range=arrayfun(#(r)([r r+range_step]), range_start, 'UniformOutput', false)
% analyze ranges
o = cellfun(#(r)(mode(A(A>=r(1) & A<=r(2)))), range, 'UniformOutput', false)
o =
[10] [10] [22] [32] [42]