Sorting (with conditions) using Matlab - matlab

Using Matlab, I will like to sort the following wireless sensor readings in descending order, using the received signal strength (RSS) values in Column 2. I will like to find the average of the coordinates corresponding to the three highest RSS values. Column 3 is the coordinate of each of the sensor, while Column 4 are the wireless sensors that are visible to the sensor in Column 1.
However, there is a condition that must be met. The three highest values to be selected must be visible to each other. For instance, if sensors A,D,F are selected, sensors D and F must be visible to A, sensors A and D must be visible to F, and sensors A and F must be visible D.
Column 1 Column 2 Column 3 Column 4
(Sensors that are visible to
the sensor in Column 1)
A -45 1,1 B,C,D,E,H
B -90 1,5 A,D,C,E,H
C -50 3,9 A,B,E,H,G
D -54 4,2 A,C,B,F,G
E -70 4,6 C,D,H,G
F -57 7,2 B,D,H,I
G -75 7,6 D,B,I,E
H -64 6,9 E,D,G,I
I -23 9,9 H,G,F,B
Looking forward to any form of assistance. Grateful

Here is what I got for the first part, sorting the data in descending order:
data = cell(9,4);
col1 = ['A','B','C','D','E','F','G','H','I'];
col2 = [-45,-90,-50,-54,-70,-57,-75,-64,-23];
col3 = [{'1,1'},{'1,5'},{'3,9'},{'4,2'},{'4,6'},{'7,2'},{'7,6'},{'6,9'},{'9,9'}];
col4 = [{'B,C,D,E,H'},{'A,D,C,E,H'},{'A,B,E,H,G'},{'A,C,B,F,G'},{'C,D,H,G'},{'B,D,H,I'},{'D,B,I,E'},{'E,D,G,I'},{'H,G,F,B'}];
for i = 1:length(data)
data{i,1} = col1(i);
data{i,2} = col2(i);
data{i,3} = col3(i);
data{i,4} = col4(i);
end
[trash idx] = sort([data{:,2}],'descend');
newdata = data(idx,:);
Then for the second part, this will work for finding the average of the coordinates corresponding to the three highest RSS values but without your condition that the three highest values to be selected must be visible to each other.
for i = 1:3
coord = str2num(cell2mat(newdata{i,3}));
coord_ave(i) = mean(coord);
end
I'll see if I can figure anything out on the condition stuff and post something if I do. Right now I think that using strfind will work to compare columns 1 and 4 to each other like below but some additional steps will be needed to find the 3 maximum matches in all of the data:
current_max_sens = newdata{i,1};
cell2mat(strfind(cellstr(newdata{3,4}),current_max_sens))
I hope this could at least give you a starting point.

Related

Sensor time distributions from arrays and cell arrays

I have a cell array and a numeric array in matlab which are inherently linked. The numeric array (A) contains a series of times from several data sources e.g. the time of each measurement. The array is n sensors (columns) by n measurements (rows). The array is filled with -1 by default since 0 is a valid time.
A = [ [ 100 110 -1 -1 ] ; ...
[ -1 200 180 -1 ] ; ...
[ -1 200 210 240 ] ; ...
[ 400 -1 -1 450 ] ];
The cell contains the sensors, in chronological order, for each row of the numeric array. Each cell elements contains a vector showing the sensors in the order they made the measurements.
C = { [1 2] [3 2] [2 3 4] [1 4]};
I want to see the distribution of times relative to each sensor e.g. what is the distribution of times from sensor 2/3/4 (when they are present), relative to sensor?
For example...
Sensor 1 is involved in the first and fourth measurements and the other detectors were +10 (100 -> 110) and +50 (400 -> 450). In this case I'm looking to return an array such as [10 50].
Sensor 2 is involved in the first three events, one of which is a three-way event. In this case it sensor2 isn't always the first to trigger, so some values will be negative. In this case I'm looking to return [-10 -20 +10 +40)]
Using the same logic sensor3 should return [20 -10 30] and sensor4 [-40 -30 -50].
I'm sure there should be an easy way to do this but I can't get my head round it. Of course the example I've given is a very simple one.... normally I'm dealing with tens of sensors and 100,000's measurements so looping over each and every col / row will take a long time... and often draw little results if only two (or so) of the sensors trigger in each measurement. For this reason I was hoping to use the elements in the cell array to access only the correct elements in the numeric array.
Any thoughts?
If I have understood the problem well enough for solving, it seems you don't need to worry about C for the output. Here's the code -
num_sensors = size(A,2)%// No. of sensors
A = A'; %//' The tracking goes row-wise, so transpose the input array
A(A==-1)=nan; %//set minus 1's to NaNs as excluding elements
out = cell(num_sensors,1); %// storage for ouput
for k1 = 1:num_sensors
%// Per sensor subtractions
per_sensor_subt = bsxfun(#minus,A,A(k1,:));
%// Set all elements of its own row to NaNs to exclude own subtractions
per_sensor_subt(k1,:)=nan;
%// Get all the non-nans that correspond to the valid output
out{k1} = per_sensor_subt(~isnan(per_sensor_subt));
end
Output -
>> celldisp(out)
out{1} =
10
50
out{2} =
-10
-20
10
40
out{3} =
20
-10
30
out{4} =
-40
-30
-50
As you have confirmed that the order of the output for each cell isn't important, you can employ a simplified approach that could be faster -
num_sensors = size(A,2)%// No. of sensors
A(A==-1)=nan; %//set minus 1's to NaNs as excluding elements
out = cell(num_sensors,1); %// storage for ouput
for k1 = 1:num_sensors
%// Per sensor subtractions
per_sensor_subt = bsxfun(#minus,A,A(:,k1));
%// Set all elements of its own row to NaNs to exclude own subtractions
per_sensor_subt(:,k1)=nan;
%// Get all the non-nans that correspond to the valid output
out{k1} = per_sensor_subt(~isnan(per_sensor_subt));
end
Fully vectorized solution if memory permits -
[m,n] = size(A)%// No. of sensors and measurements
A(A==-1)=nan; %//set minus 1's to NaNs as excluding elements
%// Per sensor subtractions
per_sensor_subt = bsxfun(#minus,A,permute(A,[1 3 2]))
%// Set all elements of its own row to NaNs to exclude own subtractions
own_idx = bsxfun(#plus,bsxfun(#plus,[1:m]',[0:n-1]*numel(A)),[0:n-1]*m);%//'
per_sensor_subt(own_idx)=nan;
%// Linear and row-col-dim3 indices of valid subtractions
idx = find(~isnan(per_sensor_subt))
[x,y,z] = ind2sub(size(per_sensor_subt),idx)
%// Get per sensor output
out = arrayfun(#(n) per_sensor_subt(idx(z==n)),1:n,'un',0)
If you would like to calculate C, use this approach -
%// Sort A row-wise
[sortedA,sorted_idx] = sort(A,2)
%// Set all invalid indices to zeros, so that later on we can use `nonzeros`
%// to extract out the valid indices
valid_sorted_idx = sorted_idx.*(sortedA~=-1)
%// Convert to a cell array
valid_sorted_idx_cell = mat2cell(valid_sorted_idx,ones(1,size(A,1)),size(A,2))
%// Extract the valid ones(nonzero indices) for the final output, C
C = cellfun(#(x) nonzeros(x), valid_sorted_idx_cell,'un',0)

Find the combination that minimizes a cost function

I am facing a problem and I would be grateful to anyone that could help. The problem is the following:
Consider that we have a vector D = [D1;D2;D3;...;DN] and a set of time instances TI = {t1,t2,t3,...,tM}. Each element of vector D, Di, corresponds to a subset of TI. For example D1 could correspond to time instances {t1,t2,t3} and D2 to {t2,t4,t5}.
I would like to find the combination of elements of D that corresponds to all elements of TI, without any of these being taken into account more than once, and at the same time minimizes the cost function sum(Dj). Dj are elements of vector D and each one corresponds to a set of time instances.
Let me give an example. Let us consider a vector
D = [15;10;5;2;35;15;25;25;25;30;45;5;1;40]
and a set
TI={5,10,15,20,25,30}
Each of D elements corresponds to
{[5 15];[5 20];[5 25];[5 30];[5 15 20];[5 20 25];[5 15 30];[5 20 25 30];[10 15];[10 20];[10 25];[10 15 20];[10 15 20 25];[10 30]}
respectively, e.g. D(1)=15 corresponds to time instance [5 15].
The solution that the procedure has to come up with is that the combination of D(4) and D(12), i.e. 2 and 1 respectively, has the minimum sum and correspond to all time instances.
I have to mention that the procedure has to be able to work with large vectors.
Thanks for every attempt to help!
The binary weight vector x places a weight on each D_i.
Let f=[D1;D2;...;DN].
Column j of A, A_j is a binary vector.
A_jk is 1 if D_j corresponds to Tk, else is zero.
The problem is:
min f^T*x s.t. A*x=1;
Then use bintprog to solve.
x = bintprog(f,[],[],A,ones(M,1))

Matlab: Cannot plot timeseries with repeated x values. How to get rid of repeated rows?

so I have a matrix Data in this format:
Data = [Date Time Price]
Now what I want to do is plot the Price against the Time, but my data is very large and has lines where there are multiple Prices for the same Date/Time, e.g. 1st, 2nd lines
29 733575.459548611 40.0500000000000
29 733575.459548611 40.0600000000000
29 733575.459548612 40.1200000000000
29 733575.45954862 40.0500000000000
I want to take an average of the prices with the same Date/Time and get rid of any extra lines. My goal is to do linear intrapolation on the values which is why I must have only one Time to one Price value.
How can I do this? I did this (this reduces the matrix so that it only takes the first line for the lines with repeated date/times) but I don't know how to take the average
function [ C ] = test( DN )
[Qrows, cols] = size(DN);
C = DN(1,:);
for i = 1:(Qrows-1)
if DN(i,2) == DN(i+1,2)
%n = 1;
%while DN(i,2) == DN(i+n,2) && i+n<Qrows
% n = n + 1;
%end
% somehow take average;
else
C = [C;DN(i+1,:)];
end
end
[C,ia,ic] = unique(A,'rows') also returns index vectors ia and ic
such that C = A(ia,:) and A = C(ic,:)
If you use as input A only the columns you do not want to average over (here: date & time), ic with one value for every row where rows you want to combine have the same value.
Getting from there to the means you want is for MATLAB beginners probably more intuitive with a for loop: Use logical indexing, e.g. DN(ic==n,3) you get a vector of all values you want to average (where n is the index of the date-time-row it belongs to). This you need to do for all different date-time-combinations.
A more vector-oriented way would be to use accumarray, which leads to a solution of your problem in two lines:
[DateAndTime,~,idx] = unique(DN(:,1:2),'rows');
Price = accumarray(idx,DN(:,3),[],#mean);
I'm not quite sure how you want the result to look like, but [DataAndTime Price] gives you the three-row format of the input again.
Note that if your input contains something like:
1 0.1 23
1 0.2 47
1 0.1 42
1 0.1 23
then the result of applying unique(...,'rows') to the input before the above lines will give a different result for 1 0.1 than using the above directly, as the latter would calculate the mean of 23, 23 and 42, while in the former case one 23 would be eliminates as duplicate before and the differing row with 42 would have a greater weight in the average.
Try the following:
[Qrows, cols] = size(DN);
% C is your result matrix
C = DN;
% this will give you the indexes where DN(i,:)==DN(i+1)
i = find(diff(DN(:,2)==0);
% replace C(i,:) with the average
C(i,:) = (DN(i,:)+DN(i+1,:))/2;
% delete the C(i+1,:) rows
C(i,:) = [];
Hope this works.
This should work if the repeated time values come in pairs (the average is calculated between i and i+1). Should you have time repeats of 3 or more then try to rethink how to change these steps.
Something like this would work, but I did not run the code so I can't promise there's no bugs.
newX = unique(DN(:,2));
newY = zeros(1,length(newX));
for ix = 1:length(newX)
allOcurrences = find(DN(:,2)==DN(i,2));
% If there's duplicates, take their mean
if numel(allOcurrences)>1
newY(ix) = mean(DN(allOcurrences,3));
else
% If not, use the only Y value
newY(ix) = DN(ix,3);
end
end

re-formatting a matrix in matlab

This is a simplistic example of a problem I am facing:
depth = [0:1:20]';
data = rand(1,length(depth))';
d = [depth,data];
d = [d;d;d];
Consider the matrix 'd'. Here we have depth in the first column followed by temperature measurements recorded at that depth in column 2 (in this example we have 3 days of data). How could I alter this matrix so that each column represents a specific depth and each row represents time. So, finally I should have 3 rows with 21 columns.
If I understand correctly your array d has the data for day 1 in rows 1:21, for day 2 in rows 22:42, and so on. Column 1 of d holds the depths (3 times), and column 2 holds the measurements.
One way to get the results in the form you want is to execute:
d2 = reshape(d(:,2),21,3)'; % note the ' for transposition here
This leaves you with an array with 3 rows and 21 columns. Each column represents the measurements for one depth, each row the measurements for one day.

reformatting a matrix in matlab with nan values

This post follows a previous question regarding the restructuring of a matrix:
re-formatting a matrix in matlab
An additional problem I face is demonstrated by the following example:
depth = [0:1:20]';
data = rand(1,length(depth))';
d = [depth,data];
d = [d;d(1:20,:);d];
Here I would like to alter this matrix so that each column represents a specific depth and each row represents time, so eventually I will have 3 rows (i.e. days) and 21 columns (i.e. measurement at each depth). However, we cannot reshape this because the number of measurements for a given day are not the same i.e. some are missing. This is known by:
dd = sortrows(d,1);
for i = 1:length(depth);
e(i) = length(dd(dd(:,1)==depth(i),:));
end
From 'e' we find that the number of depth is different for different days. How could I insert a nan into the matrix so that each day has the same depth values? I could find the unique depths first by:
unique(d(:,1))
From this, if a depth (from unique) is missing for a given day I would like to insert the depth to the correct position and insert a nan into the respective location in the column of data. How can this be achieved?
You were thinking correctly that unique may come in handy here. You also need the third output argument, which maps the unique depths onto the positions in the original d vector. have a look at this code - comments explain what I do
% find unique depths and their mapping onto the d array
[depths, ~, j] = unique(d(:,1));
% find the start of every day of measurements
% the assumption here is that the depths for each day are in increasing order
days_data = [1; diff(d(:,1))<0];
% count the number of days
ndays = sum(days_data);
% map every entry in d to the correct day
days_data = cumsum(days_data);
% construct the output array full of nans
dd = nan(numel(depths), ndays);
% assing the existing measurements using linear indices
% Where data does not exist, NaN will remain
dd(sub2ind(size(dd), j, days_data)) = d(:,2)
dd =
0.5115 0.5115 0.5115
0.8194 0.8194 0.8194
0.5803 0.5803 0.5803
0.9404 0.9404 0.9404
0.3269 0.3269 0.3269
0.8546 0.8546 0.8546
0.7854 0.7854 0.7854
0.8086 0.8086 0.8086
0.5485 0.5485 0.5485
0.0663 0.0663 0.0663
0.8422 0.8422 0.8422
0.7958 0.7958 0.7958
0.1347 0.1347 0.1347
0.8326 0.8326 0.8326
0.3549 0.3549 0.3549
0.9585 0.9585 0.9585
0.1125 0.1125 0.1125
0.8541 0.8541 0.8541
0.9872 0.9872 0.9872
0.2892 0.2892 0.2892
0.4692 NaN 0.4692
You may want to transpose the matrix.
It's not entirely clear from your question what your data looks like exactly, but the following might help you towards an answer.
Suppose you have a column vector
day1 = 1:21';
and, initially, all the values are NaN
day1(:) = NaN
Suppose next that you have a 2d array of measurements, in which the first column represents depths, and the second the measurements at those depths. For example
msrmnts = [1,2;2,3;4,5;6,7] % etc
then the assignment
day1(msrmnts(:,1)) = msrmnts(:,2)
will set values in only those rows of day1 whose indices are found in the first column of msrmnts. This second statement uses Matlab's capabilities for using one array as a set of indices into another array, for example
d([9 7 8 12 4]) = 1:5
would set elements [9 7 8 12 4] of d to the values 1:5. Note that the indices of the elements do not need to be in order. You could even insert the same value several times into the index array, eg [4 4 5 6 3 4] though it's not terribly useful.