I have a cell array and a numeric array in matlab which are inherently linked. The numeric array (A) contains a series of times from several data sources e.g. the time of each measurement. The array is n sensors (columns) by n measurements (rows). The array is filled with -1 by default since 0 is a valid time.
A = [ [ 100 110 -1 -1 ] ; ...
[ -1 200 180 -1 ] ; ...
[ -1 200 210 240 ] ; ...
[ 400 -1 -1 450 ] ];
The cell contains the sensors, in chronological order, for each row of the numeric array. Each cell elements contains a vector showing the sensors in the order they made the measurements.
C = { [1 2] [3 2] [2 3 4] [1 4]};
I want to see the distribution of times relative to each sensor e.g. what is the distribution of times from sensor 2/3/4 (when they are present), relative to sensor?
For example...
Sensor 1 is involved in the first and fourth measurements and the other detectors were +10 (100 -> 110) and +50 (400 -> 450). In this case I'm looking to return an array such as [10 50].
Sensor 2 is involved in the first three events, one of which is a three-way event. In this case it sensor2 isn't always the first to trigger, so some values will be negative. In this case I'm looking to return [-10 -20 +10 +40)]
Using the same logic sensor3 should return [20 -10 30] and sensor4 [-40 -30 -50].
I'm sure there should be an easy way to do this but I can't get my head round it. Of course the example I've given is a very simple one.... normally I'm dealing with tens of sensors and 100,000's measurements so looping over each and every col / row will take a long time... and often draw little results if only two (or so) of the sensors trigger in each measurement. For this reason I was hoping to use the elements in the cell array to access only the correct elements in the numeric array.
Any thoughts?
If I have understood the problem well enough for solving, it seems you don't need to worry about C for the output. Here's the code -
num_sensors = size(A,2)%// No. of sensors
A = A'; %//' The tracking goes row-wise, so transpose the input array
A(A==-1)=nan; %//set minus 1's to NaNs as excluding elements
out = cell(num_sensors,1); %// storage for ouput
for k1 = 1:num_sensors
%// Per sensor subtractions
per_sensor_subt = bsxfun(#minus,A,A(k1,:));
%// Set all elements of its own row to NaNs to exclude own subtractions
per_sensor_subt(k1,:)=nan;
%// Get all the non-nans that correspond to the valid output
out{k1} = per_sensor_subt(~isnan(per_sensor_subt));
end
Output -
>> celldisp(out)
out{1} =
10
50
out{2} =
-10
-20
10
40
out{3} =
20
-10
30
out{4} =
-40
-30
-50
As you have confirmed that the order of the output for each cell isn't important, you can employ a simplified approach that could be faster -
num_sensors = size(A,2)%// No. of sensors
A(A==-1)=nan; %//set minus 1's to NaNs as excluding elements
out = cell(num_sensors,1); %// storage for ouput
for k1 = 1:num_sensors
%// Per sensor subtractions
per_sensor_subt = bsxfun(#minus,A,A(:,k1));
%// Set all elements of its own row to NaNs to exclude own subtractions
per_sensor_subt(:,k1)=nan;
%// Get all the non-nans that correspond to the valid output
out{k1} = per_sensor_subt(~isnan(per_sensor_subt));
end
Fully vectorized solution if memory permits -
[m,n] = size(A)%// No. of sensors and measurements
A(A==-1)=nan; %//set minus 1's to NaNs as excluding elements
%// Per sensor subtractions
per_sensor_subt = bsxfun(#minus,A,permute(A,[1 3 2]))
%// Set all elements of its own row to NaNs to exclude own subtractions
own_idx = bsxfun(#plus,bsxfun(#plus,[1:m]',[0:n-1]*numel(A)),[0:n-1]*m);%//'
per_sensor_subt(own_idx)=nan;
%// Linear and row-col-dim3 indices of valid subtractions
idx = find(~isnan(per_sensor_subt))
[x,y,z] = ind2sub(size(per_sensor_subt),idx)
%// Get per sensor output
out = arrayfun(#(n) per_sensor_subt(idx(z==n)),1:n,'un',0)
If you would like to calculate C, use this approach -
%// Sort A row-wise
[sortedA,sorted_idx] = sort(A,2)
%// Set all invalid indices to zeros, so that later on we can use `nonzeros`
%// to extract out the valid indices
valid_sorted_idx = sorted_idx.*(sortedA~=-1)
%// Convert to a cell array
valid_sorted_idx_cell = mat2cell(valid_sorted_idx,ones(1,size(A,1)),size(A,2))
%// Extract the valid ones(nonzero indices) for the final output, C
C = cellfun(#(x) nonzeros(x), valid_sorted_idx_cell,'un',0)
Related
Suppose I have 2 input vectors x and reset of the same size
x = [1 2 3 4 5 6]
reset = [0 0 0 1 0 0]
and an output y which is the cumulative sum of the elements in x. Whenever the value of resets corresponds to 1, the cumulative sum for the elements reset and start all over again just like below
y = [1 3 6 4 9 15]
How would I implement this in Matlab?
One approach with diff and cumsum -
%// Setup few arrays:
cx = cumsum(x) %// Continuous Cumsumed version
reset_mask = reset==1 %// We want to create a logical array version of
%// reset for use as logical indexing next up
%// Setup ID array of same size as input array and with differences between
%// cumsumed values of each group placed at places where reset==1, 0s elsewhere
%// The groups are the islands of 0s and bordered at 1s in reset array.
id = zeros(size(reset))
diff_values = x(reset_mask) - cx(reset_mask)
id(reset_mask) = diff([0 diff_values])
%// "Under-compensate" the continuous cumsumed version cx with the
%// "grouped diffed cumsum version" to get the desired output
y = cx + cumsum(id)
Here's a way:
result = accumarray(1+cumsum(reset(:)), x(:), [], #(t) {cumsum(t).'});
result = [result{:}];
This works because if the first input to accumarray is sorted, the order within each group of the second input is preserved (more about this here).
I have 3 vectors, v1, v2, v3. What I want to get is the difference between every possible pair of them, that is, v1-v2, v1-v3, v2-v3. How can I do this without looping in matlab?
Thank you.
Just use nchoosek to generate the combinations first and then use them to index into your array of row-vectors:
Test case:
numVectors = 3;
dim = 5;
Vs = rand(numVectors, dim);
Actual computation:
combs = nchoosek(1:size(Vs,1), 2);
differences = Vs(combs(:,1),:) - Vs(combs(:,2),:);
The above creates 3 random row vectors of dimension 5. So in your case, you may want to replace the creation of the random matrix with Vs = [v1; v2; v3]; if your vectors are row vectors; or transpose the vectors using Vs = [v1, v2, v3].'; if your data are column vectors.
Using bsxfun:
clear
clc
%// Sample vectors.
v1 = [1 2];
v2 = [10 20];
v3 = [0 0];
Out = bsxfun(#minus,[v1 v2 v3], [v1 v2 v3].')
Out =
0 1 9 19 -1 -1
-1 0 8 18 -2 -2
-9 -8 0 10 -10 -10
-19 -18 -10 0 -20 -20
1 2 10 20 0 0
1 2 10 20 0 0
Reasoning: Each difference is computed starting from the 1st element of the 1st vector until the 2nd element of the last vector.
The 1st column contains all the differences for the 1st element of the 1st vector, i.e. (1 -1), (1-2), (1-10), (1 - 20), (1 - 0), (1 - 0).
Then 2nd column, same thing but this time with the 2: (2 - 1), (2 - 2), (2 - 10), and so on.
Sorry if my explanations are unclear haha I don't know the right terms in english. Please ask for more details.
Code
%// Concatenate all vectors to form a 2D array
V = cat(2,v1(:),v2(:),v3(:),v4(:),v5(:))
N = size(V,2) %// number of vectors
%// Find all IDs of all combinations as x,y
[y,x] = find(bsxfun(#gt,[1:N]',[1:N])) %//'
%// OR [y,x] = find(tril(true(size(V,2)),-1))
%// Use matrix indxeing to collect vector data for all combinations with those
%// x-y IDs from V. Then, perform subtractions across them for final output
diff_array = V(:,x) - V(:,y)
Few points about the code
bsxfun with find gets us the IDs for forming pairwise combinations.
We use those IDs to index into the 2D concatenated array and perform subtractions between them to get the final output.
Bonus Stuff
If you look closely into the part where it finds the IDs of all combinations, that is basically nchoosek(1:..,2).
So, basically one can have alternatives to nchoosek(1:N,2) as:
[Y,X] = find(bsxfun(#gt,[1:N]',[1:N]))
[y,x] = find(tril(true(N),-1))
with [X Y] forming those pairwise combinations and might be interesting to benchmark them!
For my experiment I have 20 categories which contain 9 pictures each. I want to show these pictures in a pseudo-random sequence where the only constraint to randomness is that one image may not be followed directly by one of the same category.
So I need something similar to
r = randi([1 20],1,180);
just with an added constraint of two numbers not directly following each other. E.g.
14 8 15 15 7 16 6 4 1 8 is not legitimate, whereas
14 8 15 7 15 16 6 4 1 8 would be.
An alternative way I was thinking of was naming the categories A,B,C,...T, have them repeat 9 times and then shuffle the bunch. But there you run into the same problem I think?
I am an absolute Matlab beginner, so any guidance will be welcome.
The following uses modulo operations to make sure each value is different from the previous one:
m = 20; %// number of categories
n = 180; %// desired number of samples
x = [randi(m)-1 randi(m-1, [1 n-1])];
x = mod(cumsum(x), m) + 1;
How the code works
In the third line, the first entry of x is a random value between 0 and m-1. Each subsequent entry represents the change that, modulo m, will give the next value (this is done in the fourth line).
The key is to choose that change between 1 and m-1 (not between 0 and m-1), to assure consecutive values will be different. In other words, given a value, there are m-1 (not m) choices for the next value.
After the modulo operation, 1 is added to to transform the range of resulting values from 0,...,m-1 to 1,...,m.
Test
Take all (n-1) pairs of consecutive entries in the generated x vector and count occurrences of all (m^2) possible combinations of values:
count = accumarray([x(1:end-1); x(2:end)].', 1, [m m]);
imagesc(count)
axis square
colorbar
The following image has been obtained for m=20; n=1e6;. It is seen that all combinations are (more or less) equally likely, except for pairs with repeated values, which never occur.
You could look for the repetitions in an iterative manner and put new set of integers from the same group [1 20] only into those places where repetitions have occurred. We continue to do so until there are no repetitions left -
interval = [1 20]; %// interval from where the random integers are to be chosen
r = randi(interval,1,180); %// create the first batch of numbers
idx = diff(r)==0; %// logical array, where 1s denote repetitions for first batch
while nnz(idx)~=0
idx = diff(r)==0; %// logical array, where 1s denote repetitions for
%// subsequent batches
rN = randi(interval,1,nnz(idx)); %// new set of random integers to be placed
%// at the positions where repetitions have occured
r(find(idx)+1) = rN; %// place ramdom integers at their respective positions
end
I have a matrix with constant consecutive values randomly distributed throughout the matrix. I want the indices of the consecutive values, and further, I want a matrix of the same size as the original matrix, where the number of consecutive values are stored in the indices of the consecutive values. For Example
original_matrix = [1 1 1;2 2 3; 1 2 3];
output_matrix = [3 3 3;2 2 0;0 0 0];
I have struggled mightily to find a solution to this problem. It has relevance for meteorological data quality control. For example, if I have a matrix of temperature data from a number of sensors, and I want to know what days had constant consecutive values, and how many days were constant, so I can then flag the data as possibly faulty.
temperature matrix is number of days x number of stations and I want an output matrix that is also number of days x number of stations, where the consecutive values are flagged as described above.
If you have a solution to that, please provide! Thank you.
For this kind of problems, I made my own utility function runlength:
function RL = runlength(M)
% calculates length of runs of consecutive equal items along columns of M
% work along columns, so that you can use linear indexing
% find locations where items change along column
jumps = diff(M) ~= 0;
% add implicit jumps at start and end
ncol = size(jumps, 2);
jumps = [true(1, ncol); jumps; true(1, ncol)];
% find linear indices of starts and stops of runs
ijump = find(jumps);
nrow = size(jumps, 1);
istart = ijump(rem(ijump, nrow) ~= 0); % remove fake starts in last row
istop = ijump(rem(ijump, nrow) ~= 1); % remove fake stops in first row
rl = istop - istart;
assert(sum(rl) == numel(M))
% make matrix of 'derivative' of runlength
% don't need last row, but needs same size as jumps for indices to be valid
dRL = zeros(size(jumps));
dRL(istart) = rl;
dRL(istop) = dRL(istop) - rl;
% remove last row and 'integrate' to get runlength
RL = cumsum(dRL(1:end-1,:));
It only works along columns since it uses linear indexing. Since you want do something similar along rows, you need to transpose back and forth, so you could use it for your case like so:
>> original = [1 1 1;2 2 3; 1 2 3];
>> original = original.'; % transpose, since runlength works along columns
>> output = runlength(original);
>> output = output.'; % transpose back
>> output(output == 1) = 0; % see hitzg's comment
>> output
output =
3 3 3
2 2 0
0 0 0
I have a set of data points in a vector. For example,
[NaN, NaN, NaN, -1.5363, NaN -1.7664, -1.7475];
These data result from a code which selects 3 points within a specified range (specifically. -0.6 an 0.6). If three points from the column do not exist in this range, the range is incrementally expanded until three points are found. In the above example, the range was increased to -1.8 to 1.8. However, the data we are analyzing is erratic, and has random peaks and troughs, leading to points which are non-contiguous being accepted into the range (element 3 is chosen to be valid, but not element 4).
What would be the best way to go about this? I already have a code to incrementally increase the range to find three points, I just need to modify it to not stop at any three points, but to increase the range until it finds three CONTIGUOUS points. If that were done for the above example, I would just evaluate slopes to remove the 3rd element (since between 3 and 4, the slope is negative).
Thanks.
Assuming your data as provided in the example is in the variable x, you can use isnan and findstr like so:
x = [NaN, NaN, NaN, -1.5363, NaN -1.7664, -1.7475, 123];
~isnan(x)
ans =
0 0 0 1 0 1 1 1
pos = findstr(~isnan(x), [1 1 1]);
The reason for using findstr like this is that we would like to find the sequence [1 1 1] within the logical array returned by isnan, and findstr will return the index of the positions in the input array where this sequence appears.
For your example data, this will return [], but if you change it to the data in the example I have given, it will return 6, and you can extract the contiguous region with x(pos:pos+2). You will have to be a bit careful about cases where there are more than 3 contiguous values (if there were 4, it would return [6 7]) and the cases where there is more than one contiguous region. If you don't need to do anything meaningful with these cases then just use pos(1).
If you want to extract the entirety of the first contiguous region whose length is greater than or equal to 3, you could do something like:
x = [NaN, NaN, NaN, -1.5363, NaN -1.7664, -1.7475, 123, 456, 789];
startPos = [];
stopPos = [];
pos = findstr(~isnan(x), [1 1 1]);
if ~isempty(pos)
startPos = pos(1);
stopPos = startPos + 2;
% Find any cases where we have consecutive numbers in pos
if length(pos) > 1 && any(diff(pos) == 1)
% We have a contiguous section longer than 3 elements
% Find the NaNs
nans = find(isnan(x));
% Find the first NaN after pos(1), or the index of the last element
stopPos = nans(nans > startPos);
if ~isempty(stopPos)
stopPos = stopPos(1) - 1; % Don't want the NaN
else
stopPos = length(x);
end
end
end
x(startPos:stopPos)