mean squared displacement from multiple trajectories - matlab

I have a matrix of multiple particle trajectories that I would like to analyze separately The trajectory number is one of the columns of the matrix, so I am trying to sort based on that number. I am using some of the code from this answer: MSD with matlab (which was very helpful, thank you!) to calculate MSD, but I am having difficulty parsing out the individual trajectories. To explain in more detail what I am trying to do: I have trajectory outputs that are in matrix format, with one column for trajectory number, one column for x-position, one column for y-position, etc. I want to be able to take this information and calculate the mean-squared displacement for each trajectory. In order to do this, I have to create a way to distinguish data points based on trajectory number (which is listed in row 7 of mymatrix). This seems to be where I am having trouble. The important columns in this matrix are 1: x-position, 2: y-position and 7: trajectory number. So far I have
total_rows=size(mymatrix,1);
max_trajectory_number=mymatrix(total_rows,7);
nData=0;
msd=zeros(total_rows, 4)
for i=0:max_trajectory_number
trajectornumber= mymatrix(i,7);
if trajectorynumber.equals(i)
nData=nData+1; %counts the number of instances of this trajectory number, which is the number of data points in the trajectory
for dt = 1:nData
deltaCoords = mymatrix(1+dt:end,1:2) - traj0mat(1:end-dt,1:2); %calculates time-averaged MSD based on y and y positions in colums 1 and 2 respectively
squaredDisplacement = sum(deltaCoords.^2,2); %# dx^2+dy^2+dz^2
msd(dt,1) = trajectorynumber; %trajectory number
msd(dt,2) = mean(squaredDisplacement); %# average
msd(dt,3) = std(squaredDisplacement); %# std
msd(dt,4) = length(squaredDisplacement); %# n
end
end
Unfortunately when I run this on mymatrix, the resulting msd matrix remains all zeros. I think this is likely due to an error in sorting based on the trajectory number. I do not get an error just not the results I was looking for
If anyone has any suggestions on how to fix this, it would be greatly appreciated.

It looks like you want to bundle all rows identified by the same trajectory number. I assume that they show up in chronological order as you continue down a column. Then try something like
tnumbs = unique(mymatrix(:,7)); % identify unique trajectory numbers
for i=1:length(tnumbs) % loop through trajectories
icurr = find(mymatrix(:,7)==tnumbs(i)); % find indices to entries for current trajectory
% perform your averaging
deltaCoords = mymatrix(icurr(1+dt:end),1:2) - traj0mat(icurr(1:end-dt),1:2); %calculates time-averaged MSD based on y and y positions in colums 1 and 2 respectively
squaredDisplacement = sum(deltaCoords.^2,2); %# dx^2+dy^2+dz^2
msd(i,1) = tnumbs(i); %trajectory number
msd(i,2) = mean(squaredDisplacement); %# average
msd(i,3) = std(squaredDisplacement); %# std
msd(i,4) = length(squaredDisplacement); %# n
end

Related

How can I improve code that finds the n-samples vector subset meeting certain criteria?

In Matlab, given a vector A (please, find it here: https://www.dropbox.com/s/otropedwxj0lki7/A.mat?dl=0 ), how could I find the n-samples vector subset with the smallest range (or standard deviation)?
I am trying a potential solution: reshaping the vector in columns, performing range of each column and selecting the smallest. However, reshape does not always works well when applied to other examples with different lengths. How could this be worked around in an easier and more efficient way?
Fs = 1000; % sampling frequency
time = round(length(A)/Fs)-1; % calculate approximated rounded total length in time
A_reshaped = reshape(A(1:time*Fs), [], time/2); % reshape A (deleting some samples at the end) in time/2 columns
D(1,:) = mean(A_reshaped);
D(2,:) = range(A_reshaped);
[~,idx] = min(D(2,:));
Value = D(1,idx);
Any help is much appreciated.
To find the n-sample with minimum range you can sort the vector and subtract the first section of the sorted vector from the last section. Then use index of the minimum to find the n-sample:
n=4
a= rand(1,10);
s= sort(a);
[~,I]=min(s(n:end)-s(1:end-n+1))
result = s(I:I+n-1)

Only plot lines of specific length

Below is an image showing a contour plot with areas of interest that have have been connected up by using their centroids. What I want to achieve is that only lines of a certain length are plotted. Currently, every point has a line drawn to every other point.
C=contourf(K{i});
[Area,Centroid] = Contour2Area(C);
% This converts any entries that are negative into a positive value
% of the same magnitiude
indices{i} = find( Centroid < 0);
Centroid(indices{i})=Centroid(indices{i}) * -1; %set all
% Does the same but for positive (+500)
indices{i} = find( Area > 500);
Area(indices{i})=0;
[sortedAreaVal, sortedAreaInd] = sort(Area, 'descend');
maxAreaVals = sortedAreaVal(1:10)';
maxAreaInd = sortedAreaInd(1:10)';
xc=Centroid(1,:); yc=Centroid(2,:);
hold on; plot(xc,yc,'-');
It would be very useful if there was a way of only plotting the lines that fall below a specific threshold. The next step will be to label and measure each line. Thanks in advance for your time.
If xc and yc are the x and y coordinates of the centroids, then you could do something like this:
sqrt(sum(diff([x,y],1).^2,2))
What this does is take the difference between successive [x,y] data points, then calculate the Euclidean distance between them. You then have all the information you need to select the ones you want and label the lengths.
One thing though, this will only compute distances between successive centroids. I wrote it this way because it appears that's what you're trying to do above. If you are interested in finding out the distances between all centroids, you would have to loop through and compute the distances.
Something along the lines of:
for i=1:length(xc)-1
for j=i+1:length(xc)
% distance calculation here...
Hope this helps.

Computing a moving average

I need to compute a moving average over a data series, within a for loop. I have to get the moving average over N=9 days. The array I'm computing in is 4 series of 365 values (M), which itself are mean values of another set of data. I want to plot the mean values of my data with the moving average in one plot.
I googled a bit about moving averages and the "conv" command and found something which i tried implementing in my code.:
hold on
for ii=1:4;
M=mean(C{ii},2)
wts = [1/24;repmat(1/12,11,1);1/24];
Ms=conv(M,wts,'valid')
plot(M)
plot(Ms,'r')
end
hold off
So basically, I compute my mean and plot it with a (wrong) moving average. I picked the "wts" value right off the mathworks site, so that is incorrect. (source: http://www.mathworks.nl/help/econ/moving-average-trend-estimation.html) My problem though, is that I do not understand what this "wts" is. Could anyone explain? If it has something to do with the weights of the values: that is invalid in this case. All values are weighted the same.
And if I am doing this entirely wrong, could I get some help with it?
My sincerest thanks.
There are two more alternatives:
1) filter
From the doc:
You can use filter to find a running average without using a for loop.
This example finds the running average of a 16-element vector, using a
window size of 5.
data = [1:0.2:4]'; %'
windowSize = 5;
filter(ones(1,windowSize)/windowSize,1,data)
2) smooth as part of the Curve Fitting Toolbox (which is available in most cases)
From the doc:
yy = smooth(y) smooths the data in the column vector y using a moving
average filter. Results are returned in the column vector yy. The
default span for the moving average is 5.
%// Create noisy data with outliers:
x = 15*rand(150,1);
y = sin(x) + 0.5*(rand(size(x))-0.5);
y(ceil(length(x)*rand(2,1))) = 3;
%// Smooth the data using the loess and rloess methods with a span of 10%:
yy1 = smooth(x,y,0.1,'loess');
yy2 = smooth(x,y,0.1,'rloess');
In 2016 MATLAB added the movmean function that calculates a moving average:
N = 9;
M_moving_average = movmean(M,N)
Using conv is an excellent way to implement a moving average. In the code you are using, wts is how much you are weighing each value (as you guessed). the sum of that vector should always be equal to one. If you wish to weight each value evenly and do a size N moving filter then you would want to do
N = 7;
wts = ones(N,1)/N;
sum(wts) % result = 1
Using the 'valid' argument in conv will result in having fewer values in Ms than you have in M. Use 'same' if you don't mind the effects of zero padding. If you have the signal processing toolbox you can use cconv if you want to try a circular moving average. Something like
N = 7;
wts = ones(N,1)/N;
cconv(x,wts,N);
should work.
You should read the conv and cconv documentation for more information if you haven't already.
I would use this:
% does moving average on signal x, window size is w
function y = movingAverage(x, w)
k = ones(1, w) / w
y = conv(x, k, 'same');
end
ripped straight from here.
To comment on your current implementation. wts is the weighting vector, which from the Mathworks, is a 13 point average, with special attention on the first and last point of weightings half of the rest.

maximum points detection in a multiple plot MATLAB

I have 2 FFT spectrums on a plot. I want to get the top 5 maximum points of the overall plot. I get the maximum points separately for each spectrum. How can i combine these spectrums into one and get the overall maximum 5 points?
You have two separate maximum matrix: lets Max1 and Max2
Now combine both of them to form third matrix
Max3 = [Matx1 Max2]
Sort the Max3 in descending order
Max3 = sort(Max3,'descend');
Extract the first 5 element
peaks = Max3(1:5)
Put the spectra in one vector and sort them in descending order.
spec1 = fft(x1); % a spectrum (column vector)
spec2 = fft(x2); % another spectrum (column vector)
dummy = abs([spec1; spec2]); % concatenate absolute values
sorted = sort(dummy, 'descending');
five_greatest = sorted(1:5);

How to connect a 3D points with a distance threshold Matlab

I have a vector of 3D points lets say A as shown below,
A=[
-0.240265581092000 0.0500598627544876 1.20715641293013
-0.344503191645519 0.390376667574812 1.15887540716612
-0.0931248606994074 0.267137193112796 1.24244644549763
-0.183530493218807 0.384249186312578 1.14512014134276
-0.0201358671977785 0.404732019283683 1.21816745283019
-0.242108038906952 0.229873488902244 1.24229940627651
-0.391349107031230 0.262170158259873 1.23856838565023
]
what I want to do is to connect 3D points with lines which only have distance less than a specific threshold T. I want to get a list of pairs of points needed to be connected. Such as,
[
( -0.240265581092000 0.0500598627544876 1.20715641293013), (-0.344503191645519 0.390376667574812 1.15887540716612);
(-0.0931248606994074 0.267137193112796 1.24244644549763),(-0.183530493218807 0.384249186312578 1.14512014134276),.....
]
So as shown, I'll have a vector of pairs of points needed to be connected. So if anyone could please advise how this can be done in Matlab.
The following example demonstrates how to accomplish this.
%# Build an example matrix
A = [1 2 3; 0 0 0; 3 1 3; 2 0 2; 0 1 0];
Threshold = 3;
%# Calculate distance between all points
D = pdist2(A, A);
%# Discard any points with distance greater than threshold
D(D > Threshold) = nan;
If you wish to extract an index of all observation pairs that are linked by a distance less than (or equal to) Threshold, as well as the corresponding distance (your question didn't specify what form you wanted the output to take, so I am essentially guessing here), then instead use the following:
%# Obtain a list of linear indices of observations less than or equal to TH
I1 = find(D <= Threshold);
%#Extract the actual distances, as well as the corresponding observation indices from A
[Obs1Index, Obs2Index] = ind2sub(size(D), I1);
DList = [Obs1Index, Obs2Index, D(I1)];
Note, pdist2 uses Euclidean distance by default, but there are other options - see the documentation here.
UPDATE: Based on the OP's comments, the following code will express the output as a K*6 matrix, where K is the number of distance measures less than the threshold value, and the first three columns of each row is the first data point (3 dimensions) and the second three columns of each row is the connected data point.
DList2 = [A(Obs1Index, :), A(Obs2Index, :)];
SECOND UPDATE: I have not made any assumptions on the distance measure in this answer. That is, I'm deliberately using pdist2 in case your distance measure is not symmetric. However, if you are using a symmetric distance measure, then you could probably speed up the run-time by using pdist instead, although my indexing code would need to be adjusted accordingly.
Plot3 and pdist2 can be used to achieve what you want.
D=pdist2(A,A);
T=0.2;
for i=1:7
for j=i+1:7
if D(i,j)<T & D(i,j)~=0
i
j
plot3(A([i j],1),A([i j],2),A([i j],3));
hold on;
fprintf('line is plotted\n');
pause;
end
end
end