Getting a moving average in Netlogo between ticks - netlogo

I am computing a standard deviation at each tick. I'd like to get the mean of this through time with a moving average. Is there a smart way to do this built-in or should I create a list and a function that computes the MA each tick?

Related

Summation of a variable variation values in Modelica

In my model i have the evolution of a variable lets say the electric power during time and i want to calculate the summation of the total electric power during the simulation time ,if anyone have an idea of how i will be thankfull.
The best choice would be to declare a variable with a meaningful name say energy and in the equation section of model's top-level assign der(energy)= electricPower. This will give you the continuous summation of electric power variable or otherwise integrate the power variable across the simulation stop time.

How to calculate an exponentially weighted moving average on a conditional basis?

I am using MATLAB R2020a on a MacOS. I have a signal 'cycle_periods' consisting of the cycle periods of an ECG signal on which I would like to perform an exponentially weighted mean, such that older values are less weighted than newer ones. However, I would like this to be done on an element-by-element basis such that a given element is only included in the overall weighted mean if the weighted mean with the current sample does not exceed 1.5 times or go below 0.5 times the weighted mean without the element.
I have used the dsp.MovingAverage function as shown below to calculate the weighted mean, but I am really unsure as to how to manipulate the function to include my conditions.
% Exponentially weighted moving mean for stable cycle periods
movavgExp = dsp.MovingAverage('Method', 'Exponential weighting', 'ForgettingFactor', 0.1);
mean_cycle_period_exp = movavgExp(cycle_period_stable);
I would very much appreciate any help regarding this matter, thanks in advance.

Predicting the difference or the quotient?

For a time series forecasting problem, I noticed some people tried to predict the difference or the quotient. For instance, in trading, we can try to predict the price difference P_{t-1} - P_t or the price quotient P_{t-1}/P_t. So we get a more stationary problem. With a recurrent neural network for a regression problem, trying to predict the price difference can be a real pain if the price does not change sufficiently fast because it will predict mostly zero at each step.
Questions :
What are the advantages and inconveniences of using the difference or the quotient instead of the whole quantity?
What can a nice tool to get rid of the repetitive zeros in a problem like trying to predict the price movement?
If the assumption is that the price is stationary (*Pt=Cte), then predict the whole quantity.
If the assumption is that the price increase ()is stationary (Pt= Pt-1+Cte), then predict the absolute difference Pt-Pt-1. (Note: thie is the ARIMA model with a degree of differencing=1)
If the assumption is that the price growth (in percentage) is stationary (Pt=Pt-1 +Cte * Pt-1), then predict the relative difference Pt/Pt-1.
If the price changes rarely (i.e. the absolute or relative difference is most often zero), then try to predict the time interval between tow changes rather than the price itself.

Determine intervals from kernel density estimation

I have a 1-dimensional data which is (t) where users spend time to complete a task. I applied kernel density estimation from http://www.mathworks.com/matlabcentral/fileexchange/14034-kernel-density-estimator to remove the outliers who spent unreasonable time. I used the following lines:
[bandwidth,density,xmesh]=kde(dur1);
plot(xmesh,density);
After applying KDE, I have a problem of defining the local minima to split the data. The following link shows how the curve looks like:
http://s23.postimg.org/6aa1748jf/kde.jpg
I expect to see three clusters, where the middle one contains the reasonable spent time. However, the curve I have got has only one peak.
I am wondering if the steps I am following are correct?

How can I perform K-means clustering on time series data?

How can I do K-means clustering of time series data?
I understand how this works when the input data is a set of points, but I don't know how to cluster a time series with 1XM, where M is the data length. In particular, I'm not sure how to update the mean of the cluster for time series data.
I have a set of labelled time series, and I want to use the K-means algorithm to check whether I will get back a similar label or not. My X matrix will be N X M, where N is number of time series and M is data length as mentioned above.
Does anyone know how to do this? For example, how could I modify this k-means MATLAB code so that it would work for time series data? Also, I would like to be able to use different distance metrics besides Euclidean distance.
To better illustrate my doubts, here is the code I modified for time series data:
% Check if second input is centroids
if ~isscalar(k)
c=k;
k=size(c,1);
else
c=X(ceil(rand(k,1)*n),:); % assign centroid randomly at start
end
% allocating variables
g0=ones(n,1);
gIdx=zeros(n,1);
D=zeros(n,k);
% Main loop converge if previous partition is the same as current
while any(g0~=gIdx)
% disp(sum(g0~=gIdx))
g0=gIdx;
% Loop for each centroid
for t=1:k
% d=zeros(n,1);
% Loop for each dimension
for s=1:n
D(s,t) = sqrt(sum((X(s,:)-c(t,:)).^2));
end
end
% Partition data to closest centroids
[z,gIdx]=min(D,[],2);
% Update centroids using means of partitions
for t=1:k
% Is this how we calculate new mean of the time series?
c(t,:)=mean(X(gIdx==t,:));
end
end
Time series are usually high-dimensional. And you need specialized distance function to compare them for similarity. Plus, there might be outliers.
k-means is designed for low-dimensional spaces with a (meaningful) euclidean distance. It is not very robust towards outliers, as it puts squared weight on them.
Doesn't sound like a good idea to me to use k-means on time series data. Try looking into more modern, robust clustering algorithms. Many will allow you to use arbitrary distance functions, including time series distances such as DTW.
It's probably too late for an answer, but:
k-means can be used to cluster longitudinal data
Anony-Mousse is right, DWT distance is the way to go for time series
The methods above use R. You'll find more methods by looking, e.g., for "Iterative Incremental Clustering of Time Series".
I have recently come across the kml R package which claims to implement k-means clustering for longitudinal data. I have not tried it out myself.
Also the Time-series clustering - A decade review paper by S. Aghabozorgi, A. S. Shirkhorshidi and T. Ying Wah might be useful to you to seek out alternatives. Another nice paper although somewhat dated is Clustering of time series data-a survey by T. Warren Liao.
If you did really want to use clustering, then dependent on your application you could generate a low dimensional feature vector for each time series. For example, use time series mean, standard deviation, dominant frequency from a Fourier transform etc. This would be suitable for use with k-means, but whether it would give you useful results is dependent on your specific application and the content of your time series.
I don't think k-means is the right way for it either. As #Anony-Mousse suggested you can utilize DTW. In fact, I had the same problem for one of my projects and I wrote my own class for that in Python. The logic is;
Create your all cluster combinations. k is for cluster count and n is for number of series. The number of items returned should be n! / k! / (n-k)!. These would be something like potential centers.
For each series, calculate distances for each center in each cluster groups and assign it to the minimum one.
For each cluster groups, calculate total distance within individual clusters.
Choose the minimum.
And, the Python implementation is here if you're interested.