Equivalent moment using Rainflow - matlab

I am trying to make a fatigue analysis from a load series and I woudl like to extract the equivalent moment for
number cycles=1000000
years=25
I have a time series for one hour that looks like:
Then I have read that Rainflow Analysis is a very good tool to extract the cycles from a time history. Therefore I apply:
%Rainflow moment
dt=time(2)-time(1);
[timeSeriesSig, extt] = sig2ext(timeSeries, dt);
rf = rainflow(timeSeriesSig,extt);
I read that
OUTPUT
rf - rainflow cycles: matrix 3xn or 5xn dependend on input,
rf(1,:) Cycles amplitude,
rf(2,:) Cycles mean value,
rf(3,:) Number of cycles (0.5 or 1.0),
rf(4,:) Begining time (when input includes dt or extt data),
rf(5,:) Cycle period (when input includes dt or extt data),
If I am interested in the number of cycles what does the term rf(3,:) mean? It is only containing 0.5 and 1 in the vector. I want to obtain an histogram with the number of cycles per bin amplitude. Thanks

If you are using code from the File Exchange or another source, it helps to link to where you obtained it.
Here, rf(3,:) is showing you whether the amplitudes and other outputs refer to half or full cycles. The rainflow algorithm finds half-cycles first, and then matches tensile and compressive half-cycles of equal amplitude to find full cycles. There are normally some residual half-cycles.
If you examine the rfhist function contained within that same File Exchange submission, you will see how they handle full and half amplitudes. Basically, two half-cycles at any given load amplitude will be counted as one full cycle when creating the histogram.

Related

Removing outlier with multiple consecutive values similar to a step

I am processing an ocean wave data, where I have a timeseries of the Peak Wave Period (Tp (s)). The typical values for Tp ranges from 2s-15s for this location. However, it may reach higher values above 15s during extreme events such as a storm. Hence, removing data based on a threshold value is not suitable.
As you can see in the figure below, there are multiple values that are outliers. The high values occurred for a small duration and then dropped down. An extreme event would last for hours.
I have tried the functions filloutlier and medfilt1, but they are not successful in removing the outlier, which I presume is because multiple consecutive outlier data points exists.
Is there a built-in Matlab function exist to handle such situation?
Else, if I need to write my own function to filter such signals, could you provide some guidance.
Attaching a small data sample here as well: Download Data
Dataset plot (Only the segment in the provided data above)
Zoomed in plot at one of the outliers.
If we know that we need the values to be in the range of (2,15), we can clip the values > 15 to 15.
Another way is to use the value of a high percentile (say 95) of the observations and clip values about it.
filloutlier, medfilt1 methods are not removing values like 18 because they are not treating them as outliers. 18 is not very far away from the typical range of (2, 15).

Tunning gain table to match two-curves

I have two data set, let us name them "actual speed" and "desired speed". My main objective is to match actual speed with the desired speed.
But for doing that in my case, I need to tune FF(1x10), Integral(10x8) and Proportional gain table(10x8).
My approach till now was as follows:-
First, start the iteration with having 0.1 as the initial value in the first cells(FF[0]) of the FF table
Then find the R-square or Co-relation between two dataset( i.e. Actual Speed and Desired Speed)
Increment the value of first cell(FF[0]) by 0.25 and then again compute R-square or Co-relation of two data set.
Once the cell(FF[0]) value reaches 2(Gains Maximum value. Already defined by the lab). Evaluate R-square and re-write the gain value in FF[0] which gives min. error between the two curve.
Then tune the Integral and Proportional table in the same way for the same RPM Range
Once It is tune then go for higher RPM range and repeat step 2-5 (RPM Range: 800-1000; 1000-1200;....;3000-3200)
Now the problem is that this process is taking way too long time to complete. For example it takes around 1 Hr. time to tune one cell of FF. Which is actually very slow.
If possible, Please suggest any other approach which I can try to tune the tables. I am using MATLAB R2010a and I can't shift to any other version of MATLAB because my controller can communicate with this version only and I can't use any app for tuning since my GUI is already communicating with the controller and those two datasets are being made in real-time
In the given figure, lets us take (X1,Y1) curve as Desired speed and (X2,Y2) curve as Actual speed
UPDATE

Predicting runtime of parallel loop using a-priori estimate of effort per iterand (for given number of workers)

I am working on a MATLAB implementation of an adaptive Matrix-Vector Multiplication for very large sparse matrices coming from a particular discretisation of a PDE (with known sparsity structure).
After a lot of pre-processing, I end up with a number of different blocks (greater than, say, 200), for which I want to calculate selected entries.
One of the pre-processing steps is to determine the (number of) entries per block I want to calculate, which gives me an almost perfect measure of the amount of time each block will take (for all intents and purposes the quadrature effort is the same for each entry).
Thanks to https://stackoverflow.com/a/9938666/2965879, I was able to make use of this by ordering the blocks in reverse order, thus goading MATLAB into starting with the biggest ones first.
However, the number of entries differs so wildly from block to block, that directly running parfor is limited severely by the blocks with the largest number of entries, even if they are fed into the loop in reverse.
My solution is to do the biggest blocks serially (but parallelised on the level of entries!), which is fine as long as the overhead per iterand doesn't matter too much, resp. the blocks don't get too small. The rest of the blocks I then do with parfor. Ideally, I'd let MATLAB decide how to handle this, but since a nested parfor-loop loses its parallelism, this doesn't work. Also, packaging both loops into one is (nigh) impossible.
My question now is about how to best determine this cut-off between the serial and the parallel regime, taking into account the information I have on the number of entries (the shape of the curve of ordered entries may differ for different problems), as well as the number of workers I have available.
So far, I had been working with the 12 workers available under a the standard PCT license, but since I've now started working on a cluster, determining this cut-off becomes more and more crucial (since for many cores the overhead of the serial loop becomes more and more costly in comparison to the parallel loop, but similarly, having blocks which hold up the rest are even more costly).
For 12 cores (resp. the configuration of the compute server I was working with), I had figured out a reasonable parameter of 100 entries per worker as a cut off, but this doesn't work well when the number of cores isn't small anymore in relation to the number of blocks (e.g 64 vs 200).
I have tried to deflate the number of cores with different powers (e.g. 1/2, 3/4), but this also doesn't work consistently. Next I tried to group the blocks into batches and determine the cut-off when entries are larger than the mean per batch, resp. the number of batches they are away from the end:
logical_sml = true(1,num_core); i = 0;
while all(logical_sml)
i = i+1;
m = mean(num_entr_asc(1:min(i*num_core,end))); % "asc" ~ ascending order
logical_sml = num_entr_asc(i*num_core+(1:num_core)) < i^(3/4)*m;
% if the small blocks were parallelised perfectly, i.e. all
% cores take the same time, the time would be proportional to
% i*m. To try to discount the different sizes (and imperfect
% parallelisation), we only scale with a power of i less than
% one to not end up with a few blocks which hold up the rest
end
num_block_big = num_block - (i+1)*num_core + sum(~logical_sml);
(Note: This code doesn't work for vectors num_entr_asc whose length is not a multiple of num_core, but I decided to omit the min(...,end) constructions for legibility.)
I have also omitted the < max(...,...) for combining both conditions (i.e. together with minimum entries per worker), which is necessary so that the cut-off isn't found too early. I thought a little about somehow using the variance as well, but so far all attempts have been unsatisfactory.
I would be very grateful if someone has a good idea for how to solve this.
I came up with a somewhat satisfactory solution, so in case anyone's interested I thought I'd share it. I would still appreciate comments on how to improve/fine-tune the approach.
Basically, I decided that the only sensible way is to build a (very) rudimentary model of the scheduler for the parallel loop:
function c=est_cost_para(cost_blocks,cost_it,num_cores)
% Estimate cost of parallel computation
% Inputs:
% cost_blocks: Estimate of cost per block in arbitrary units. For
% consistency with the other code this must be in the reverse order
% that the scheduler is fed, i.e. cost should be ascending!
% cost_it: Base cost of iteration (regardless of number of entries)
% in the same units as cost_blocks.
% num_cores: Number of cores
%
% Output:
% c: Estimated cost of parallel computation
num_blocks=numel(cost_blocks);
c=zeros(num_cores,1);
i=min(num_blocks,num_cores);
c(1:i)=cost_blocks(end-i+1:end)+cost_it;
while i<num_blocks
i=i+1;
[~,i_min]=min(c); % which core finished first; is fed with next block
c(i_min)=c(i_min)+cost_blocks(end-i+1)+cost_it;
end
c=max(c);
end
The parameter cost_it for an empty iteration is a crude blend of many different side effects, which could conceivably be separated: The cost of an empty iteration in a for/parfor-loop (could also be different per block), as well as the start-up time resp. transmission of data of the parfor-loop (and probably more). My main reason to throw everything together is that I don't want to have to estimate/determine the more granular costs.
I use the above routine to determine the cut-off in the following way:
% function i=cutoff_ser_para(cost_blocks,cost_it,num_cores)
% Determine cut-off between serial an parallel regime
% Inputs:
% cost_blocks: Estimate of cost per block in arbitrary units. For
% consistency with the other code this must be in the reverse order
% that the scheduler is fed, i.e. cost should be ascending!
% cost_it: Base cost of iteration (regardless of number of entries)
% in the same units as cost_blocks.
% num_cores: Number of cores
%
% Output:
% i: Number of blocks to be calculated serially
num_blocks=numel(cost_blocks);
cost=zeros(num_blocks+1,2);
for i=0:num_blocks
cost(i+1,1)=sum(cost_blocks(end-i+1:end))/num_cores + i*cost_it;
cost(i+1,2)=est_cost_para(cost_blocks(1:end-i),cost_it,num_cores);
end
[~,i]=min(sum(cost,2));
i=i-1;
end
In particular, I don't inflate/change the value of est_cost_para which assumes (aside from cost_it) the most optimistic scheduling possible. I leave it as is mainly because I don't know what would work best. To be conservative (i.e. avoid feeding too large blocks to the parallel loop), one could of course add some percentage as a buffer or even use a power > 1 to inflate the parallel cost.
Note also that est_cost_para is called with successively less blocks (although I use the variable name cost_blocks for both routines, one is a subset of the other).
Compared to the approach in my wordy question I see two main advantages:
The relatively intricate dependence between the data (both the number of blocks as well as their cost) and the number of cores is captured much better with the simulated scheduler than would be possible with a single formula.
By calculating the cost for all possible combinations of serial/parallel distribution and then taking the minimum, one cannot get "stuck" too early while reading in the data from one side (e.g. by a jump which is large relative to the data so far, but small in comparison to the total).
Of course, the asymptotic complexity is higher by calling est_cost_para with its while-loop all the time, but in my case (num_blocks<500) this is absolutely negligible.
Finally, if a decent value of cost_it does not readily present itself, one can try to calculate it by measuring the actual execution time of each block, as well as the purely parallel part of it, and then trying to fit the resulting data to the cost prediction and get an updated value of cost_it for the next call of the routine (by using the difference between total cost and parallel cost or by inserting a cost of zero into the fitted formula). This should hopefully "converge" to the most useful value of cost_it for the problem in question.

MATLAB findpeaks using minpeakdistance

Suppose that I have a data set that contains a cyclical event and I am identifying a threshold (peaks) to separate each event (to eventually find the coefficient of variation).
I have multiple trials of this data - the speed of these events is sometimes significantly faster than others. This data is also a bit noisy, so some 'false local maximas' are sometimes picked up if I don't set the 'minpeakdistance' constraint within the 'findpeaks' function.
I am trying to find a way to ensure that regardless of speed, I am finding 'true local maximas'. I have been visually inspecting each trial to ensure that I have identified only true peaks - if I have also identified false peaks, I have been adjusted the mpd value for that specific trial - but this is literally going to take days.
Any suggestions?
Example:
For most trials of my collection, the following line of code only identifies true maximas:
mpd = 'minpeakdistance';
eval(['[t' num2str(a) '.Mspine.pks(:,1),t' num2str(a) '.Mspine.locs] = findpeaks(t' num2str(a) '.Mspine.xyz(:,1), mpd,25);']);
But, for trial 11, they are moving much faster, so the mpd has to be adjusted to 9; however, if I apply an mpd value of 9 to all of the trials, it will pick up false local maximas.
theI would go over to the frequency domain to find this "cyclical event". Specifically, if you know the rate at which data is sampled/generated, using a FFT will indicate the relative strengths of all periodic events in your data. Have a look at: http://www.mathworks.se/help/matlab/ref/fft.html

Measuring Frequency of Square wave in MATLAB using USB 1024HLS

I'm trying to measure the frequency of a square wave which is read through a USB 1024 HLS Daq module through MATLAB. What I've done is create a loop which reads 100 values from the digitial input and that gives me vector of 0's and 1's. There is also a timer in this loop which measures the duration for which the loop runs.
After getting the vector, I then count the number of 1's and then use frequency = num_transitions/time to give me the frequency. However, this doesn't seem to work well :( I keep getting different frequencies for different number of iterations of the loop. Any suggestions?
I would suggest trying the following code:
vec = ...(the 100-element vector of digital values)...
dur = ...(the time required to collect the above vector)...
edges = find(diff(vec)); % Finds the indices of transitions between 0 and 1
period = 2*mean(diff(edges)); % Finds the mean period, in number of samples
frequency = 100/(dur*period);
First, the code finds the indices of the transitions from 0 to 1 or 1 to 0. Next, the differences between these indices are computed and averaged, giving the average duration (in number of samples) for the lengths of zeroes and ones. Multiplying this number by two then gives the average period (in number of samples) of the square wave. This number is then multiplied by dur/100 to get the period in whatever the time units of dur are (i.e. seconds, milliseconds, etc.). Taking the reciprocal then gives the average frequency.
One additional caveat: in order to get a good estimate of the frequency, you might have to make sure the 100 samples you collect contain at least a few repeated periods.
Functions of interest used above: DIFF, FIND, MEAN
First of all, you have to make sure that your 100 samples contain at least one full period of the signal, otherwise you'll get false results. You need a good compromise of sample rate (i.e. the more samples per period you have the better the measurement is) and and number of samples.
To be really precise, you should either have a timestamp associated with every measurement (as you usually can't be sure that you get equidistant time spacing in the for loop) or perhaps it's possible to switch your USB module in some "running" mode which doesn't only get one sample at a time but a complete waveform with fixed samplerate.
Concerning the calculation of the frequency, gnovice already pointed out the right way. If you have individual timestamps (in seconds), the following changes are necessary:
tst = ...(the timestamps associated with every sample)...
period = 2*mean(diff(tst(edges)));
frequency = 1/period;
I can't figure out the problem, but if the boolean vector were v then,
frequency = sum(v)/time_to_give_me_the_frequency
Based on your description, it doesn't sound like a problem with the software, UNLESS you are using the Windows system timer, which is notoriously inaccurate (it is only accurate to about 15 milliseconds).
There are high-resolution timers available in Windows, but I don't know how to use them in Matlab. If you have access to the .NET framework, the Stopwatch class has 1 microsecond accuracy (or better), as does the QueryPerformanceCounter API in Win32.
Other than that, you might have some jitter. There could be something in your signal chain that is causing false triggers, etc.
UPDATE: The following CodeProject article should solve the timing problem, if there is one. You should check the Matlab documentation of your version of Matlab to see if it has a native high-resolution timer. Otherwise, you can use this:
C++/Mex wrapper adds microsecond resolution timer to Matlab under WinXP
http://www.codeproject.com/KB/cpp/Matlab_Microsecond_Timer.aspx
mersenne31:
Thanks everyone for your responses. I have tried the solutions that gnovice and groovingandi mentioned and I'm sure they will work as soon as the timing issue is solved.
The code I've used is shown below:
for i=1:100 tic; value = getvalue(portCH); vector(i) = value(1); tst(i) = toc; % gets an individual time sample end
% to get the total time I put total_time = toc after the for loop
totaltime = sum(tst); edges = find(diff(vec)); % Finds the indices of transitions between 0 and 1 period = 2*mean(diff(edges)); % Finds the mean period, in number of samples frequency = 100/(totaltime*period);
The problem is that measuring the time for one sample doesn't really help because it is nearly the same for all samples. What is needed is, as groovingandi mentioned, some "running" mode which reads 100 samples for 3 seconds.
So something like for(3 seconds) and then we do the data capture. But I can't find anything like this. Is there any function in MATLAB that could do this?
This won't answer your question, but it's what I thought of after reading you question. square waves have infinite frequency. The FFT of a square wave it sin(x)/x, which goes from -inf to +inf.
Also try counting only the rising edges in matlab. You can quantize the signal to just +1 and 0, and then only increment the count when you see [0 1] slice of your vector.
OR
You can quantize, decimate, then just sum. This will only work if the each square pulse is the same length and your sampling frequency is constant. I think this one would be harder to do.