Given a sequence of numbers that trend overtime, I would like to use Reactive Extensions to give an alert when there is a sudden absolute change spike or drop. i.e 101.2, 102.4, 101.4, 100.9, 95, 93, 85... and then increasing slowly back to 100.
The alert would be triggered on the drop from 100.9 to 95, each would have a timestamp looking for an an alert of the form:
LargeChange
TimeStamp
Distance
Percentage
I believe i need to start with Buffer(60, 1) for a 60 sample moving average (of a minute frequency between samples).
Whilst that would give the average value, I can't assign an arbitrary % to trigger the alert since this could vary from signal to signal - one may have more volatility that the other.
To get volatility I would then take a longer historical time frame Buffer(14, 1) (these would be 14 days of daily averages of the same signal).
I would then calculate the difference between each value in the buffer and the 14 day average, square and add all these deviations, and divide by the number of samples.
My questions are please:
How would I perform the above volatility calculation, or is it better to just do this outside of RX and update the new volatility value once daily external to the observable stream calculation (this may make more sense to avoid me having to run 14 days worth of 1 minute samples through it)?
How would we combine the fast moving average and volatility level (updated once per day) to give alerts? I am seeing Scan and DistinctUntilChanged on posts on SO, but cant work out how to put together.
I would start by breaking this down into steps. (For simplicity I'll assume the original data source is an observable called values.)
Convert values into a moving averages observable (we'll call this averages here).
Combine values and averages into an observable that can watch for "extremes".
For step 1, you may be able to use the built-in Window method that Slugart mentioned in a comment or the similar Buffer method. A Select call after the Window or Buffer can be used to process the array into a single average value object. Something like:
averages = values.Buffer(60, 1)
.Select((buffer) => { /* do average and std dev calcuation here */ });
If you need sliding windows, you may have to implement your own operator, but I could easily be unaware of one that does exist. Scan along with a queue seem like a good basis for such an operator if you need to write it.
For step 2, you will probably want to start with CombineLatest followed by a Where clause. Something like:
extremes = values.CombineLatest(averages, (v, a) => new { Current = v, Average = a })
.Where((value) = { /* check if value.Current is out of deviation from value.Average */ });
The nice part of this approach is that you can choose between having averages be computed directly from values in line like we did here or be some other source of volatility information with minimal effect on the rest of the code.
Note that the CombineLatest call may cause two subscriptions to values, one directly and one indirectly via a subscription to averages. If the underlying implementation of values makes this undesirable, use Publish and RefCount to get around this.
Also note that CombineLatest will output a value each time either values or averages outputs a value. This means that you will get two events every time averages updates, one for the values update and one for the averages update triggered by the value.
If you are using sliding windows, that would mean a double update on every value, and it would probably be better to simply include the current value on the Scan output and skip the CombineLatest altogether. You would have something like this instead:
averages = values.Scan((v) => { /* build sliding window and attach current value */ });
extremes = averages.Where((a) => { /* check if current value is out of deviation for the window */ });
Once you have extremes, you can subscribe to it and trigger your alerts.
Related
I would like to get peak value from STM32 adc samples. I have written the below code and I've managed to get peak value however most of the time this value includes the biggest noise. In order to eliminate noise effects, I have decided to apply averaging method. I would like to get 5 measurements' averages. Then I'd like to compare these averages and use the biggest one(biggest average). Can anybody suggest a code?
Regards,
Umut
void HAL_ADC_ConvCpltCallback(ADC_HandleTypeDef* hadc)
{
ADC_raw = HAL_ADC_GetValue(hadc);
Vdd = 3.3 * (ADC_raw)/4095;
if (Vdd > Vmax)
{
Vmax = Vdd;
}
At first, I would remove as much code as possible from the Callback function because it is still inside the interrupt context which should be as short as possible. This is mentioned in a lot of answears here so I will not go into details on how to handle this.
For averaging the measurement, there are multiple ways you can go.
Automatic avarage
Use the ADCs oversampling function. The controller will sample the signal multiple times (configure using OVFS register) and calculate an average value before triggering the interrupt.
Manual average
Using the HAL_ADC_ConvCpltCallback function Store the numer of desired value into an array and calculate the average in the main loop.
Manual average using DMA
Let the DMA store the number of samples you want to use in an array using the function HAL_ADC_Start_DMA. When all samples have been collected you will be notified. This will reduce the processor load because you don't have to shift the data into the array yourself.
You can also combine the oversampling (most of the time a good idea) and one of the other methods depending on your Use-Case.
In prometheus, I have a monotonically increasing counter (ifHCInOctets from IF-MIB, in this case).
In Grafana, I can create a graph using the simple query ifHCInOctets{job='snmp',instance='$Device',ifDescr=~'eth0'} and see the counter graphed over different time ranges by selecting the desired range in the upper-right.
This is almost exactly what I want. However, I would like the graph to always start at zero and increase from there. The use-case is that I want to visualize my data usage over the course of a month to see how quickly I am approaching my data cap. (I already create a gauge object using increase(ifHCInOctets{...}[$__range]) function which shows me how much I have used in total over the given time range, but I'd like to be able to visualize that usage over time.)
Basically, I want ifHCInOctets{...} - X where X is the value of ifHCInOctets at the start of the range. My first thought was:
ifHCInOctets{...} - ifHCInOctets{...} offset $__range
But that seems to show me each data point minus the data point $__range time prior to it (rather than just subtracting the starting value from all points).
I then tried creating a query variable with the query query_result(ifHCInOctets{...} offset $__range) and setting it to update on time range change. This almost seemed to work, but the resulting graph always seemed to start slightly negative, depending on the time range selected, which made me think it wasn't doing what I thought it was.
I have also tried various forms of sum, sum_over_time, and increase, all to no avail.
You're probably looking for something like this
ifHCInOctets
-
min_over_time(
(ifHCInOctets
and
(month(timestamp(ifHCInOctets)) == scalar(month(vector($__to / 1000)))))[31d:]
)
But it doesn't take into account counter resets. And is ugly and inefficient as hell. It's basically the current value minus the min_over_time calculated over samples in the previous 31 days that fell into the same month as Grafana's $__to timestamp.
You probably want to set up a recording rule based on this expression (that adds year, month and day labels to a metric) and then calculate the increase() over any given month (including the current month). That takes into account both counter resets and counters that did not exist at the beginning of the month.
I'd like to get the 0.95 percentile memory usage of my pods from the last x time. However this query start to take too long if I use a 'big' (7 / 10d) range.
The query that i'm using right now is:
quantile_over_time(0.95, container_memory_usage_bytes[10d])
Takes around 100s to complete
I removed extra namespace filters for brevity
What steps could I take to make this query more performant ? (except making the machine bigger)
I thought about calculating the 0.95 percentile every x time (let's say 30min) and label it p95_memory_usage and in the query use p95_memory_usage instead of container_memory_usage_bytes, so that i can reduce the amount of points the query has to go through.
However, would this not distort the values ?
As you already observed, aggregating quantiles (over time or otherwise) doesn't really work.
You could try to build a histogram of memory usage over time using recording rules, looking like a "real" Prometheus histogram (consisting of _bucket, _count and _sum metrics) although doing it may be tedious. Something like:
- record: container_memory_usage_bytes_bucket
labels:
le: 100000.0
expr: |
container_memory_usage_bytes > bool 100000.0
+
(
container_memory_usage_bytes_bucket{le="100000.0"}
or ignoring(le)
container_memory_usage_bytes * 0
)
Repeat for all bucket sizes you're interested in, add _count and _sum metrics.
Histograms can be aggregated (over time or otherwise) without problems, so you can use a second set of recording rules that computes an increase of the histogram metrics, at much lower resolution (e.g. hourly or daily increase, at hourly or daily resolution). And finally, you can use histogram_quantile over your low resolution histogram (which has a lot fewer samples than the original time series) to compute your quantile.
It's a lot of work, though, and there will be a couple of downsides: you'll only get hourly/daily updates to your quantile and the accuracy may be lower, depending on how many histogram buckets you define.
Else (and this only came to me after writing all of the above) you could define a recording rule that runs at lower resolution (e.g. once an hour) and records the current value of container_memory_usage_bytes metrics. Then you could continue to use quantile_over_time over this lower resolution metric. You'll obviously lose precision (as you're throwing away a lot of samples) and your quantile will only update once an hour, but it's much simpler. And you only need to wait for 10 days to see if the result is close enough. (o:
The quantile_over_time(0.95, container_memory_usage_bytes[10d]) query can be slow because it needs to take into account all the raw samples for all the container_memory_usage_bytes time series on the last 10 days. The number of samples to process can be quite big. It can be estimated with the following query:
sum(count_over_time(container_memory_usage_bytes[10d]))
Note that if the quantile_over_time(...) query is used for building a graph in Grafana (aka range query instead of instant query), then the number of raw samples returned from the sum(count_over_time(...)) must be multiplied by the number of points on Grafana graph, since Prometheus executes the quantile_over_time(...) individually per each point on the displayed graph. Usually Grafana requests around 1000 points for building smooth graph. So the number returned from sum(count_over_time(...)) must be multiplied by 1000 in order to estimate the number of raw samples Prometheus needs to process for building the quantile_over_time(...) graph. See more details in this article.
There are the following solutions for reducing query duration:
To add more specific label filters in order to reduce the number of selected time series and, consequently, the number of raw samples to process.
To reduce the lookbehind window in square brackets. For example, changing [10d] to [1d] reduces the number of raw samples to process by 10x.
To use recording rules for calculating coarser-grained results.
To try using other Prometheus-compatible systems, which may process heavy queries at faster speed. Try, for example, VictoriaMetrics.
I have an AB PLC where I am trying to read analog values to see if the values vary more than 1V in 5 minutes? I have 10 sets of values I need to read. What would the easiest way to implement this? I can think of creating arrays to save the values each time I read them but the part I am having trouble with is, how to keep a running average of the values and compare against each time I read them.
Any help with this would be greatly appreciated!!
If I understand correctly all you want to do is see if your analog input is more or less than 1V from your set value? Just check if your value is greater than (set value + 1V) or less than (set value - 1V) every plc scan then set a bool value to true. That should be it.
I think finding an average of the analog input is not the way to go for this. But if you did want to find an average of an analog input over time you would need 3 things. Sample time, interval time, and total intervals. You would set up a sample time of, lets say 12 seconds. You will get the analog value every 12 seconds. After 60 seconds you would take the total and divide by (60/12 == 5). You would then add that value to the previous value average value that you totaled up and divide by the total number of intervals times (total intervals) you have accumulated. Hope I didn't make that to complicated.
What i understood from you question is you want check whether input voltage changed or not using the analog value you got, in my case i'm using 0 to 10v. Just simple store the analog value at max input i mean at 10v and just do the same for 0v and you can simply calculate the value for 1v. All you have do is compare the value with +/- 1v value you got from the calculation. you can do this dynamically with n-number of analog inputs(n= max analog inputs supported by your PLC.)
Have a look at FFL and FFU. They are First-In-First-Out buffers. You specify the length of the buffer you want and use FFL and FFU in pairs on the same buffer. Running averages are not that difficult to compute, and there are a number of ways to best implement depending on the platform (SLC vs CLX). The simplest method that would work on both platforms is to use a counter.ACC as a value to indirectly reference the element number of the FIFO for an addition function, then divide by the number of elements in your FIFO. This can all be done in a single multi-branch rung.
1. Load your value into FIFO buffer at some timer interval using FFL.
2. If you don't need the FIFO values 'Popped out' for use elsewhere, just set .POS to 0 when the FIFO is full and let it continue to update with new values, the values aren't cleared so they are still readable for your Running Average. But you MUST either use FFU to step the .POS back or use a MOV function to change the .POS once it's full or it will stop taking values.
3. Create a counter with a .PRE equal to the .LEN of your FIFO
4. On a parallel Rung, with each increment of the counter.ACC use an ADD function. Here's an example assuming CLX. If you're using SLC you can do the same thing but obviously you can't use tag names:
ADD
Value1: AllValues
Value2: FIFO[IndexCounter.ACC]
Destination: AllValues
5. When your counter.DN bit is set, divide AllValues by FIFO.LEN and store in a RunningAverage Tag, then reset the counter. Have your counter step once for each scan or put it all in a Periodic Function to execute the routine.
I Have a Sensor (Gyro) that connected to my python program (with socket UDP) and send data to python console in real-time but with 200 Hz frequency.
I want to change this frequency of coming data to my console but could not find a good way to do it.
I was thinking about doing it with filters like Mean an waiting for idea?
If you want to have regular updates, use a windowing mechanism. Take the last n values and store the average. Then, discard the next two values and take the last n values again. This example would yield values with a frequency of 200 Hz/2.
If you only want to see events when changes have occured, store the last value, compare the current value with the last one and emit an event if it has changed, updating the stored value. As you're dealing with sensors (and thus, a little fuzziness), you probably want to implement a hysteresis.
You can even raise the frequency by creating extra values in between the received ones through interpolation. For a steady frequency, you would have to take care about your timing though.