change Frequency of coming data - sockets

I Have a Sensor (Gyro) that connected to my python program (with socket UDP) and send data to python console in real-time but with 200 Hz frequency.
I want to change this frequency of coming data to my console but could not find a good way to do it.
I was thinking about doing it with filters like Mean an waiting for idea?

If you want to have regular updates, use a windowing mechanism. Take the last n values and store the average. Then, discard the next two values and take the last n values again. This example would yield values with a frequency of 200 Hz/2.
If you only want to see events when changes have occured, store the last value, compare the current value with the last one and emit an event if it has changed, updating the stored value. As you're dealing with sensors (and thus, a little fuzziness), you probably want to implement a hysteresis.
You can even raise the frequency by creating extra values in between the received ones through interpolation. For a steady frequency, you would have to take care about your timing though.

Related

How to persist previous data point when time range doesn't include a data point

TL;DR:
Can I get Grafana to show me the previous data point, when the currently selected time period does not have a data point? I have an example which sounds ridiculous, but at least it's simple to understand: I send data every 1 minute, and I wish to zoom into the last 30 seconds, and still see data. You may ask "why not just zoom out to 2 minutes" but the reason is that other data is on the same graph that has updated more often, and I wish to compare with that data. Also, for the more lengthy reasons below.
If not, how can I achieve what I want to achieve, see below?
Context
For a few years, I have been monitoring the water level in three of our basement sumps (which have pumps installed) by sending this data from Node-RED to InfluxDB, then visualising the sump levels in Grafana. I have set up three waterproof ultrasonic distance sensors, each pointed down a pipe that is inserted vertically into each sump. The water fills the pipe and the distance sensor, connected to an Arduino, sends me the reading. The Arduino also has other sensors connected (temp / humidity) and deals with distance calibrations to calculate the percent full of each sump. All this data is sent to Node-RED. In total, I am sending 4 values per sump: distance measurement in mm, percent full, temp, humidity. So that's 12 fields. Data is sent every 2 seconds, because I wished to have a reasonably high resolution to see nice curves in graphs.
Also I decided to store all this data so that I could later troubleshoot issues (we have had sewage floods resulting in water not being able to be pumped away, etc...) and design some warning systems for these issues based on data.
Storing 12 values for every 2 seconds, over the course of a number of years, takes up a lot of space (8GB).
Nature of the data
Storing this resolution of data has also helped me be able to describe the nature of the data. I will do so here.
(1) Non-meaningful NOISE (see below) - the percent-full reading goes up and down by 1 or 2 percent every couple of seconds:
(2) Meaningful DRIFT (see below) - I don't mean sensor drift, I am referring to actual water levels changing slowly over time, e.g. over 1 day or 1 week. Perhaps condensation on the walls drips down into the sump, or water evaporates from the sump, and the value can waver by a few percent over the course of a day. Each sump has slightly different characteristics.
(3) Meaningful MONITORING DATA - during wet weather, depending on rainfall amount, the sumps fill up over the course of say 30 mins to 3 hours. Then the pumps run and the water level drops again, wavers a bit, then the sumps continue to fill up. If the rain stopped, you can see a lovely curve as the water fills in progressively more slowly (see the green line below):
Solution to downsample
I know Influx has its own downsampling possibilities, however because of the nature of the data (which can hardly vary for 2 months but when it does, I really need to capture it in detail), I don't think lowering the sample rate is a great idea.
I have some understanding of digital filters (e.g. low pass etc) but have never programmed one myself. So I have written a basic filter in javascript (a Node-RED function) to filter the data in realtime as follows: only send each reading when it has changed from the previous one by x amount. (And update the previous one, when that occurs.)
This has already vastly reduced the amount of data being stored, and I can vary x to filter out noise shown in my first graph above, at the expense of resolution when the pumps run. Even if I set the x value to 2, it still vastly reduces data over long periods of dry weather.
So - onto my problem! Now data is not being logged to InfluxDB unless there is some meaningful change. Which means that when I zoom in to e.g. 15 minute timeframe of data, there is nothing to see.
Grafana does have the option of "fill (previous)" but this draws a line between points on the existing graph, rather than showing the previous data as if it hasn't changed since that point. Now my grafana dashboard looks a bit sad :(
One proposed solution is, in addition to sending "delta" data, send "summary" data, that is - send a full suite of data every 1 minute regardless of whether data changed or not. But then we get noise back again, and pointless storage.
Any other ideas?

Machine Learning to predict time-series multi-class signal changes

I would like to predict the switching behavior of time-dependent signals. Currently the signal has 3 states (1, 2, 3), but it could be that this will change in the future. For the moment, however, it is absolutely okay to assume three states.
I can make the following assumptions about these states (see picture):
the signals repeat periodically, possibly with variations concerning the time of day.
the duration of state 2 is always constant and relatively short for all signals.
the duration of states 1 and 3 are also constant, but vary for the different signals.
the switching sequence is always the same: 1 --> 2 --> 3 --> 2 --> 1 --> [...]
there is a constant but unknown time reference between the different signals.
There is no constant time reference between my observations for the different signals. They are simply measured one after the other, but always at different times.
I am able to rebuild my model periodically after i obtained more samples.
I have the following problems:
I can only observe one signal at a time.
I can only observe the signals at different times.
I cannot trigger my measurement with the state transition. That means, when I measure, I am always "in the middle" of a state. Therefore I don't know when this state has started and also not exactly when this state will end.
I cannot observe a certain signal for a long duration. So, i am not able to observe a complete period.
My samples (observations) are widespread in time.
I would like to get a prediction either for the state change or the current state for the current time. It is likely to happen that i will never have measured my signals for that requested time.
So far I have tested the TimeSeriesPredictor from the ML.NET Toolbox, as it seemed suitable to me. However, in my opinion, this algorithm requires that you always pass only the data of one signal. This means that assumption 5 is not included in the prediction, which is probably suboptimal. Also, in this case I had problems with the prediction not changing, which should actually happen time-dependently when I query multiple predictions. This behavior led me to believe that only the order of the values entered the model, but not the associated timestamp. If I have understood everything correctly, then exactly this timestamp is my most important "feature"...
So far, i did not do any tests on Regression-based approaches, e.g. FastTree, since my data is not linear, but keeps changing states. Maybe this assumption is not valid and regression-based methods could also be suitable?
I also don't know if a multiclassifier is required, because I had understood that the TimeSeriesPredictor would also be suitable for this, since it works with the single data type. Whether the prediction is 1.3 or exactly 1.0 would be fine for me.
To sum it up:
I am looking for a algorithm which is able to recognize the switching patterns based on lose and widespread samples. It would be okay to define boundaries, e.g. state duration 3 of signal 1 will never last longer than 30s or state duration 1 of signal 3 will never last longer 60s.
Then, after the algorithm has obtained an approximate model of the switching behaviour, i would like to request a prediction of a certain signal state for a certain time.
Which methods can I use to get the best prediction, preferably using the ML.NET toolbox or based on matlab?
Not sure if this is quite what you're looking for, but if detecting spikes and changes using signals is what you're looking for, check out the anomaly detection algorithms in ML.NET. Here are two tutorials that show how to use them.
Detect anomalies in product sales
Spike detection
Change point detection
Detect anomalies in time series
Detect anomaly period
Detect anomaly
One way to approach this would be to first determine the periodicity of each of the signals independently. This could be done by looking at the frequency distribution of time differences between measurements of state 2 only and separately for each signal.
This will give a multinomial distribution. The shortest time difference will be the duration of the switching event (after discarding time differences less than the max duration of state 2). The second shortest peak will be the duration between the end of one switching event and the start of the next.
When you have the 3 calculations of periodicity you can simply calculate the difference between each of them. Given you have the timestamps of the measurements of state 2 for each signal you should be able to calculate the time of switching for all other signals.

Reactive Extensions rate of change detection from normal movement

Given a sequence of numbers that trend overtime, I would like to use Reactive Extensions to give an alert when there is a sudden absolute change spike or drop. i.e 101.2, 102.4, 101.4, 100.9, 95, 93, 85... and then increasing slowly back to 100.
The alert would be triggered on the drop from 100.9 to 95, each would have a timestamp looking for an an alert of the form:
LargeChange
TimeStamp
Distance
Percentage
I believe i need to start with Buffer(60, 1) for a 60 sample moving average (of a minute frequency between samples).
Whilst that would give the average value, I can't assign an arbitrary % to trigger the alert since this could vary from signal to signal - one may have more volatility that the other.
To get volatility I would then take a longer historical time frame Buffer(14, 1) (these would be 14 days of daily averages of the same signal).
I would then calculate the difference between each value in the buffer and the 14 day average, square and add all these deviations, and divide by the number of samples.
My questions are please:
How would I perform the above volatility calculation, or is it better to just do this outside of RX and update the new volatility value once daily external to the observable stream calculation (this may make more sense to avoid me having to run 14 days worth of 1 minute samples through it)?
How would we combine the fast moving average and volatility level (updated once per day) to give alerts? I am seeing Scan and DistinctUntilChanged on posts on SO, but cant work out how to put together.
I would start by breaking this down into steps. (For simplicity I'll assume the original data source is an observable called values.)
Convert values into a moving averages observable (we'll call this averages here).
Combine values and averages into an observable that can watch for "extremes".
For step 1, you may be able to use the built-in Window method that Slugart mentioned in a comment or the similar Buffer method. A Select call after the Window or Buffer can be used to process the array into a single average value object. Something like:
averages = values.Buffer(60, 1)
.Select((buffer) => { /* do average and std dev calcuation here */ });
If you need sliding windows, you may have to implement your own operator, but I could easily be unaware of one that does exist. Scan along with a queue seem like a good basis for such an operator if you need to write it.
For step 2, you will probably want to start with CombineLatest followed by a Where clause. Something like:
extremes = values.CombineLatest(averages, (v, a) => new { Current = v, Average = a })
.Where((value) = { /* check if value.Current is out of deviation from value.Average */ });
The nice part of this approach is that you can choose between having averages be computed directly from values in line like we did here or be some other source of volatility information with minimal effect on the rest of the code.
Note that the CombineLatest call may cause two subscriptions to values, one directly and one indirectly via a subscription to averages. If the underlying implementation of values makes this undesirable, use Publish and RefCount to get around this.
Also note that CombineLatest will output a value each time either values or averages outputs a value. This means that you will get two events every time averages updates, one for the values update and one for the averages update triggered by the value.
If you are using sliding windows, that would mean a double update on every value, and it would probably be better to simply include the current value on the Scan output and skip the CombineLatest altogether. You would have something like this instead:
averages = values.Scan((v) => { /* build sliding window and attach current value */ });
extremes = averages.Where((a) => { /* check if current value is out of deviation for the window */ });
Once you have extremes, you can subscribe to it and trigger your alerts.

serial input buffer size Matlab

I'm trying to read a lot of data coming from my Arduino, I've set my input buffer to 500000 to make sure that it can handle all these data. My data are 4 sensors readings each samples at 250 Hz. With the default buffer size (712), I used to get snags when I plot the readings in real time and the samples get disordered which makes the plot go crazy. I solved this by increasing the buffer size to 50000. But now, this will work for a while but if I want to run it for 15 minutes, I get the same misbehavior after 5 minutes, with the addition that the plotting gets slower. I do have some processing code along with the live plotting but it shouldn't be like this with such a bi buffer. I want to know whether the buffer will contain all the data from the beginning until it's full or will it keep erasing older data when it gets full (knowing that I already saved it in another vector and plotted it). I truly don't understand why this keeps happening.
kind regards
I.H
When the buffer gets full, once you get new data it erases the old data. The behavior you are seeing is because your processing and your plotting is slower than the flow of the data.
Try to make sure that you optimize you processing
Make sure that for plotting is done by "drawnow". Like this you are sure that if there is anything in the queue it is not executed
Try to avoid saving and keeping all the data
If the problem is still there, you can try to implement a timer to make sure that you are consistent with reading your data

How to analyze scale-free signals and get signal properties

I am new with signal processing, i have following signals which i've got after some pre-processing on original signals.
You can see some of them has some similarities with others and some doesn't. but the problem is They have various range(in this example from 1000 to 3000).
Question
How can i analysis their properties scale-free(what i mean from properties is statistical properties of signals or whatever)??
Note that i don't want to cross-comparing the signals, i just want independent signals signatures which i can run some process on them sometime later.
Anything would help.
If you want to make a filter that separates signals that follow this pattern from signals that don't, well, there's tons of things you could do!
Just think practically. As a first shot at it, you could do something like this (in this order):
Check if the signals are all-positive
Check if the first element is close in value to the last element
Check if the maximum lies "in the middle" somewhere
Check if the first value is small, then the signal grows, then shrinks again
Check if the growth rates are gradual. You could for example analyze their derivatives (after smoothing):
a. derivative should be all-positive for a while, then all-negative.
b. derivative should be smooth (no jumps greater than some tolerance)
Without additional knowledge about the signal's nature/origin, it's going to be hard to come up with more meaningful metrics than these...