Kapacitor - shift series to epoch start - kapacitor

Using a Kapacitor Batch script, how can I
shift the time stamps of a time series to start at the epoche
add a tag and
store the result in a separate InfluxDB?
In other words: I’d like to run a “SELECT INTO” query, with the ELAPSED ns between measurements being used as the time stamp (and add a tag)?
Background: I’d like to extract the measurements taken during “experiments”, store them “time-neutralised” in a dedicated experiments-DB, in order to compare measurements of multiple experiments in a single Grafana chart.

Related

AnyLogic mean waiting time in queue

I would like to get the mean waiting time of each unit spending in my queue of every hour. (so betweeen 7-8 am for example 4 minutes, 8-9 10 minutes and so on). Thats my current queue with my timemeasure Is there a way to do so?
]
Create a normal dataset and call it datasetHourly. Deactivate the option Use time as horizontal value. This is where we will store your hourly data.
Creat a cyclic event and set the trigger to cyclic, once every hour.
This cyclic event will get the current mean of your time measurement ( waiting time + service time in your example) and save this single value in the extra dataset.
Also we have to clear the dataset that is integrated into the timeMeasurementEnd, in order to get clean statistics again for the next hour interval.
datasetHourly.add(time(HOUR),timeMeasureEnd.dataset.getYMean());
timeMeasureEnd.dataset.reset();
You can now visualise the hourly development by adding the hourlyDataset to a normal plot.

How to do a distinct count of a metric using graphite datasource in grafana?

I have a metric that shows the state of a server. The values are integers and if the value is 0 (zero) then the server is stable, else it is unstable. And the graph we have is at a minute level. So, I want to show an aggregated value to know how many hours the server is unstable in the selected time range.
Lets say, if I select "Last 7 days" as the time duration...we have get X hours of instability of server.
And one more thing, I have a line graph (time series graph) that shows the state of server...but, the thing is when I select "Last 24 hours or 48 hours" I am getting the graph at a minute level...when I increase the duration to a quarter I am getting the graph for every 5 min or something like that....I understand it's aggregating the values....but does any body know how the grafana is doing the aggregation ??
I have tried "scaleToSeconds" function and "ConsolidateBy" functions and many more to first get the count of non zero value minutes, but no success.
Any help would be greatly appreciated.
Thanks in advance.
There are a few different ways to tackle this, there are 2 places that aggregation happens in this situation:
When you query for a time range longer than your raw retention interval and whisper returns aggregated data. The aggregation method used here is defined in your carbon aggregation configuration.
When Grafana sends a query to Graphite it passes maxDataPoints=<width of graph in pixels>, and Graphite will perform aggregation to return at most that many points (because you don't have enough pixels to render more points than that). The method used for this consolidation is controlled by the consolidateBy function.
It is possible for both of these to be used in the same query if you eg have a panel that queries 3 days worth of data and you store 2 days at 1-minute and 7 days at 5-minute intervals in whisper then you'd have 72 * 60 / 5 = 864 points from the 5-minute archive in whisper, but if your graph is only 500px wide then at runtime that would be consolidated down to 10-minute intervals and return 432 points.
So, if you want to always have access to the count then you can change your carbon configuration to use sum aggregation for those series (and remove the existing whisper files so new ones are created with the new aggregation config), and pass consolidateBy('sum') in your queries, and you'll always get the sum back for each interval.
That said, you can also address this at query time by multiplying the average back out to get a total (assuming that your whisper aggregation config is using average). The simplest way to do that will be to summarize the data with average into buckets that match the longest aggregation interval you'll be querying, then scale those values by that interval to calculate the total number of minutes. Finally, you'll want to use consolidateBy('sum') so that any runtime consolidation will work properly.
consolidateBy(scale(summarize(my.series, '10min', 'avg'), 60), 'sum')
With all of that said, you may want to consider reporting uptime in terms of percentages rather than raw minutes, in which case you can use the raw averages directly.
When you say the value is zero (0), the server is healthy - what other values are reported while the server is unhealthy/unstable? If you're only reporting zero (healthy) or one (unhealthy), for example, then you could use the sumSeries function to get a count across multiple servers.
Some more information is needed here about the types of values the server is reporting in order to give you a better answer.
Grafana does aggregate - or consolidate - data typically by using the average aggregation function. You can override this using the 'sum' aggregation in the consolidateBy function.
To get a running calculation over time, you would most likely have to use the summarize function (also with the sum aggregation) and define the time period, e.g. 1 hour, 1 day, 1 week, and so on. You could take this a step further by combining this with a time template variable so that as the period grows/shrinks, the summarize period will increase/decrease accordingly.

Adding lagged terms into functions [in MATLAB]

I am using a MATLAB toolbox, specifically, https://uk.mathworks.com/matlabcentral/fileexchange/32882-armax-garch-k-sk-toolbox-estimation-forecasting-simulation-and-value-at-risk-applications
to insert data into the functions, the author defines a data matrix and then uses data(:,3) for the third column which represents a series.
I would like to do this put add data(:,3) lagged by one period.
My question: is there a way I can write something in Matlab that lags the dataset by one period which can be inserted into the function.
If I understand correctly, you would like to lag a series by one time period, with the time period being however you collect the data, for example, daily data, lag the series by one day.
If so you can use the lagmatrix
To provide an example,
LAGGEDX = lagmatrix(data(:,3),1)
This would lag your data(:,3) series by one day if it is daily data, you could then insert LAGGEDX in replace of data(:,3).

How to calculate the average value in a Prometheus query from Grafana

I was trying to create a Prometheus graph on Grafana, but i can't find the function which calculate the average value.
For example , to create a graph for read_latency, the result contain many tags. If there are 3 machine, there will be 3 tag seperately, for machine1, machine2, machine3. Here is a graph(click to show)
Prometheus
I want to combine these three together, so there will be only one tag : machines, and the value is the average of those three.
It seems that Prometheus query function doesn't have something like average(), so I am not sure how to do this.
I used to work on InfluxDB, and the graph can show like (click to show):
influxDB
I think you are searching for the avg() operation. see documentation
Use built-in $__interval variable, where node, name are custom labels (depending on you metrics):
sum(avg_over_time(some_metric[$__interval])) by (node, name)
or fixed value like 1m,1h etc:
sum(avg_over_time(some_metric[1m])) by (node, name)
You can filter using Grafana variables:
sum(avg_over_time(some_metric{cluster=~"$cluster"}[1m])) by (node, name)
Short answer: use avg() function to return the average value across multiple time series. For example, avg(metric) returns the average value for time series with metric name.
Long answer: Prometheus provides two functions for calculating the average:
avg_over_time calculates the average over raw sample stored in the database on the lookbehind window specified in square brackets. The average is calculated independently per each matching time series. For example, avg_over_time(metric[1h]) calculates average values for raw samples over the last hour per each time series with metric name.
avg calculates the average over multiple time series. The average is calculated independently per each point on the graph.
If you need to calculate the average over raw samples across all the time series, which match the given selector, per each time bucket, e.g.:
SELECT
time_bucket('5 minutes', timestamp) AS t,
avg(value)
FROM table
GROUP BY t
Then the following PromQL query must be used:
sum(sum_over_time(metric[$__interval])) / sum(count_over_time(metric[$__interval]))
Do not use avg(avg_over_time(metric[$__interval])), since it returns average of averages, which isn't equal to real average. See this explanation for details.

How to deal with multiple time series in MATLAB?

I have some smart meter data which shows gas and electricity meter readings at 30 min intervals for about two years, for 16000 households.
The date is stored in separate .mat files, with a datetime variable for the timestamp and a double variable for the actual data. Some of the data has gaps in, from a few hours to several days or weeks. I want to create a timeseries object containing all of the data and a continuous timestamp for the two year period, so that I can then interpolate the gaps.
Another option would be to use snychronize, but for this it seems the 16000 data series need to be in individual timeseries objects, which seems cumbersome.
I have tried this with timeseries objects and financial time series but cannot get all of the 16000 data series and corresponding timestamps into one time series object. When I try to add more than one series to an existing timeseries object, it is added "in series" rather than "in parallel" (i.e. data in the Data:1 column).
When I tried with a financial time series I had difficulties preparing the datetime data in a cell array.
Any ideas what the most efficient way to do this is?
Thanks
Russell
Depending on the version of matlab that you have the best idea would seem to be to use the table variable.
Tables can be used to store disparate objects so that you can have the date/time stamp as well as the meter readings in the same variable.
You can horizontally concatenate the tables (or otherwise join as you read them in so that you will now have a time series with a single date variable and the responses for each household.