Silk Performer - recalulating a specific time period - silk-performer

Is it possible to have Silk Performer recalculate results based on a specific time period from testing? What I am seeing is a spike at the end of my test. This needs investigated, but I would like to show that prior to that spike, the average time was good. Because of this spike I am getting a couple seconds higher - at least that is my assumption right now. I would like to show that before the spike the times are good.
Here is what I am seeing:

In the Silk Performance Explorer there is a resample option under results. You can set the specific time period for calculation there.

Related

How to persist previous data point when time range doesn't include a data point

TL;DR:
Can I get Grafana to show me the previous data point, when the currently selected time period does not have a data point? I have an example which sounds ridiculous, but at least it's simple to understand: I send data every 1 minute, and I wish to zoom into the last 30 seconds, and still see data. You may ask "why not just zoom out to 2 minutes" but the reason is that other data is on the same graph that has updated more often, and I wish to compare with that data. Also, for the more lengthy reasons below.
If not, how can I achieve what I want to achieve, see below?
Context
For a few years, I have been monitoring the water level in three of our basement sumps (which have pumps installed) by sending this data from Node-RED to InfluxDB, then visualising the sump levels in Grafana. I have set up three waterproof ultrasonic distance sensors, each pointed down a pipe that is inserted vertically into each sump. The water fills the pipe and the distance sensor, connected to an Arduino, sends me the reading. The Arduino also has other sensors connected (temp / humidity) and deals with distance calibrations to calculate the percent full of each sump. All this data is sent to Node-RED. In total, I am sending 4 values per sump: distance measurement in mm, percent full, temp, humidity. So that's 12 fields. Data is sent every 2 seconds, because I wished to have a reasonably high resolution to see nice curves in graphs.
Also I decided to store all this data so that I could later troubleshoot issues (we have had sewage floods resulting in water not being able to be pumped away, etc...) and design some warning systems for these issues based on data.
Storing 12 values for every 2 seconds, over the course of a number of years, takes up a lot of space (8GB).
Nature of the data
Storing this resolution of data has also helped me be able to describe the nature of the data. I will do so here.
(1) Non-meaningful NOISE (see below) - the percent-full reading goes up and down by 1 or 2 percent every couple of seconds:
(2) Meaningful DRIFT (see below) - I don't mean sensor drift, I am referring to actual water levels changing slowly over time, e.g. over 1 day or 1 week. Perhaps condensation on the walls drips down into the sump, or water evaporates from the sump, and the value can waver by a few percent over the course of a day. Each sump has slightly different characteristics.
(3) Meaningful MONITORING DATA - during wet weather, depending on rainfall amount, the sumps fill up over the course of say 30 mins to 3 hours. Then the pumps run and the water level drops again, wavers a bit, then the sumps continue to fill up. If the rain stopped, you can see a lovely curve as the water fills in progressively more slowly (see the green line below):
Solution to downsample
I know Influx has its own downsampling possibilities, however because of the nature of the data (which can hardly vary for 2 months but when it does, I really need to capture it in detail), I don't think lowering the sample rate is a great idea.
I have some understanding of digital filters (e.g. low pass etc) but have never programmed one myself. So I have written a basic filter in javascript (a Node-RED function) to filter the data in realtime as follows: only send each reading when it has changed from the previous one by x amount. (And update the previous one, when that occurs.)
This has already vastly reduced the amount of data being stored, and I can vary x to filter out noise shown in my first graph above, at the expense of resolution when the pumps run. Even if I set the x value to 2, it still vastly reduces data over long periods of dry weather.
So - onto my problem! Now data is not being logged to InfluxDB unless there is some meaningful change. Which means that when I zoom in to e.g. 15 minute timeframe of data, there is nothing to see.
Grafana does have the option of "fill (previous)" but this draws a line between points on the existing graph, rather than showing the previous data as if it hasn't changed since that point. Now my grafana dashboard looks a bit sad :(
One proposed solution is, in addition to sending "delta" data, send "summary" data, that is - send a full suite of data every 1 minute regardless of whether data changed or not. But then we get noise back again, and pointless storage.
Any other ideas?

what is the default grafana setting for $__rate_interval

I understand that rate(xyz[5m]) * 60 is the rate of xyz per minute, averaged over 5 mins.
How then would $__rate_interval and $__interval be defined,
possibly in the same syntax?
What format is rate being measured here, in my panel? Per minute, per second?
What is the interval= 30s in my panel here? My scraping interval is set to 5s.
How do i change the rate format?
See New in Grafana 7.2: $__rate_interval for Prometheus rate queries that just work.
Rate is always per second. See Grafana documentation for the rate function.
Click on Query options, then click on the Info-Symbol. An explanation will be displayed.
To get rate per minute, just multiply the rate with 60.
Edit: ($__rate_interval and $_interval)
Prometheus periodically fetches data from your application. Grafana periodically fetches Data from Prometheus. Grafana does not know, how often Prometheus polls your application for data. Grafana will estimate this time by looking at the data. The $__interval variable expands to the duration between two data points in the graph. (Note that this is only true for small time ranges and high resolution as the intended use case for $__interval is reducing the number of data points when the time range is wide. See Approximate Calculation of $__interval.)
If the time-distance between every two data points in each series is 15 seconds, it does not make sense to use anything less than [15s] as interval in the rate function. The rate function works best with at least 4 data points. Therefore [1m] would be much better than anything betweeen [15s] and [1m]. This is what $__rate_interval tries to achieve: guessing a minimal sensible interval for the rate function.
Personally, I think, this does not always work if your application delivers sparse data. I prefer using fixed intervals like 10m or even 1h or 1d in these situations. The interval need to be great enough to get you enough data points for the metric to work with the rate function.
A different approach would be to use any of $__rate_interval and $_interval but also set the Min step parameter for the query in the Grafana UI to be big enough.

What period is P99 or P95 calculated over?

When committing to/setting SLAs for a service, what time period should the SLA be calculated over?
For example, if I wanted all the services in my organization to commit to P95 latency, and one of the services commits to 500ms, what is the time window - because the P95 will be different based on the time window we look at.
Depends on in what cycles your latency fluctuates.
No daily / hourly peaks? A couple thousand samples do just fine.
Daily fluctuations (e.g. peak usage, concurrent backups etc.)? Then you will need to measure at least a whole day.
Weekly fluctuations (e.g. tied to work hours or evening activities etc.)? Then you will need to sample over a full week.
There is no strict requirement to sample everything over the chosen time window, but your time window better be representative or you may be held liable. Also make sure to be fair when you under-sample.
If you want to be on the safe side, take the worst-case-scenario in your load cycle, and within that scenario take a full minute worth of samples. That gives you a good estimate what will be held against you.

How to visualize cycle times of a periodic process?

I have a periodic backend process and I would like to visualize the history of the length of cycles on my dashboard. Is it possible?
I have full control over the data/metrics I generate, so I could perhaps increment a counter every time a cycle completes (a cycle takes about 3 days), so I would get counter updates every 3 days or so. Then how could I get Grafana to report the length of each cycle? (for instance: 72h; 69h; 74h; etc.) The actual widget doesn't matter, but I need something visual to tell me at once if cycles are getting faster or slower.
Any pointers or ideas are welcome.
It looks like a standard time series: X-axis - time, Y-axis - duration [s]:
Then you may add:
trend line
aggregations (min/max/avg/derivation/diff/...)
moving average
other math functions, which are available in used datasource

reset chart to 0 in grafan

Below is a chart I have in grafana:
My problem is that if my chosen time range is say 5 minutes, the graph wont show only what happened in the last 5 minutes. So in the picture, nothing happened in the past 5 minutes so it's just showing the last points it has. How can I change this so that it goes back to zero if nothing has changed? I'm using a Prometheus counter for this, if that is relevant.
As explained in the Prometheus documentation, a counter value in itself is not of much use. It depends on when your job was last restarted and everything that happened since.
What's interesting about a counter is how much it changed over some period of time. I.e. either the average rate of change per second (e.g. 3 queries per second) or the increase over some time range (e.g. 10K queries in the last hour).
So instead of graphing something like e.g. http_requests, you should graph rate(http_requests[1m]) (the averate number of requests over the previous 1 minute) or increase(http_requests[1h]) (the total number of requests over the past hour). You can play with the range size until you get something which makes sense for your data. But make sure to use a range at least 2x your scrape interval (and ideally more, as Prometheus is somewhat daft in the way it computes rates/increases).