I'm using Graphite + Grafana to monitor (by sampling) queue lengths in a test system at work. The data that gets uploaded to Graphite is grouped into different series/metrics by properties of the payloads in the queue. These properties can be somewhat arbitrary, at least to the point where they are not all known at the time when the data collection script is run.
For example, a property could be the project that the payload belongs to and this could be uploaded as a separate series/metric so that we can monitor the queues broken down by the different projects.
This has the consequence that Graphite sends a lot of null values for certain metrics if the queues in the test system did not contain any payloads with properties that would group it into that specific series/metric.
For example, if a certain project did not have any payloads in queue at the time when the data collection was ran.
In Grafana this is not so nice as the line graphs don't show up as connected lines and gauges will show either null or the last non-null value.
For line graphs I can just chose to connect null values in Grafana but for gauges thats not possible.
I know about the keepLastValue function in Graphite but that includes a limit for how long to keep the value which I like very much as I would like to keep the last value until the next time data collection is ran. Data collection is run periodically at known intervals.
The problem with keepLastValue is it expects a number of points as this limit. I would rather give it a time period instead. In Grafana the relationship between time and data points is very dynamic so its not easy to hard-code a good limit for keepLastValue.
Thus, my question is: Is there a way to tell Graphite to keep the last value for a given time instead of a given number of points?
I'm trying to use a kibana trigger from a monitoring alert using the message box to formulate a kibana link that sends it to a specific kibana dashboard for a time period of the last 7 days from the time the alert is triggered and send it to a slack webhook.
I'm using the ctx.periodStart variable as my ending period. Is it possible to get the exact timestamp that's 7 days from the ctx.periodStart timestamp as part of my kibana trigger message (In other words, if the ctx.periodStart timestamp is 2022-04-11T19:19:25Z as my period ending range, I want to have the beginning timestamp value at 2022-04-04T19:19:25Z)? I can't seem to find anything that can do this. The only way around this issue is to use the ctx.results.0.hits.hits.0._source.timestamp variable as my starting period and that timestamp value is a bit earlier than the exact 7 days from the ctx.periodStart timestamp. Using a relative time period (like now-7d) isn't going to cut it because we fire this kibana monitor alert every Friday at noon to grab a query where we know we will get at least 1 hit and every Tuesday at midnight where we know we are not likely to get any results from the query. And since we know that this alert will go to a slack message with the kibana dashboard link once a week, putting a relative time period like 'now-7d' is going to mess up the kibana links for past slack alerts from the same kibana monitoring alert.
Thanks.
Trying to understand the difference between the two: Aggregator vs Aligner.
Docs was not helpful for me.
What I'm trying to achieve is to get the bytes of logs generated within a week per each namespace and container combination. For example, I want to see that container C in namespace N generated 10Gb of logs during the last 7 days.
This is how far I got:
Resource type = Kubernetes Container
Metric = Log bytes
Group by = namespace_name and container_name
Aggregator = sum(?) mean(?)
Minimum alignment period = 1(?) 7(?) days
Aligner = sum(?) mean(?)
I was confused with this until I realized that a single metric, such as kubernetes.io/container/cpu/core_usage_time is available in multiple different resources in my cluster.
So when you search for that metric, you'll get a whole lot of different resources that emit that metric. Aggregation is adding up all the data from those different resources WITH THAT SAME METRIC.
This all combines into one "time series" for that metric, an aggregation of all the individual time series from each of those different resources.
Now, alignment is the process of using that time series and putting all the data points through a function (over a period of time, known as the alignment period) which results in one single data point (per alignment period).
So aggregation combines the same metric across multiple resources, while alignment combines multiple data points in the same time series into one data point (per alignment period, which is why that field is required when using alignment).
I want to query my graphite server to retrieve certain metrics.
I am able to query all data points between certain time period but my requirement is, I want to query data points of specific time of previous days.
How can I do this?
The Graphite Render API supports a number of arguments in order to make your query more specific. Specifically, the from / until arguments will be useful to you, you can read about them here: https://graphite.readthedocs.io/en/latest/render_api.html#from-until
edit: I should add that if you're using Grafana for visulaising your data, you can click+drag on the graph to select specific time ranges or use the timepicker in the top-right corner to choose Custom and set your range there.
I have a JMeter project with multiple GET and POST requests and assertions for these. I use Aggregate results and View results tree listeners, but none of these can store results on hourly basis. I tried JMeterPlugins-Standard and JMeterPlugins-Extras packages and jp#gc - Graphs Generator listener, but all of them use aggregated data instead of hourly data. So I would like to get number of successful and failed requests/assertions per hour, maybe a bar chart would be most suitable for this purpose.
I'm going to suggest a non-conventional design-level solution: name your samplers dynamically with hour (or date and hour), so that each hour the name will change, and thus they will appear in different category, i.e.:
The code for such name is:
${__time(dd:hh,)} the rest of sampler name
Such sampler will appear in the following way in Aggregate Report (here I simulated it with minutes/seconds, but same will happen with days/hours, just on larger scale):
Pros and cons of such approach:
Simple, you can aggregate anything by hour, minute, or any other time slice while test is running, and not by analysis after execution.
Not listener-dependant, can be used with pretty much any listener or visualizer
If you want to also have overall stats, it will require to sum up every sub-category. So it alters data, but in the way that it can still can be added back to original relatively easy.
Calculating __time before every sampler will not be unnoticed completely from performance perspective, but I don't think it will add visible overhead to a script.
You could get the same data by properly aggregating JTL or CSV (whichever you use) after execution, so it doesn't provide you with anything that is not possible to achieve using standard methods
Script needs altering to make this happen. if you have 100s of samplers, it's going to take a while. And if you want to change back...
You might want to use Filter Results Tool which has --start-offset and --end-offset parameters, you can "cut" your results file into "interesting" pieces and plot them according to your requirements.
You can install Filter Results Tool using JMeter Plugins Manager
Also be aware that according to JMeter Best Practices you should
Use as few Listeners as possible; if using the -l flag as above they can all be deleted or disabled.
Don't use "View Results Tree" or "View Results in Table" listeners during the load test, use them only during scripting phase to debug your scripts.
You can get whatever information you need from the .jtl results file, you can specify test results location via -l command-line argument
To get summarized results per hour add to your test plan Generate Summary Results:
Generates a summary of the test run so far to the log file and/or standard output
Update interval in jmeter.properties to your needs ,1 hour, 3600 seconds:
summariser.interval=3600
You will get summary per hour of your requests.
You can try with Jmeter backend Listener. It has integration with graphite and Influxdb. After storing the results in these time series database you can display the result in Grafana dashboard. Grafana has its own filtering of showing the results in hourly, monthly, daily basis and so on.