Group by field and bin for CloudWatch Logs Insights line chart - group-by

I want to create a line chart with two lines in AWS CloudWatch Logs Insights. One line representing mobile users and the other desktop users, showing a success rate for each group of users.
This this is the code I am working with:
| fields
properties.device as device,
properties.success as success
| stats avg(success) by device, bin(1hour)
The results of this query look promising. As you can see, the results include the device type, a timestamp, and the floating-point number to be plotted on the line chart:
# device bin(1hour) avg(success)
1 desktop 2023-02-01T10:00:00.000 0.6129
2 mobile 2023-02-01T10:00:00.000 0.7453
3 desktop 2023-02-01T09:00:00.000 0.5578
4 mobile 2023-02-01T09:00:00.000 0.6082
However, the Visualization tab shows me this error:
The data is not suitable for a line chart.
Try a bar chart, or group your result by bin function.
I think Logs Insights is getting confused by my overlapping timestamps. It does not know that I intend one time series of mobile data and another of desktop data. To group by a field and bin by time, I seem to be doing the standard thing using a single by operation with two arguments. But it's not good enough to create a line chart.
Is there a better way to structure my query to convince CloudWatch of what I am trying to do?

Related

Can you calculate active users using time series

My atomist client exposes metrics on commands that are run. Each command is a metric with a username element as well a status element.
I've been scraping this data for months without resetting the counts.
My requirement is to show the number of active users over a time period. i.e 1h, 1d, 7d and 30d in Grafana.
The original query was:
count(count({Username=~".+"}) by (Username))
this is an issue because I dont clear the metrics so its always a count since inception.
I then tried this:
count(max_over_time(help_command{job=“Application
Name”,Username=~“.+“}[1w]) -
max_over_time(help_command{job=“Application name”,Username=~“.+“}[1w]
offset 1w) > 0)
which works but only for one command I have about 50 other commands that need to be added to that count.
I tried the:
"{__name__=~".+_command",job="app name"}[1w] offset 1w"
but this is obviously very expensive (timeout in browser) and has issues with integrating max_over_time which doesn't support it.
Any help, am I using the metric in the wrong way. Is there a better way to query... my only option at the moment is the count (format working above for each command)
Thanks in advance.
To start, I will point out a number of issues with your approach.
First, the Prometheus documentation recommends against using arbitrarily large sets of values for labels (as your usernames are). As you can see (based on your experience with the query timing out) they're not entirely wrong to advise against it.
Second, Prometheus may not be the right tool for analytics (such as active users). Partly due to the above, partly because it is inherently limited by the fact that it samples the metrics (which does not appear to be an issue in your case, but may turn out to be).
Third, you collect separate metrics per command (i.e. help_command, foo_command) instead of a single metric with the command name as label (i.e. command_usage{commmand="help"}, command_usage{commmand="foo"})
To get back to your question though, you don't need the max_over_time, you can simply write your query as:
count by(__name__)(
(
{__name__=~".+_command",job=“Application Name”}
-
{__name__=~".+_command",job=“Application name”} offset 1w
) > 0
)
This only works though because you say that whatever exports the counts never resets them. If this is simply because that exporter never restarted and when it will the counts will drop to zero, then you'd need to use increase instead of minus and you'd run into the exact same performance issues as with max_over_time.
count by(__name__)(
increase({__name__=~".+_command",job=“Application Name”}[1w]) > 0
)

Two different averages on a Tableau graph - user and company-wide in as a reference line

I'm using Tableau to show items resolved over time, either 'On Time' or 'Overdue'.
I want to be able to show the users individual performance as a reference line and what the company average is as a second reference line.
The graph plots the items 'On Time' over time.
I can only get the users average individual performance to show as a reference line.
Any ideas how to create this company wide average?
Cant share workbook as has sensitive data, but can share calculations.

JMeter to record results on hourly basis

I have a JMeter project with multiple GET and POST requests and assertions for these. I use Aggregate results and View results tree listeners, but none of these can store results on hourly basis. I tried JMeterPlugins-Standard and JMeterPlugins-Extras packages and jp#gc - Graphs Generator listener, but all of them use aggregated data instead of hourly data. So I would like to get number of successful and failed requests/assertions per hour, maybe a bar chart would be most suitable for this purpose.
I'm going to suggest a non-conventional design-level solution: name your samplers dynamically with hour (or date and hour), so that each hour the name will change, and thus they will appear in different category, i.e.:
The code for such name is:
${__time(dd:hh,)} the rest of sampler name
Such sampler will appear in the following way in Aggregate Report (here I simulated it with minutes/seconds, but same will happen with days/hours, just on larger scale):
Pros and cons of such approach:
Simple, you can aggregate anything by hour, minute, or any other time slice while test is running, and not by analysis after execution.
Not listener-dependant, can be used with pretty much any listener or visualizer
If you want to also have overall stats, it will require to sum up every sub-category. So it alters data, but in the way that it can still can be added back to original relatively easy.
Calculating __time before every sampler will not be unnoticed completely from performance perspective, but I don't think it will add visible overhead to a script.
You could get the same data by properly aggregating JTL or CSV (whichever you use) after execution, so it doesn't provide you with anything that is not possible to achieve using standard methods
Script needs altering to make this happen. if you have 100s of samplers, it's going to take a while. And if you want to change back...
You might want to use Filter Results Tool which has --start-offset and --end-offset parameters, you can "cut" your results file into "interesting" pieces and plot them according to your requirements.
You can install Filter Results Tool using JMeter Plugins Manager
Also be aware that according to JMeter Best Practices you should
Use as few Listeners as possible; if using the -l flag as above they can all be deleted or disabled.
Don't use "View Results Tree" or "View Results in Table" listeners during the load test, use them only during scripting phase to debug your scripts.
You can get whatever information you need from the .jtl results file, you can specify test results location via -l command-line argument
To get summarized results per hour add to your test plan Generate Summary Results:
Generates a summary of the test run so far to the log file and/or standard output
Update interval in jmeter.properties to your needs ,1 hour, 3600 seconds:
summariser.interval=3600
You will get summary per hour of your requests.
You can try with Jmeter backend Listener. It has integration with graphite and Influxdb. After storing the results in these time series database you can display the result in Grafana dashboard. Grafana has its own filtering of showing the results in hourly, monthly, daily basis and so on.

Google Analytics API: tiny differences in results between GA API and Google Analytics UI

I'm querying GA Report API v4 to get some metrics for AdWords Keywords.
As dimension I use:
ga:keyword
As metrics I use:
ga:adClicks,
ga:adCost,
ga:CPC,
ga:sessions,
ga:bounceRate,
ga:pageviewsPerSession,
ga:goalConversionRateAll,
ga:transactions,
ga:transactionRevenue
When I compare results pulled from API with results that I'm getting by Google Analytics UI, I found out that certain metrics in some Keywords has tiny differences.
Also when I tried GA API v3 I had same result.
What is the reason?
Why some returned metrics for Keywords are fully identical to results in UI, but certain not?
I tried various date ranges: 1 day, week, month but in all cases I got some tiny differences in some metrics of certain Keywords.
Here is screenshot with example of differences in metrics how it looks like:
In red color means the difference, green color - means that values are identical
Problem: The reason for the discrepancy is that you are calling two different reports.
Report 1) UI Report.
As you have seen, this report is made up of two parts the first being Clicks, Cost, and CPC which are from the Google AdWords API, and the other metrics (sessions, bounce, etc.) which is from Google Analytics.
Because you are going into AdWords > Keywords, you are actually setting a filter to select only AdWords traffic.
Report 2) Custom Report.
This report is pulling the keywords dimension without any filters. This means that the report will also have data for organic keywords, and any UTM_term parameters set.
Because sessions from organic keywords have no AdWords data, the first three columns will be the same, however the Google Analytics specific columns will show variation in the metrics.
Solution:
To get your reports the same, you need to add a filter to your API request, such as ga:adwordsCustomerID or ga:source=google & ga:medium=cpc.

Google Analytics Core Reporting API query for exits and entrances metrics - entrance values incorrectly exactly the same as exits

I'm using GA's Core Reporting API to create a report that shows the top exit pages alongside some behavioural metrics for each page. The dimension is ga:exitPagePath, and the metrics I want are:
ga:exits
ga:pageviews
ga:entrances
ga:avgTimeOnPage
ga:bounceRate
ga:exitRate
I'm sorting by -ga:exits. I'm not using any filters or segments.
The query appears to work fine, it doesn't return an error - however the entrances values it returns are incorrect and exactly match the exit values for each page. Other queries for ga:entrances without ga:exits give the correct entrance values.
I may have overlooked it but I can't find anywhere in the documentation indicating that these metrics can't be used together. I also tested creating a custom report within the GA interface with these two metrics and found the same result - no error or indication that I can't create a report with both metrics, but entrances incorrectly reported and exactly matching the exit values. I also get the same result in GA's Query Explorer.
Would love to work this out - it seems perfectly logical to me to want to view entrances alongside exits for exit pages :)
A better late than never response.
It makes sense, because all users that have visited your site (entrances) have left (exits).
It gets meaningful when using it along with the pages (ga:pagePath for example).