Selecting multiple values/aggregators from Influxdb, excluding time - grafana

I got a influx db table consisting of
> SELECT * FROM results
name: results
time artnum duration
---- ------ --------
1539084104865933709 1234 34
1539084151822395648 1234 81
1539084449707598963 2345 56
1539084449707598123 2345 52
and other tags. Both artnum and duration are fields (that is changeable though). I'm now trying to create a query (to use in grafana) that gives me the following result with a calculated mean() and the number of measurements for that artnum:
artnum mean_duration no. measurements
------ -------- -----
1234 58 2
2345 54 2
First of all: Is it possible to exclude the time column? Secondly, what is the influx db way to create such a table? I started with
SELECT mean("duration"), "artnum" FROM "results"
resulting in ERR: mixing aggregate and non-aggregate queries is not supported. Then I found https://docs.influxdata.com/influxdb/v1.6/guides/downsampling_and_retention/, which looked like what I wanted to do. I then created a infinite retention policy (duration 0s) and a continuous query
> CREATE CONTINUOUS QUERY "cq" ON "test" BEGIN
SELECT mean("duration"),"artnum"
INTO infinite.mean_duration
FROM infinite.test
GROUP BY time(1m)
END
I followed the instructions, but after I fed some data to the db and waited for 1m, `SELECT * FROM "infinite"."mean_duration" did not return anything.
Is that approach the right one or should I continue somewhere else? The very goal is to see the updated table in grafana, refreshing once a minute.

InfluxDB is a time series database, so you really need the time dimension - also in the response. You will have a hard time with Grafana if your query returns non time series data. So don't try to remove time from the query. Better option is to hide time in the Grafana table panel - use column styles and set Type: Hidden.
InfluxDB doesn't have a tables, but measurements. I guess you need query with proper grouping only, no advance continous queries, etc.. Try and improve this query*:
SELECT
MEAN("duration"),
COUNT("duration")
FROM results
GROUP BY "artnum" fill(null)
*you may have a problem with grouping in your case, because artnum is InfluxDB field - better option is to save artnum as InfluxDB tag.

Related

I can't show to Grafana what time field it should use for chart building

I have a Postgresql DataSource with the following table:
It's kinda logs. All I want is to show on a chart how many successful records (with http_status == 200) do I have per each hour. Sounds simple, right? I wrote this query:
SELECT
count(http_status) AS "suuccess_total_count_per_hour",
date_trunc('hour', created_at) "log_date"
FROM logs
WHERE
http_status = 200
GROUP BY log_date
ORDER BY log_date
It gives me the following result:
Looks good to me. I'm going ahead and trying to put it into Grafana:
Ok, I get it, I have to help Grafana to understand where is the field for time count.
I go to Query Builder and I see that it breaks me query at all. And since that moment I got lost completely. Here is the Query Builder screen:
How to explain to Grafana what do I want? I want just a simple chart like:
Sorry for the rough picture, but I think you got the idea. Thanks for any help.
Your time column (e.g. created_at) should be TIMESTAMP WITH TIME ZONE type*
Use time condition, Grafana has macro so it will be easy, e.g. WHERE $__timeFilter(created_at)
You want to have hourly grouping, so you need to write select for that. Again Grafana has macro: $__timeGroupAlias(created_at,1h,0)
So final Grafana SQL query (not tested, so it may need some minor tweaks):
SELECT
$__timeGroupAlias(created_at,1h,0),
count(*) AS value,
'succcess_total_count_per_hour' as metric
FROM logs
WHERE
$__timeFilter(created_at)
AND http_status = 200
GROUP BY 1
ORDER BY 1
*See Grafana doc: https://grafana.com/docs/grafana/latest/datasources/postgres/
There are documented macros. There are also macros for the case, when your time column is UNIX timestamp.

Tableau - Calculated field of different columns based on different partition of the same table

Sorry for the stupid question.
Situation: I have a partitioned table (the partition is the week of the year) with some metrics (e.g. frequency of some keywords); I need to run an analysis of metrics belonging to different partitions (e.g. the trend between the frequency of a keyword in week 32 compared to week 3). The ultimate purpose is to create a dashboard where the user can choose the week of the year and is presented with the calculated analysis on the go.
So far I have used a live query that uses two parameters (week_1 and week_2) that joins data from the same table based on the two different parameters. You can imagine that the dashboard recomputes everything once one of the parameter is changed by the user. To avoid long waiting times, I have set the two parameters to a non-existent default value (0, zero), so that the dashboard can open very quickly. Then I prompt the user to stop the dashboard, insert the new parameters of choice, and then restart the dashboard to load the new computations.
My question is: is it possible to achieve the same by using an extract of the table? The table itself should not be excessively big (it should be 15 million records spanning 3 years) and as far as I know the extracts are performant with those numbers.
I am quite new to Tableau, so I would like to know from more expert people if there is a more optimal way to do such a thing without using live queries.
Please, feel free to ask more information if I was not clear! However, I cannot share my workbook, as it contains sensitive information.
Edit:
+-----------+ -----------+ ------------+
partition keyword frequency
+-----------+ -----------+ ------------+
202032 hello 5000
202032 ciao 567
...
202031 hello 2323
202031 ciao 34567
...
20203 hello 2
20203 ciao 1000
With the live query, I can join the table where partition = 202032 with the same table where partition - 20203 and make a new table with a column where I compute e.g. a trend between the two frequencies:
+----------+ -----------------------+ ---------------+
keyword partitions_compared trend
+----------+ -----------------------+ ---------------+
hello 202032 - 20203 +1billion %
ciao 202032 - 20203 +1K %
With the live query I join on the keywords.
Thanks a lot in advance and have a great day!
Cheers

Display 20 curves in the same grafana panel 1 curve x user

I am using InfluxDB and grafana. I have a data for 20 users.
Here is the query I use to display a single user:
SELECT delta FROM "measures" WHERE user_id='23545296664228'
What I want is a panel in Grafana when I can display all 20 series ?
Ideally, it would be great to be able to display/hide series to focus on a specific one.
The goal, here, is to see if there is a specific user that behaves differently from others, and be able to visually identify him
Is it possible ?
Assuming user_id is tag in measures measurement you need to use group by in query like in this example:
SELECT delta FROM "measures" WHERE time > now() - 1d GROUP BY time(1h),user_id

Best way to query 4 B+ records in Tableau

I am looking a best way to analyse 4B records (1TB data) stored in Vertica using Tableau. I tried using extract of 1M records which works perfectly. but dont know how to manage 4B records, because its taking too long to query on 4B records.
I have following dataset :
timestamp id url domain keyword nor_word cat_1 cat_2 cat_3
So here I need to create descending list of Top 10 ID's, Top 10 url, Top 10 domain, Top 10 keyword, Top 10 nor_word, Top 10 cat_1, Top 10 cat_2, Top 10 cat_3 depending count of each field value in separate worksheet and combine all worksheet in one dashboard.
There is no primary key. This dataset of 1 month so I want to make global filter start date and end date to reduce the query size. But don't know how to create global date filter and display on dashboard ?
You have two questions, one about Vertica and one about Tableau. You should split these up.
Regarding Vertica, you need to know that Vertica stores data in ascending sort order in physical storage. This means that an additional step will always be required anytime you want to get a descending sort order.
I would suggest creating a partition on the date, and subsequently running Database Designer (DBD) in incremental mode and using your queries as samples. By partitioning the data, Vertica can eliminate the partitions during optimization.
Running the DBD will generate some better optimized projections. You should consider the trade-off between how often you will need this data and whether it's worth creating these additional projections as it will impact your load performance.

Sorting result set according to variable in iReport

I have a resultset with columns:
interval_start(timestamp) , camp , queue , other columns
2012-09-10 11:10 c1 q1
2012-09-10 11:20 c1 q2
interval_start is having values in 10 minutes interval like :
2012-09-10 11:10,
2012-09-10 11:20,
2012-09-10 11:30 ....
using Joda Time library and interval_start field, I have created a variable to create string such that if minutes of interval_start lie between 00-30, 30 is set in minutes else 00 is set in minutes.
I want to group the data as :
camp as group1
variable created as group2
queue as group3
and done some aggregations
But in my report result, I am getting same queue many time in same interval.
I have used order by camp, interval_start, queue but the problem is still exists.
Attaching screenshot for your reference:
Is there any way to sort the resultset according to created variable?
Best guess would be an issue with your actual SQL Query. You say the same queue is being repeated, but from looking at your image it is not actually be repeated it is a different row.
Your query is going to be tough to pull off, as you are really wanting to your query to have an order by of order by camp, (rounded)interval_start, queue. Without that it is ordering by the camp column, and then the non-rounded version of interval_start and then camp. When means that the data is not in the correct order for Jasper Reports to group them the way you want. And then the real kicker is Jasper Reports does not have the functionality to sort the data once it gets it. It is up to the developer to supply that.
So you have a couple options:
Update your SQL Query to do the rounding of your time. Depending on your database this is done in different ways, but will likely required some type of stored procedure or function to handle it (see this TSQL function for example).
instead of having the sql query in the report, move it outside the report, and process the data, doing the rounding and sorting on the java side. Then pass it is as a the REPORT_DATASOURCE parameter.
Add a column to your table to store the rounded time. You may be able to create a trigger to handle this all in the database, with out having to change any of your other code in your application.
Honestly both these options are not ideal, and I hope someone comes along and provides an answer that proves me wrong. But I do not think there is currently a better way.