How to visualize stickiness (DAU / MAU) in Apache superset - druid

i am new to Apache superset i want to implement chart it should represent stickiness means ratio of Daily active users/Monthly active users.
Using Druid as a datasource.
column present in my database "userId","event","timestamp".

For DAU you just need to truncate timestamp to daily grain:
SELECT COUNT(DISTINCT userId) AS dau, FLOOR(timestamp TO DAY) AS day
FROM your_datasource
GROUP BY FLOOR(timestamp TO DAY)
You can write the query above in SQL Lab, and then click "Visualize" to build a chart out of it. Alternatively, in the explore view you can choose "day" as the time grain, and then define the metric as COUNT(DISTINCT userId).
For WAU/MAU, it's a bit more complicated. Traditionally you'd use a self join for the query, something like this (in pseudo-SQL):
SELECT ...
FROM your_datasource AS today
JOIN your_datasource AS last_month
ON last_month.timestamp BETWEEN today.timestamp AND (today.timestamp - 30 days)
But Druid only supports equijoins.

Related

Get truncked data from a table - postgresSQL

I want to get truncked data over the last month. My time is in unix timestamps and I need to get data from last 30 days for each specific day.
The data is in the following form:
{
"id":"648637",
"exchange_name":"BYBIT",
"exchange_icon_url":"https://cryptowisdom.com.au/wp-content/uploads/2022/01/Bybit-colored-logo.png",
"trade_time":"1675262081986",
"price_in_quote_asset":23057.5,
"price_in_usd":1,
"trade_value":60180.075,
"base_asset_icon":"https://assets.coingecko.com/coins/images/1/large/bitcoin.png?1547033579",
"qty":2.61,
"quoteqty":60180.075,
"is_buyer_maker":true,
"pair":"BTCUSDT",
"base_asset_trade":"BTC",
"quote_asset_trade":"USDT"
}
I need to truncate data based on trade_time
How do I write the query?
The secret sauce is the date_trunc function, which takes a timestamp with time zone and truncates it to a specific precision (hour, day, week, etc). You can then group based on this value.
In your case we need to convert these unix timestamps javascript style timestamps to timestamp with time zone first, which we can do with to_timestamp, but it's still a fairly simple query.
SELECT
date_trunc('day', to_timestamp(trade_time / 1000.0)),
COUNT(1)
FROM pings_raw
GROUP BY date_trunc('day', to_timestamp(trade_time / 1000.0))
Another approach would be to leave everything as numbers, which might be marginally faster, though I find it less readable
SELECT
(trade_time/(1000*60*60*24))::int * (1000*60*60*24),
COUNT(1)
FROM pings_raw
GROUP BY (trade_time/(1000*60*60*24))::int

Subract Additional Time from $__timeFilter

I want to subract additional time in $__timeFilter in grafana. Like if I have selected Last 7 days, I want to run 2 queries which do a comparison like one query gives me avg cpu utilization for last 7 days and another one gives me avg cpu utilzation for now() - 14d to now() - 7d. And this is dynamic. I can get for 6hrs, 2days or anything selected.
My database is TimescaleDB and grafana version in 8.3.5
Edit
Query is
select avg(cpu) from cpu_utilization where $__timeFilter(timestamp)
Whatever is selected in the time filter in grafana, the query is manipulated accordingly
Now with grafana understands this query becomes as following. if I select last 24hrs
select avg(cpu) from cpu_utilization where timestamp BETWEEN '2022-09-07 05:32:10' and '2022-09-08 05:32:10'
This is normal behaviour. Now I wanted that if I select last 24hrs, this query to behave as it is but an additional query becomes
select avg(cpu) from cpu_utilization where timestamp BETWEEN '2022-09-06 05:32:10' and '2022-09-07 05:32:10'
(I just don't want it for last 24hrs, but any relative time period selected in the filter)
Answer : https://stackoverflow.com/a/73658919/14817486
You can use the global variables $__to and $__from.
For example, ${__from:date:seconds} will give you a timestamp in seconds. You can then subtract 7 days (= 604800 seconds) from it and use it in your query's WHERE clause. Depending on your SQL dialect, that might be by using TIMESTAMP(), TO_TIMESTAMP() or something similar. So it would look similar to this:
[...] WHERE timestamp BETWEEN TO_TIMESTAMP(${__from:date:seconds}-604800) AND TO_TIMESTAMP(${__to:date:seconds}-604800) [...]
Interesting question! If I understood correctly, you could use the timestamp column as the reference as the grafana is already filtering by this to the comparison query. So you can get the min(timestamp) and max(timestamp) to know the limits of your period and then build something from it.
Like min(timestamp) - INTERVAL '7 days' would give you the start of the previous range, and max(timestamp) - INTERVAL '7 days' would offer the final scope.

Is there a way to automatically increment the dates in my InlfuxQL query?

I have a Grafana dashboard , version v8.1.6 (4a4083716c),where I display the output voltage, current and power of a solar panel. I am using the watt2kwh node to convert my power reading that is in Watt to Watt-Hour. The interval between successive measurements is 10 seconds. Node-Red, version 2.0.6, is used to populate my database.
In Grafana I would like to show the total accumulated power for the current day from 00:00 to 00:00 of the next day. I am successfully doing this with the query below:
SELECT sum("value") FROM "solar/ina219/energy" WHERE time> '2021-10-10 00:00:00' AND time< '2021-10-11 00:00:00'
But each day I must manually change the dates. Can I automate the changing of the dates using InfluxQL? (or pure SQL)
Or would it be easier implementing this in Node-Red and then just fetching the accumulated energy from the database?
Below is a screenshot of the simple panel:
Any help will be appreciated!
Thank you.
After some research I concluded that the InfluxQL query language cannot do this and I would have to use Flux.

Calculated Field to Count While Between Dates

I am creating a Tableau visualization for floor stock in our plant. We have a column for incoming date, quantity, and outgoing date. I am trying to create a visualization that sums the quantity but only while between the 2 columns.
So for example, if we have 9 parts in stock that arrived on 9/1 and is scheduled to ship out on 9/14, I would like this visualization to include these 9 parts in the sum only while it is in our stock between those 2 dates. Here is an example of some of the data I am working with.
4/20/2018 006 5/30/2018
4/20/2018 017 5/30/2018
4/20/2018 008 5/30/2018
6/29/2018 161 9/7/2018
Create a new calculation:
if [ArrivalDate]>="2018-09-01" and [ArrivalDate]<"2018-09-15"
and [Shipdate]<'2018-09-15"
then [MEASUREofStock] else 0 end
Here is a solution using UNIONs written before Tableau added support for Unions (so it required custom SQL)
Volume of an Incident Queue at a Point in Time
For several years now, Tableau has supported Union directly, so now it is possible to get the same effect without writing custom SQL, but the concept is the same.
The main thing to understand is that you need a data row per event (per arrival or per departure) and a single date column, not two. That will let you calculate the net change in quantity per day, and you can then use a running total if you want to see the absolute quantity at the close of each day
There is no simple way to display the total quantity between the two dates without changing the input table structure. If you want to show all dates and the "eligible" quantity in each day, you should
Create a calendar table that has all dates start from 1990-01-01 to 2029-12-31. (You can limit the dates to be displayed in dashboard later by applying date filter, but here you want to be safe and include all dates that may exist in your stock table) Here is how to create the date table quickly.
Left join the date table to stock table and calculate the eligible quantity in each day.
SELECT
a.date,
SUM(CASE WHEN b.quantity IS NULL THEN 0 ELSE b.quantity END) AS quantity
FROM date a
LEFT JOIN
stock b on a.date BETWEEN b.Incoming_Date AND b.Outgoing_Date
GROUP BY a.date
Import the output table to Tableau, and simply add dates and quantity to the chart.

Year over year monthly sales

I am using SQL Server 2008 R2. Here is the query I have that returns monthly sales totals by zip code, per store.
select
left(a.Zip, 5) as ZipCode,
s.Store,
datename(month,s.MovementDate) as TheMonth,
datepart(year,s.MovementDate) as TheYear,
datepart(mm,s.MovementDate) as MonthNum,
sum(s.Dollars) as Sales,
count(*) as [TxnCount],
count(distinct s.AccountNumber) as NumOfAccounts
from
dbo.DailySales s
inner join
dbo.Accounts a on a.AccountNumber = s.AccountNumber
where
s.SaleType = 3
and s.MovementDate > '1/1/2016'
and isnull(a.Zip, '') <> ''
group by
left(a.Zip, 5),
s.Store,
datename(month, s.MovementDate),
datepart(year, s.MovementDate),
datepart(mm, s.MovementDate)
Now I'd like to add columns that compare sales, TxnCount, and NumOfAccounts to the same month the previous year for each zip code and store. I also would like each zip code/store combo to have a record for every month in the range; so zeros if null.
I do have a calendar table that I tried to use to get all months, but I ran into problems because of my "where" statements.
I know that both of these issues (comparing to previous year and including all dates in a date range) have been asked and answered before, and I've gotten them to work before myself, but this particular one has me running in circles. Any help would be appreciated.
I hope this is clear enough.
Thanks,
Tim
Treat the Query you have above as a data source. Run it as a CTE for the period you want to report, plus the period - 12 months (to get the historic data). (SalesPerMonth)
Then do a query that gets all the months you need from your calendar table as another CTE. This is the reporting months, not the previous year. (MonthsToReport)
Get a list of every valid zip code / Store combo - probably a select distinct from the SalesPerMonth CTE this would give you only combos that have at least one sale in the period (or historical period - you probably also want ones that sold last year, but not this year). Another CTE - StoreZip
Finally, your main query cross joins the StoreZip results with the MonthsToReport - this gives you the one row per StoreZip/Month combos you are looking for. Left join twice to the SalesPerMonth data, once for the month, once for the 1 year previous data. Use ISNULL to change any null records (no data) to zero.
Instead of CTEs, you could also do it as separate queries, storing the results in Temp tables instead. This may work better for large amounts of data.