Aggregate by most recent not-null value - tableau-api

I have a dataset with the following columns [ product_id, country_id, date, number_of_installs, cumulative_installs_last_30_days ]
I have no problem applying the standard measures to find the sum, max or average number_of_installs within those three dimensions (product_id, country_id, date(aggregated by month or week)). However, I have not been able to aggregate by cumulative_installs_last_30_days because as that variable is already a cumulative, I need to return the “most recent value” and Tableau does not have that option built-in the aggregation functions.
How do I create a Calculated Field that enables an addicional column in the aggregated dataset with the most recent not-null value of cumulativeInstalls_last_30_days within the dimensions product_id, country_id and date(aggregated by month or week)?

Here's a dirty solution.
In the comments, you noted that you wanted that 30 days to be dynamic, so to accomplish that, create a parameter, make it an integer, select Range, and allow any integer over zero. I'll call it [Number of Days].
Then create a calculated field:
TOTAL(SUM(IIF(DATEDIFF("day", [date], TODAY()) < [Number of Days], [Number of Installs], NULL)))
I know that's ridonk, so I'll break it down, from the inside out.
DATEDIFF("day", [date], TODAY())
That just calculates the difference in days between today and the date in a given row.
IIF(DATEDIFF("day", [date], TODAY()) < [Number of Days], [Number of Installs], NULL)
That checks if that difference is less than the number of days you selected. If it is, this statement is equal to the number of installs. If it's not, it's null. As a result, if we sum all of these values, we only get the number of installs in the last [Number of Days] days.
With that in mind, we SUM() the rows. TOTAL() just performs that sum over every database row that contributes to the partition.
Note that if your database has dates after TODAY(), you'll need to add another condition to that IIF() statement to make sure those aren't included.
You also mentioned that you want to be able to aggregate the number of installs by month. That's MUCH easier. Just toss MONTH([date]) into the dashboard, then SUM([Number of Installs]), and Tableau will knock it out for you.

Related

First date with sales greater than 100 in TABLEAU

I have a very Basic flat file with Sales by date and product names. I need to create a field for First sales day where sales are greater than 100 units.
I tried {FIXED [Style Code]: MIN([Prod Cal Activity Date])} but that just gives me the first day in the data the Style code Exists
I also tried IF ([Net Sales Units]>200) THEN {FIXED [Style Code]: MIN([Prod Cal Activity Date])}END but that also gives me the first day in the data the Style code Exists
DATA EXISTS PRIOR TO SALES DATE
You can use the following calculation:
MIN(IF([Net Sales Units]>100) THEN [Prod Cal Activity Date] ELSE #2100-01-01# END)
The IF([Net Sales Units]>100) THEN [Prod Cal Activity Date] ELSE #2100-01-01# END part of the calculation converts the date into a very high value (year 2100 in the example) for all the cases where the sales was more than 100 units. Once this is done, you can simply take a minimum of the calculated date to get the desired result. If you need this by style code, then you can add a fixed function in the beginning.
A few ways to simplify further if you like. They don't change the meaning.
You don't need parenthesis around boolean expressions as you would in C.
You can eliminate the ELSE clause altogether. The if expression will default to null in cases where the condition was false. Aggregation functions like MIN(), MAX(), SUM() etc silently ignore nulls, so you don't have to come up with some default future date.
So MIN(IF [Net Sales Units] > 100 THEN [Prod Cal Activity Date] END is exactly equivalent, just a few less characters to read.
The next possible twist has a bit of analytic value beyond just saving keystrokes.
You don't need to hard code the choice of aggregation function into the calculation. You could instead name your calculated field something like High Sales Activity Date defined as just
if [Net Sales Units] > 100 then [Prod Cal Activity Date] end
This field just holds the date for records with high sales, and is null for records with low sales. But by leaving the aggregation function out of the calculation, you have more flexibility to use it in different ways. For example, you could
Calculate the earliest (i.e. Min) high sales date as requested originally
Calculate the latest high sales date using Max
Filter to only dates with high sales by filtering special non-null values
Calculate the number of high sales dates using COUNTD
Simple little filtering calculations like this can be very useful - so called because of the embedded if statement effectively filters out values that don't match the condition. There are still null values for the other records, but since aggregation functions ignore nulls, you can think of them as effectively filtered out by the calculation.

postgreSQL sorting with timestamps

I have the following SQL statement:
SELECT * FROM schema."table"
WHERE "TimeStamp"::timestamp >= '2016-03-09 03:00:05'
ORDER BY "TimeStamp"::date asc
LIMIT 15
What do I expect it to do? Giving out 15 rows of the table, where the timestamp is the same and bigger than that date, in ascending order. But postgres sends the rows in the wrong order. The first item is on the last position.
So has anyone an idea why the result is this strange?
Use simply ORDER BY "TimeStamp" (without casting to date).
By casting "TimeStamp" to date you throw away the time part of the timestamp, so all values within one day will be considered equal and are returned in random order. It is by accident that the first rows appear in the order you desire.
Don't cast to date in the ORDER BY clause if the time part is relevant for sorting.
Perhaps you are confused because Oracle's DATE type has a time part, which PostgreSQL's doesn't.

Tableau: How to calculate number of days

I have custom SQL query in tableau that displays the data in the following format:
Start Date | App
6/21/16 app1
6/22/16 app2
6/23/16 app3
In this case, the end date would be '6/23/16'. So, app1 has been "live" for 2 days, app2 for 1 and so on.
I am trying to find the the number of days an app has been live. I can try using the DateDiff function but I would need to hardcode the values in that case and I want it to be dynamic.
The challenge is to have a calculated field that would find the max date in the entire column and subtract it from the individual app's date. This would give me the 'number of live days' for an app.
I am new to tableau and do not know how to proceed. Any help is appreciated.
Here is one solution.
datediff('day', [Start Date], { fixed : max([Start Date]) } )
Note the expression in Curley braces. That is a level of Detail (LOD) calculation -- basically a separate subquery at a potential different level of detail. So you can compare values for each row with values computed based on the whole table.
Depending on how and where you want to use this calculation, you might want to alter that LOD calculation to be fixed for certain dimensions or include or exclude certain dimensions. The online help should explain.
Just use"DATEDIFF('day',[Order Date], [Ship Date])",
order date and ship date are example dimensions from superstore data.xlxs

Filter data by different time intervals

I need to filter my query with different time intervals like that:
...
where
date >= '2011-07-01' and date <='2011-09-30'
and date >='2012-07-01' and date >='2012-09-30'
I suppose such code is not good, because these dates conflicts with each other. But how to filter only these two intervals, skipping everything else? Is it even possible? Because if I query like this, I don't get any results.I tried to use BETWEEN, but it does same thing.
I bypassed this by extracting quarters from years and calculating only third quarter. But then other quarters sum is showed as zero and I can't ignore these rows that have sum column with zero value. I tried to filter where price > 0 (column where sum goes), but it says that column do not exist. So I put it whole FROM under '('')' brackets to make it calculate sum before where clause, but it still does give me error that such column do not exist.
Also if you need to see query I have now, I can post it, just tell me if it is needed.
I want to do this, because I need to compare third quarter of two different years (maybe I should use another approach).
You're not going to get any results because you can't have a date that's both within 7/1/2011 through 9/30/11 and after 7/1/2012 and after 9/30/12.
You can have a date that is either between 7/1/20122 and 9/30/2011 or between 7/1/2012 and 9/30/2012.
SELECT col1 FROM table1
WHERE date BETWEEN '7/1/2011' AND '9/30/2011' OR date BETWEEN '7/1/2012' AND '9/30/2012';

MDX - calculate one date dimension from another date dimension

I have a fact table that has 2 dates Invoice Date and Accounting Current Date. In order to get requested Revenue value I need to use combination of these two dates. For example, if I need YTD Revenue I need to select it like this:
(Note: I am writing SQL query because I am more familiar with it)
SELECT Revenue
FROM
Fact_Revenue
WHERE
Invoice_Date <= '2011-10-22'
and AccountingCurrent >= '2011-01'
and AccountingCurrent <= '2011-10'
Besides Revenue, this fact tables has other information that I also need, but for calculating this other data I don't need Accounting Current Date. So my idea is to use only 1 date (Invoice Date) in main MDX query (so that I can grab as many data with 1 query as I can) and for calculating Revenue I would like to use Calculated Member and in there I would like to associate Accounting Current Date with selected Invoice Date.
For example
SELECT {[Measure].[RevenueYTD],
[Measure].[RevenueMTD],
[Measure].[NumberOfInvoices],
[Measure].[NumberOfPolicies]}
ON COLUMNS,
{[People].Members} ON ROWS
FROM [Cube]
WHERE
[Invoice Date].[Date Hierarchy].[Date].&[2011-10-22]
In this case, [Measure].[RevenueYTD] and [Measure].[RevenueMTD] need to be limited by Accounting Current Date and Invoice Date must be lower than the date from the query. On the other hand, I need [Measure].[NumberOfInvoices] and [Measure].[NumberOfPolicies] for particual Invoice Date (or MTD Date, whatever), but without involvemenet of Accounting Current Date
Calculated member query should do something like this (this is more like algorithm):
ROUND(
SUM(
YTD([Accounting Current Date].[Date Hierarchy].CurrentMember),
[Measures].[Revenue]
),
2)
WHERE [Invoice Current Date].[Date Hierarchy] < [Invoice Current Date].[Date Hierarchy].CurrentMember
Navigating from one dimension to another is not something trivial in MDX. In theory dimensions are independent so standard language is missing functions for doing this. You can use StrToMember MDX function but it's slow and a bit strange.
For your filters, let's start with the first one :
Invoice_Date <= '2011-10-22'
In MDX we'll have to create a set with the members matching the expression. This can be done using the Range set operator :
NULL:[Invoice Date].[Date Hierarchy].[Date].&[2011-10-22]
The other filter is easy to guess :
AccountingCurrent >= '2011-01' and AccountingCurrent <= '2011-10'
MDX version :
[Accounting Date].[Date Hierarchy].[Date].&[2011-01-31]:[Accounting Date].[Date Hierarchy].[Date].&[2011-10-30]
It's also possible using Filter MDX function if your need different type of filters.
Now we need to take the pieces and build the query. One possible solution is using a set slicer and overwritting the values when you don't want the filter to be applied :
WITH
// here we're changing the 'selection' from the where clause
MEMBER [Measure].[NumberOfInvoices II] AS ([Accounting Date].[Date Hierarchy].defaultmember,[Measure].[NumberOfInvoices])
SELECT
.. axis here [Measure].[RevenueYTD] will be applying the filters defined in the where clause
FROM MyCube
WHERE {[Accounting Date].[Date Hierarchy].[Date].&[2011-01-31]:[Accounting Date].[Date Hierarchy].[Date].&[2011-10-30]}