Tableau - Date Filter that acts on Two Date Columns - tableau-api

I have a data set that we are pulling from two different SQL tables. This is a custom SQL query. This is financial data - trades and transactions. I am pulling them together in a single return so for each customer I can see all the trades and transactions in a single query. To be specific - some trades do not have corresponding transactions, and some transactions do not have corresponding trades - therefore there are some rows that will have both a trade AND transaction date, some will have one or the other. For those missing they are NULL.
My question is - how can I create a filter so that I get all trades and transactions in a given date range? I tried using two filters (one for each date), but for some reason that won't work. Also, I'd like for our customers to be able to have one filter that gives them any trade or transaction in that range.
I thought about a parameter, but with the data always changing (this gets refreshed daily), I need something dynamic - one filter that will filter the data set on those two fields. Any suggestions would be greatly appreciated. Thanks so much!!

Related

Identifying next closest record by date in tableau

I have a table of users and another table of transactions.
The transactions all have a date against them. What I am trying to ascertain for each user is the average time between transactions.
User | Transaction Date
-----+-----------------
A | 2001-01-01
A | 2001-01-10
A | 2001-01-12
Consider the above transactions for user A. I am basically looking for the distance from one transaction to the next chronologically to determine the distances.
There are 9 days between transactions one and two; and there are 2 days between transactions three and four. The average of these is obviously 4.5, so I would want to identify the average time between user A's transactions to be 4.5 days.
Any idea of how to achieve this in Tableau?
I am trying to create a calculated field for each transaction to identify the date of the "next" transaction but I am struggling.
{ FIXED [user id] : MIN(IF [Transaction Date] > **this transaction date** THEN [Transaction Date]) }
I am not sure what to replace this transaction date with or whether this is the right approach at all.
Any advice would be greatly appreciated.
LODs dont have access to previous values directly, so you need to create a self join in your data connection. Follow below steps to achieve what you want.
Create a self join with your data with following criteria
Create an LOD calculation as below
{FIXED [User],[Transaction Date]:
MIN(DATEDIFF('day',[Transaction Date],[Transaction Date (Data1)]))
}
Build the View
PS: If you want to improve the performance, Custom SQL might be the way.
The only type of calculation that can take order sequence into account (e.g., when the value for a calculated field depends on the value of the immediately preceding row) is a table calc. You can't use an LOD calc for this kind of problem.
You'll need to understand how partitioning and addressing works with table calcs, along with specifying your sort order criteria. See the online help. You can then do something like, for example, define days_since_last_transaction as:
if first() > 0 then min([Transaction Date]) -
lookup(min([Transaction Date]), -1) end
If you have very large data or for other reasons want to do your calculations at the database instead of in Tableau by a table calc, then you use SQL windowing (aka analytical) queries instead via Tableau's custom SQL.
Please attach an example workbook and anything you tried along with the error you have.
This might not be useful if you cannot set User ID Field as a filter.
So, you can set
User ID
as a filter. Then following the steps mentioned in here will lead you to calculating difference between any two dates. Ideally if you select any one value in the filter, the calculated field from the link should give you the difference in the dates that you have in the transaction dates column.

How to get all missing days between two dates

I will try to explain the problem on an abstract level first:
I have X amount of data as input, which is always going to have a field DATE. Before, the dates that came as input (after some process) where put in a table as output. Now, I am asked to put both the input dates and any date between the minimun date received and one year from that moment. If there was originally no input for some day between this two dates, all fields must come with 0, or equivalent.
Example. I have two inputs. One with '18/03/2017' and other with '18/03/2018'. I now need to create output data for all the missing dates between '18/03/2017' and '18/04/2017'. So, output '19/03/2017' with every field to 0, and the same for the 20th and 21st and so on.
I know to do this programmatically, but on powercenter I do not. I've been told to do the following (which I have done, but I would like to know of a better method):
Get the minimun date, day0. Then, with an aggregator, create 365 fields, each has that "day0"+1, day0+2, and so on, to create an artificial year.
After that we do several transformations like sorting the dates, union between them, to get the data ready for a joiner. The idea of the joiner is to do an Full Outer Join between the original data, and the data that is going to have all fields to 0 and that we got from the previous aggregator.
Then a router picks with one of its groups the data that had actual dates (and fields without nulls) and other group where all fields are null, and then said fields are given a 0 to finally be written to a table.
I am wondering how can this be achieved by, for starters, removing the need to add 365 days to a date. If I were to do this same process for 10 years intead of one, the task gets ridicolous really quick.
I was wondering about an XOR type of operation, or some other function that would cut the number of steps that need to be done for what I (maybe wrongly) feel is a simple task. Currently I now need 5 steps just to know which dates are missing between two dates, a minimun and one year from that point.
I have tried to be as clear as posible but if I failed at any point please let me know!
Im not sure what the aggregator is supposed to do?
The same with the 'full outer' join? A normal join on a constant port is fine :) c
Can you calculate the needed number of 'dublicates' before the 'joiner'? In that case a lookup configured to return 'all rows' and a less-than-or-equal predicate can help make the mapping much more readable.
In any case You will need a helper table (or file) with a sequence of numbers between 1 and the number of potential dublicates (or more)
I use our time-dimension in the warehouse, which have one row per day from 1753-01-01 and 200000 next days, and a primary integer column with values from 1 and up ...
You've identified you know how to do this programmatically and to be fair this problem is more suited to that sort of solution... but that doesn't exclude powercenter by any means, just feed the 2 dates into a java transformation, apply some code to produce all dates between them and for a record to be output for each. Java transformation is ideal for record generation
You've identified you know how to do this programmatically and to be fair this problem is more suited to that sort of solution... but that doesn't exclude powercenter by any means, just feed the 2 dates into a java transformation, apply some code to produce all dates between them and for a record to be output for each. Java transformation is ideal for record generation
Ok... so you could override your source qualifier to achieve this in the selection query itself (am giving Oracle based example as its what I'm used to and I'm assuming your data in is from a table). I looked up the connect syntax here
SQL to generate a list of numbers from 1 to 100
SELECT (MIN(tablea.DATEFIELD) + levquery.n - 1) AS Port1 FROM tablea, (SELECT LEVEL n FROM DUAL CONNECT BY LEVEL <= 365) as levquery
(Check if the query works for you - haven't access to pc to test it at the minute)

Annualize data - Tableau

I'm trying to annualise my data in tableau, but get an error in the Calculated Field.
"Cannot mix aggregate and non-aggregate arguments to function"
my formula is
sum(profit)/month(selected date) *12
How do I get an integer for the current month? That seems to be the problem, it tries to aggregate the month as well.
Thanks.
Short answer: wrap the call to month in a call to min() -- which works well if you have MONTH([selected date]) on the visualization as a dimension.
There are three types of calculated fields in Tableau:
row level calculations which act on a single data row. They can read from values of other fields in the same row and return a single value per row.
aggregate calculations which act on a partition or block of data rows. They can reference the result of aggregating the values for a field across the entire partition, using a an aggregate function like SUM() or MIN().
table calculations which act on an entire table of aggregated results.
You can't mix and match. Everything in a calculated field must be all at one level or another -- either all referenced fields must use aggregation functions (for aggregate calculated fields) or no referenced fields must use aggregation functions (for data row level calculated fields).
Hence the error message you saw.
Sometimes you know that all values for a field will be the same in a partition based on your visualization, so the aggregation function seems unnecessary. But Tableau still requires you to be explicit about how to turn a block of values into a single value, because the calculation must be defined even when the visualization is partitioned differently. In these cases, you can use min(), max(), avg(), or perhaps attr() because they all return the same value for a list of identical values.
The first two types are typically executed on the server (i.e. they are implemented by Tableau emitting SQL to send to the database server). Table calculations are executed by Tableau on the client site to post-process the results from the database server.
Table calcs are the most complicated type, but can be very useful. Explaining them is a post for another day.

Executing query in chunks on Greenplum

I am trying to creating a way to convert bulk date queries into incremental query. For example, if a query has where condition specified as
WHERE date > now()::date - interval '365 days' and date < now()::date
this will fetch a years data if executed today. Now if the same query is executed tomorrow, 365 days data will again be fetched. However, I already have last 364 days data from previous run. I just want a single day's data to be fetched and a single day's data to be deleted from the system, so that I end up with 365 days data with better performance. This data is to be stored in a separate temp table.
To achieve this, I create an incremental query, which will be executed in next run. However, deleting the single date data is proving tricky when that "date" column does not feature in the SELECT clause but feature in the WHERE condition as the temp table schema will not have the "date" column.
So I thought of executing the bulk query in chunks and assign an ID to that chunk. This way, I can delete a chunk and add a chunk and other data remains unaffected.
Is there a way to achieve the same in postgres or greenplum? Like some inbuilt functionality. I went through the whole documentation but could not find any.
Also, if not, is there any better solution to this problem.
I think this is best handled with something like an aggregates table (I assume the issue is you have heavy aggregates to handle over a lot of data). This doesn't necessarily cause normalization problems (and data warehouses often denormalize anyway). In this regard the aggregates you need can be stored per day so you are able to cut down to one record per day of the closed data, plus non-closed data. Keeping the aggregates to data which cannot change is what is required to avoid the normal insert/update anomilies that normalization prevents.

SSRS 2008 - Multiple Groupings For Date Range

A record in a table contains a range of valid dates, say:
*tbl1.start_date* and *tbl1.end_date*. So to ensure I get all records that are valid for a specific date range, the selection logic is: <...> WHERE end_date >= #dtFrom AND start_date < #dtTo (the #dtTo parameter used in the SQL statement is actually the calculated next day of the *#prmDt_To* parameter used in the report).
Now in a report I need to count the number of records for each day within the specified data range and include the days, if any, for which there were no valid records. Thus a retrieved record may be counted in several different days. I can do it relatively easily with a recursive CTE within the data set, but my rule of thumb is to avoid the unnecessary load on the SQL database and instead return just the necessary raw data and let the Report engine handle groupings. So is there a means to do this within SSRS?
Thank you,
Sergey
You might be able to do something in SSRS with custom code, but I recommend against it. The place to do this is in the dataset. SSRS is not designed to fill in groups that don't exist in the dataset. That sounds like what you are trying to do: SSRS would need to create the groups for each date whether or not that date is in the dataset.
If you don't have a number or date table in your database, I would just create a recursive CTE with a record for every date in the range that you are interested as you mention. Then outer join this to your table and use COUNT(tbl1.start_date) to find the appropriate days. This shouldn't be too painful a query for SQL server.
If you really need to avoid the CTE, then I would create a date or number table to use to generate the dates in your range.