Count # of records by grouped date? - tableau-api

I'm a novice Tableau user, trying to help my organization to analyze phone traffic. My data source of incoming phone calls is in an Excel spreadsheet, and is listed like this:
TRANSACTION ID DATETIME
151313:179805 1/2/2018 9:57
151340:108017 1/2/2018 17:27
151395:176211 1/3/2018 15:27
Our total calls per day range from 10 to 50.
I'd like to count days with an identical # of calls, and probably make a Histogram sorted by # of calls on the X-Axis, and # of days w/ that many calls on the Y-Axis.
I feel like this would be a simple Calculated Field, but for the life of me, I'm not getting what I'd do here.
Help! :)

One solution is to define an LOD calc, calls_per_day, as
{ FIXED DateTrunc('day', [DATETIME]) : COUNT("*") }
which in effect, prebuilds a little table in space showing the number of data rows for each day. That works if you have one data row in your input per transaction id.
If transaction ids are repeated, and instead you want the number of transactions for each day, you can use the following variation.
{ FIXED DateTrunc('day', [DATETIME]) : COUNTD([TRANSACTION ID]) }
COUNTD() can be expensive on large data sets, so its better to use an alternative when you have the option.

You can use LoD
{Fixed TRANSACTION ID : Count(Day(DATETIME))}
Try this and post the result

Related

PowerBI - calculation with 2 different dates

I have one data table from which I have to calculate 2 different KPIs, each KPI is tied to different date column.
"Creation Date" for KPI "Net Satisfaction Score" calculation and "Uni Date" for KPI "Response Rate" calculation.
"Date" from "Date table" used as field to filter time periods so I need to have a relation to that field.
table design
If I filter for results in September'22, I want to see "Net Satisfaction Score" calculated from all Ids with Creation date in SEP'22, and I want to see "Response Rate" calculated from all Ids with UniDate in SEP'22 (this means Ids 00004, 00007, 00009 and 00010 are not to be considered in Response Rate calculation).
What I have tried already:
Using more queries - one for Response Rate (with relation UniDate <-> Date) and second one for Net Satisfaction Score (with relation Creation Date <-> Date).
This worked, but if I want to go more into detail and see the results by country, the numbers don't show up correctly, as there is no relation on "Country" or whatever detail I want to split the result by.
Making relation based on IDs between the queries mentioned in "1)" - circular dependency error.
I am really out of ideas, but maybe some of you tried to solve this kind of issue already.
Sounds like you need to use USERELATIONSHIP within your measure.
You will need something like this
Net Satisfaction Score = CALCULATE(Sum('Net Satisfaction Table[Score]'),
USERELATIONSHIP('DateTable[Day]','Net Satisfaction Score Table[Created Date]))
You can obviously use sum, count, average...whatever you need to do with your score. You also need to make sure that both dates - Created Date and UniDate have an inactive relationship back to your Date Table.
Repeat the same measure example for your other measure. USERELATIONSHIP works perfectly on inactive relationships and only works within CALCULATE operation where you tell PowerBI which date to use in this calculation.

How do I use Tableau to populate the count of each dimension over a time period?

How do I populate the number of purchases and sales per day in tableau?
Here is my Sample Data:
In my first attempt, sales numbers are not counted to the exact date.
In my second attempt, I tried to tabulate by dropping sales date into the rows. However, it returned two figures - purchases and sales.
I have also tried Calculated Field but Tableau is unable to do a "for loop" like python.
First attempt:
After dropping Sales Date into the Rows. This is what I get:
Is there any way to populate it like this? Please help, I am still new to tableau. Special thanks to Fabio Fantoni for the first solution!
Desired Format:
I have another sample data (refer to sample data 2) which I would like to populate in the desired format (refer to desired format 2). In Sample Data 2, the purchase date "15/12/2020" is not reflected in sold dates.
My apologies but I may require some guidance as I am still new to tableau. Thank you in advance.
Sample Data 2:
Desired Format 2:
Based on this sample:
In order to bypass your double count for two different date columns, you may want to cross join your original data with a copy of it on original.Purchase = support.Sold, like this:
Doing so, you just have to create two calculated fields:
count Purchase:
count([Purchase Date])
count Sold:
Count([Purchase Date (Foglio11)])
The only thing you have to pay attention to is that in the second calculus you have to count Purchase date due to your "inverted" cross join.
You should get something like this:

Find count of active users in the last 29 days in Tableau

Require assistance in calculating the Total Active Users from March 16 2020 to Feb 16 2020.
I have tried using calculated fields, but not getting the correct results. Please advise.
Thank you,
Nirmal
To find the number of unique values that appear in a field, say [user_code], you can use the COUNT DISTINCT function, COUNTD() as in COUNTD([user_code])
To restrict the data to a particular time range, one way is put your date field on the Filter shelf and choose the settings that include only the data rows you want — say the range from 2/16 to 3/16 as you stated.
Alternatively, you can push the filtering condition into the calculation with an IF function call, as in COUNTD(IF <data is relevant> THEN [user_code] END) Thus effectively combining the two techniques. That works because if there is no ELSE clause and the IF condition is False then the IF statement evaluates to null. Since COUNTD() silently ignores nulls, like other aggregation functions, the expression acts as if the irrelevant data rows were filtered.
So, for example,
COUNTD(IF [dates] >= #2/16/2020# AND [dates] <= #3/16/2020# THEN [user_code] END)
Will tell you then number of unique user codes during the period between 2/16 and 3/16. The DateDiff() function will probably be useful in more elaborate tests.
Finally, what if you want more flexibility? You could easily use Parameters or Filter controls to let the user choose the date range interactively.
If you want this calculation repeated for each possible day, showing the unique users in the preceding 30 day period, as some sort of rolling calculation, then you’ll need to learn about some more advanced features. Either multiple calculations as above for different time ranges, using Table Calculations, or some data prep and/or data padding with Tableau Prep Builder, Python or some other technique — mostly because in that scenario each data row contributes to multiple rolling counts, rather than one count when partitioning the data by some dimension.

Identifying next closest record by date in tableau

I have a table of users and another table of transactions.
The transactions all have a date against them. What I am trying to ascertain for each user is the average time between transactions.
User | Transaction Date
-----+-----------------
A | 2001-01-01
A | 2001-01-10
A | 2001-01-12
Consider the above transactions for user A. I am basically looking for the distance from one transaction to the next chronologically to determine the distances.
There are 9 days between transactions one and two; and there are 2 days between transactions three and four. The average of these is obviously 4.5, so I would want to identify the average time between user A's transactions to be 4.5 days.
Any idea of how to achieve this in Tableau?
I am trying to create a calculated field for each transaction to identify the date of the "next" transaction but I am struggling.
{ FIXED [user id] : MIN(IF [Transaction Date] > **this transaction date** THEN [Transaction Date]) }
I am not sure what to replace this transaction date with or whether this is the right approach at all.
Any advice would be greatly appreciated.
LODs dont have access to previous values directly, so you need to create a self join in your data connection. Follow below steps to achieve what you want.
Create a self join with your data with following criteria
Create an LOD calculation as below
{FIXED [User],[Transaction Date]:
MIN(DATEDIFF('day',[Transaction Date],[Transaction Date (Data1)]))
}
Build the View
PS: If you want to improve the performance, Custom SQL might be the way.
The only type of calculation that can take order sequence into account (e.g., when the value for a calculated field depends on the value of the immediately preceding row) is a table calc. You can't use an LOD calc for this kind of problem.
You'll need to understand how partitioning and addressing works with table calcs, along with specifying your sort order criteria. See the online help. You can then do something like, for example, define days_since_last_transaction as:
if first() > 0 then min([Transaction Date]) -
lookup(min([Transaction Date]), -1) end
If you have very large data or for other reasons want to do your calculations at the database instead of in Tableau by a table calc, then you use SQL windowing (aka analytical) queries instead via Tableau's custom SQL.
Please attach an example workbook and anything you tried along with the error you have.
This might not be useful if you cannot set User ID Field as a filter.
So, you can set
User ID
as a filter. Then following the steps mentioned in here will lead you to calculating difference between any two dates. Ideally if you select any one value in the filter, the calculated field from the link should give you the difference in the dates that you have in the transaction dates column.

How to get all missing days between two dates

I will try to explain the problem on an abstract level first:
I have X amount of data as input, which is always going to have a field DATE. Before, the dates that came as input (after some process) where put in a table as output. Now, I am asked to put both the input dates and any date between the minimun date received and one year from that moment. If there was originally no input for some day between this two dates, all fields must come with 0, or equivalent.
Example. I have two inputs. One with '18/03/2017' and other with '18/03/2018'. I now need to create output data for all the missing dates between '18/03/2017' and '18/04/2017'. So, output '19/03/2017' with every field to 0, and the same for the 20th and 21st and so on.
I know to do this programmatically, but on powercenter I do not. I've been told to do the following (which I have done, but I would like to know of a better method):
Get the minimun date, day0. Then, with an aggregator, create 365 fields, each has that "day0"+1, day0+2, and so on, to create an artificial year.
After that we do several transformations like sorting the dates, union between them, to get the data ready for a joiner. The idea of the joiner is to do an Full Outer Join between the original data, and the data that is going to have all fields to 0 and that we got from the previous aggregator.
Then a router picks with one of its groups the data that had actual dates (and fields without nulls) and other group where all fields are null, and then said fields are given a 0 to finally be written to a table.
I am wondering how can this be achieved by, for starters, removing the need to add 365 days to a date. If I were to do this same process for 10 years intead of one, the task gets ridicolous really quick.
I was wondering about an XOR type of operation, or some other function that would cut the number of steps that need to be done for what I (maybe wrongly) feel is a simple task. Currently I now need 5 steps just to know which dates are missing between two dates, a minimun and one year from that point.
I have tried to be as clear as posible but if I failed at any point please let me know!
Im not sure what the aggregator is supposed to do?
The same with the 'full outer' join? A normal join on a constant port is fine :) c
Can you calculate the needed number of 'dublicates' before the 'joiner'? In that case a lookup configured to return 'all rows' and a less-than-or-equal predicate can help make the mapping much more readable.
In any case You will need a helper table (or file) with a sequence of numbers between 1 and the number of potential dublicates (or more)
I use our time-dimension in the warehouse, which have one row per day from 1753-01-01 and 200000 next days, and a primary integer column with values from 1 and up ...
You've identified you know how to do this programmatically and to be fair this problem is more suited to that sort of solution... but that doesn't exclude powercenter by any means, just feed the 2 dates into a java transformation, apply some code to produce all dates between them and for a record to be output for each. Java transformation is ideal for record generation
You've identified you know how to do this programmatically and to be fair this problem is more suited to that sort of solution... but that doesn't exclude powercenter by any means, just feed the 2 dates into a java transformation, apply some code to produce all dates between them and for a record to be output for each. Java transformation is ideal for record generation
Ok... so you could override your source qualifier to achieve this in the selection query itself (am giving Oracle based example as its what I'm used to and I'm assuming your data in is from a table). I looked up the connect syntax here
SQL to generate a list of numbers from 1 to 100
SELECT (MIN(tablea.DATEFIELD) + levquery.n - 1) AS Port1 FROM tablea, (SELECT LEVEL n FROM DUAL CONNECT BY LEVEL <= 365) as levquery
(Check if the query works for you - haven't access to pc to test it at the minute)