Calculated Field to Sum Two Values Tableau - tableau-api

Is there a way to sum up two values from two different sheets. I have one sheet that looks at full time students as a distinct count of their ID and on another sheet I made a calculated field that takes the contact hours of part time students and divides by 12 (FT course load). I want to be able to add up these two numbers so that the... SUM(Full Time + (Part Time Contact Hours/12)) = ### and would result in an FTE (full-time equivalent enrollment).

Try a calculations similar to these:
Full Time
IF [Type]="FullTime" THEN [StudentId] END
Part Time
IF [Type]!="FullTime" THEN [Hours] END
FTEs
COUNTD([Full Time]) + SUM([Part Time])/12
Obviously I don't know your data structure, hopefully that gives you enough to go on to start.

Related

Tableau measure count items if between dates

What I am trying to achieve is to get a count of people employed in a particular period.
I have 3 variables:
Employee ID (integer)
Hire date (date)
Termination date (date or null)
Example
the formula I am looking for is something like
if termination_date is null
then
count employee_ID in
dates between Hire_date and max of either hire_date or termination_date
else
count employee ID in
dates between hire_date and termination_date
This aims to show the dynamic of staff level over the time.
I am new to Tableau, not sure how to even start with it. Any suggestions welcome.
This problem will be simpler if you reshape your data to have the following three columns
Employee ID
Date
Action. (where action takes on the values of ‘Hire’ or ‘Terminate’).
Each data row represents one change in status for an employees. If an employee had a termination date, they will have two records in this new format, otherwise just one record showing the hiring date.
You can reshape your data by hand, or leave the original and use Tableau Prep or the Tableau data source page to reshape using a self Union and a few simple calculated fields.
Define a calculated field called Staffing_Change as
if Action=‘Hire’ then 1 else -1 end
Now you can plot the change in staff level over time by putting exact date on columns and sum(Staffing_Change) on Rows. You can use a quick Table calc, Running Sum, to see the net staffing level. For line mark types, I’d use a step style by pressing on the path button on the Marks card. Otherwise, the chart can give the impression of fractional number of employees.

Tableau Aggregation of Groups over Fixed Level of Detail

For some reason when I try to group a field created by a fixed level of detail groups containing more then one item disappear from the view.
The basic set up is that there are 'unique event IDs' which are RCIPID which have one or two dates associated with them (some events have two separate dates that are non-linear). There is a 'follow up' date which is tied only to the event location and not the RCIPID, but each RCIPID has only one location.
I've joined the 'follow up' data to the main file based on the location. The below computations [NTC Submission Date] is the date of the event. [Live Date] is the date of the followup. The computations correctly give the number of days after the event that the follow up happened.
However when I try to do a unique count of NTC Date (Or RCIPID) and then group the computations any group with more then one day disappears.
FP - Straight Diff
DATEDIFF('day', [NTC Submission Date], [Livedate])
FP - Remove Negatives
IF [FP - Straight Diff]<0
Then DateDiff('day',[NTC Submission Date],TODAY())
ELSE
[FP - Straight Diff]
END
FP - Days after NTC
{FIXED [Rcip Id], [NTC Submission Date]: MIN([FP - Remove Negatives])}
It works when it is not grouped together
But as soon as I group it all of the groups with more then one day disappear.
Any and all help is greatly appreciated. I think it has /something/ to do with being a dimension, but I honestly don't know what.
The goal is a bar chart similar to the second one, but with the groups "4-5 Days", "6-10 Days", "11-20 Days", and "Over 20 Days" visible. Those values do exist in the data and if I change the view to show the day instead of a count it shows the proper calculations:
EDIT: Using a calculated field instead of groups did not work. Trying a concatenated RCIPID and NTC Submission Date also did not have an effect.
I'm not a huge fan of the group by feature in Tableau as it has give me some unexpected behavior in the past.
There are a few approaches I would recommend trying. One is spelled out well here by using the bin functionality built into tableau. https://community.tableau.com/thread/188952
It sounds like you have some irregular bin sizes you're looking for, in which case you could create a separate calculated field with a series if else statements based on the range of [FP - Days after NTC] and assign a string to each 'grouping'
Lastly, I've not done an LOD of this style with two dimensions "FP - Days after NTC". You only appear to have COUNTD(submission days) on the view shelf. I would verify that you are getting the results you are expecting. And if not, you could create a separate calculated field that is a concatenation of [Rcip Id] and [NTC Submission Date] on which to based your LOD calculation, and then you can use COUNTD on that new field.
EDIT: It was suggested to try using a computed field instead of a group and/or concatenate RCIPID and NTC Date. I tried both and neither affected the result.

How to get all missing days between two dates

I will try to explain the problem on an abstract level first:
I have X amount of data as input, which is always going to have a field DATE. Before, the dates that came as input (after some process) where put in a table as output. Now, I am asked to put both the input dates and any date between the minimun date received and one year from that moment. If there was originally no input for some day between this two dates, all fields must come with 0, or equivalent.
Example. I have two inputs. One with '18/03/2017' and other with '18/03/2018'. I now need to create output data for all the missing dates between '18/03/2017' and '18/04/2017'. So, output '19/03/2017' with every field to 0, and the same for the 20th and 21st and so on.
I know to do this programmatically, but on powercenter I do not. I've been told to do the following (which I have done, but I would like to know of a better method):
Get the minimun date, day0. Then, with an aggregator, create 365 fields, each has that "day0"+1, day0+2, and so on, to create an artificial year.
After that we do several transformations like sorting the dates, union between them, to get the data ready for a joiner. The idea of the joiner is to do an Full Outer Join between the original data, and the data that is going to have all fields to 0 and that we got from the previous aggregator.
Then a router picks with one of its groups the data that had actual dates (and fields without nulls) and other group where all fields are null, and then said fields are given a 0 to finally be written to a table.
I am wondering how can this be achieved by, for starters, removing the need to add 365 days to a date. If I were to do this same process for 10 years intead of one, the task gets ridicolous really quick.
I was wondering about an XOR type of operation, or some other function that would cut the number of steps that need to be done for what I (maybe wrongly) feel is a simple task. Currently I now need 5 steps just to know which dates are missing between two dates, a minimun and one year from that point.
I have tried to be as clear as posible but if I failed at any point please let me know!
Im not sure what the aggregator is supposed to do?
The same with the 'full outer' join? A normal join on a constant port is fine :) c
Can you calculate the needed number of 'dublicates' before the 'joiner'? In that case a lookup configured to return 'all rows' and a less-than-or-equal predicate can help make the mapping much more readable.
In any case You will need a helper table (or file) with a sequence of numbers between 1 and the number of potential dublicates (or more)
I use our time-dimension in the warehouse, which have one row per day from 1753-01-01 and 200000 next days, and a primary integer column with values from 1 and up ...
You've identified you know how to do this programmatically and to be fair this problem is more suited to that sort of solution... but that doesn't exclude powercenter by any means, just feed the 2 dates into a java transformation, apply some code to produce all dates between them and for a record to be output for each. Java transformation is ideal for record generation
You've identified you know how to do this programmatically and to be fair this problem is more suited to that sort of solution... but that doesn't exclude powercenter by any means, just feed the 2 dates into a java transformation, apply some code to produce all dates between them and for a record to be output for each. Java transformation is ideal for record generation
Ok... so you could override your source qualifier to achieve this in the selection query itself (am giving Oracle based example as its what I'm used to and I'm assuming your data in is from a table). I looked up the connect syntax here
SQL to generate a list of numbers from 1 to 100
SELECT (MIN(tablea.DATEFIELD) + levquery.n - 1) AS Port1 FROM tablea, (SELECT LEVEL n FROM DUAL CONNECT BY LEVEL <= 365) as levquery
(Check if the query works for you - haven't access to pc to test it at the minute)

Tableau Future and Current References

Tough problem I am working on here.
I have a table of CustomerIDs and CallDates. I want to measure whether there is a 'repeat call' within a certain period of time (up to 30 days).
I plan on creating a parameter called RepeatTime which is a range from 0 - 30 days, so the user can slide a scale to see the number/percentage of total repeats.
In Excel, I have this working. I sort CustomerID in order and then sort CallDate from earliest to latest. I then have formulas like:
=IF(AND(CurrentCustomerID = FutureCustomerID, FutureCallDate - CurrentCallDate <= RepeatTime), 1,0)
CurrentCustomerID = the current row, and the FutureCustomerID = the following row (so it is saying if the customer ID is the same).
FutureCallDate = the following row and the CurrentCallDate = the current row. It is subtracting the future call time from the first call time to measure the time in between.
The goal is to be able to see, dynamically, how many customers called in for a specific reason within maybe 4 hours or 1 day or 5 days, etc. All of the way up until 30 days (this is our actual metric but it is good to see the calls which are repeats within a shorter time frame so we can investigate).
I had a similar problem, see here for detailed version Array calculation in Tableau, maxif routine
In your case, that is basically the same thing as mine, so you could apply that solution, but I find it easier to understand the one I'm about to give, I would do:
1) Create a calculated field called RepeatTime:
DATEDIFF('day',MAX(CallDates),LOOKUP(MAX(CallDates),-1))
This will calculated how many days have passed since the last call to the current. You can add a IFNULL not to get Null values for the first entry.
2) Drag CustomersID, CallDates and RepeatTime to the worksheet (can be on the marks tab, don't need to be on rows or column).
3) Configure the table calculation of RepeatTIme, Compute using Advanced..., partitioning CustomersID, Adressing CallDates
Also Sort by Field CallDates, Maximum, Ascending.
This will guarantee the table calculation works properly
4) Now you have a base that you can use for what you need. You can either export it to csv or mdb and connect to it.
The best approach, actually, is to have this RepeatTime field calculated outside Tableau, on your database, so it's already there when you connect to it. But this is a way to use Tableau to do the calculation for you.
Unfortunately there's no direct way to do this directly with your database.

Default range for date range filter in tableau

I want to set the default range on a date filter to show me the last 10 days - so basically looking at the lastDate (max date) in the data and default filtering only on the last 10 days (maxDate - 10)
How it looks now:
I still would want to the see the entire range bar on the dashboard and give the user the ability to modify the selected range if he wants to. The maxDate changes after every data refresh so it has to be some sort of a condition that is applied to the filter.
How I want it to look (by default after every refresh of data - new dates coming in):
Any suggestions on how this can be done? I know I can use the relative date and show the data for last 10 days but that would modify the filter and create a drop down which I don't want.
Any suggestions are welcome!
One simple approach that does most of what you want is the following:
Create an integer valued parameter with a range from 1 to some max
you choose, say 100. Call it say num_days.
Show the parameter control on your dashboard as a slider, and give
it a nice title like "Number of days to display"
Create a boolean calculated field called Within_Day_Range defined as:
datediff('minute', [My_Date_Field], now()) < [num_days] * 24 * 60
Put Within_Day_Range on the filter shelf and select the value true.
This lets the user easily select how many days in the past to include, and works to the granularity of minutes (i.e. the last two days really means the last 48 hours, not starting at midnight yesterday). Adjust the calculated field if you want different behavior.
The main drawback of this approach as described so far is that it doesn't display the earliest date possible in the database because that is filtered out. Quick filters do an initial query to get the bounds, which has a performance cost -- so using the approach described here can avoid that query and thus load faster.
If you really need that information on your dashboard, you could create a different worksheet to get just the min([My_Date_Field]) and display that near your parameter control.