Max consecutive years for each customer in Tableau - tableau-api

I am trying to find for each customer the Max consecutive years he buys something. I tried to create a calculated field but to no avail.
I created two calculated fields
Consecutive: if max([Count])>0 then previous_value(0)+1+index()-index() else 0 end
max: window_max([Consecutive])
My data looks something like:
Year | Customer | Count
1996 | a | 2
1996 | b | 1
1997 | a | 1
1997 | b | 2
1998 | b | 1
So the result would be
a:2
b:3

Use nested table calcs.
The first calc, call it running_good_years, is a running count of consecutive years with sales.
If count(Sales) = 0 then 0 else previous_value(0) + 1 end
The second just returns the max
Window_max(running_good_years)
With table calcs, defining the partitioning and addressing is critical. Partition by Customer, Address by year

Related

Count records between rolling date range in Tableau

I have a file with a [start] and [end] date in Tableau and would like to create a calculated field that counts number of rows on a rolling basis that occur between [start] and [end] for each [person]. This data is like so:
| Start | End | Person
|1/1/2019 |1/7/2019 | A
|1/3/2019 |1/9/2019 | A
|1/8/2019 |1/15/2019| A
|1/1/2019 |1/7/2019 | B
I'd like to create a calculated field [count] with results like so:
| Start | End | Person | Count
|1/1/2019 |1/7/2019 | A | 1
|1/3/2019 |1/9/2019 | A | 2
|1/8/2019 |1/15/2019| A | 2
|1/1/2019 |1/7/2019 | B | 1
EDITED: A good analogy for what [count] represents is: "how many videos does each person rented at the same time as of that moment?" With the 1st row for person A, count is 1, with 1 item rented. As of row 2, person A has 2 items rented. But for the 3rd row [count]= 2 since the video rented in the first row is no longer rented.

Take new columns as output table - KDB

I have a query which returns results of data, which runs on a frequent basis. The new table will contain results of the old table as well but I only want to take whatever is in new in the most recent run of the new table and send that as an email. I already have the line for the email and trade data but just need a way to be able to:
display the results of the new table to be emailed
save the complete results of the new table to be used in the next run of the query
e.g.
Old results: tbl
| idx | name | age |
| 0 | Tom | 30 |
| 1 | Jerry | 25 |
| 2 | Bob | 30 |
| 3 | Ken | 45 |
New results: tbl
| idx | name | age |
| 0 | Tom | 30 |
| 1 | Jerry | 25 |
| 2 | Bob | 30 |
| 3 | Ken | 45 |
| 4 | Sam | 40 |
output required:
| 4 | Sam | 40 |
and then save the New results to be used in the next run
Thanks! :)
If the only changes between runs is that records are being appended onto the new table, you could just keep a variable denoting the last index seen and then select only those rows where idx is larger than that.
If the indexes are always increasing, this could be achieved using a query like
lastidx:exec last idx from tbl
select from tbl where idx>lastidx
If the idx values don't always increase monotonically, you could keep a count of the number of rows instead and only
lasti:count tbl
select from tbl where i>=lasti
This doesn't require saving the whole table in memory for use in the next iteration.
E.g to start with the old table had 4 rows so lasti = 4
q)tbl
idx name age
-------------
0 Tom 30
1 Jerry 25
2 Bob 30
3 Ken 45
q)lasti
4
The new table comes in and running the command selects the new row
q)tbl
idx name age
-------------
0 Tom 30
1 Jerry 25
2 Bob 30
3 Ken 45
4 Sam 40
q)select from tbl where i>lasti
idx name age
------------
4 Sam 40
lasti can then be updated to reflect the new count
q)lasti:count tbl
q)lasti
5
One way you can get this done, assuming the idx is the unique key :
q)old:([] idx:0 1 2 3; name:`T`J`B`K; age: 30 25 30 45)
q)new:old,enlist `idx`name`age!(4; `S;40) //new output from your query
q)out:()
q)if[0<count i:new[`idx] except old[`idx] ; out:new i ; old:new]
q)out
idx name age
------------
4 S 40
Another way, if your new records are always added to the last of old records:
q)old:([] idx:0 1 2 3; name:`T`J`B`K; age: 30 25 30 45)
q)i:count old
q)new:old,enlist `idx`name`age!(4; `S;40) //new output from your query
q)out:()
q)if[i<c:count new ; out:(i-c)#new ; old:new; i:c]
q)out
idx name age
------------
4 S 40

Cognos force 0 on group by

I've got a requirement to built a list report to show volume by 3 grouped by columns. The issue i'm having is if nothing happened on specific days for the specific grouped columns, i cant force it to show 0.
what i'm currently getting is something like:
ABC | AA | 01/11/2017 | 1
ABC | AA | 03/11/2017 | 2
ABC | AA | 05/11/2017 | 1
what i need is:
ABC | AA | 01/11/2017 | 1
ABC | AA | 02/11/2017 | 0
ABC | AA | 03/11/2017 | 2
ABC | AA | 04/11/2107 | 0
ABC | AA | 05/11/2017 | 1
ive tried going down the route of unioning a "dummy" query with no query filters, however there are days where nothing has happened, at all, for those first 2 columns so it doesn't always populate.
Hope that makes sense, any help would be greatly appreciated!
to anyone who wanted an answer i figured it out. Query 1 for just the dates, as there will always be some form of event happening daily so will always give a unique date range.
query 2 for the other 2 "grouped by" columns.
Create a data item in each with "1" as the result (but would work with anything as long as they are the same).
Query 1, left join to Query 2 on this new data item.
This then gives a full combination of all 3 columns needed. The resulting "Query 3" can then be left joined again to get the measures. Final query (depending on aggregation) may need to have the measure data item wrapped with a COALESCE/ISNULL to create a 0 on those days nothing happened.

Using DECLARE to create column numbers

When running a query I need column numbers to be applied to each row so that when I use the query to create a report in SSRS I can tell the report which data to put in which column. Example:
Case 1 | Jane Doe | Col 1
Case 1 | John Doe | Col 2
Case 2 | Sally Smith | Col 1 (only name in case)
My current query uses:
DECLARE #NumOfCols int=2;
And then this to tell it how to separate the columns:
(row_number() over (partition by case_num order by child_first) + #NumOfCols - 1)% #NumOfCols + 1 as DisplayCol
The problem is, when I run the query, even if a result only has one name (so only one column is needed) my data is getting duplicated. It seems like it is making it a mandatory column 1 and column 2 even if there is no data for a second column. Like this:
Case 1 | Jane Doe | Col 1
Case 1 | John Doe | Col 2
Case 2 | Sally Smith | Col 1 (only name in case)
Case 2 | Sally Smith | Col 2 (duplicating)
I hope this makes sense. Any ideas on how to eliminate duplicating the data?

postgresql Subselect Aggregate in larger query

I'm working with a gigantic dataset of individuals with demographic information and action tracking. I am trying to get the percentage of people who committed an action, which is simple, but also trying to get average ages of people who fit in a specific subgroup of the original SELECT. The CASE WHEN line works fine alone, and the subquery runs fine in it's own query but I cannot seem to get it integrated into this query as a subquery, it gives me a syntax error on the CASE WHEN statement. Here's a slightly anonymized version of the query. Any help would be VERY appreciated.
SELECT
AVG(ageagg)
FROM
(
SELECT
age AS ageagg
FROM
agetable
WHERE
age>30
AND action_taken=1) AvgAge_30Action,
COUNT(
CASE
WHEN action_taken=1
AND age> 30
THEN 1
ELSE 0 NULL) / COUNT(
CASE
WHEN age>30) AS Over_30_Action
FROM
agetable
WHERE
website_type=3
If I've interpreted your intent correctly, you wish to compute the following:
1) the number of people over the age of 30 that took a specific action as a percentage of the total number of people over the age of 30
2) the average age of the people over the age of 30 that took a specific action
Assuming my interpretation is correct, this query might work for you:
SELECT
100 * over_30_action / over_30_total AS percentage_of_over_30_took_action,
average_age_of_over_30_took_action
FROM (
SELECT
SUM(CASE WHEN action_taken=1 THEN 1 ELSE 0 END) AS over_30_action,
COUNT(*) AS over_30_total,
AVG(CASE WHEN action_taken=1 THEN age ELSE NULL END)
AS average_age_of_over_30_took_action
FROM agetable
WHERE website_type=3 AND age>30
) aggregated;
I created a dummy table and populated it with the following data.
postgres=# select * from agetable order by website_type, action_taken, age;
age | action_taken | website_type
-----+--------------+--------------
33 | 1 | 1
32 | 1 | 2
28 | 1 | 3
29 | 1 | 3
32 | 1 | 3
33 | 1 | 3
34 | 1 | 3
32 | 2 | 3
32 | 3 | 3
33 | 4 | 3
34 | 5 | 3
33 | 6 | 3
34 | 7 | 3
35 | 8 | 3
(14 rows)
Of the 14 rows, 4 rows (the first four in this listing) have either the wrong website_type or have age below 30. Of the ten remaining rows, you can see that 3 of them have an action_taken of 1. So, the query should determine that 30% of folks over the age of 30 took a particular action, and the average age among that particular population should be 33 (ages 32, 33, and 34). The results of the query I posted:
percentage_of_over_30_took_action | average_age_of_over_30_took_action
-----------------------------------+------------------------------------
30 | 33.0000000000000000
(1 row)
Again, all of this is predicated upon my interpretation of your intent actually being accurate. This is of course based on a highly contrived data set, but hopefully it's enough of a functional signpost to get you on the right path.