in my dataset there are a few customers who churned besides that their subscription plan information. Customers can change their subscription, so here I get the max plan:
{FIXED [Customer Id]:MAX(
(IF {FIXED [Customer Id]: MAX(
IF NOT ISNULL([Subscription Plan]) THEN [Date] END)
}=[Date] THEN [Subscription Plan] END)
)}
To find customer churn:
{ FIXED DATETRUNC('month', [Date]), [Max Plan]:
COUNTD(
IF NOT ISNULL([Churn Date]) AND
DATETRUNC('month', [Date]) = DATETRUNC('month', [Churn Date])
THEN [Customer Id] END
)
I want to calculate the revenue loss by churn, i.e. for each customer churn with advanced plan costs 9, for premium 19 dollars.
IF [Max Plan] = 'advanced' THEN [Churn]*9
ELSEIF [Max Plan] = 'premium' THEN [Churn]*19
END
However, it doesn't give me the correct result.
Here is the expected result:
Here is the workbook attached: https://community.tableau.com/s/contentdocument/0694T000004aUnLQAU
Related
I am trying to find basket items (category, subcategory) in last 3 orders of each customer. So, in the end, I am hoping to cluster customers according to items or categories that were mostly bought in last 3 orders.I am stuck on finding a solution to calculate last 3 orders of each customer. I should use LOD expressions but which one and how?
I think using Fixed [Client id] is the starting point. Should I rank orders descending ( based on order date) and then filter it with "<=3".
Trying replicating your problem on sample superstore data.
creating this calculation will give you last order for each customer
{fixed [Customer Name]: max([Order Date])} = [Order Date]
creating this calculation will give you last 2 orders
{Fixed [Customer Name]:MAX(
If [Order Date] <> {fixed [Customer Name]: max([Order Date])}
then [Order Date] END
)} = [Order Date]
OR
{fixed [Customer Name]: max([Order Date])} = [Order Date]
similarly creating this calculation will give you last 3 orders
{ FIXED [Customer Name] : max( IF
{Fixed [Customer Name]:MAX(
If [Order Date] <> {fixed [Customer Name]: max([Order Date])}
then [Order Date] END
)} <> [Order Date]
AND
{fixed [Customer Name]: max([Order Date])} <> [Order Date]
THEN [Order Date] END)} = [Order Date]
OR
{Fixed [Customer Name]:MAX(
If [Order Date] <> {fixed [Customer Name]: max([Order Date])}
then [Order Date] END
)} = [Order Date]
OR
{fixed [Customer Name]: max([Order Date])} = [Order Date]
only assumption is that there aren't more than 1 order on any given date.
check it
My code is an accumulated total of revenue over a period of time. If a single day is blank (no revenue for that day) I need it to show the totals from the day before. CASE WHEN (today is blank), Yesterday's data ELSE Today's Total
I am not sure what the syntax is on this one.
select distinct
date_trunc('day',admit_date) as admit_date,
revenue,
sum(revenue) over(order by admit_date) as running_rev
from dailyrev
order by admit_date
Expected Results:
Day 1: $100
Day 2: $200
Day 3: (no data so show Day 2 data) $200
Maybe this is what you need:
SELECT admit_date,
prev_revs[cardinality(prev_revs)] AS adj_revenue,
sum(prev_revs[cardinality(prev_revs)])
OVER (ORDER BY admit_date) AS running_sum
FROM (SELECT date_trunc('day', admit_date) AS admit_date,
array_remove(array_agg(revenue)
OVER (order by admit_date),
NULL) AS prev_revs
FROM dailyrev) AS q
ORDER BY admit_date;
Unfortunately PostgreSQL doesn't yet support the IGNORE NULLS clause, then it would have been simpler.
I am not sure if this is what you want, but try this:
SELECT
gs.date::date AS admit_date,
(SELECT revenue FROM dailyrev WHERE admit_date::date = gs.date) AS revenue,
(SELECT SUM(revenue) FROM dailyrev WHERE admit_date::date <= gs.date) AS accumulated_total
FROM
generated_series(
(SELECT MIN(admit_date::date) FROM dailyrev),
(SELECT MAX(admit_date::date) FROM dailyrev),
INTERVAL '1 day'
) gs
ORDER BY gs.date::date;
Yes, it does not look that nice, but..
I want to get all customers' spending for first year. But all of them have different date of joining. I have transaction data which consist of columns [User ID], [Date joined], [Transaction Date], [Amount].
Can someone help? Thanks
An LOD calculation oughta do the trick.
{ FIXED [User ID] :
SUM(
IIF(
(
[Transaction Date] >= [Date Joined]
AND [Transaction Date] < DATEADD('year', 1, [Date Joined])
),
[Amount],
0
)
)
}
I'm trying to calculate monthly retention rate in Amazon Redshift and have come up with the following query:
Query 1
SELECT EXTRACT(year FROM activity.created_at) AS Year,
EXTRACT(month FROM activity.created_at) AS Month,
COUNT(DISTINCT activity.member_id) AS active_users,
COUNT(DISTINCT future_activity.member_id) AS retained_users,
COUNT(DISTINCT future_activity.member_id) / COUNT(DISTINCT activity.member_id)::float AS retention
FROM ads.fbs_page_view_staging activity
LEFT JOIN ads.fbs_page_view_staging AS future_activity
ON activity.mongo_id = future_activity.mongo_id
AND datediff ('month',activity.created_at,future_activity.created_at) = 1
GROUP BY Year,
Month
ORDER BY Year,
Month
For some reason this query returns zero retained_users and zero retention. I'd appreciate any help regarding why this may be happening or maybe a completely different query for monthly retention would work.
I modified the query as per another SO post and here it goes:
Query 2
WITH t AS (
SELECT member_id
,date_trunc('month', created_at) AS month
,count(*) AS item_transactions
,lag(date_trunc('month', created_at)) OVER (PARTITION BY member_id
ORDER BY date_trunc('month', created_at))
= date_trunc('month', created_at) - interval '1 month'
OR NULL AS repeat_transaction
FROM ads.fbs_page_view_staging
WHERE created_at >= '2016-01-01'::date
AND created_at < '2016-04-01'::date -- time range of interest.
GROUP BY 1, 2
)
SELECT month
,sum(item_transactions) AS num_trans
,count(*) AS num_buyers
,count(repeat_transaction) AS repeat_buyers
,round(
CASE WHEN sum(item_transactions) > 0
THEN count(repeat_transaction) / sum(item_transactions) * 100
ELSE 0
END, 2) AS buyer_retention
FROM t
GROUP BY 1
ORDER BY 1;
This query gives me the following error:
An error occurred when executing the SQL command:
WITH t AS (
SELECT member_id
,date_trunc('month', created_at) AS month
,count(*) AS item_transactions
,lag(date_trunc('m...
[Amazon](500310) Invalid operation: Interval values with month or year parts are not supported
Details:
-----------------------------------------------
error: Interval values with month or year parts are not supported
code: 8001
context: interval months: "1"
query: 616822
location: cg_constmanager.cpp:145
process: padbmaster [pid=15116]
-----------------------------------------------;
I have a feeling that Query 2 would fare better than Query 1, so I'd prefer to fix the error on that.
Any help would be much appreciated.
Query 1 looks good. I tried similar one. See below. You are using self join on table (ads.fbs_page_view_staging) and the same column (created_at). Assuming mongo_id is unique, the datediff('month'....) will always return 0 and datediff ('month',activity.created_at,future_activity.created_at) = 1 will always be false.
-- Count distinct events of join_col_id that have lapsed for one month.
SELECT count(distinct E.join_col_id) dist_ct
FROM public.fact_events E
JOIN public.dim_table Z
ON E.join_col_id = Z.join_col_id
WHERE datediff('month', event_time, sysdate) = 1;
-- 2771654 -- dist_ct
I am trying to write a query that reorders date ranges around particular spans. It should do something that looks like this
Member Rank Begin Date End Date
2275 A 9/9/14 11/17/14
2275 B 9/26/14 3/24/15
2275 B 3/25/15 12/31/15
8983 A 9/16/13 3/10/15
8983 B 2/24/15 4/28/15
8983 A 4/28/15 12/31/15
and have it become
Member Rank Begin Date End Date
2275 A 9/9/14 11/17/14
2275 B 11/18/14 3/24/15
2275 B 3/25/15 12/31/15
8983 A 9/16/13 3/10/15
8983 B 3/11/15 4/27/15
8983 A 4/28/15 12/31/15
To explain further, I am looking to update the dates. There isn't much to the ranking except A > B. And there is only A and B. Date ranges with rank A should remain untouched. Overlapping B ranked dates are okay. I am concerned with B ranked dates overlapping with A ranked dates. The table is very large (~700 members) and with several different members IDs. The 2nd line (Rank B) of member 2275 changes the begin date to 11/18/15 to not overlap with the 1st line.
I am using Microsoft SQL Server 2008 R2
Thanks
LATEST EDIT: Here's what I did for pre-2012. I don't think it's the most elegant solution.
WITH a AS (
SELECT
1 AS lgoffset
, NULL AS lgdefval
, ROW_NUMBER() OVER(PARTITION BY [Member] ORDER BY [Begin Date]) AS seq
, [Member]
, [Rank]
, [Begin Date]
, [End Date]
FROM #table
)
SELECT
a.seq
, a.[Member]
, a.[Rank]
, a.[Begin Date]
, CASE
WHEN a.[Rank] = 'B' AND a.[Begin Date] <= ISNULL(aLag.[End Date], a.lgdefval)
THEN ISNULL(aLag.[End Date], a.lgdefval)
ELSE a.[Begin Date]
END AS bdate2
, a.[End Date]
INTO #b
FROM a
LEFT OUTER JOIN a aLag
ON a.seq = aLag.seq + a.lgoffset
AND a.[Member] = aLag.[Member]
ORDER BY [Member], [Begin Date];
UPDATE #table
SET #table.bdate = CASE
WHEN #table.rnk = 'B' AND #table.bdate <= (SELECT #b.bdate2 FROM #b WHERE #b.bdate2 > #b.bdate and #table.mbr = #b.mbr)
THEN dateadd(d, 1,(SELECT bdate2 FROM #b WHERE #b.bdate2 > #b.bdate and #table.mbr = #b.mbr ))
ELSE #table.bdate
END
EDIT PS: Below was my previous answer that only applies to 2012 and later.
You may want to try the following SELECT statement to see if you get the desired results and then convert to an UPDATE:
SELECT
[Member]
, [Rank]
, CASE
WHEN [Rank] = 'B' AND [Begin Date] <= LAG([End Date],1,'12/31/2030') OVER(PARTITION BY [Member] ORDER BY [Begin Date])
THEN DATEADD(d,1,LAG([End Date],1,'12/31/2030')OVER(PARTITION BY [Member] ORDER BY [Begin Date]))
ELSE [Begin Date]
END AS [Begin Date]
, [End Date]
FROM #Table
ORDER BY [Member], [Begin Date]
EDIT: So in order to update the begin date column:
UPDATE #Table
SET [Begin Date] = (SELECT
CASE
WHEN [Rank] = 'B' AND [Begin Date] <= LAG([End Date],1,'12/31/2030') OVER(PARTITION BY [Member] ORDER BY [Begin Date])
THEN DATEADD(d,1,LAG([End Date],1,'12/31/2030')OVER(PARTITION BY [Member] ORDER BY [Begin Date]))
ELSE [Begin Date]
END AS [Begin Date]
FROM #Table)
EDIT 2: Some of my code was incorrect due to not realizing the lag function needed an OVER statement, updated select statement and update statement
Sources:Alternate of lead lag function in sql server 2008
http://blog.sqlauthority.com/2011/11/24/sql-server-solution-to-puzzle-simulate-lead-and-lag-without-using-sql-server-2012-analytic-function/