Postgres: Extracting the IDs and names of people who are cheating the system - postgresql

I have a table A with the following transaction data:
ID Name Type
1 Albert Rewards
2 Albert Visit
3 Ruddy Rewards
4 Ruddy Visit
5 Ruddy Purchase
6 Mario Rewards
7 Mario Visit
...
I want a table that only select the rows with names of people who used the "Rewards" and "Visit" type but didn't make a purchase, something like this:
ID Name Type
1 Albert Rewards
2 Albert Visit
6 Mario Rewards
7 Mario Visit
...
Any ideas?

The below query will count for every Visit/Rewards/Purchase how often they happened for a given name - and if the respective results are 1/1/0 then all records from the table with that name will be returned.
If fine-tuning is required (such as cases where the count of any of those > 1 etc.) that can be done by fiddling with the numbers in the 'having' clause. The same is true for adding additional categories to check against.
select *
from mytable a
where exists (select b.name,
sum(case when b.type='Rewards' then 1 else 0 end),
sum(case when b.type='Visit' then 1 else 0 end),
sum(case when b.type='Purchase' then 1 else 0 end)
from mytable b
where b.name=a.name
group by b.name
having sum(case when b.type='Rewards' then 1 else 0 end) = 1
and
sum(case when b.type='Visit' then 1 else 0 end) = 1
and
sum(case when b.type='Purchase' then 1 else 0 end) = 0);
For completion sake: SQLFiddle with 2 queries First query also works, but a little differently

Related

Is there a dynamic way in BigQuery to select/create columns with pattern?

Not sure how to sure how to best phrase it, but essentially I need ~50 columns x 12 weeks in BigQuery and was hoping there was a way to do it more efficiently using some sort of logic or function.
I can generate the script in Python, but the end output itself is long and unwieldy. Is there a cleaner way to do it within BigQuery itself?
Example code:
select id,
sum(case when first_week_flag then visit else 0 end) as sum_visit_first_week,
sum(case when second_week_flag then visit else 0 end) as sum_visit_second_week,
... for 12 weeks,
avg(case when first_week_flag then gap else 0 end) as avg_gap_first_week,
avg(case when second_week_flag then gap else 0 end) as avg_gap_second_week,
... for 12 weeks,
etc. for 50 columns
from table
group by id
Potential for simplification:
select id,
sum(case when {WEEK}_flag then visit else 0 end) as sum_visit_{WEEK},
avg(case when {WEEK}_flag then gap else 0 end) as avg_gap_{WEEk},
etc. for 50 columns
from table
group by id
Can anyone point me in the right direction of what to search for? Thanks!

Find last occurring value within record in PostgreSQL

I'm not new to SQL, but I am new to PostgreSQL and am really struggling to adapt my current knowledge in a different environment.
I am trying to create a variable that captures whether or not someone stays active, skips, or churns within a 0/1 time series variable. For example, in the data below, my dataset would include the variables id,time, and voted, and I would create the variable "skipped":
id time voted skipped
1 1 1 active
1 2 0 skipped
1 3 1 active
2 1 1 active
2 2 0 churned
2 3 0 churned
3 1 1 active
3 2 1 active
3 3 0 churned
The rule for coding "skipped" is pretty simple: If 1 is the last record, the person is "active" and any zeroes count as "skipped", but if 0 is the last record, the person is "churned".
The record with id = 1 is a skip because id is non-zero at time 3 after being 0 at time 2. The other two cases, 0 is the final value so they are "churned". Can anyone help? I've been noodling on it all day, and am hitting a wall.
This isn't particularly elegant, but it should meet your needs:
with votes as (
select
id, time, voted,
max(time) over (partition by id) as max_time
from voter_data
)
select
v1.id, v1.time, v1.voted,
case
when v1.voted = 1 then 'active'
when v2.voted = 1 then 'skipped'
else 'churned'
end as skipped
from
votes v1
join votes v2 on
v1.id = v2.id and
v1.max_time = v2.time
In a nutshell, we first figure out which is the last record for each voter id, and then we do a self-join on the resulting table to isolate only that last id.
There is a chance this could produce multiple results -- if it's possible to have the same ID vote twice at the same time. If that's the case, you want row_number() instead of max().
Results on your data:
1 1 1 'active'
1 2 0 'skipped'
1 3 1 'active'
2 1 1 'active'
2 2 0 'churned'
2 3 0 'churned'
3 1 1 'active'
3 2 1 'active'
3 3 0 'churned'
Window functions can help for readability when working with self-referential joins.
WITH
add_last_voted_status AS (
SELECT
*
, LAST_VALUE(voted) OVER (
PARTITION BY id
ORDER BY time
) AS last_voted_status
FROM table
)
SELECT
id
, time
, voted
, CASE
WHEN last_voted_status = 0
THEN 'churned'
WHEN last_voted_status = 1 AND voted = 1
THEN 'active'
WHEN last_voted_status = 1 AND voted = 0
THEN 'skipped'
ELSE '?'
END AS skipped
FROM add_last_voted_status

how to combine multiple query into one single query

I have three queries as below and I need to combine them into one. Does any body know how to do that?
select COUNT(*) from dbo.VWAnswer where questionId =2 and answer =1
select COUNT(*) from dbo.VWAnswer where questionId =3 and answer =4
select COUNT(*) from dbo.VWAnswer where questionId =5 and answer =2
I want to find out total count of those people whose gender = 1 and Education = 4 and marital status = 2
Following is the table columns(With one ex) that i refer:
questionId questionText anwser AnserSheetID
1 Gender 1 1
2 Qualification 4 1
3 Marital Status 2 1
1 Gender 2 2
2 Qualification 1 2
3 Marital Status 2 2
1 Gender 1 3
2 Qualification 3 3
3 Marital Status 1 3
Basically, these are questions answered by different people whose answers are stored in this table.
So if we consider above table entries I should get 1 as total count based upon above 3 conditions i.e. gender = 1 and Education = 4 and marital status = 2
Can someone tell me what I need to do to get this to work?
If you want to combine your three count queries, you can try the below SQL to get it done.
select
sum(case when questionId =2 and anwser=1 then 1 else 0 end) as FCount,
sum(case when questionId =3 and anwser=4 then 1 else 0 end) as SCount,
sum(case when questionId =5 and anwser=2 then 1 else 0 end) as TCount
from dbo.VWAnswer
Update 1:
select
Sum(case when questionText='Gender' and anwser='1' then 1 else 0 end) as GenderCount,
Sum(case when questionText='Qualification' and anwser='4' then 1 else 0 end) as EducationCount,
Sum(case when questionText='Marital Status' and anwser='2' then 1 else 0 end) as MaritalCount
from VWAnswer
We can only get the counts based on the rows and every condition should apply in each row.
You might use a joined view meeting you conditions and select the count of the rows fitting your conditions.
Select COUNT(*) as cnt from
(
Select a.AnserSheetID
from VWAnswer a
Join VWAnswer b on a.AnserSheetID=b.AnserSheetID and b.questionId = 2 and b.anwser=4
Join VWAnswer c on a.AnserSheetID=c.AnserSheetID and c.questionId = 3 and c.anwser=2
where a.questionId=1 and a.anwser=1
) hlp

Union Statement that can be changed to Case statement

Good Day Everyone!
well i have this kind of code and it kinda ugly,
a friend of mine told me i can implement Case Statements in here, but i do not know how or how would i implement, the code is long so if you could just help me to optimize my code i would appreciate it greatly!
PS. please be gentle to me, im new in T-sql :)
Thank yoU!
SELECT
SUM(CYJEWELRY) 'CY_Jewelry'
,SUM(CYAPPLICANCE) 'CY_Appliance'
,SUM(CYCELLPHONE) 'CY_Cellphone'
,SUM(PYJEWELRY) 'PY_Jewelry'
,SUM(PYAPPLIANCE) 'PY_Appliance'
,SUM(PYCELLPHONE) 'PY_Cellphone'
FROM
(
---TOTAL NUNG A FORMAT 0,0,0,0,0,0
--------------CURRENT YEAR JEWELRY
SELECT COUNT (*) AS CYJEWELRY,0 AS CYAPPLICANCE,0 AS CYCELLPHONE,0 AS PYJEWELRY,0 AS PYAPPLIANCE,0 AS PYCELLPHONE
FROM #TEMPTABLE1
WHERE (fld_StorageGroupID >= 3 and fld_StorageGroupID <= 14)
UNION
-----------CURRENT YEAR APPLIANCE
SELECT 0,COUNT(*),0,0,0,0
FROM #TEMPTABLE1
WHERE fld_StorageGroupID = 1
UNION
------------CURRENT YEAR CELLPHONE
SELECT 0,0,COUNT(*),0,0,0
FROM #TEMPTABLE1
WHERE fld_StorageGroupID = 2
UNION
---------------LAST YEAR JEWELRY
SELECT 0,0,0,COUNT(*),0,0
FROM #TEMPTABLE2
WHERE (fld_StorageGroupID >= 3 and fld_StorageGroupID <= 14)
UNION
-----------------------LAST YEAR APPLIANCE
SELECT 0,0,0,0,COUNT (*),0
FROM #TEMPTABLE2
WHERE fld_StorageGroupID = 1
UNION
-------------------------LAST YEAR CELLPHONE
SELECT 0,0,0,0,0,COUNT(*)
FROM #TEMPTABLE2
WHERE fld_StorageGroupID = 2
)A
Assuming your data is bit like this Sql Fiddle Example, try this for the sub query using SUM() and CASE.
SELECT SUM(CASE WHEN fld_StorageGroupID >= 3 and fld_StorageGroupID <= 14 ELSE 0 END) Col1And4,
SUM(CASE WHEN fld_StorageGroupID = 1 THEN 1 ELSE 0 END) Col2And5,
SUM(CASE WHEN fld_StorageGroupID = 2 THEN 1 ELSE 0 END) Col3And6
FROM #TEMPTABLE1
GROUP BY fld_StorageGroupID
Since you are applying the same filter for last 3 columns in the subquery, I have done only first 3 columns here.
EDIT:
I think this is better than above (Note: no need to use SUM() in the main query).
Fiddle Example with data
select col1_4 CY_Jewelry,
col2_5 CY_Appliance,
col3_6 CY_Cellphone,
col1_4 PY_Jewelry,
col2_5 PY_Appliance,
col3_6 PY_Cellphone
from (
select sum(case when id>= 3 and id <= 14 then 1 else 0 end) col1_4,
sum(case when id = 2 then 1 else 0 end) col2_5,
sum(case when id = 3 then 1 else 0 end) col3_6
from t
--group by id
) X

CRM Reports: Grouping by a related Entity

There is an N<>N relationship between Contacts and Complaints.
My report currently looks like this:
Status 1 Status 2 Status 3 Status 4
3 4 32 34
With the following query:
SELECT
SUM(case WHEN status = 1 then 1 else 0 end) Status1,
SUM(case WHEN status = 2 then 1 else 0 end) Status2,
SUM(case WHEN status = 3 then 1 else 0 end) Status3,
SUM(case WHEN status = 4 then 1 else 0 end) Status4,
SUM(case WHEN status = 5 then 1 else 0 end) Status5
FROM [DB].[dbo].[Contact]
This is listing the number of contacts in each status. I'm now trying to GROUP BY a field in a related entity in CRM - complaints.
Status 1 Status 2 Status 3 Status 4
Contact.Complaints.CreatedBy[1] 3 4 32 34
Contact.Complaints.CreatedBy[2] 3 4 32 34
Contact.Complaints.CreatedBy[3] 3 4 32 34
Contact.Complaints.CreatedBy[4] 3 4 32 34
I'm not sure where to get started in my GROUP BY statement - any pointers would be awesome. I feel like I have to have another FROM statement pointing to the NN relationship, or at least Complaints.
It should be as easy as adding a JOIN to Complaints (thru the N:N) table. I completely agree with James, just make sure you execute the report as a CRM user, otherwise Filtered views return 0 rows.
SELECT
MyComplaintType,
...existing Sum(Case) stuff
FROM
FilteredContacts c
JOIN
Filterednew_Contacts_new_Complaint_new_complaints r1 (whatever your N:N is)
ON c.contactId = r1.contactId
JOIN
Filterednew_Complaint comp
ON r1.new_complaintId = comp.new_complaintId
GROUP BY
MyComplaintType