Count distinct column case when/conditional - postgresql

I'm trying to count the distinct number of ids in a column and this works fine.
COUNT(DISTINCT messages.id) AS link_created
But when I try to count with a conditional, I get a syntax error, what's the proper syntax to add a case when or some other condition to only count the distinct message ids where the messages.link_label is present?
COUNT(DISTINCT messages.id CASE WHEN messages.link_label IS NOT NULL 1 END) AS link_created
My full query looks like this.
#customers = Customer.select("customers.*,
COUNT(DISTINCT recipient_lists.id) messages_sent,
COUNT(DISTINCT messages.id CASE WHEN messages.link_label IS NOT NULL 1 END) AS link_created,
COALESCE(SUM(video_activities.video_watched_count),0) AS watched_count,
COALESCE(SUM(video_activities.response_count),0) AS response_count,
COALESCE(SUM(video_activities.email_opened_count),0) AS email_opened_count,
COALESCE(SUM(CASE WHEN video_activities.video_watched_at IS NOT NULL THEN 1 ELSE 0 END),0) AS unique_watches,
COALESCE(SUM(CASE WHEN video_activities.email_opened_at IS NOT NULL THEN 1 ELSE 0 END),0) AS unique_opens,
COALESCE(SUM(CASE WHEN video_activities.response_count > 0 THEN 1 ELSE 0 END),0) AS unique_responses,
customers.updated_at AS last_login,
SUBSTRING( email from POSITION( '#' in email) + 1 for length(email)) AS company")
.joins("LEFT JOIN messages ON customers.id = messages.customer_id
LEFT JOIN recipient_lists ON messages.id = recipient_lists.message_id AND messages.link_label is NULL
LEFT JOIN video_activities ON messages.id = video_activities.message_id")
.group("customers.id")

Try this:
COUNT(DISTINCT CASE
WHEN messages.link_label IS NOT NULL
THEN messages.id
ELSE NULL END)
AS link_created

Related

UNION types integer and text cannot be matched.in postgreSQL

select product_name ,0 price1,0 price2,0 price3,
(CASE when sum(price)>100 then 1 else 0 end) as price4,0 price5
from sales_1
group by product_name,price
union
select product_name ,0 price1,0 price2,0 price3, 0 price4,
(CASE when sum(price)<100 then 'yes' else 'no' end) as price5
from sales_1
group by product_name,price
I want values which are less then 100 to turn into 'no' and others to 'yes' but it is throwing an error which is'UNION types integer and text cannot be matched' .i have tried different type of casting to solve it but it didn't. and i am doing it in postgresql
This is the code which got me to my required result:
SELECT product_name,
0 price1, 0 price2, 0 price3,
(CASE WHEN SUM(price)>100 THEN 'yes' ELSE 'no' END) AS price4,
'' price5
FROM sales_1
GROUP BY product_name,price
UNION ALL
SELECT product_name,
0 price1, 0 price2, 0 price3, '' price4,
(CASE WHEN SUM(price)<100 THEN 'yes' ELSE 'no' END) AS price5
FROM sales_1
GROUP BY product_name, price
And this is the result I got from upper query:

Combine Two Queries in one SUM Case

I have two queries with exact same grouping but I dont seem to be able to combine them in a correct way.
Query1:
SELECT
WorkPeriods.Id AS Z_Number,
CONVERT(VARCHAR, (CONVERT(DATE, WorkPeriods.StartDate, 103)), 103) AS Z_Date,
SUM(CASE WHEN Payments.Name = 'Cash' THEN Payments.Amount ELSE 0 END) AS Cash_Payments,
COUNT(CASE WHEN Payments.Name = 'Cash' THEN 1 END) AS No_of_Tickets_Cash,
SUM(CASE WHEN Payments.Name = 'Credit Card' THEN Payments.Amount ELSE 0 END) AS Credit_Card_Payments,
COUNT(CASE WHEN Payments.Name = 'Credit Card' THEN 1 END) AS No_of_Tickets_Credit_Card
FROM
Payments, WorkPeriods
WHERE
Payments.Date BETWEEN WorkPeriods.StartDate AND WorkPeriods.EndDate
GROUP BY
WorkPeriods.Id, WorkPeriods.StartDate
Query 2:
SELECT
WorkPeriods.Id AS Z_Number,
CONVERT(VARCHAR, (CONVERT(DATE, WorkPeriods.StartDate, 103)), 103) AS Z_Date,
SUM(CASE WHEN Orders.CalculatePrice = 0 THEN Orders.Quantity * Orders.Price ELSE 0 END) AS Gifts_Amount,
SUM(CASE WHEN Orders.CalculatePrice = 0 THEN Orders.Quantity ELSE 0 END) AS No_of_Gift_Orders
FROM
Orders, WorkPeriods
WHERE
Orders.CreatedDateTime BETWEEN WorkPeriods.StartDate AND WorkPeriods.EndDate
GROUP BY
WorkPeriods.Id, WorkPeriods.StartDate
Any advice on how to continue? I have already tried merging them using all 3 tables and all sum-count conditions but the result I get is wrong. I need all results to appear on the same row. Attached are query results
You can't just join them all in the one query, as you will get incorrect values as soon as you get multiple orders or payments in the same workperiod.
You could use the current queries as sub queries, and full join them to get the result. By using full join you get any results that are only on one table and not the other.
Select ISNULL(Pay.Z_Number, Ord.Z_Number) As Z_Number,
ISNULL(Pay.Z_Date, Ord.Z_Date) as Z_Date,
Pay.CashPayments,
Pay.No_of_Tickets_Cash,
Ord.Gifts_Amount
--other fields as appropriate
FROM (
--Query 1 here
) AS Pay
FULL OUTER JOIN (
--Query 2 here
) as Ord ON Pay.Z_Number = Ord.Z_Number and Pay.Z_Date = Ord.Z_Date
Another way to do this, is to create one sub query that has the data from both payments and orders in it unioned together, and then sum the resulting list in the outer query.
Below sample query may be helpful
SELECT
MAIN_T.Z_Number
,MAIN_T.Z_Date
,T1.Cash_Payments
,T1.Credit_Card_Payments
,T1.No_of_Tickets_Cash
,T1.No_of_Tickets_Credit_Card
,T2.Gifts_Amount
,T2.No_of_Gift_Orders
FROM
(SELECT DISTINCT
WorkPeriods.Id AS Z_Number,
CONVERT(VARCHAR, (CONVERT(DATE, WorkPeriods.StartDate, 103)), 103) AS Z_Date
FROM
Payments, WorkPeriods
WHERE
Payments.Date BETWEEN WorkPeriods.StartDate AND WorkPeriods.EndDate ) MAIN_T
LEFT JOIN
(SELECT
WorkPeriods.Id AS Z_Number,
CONVERT(VARCHAR, (CONVERT(DATE, WorkPeriods.StartDate, 103)), 103) AS Z_Date,
SUM(CASE WHEN Payments.Name = 'Cash' THEN Payments.Amount ELSE 0 END) AS Cash_Payments,
COUNT(CASE WHEN Payments.Name = 'Cash' THEN 1 END) AS No_of_Tickets_Cash,
SUM(CASE WHEN Payments.Name = 'Credit Card' THEN Payments.Amount ELSE 0 END) AS Credit_Card_Payments,
COUNT(CASE WHEN Payments.Name = 'Credit Card' THEN 1 END) AS No_of_Tickets_Credit_Card
FROM
Payments, WorkPeriods
WHERE
Payments.Date BETWEEN WorkPeriods.StartDate AND WorkPeriods.EndDate
GROUP BY
WorkPeriods.Id, WorkPeriods.StartDate) T1
ON MAIN_T.Z_Number=T1.Z_Number AND MAIN_T.Z_Date=T1.Z_Date
LEFT JOIN
(SELECT
WorkPeriods.Id AS Z_Number,
CONVERT(VARCHAR, (CONVERT(DATE, WorkPeriods.StartDate, 103)), 103) AS Z_Date,
SUM(CASE WHEN Orders.CalculatePrice = 0 THEN Orders.Quantity * Orders.Price ELSE 0 END) AS Gifts_Amount,
SUM(CASE WHEN Orders.CalculatePrice = 0 THEN Orders.Quantity ELSE 0 END) AS No_of_Gift_Orders
FROM
Orders, WorkPeriods
WHERE
Orders.CreatedDateTime BETWEEN WorkPeriods.StartDate AND WorkPeriods.EndDate
GROUP BY
WorkPeriods.Id, WorkPeriods.StartDate) T2
ON MAIN_T.Z_Number=T2.Z_Number AND MAIN_T.Z_Date=T2.Z_Date

how to select top 10 without duplicates

Using SQL Server 2012
I need to select TOP 10 Producer based on a ProducerCode. But the data is messed up, users were entering same Producers just spelled differently and with the same ProducerCode.
So I just need TOP 10, so if the ProducerCode is repeating, I just want to pick the first one in a list.
How can I achieve that?
Sample of my data
;WITH cte_TopWP --T
AS
(
SELECT distinct ProducerCode, Producer,SUM(premium) as NetWrittenPremium,
SUM(CASE WHEN PolicyType = 'New Business' THEN Premium ELSE 0 END) as NewBusiness1,
SUM(CASE WHEN PolicyType = 'Renewal' THEN Premium ELSE 0 END) as Renewal1,
SUM(CASE WHEN PolicyType = 'Rewrite' THEN Premium ELSE 0 END) as Rewrite1
FROM ProductionReportMetrics
WHERE YEAR(EffectiveDate) = 2016 AND TransactionType = 'Policy' AND CompanyLine = 'Arch Insurance Company'--AND ProducerType = 'Wholesaler'
GROUP BY ProducerCode,Producer
)
,
cte_Counts --C
AS
(
SELECT distinct ProducerCode, ProducerName, COUNT (distinct ControlNo) as Submissions2,
SUM(CASE WHEN QuotedPremium IS NOT NULL THEN 1 ELSE 0 END) as Quoted2,
SUM(CASE WHEN Type = 'New Business' AND Status IN ('Bound','Cancelled','Notice of Cancellation') THEN 1 ELSE 0 END ) as NewBusiness2,
SUM(CASE WHEN Type = 'Renewal' AND Status IN ('Bound','Cancelled','Notice of Cancellation') THEN 1 ELSE 0 END ) as Renewal2,
SUM(CASE WHEN Type = 'Rewrite' AND Status IN ('Bound','Cancelled','Notice of Cancellation') THEN 1 ELSE 0 END ) as Rewrite2,
SUM(CASE WHEN Status = 'Declined' THEN 1 ELSE 0 END ) as Declined2
FROM ClearanceReportMetrics
WHERE YEAR(EffectiveDate)=2016 AND CompanyLine = 'Arch Insurance Company'
GROUP BY ProducerCode,ProducerName
)
SELECT top 10 RANK() OVER (ORDER BY NetWrittenPremium desc) as Rank,
t.ProducerCode,
c.ProducerName as 'Producer',
NetWrittenPremium,
t.NewBusiness1,
t.Renewal1,
t.Rewrite1,
c.[NewBusiness2]+c.[Renewal2]+c.[Rewrite2] as PolicyCount,
c.Submissions2,
c.Quoted2,
c.[NewBusiness2],
c.Renewal2,
c.Rewrite2,
c.Declined2
FROM cte_TopWP t --LEFT OUTER JOIN tblProducers p on t.ProducerCode=p.ProducerCode
LEFT OUTER JOIN cte_Counts c ON t.ProducerCode=c.ProducerCode
You should use ROW_NUMBER to fix your issue.
https://msdn.microsoft.com/en-us/library/ms186734.aspx
A good example of this is the following answer:
https://dba.stackexchange.com/a/22198
Here's the code example from the answer.
SELECT * FROM
(
SELECT acss_lookup.ID AS acss_lookupID,
ROW_NUMBER() OVER
(PARTITION BY your_distinct_column ORDER BY any_column_you_think_is_appropriate)
as num,
acss_lookup.product_lookupID AS acssproduct_lookupID,
acss_lookup.region_lookupID AS acssregion_lookupID,
acss_lookup.document_lookupID AS acssdocument_lookupID,
product.ID AS product_ID,
product.parent_productID AS productparent_product_ID,
product.label AS product_label,
product.displayheading AS product_displayheading,
product.displayorder AS product_displayorder,
product.display AS product_display,
product.ignorenewupdate AS product_ignorenewupdate,
product.directlink AS product_directlink,
product.directlinkURL AS product_directlinkURL,
product.shortdescription AS product_shortdescription,
product.logo AS product_logo,
product.thumbnail AS product_thumbnail,
product.content AS product_content,
product.pdf AS product_pdf,
product.language_lookupID AS product_language_lookupID,
document.ID AS document_ID,
document.shortdescription AS document_shortdescription,
document.language_lookupID AS document_language_lookupID,
document.document_note AS document_document_note,
document.displayheading AS document_displayheading
FROM acss_lookup
INNER JOIN product ON (acss_lookup.product_lookupID = product.ID)
INNER JOIN document ON (acss_lookup.document_lookupID = document.ID)
)a
WHERE a.num = 1
ORDER BY product_displayheading ASC;
You could do this:
SELECT ProducerCode, MIN(Producer) AS Producer, ...
GROUP BY ProducerCode

Count Distinct with Answer side by side instead of underneath

Here is my query:
SELECT substring(date,1,10), count(distinct id),
CASE WHEN name IS NOT NULL THEN 1 ELSE 0 END
FROM table
WHERE (date >= '2015-09-01')
GROUP BY substring(date,1,10), CASE WHEN name IS NOT NULL THEN 1 ELSE 0 END
ORDER BY substring(date,1,10)
This is my result:
substring count case
2015-09-01 20472 0
2015-09-01 7 1
2015-09-02 20465 0
2015-09-02 470 1
What I want it to look like is this:
substring count count
2015-09-01 20472 7
2015-09-02 20465 470
Thank you!
With PostgreSQL 9.4 or newer, we can filter directly an aggregate with the new FILTER clause:
SELECT substring(date,1,10),
count(distinct id),
count(*) FILTER (WHERE name IS NOT NULL)
FROM table
WHERE (date >= '2015-09-01')
GROUP BY 1
ORDER BY 1
SELECT substring(date,1,10)
, count(distinct CASE WHEN name IS NOT NULL THEN id ELSE null END ) AS count1
, count(distinct CASE WHEN name IS NOT NULL THEN null ELSE id END ) AS count2
FROM event
WHERE (date >= '2015-09-01')
GROUP BY substring(date,1,10)
ORDER BY substring(date,1,10)
This gave me an answer like this: (which is exactly what I wanted so thank you so much)
substring count1 count2
2015-09-01 7 20472
2015-09-02 470 20465
Use case in count to get columns for some condition (name IS NOT NULL), like this:
SELECT substring(date,1,10)
, count(distinct CASE WHEN name IS NOT NULL THEN id ELSE null END ) AS count1
, count(distinct CASE WHEN name IS NOT NULL THEN null ELSE id END ) AS count2
FROM table
WHERE (date >= '2015-09-01')
GROUP BY substring(date,1,10)
ORDER BY substring(date,1,10)
you can also use subquery to create columns:
SELECT dt, Count(id1) count1, Count(distinct id2) count2
FROM (
SELECT distinct substring(date,1,10) AS dt
, CASE WHEN name IS NOT NULL THEN id ELSE null END AS id1
, CASE WHEN name IS NOT NULL THEN null ELSE id END AS id2,
FROM table
WHERE (date >= '2015-09-01')) d
GROUP BY dt
ORDER BY dt

Count based on Or is not differentiating the count

My results are showing both counts the same but there should be some that have different counts as CarCode is sometimes null.
SELECT distinct car.carKey,
car.Weight,
car.CarCode,
COUNT(car.carKey)OVER(PARTITION BY car.carKey) AS TotalCarKeyCount,
COUNT(Case When (car.[Weight] IS not null) and (car.CarCode is null) as CarCountWithoutCode
then 0
else car.carKey End) OVER(PARTITION BY car.carKey) AS CarCount
from car
results show TotalCarKeyCount and CarCountWithoutCode always with the same counts like the case statement isn't working or something.
It sounds like you might want to use SUM() instead:
SELECT distinct car.carKey,
car.Weight,
car.CarCode,
COUNT(car.carKey)OVER(PARTITION BY car.carKey) AS TotalCarKeyCount,
SUM(Case When (car.[Weight] IS not null) and (car.CarCode is null) as CarCountWithoutCode
then 0 else 1 End) OVER(PARTITION BY car.carKey) AS CarCount
from car
SQL Fiddle demo showing the difference between using COUNT() and SUM():
create table test
(
id int
);
insert into test values
(1), (null), (23), (4), (2);
select
count(case when id is null then 0 else id end) [count],
sum(case when id is null then 0 else 1 end) [sum]
from test;
Count returns 5 and Sum returns 4. Or you can change the COUNT() to use null and the null values will be excluded in the final count()
select
count(case when id is null then null else id end) [count],
sum(case when id is null then 0 else 1 end) [sum]
from test;
Your query would be:
SELECT distinct car.carKey,
car.Weight,
car.CarCode,
COUNT(car.carKey)OVER(PARTITION BY car.carKey) AS TotalCarKeyCount,
COUNT(Case When (car.[Weight] IS not null) and (car.CarCode is null) as CarCountWithoutCode
then null else 1 End) OVER(PARTITION BY car.carKey) AS CarCount
from car
Change the then 0 to then null. Zero values are counted, nulls are not.