SQL how to display group by results in columns postgresSQL - postgresql

I used a** group by **id and year in a SQL query to display the following table :
QueySQL
select s.id as societe, typecombustible,extract(YEAR from p.datedebut) as yearrr
,sum(quantiteconsommee) as somme
from sch_consomind.consommationcombustible, sch_referentiel.societe s, sch_referentiel.unite u,sch_referentiel.periode p
where unite=u.id and s.id=u.societe_id and p.id=periode
group by s.id, typecombustible, yearrr
order by yearrr
But, I want to display the result by columns, like the following table
Searching in google and StackOverflow I found PIVOT function which is available in SQL Server, but I use PostgreSQL

You can use filtered aggregation:
select s.id as societe,
c.typecombustible,
sum(c.quantiteconsommee) filter (where extract(YEAR from p.datedebut) = 2020) as "2020",
sum(c.quantiteconsommee) filter (where extract(YEAR from p.datedebut) = 2021) as "2021",
sum(c.quantiteconsommee) filter (where extract(YEAR from p.datedebut) = 2022) as "2022"
from sch_consomind.consommationcombustible c
join sch_referentiel.societe s on c.unite = u.id
join sch_referentiel.unite u on s.id = u.societe_id
join sch_referentiel.periode p on p.id = c.periode
group by s.id, c.typecombustible
order by s.id, c.typecombustible;
And before you ask: no, this can not be made "dynamic". A fundamental restriction of the SQL language is, that the number, names and data types of all columns of a query must be known before the database starts retrieving the data.

Related

Postgres string_agg function not recognized as aggregate function

I am attempting to run this query
SELECT u.*, string_agg(CAST(uar.roleid AS VARCHAR(100)), ',') AS roleids, string_agg(CAST(r.role AS VARCHAR(100)), ',') AS systemroles
FROM idpro.users AS u
INNER JOIN idpro.userapplicationroles AS uar ON u.id = uar.userid
INNER JOIN idpro.roles AS r ON r.id = uar.roleid
GROUP BY u.id, uar.applicationid
HAVING u.organizationid = '77777777-f892-4f4a-8328-c31df32bd6ba'
AND uar.applicationid = 'd88fbf05-c048-4697-8bf3-036f39897183'
AND (u.statusid = '7f9f0b75-44b7-4216-bf2a-03abc47dcff8')
AND uar.roleid IN ('cc9ada1c-fa21-400b-be98-c563ebb65a9c','de087148-4788-43da-89e2-dd7dff097735');
However, I'm getting an error stating that
ERROR: column "uar.roleid" must appear in the GROUP BY clause or be used in an aggregate function
LINE 9: AND uar.roleid IN ('cc9ada1c-fa21-400b-be98-c563ebb65a9c','...
string_agg() IS an aggregate function, is it not? My intent, if it isn't obvious, is to return each user record with the roleids and rolenames in comma-delimited lists. If I am doing everything wrong, could you please point me in the right direction?
You are filtering the data, so a WHERE clause would be needed. This tutorial is worth reading.
SELECT u.*,
string_agg(CAST(uar.roleid AS VARCHAR(100)), ',') AS roleids,
string_agg(CAST(r.role AS VARCHAR(100)), ',') AS systemroles
FROM idpro.users AS u
INNER JOIN idpro.userapplicationroles AS uar ON u.id = uar.userid
INNER JOIN idpro.roles AS r ON r.id = uar.roleid
WHERE u.organizationid = '77777777-f892-4f4a-8328-c31df32bd6ba'
AND uar.applicationid = 'd88fbf05-c048-4697-8bf3-036f39897183'
AND (u.statusid = '7f9f0b75-44b7-4216-bf2a-03abc47dcff8')
AND uar.roleid IN ('cc9ada1c-fa21-400b-be98-c563ebb65a9c','de087148-4788-43da-89e2-dd7dff097735');
GROUP BY u.id, uar.applicationid
The HAVING clause is helpful for filtering the aggregated values or the groups.
Since you are grouping by u.id, the table primary key you have access to every column of the u table. You can either use a where clause or a having clause.
For uar.applicationid, it is part of the group by so you can also use either a where or a having.
uar.roleid is not part of the group by clause, so to be usable in the having clause, you would have to consider the aggregated value.
The following example filters out rows whose aggregated length is more than 10 chars.
HAVING length(string_agg(CAST(uar.roleid AS VARCHAR(100)), ',')) > 10
A more common usage, on numerical field, is to filter out if the number of aggregated rows is less than a threshold (having count(*) > 2) or a sum of some kind (having sum(vacation_days) > 21)

Get distinct row by primary key, but use value from another column

I'm trying to get the sum of the total time that was spent sending all emails within a campaign.
Because of the joins in my query I end up with the 'processing_time' column duplicated over many rows. So running sum(s.processing_time) as send_time will always over represent how long it took to run.
select
c.id,
c.sender,
c.subject,
count(*) as total_items,
count(distinct s.id) as sends,
sum(s.processing_time) as send_time,
from campaigns c
left join sends s on c.id = s.campaigns_id
left join opens o on s.id = o.sends_id
group by c.id;
I'd ideally like to do something like sum(s.processing_time when distinct s.id) but I can't quite work out how to achieve that.
I have made other attempts using case but I always run into the same issue, I need to get the distinct rows based on the ID column, but work with another column.
Since you want statistics related to distinct s.id as well as c.id, group by both columns. Collect the (intermediate) data that you need,
and use this table as the inner table in a nested sub-select query.
In the outer select, group by c.id alone.
Since the inner select groups by s.id, values which are unique per s.id will not get double-counted when you sum/group by c.id.
SELECT id
, sender
, subject
, sum(total_items) as total_items
, sum(sends) as sends
, sum(processing_time) as send_time
FROM (
SELECT
c.id
, s.id as sid
, count(*) as total_items
, 1 as sends
, s.processing_time
, c.sender
, c.subject
FROM campaigns c
LEFT JOIN sends s on c.id = s.campaigns_id
LEFT JOIN opens o on s.id = o.sends_id
GROUP BY c.id, c.sender, c.subject, s.processing_time, s.id) t
GROUP BY id, sender, subject
ORDER BY id
Since the final table includes sender and subject, you'll need to group by these columns as well to avoid an error such as:
ERROR: column "c.sender" must appear in the GROUP BY clause or be used in an aggregate function
LINE 14: , c.sender

Unable to get Percentile_Cont() to work in Postgresql

I am trying to calculate a percentile using the percentile_cont() function in PostgreSQL using common table expressions. The goal is find the top 1% of accounts regards to their balances (called amount here). My logic is to find the 99th percentile which will return those whose account balances are greater than 99% of their peers (and thus finding the 1 percenters)
Here is my query
--ranking subquery works fine
with ranking as(
select a.lname,sum(c.amount) as networth from customer a
inner join
account b on a.customerid=b.customerid
inner join
transaction c on b.accountid=c.accountid
group by a.lname order by sum(c.amount)
)
select lname, networth, percentile_cont(0.99) within group
order by networth over (partition by lname) from ranking ;
I keeping getting the following error.
ERROR: syntax error at or near "order"
LINE 2: ...ame, networth, percentile_cont(0.99) within group order by n..
I am thinking that perhaps I forgot a closing brace etc. but I can't seem to figure out where. I know it could be something with the order keyword but I am not sure what to do. Can you please help me to fix this error?
This tripped me up, too.
It turns out percentile_cont is not supported in postgres 9.3, only in 9.4+.
https://www.postgresql.org/docs/9.4/static/release-9-4.html
So you have to use something like this:
with ordered_purchases as (
select
price,
row_number() over (order by price) as row_id,
(select count(1) from purchases) as ct
from purchases
)
select avg(price) as median
from ordered_purchases
where row_id between ct/2.0 and ct/2.0 + 1
That query care of https://www.periscopedata.com/blog/medians-in-sql (section: "Median on Postgres")
You are missing the brackets in the within group (order by x) part.
Try this:
with ranking
as (
select a.lname,
sum(c.amount) as networth
from customer a
inner join account b on a.customerid = b.customerid
inner join transaction c on b.accountid = c.accountid
group by a.lname
order by networth
)
select lname,
networth,
percentile_cont(0.99) within group (
order by networth
) over (partition by lname)
from ranking;
I want to point out that you don't need a subquery for this:
select c.lname, sum(t.amount) as networth,
percentile_cont(0.99) within group (order by sum(t.amount)) over (partition by lname)
from customer c inner join
account a
on c.customerid = a.customerid inner join
transaction t
on a.accountid = t.accountid
group by c.lname
order by networth;
Also, when using table aliases (which should be always), table abbreviations are much easier to follow than arbitrary letters.

Can't solve this SQL query

I have a difficulty dealing with a SQL query. I use PostgreSQL.
The query says: Show the customers that have done at least an order that contains products from 3 different categories. The result will be 2 columns, CustomerID, and the amount of orders. I have written this code but I don't think it's correct.
select SalesOrderHeader.CustomerID,
count(SalesOrderHeader.SalesOrderID) AS amount_of_orders
from SalesOrderHeader
inner join SalesOrderDetail on
(SalesOrderHeader.SalesOrderID=SalesOrderDetail.SalesOrderID)
inner join Product on
(SalesOrderDetail.ProductID=Product.ProductID)
where SalesOrderDetail.SalesOrderDetailID in
(select DISTINCT count(ProductCategoryID)
from Product
group by ProductCategoryID
having count(DISTINCT ProductCategoryID)>=3)
group by SalesOrderHeader.CustomerID;
Here are the database tables needed for the query:
where SalesOrderDetail.SalesOrderDetailID in
(select DISTINCT count(ProductCategoryID)
Is never going to give you a result as an ID (SalesOrderDetailID) will never logically match a COUNT (count(ProductCategoryID)).
This should get you the output I think you want.
SELECT soh.CustomerID, COUNT(soh.SalesOrderID) AS amount_of_orders
FROM SalesOrderHeader soh
INNER JOIN SalesOrderDetail sod ON soh.SalesOrderID = sod.SalesOrderID
INNER JOIN Product p ON sod.ProductID = p.ProductID
HAVING COUNT(DISTINCT p.ProductCategoryID) >= 3
GROUP BY soh.CustomerID
Try this :
select CustomerID,count(*) as amount_of_order from
SalesOrder join
(
select SalesOrderID,count(distinct ProductCategoryID) CategoryCount
from SalesOrderDetail JOIN Product using (ProductId)
group by 1
) CatCount using (SalesOrderId)
group by 1
having bool_or(CategoryCount>=3) -- At least on CategoryCount>=3

MS Access INNER JOIN most recent entry

I'm having some trouble trying to get Microsoft Access 2007 to accept my SQL query but it keeps throwing syntax errors at me that don't help me correct the problem.
I have two tables, let's call them Customers and Orders for ease.
I need some customer details, but also a few details from the most recent order. I currently have a query like this:
SELECT c.ID, c.Name, c.Address, o.ID, o.Date, o.TotalPrice
FROM Customers c
INNER JOIN Orders o
ON c.ID = o.CustomerID
AND o.ID = (SELECT TOP 1 ID FROM Orders WHERE CustomerID = c.ID ORDER BY Date DESC)
To me, it appears valid, but Access keeps throwing 'syntax error's at me and when I hit OK, it selects a piece of the SQL text that doesn't even relate to it.
If I take the extra SELECT clause out it works but is obviously not what I need.
Any ideas?
You cannot use AND in that way in MS Access, change it to WHERE. In addition, you have two reserved words in your column (field) names - Name, Date. These should be enclosed in square brackets when not prefixed by a table name or alias, or better, renamed.
SELECT c.ID, c.Name, c.Address, o.ID, o.Date, o.TotalPrice
FROM Customers c
INNER JOIN Orders o
ON c.ID = o.CustomerID
WHERE o.ID = (
SELECT TOP 1 ID FROM Orders
WHERE CustomerID = c.ID ORDER BY [Date] DESC)
I worked out how to do it in Microsoft Access. You INNER JOIN on a pre-sorted sub-query. That way you don't have to do multiple ON conditions which aren't supported.
SELECT c.ID, c.Name, c.Address, o.OrderNo, o.OrderDate, o.TotalPrice
FROM Customers c
INNER JOIN (SELECT * FROM Orders ORDER BY OrderDate DESC) o
ON c.ID = o.CustomerID
How efficient this is another story, but it works...