receiving error code every time i try execute

receiving error code every time i try execute - mysql-workbench

Select c.CategoryName,COUNT(p1.ProductID) as CountOfProduct, MAX(p1.ListPrice) as MaxOfProduct
From Categories as c1 JOIN products as p1
ON c1.CategoryID=p1.CategoryID
Group By c1.CategoryName
Order By CountOfProduct DESC

Related

Postgres, aggregate function and ORDER BY

The statement works for me:
SELECT e.id, e.title, array_agg(d.start_date) date, array_agg(d.id) ids
FROM event e JOIN event_date d ON e.id = d.event_id
GROUP BY e.id
I receive the results
id
title
dates
ids
1
First Event
{2022-06-05,2022-10-05}
{1,2}
2
Second Event
{2022-07-05}
{3}
I want to order events by start_date. For that, I add ORDER BY d.start_date DESC to the statement:
SELECT e.id, e.title, array_agg(d.start_date) date, array_agg(d.id) ids
FROM event e JOIN event_date d ON e.id = d.event_id
GROUP BY e.id
ORDER BY d.start_date DESC
And I receive the error message:
ERROR: column "d.start_date" must appear in the GROUP BY clause or be
used in an aggregate function LINE 4: ORDER BY d.start_date DESC
I don't understand it. array_agg is an aggregate function. How to solve this issue?

it seems to work
SELECT e.id, e.title, array_agg(d.start_date ORDER BY d.start_date ASC) AS dates
FROM event e JOIN event_date d ON e.id = d.event_id
GROUP BY e.id
ORDER BY dates ASC

Get distinct row by primary key, but use value from another column

I'm trying to get the sum of the total time that was spent sending all emails within a campaign.
Because of the joins in my query I end up with the 'processing_time' column duplicated over many rows. So running sum(s.processing_time) as send_time will always over represent how long it took to run.
select
c.id,
c.sender,
c.subject,
count(*) as total_items,
count(distinct s.id) as sends,
sum(s.processing_time) as send_time,
from campaigns c
left join sends s on c.id = s.campaigns_id
left join opens o on s.id = o.sends_id
group by c.id;
I'd ideally like to do something like sum(s.processing_time when distinct s.id) but I can't quite work out how to achieve that.
I have made other attempts using case but I always run into the same issue, I need to get the distinct rows based on the ID column, but work with another column.

Since you want statistics related to distinct s.id as well as c.id, group by both columns. Collect the (intermediate) data that you need,
and use this table as the inner table in a nested sub-select query.
In the outer select, group by c.id alone.
Since the inner select groups by s.id, values which are unique per s.id will not get double-counted when you sum/group by c.id.
SELECT id
, sender
, subject
, sum(total_items) as total_items
, sum(sends) as sends
, sum(processing_time) as send_time
FROM (
SELECT
c.id
, s.id as sid
, count(*) as total_items
, 1 as sends
, s.processing_time
, c.sender
, c.subject
FROM campaigns c
LEFT JOIN sends s on c.id = s.campaigns_id
LEFT JOIN opens o on s.id = o.sends_id
GROUP BY c.id, c.sender, c.subject, s.processing_time, s.id) t
GROUP BY id, sender, subject
ORDER BY id
Since the final table includes sender and subject, you'll need to group by these columns as well to avoid an error such as:
ERROR: column "c.sender" must appear in the GROUP BY clause or be used in an aggregate function
LINE 14: , c.sender

How do I efficiently select the most recent record for a set of values TSQL?

I have a set of tables each containing related data and I need to select the most recent set of records for each row in the source table. There are millions of rows and I need to do this efficiently and so far im unable to return only the most recent date for a given number.
For example the current result for a given number is:
CampaignName MobileNumber Date
Campaign A 12345678910 12/02/2018 14:50:30
Campaign B 12345678910 05/02/2018 11:35:22
Only the row for Campaign A should be returned.
I'm essentially trying to get the most recent message sent for each mobile number and the campaign data for that message (each message is part of a campaign.
SELECT CC.campaignname,
Co.mobilenumber,
Max(M.msgcreatetime)
FROM [Database].[dbo].[messages] M WITH(nolock)
INNER JOIN dbo.messagecontact MC WITH(nolock)
ON M.msgid = MC.messageid
INNER JOIN dbo.campaigncontact Co WITH(nolock)
ON Co.contactid = MC.contactid
INNER JOIN dbo.campaign CC WITH(nolock)
ON M.campaignid = CC.campaignid
GROUP BY CC.campaignname,
Co.mobilenumber

Use top 1 with ties and order by row_number:
Using top 1 with ties means you will get all the records where the value of the order by expression is the lowest.
Using row_number() over(partition by Co.mobilenumber order by M.msgcreatetime desc) will return 1 for the last date for each Co.mobilenumber, 2 for the second from last etc'.
SELECT TOP 1 WITH TIES
CC.campaignname,
Co.mobilenumber,
M.msgcreatetime
FROM [Database].[dbo].[messages] M WITH(nolock)
INNER JOIN dbo.messagecontact MC WITH(nolock)
ON M.msgid = MC.messageid
INNER JOIN dbo.campaigncontact Co WITH(nolock)
ON Co.contactid = MC.contactid
INNER JOIN dbo.campaign CC WITH(nolock)
ON M.campaignid = CC.campaignid
ORDER BY ROW_NUMBER() OVER(PARTITION BY Co.mobilenumber ORDER BY M.msgcreatetime desc)

Can't solve this SQL query

I have a difficulty dealing with a SQL query. I use PostgreSQL.
The query says: Show the customers that have done at least an order that contains products from 3 different categories. The result will be 2 columns, CustomerID, and the amount of orders. I have written this code but I don't think it's correct.
select SalesOrderHeader.CustomerID,
count(SalesOrderHeader.SalesOrderID) AS amount_of_orders
from SalesOrderHeader
inner join SalesOrderDetail on
(SalesOrderHeader.SalesOrderID=SalesOrderDetail.SalesOrderID)
inner join Product on
(SalesOrderDetail.ProductID=Product.ProductID)
where SalesOrderDetail.SalesOrderDetailID in
(select DISTINCT count(ProductCategoryID)
from Product
group by ProductCategoryID
having count(DISTINCT ProductCategoryID)>=3)
group by SalesOrderHeader.CustomerID;
Here are the database tables needed for the query:

where SalesOrderDetail.SalesOrderDetailID in
(select DISTINCT count(ProductCategoryID)
Is never going to give you a result as an ID (SalesOrderDetailID) will never logically match a COUNT (count(ProductCategoryID)).
This should get you the output I think you want.
SELECT soh.CustomerID, COUNT(soh.SalesOrderID) AS amount_of_orders
FROM SalesOrderHeader soh
INNER JOIN SalesOrderDetail sod ON soh.SalesOrderID = sod.SalesOrderID
INNER JOIN Product p ON sod.ProductID = p.ProductID
HAVING COUNT(DISTINCT p.ProductCategoryID) >= 3
GROUP BY soh.CustomerID

Try this :
select CustomerID,count(*) as amount_of_order from
SalesOrder join
(
select SalesOrderID,count(distinct ProductCategoryID) CategoryCount
from SalesOrderDetail JOIN Product using (ProductId)
group by 1
) CatCount using (SalesOrderId)
group by 1
having bool_or(CategoryCount>=3) -- At least on CategoryCount>=3

SQL Server 2012 Passing parameter from main query to the Joined subquery

I need to select some settings from some joined tables, but only if Items ORDER BY EndTime DESC ItemID is among first 1000 Items.
Do do this I built the following Query that, although surely can be improved, works:
SELECT ss.ModuleCode, ss.MaxItems , w.*
FROM Subscriptions ss
JOIN Sellers s ON s.UID=ss.UID
JOIN Items i ON s.UserID=i.UserID
JOIN Items ii ON i.ItemID=ii.ItemID
JOIN Modules mo ON ss.ModuleCode=mo.ModuleCode
JOIN Settings w ON w.UID=s.UID AND ss.ModuleCode=w.WCode
FULL JOIN GoogleFonts f ON f.FontCode=a.FontFamily
JOIN ( SELECT
ItemID
FROM Items
WHERE UserID=#UserID
ORDER BY EndTime DESC
OFFSET 0 ROWS
FETCH FIRST (1000) ROWS ONLY
) it ON it.ItemID=i.ItemID
WHERE it.ItemID=#ItemID
AND .....
but since MaxItems is not always 1000 and its value is defined by ss.MaxItems,
I would replace the fixed value of 1000 with the dynamic value of ss.MaxItems, but I haven't find a way to do it:
Although not optimal since makes the query much heavier, I tried putting instead of 1000 a further query with this result:
SELECT ss.ModuleCode, ss.MaxItems , w.*
FROM Subscriptions ss
JOIN Sellers s ON s.UID=ss.UID
JOIN Items i ON s.UserID=i.UserID
JOIN Items ii ON i.ItemID=ii.ItemID
JOIN Modules mo ON ss.ModuleCode=mo.ModuleCode
JOIN Settings w ON w.UID=s.UID AND ss.ModuleCode=w.WCode
FULL JOIN GoogleFonts f ON f.FontCode=a.FontFamily
JOIN ( SELECT
ItemID
FROM Items
WHERE UserID=#UserID
ORDER BY EndTime DESC
OFFSET 0 ROWS
FETCH FIRST ( SELECT ss.MaxItems
FROM Subscriptions ss
JOIN Sellers s ON s.UID=ss.UID
JOIN Items i ON s.UserID=i.UserID
JOIN Modules mo ON ss.ModuleCode=mo.ModuleCode
JOIN Settings w ON w.UID=s.UID AND ss.ModuleCode=w.WCode
WHERE i.ItemID=#ItemID) ROWS ONLY
) it ON it.ItemID=i.ItemID
Where it.ItemID=#ItemID
AND .....
but since this returns more than 1 value it is not accepted: limiting to TOP 1 result the latest subquery will work but will not be fully dynamic as required.
Can suggest how to solve or at least suggest the path for the solution?
Thanks!

Instead of fetch use row_number:
JOIN (SELECT ItemID, ROW_NUMBER() OVER (PARTITION BY UserID ORDER BY EndTime) as seqnum
FROM Items it
WHERE UserID = #UserID
) it
ON it.ItemID = i.ItemID AND seqnum <= ss.maxitems