Postgresql query with double has_many relationships - postgresql

I have this complex data relationship.
POSTGRESQL FIDDLE: https://www.db-fiddle.com/f/vm2z8qLuddzcHEgyaMnCbc/3
"Item Group" has many "items" through "item_ads" table.
So an Item Group has many part_number.
reports table contains the number of clicks for each day for each adgroupid.
Each adgroupid has_many part_numbers. (table: product_ads)
Now, I want to SUM all reports.clicks for each item_groups.id using the part_number to linked the tables.
After this, I have to consider only reports.adgroupid which are included in the part_numbers of item_group. So if "Item group" has three part_number (A, B, C) can be considered all adgroupid that contains A,B, or C but nothing more. If adgroupid contains part_number D it cannot be considered for clicks sum.
Expected results
I have to have a table with lots of item_group_ids.
I am looking for the PostgreSQL query to achieve this table.

First, let's build the query up in parts. It sounds like you already know how to get from item_group and adgroup to part_number, just not about how to join them. I've added a query that removes duplicates for part 1 of your question, but putting them into a CTE:
WITH unique_part_numbers AS (
SELECT DISTINCT item_groups.id AS item_group_id,
part_number
FROM item_groups
JOIN item_ads ON item_group_id = item_groups.id
JOIN items ON items.id = item_ads.item_id
)
SELECT unique_part_numbers.item_group_id, SUM(clicks)
FROM unique_part_numbers
JOIN product_ads ON product_ads.part_number = unique_part_numbers.part_number
JOIN reports ON product_ads.adgroupid = reports.adgroupid
GROUP BY item_group_id
About the second part - it's not possible to do it as you want, because you can have multiple adgroups per item_group - so I added adgroupid as an extra column. I create an array of part_numbers for the adgroup and check, using the #> operator, that all parts that are from the adgroupid are also from the unique_part_numbers.item_group_id.
WITH unique_part_numbers AS (
SELECT DISTINCT item_groups.id AS item_group_id,
part_number
FROM item_groups
JOIN item_ads ON item_group_id = item_groups.id
JOIN items ON items.id = item_ads.item_id
)
SELECT unique_part_numbers.item_group_id,
product_ads.adgroupid,
array_agg(unique_part_numbers.part_number),
SUM(clicks)
FROM unique_part_numbers
JOIN product_ads ON product_ads.part_number = unique_part_numbers.part_number
JOIN reports ON product_ads.adgroupid = reports.adgroupid
GROUP BY item_group_id, product_ads.adgroupid
HAVING array_agg(product_ads.part_number) #> (
SELECT ARRAY_AGG(other_product_ads.part_number)
FROM product_ads AS other_product_ads
WHERE other_product_ads.adgroupid = product_ads.adgroupid
)

Related

Return closest timestamp from Table B based on timestamp from Table A with matching Product IDs

Goal: Create a query to pull the closest cycle count event (Table C) for a product ID based on the inventory adjustments results sourced from another table (Table A).
All records from Table A will be used, but is not guaranteed to have a match in Table C.
The ID column will be present in both tables, but is not unique in either, so that pair of IDs and Timestamps together are needed for each table.
Current simplified SQL
SELECT
A.WHENOCCURRED,
A.LPID,
A.ITEM,
A.ADJQTY,
C.WHENOCCURRED,
C.LPID,
C.LOCATION,
C.ITEM,
C.QUANTITY,
C.ENTQUANTITY
FROM
A
LEFT JOIN
C
ON A.LPID = C.LPID
WHERE
A.facility = 'FACID'
AND A.WHENOCCURRED > '23-DEC-22'
AND A.ADJREASONABBREV = 'CYCLE COUNTS'
ORDER BY A.WHENOCCURRED DESC
;
This is currently pulling the first hit on C.WHENOCCURRED on the LPID matches. Want to see if there is a simpler JOIN solution before going in a direction that creates 2 temp tables based on WHENOCCURRED.
I have a functioning INDEX(MATCH(MIN()) solution in Excel but that requires exporting a couple system reports first and is extremely slow with X,XXX row tables.
If you are using Oracle 12 or later, you can use a LATERAL join and FETCH FIRST ROW ONLY:
SELECT A.WHENOCCURRED,
A.LPID,
A.ITEM,
A.ADJQTY,
C.WHENOCCURRED,
C.LPID,
C.LOCATION,
C.ITEM,
C.QUANTITY,
C.ENTQUANTITY
FROM A
LEFT OUTER JOIN LATERAL (
SELECT *
FROM C
WHERE A.LPID = C.LPID
AND A.whenoccurred <= c.whenoccurred
ORDER BY c.whenoccurred
FETCH FIRST ROW ONLY
) C
ON (1 = 1) -- The join condition is inside the lateral join
WHERE A.facility = 'FACID'
AND A.WHENOCCURRED > DATE '2022-12-23'
AND A.ADJREASONABBREV = 'CYCLE COUNTS'
ORDER BY A.WHENOCCURRED DESC;

How to get the matching column name when searching across multiple columns in PostgreSQL?

Is there a way to get the matching column name when searching across multiple columns in PostgreSQL?
Say I have the following table structure and query:
CREATE TABLE document (
id serial PRIMARY KEY,
document_content VARCHAR
);
CREATE TABLE story (
id serial PRIMARY KEY,
headline VARCHAR
);
-----
SELECT
"document".*,
story.id,
story.headline
FROM
"document"
INNER JOIN story_document AS Documents_join ON "document".id = Documents_join.document_id
INNER JOIN story ON "story".id = Documents_join.story_id
WHERE to_tsvector(document_content) ## to_tsquery('foo')
OR to_tsvector(headline) ## to_tsquery('foo');
I was thinking of concatenating the value of the two columns, run the full text search, then create a sub query for both columns and re-run the search individually and record the result as a reference, but this would mean executing the search 3x:
SELECT
"document".*,
story.id AS story_id,
story.headline
(SELECT "document".id WHERE to_tsvector(document_content) ## to_tsquery('foo')) AS "matching_document_id",
(SELECT story_id WHERE to_tsvector(headline) ## to_tsquery('foo')) AS "matching_story_id"
FROM
"document"
INNER JOIN story_document AS Documents_join ON "document".id = Documents_join.document_id
RIGHT JOIN story ON "story".id = Documents_join.story_id
WHERE to_tsvector(document_content || ' ' || headline) ## to_tsquery('foo');
How could I get a reference to the column: document_content or headline, where the keyword "foo" was found in one query?
Thanks!
Since the columns are in different tables the best you can do is translate the OR into a UNION:
SELECT
"document".*,
story.id,
story.headline
FROM
"document"
INNER JOIN story_document AS Documents_join ON "document".id = Documents_join.document_id
INNER JOIN story ON "story".id = Documents_join.story_id
WHERE to_tsvector(document_content) ## to_tsquery('foo')
UNION
SELECT
"document".*,
story.id,
story.headline
FROM
"document"
INNER JOIN story_document AS Documents_join ON "document".id = Documents_join.document_id
INNER JOIN story ON "story".id = Documents_join.story_id
WHERE to_tsvector(headline) ## to_tsquery('foo');
Then PostgreSQL doesn't have to build the complete join just to filter out most of the rows. My variant will be fast if the conditions are selective and indexed and you have indexes on the join conditions as well, so that you can get fast nested loop joins.
Here is some more about dealing with OR.

How does this query populate data?

It is my understanding that when this query runs it would not populate any data any number of times it runs because of the where clause
where c.company_id = lot.company_id
and p.product_id = lot.product_id
and l.packlevel_id = lot.packlevel_id
It looks to me that at the very beginning when the table fact_table_lot is empty the where clause would return with empty data because it would not find anything in an empty table and it would happen everytime. Is my understanding wrong?
insert into fact_table_lot(company_id, product_id, packlevel_id, l_num, sn_count, comm_loct, comm_start, commdate_end, man_date, exp_date, user_id, created_datetime)
select c.company_id, p.product_id, l.packlevel_id, l_num, sn_count, comm_loct, comm_start, commdate_end, man_date, exp_date, user_id, sysdate
from staging_serials s
left outer join fact_table_lot lot on s.lotnumber = lot.l_num
join company c on c.lsc_company_id = s.companyid
join product p on s.compositeprodcode = p.compositeprodcode
join level l on l.unit_of_measure = p.packaginguom
where c.company_id = lot.company_id
and p.product_id = lot.product_id
and l.packlevel_id = lot.packlevel_id
and lot.created_datetime is null
In your query staging_serials s left outer join fact_table_lot lot on s.lotnumber= lot.l_num this will give the result set containing all records from staging_serials and since fact table is empty null values for those column from fact table. If you want no records to be returned use a inner join instead of left join.

How to use GROUP BY with Firebird?

I'm trying create a SELECT with GROUP BY in Firebird but I can't have any success. How could I do this ?
Exception
Can't format message 13:896 -- message file C:\firebird.msg not found.
Dynamic SQL Error.
SQL error code = -104.
Invalid expression in the select list (not contained in either an aggregate function or the GROUP BY clause).
(49,765 sec)
trying
SELECT FA_DATA, FA_CODALUNO, FA_MATERIA, FA_TURMA, FA_QTDFALTA,
ALU_CODIGO, ALU_NOME,
M_CODIGO, M_DESCRICAO,
FT_CODIGO, FT_ANOLETIVO, FT_TURMA
FROM FALTAS Falta
INNER JOIN ALUNOS Aluno ON (Falta.FA_CODALUNO = Aluno.ALU_CODIGO)
INNER JOIN MATERIAS Materia ON (Falta.FA_MATERIA = Materia.M_CODIGO)
INNER JOIN FORMACAOTURMAS Turma ON (Falta.FA_TURMA = Turma.FT_CODIGO)
WHERE (Falta.FA_CODALUNO = 238) AND (Turma.FT_ANOLETIVO = 2015)
GROUP BY Materia.M_CODIGO
Simple use of group by in firebird,group by all columns
select * from T1 t
where t.id in
(SELECT t.id FROM T1 t
INNER JOIN T2 j ON j.id = t.jid
WHERE t.id = 1
GROUP BY t.id)
Using GROUP BY doesn't make sense in your example code. It is only useful when using aggregate functions (+ some other minor uses). In any case, Firebird requires you to specify all columns from the SELECT column list except those with aggregate functions in the GROUP BY clause.
Note that this is more restrictive than the SQL standard, which allows you to leave out functionally dependent columns (ie if you specify a primary key or unique key, you don't need to specify the other columns of that table).
You don't specify why you want to group (because it doesn't make much sense to do it with this query). Maybe instead you want to ORDER BY, or you want the first row for each M_CODIGO.

How to do this JOIN

Related to my previous post here, I have the following SELECT:
SELECT tc.[Time],tc.[From], tc.[To], tc.[Cost], tc.[Length], tc.[Type], tc.[PlaceCalled]
FROM
TelstraCall as tc
WHERE
[AccountNumber] IN (#AccountNumber)
ORDER BY [Time] DESC
I'm trying to get the [Username] out of [Resource] given that the [PhoneNum] in [rtc] matches either [From] or [To], and Hogan has kindly helped me out with the first half :
USE [rtc]
SELECT [Username]
FROM [dbo].[Resource] R
JOIN ResourcePhone RP on R.ResourceId = RP.ResourceId
WHERE RP.PhoneNum = tc.[From]
Now I'm trying to work out the syntax of how to get a 'User1' given that [From] matches the [PhoneNum] in [rtc] and a 'User2' if [To] matches [PhoneNum] instead, because I can't have them being jumbled up.
What you're wanting to do is join on the same table twice to get related values based on two different references.
For this, you use table aliases. Here's a simple example
SELECT u1.[Username] AS User1, u2.[Username] AS User2
FROM TelstraCall tc
INNER JOIN ResourcePhone rp1 ON tc.[From] = rp1.PhoneNum
INNER JOIN Resource u1 ON rp1.ResourceId = u1.Id -- guessing at column names here
INNER JOIN ResourcePhone rp2 ON tc.[To] = rp2.PhoneNum
INNER JOIN Resource u2 ON rp2.ResourceId = u2.Id
Here is one way that you can do this using CROSS APPLY since you are using SQL Server 2008. CROSS APPLY helps you to join your table with sub queries.
In this case, the table CallDetails in the database PhoneBills drives your query using the fields From and To. Both these fields have to fetch the Username data from the table Resource in the database rtc by joining with the PhoneNumber column in the table ResourcePhone also in the database rtc.
So the inner/sub query will join the tables Resource and ResourcePhone, it will then be used twice to fetch User1 and User2. For User1, the filter will use the From field in the table CallDetails in the database PhoneBills and for User2, the filter will use the To field in the table CallDetails in the database PhoneBills
SELECT USR1.UserName AS [User1]
, USR2.UserName AS [User2]
FROM PhoneBills.dbo.CallDetails CD
CROSS APPLY (
SELECT Username
FROM rtc.dbo.Resource R
INNER JOIN rtc.dbo.ResourcePhone RP
ON RP.ResourceID = R.ResourceID
WHERE RP.PhoneNumber = CD.From
) USR1
CROSS APPLY (
SELECT Username
FROM rtc.dbo.Resource R
INNER JOIN rtc.dbo.ResourcePhone RP
ON RP.ResourceID = R.ResourceID
WHERE RP.PhoneNumber = CD.To
) USR2