Is it possible to joining multiple row - postgresql

I am looking for a way to join two rows with same value (promo code in my case). So for example, I have a table:
ID PROMO_CODE PROMO_NAME DESCRIPTION LANGUAGE
1 PC123 ABC Desc in English ENG
2 PC123 CBA Desc in Español ESP
and I want the result like:
ID PROMO_CODE PROMO_NAME_ENG PROMO_NAME_ESP DESCRIPTION_ENG DESCRIPTION_ESP
1 PC123 ABC CBA Desc in English Desc in Español
Any help will be appreciate

Join on PROMO_CODE:
select
a.ID,
a.PROMO_CODE,
a.PROMO_NAME as PROMO_NAME_ENG,
b.PROMO_NAME as PROMO_NAME_ESP,
a.DESCRIPTION as DESCRIPTION_ENG,
b.DESCRIPTION as DESCRIPTION_ESP
from mytable a
left join mytable b on b.PROMO_CODE = a.PROMO_CODE
and b.LANGUAGE = 'ESP'
where a.LANGUAGE = 'ENG'
Using a left join guarantees the ENG data even if the ESP data doesn't exist, which seems possible due to ENG's ID being in the output, but not ESP's ID

Related

Conditional JOIN with two different keys

I have a query that produces two separate IDs:
SELECT
date,
user_id,
vendor_id,
SUM(purchase) user_purchase
SUM(spend) vendor_spend
GROUP BY 1,2,3
FROM tabla.abc
This produces results like this:
date user_id vendor_id user_purchase vendor_spend
1/1/18 123 NULL 5.00 0.00
1/1/18 NULL 456 0.00 10.00
I want to join it on a table that looks like this:
client_id user_id vendor_id
456789 123 NULL
101112 NULL 456
But the problem is, I obviously want to join it on both the appropiate IDs so my final output can look like this:
date client_id user_id vendor_id user_purchase vendor_spend
1/1/18 456790 123 NULL 5.00 0.00
1/1/18 101112 NULL 456 0.00 10.00
So is there a way I can do like, a conditional join? Something like WHERE user_id IS NULL THEN... etc...
Use not distinct from because one of the argument may be null:
select *
from (
select
date,
user_id,
vendor_id,
sum(purchase) user_purchase,
sum(spend) vendor_spend
from table1
group by 1,2,3
) t1
join table2 t2
on (t1.user_id, t1.vendor_id)
is not distinct from (t2.user_id, t2.vendor_id)
Note that for performance reasons you should join already aggregated table (hence I have placed the original query in a derived table).
Try this:
SELECT
date,
COALESCE(lu.client_id, lv.client_id) AS client_id,
user_id,
vendor_id,
SUM(purchase) user_purchase
SUM(spend) vendor_spend
FROM tabla.abc
LEFT JOIN tabla.link AS lu USING (user_id)
LEFT JOIN tabla.link AS lv USING (vendor_id)
GROUP BY 1,2,3,4
I think the sufficient join is just this:
FROM aggregated_table t1
LEFT JOIN client_id_table t2
ON t1.user_id=t2.user_id
OR t1.vendor_id=t2.vendor_id
because as I understand you need to join by user id if there is user id and by vendor id if there is vendor id. Using a left join with OR does exactly that.
Also, conditional join is possible as well. If you're familiar with a CASE statement it works perfectly well in join conditions. Similar thing can be expressed as:
FROM aggregated_table t1
LEFT JOIN client_id_table t2
ON CASE
WHEN t1.user_id is not null THEN t1.user_id=t2.user_id
WHEN t1.vendor_id is not null THEN t1.vendor_id=t2.vendor_id
END
but this is too verbose compared to the previous option that I think should produce the same result

JOIN the record with the most similar name with each row from multiple tables

Platform: PostgreSQL
Tables:
shortlist: name (text), city (text)...
data1: name (text), ranking (integer), score1 (double)...
data2: name (text), ranking (integer), score1 (double)...
data3: name (text), ranking (integer), score1 (double)...
data4: name (text), ranking (integer), score1 (double)...
There is a limited number of data tables of similar format.
I would like to join each row in shortlist with the row in each data table that has the most similar name determined by similarity(shortlist.name, data#.name).
Pseudo code of the same idea:
for each s_row in shortlist:
select shortlist.*
join (SELECT data1.*, similarity(s_row.name, data1.name) AS sim FROM data1 ORDER BY sim DESC LIMIT 1)
join (SELECT data2.*, similarity(s_row.name, data2.name) AS sim FROM data2 ORDER BY sim DESC LIMIT 1)
join (SELECT data3.*, similarity(s_row.name, data3.name) AS sim FROM data3 ORDER BY sim DESC LIMIT 1)
join (SELECT data4.*, similarity(s_row.name, data4.name) AS sim FROM data4 ORDER BY sim DESC LIMIT 1)
Is there a way to do this in SQL?
I am not entirely sure what you are after but something like this:
select s.name,
d1.name as d1_name,
d2.name as d2_name
from shortlist s
left join lateral (
SELECT data1.*, similarity(s.name, data1.name) AS sim
FROM data1
ORDER BY sim
DESC LIMIT 1
) d1 on true
left join lateral (
SELECT data2.*, similarity(s.name, data2.name) AS sim
FROM data2
ORDER BY sim DESC
LIMIT 1
) d2 on true
You want an outer join (left join) for each table because otherwise you would not see anything if there is no match in at least one of the tables.

Postgres how to maintain order of rows using CTEs

I have 2 tables
students:
id | name | age
1 abc 20
2 xyz 21
scores:
id | studentid | marks
1 1 20
2 2 22
3 2 20
4 1 22
5 1 20
where studentid is foreign key to students table
When a do
select studentid
from scores
where marks=20;
I get the following result
1, 2, 1
But if want the name of the student name and when I do a join using
select t1.name
from students t1
inner join scores t2 on t1.id = t2.studentid
where t2.marks=20;
I get xyz,abc,abc Though the ouput is correct is there any way I can maintain the order in which scores are listed in the scores table? I should get abc,xyz,abc as output. I tried using subquery as well
SELECT name
FROM students
WHERE ID IN ( select studentid from scores where marks=20) ;
but that also did not give me correct order. How can this be achieved using CTEs (common table expressions)? I tried the follownig cte but it did not work
with cte as(
select t2.id, t1.name
from students t1
inner join scores t2 on t1.id = t2.studentid
where t2.marks=20)
select name from cte order by id
You can order by a column not present in select list:
select t1.name
from students t1
inner join scores t2 on t1.id = t2.student_id
where t2.marks=20
order by t2.id;
name
------
abc
xyz
abc
(3 rows)

Limit for inner Join Table

I have a scenario where I am joining three tables and getting the results.
My problem is i have apply limit for joined table.
Take below example, i have three tables 1) books and 2) Customer 3)author. I need to find list of books sold today with author and customer name however i just need last nth customers not all by passing books Id
Books Customer Authors
--------------- ---------------------- -------------
Id Name AID Id BID Name Date AID Name
1 1 1 ABC 1 A1
2 2 1 CED 2 A2
3 3 2 DFG
How we can achieve this?
You are looking for LATERAL.
Sample:
SELECT B.Id, C.Name
FROM Books B,
LATERAL (SELECT * FROM Customer WHERE B.ID=C.BID ORDER BY ID DESC LIMIT N) C
WHERE B.ID = ANY(ids)
AND Date=Current_date

counting in sql in subquery in the table

DNO DNAME
----- -----------
1 Research
2 Finance
EN ENAME CITY SALARY DNO JOIN_DATE
-- ---------- ---------- ---------- ---------- ---------
E1 Ashim Kolkata 10000 1 01-JUN-02
E2 Kamal Mumbai 18000 2 02-JAN-02
E3 Tamal Chennai 7000 1 07-FEB-04
E4 Asha Kolkata 8000 2 01-MAR-07
E5 Timir Delhi 7000 1 11-JUN-05
//find all departments that have more than 3 employees.
My try
select deptt.dname
from deptt,empl
where deptt.dno=empl.dno and (select count(empl.dno) from empl group by empl.dno)>3;
here is the solution
select deptt.dname
from deptt,empl
where deptt.dno=empl.dno
group by deptt.dname having count(1)>3;
select
*
from departments d
inner join (
select dno from employees group by dno having count(*) > 3
) e on d.dno = e.dno
There are many approaches to this problem but almost all will use GROUP BY and the HAVING clause. That clause allows you to filter results of aggregate functions. Here it is used to choose only those records where the count is greater than 3.
In the query structure used above the group by is handled on the employee table only, then the result (which is known as a derived table) is joined by an INNER JOIN to the departments table. This inner join only allows matching records so this has the effect of filtering the departments table to only those which have a count() of greater than 3.
An advantage of this query structure is fewer records are joined, and also that all columns of the departments table are available for reporting. Disadvantage of this structure is the the count() of employees per department isn't visible.