Get value of at least one coincidense between related tables postgres - postgresql

I have 3 tables:
CREATE TABLE public.student (id serial NOT NULL PRIMARY KEY,"student" varchar(30) NOT NULL) WITH (OIDS = FALSE);
CREATE TABLE public."class" (id serial NOT NULL PRIMARY KEY,"class" varchar(30) NOT NULL) WITH (OIDS = FALSE);
CREATE TABLE public.student_class (student_id integer, class_id integer, /* Keys */ CONSTRAINT student_class_index01
PRIMARY KEY (student_id, class_id)) WITH (OIDS = FALSE);
With these values:
INSERT INTO public.student_class ("student_id","class_id") VALUES(1,1),(1,2),(1,3),(1,4),(1,5),(1,6),(1,7),(2,1),(2,2),(2,3),(3,4),(3,5),(3,6),(3,7),(3,8);
I want to now the names of the students that have at least one coincident class with other one:
Student 1 has got at least one coincidence with 2 and 3. Student 2 has
got at least one coincidence with 1. Student 3 has got at least one
coincidence with 1.
I have tried this, but it is not doing it by student.
SELECT distinct a.id,a.student from public.student a
LEFT OUTER JOIN public.student_class b ON b.student_id=a.id
LEFT OUTER JOIN public.class d ON d.id=b.class_id
WHERE class_id IN(select class_id from student_class where student_id not in(1))
AND a.id not in(1);
I think taht is easier than I think. Thanks.

What do you think about this :
with student_lst_classes as (
select student_id,array_agg(class_id order by class_id) lst_classes
from student_class group by 1
)
select st1.student,string_agg(st2.student,';' order by st2.student ASC) lst_student_coincident from
(student_lst_classes l1 join student st1 on (st1.id=l1.student_id))
,(student_lst_classes l2 join student st2 on (st2.id=l2.student_id))
where l1.student_id!=l2.student_id and l1.lst_classes && l2.lst_classes
group by 1 order by 1

Try this:
select *
from student
where exists (select null
from student_class sc1
where sc1.student_id = student.id
and exists (select null
from student_class sc2
where sc2.class_id = sc1.class_id
and sc2.student_id != sc1.student_id))

Besically a cartesian join of student X student, so there are no duplicates to take care of. (except for the symmetry s1 <--> s2)
SELECT s1.id AS a_id, s1.student AS a_student
, s2.id AS b_id, s2.student AS b_student
FROM student s1
JOIN student s2 ON EXISTS (
SELECT *
FROM student_class x1
JOIN student_class x2 ON x2.class_id = x1.class_id
WHERE x1.student_id = s1.id
AND x2.student_id = s2.id
)
WHERE s1.id < s2.id -- tie-breaker
;

Related

Rewrite Where condition using Lateral Join

I have a WITH statement which returns a table like followings:
id
parent_id
1
null
3
2
4
null
5
4
The desired output of my query is where parent_id is null or parent_id is present in the id column:
id
parent_id
1
null
4
null
5
4
I wrote the following query:
SELECT * FROM items
WHERE parent_id IS NULL OR parent_id = ANY(SELECT id from items)
As far I have understood, lateral joins are faster than the ANY operator so my idea was to rewrite the above query using them. I started with:
SELECT * FROM items i1
JOIN LATERAL (SELECT * FROM items i2 WHERE i2.parent_id = i1.id ) t ON true
But where do I add the condition to take the items where parent_id is null?
Use a self join:
SELECT t1.*
FROM items t1
LEFT JOIN items t2 ON t1.parent_id = t2.id
WHERE t1.parent_id IS NULL
OR t2.id IS NOT NULL
This is the best performing approach (assuming you have an index on the id column, which is almost certainly the case).

How to find in a many to many relation all the identical values in a column and join the table with other three tables?

I have a many to many relation with three columns, (owner_id,property_id,ownership_perc) and for this table applies (many owners have many properties).
So I would like to find all the owner_id who has many properties (property_id) and connect them with other three tables (Table 1,3,4) in order to get further information for the requested result.
All the tables that I'm using are
Table 1: owner (id_owner,name)
Table 2: owner_property (owner_id,property_id,ownership_perc)
Table 3: property(id_property,building_id)
Table 4: building(id_building,address,region)
So, when I'm trying it like this, the query runs but it returns empty.
SELECT address,region,name
FROM owner_property
JOIN property ON owner_property.property_id = property.id_property
JOIN owner ON owner.id_owner = owner_property.owner_id
JOIN building ON property.building_id=building.id_building
GROUP BY owner_id,address,region,name
HAVING count(owner_id) > 1
ORDER BY owner_id;
Only when I'm trying the code below, it returns the owner_id who has many properties (see image below) but without joining it with the other three tables:
SELECT a.*
FROM owner_property a
JOIN (SELECT owner_id, COUNT(owner_id)
FROM owner_property
GROUP BY owner_id
HAVING COUNT(owner_id)>1) b
ON a.owner_id = b.owner_id
ORDER BY a.owner_id,property_id ASC;
So, is there any suggestion on what I'm doing wrong when I'm joining the tables? Thank you!
This query:
SELECT owner_id
FROM owner_property
GROUP BY owner_id
HAVING COUNT(property_id) > 1
returns all the owner_ids with more than 1 property_ids.
If there is a case of duplicates in the combination of owner_id and property_id then instead of COUNT(property_id) use COUNT(DISTINCT property_id) in the HAVING clause.
So join it to the other tables:
SELECT b.address, b.region, o.name
FROM (
SELECT owner_id
FROM owner_property
GROUP BY owner_id
HAVING COUNT(property_id) > 1
) t
INNER JOIN owner_property op ON op.owner_id = t.owner_id
INNER JOIN property p ON op.property_id = p.id_property
INNER JOIN owner o ON o.id_owner = op.owner_id
INNER JOIN building b ON p.building_id = b.id_building
ORDER BY op.owner_id, op.property_id ASC;
Always qualify the column names with the table name/alias.
You can try to use a correlated subquery that counts the ownerships with EXISTS in the WHERE clause.
SELECT b1.address,
b1.region,
o1.name
FROM owner_property op1
INNER JOIN owner o1
ON o1.id_owner = op1.owner_id
INNER JOIN property p1
ON p1.id_property = op1.property_id
INNER JOIN building b1
ON b1.id_building = p1.building_id
WHERE EXISTS (SELECT ''
FROM owner_property op2
WHERE op2.owner_id = op1.owner_id
HAVING count(*) > 1);

Joining two one-to-many tables duplicates records

I have 3 tables, Transaction, Transaction_Items and Transaction_History.
Where the Transaction is the parent table, while Transaction_Items and Transaction_History are the children tables, with one to many relationship.
When i try to join those tables together, if i have 2+ Transaction_History records, or 2+ Transaction_Items i get duplicated or triplicated record results.
This is the SQL query im currently using which works, but what worries me that in the future if i have to Join another one-to-many table, it will duplicate the results again.
I found a workaround for this, but i was just wondering if there is a better and cleaner way to do this ?
The results should be a PostgreSQL JSON array which will contain the Transaction_Items and Transaction_History
SELECT
TR.id AS transaction_id,
TR.transaction_number,
TR.status,
TR.status AS status,
to_json(TR_INV.list),
COUNT(TR_INV) item_cnt,
COUNT(THR) tr_cnt,
json_agg(THR)
FROM transaction_transaction AS TR
LEFT JOIN (
SELECT
array_agg(t) list, -- this is a workaround method
t.transaction_id
FROM (
SELECT
TR_INV.transaction_id transaction_id,
IT.id,
IT.stock_number,
CAT.key category_key,
ITP.description description,
ITP.serial_number serial_number,
ITP.color color,
ITP.manufacturer manufacturer,
ITP.inventory_model inventory_model,
ITP.average_cost average_cost,
ITP.location_in_store location_in_store,
ITP.firearm_caliber firearm_caliber,
ITP.federal_firearm_number federal_firearm_number,
ITP.sold_price sold_price
FROM transaction_transaction_item TR_INV
LEFT JOIN inventory_item IT ON IT.id = TR_INV.item_id
LEFT JOIN inventory_itemprofile ITP ON ITP.id = IT.current_profile_id
LEFT JOIN inventory_category CAT ON CAT.id = ITP.category_id
LEFT JOIN inventory_categorytype CAT_T ON CAT_T.id = CAT.category_type_id
) t
GROUP BY t.transaction_id
) TR_INV ON TR_INV.transaction_id = TR.id
LEFT JOIN transaction_transactionhistory THR ON THR.transaction_id = TR.id
AND (THR.audit_code_id = 44 OR THR.audit_code_id = 27 OR THR.audit_code_id = 28)
WHERE TR.store_id = 21
AND TR.transaction_type = 'Pawn_Loan' AND TR.date_made >= '2018-10-08'
GROUP BY TR.id, TR_INV.list
What you want to do can be achieved by not using joins, as shown below.
Because your actual tables have so many columns that I don't know and should not care. I just created the simplest forms of them for demonstration.
CREATE TABLE transactions (
tid serial PRIMARY KEY,
name varchar(40) NOT NULL
);
CREATE TABLE transaction_histories (
hid serial PRIMARY KEY ,
tid integer REFERENCES transactions(tid),
history varchar(40) NOT NULL
);
CREATE TABLE transaction_items (
iid serial PRIMARY KEY ,
tid integer REFERENCES transactions(tid),
item varchar(40) NOT NULL
);
INSERT INTO transactions(tid,name) Values(1, 'transaction');
INSERT INTO transaction_histories(tid, history) Values(1, 'history1');
INSERT INTO transaction_histories(tid, history) Values(1, 'history2');
INSERT INTO transaction_items(tid, item) Values(1, 'item1');
INSERT INTO transaction_items(tid, item) Values(1, 'item2');
select
t.*,
(select count(*) from transaction_histories h where h.tid= t.tid) h_count ,
(select json_agg(h) from transaction_histories h where h.tid= t.tid) h ,
(select count(*) from transaction_items i where i.tid= t.tid) i_count ,
(select json_agg(i) from transaction_items i where i.tid= t.tid) i
from transactions t;

PostgreSql unable to create view due to "duplicate column"

I am trying to create a country_name, and country cid pair between each country that are neighbours:
Here's the schema:
CREATE TABLE country (
cid INTEGER PRIMARY KEY,
cname VARCHAR(20) NOT NULL,
height INTEGER NOT NULL,
population INTEGER NOT NULL);
CREATE TABLE neighbour (
country INTEGER REFERENCES country(cid) ON DELETE RESTRICT,
neighbor INTEGER REFERENCES country(cid) ON DELETE RESTRICT,
length INTEGER NOT NULL,
PRIMARY KEY(country, neighbor));
My query:
create view neighbour_pair as (
select c1.cid, c1.cname, c2.cid, c2.cname
from neighbour n join country c1 on c1.cid = n.country
join country c2 on n.neighbor = c2.cid);
I am getting error code 42701 which means that there is a duplicate column.
The actual error message I am getting is:
ERROR: column "cid" specified more than once
********** Error **********
ERROR: column "cid" specified more than once
SQL state: 42701
I am unsure how to go around the error problem since I WANT the pair of neighbour countries with the country name and their cid.
Nevermind. I edited the first line of the query and changed the column names
create view neighbour_pair as
select c1.cid as c1cid, c1.cname as c1name, c2.cid as c2cid, c2.cname as c2name
from neighbour n join country c1 on c1.cid = n.country
join country c2 on n.neighbor = c2.cid;
I ran into a similar issue recently. I had a query like:
CREATE VIEW pairs AS
SELECT p.id, p.name,
(SELECT count(id) from results
where winner = p.id),
(SELECT count(id) from results
where winner = p.id OR loser = p.id)
FROM players p LEFT JOIN matches m ON p.id = m.id
GROUP BY 1,2;
The error was telling me: ERROR: column "count" specified more than once. The query WAS working via psycopg2, however when I brought it into a .sql file for testing the error arose.
I realized I just needed to alias the 2 count subqueries:
CREATE VIEW pairs AS
SELECT p.id, p.name,
(SELECT count(id) from results
where winner = p.id) as wins,
(SELECT count(id) from results
where winner = p.id OR loser = p.id) as matches
FROM players p LEFT JOIN matches m ON p.id = m.id
GROUP BY 1,2;
You can use alias with AS:
For example your view could be as follows:
create view neighbour_pair as
(
select c1.**cid**
, c1.cname
, c2.**cid AS cid_c2**
, c2.cname
from neighbour n
join country c1 on c1.cid = n.country
join country c2 on n.neighbor = c2.cid
);

Concatenated columns should not match in 2 tables

I'll just put this in layman's terms since I'm a complete noobie:
I have 2 tables A and B, both having 2 columns of interest namely: employee_number and salary.
What I am looking to do is to extract rows of 'combination' of employee_number and salary from A that are NOT present in B, but each of employee_number and salary should be present in both.
I am looking to doing it with the 2 following conditions(please forgive the wrong function
names.. this is just to present the problem 'eloquently'):
1.) A.unique(employee_number) exists in B.unique(employee_number) AND A.unique(salary)
exists in B.unique(salary)
2.) A.concat(employee_number,salary) <> B.concat(employee_number,salary)
Note: A and B are in different databases, so I'm looking to use dblink to do this.
This is what I tried doing:
SELECT distinct * FROM dblink('dbname=test1 port=5432
host=test01 user=user password=password','SELECT employee_number,salary, employee_number||salary AS ENS FROM empsal.A')
AS A(employee_number int8, salary integer, ENS numeric)
LEFT JOIN empsalfull.B B on B.employee_number = A.employee_number AND B.salary = A.salary
WHERE A.ENS not in (select distinct employee_number || salary from empsalfull.B)
but it turned out to be wrong as I had it cross-checked by using spreadsheets and I don't get the same result.
Any help would be greatly appreciated. Thanks.
For easier understanding I left out the dblink.
Because, the first one selects lines in B that equal the employeenumber in A as well as the salery in A, so their concatenated values will equal as well (if you expect this to not be true, please provide some test data).
SELECT * from firsttable A
LEFT JOIN secondtable B where
(A.employee_number = B.employee_number AND a.salery != b.salery) OR
(A.salery = B.salery AND A.employee_number != B.employee_number)
If you have troubles with lines containing nulls, you might also try somthing like this:
AND (a.salery != b.salery OR (a.salery IS NULL AND b.salery IS NOT NULL) or (a.salery IS NOT
NULL and b.salery IS NULL))
I think you're looking for something along these lines.
(Sample data)
create table A (
employee_number integer primary key,
salary integer not null
);
create table B (
employee_number integer primary key,
salary integer not null
);
insert into A values
(1, 20000),
(2, 30000),
(3, 20000); -- This row isn't in B
insert into B values
(1, 20000), -- Combination in A
(2, 20000), -- Individual values in A
(3, 50000); -- Only emp number in A
select A.employee_number, A.salary
from A
where (A.employee_number, A.salary) NOT IN (select employee_number, salary from B)
and A.employee_number IN (select employee_number from B)
and A.salary IN (select salary from B)
output: 3, 20000