Postgresql Select rows where column::text = array::text[] - postgresql

This is similar to Postgresql Select rows where column = array question
create table students (id int, name text);
insert into students values
(1,'AA'),
(2,'BB'),
(3,'CC'),
(4,'DD');
create table classes (name text,students text[]);
insert into classes values
('CL-1','{2,4}'),
('YL-2','{2,1,4}'),
('CL-3','{2,3}'),
('BL-33','{2}'),
('CL-5','{1,3,4}'),
('CL-6','{4}');
How can I get the names of the students in each class?
select cl.name,
(select st.names
from student st
where st.id in cl.student) as student_names -- exp: AA,BB,CC
from class cl;

You can join the tables and re-aggregate the names that correspond the the ID's in your array:
select c.name as class_name,
string_agg(s.name,',') as student_names
from classes c
inner join students s
on s.id::text=any(students)
group by c.name;
-- class_name | student_names
--------------+---------------
-- CL-5 | AA,CC,DD
-- YL-2 | AA,BB,DD
-- CL-6 | DD
-- BL-33 | BB
-- CL-1 | BB,DD
-- CL-3 | BB,CC
If you don't want to group by a ton of columns in classes, you can initially retrieve these lists in a CTE, then join that to classes:
with student_name_lists as
( select c.name as class_name,
string_agg(s.name,',') as student_names
from classes c join students s
on s.id::text = any(students)
group by c.name )
select c.*,
sn.student_names
from classes c join student_name_lists sn
on c.name=sn.class_name;
online demo

Related

SQL check value from another table

I've got multiple tables
I made my query like this :
SELECT a.creation, b.caseno, c.instanceno
FROM TableB b
JOIN TableA a
ON a.caseno = b.caseno
JOIN TableC c
ON c.caseno = b.caseno
WHERE a.creation BETWEEN '2021-01-01' AND '2021-12-31'
I've got TableD who contains the following column
| InstanceNo | Position | Creation | TaskNo |
The idea is to add a new colum (result) on my query.
If instance from c.instanceno exist on tableD and taskno is 30 or 20, in that case i would like the d.creation but for the max(position).
If not the value null is enough for the column result.
SELECT a.creation, b.caseno, c.instanceno, d.creation
FROM TableB b
JOIN TableA a
ON a.caseno = b.caseno
JOIN TableC c
ON c.caseno = b.caseno
LEFT JOIN (SELECT MAX(position) position, instanceno, creation, taskno FROM TableD GROUP BY instanceno, creation, taskno) d
ON d.instanceno = c.instanceno
AND d.taskno in (20,30)
WHERE a.creation BETWEEN '2021-01-01' AND '2021-12-31'

How to find in a many to many relation all the identical values in a column and join the table with other three tables?

I have a many to many relation with three columns, (owner_id,property_id,ownership_perc) and for this table applies (many owners have many properties).
So I would like to find all the owner_id who has many properties (property_id) and connect them with other three tables (Table 1,3,4) in order to get further information for the requested result.
All the tables that I'm using are
Table 1: owner (id_owner,name)
Table 2: owner_property (owner_id,property_id,ownership_perc)
Table 3: property(id_property,building_id)
Table 4: building(id_building,address,region)
So, when I'm trying it like this, the query runs but it returns empty.
SELECT address,region,name
FROM owner_property
JOIN property ON owner_property.property_id = property.id_property
JOIN owner ON owner.id_owner = owner_property.owner_id
JOIN building ON property.building_id=building.id_building
GROUP BY owner_id,address,region,name
HAVING count(owner_id) > 1
ORDER BY owner_id;
Only when I'm trying the code below, it returns the owner_id who has many properties (see image below) but without joining it with the other three tables:
SELECT a.*
FROM owner_property a
JOIN (SELECT owner_id, COUNT(owner_id)
FROM owner_property
GROUP BY owner_id
HAVING COUNT(owner_id)>1) b
ON a.owner_id = b.owner_id
ORDER BY a.owner_id,property_id ASC;
So, is there any suggestion on what I'm doing wrong when I'm joining the tables? Thank you!
This query:
SELECT owner_id
FROM owner_property
GROUP BY owner_id
HAVING COUNT(property_id) > 1
returns all the owner_ids with more than 1 property_ids.
If there is a case of duplicates in the combination of owner_id and property_id then instead of COUNT(property_id) use COUNT(DISTINCT property_id) in the HAVING clause.
So join it to the other tables:
SELECT b.address, b.region, o.name
FROM (
SELECT owner_id
FROM owner_property
GROUP BY owner_id
HAVING COUNT(property_id) > 1
) t
INNER JOIN owner_property op ON op.owner_id = t.owner_id
INNER JOIN property p ON op.property_id = p.id_property
INNER JOIN owner o ON o.id_owner = op.owner_id
INNER JOIN building b ON p.building_id = b.id_building
ORDER BY op.owner_id, op.property_id ASC;
Always qualify the column names with the table name/alias.
You can try to use a correlated subquery that counts the ownerships with EXISTS in the WHERE clause.
SELECT b1.address,
b1.region,
o1.name
FROM owner_property op1
INNER JOIN owner o1
ON o1.id_owner = op1.owner_id
INNER JOIN property p1
ON p1.id_property = op1.property_id
INNER JOIN building b1
ON b1.id_building = p1.building_id
WHERE EXISTS (SELECT ''
FROM owner_property op2
WHERE op2.owner_id = op1.owner_id
HAVING count(*) > 1);

Hoe to split data of one column in multiple columns on the basis of a condition

I have one table having data
Category. New data
Cost of equipment. 23
Price of equipments. 45
Cost of M&C. 13
Price of M&C. 12
And one another table having
Category
Equipments
M&C
Now i want data as below
Category Cost Price
Equipment 23 45
M&C 13 12
Can you please help me in solving this
You may try this. A better approach is to change your table design.
Note that while joining I had to use RTRIM to remove s from equipments. I am not aware of any other variations in your data which might not match between the two tables. Please change the join conditions appropriately ( or use a REGEXP match instead of ILIKE if they don't )
SQL Fiddle
PostgreSQL 9.6 Schema Setup:
CREATE TABLE Table1
(Category varchar(19), New_data int)
;
INSERT INTO Table1
(Category, New_data)
VALUES
('Cost of equipment', 23),
('Price of equipments', 45),
('Cost of M&C', 13),
('Price of M&C', 12)
;
CREATE TABLE Table2
(Category varchar(10))
;
INSERT INTO Table2
(Category)
VALUES
('Equipments'),
('M&C')
;
Query 1:
WITH t1
AS (
SELECT b.category
,a.new_data
FROM TABLE1 a
INNER JOIN TABLE2 b ON a.Category ILIKE '%cost%' || RTRIM(b.Category, 's') || '%'
)
,t2
AS (
SELECT c.category
,a.new_data
FROM TABLE1 a
INNER JOIN TABLE2 c ON a.Category ILIKE '%price%' || RTRIM(c.Category, 's') || '%'
)
SELECT t1.category
,t1.new_data AS cost
,t2.new_data AS price
FROM t1
INNER JOIN t2 ON t1.category = t2.category
Results:
| category | cost | price |
|------------|------|-------|
| Equipments | 23 | 45 |
| M&C | 13 | 12 |

Cascading sum hierarchy using recursive cte

I'm trying to perform recursive cte with postgres but I can't wrap my head around it. In terms of performance issue there are only 50 items in TABLE 1 so this shouldn't be an issue.
TABLE 1 (expense):
id | parent_id | name
------------------------------
1 | null | A
2 | null | B
3 | 1 | C
4 | 1 | D
TABLE 2 (expense_amount):
ref_id | amount
-------------------------------
3 | 500
4 | 200
Expected Result:
id, name, amount
-------------------------------
1 | A | 700
2 | B | 0
3 | C | 500
4 | D | 200
Query
WITH RECURSIVE cte AS (
SELECT
expenses.id,
name,
parent_id,
expense_amount.total
FROM expenses
WHERE expenses.parent_id IS NULL
LEFT JOIN expense_amount ON expense_amount.expense_id = expenses.id
UNION ALL
SELECT
expenses.id,
expenses.name,
expenses.parent_id,
expense_amount.total
FROM cte
JOIN expenses ON expenses.parent_id = cte.id
LEFT JOIN expense_amount ON expense_amount.expense_id = expenses.id
)
SELECT
id,
SUM(amount)
FROM cte
GROUP BY 1
ORDER BY 1
Results
id | sum
--------------------
1 | null
2 | null
3 | 500
4 | 200
You can do a conditional sum() for only the root row:
with recursive tree as (
select id, parent_id, name, id as root_id
from expense
where parent_id is null
union all
select c.id, c.parent_id, c.name, p.root_id
from expense c
join tree p on c.parent_id = p.id
)
select e.id,
e.name,
e.root_id,
case
when e.id = e.root_id then sum(ea.amount) over (partition by root_id)
else amount
end as amount
from tree e
left join expense_amount ea on e.id = ea.ref_id
order by id;
I prefer doing the recursive part first, then join the related tables to the result of the recursive query, but you could do the join to the expense_amount also inside the CTE.
Online example: http://rextester.com/TGQUX53703
However, the above only aggregates on the top-level parent, not for any intermediate non-leaf rows.
If you want to see intermediate aggregates as well, this gets a bit more complicated (and is probably not very scalable for large results, but you said your tables aren't that big)
with recursive tree as (
select id, parent_id, name, 1 as level, concat('/', id) as path, null::numeric as amount
from expense
where parent_id is null
union all
select c.id, c.parent_id, c.name, p.level + 1, concat(p.path, '/', c.id), ea.amount
from expense c
join tree p on c.parent_id = p.id
left join expense_amount ea on ea.ref_id = c.id
)
select e.id,
lpad(' ', (e.level - 1) * 2, ' ')||e.name as name,
e.amount as element_amount,
(select sum(amount)
from tree t
where t.path like e.path||'%') as sub_tree_amount,
e.path
from tree e
order by path;
Online example: http://rextester.com/MCE96740
The query builds up a path of all IDs belonging to a (sub)tree and then uses a scalar sub-select to get all child rows belonging to a node. That sub-select is what will make this quite slow as soon as the result of the recursive query can't be kept in memory.
I used the level column to create a "visual" display of the tree structure - this helps me debugging the statement and understanding the result better. If you need the real name of an element in your program you would obviously only use e.name instead of pre-pending it with blanks.
I could not get your query to work for some reason. Here's my attempt that works for the particular table you provided (parent-child, no grandchild) without recursion. SQL Fiddle
--- step 1: get parent-child data together
with parent_child as(
select t.*, amount
from
(select e.id, f.name as name,
coalesce(f.name, e.name) as pname
from expense e
left join expense f
on e.parent_id = f.id) t
left join expense_amount ea
on ea.ref_id = t.id
)
--- final step is to group by id, name
select id, pname, sum(amount)
from
(-- step 2: group by parent name and find corresponding amount
-- returns A, B
select e.id, t.pname, t.amount
from expense e
join (select pname, sum(amount) as amount
from parent_child
group by 1) t
on t.pname = e.name
-- step 3: to get C, D we union and get corresponding columns
-- results in all rows and corresponding value
union
select id, name, amount
from expense e
left join expense_amount ea
on e.id = ea.ref_id
) t
group by 1, 2
order by 1;

Query table with multiple joined values

I've created a query that joins six tables:
SELECT a.accession, b.value, c.name, d.description, e.value, f.seqlen, f.residues
FROM chado.dbxref a inner join chado.dbxrefprop b on a.dbxref_id = b.dbxref_id
inner join chado.biomaterial d on b.dbxref_id = d.dbxref_id
inner join chado.feature f on d.dbxref_id = f.dbxref_id
inner join chado.biomaterialprop e on d.biomaterial_id = e.biomaterial_id
inner join chado.contact c on d.biosourceprovider_id = c.contact_id;
The output:
I'm currently working with a PostgreSQL schema called Chado (http://gmod.org/wiki/Chado_Tables). My attempts to comply with the preexisting schema have led me to deposit multiple joined values within the same table (two different values within the dbxrefprop table, three different values within the biomaterialprop table). Querying the database results in a substantial amount of redundant output. Is there a way for me to reduce output redundancy by modifying my query statement? Ideally, I'd like the output to resemble the following:
test001 | GB0101 | source011 | Faaberg,K.; Lyoo,K.; Korol,D.M. | serum | T1 | Iowa, USA | 01 Jan 2005 | 1234 | AUGAACGCCUUGCAUUACUAUGACUAUGAUU
Working query statement:
SELECT a.accession, string_agg(distinct b.value, ' | ' ORDER BY b.value) AS bvalue_list, c.name, d.description, string_agg(distinct e.value, ' | ' ORDER BY e.value) AS evalue_list, f.seqlen, f.residues
FROM chado.dbxref a INNER JOIN chado.dbxrefprop b ON a.dbxref_id = b.dbxref_id
INNER JOIN chado.biomaterial d ON b.dbxref_id = d.dbxref_id
INNER JOIN chado.feature f ON d.dbxref_id = f.dbxref_id
INNER JOIN chado.biomaterialprop e ON d.biomaterial_id = e.biomaterial_id
INNER JOIN chado.contact c ON d.biosourceprovider_id = c.contact_id
GROUP BY a.accession, c.name, d.description, f.seqlen, f.residues;