SELECT statement for all 3 tables in a many-to-many relationship - postgresql

I am having trouble writing a SELECT query that includes all the 3 tables in a many-to-many relationship. I have the following tables:
Table "public.companies"
Column | Type | Modifiers | Storage | Stats target | Description
----------------+------------------------+--------------------------------------------------------+----------+--------------+-------------
id | integer | not null default nextval('companies_id_seq'::regclass) | plain | |
name | character varying(48) | not null | extended | |
description | character varying(512) | | extended | |
tagline | character varying(64) | | extended | |
featured_image | integer | | plain | |
Indexes:
"companies_pkey" PRIMARY KEY, btree (id)
Referenced by:
TABLE "company_category_associations" CONSTRAINT "company_category_associations_company_id_foreign" FOREIGN KEY (company_id) REFERENCES companies(id) ON DELETE CASCADE
Table "public.company_category_associations"
Column | Type | Modifiers
-------------+---------+----------------------------------------------------------------------------
id | integer | not null default nextval('company_category_associations_id_seq'::regclass)
company_id | integer | not null
category_id | integer | not null
Indexes:
"company_category_associations_pkey" PRIMARY KEY, btree (id)
Foreign-key constraints:
"company_category_associations_category_id_foreign" FOREIGN KEY (category_id) REFERENCES company_categories(id) ON DELETE RESTRICT
"company_category_associations_company_id_foreign" FOREIGN KEY (company_id) REFERENCES companies(id) ON DELETE CASCADE
Table "public.company_categories"
Column | Type | Modifiers
-------------+-----------------------+-----------------------------------------------------------------
id | integer | not null default nextval('company_categories_id_seq'::regclass)
name | character varying(32) | not null
description | character varying(96) |
Indexes:
"company_categories_pkey" PRIMARY KEY, btree (id)
Referenced by:
TABLE "company_category_associations" CONSTRAINT "company_category_associations_category_id_foreign" FOREIGN KEY (category_id) REFERENCES company_categories(id) ON DELETE RESTRICT
My companies table will have around 100k rows and a company can have up to 10 categories associated. Of course I won't be selecting more than 200 companies at a time.
I managed to get the results with the following query:
select
c.id as companyid,
c.name as companyname,
cat.id as categoryid,
cat.name as categoryname
from company_categories cat
left join company_category_associations catassoc on catassoc.category_id = cat.id
left join companies c on catassoc.company_id = c.id where c.id is not null;
This question comes from the fact that I need to present data in JSON format and I would like it to look like this:
{
"companies": [
{
"name": "...",
"description": "...",
"categories": [
{
"id": 12,
"name": "Technology"
},
{
"id": 14,
"name": "Computers"
},
]
},
/* ... */
]
}
And basically I want to take as much of that data in as few queries as possible.
How can I write that SELECT query to fit my needs?
Is there a problem with the database structure as it is my diagram?
Thank you!
P.S. I am using PostgreSQL 9.6.6

You can get json out of postgresql directly.
First define composite types (json objects) that you want to output. This is to get names for fields, otherwise they will be named f1,f2,...
create type cat as (id integer, name varchar);
create type comp as (id integer, name varchar, categories cat[]);
Then select array of comps with nested array of cats as json
select to_json(array(
select (
c.id,
c.name,
array(
select (cc.id, cc.name)::cat
from company_categories cc
join company_category_associations cca on (cca.category_id=cc.id and cca.company_id=c.id)
))::comp
from companies c
)) as companies
dbfiddle

You can nest two JSON aggregations to get there.
The first level creates a JSON value for each company with the categories stored as an array:
select to_jsonb(c) || jsonb_build_object('categories', jsonb_agg(cc)) comp_json
from companies c
join company_category_associations cca on cca.company_id = c.id
join company_categories cc on cc.id = cca.category_id
group by c.id;
This returns one row per company. Now we need to aggregate those rows into a single JSON value:
select jsonb_build_object('companies', jsonb_agg(comp_json))
from (
select to_jsonb(c) || jsonb_build_object('{categories}', jsonb_agg(cc)) comp_json
from companies c
join company_category_associations cca on cca.company_id = c.id
join company_categories cc on cc.id = cca.category_id
group by c.id
) t;
Online example: http://rextester.com/TRCR26633

Related

Postgresql remove values from foreign key that has a cyclic reference and also is referenced in a primary table

There are 2 tables:
the first one is the Father Table
create table win_folder_principal(
id_folder_principal serial primary key not null,
folder_name varchar(300)not null
);
and the table that has a cyclic reference
create table win_folder_dependency(
id_folder_dependency serial primary key not null,
id_folder_father int not null,
id_folder_son int not null,
foreign key(id_folder_father)references win_folder_principal(id_folder_principal),
foreign key(id_folder_son)references win_folder_principal(id_folder_principal)
);
however i found a very interesting situation, if i wanna remove a value from the table father that has a kid and that kid has more kids, is there any way to remove the values from the last to the first but also those values be removed from the Father table?
**WIN_FOLDER_PRINCIPAL**
| Id | Folder_Name|
| 23 | new2 |
| 24 | new3 |
| 13 | new0 |
| 22 | new1 |
| 12 | nFol |
And this are the value stored in the Win_Folder_Dependency
**WIN_FOLDER_DEPENDENCY**
| Id_Father | Id_Son |
| 12 | 13 |
| 13 | 22 |
| 22 | 23 |
| 23 | 24 |
and this is the query that i use to know the values in the dependency and principal table.
SELECT m2.id_folder_principal AS "Principal",
m.folder_name AS "Dependency",
m2.id_folder_principal AS id_principal,
m.id_folder_principal AS id_dependency
FROM ((win_folder_dependency md
JOIN win_folder_principal m ON ((m.id_folder_principal = md.id_folder_son)))
JOIN win_folder_principal m2 ON ((m2.id_folder_principal = md.id_folder_father)))
If i wanna remove the folder with the Id_Principal 13 i need to remove the other relations that exists in the Folder_Dependency table, but also remove the value from the Folder_Principal
is there any way to achieve that cyclic delete?
This anonymous code block will accumulate all the principles rooted with ID 13 searching down the dependency tree in an array parameter named l_Principles. It then deletes all the dependency records where either the father or son (or both) are contained in l_Principles, and then deletes all the principle records identified in l_Principles:
DO $$DECLARE
l_principles int[];
BEGIN
with recursive t1(root, child, pinciples) as (
select id_folder_father
, id_folder_son
, array[id_folder_father, id_folder_son]
from win_folder_dependency
where id_folder_father = 13
union all
select root
, id_folder_son
, pinciples||id_folder_son
from win_folder_dependency
join t1
on id_folder_father = child
and not id_folder_son = any(pinciples) -- Avoid cycles
)
select max(pinciples) into l_principles from t1 group by root;
delete from win_folder_dependency
where id_folder_father = any(l_principles)
or id_folder_son = any(l_principles);
delete from win_folder_principal
where id_folder_principal = any(l_principles);
end$$;
/
With your provided sample data, the end result will be only one record remaining in the win_folder_principal and no records in the win_folder_dependency table.
If you wan to delete a record from win_folder_principal you must first remove the references to it in win_folder_dependency like so:
delete from win_folder_dependency where 13 in (id_folder_father, id_folder_son);
before you delete the record from win_folder_principal like so:
delete from win_folder_principal where id_folder_principal = 13;
Alternatively if you build your second table like this:
create table win_folder_dependency(
id_folder_dependency serial primary key not null,
id_folder_father int not null,
id_folder_son int not null,
foreign key(id_folder_father)references win_folder_principal(id_folder_principal) on delete cascade,
foreign key(id_folder_son)references win_folder_principal(id_folder_principal) on delete cascade
);
Note the on delete cascade directives, then you can just delete from the principal table, and the references in the dependency table will be deleted as well.

Postgres - updates with join gives wrong results

I'm having some hard time understanding what I'm doing wrong.
The result of this query shows the same results for each row instead of being updated by the right result.
My DATA
I'm trying to update a table of stats over a set of business
business_stats ( id SERIAL,
pk integer not null,
b_total integer,
PRIMARY KEY(pk)
);
the details of each business are stored here
business_details (id SERIAL,
category CHARACTER VARYING,
feature_a CHARACTER VARYING,
feature_b CHARACTER VARYING,
feature_c CHARACTER VARYING
);
and here a table that associate the pk with the category
datasets (id SERIAL,
pk integer not null,
category CHARACTER VARYING;
PRIMARY KEY(pk)
);
WHAT I DID (wrong)
UPDATE business_stats
SET b_total = agg.total
FROM business_stats b,
( SELECT d.pk, count(bd.id) total
FROM business_details AS bd
INNER JOIN datasets AS d
ON bd.category = d.category
GROUP BY d.pk
) agg
WHERE b.pk = agg.pk;
The result of this query is
| id | pk | b_total |
+----+----+-----------+
| 1 | 14 | 273611 |
| 2 | 15 | 273611 |
| 3 | 16 | 273611 |
| 4 | 17 | 273611 |
but if I run just the SELECT the results of each pk are completely different
| pk | agg.total |
+----+-------------+
| 14 | 273611 |
| 15 | 407802 |
| 16 | 179996 |
| 17 | 815580 |
THE QUESTION
why is this happening?
why is the WHERE clause not working?
Before writing this question I've used as reference these posts: a, b, c
Do the following (I always recommend against joins in Updates)
UPDATE business_stats bs
SET b_total =
( SELECT count(c.id) total
FROM business_details AS bd
INNER JOIN datasets AS d
ON bd.category = d.category
where d.pk=bs.pk
)
/*optional*/
where exists (SELECT *
FROM business_details AS bd
INNER JOIN datasets AS d
ON bd.category = d.category
where d.pk=bs.pk)
The issue is your FROM clause. The repeated reference to business_stats means you aren't restricting the join like you expect to. You're joining agg against the second unrelated mention of business_stats rather than the row you want to update.
Something like this is what you are after (warning not tested):
UPDATE business_stats AS b
SET b_total = agg.total
FROM
(...) agg
WHERE b.pk = agg.pk;

Postgresssql Advanced Join

I am new to postgresssql.
I have two table like this
Table A
id |value | type1 | type2 | type3
bigint |text | bigint | bigint | bigint
Table B
Id | description
bigint | text
Table A's type1,type2,type3 is the ids of Table B but not foreign key constraint.
I have to retrieve like this
Select a.id,
a.value,
b.description1(as of a.type1),
b.description1(as of a.type1),
b.description1(as of a.type1)
If you have to many columns you should consider change your db design
TableA
id | value
1 | <something>
2 | <something>
TableAType
id | TableA.id | type_id | typeValue
1 | 1 | type1 | bigint
2 | 1 | type2 | bigint
3 | 1 | type3 | bigint
.....
4 | 1 | typeN | bigint
TableB (type_description)
Id | description
bigint | text
Then your query become more simple and isn't affected when you add/remove types.
SELECT TB.Description, TT.TypeValue
FROM TableAType TT
JOIN TableB TB
ON TT.Type_id = TB.id
Then you can use a PIVOT to get the tabular data. Again the advantage is you can delete remove types, and your query doesnt change, only need update the types tables.
You should (LEFT) JOIN tableB 3 times, and use a different alias for each one.
select id, value,
type1, t1.description descri1,
type2, t2.description descri2,
type3, t3.description descri3
from tableA ta
left join tableB t1
on ta.type1 = t1.id
left join tableB t2
on ta.type2 = t2.id
left join tableB t3
on ta.type3 = t3.id;

table with two nullable foreign keys join results into single column

Sorry if title isn't very descriptive. I have a table like this example, and am using sql server 2012:
PersonId | PetID
and want to join it to the following two tables
PersonId | PersonName | PersonAsset
AnimalId | AnimalName | Animal Asset
So the end result is:
PersonId | PetId | Name | Asset
-------------------------------
1 null Dave 1
null 1 Fido 2
The output you require can be achieved by using a LEFT JOIN for your two tables and ISNULL for the required fields.
For example (assuming the first table is named 'common'):
SELECT common.PersonId,
common.PetId,
ISNULL(person.PersonName, animal.AnimalName) AS Name,
ISNULL(person.PersonAsset, animal.AnimalAsset) AS Asset
FROM common
LEFT JOIN person ON common.PersonId = person.PersonId
LEFT JOIN animal ON common.AnimalId = animal.AnimalId

Sort SELECT result by pairs of columns

In the following PostgreSQL 8.4.13 table
(where author users give grades to id users):
# \d pref_rep;
Table "public.pref_rep"
Column | Type | Modifiers
-----------+-----------------------------+-----------------------------------------------------------
id | character varying(32) | not null
author | character varying(32) | not null
good | boolean |
fair | boolean |
nice | boolean |
about | character varying(256) |
stamp | timestamp without time zone | default now()
author_ip | inet |
rep_id | integer | not null default nextval('pref_rep_rep_id_seq'::regclass)
Indexes:
"pref_rep_pkey" PRIMARY KEY, btree (id, author)
Check constraints:
"pref_rep_check" CHECK (id::text <> author::text)
Foreign-key constraints:
"pref_rep_author_fkey" FOREIGN KEY (author) REFERENCES pref_users(id) ON DELETE CASCADE
"pref_rep_id_fkey" FOREIGN KEY (id) REFERENCES pref_users(id) ON DELETE CASCADE
how to find faked entries, which have same id and same author_ip?
I.e. some users register several accounts and then submit bad notes (the good, fair, nice columns above) for other users. But I can still identify them by their author_ip addresses.
I'm trying to find them by fetching:
# select id, author_ip from pref_rep group by id, author_ip;
id | author_ip
-------------------------+-----------------
OK490496816466 | 94.230.231.106
OK360565502458 | 78.106.102.16
DE25213 | 178.216.72.185
OK331482634936 | 95.158.209.5
VK25785834 | 77.109.20.182
OK206383671767 | 80.179.90.103
OK505822972559 | 46.158.46.126
OK237791033602 | 178.76.216.77
VK90402803 | 109.68.173.37
MR16281819401420759860 | 109.252.139.198
MR5586967138985630915 | 2.93.14.248
OK341086615664 | 93.77.75.142
OK446200841566 | 95.59.127.194
But I need to sort the above result.
How can I sort it by the number of pairs (id, author_ip) desc please?
select id, pr.author_ip
from
pref_rep pr
inner join
(
select author_ip
from pref_rep
group by author_ip
having count(*) > 1
) s using(author_ip)
order by 2, 1