Recursice JSONB join in Postgres

Recursice JSONB join in Postgres - postgresql

I am trying to expand existing system to allow self-referencing "foreighn key" relationships based on values in JSONB. I'll give an example for beter understanding
| ID | DATA |
|---------------------------------------------------------------|
| 1 |{building_id: 1, building_name: 'Office 1'} |
|---------------------------------------------------------------|
| 2 |{building_id: 2, building_name: 'Office 2'} |
|---------------------------------------------------------------|
| 3 |{building_id: 1, full_name: 'John Doe', calary: 3000} |
|---------------------------------------------------------------|
| 4 |{building_id: 1, full_name: 'Alex Smit', calary: 2000} |
|---------------------------------------------------------------|
| 5 |{building_id: 1, full_name: 'Anna Birkin', calary: 2500}|
I tried using jsonb_array_elements_text but i need new data to be combined in single JSON field like this
| ID | DATA |
|--------------------------------------------------------------------------------------------|
| 1 |{building_id: 1, building_name: 'Office 1', full_name: 'John Doe', calary: 3000} |
|--------------------------------------------------------------------------------------------|
| 2 |{building_id: 1, building_name: 'Office 1', full_name: 'Alex Smit', calary: 2000} |
|--------------------------------------------------------------------------------------------|
| 3 |{building_id: 2, building_name: 'Office 2', , full_name: 'Anna Birkin', calary: 2500}|
I am wondering whether it is even possible

Assuming that Anna Birkin is in building_id 2 and these different object types are consistent enough to determine types by the presence of keys, try something like this:
select b.data || p.data as result
from really_bad_idea b
join really_bad_idea p on p.data->'building_id' = b.data->'building_id'
where b.data ? 'building_name'
and p.data ? 'full_name';
db<>fiddle here

Related

How to query across multiple rows in postgres

I'm saving dynamic objects (objects of which I do not know the type upfront) using the following 2 tables in Postgres:
CREATE TABLE IF NOT EXISTS objects(
id UUID NOT NULL DEFAULT gen_random_uuid(),
user_id UUID NOT NULL,
name TEXT NOT NULL,
PRIMARY KEY(id)
);
CREATE TABLE IF NOT EXISTS object_values(
id UUID NOT NULL DEFAULT gen_random_uuid(),
event_id UUID NOT NULL,
param TEXT NOT NULL,
value TEXT NOT NULL,
);
So for instance, if I have the following objects:
dog = [
{ breed: "poodle", age: 15, ...},
{ breed: "husky", age: 9, ...},
}
monitors = [
{ manufacturer: "dell", ...},
}
It will live in the DB as follows:
-- objects
| id | user_id | name |
|----|---------|---------|
| 1 | 1 | dog |
| 2 | 2 | dog |
| 3 | 1 | monitor |
-- object_values
| id | event_id | param | value |
|----|----------|--------------|--------|
| 1 | 1 | breed | poodle |
| 2 | 1 | age | 15 |
| 3 | 2 | breed | husky |
| 4 | 2 | age | 9 |
| 5 | 3 | manufacturer | dell |
Note, these tables are big (hundreds of millions). Generally optimised for writing.
What would be a good way of querying/filtering objects based on multiple object params? For instance: Select the number of all husky dogs above the age of 10 per unique user.
I also wonder whether it would have been better to denormalise the tables and collapse the params onto a JSON column (and use gin indexes).
Are there any standards I can use here?

"Select the number of all husky dogs above the age of 10 per unique user" - The following query would do it.
SELECT user_id, COUNT(DISTINCT event_id) AS num_husky_dogs_older_than_10
FROM objects o
INNER JOIN object_values ov
ON o.id_ = ov.event_id
AND o.name_ = 'dog'
GROUP BY o.user_id
HAVING MAX(CASE WHEN ov.param = 'age'
AND ov.value_::integer >= 10 THEN 1 END) = 1
AND MAX(CASE WHEN ov.param = 'breed'
AND ov.value_ = 'husky' THEN 1 END) = 1;
Since your queries are most likely affected by having always the same JOIN operation between these two tables on the same fields, would be good to have a indices on:
the fields you join on ("objects.id", "object_values.event_id")
the fields you filter on ("objects.name", "object_values.param", "object_values.value_")
Check the demo here.

PostgreSQL - Fetch only a specified limit of related rows but with a number indicating the total count

Basically I have 2 models like so:
Comments:
+----+--------+----------+-----------------+
| id | userId | parentId | text |
+----+--------+----------+-----------------+
| 2 | 5 | 1 | Beautiful photo |
+----+--------+----------+-----------------+
| 3 | 2 | 2 | Thanks Jeff. |
+----+--------+----------+-----------------+
| 4 | 7 | 2 | Thank you, Jeff.|
+----+--------+----------+-----------------+
This table is designed to handle threads. Each parentId is a comment itself.
And CommentLikes:
+----+--------+-----------+
| id | userId | commentId |
+----+--------+-----------+
| 1 | 2 | 2 |
+----+--------+-----------+
| 2 | 7 | 2 |
+----+--------+-----------+
| 3 | 7 | 3 |
+----+--------+-----------+
What I'm trying to achieve is an SQL query that will perform the following (given the parameter parentId):
Get a limit of 10 replies that belong to parentId. With each reply, I need a count indicating the total number of replies to that reply and another count indicating the total number of likes given to that reply.
Sample input #1: /replies/1
Expected output:
[{
id: 2,
userId: 5,
parentId: 1,
text: 'Beautiful photo',
likeCount: 2,
replyCount: 2
}]
Sample input #2: /replies/2
Expected output:
[
{
id: 2,
userId: 2,
parentId: 2,
text: 'Thanks Jeff.'
replyCount: 0,
likeCount: 1
},
{
id: 3,
userId: 7,
parentId: 2,
text: 'Thank you, Jeff.'
replyCount: 0,
likeCount: 0
}
]
I'm trying to use Sequelize for my case but it seems to only over-complicate things so any raw SQL query will do.
Thank you in advance.

what about something like this
SELECT *,
(SELECT COUNT(*) FROM comment_likes WHERE comments.id =comment_likes."commentId") AS likecount,
(SELECT COUNT(*) FROM comments AS c WHERE c."parentId" = comments.id) AS commentcount
FROM comments
WHERE comments."parentId"=2

How do I group by identifier and create an array of JSON objects in postgreSQL?

I have a table of transactions where I need to group by customerId and create aggregated JSON records in an array.
customerId|transactionVal|day
1234| 2|2019-01-01
1234| 3|2019-01-04
14| 1|2019-01-01
What I'm looking to do is return something like this:
customerId|transactions
1234|[{'transactionVal': 2, 'day': '2019-01-01'}, {'transactionVal': 3, 'day': '2019-01-04'}]
14|[{'transactionVal': 1, 'day': '2019-01-01'}]
I will need to later iterate through each transaction in the array to calculate % changes in the transactionVal.
I searched for a while but could not seem to find something to handle this as the table is quite large > 70mil rows.
Thanks!

Should be possible to use array_agg and json_build_object like so:
ok=# select customer_id,
array_agg(json_build_object('thing', thing, 'time', time))
from test group by customer_id;
customer_id | array_agg
-------------+---------------------------------------------------------------------------------------------------
2 | {"{\"thing\" : 1, \"time\" : 1}"}
1 | {"{\"thing\" : 1, \"time\" : 1}",
"{\"thing\" : 2, \"time\" : 2}",
"{\"thing\" : 3, \"time\" : 3}"}
(2 rows)
ok=# select * from test;
customer_id | thing | time
-------------+-------+------
1 | 1 | 1
2 | 1 | 1
1 | 2 | 2
1 | 3 | 3
(4 rows)
ok=#

How do I select a postgres Many-to-One relationship as a single row? [duplicate]

This question already has answers here:
PostgreSQL Crosstab Query
(7 answers)
Closed 3 years ago.
I have a many-to-one relationship between Animals and their attributes. Because different Animals have different attributes, I want to be able to select all animals with their attribute name as a column header and NULL values where that animal does not have that attribute.
Like so...
TABLE_ANIMALS
ID | ANIMAL | DATE | MORE COLS....
1 | CAT | 2012-01-10 | ....
2 | DOG | 2012-01-10 | ....
3 | FROG | 2012-01-10 | ....
...
TABLE_ATTRIBUTES
ID | ANIMAL_ID | ATTRIBUE_NAME | ATTRIBUTE_VALUE
1 | 1 | noise | meow
2 | 1 | legs | 4
3 | 1 | has_fur | TRUE
4 | 2 | noise | woof
5 | 2 | legs | 4
6 | 3 | noise | croak
7 | 3 | legs | 2
8 | 3 | has_fur | FALSE
...
QUERY RESULT
ID | ANIMAL | NOISE | LEGS | HAS_FUR
1 | CAT | meow | 4 | TRUE
2 | DOG | woof | 4 | NULL
3 | FROG | croak | 2 | FALSE
How would I do this? To reiterate, it's important that all the columns are there even if one Animal doesn't have that attribute, such as "DOG" and "HAS_FUR" in this example. If it doesn't have the attribute, it should just be null.

How about a simple join, aggregation and group by?
create table table_animals(id int, animal varchar(10), date date);
create table table_attributes(id varchar(10), animal_id int, attribute_name varchar(10), attribute_value varchar(10));
insert into table_animals values (1, 'CAT', '2012-01-10'),
(2, 'DOG', '2012-01-10'),
(3, 'FROG', '2012-01-10');
insert into table_attributes values (1, 1, 'noise', 'meow'),
(2, 1, 'legs', 4),
(3, 1, 'has_fur', TRUE),
(4, 2, 'noise', 'woof'),
(5, 2, 'legs', 4),
(6, 3, 'noise', 'croak'),
(7, 3, 'legs', 2),
(8, 3, 'has_fur', FALSE);
select ta.animal,
max(attribute_value) filter (where attribute_name = 'noise') as noise,
max(attribute_value) filter (where attribute_name = 'legs') as legs,
max(attribute_value) filter (where attribute_name = 'has_fur') as has_fur
from table_animals ta
left join table_attributes tat on tat.animal_id = ta.id
group by ta.animal
Here's a rextester sample
Additionally you can change the aggregation to MAX CASE WHEN... but MAX FILTER WHERE has better performance.

How to create a pivot table from hstore data?

Imagining I have a table cars with a field data inside:
CARS
name | data
car 1 | { "doors" => "5", "engine" => "1.1" }
car 2 | { "doors" => "3", "engine" => "1.1", "air_conditioning" => "true" }
car 3 | { "doors" => "5", "engine" => "1.4" }
Assuming data keys are dynamic (more can be added), how can I create a pivot table from that data like this:
CROSSTAB
name | doors | engine | air_conditioning
car 1 | 5 | 1.1 |
car 2 | 3 | 1.1 | "true"
car 3 | 5 | 1.4 |

Here's how to get the result you asked for:
CREATE TABLE hstore_test (id bigserial primary key, title text, doors integer, engine text, air_conditioning boolean)
INSERT INTO hstore_test (title, doors, engine, air_conditioning)
VALUES ('Car1', 2, '1.1', false), ('Car2', 4, '1.2', true), ('Car3', 3, '1.3', false), ('Car4', 5, '1.4', null);
DROP TABLE IF EXISTS hstore_persist;
CREATE TABLE hstore_persist AS
SELECT hstore(t) car_data FROM hstore_test AS t;
SELECT car_data->'title' "name", car_data->'doors' doors, car_data->'engine' engine, car_data->'air_conditioning' air_conditioning
FROM hstore_persist
This will result in the table
name | doors | engine | air_conditioning
Car1 | 2 | 1.1 | f
Car2 | 4 | 1.2 | t
Car3 | 3 | 1.3 | f
Car4 | 5 | 1.4 |
There is nothing "crosstab" about it, though. This is just using the accessor methods of an hstore to display the data in the way you show in the example.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

Recursice JSONB join in Postgres - postgresql

Related

How to query across multiple rows in postgres

PostgreSQL - Fetch only a specified limit of related rows but with a number indicating the total count

How do I group by identifier and create an array of JSON objects in postgreSQL?

How do I select a postgres Many-to-One relationship as a single row? [duplicate]

How to create a pivot table from hstore data?

Categories

Resources