I am designing a comment reply model in golang and Postgres, and my comment table looks like this.
create TABLE comments(postid uuid DEFAULT uuid_generate_v4 (),
comment TEXT,
comment_reaction VARCHAR(255) NOT NULL,
commented_user VARCHAR(255) NOT NULL,
created_at TIMESTAMP,
parent_path ltree ,
CONSTRAINT fk_post FOREIGN KEY(postid)
REFERENCES posts(postid) on DELETE CASCADE);
I have added a couple of values to the table and my selection goes as below:
SELECT * FROM comments WHERE parent_path <# 'ed9f0f769ee4455b8dbf6120afc902fa';
postid | comment | comment_reaction | commented_user | created_at | parent_path
--------------------------------------+----------+------------------+----------------+----------------------------+--------------------------------------
ed9f0f76-9ee4-455b-8dbf-6120afc902fa | blah1 | | sai | 2021-02-06 16:46:36.436241 | ed9f0f769ee4455b8dbf6120afc902fa
ed9f0f76-9ee4-455b-8dbf-6120afc902fa | reply1 | | sai | 2003-02-01 00:00:00 | ed9f0f769ee4455b8dbf6120afc902fa.1
ed9f0f76-9ee4-455b-8dbf-6120afc902fa | reply2 | | sai | 2003-02-01 00:00:00 | ed9f0f769ee4455b8dbf6120afc902fa.2
ed9f0f76-9ee4-455b-8dbf-6120afc902fa | reply1.1 | | sai | 2003-02-01 00:00:00 | ed9f0f769ee4455b8dbf6120afc902fa.1.2
so basically I have some post with id
which has a comment. blah1 whose replies are reply 1 and reply 2 and reply 1 has sub reply reply1.1
How do I map it to a struct in go such that I get a JOSN output as below
{
post :ed9f0f76-9ee4-455b-8dbf-6120afc902fa,
comment :blah
reply1 :{
comment:reply1,
reply: {
comment :reply1.1
}
}
reply2 :{
comment:reply2,
}
}
so this can go to n comment and reply, is it even possible in the first place to reconstruct the JSON mapping in go? and if yes can someone help me? Alternate solutions for a schema and reconstruction are also appreciated.
so, i figured a solution myself while trying to understand facebook schema.
facebook shows the first level comments and shows the replies but replies are hidden.
so we can do BFS on the tree
SELECT * FROM comments WHERE parent_path ~ 'be785c64e9654a59821d20dff67230fc.*{1,1}';
this gives us first level children, which i can map to a struct and show the replies for each child as count
Then if user wants to dig in he can make another api call with the child node.
This way performance is also preserved. mapping entire tree to struct recursively will also affect the performance.
Related
I'm trying to create a table that would enforce a unique combination of two columns of the same type - in both directions. E.g. this would be illegal:
col1 col2
1 2
2 1
I have come up with this, but it doesn't work:
database=> \d+ friend;
Table "public.friend"
Column | Type | Modifiers | Storage | Stats target | Description
--------------+--------------------------+-----------+----------+--------------+-------------
user_id_from | text | not null | extended | |
user_id_to | text | not null | extended | |
status | text | not null | extended | |
sent | timestamp with time zone | not null | plain | |
updated | timestamp with time zone | | plain | |
Indexes:
"friend_pkey" PRIMARY KEY, btree (user_id_from, user_id_to)
"friend_user_id_to_user_id_from_key" UNIQUE CONSTRAINT, btree (user_id_to, user_id_from)
Foreign-key constraints:
"friend_status_fkey" FOREIGN KEY (status) REFERENCES friend_status(name)
"friend_user_id_from_fkey" FOREIGN KEY (user_id_from) REFERENCES user_account(login)
"friend_user_id_to_fkey" FOREIGN KEY (user_id_to) REFERENCES user_account(login)
Has OIDs: no
Is it possible to write this without triggers or any advanced magic, using constraints only?
A variation on Neil's solution which doesn't need an extension is:
create table friendz (
from_id int,
to_id int
);
create unique index ifriendz on friendz(greatest(from_id,to_id), least(from_id,to_id));
Neil's solution lets you use an arbitrary number of columns though.
We're both relying on using expressions to build the index which is documented
https://www.postgresql.org/docs/current/indexes-expressional.html
Do you consider the intarray extension to be magic?
You'd need to use int keys for the users instead of text though...
Here's a possible solution:
create extension intarray;
create table friendz (
from_id int,
to_id int
);
create unique index on friendz ( sort( array[from_id, to_id ] ) );
insert into friendz values (1,2); -- good
insert into friendz values (2,1); -- bad
http://sqlfiddle.com/#!15/c84b7/1
Ok, I deleted previous post and will try this again. I am sure I don't know the topic and I'm not sure if this is a loop or if I should use a stored function or how to get what I'm looking for. Here's sample data and expected output;
I have a single table A. Table has following fields; date created, unique person key, type, location.
I need a Postgres query that says for any given month(parameter, based on date created) and given a location(parameter based on location field), provide me fieds below where unique person key may be duplicated + or – 30 days from the date created within the month given for same type but all locations.
Example Data
Date Created | Unique Person | Type | Location
---------------------------------------------------
2/5/2017 | 1 | Admit | Hospital1
2/6/2017 | 2 | Admit | Hospital2
2/15/2017 | 1 | Admit | Hospital2
2/28/2017 | 3 | Admit | Hospital2
3/3/2017 | 2 | Admit | Hospital1
3/15/2017 | 3 | Admit | Hospital3
3/20/2017 | 4 | Admit | Hospital1
4/1/2017 | 1 | Admit | Hospital2
Output for the month of March for Hospital1:
DateCreated| UniquePerson | Type | Location | +-30days | OtherLoc.
------------------------------------------------------------------------
3/3/2017 | 2 | Admit| Hospital1 | 2/6/2017 | Hospital2
Output for the month of March for Hospital2:
None, because no one was seen at Hospital2 in March
Output for the month of March for Hospital3:
DateCreated| UniquePerson | Type | Location | +-30days | otherLoc.
------------------------------------------------------------------------
3/15/2017 | 3 | Admit| Hospital3 | 2/28/2017 | Hospital2
Version 1
I would use a WITH clause. Please, notice that I've added a column id that is a primary key to simplify the query. It's just to prevent the rows to be matched with themselves.
WITH x AS (
SELECT
id,
date_created,
unique_person_id,
type,
location
FROM
a
WHERE
location = 'Hospital1' AND
date_trunc('month', date_created) = date_trunc('month', '2017-03-01'::date)
)
SELECT
x.date_created,
x.unique_person_id,
x.type,
x.location,
a.date_created AS "+-30days",
a.location AS other_location
FROM
x
JOIN a
USING (unique_person_id, type)
WHERE
x.id != a.id AND
abs(x.date_created - a.date_created) <= 30;
Now a little bit of explanations:
First we select, let's say a reference data with a WITH clause. Think of it as a temporary table that we can reference in the main query. In given example it could be a "main visit" in given hospital.
Then we join "main visits" with other visits of the same person and type (JOIN condition) that happen in date difference of 30 days (WHERE condition).
Notice that the WITH query has the limits you want to check (location and date). I use date_trunc function that truncates the date to specified precision (a month in this case).
Version 2
As #Laurenz Albe suggested, there is no special need to use a WITH clause. Right, so here is a second version.
SELECT
x.date_created,
x.unique_person_id,
x.type,
x.location,
a.date_created AS "+-30days",
a.location AS other_location
FROM
a AS x
JOIN a
USING (unique_person_id, type)
WHERE
x.location = 'Hospital1' AND
date_trunc('month', x.date_created) = date_trunc('month', '2017-03-01'::date) AND
x.id != a.id AND
abs(x.date_created - a.date_created) <= 30;
This version is shorter than the first one but, in my opinion, the first is easier to understand. I don't have big enough set of data to test and I wonder which one runs faster (the query planner shows similar values for both).
Here is my table (simplified, only significant columns):
CREATE TABLE details(
id serial primary key,
name text,
Address jsonb );
And some sample Data
# Select * from details
id | name | Address
----+----------+-----------------------------------------------------------
1 | Batman | {"city":"Gotham City","street":"1007 Mountain Drive"}
2 | Superman | {"city":"Metropolis","street":"344 Clinton Street"}
3 | Flash | {"city":"Central City","street":"122 Englewood street"}
Now I would like to select only name and City field of Address, Query would be
Select name, Address -> 'city' as Address from details
name | Address
----------+------------------
Batman | "Gotham City"
Superman | "Metropolis"
Flash | "Central City"
But I want it to be filtered as shown below.
name | Address
----------+-------------------------
Batman | {"city":"Gotham City"}
Superman | {"city":"Metropolis"}
Flash | {"city":"Central City"}
Is it possible to select only some fields from jsonb type column? If it is possible then what would be the query ?
If you want to include only 1 field, your query can be fairly easy:
select name, jsonb_build_object('city', address -> 'city') address
from details
However, if you want to include multiple fields, things will get complex. You could f.ex. remove unwanted keys one-by-one with the - operator, like: jsonb_column - 'key1' - 'key2':
select name, address - 'street' address
from details
But this will only work, when you have a fairly few fields inside of the JSON column (and they are well defined).
If you want a general solution, you should use some aggregation:
select name, (select jsonb_object_agg(e.key, e.value)
from jsonb_each(address) e
where e.key in ('city')) address
from details
I'm trying to create a table that would enforce a unique combination of two columns of the same type - in both directions. E.g. this would be illegal:
col1 col2
1 2
2 1
I have come up with this, but it doesn't work:
database=> \d+ friend;
Table "public.friend"
Column | Type | Modifiers | Storage | Stats target | Description
--------------+--------------------------+-----------+----------+--------------+-------------
user_id_from | text | not null | extended | |
user_id_to | text | not null | extended | |
status | text | not null | extended | |
sent | timestamp with time zone | not null | plain | |
updated | timestamp with time zone | | plain | |
Indexes:
"friend_pkey" PRIMARY KEY, btree (user_id_from, user_id_to)
"friend_user_id_to_user_id_from_key" UNIQUE CONSTRAINT, btree (user_id_to, user_id_from)
Foreign-key constraints:
"friend_status_fkey" FOREIGN KEY (status) REFERENCES friend_status(name)
"friend_user_id_from_fkey" FOREIGN KEY (user_id_from) REFERENCES user_account(login)
"friend_user_id_to_fkey" FOREIGN KEY (user_id_to) REFERENCES user_account(login)
Has OIDs: no
Is it possible to write this without triggers or any advanced magic, using constraints only?
A variation on Neil's solution which doesn't need an extension is:
create table friendz (
from_id int,
to_id int
);
create unique index ifriendz on friendz(greatest(from_id,to_id), least(from_id,to_id));
Neil's solution lets you use an arbitrary number of columns though.
We're both relying on using expressions to build the index which is documented
https://www.postgresql.org/docs/current/indexes-expressional.html
Do you consider the intarray extension to be magic?
You'd need to use int keys for the users instead of text though...
Here's a possible solution:
create extension intarray;
create table friendz (
from_id int,
to_id int
);
create unique index on friendz ( sort( array[from_id, to_id ] ) );
insert into friendz values (1,2); -- good
insert into friendz values (2,1); -- bad
http://sqlfiddle.com/#!15/c84b7/1
My PostgreSQL database has table with entities which can be active and inactive - it's determined by isActive column value. Inactive entities are accessed very rarely, and, as database grows, "inactive to active" rate becomes very high for the database. So I expect partitioning based on simple isActive check to bring huge performance outcome.
The problem is, the table is referenced by foreign key constraint from many other tables. As specified in the last bullet of Caveats section of PostgreSQL Inheritance doc, there is no good workaround for this case.
So, is it true that currently partitioning in PostgreSQL is only suitable for the simple cases when the table partitioned is not referenced from anywhere?
Are there any other ways to go and optimize performance of queries to the table I described above? I'm pretty sure my use case is common and there should be good solution for that.
Example of queries to create the tables:
CREATE TABLE resources
(
id uuid NOT NULL,
isActive integer NOT NULL, -- 0 means false, anything else is true, I intentionally do not use boolean type
PRIMARY KEY (id)
);
CREATE TABLE resource_attributes
(
id uuid NOT NULL,
resourceId uuid NOT NULL,
name character varying(128) NOT NULL,
value character varying(1024) DEFAULT NULL,
PRIMARY KEY (id),
CONSTRAINT fk_resource_attributes_resourceid_resources_id FOREIGN KEY (resourceId) REFERENCES resources (id)
);
In this case, I'd like to partition resources table.
If the inactive to active ratio is very high a partial index is a good choice
create index index_name on resources (isActive) where isActive = 1
The only known workaround (i can think of) to create a foreign key for a table which has multiple child tables is to create another table to hold just the primary keys (but all of them, maintained by triggers) and point all foreign key references to it.
like:
+-----------+ +----------------+ +---------------------+
| resources | | resource_uuids | | resource_part_n |
+===========+ 0 1 +================+ 1 0 +=====================+
| id | --> | id | <-- | (id from resources) |
+-----------+ +----------------+ +---------------------+
| ... | ↑ 1 | CHECK(...) |
+-----------+ +--------+ +---------------------+
| | INHERITS(resources) |
+---------------------+ | +---------------------+
| resource_attributes | |
+---------------------+ |
| resourceId | --+ *
+---------------------+
| ... |
+---------------------+
But you still can't partition that table (resource_uuids), so i don't think partitioning will help you in this case.