My PostgreSQL database has table with entities which can be active and inactive - it's determined by isActive column value. Inactive entities are accessed very rarely, and, as database grows, "inactive to active" rate becomes very high for the database. So I expect partitioning based on simple isActive check to bring huge performance outcome.
The problem is, the table is referenced by foreign key constraint from many other tables. As specified in the last bullet of Caveats section of PostgreSQL Inheritance doc, there is no good workaround for this case.
So, is it true that currently partitioning in PostgreSQL is only suitable for the simple cases when the table partitioned is not referenced from anywhere?
Are there any other ways to go and optimize performance of queries to the table I described above? I'm pretty sure my use case is common and there should be good solution for that.
Example of queries to create the tables:
CREATE TABLE resources
(
id uuid NOT NULL,
isActive integer NOT NULL, -- 0 means false, anything else is true, I intentionally do not use boolean type
PRIMARY KEY (id)
);
CREATE TABLE resource_attributes
(
id uuid NOT NULL,
resourceId uuid NOT NULL,
name character varying(128) NOT NULL,
value character varying(1024) DEFAULT NULL,
PRIMARY KEY (id),
CONSTRAINT fk_resource_attributes_resourceid_resources_id FOREIGN KEY (resourceId) REFERENCES resources (id)
);
In this case, I'd like to partition resources table.
If the inactive to active ratio is very high a partial index is a good choice
create index index_name on resources (isActive) where isActive = 1
The only known workaround (i can think of) to create a foreign key for a table which has multiple child tables is to create another table to hold just the primary keys (but all of them, maintained by triggers) and point all foreign key references to it.
like:
+-----------+ +----------------+ +---------------------+
| resources | | resource_uuids | | resource_part_n |
+===========+ 0 1 +================+ 1 0 +=====================+
| id | --> | id | <-- | (id from resources) |
+-----------+ +----------------+ +---------------------+
| ... | ↑ 1 | CHECK(...) |
+-----------+ +--------+ +---------------------+
| | INHERITS(resources) |
+---------------------+ | +---------------------+
| resource_attributes | |
+---------------------+ |
| resourceId | --+ *
+---------------------+
| ... |
+---------------------+
But you still can't partition that table (resource_uuids), so i don't think partitioning will help you in this case.
Related
I am designing a comment reply model in golang and Postgres, and my comment table looks like this.
create TABLE comments(postid uuid DEFAULT uuid_generate_v4 (),
comment TEXT,
comment_reaction VARCHAR(255) NOT NULL,
commented_user VARCHAR(255) NOT NULL,
created_at TIMESTAMP,
parent_path ltree ,
CONSTRAINT fk_post FOREIGN KEY(postid)
REFERENCES posts(postid) on DELETE CASCADE);
I have added a couple of values to the table and my selection goes as below:
SELECT * FROM comments WHERE parent_path <# 'ed9f0f769ee4455b8dbf6120afc902fa';
postid | comment | comment_reaction | commented_user | created_at | parent_path
--------------------------------------+----------+------------------+----------------+----------------------------+--------------------------------------
ed9f0f76-9ee4-455b-8dbf-6120afc902fa | blah1 | | sai | 2021-02-06 16:46:36.436241 | ed9f0f769ee4455b8dbf6120afc902fa
ed9f0f76-9ee4-455b-8dbf-6120afc902fa | reply1 | | sai | 2003-02-01 00:00:00 | ed9f0f769ee4455b8dbf6120afc902fa.1
ed9f0f76-9ee4-455b-8dbf-6120afc902fa | reply2 | | sai | 2003-02-01 00:00:00 | ed9f0f769ee4455b8dbf6120afc902fa.2
ed9f0f76-9ee4-455b-8dbf-6120afc902fa | reply1.1 | | sai | 2003-02-01 00:00:00 | ed9f0f769ee4455b8dbf6120afc902fa.1.2
so basically I have some post with id
which has a comment. blah1 whose replies are reply 1 and reply 2 and reply 1 has sub reply reply1.1
How do I map it to a struct in go such that I get a JOSN output as below
{
post :ed9f0f76-9ee4-455b-8dbf-6120afc902fa,
comment :blah
reply1 :{
comment:reply1,
reply: {
comment :reply1.1
}
}
reply2 :{
comment:reply2,
}
}
so this can go to n comment and reply, is it even possible in the first place to reconstruct the JSON mapping in go? and if yes can someone help me? Alternate solutions for a schema and reconstruction are also appreciated.
so, i figured a solution myself while trying to understand facebook schema.
facebook shows the first level comments and shows the replies but replies are hidden.
so we can do BFS on the tree
SELECT * FROM comments WHERE parent_path ~ 'be785c64e9654a59821d20dff67230fc.*{1,1}';
this gives us first level children, which i can map to a struct and show the replies for each child as count
Then if user wants to dig in he can make another api call with the child node.
This way performance is also preserved. mapping entire tree to struct recursively will also affect the performance.
I'm using postgresql 10.12
I have labeled entities. Some are standard, some are not. Standard entities are shared among all users, whereas not standard entities are user owned. So let's say I have a table Entity with a text column Label, and a column user_id which is null for standard entities.
CREATE TABLE Entity
(
id uuid NOT NULL PRIMARY KEY,
user_id integer,
label text NOT NULL,
)
Here is my constraint : two not standard entities belonging to different users can have the same label. Standard entities labels are unique, and entities of a given users have unique labels. The hard part is: a label must be unique within a group of standard entities + a given user's entities.
I'm using sqlAlchemy, here is the constraints I've made so far:
__table_args__ = (
UniqueConstraint("label", "user_id", name="_entity_label_user_uc"),
db.Index(
"_entity_standard_label_uc",
label,
user_id.is_(None),
unique=True,
postgresql_where=(user_id.is_(None)),
),
)
My problem with this constraint is that I do not guarantee that a user entity won't have a standard entity label.
Example:
+----+---------+------------+
| id | user_id | label |
+----+---------+------------+
| 1 | null | std_ent |
| 2 | 42 | user_ent_1 |
| 3 | 42 | user_ent_2 |
| 4 | 43 | user_ent_1 |
+----+---------+------------+
This is a valid table. I want to make sure that it is not possible anymore to create an entity with label std_ent, that user 42 cannot create another entity with label user_ent_1 or user_ent_2 and that user 43 cannot create another entity with label user_ent_1.
With my current constraints, it is still possible for users 42 and 43 to create an entity with label std_ent, which is what I want to fix.
Any idea?
If your unique constraint(s) are doing their job of preventing users from entering duplicate labels for their own "user entities" then you can prevent them from entering the label of a "standard entity" by adding a trigger.
You create a function …
CREATE OR REPLACE FUNCTION public.std_label_check()
RETURNS trigger
LANGUAGE plpgsql
AS $function$
begin
if exists(
select * from entity
where label = new.label and user_id is null) then
raise exception '"%" is already a standard entity', new.label;
end if;
return new;
end;
$function$
;
… and then attach it as a trigger to the table
CREATE TRIGGER entity_std_label_check
BEFORE INSERT
ON public.entity FOR EACH ROW
EXECUTE PROCEDURE std_label_check()
I'm trying to create a table that would enforce a unique combination of two columns of the same type - in both directions. E.g. this would be illegal:
col1 col2
1 2
2 1
I have come up with this, but it doesn't work:
database=> \d+ friend;
Table "public.friend"
Column | Type | Modifiers | Storage | Stats target | Description
--------------+--------------------------+-----------+----------+--------------+-------------
user_id_from | text | not null | extended | |
user_id_to | text | not null | extended | |
status | text | not null | extended | |
sent | timestamp with time zone | not null | plain | |
updated | timestamp with time zone | | plain | |
Indexes:
"friend_pkey" PRIMARY KEY, btree (user_id_from, user_id_to)
"friend_user_id_to_user_id_from_key" UNIQUE CONSTRAINT, btree (user_id_to, user_id_from)
Foreign-key constraints:
"friend_status_fkey" FOREIGN KEY (status) REFERENCES friend_status(name)
"friend_user_id_from_fkey" FOREIGN KEY (user_id_from) REFERENCES user_account(login)
"friend_user_id_to_fkey" FOREIGN KEY (user_id_to) REFERENCES user_account(login)
Has OIDs: no
Is it possible to write this without triggers or any advanced magic, using constraints only?
A variation on Neil's solution which doesn't need an extension is:
create table friendz (
from_id int,
to_id int
);
create unique index ifriendz on friendz(greatest(from_id,to_id), least(from_id,to_id));
Neil's solution lets you use an arbitrary number of columns though.
We're both relying on using expressions to build the index which is documented
https://www.postgresql.org/docs/current/indexes-expressional.html
Do you consider the intarray extension to be magic?
You'd need to use int keys for the users instead of text though...
Here's a possible solution:
create extension intarray;
create table friendz (
from_id int,
to_id int
);
create unique index on friendz ( sort( array[from_id, to_id ] ) );
insert into friendz values (1,2); -- good
insert into friendz values (2,1); -- bad
http://sqlfiddle.com/#!15/c84b7/1
I'm trying to write a function that is able to find the shortest way between two points using pgr_dijkstra function. I'm following this guide. With data provided in the guide everything works fine. But when I try to apply the same steps (build a topology using pgr_createTopology and then test it with pgr_dijkstra) to another data set, pgr_dijkstra returns an empty result. I've also noticed that the guide's data set has a LineString geometry column, while I have a MultiLineString geometry column. What could be the reason?
My table's structure:
Table "public.roads"
Column | Type | Collation | Nullable | Default
--------+--------------------------------+-----------+----------+------------------------------------
id | integer | | not null | nextval('roads_gid_seq'::regclass)
geom | geometry(MultiLineString,4326) | | |
source | integer | | |
target | integer | | |
Indexes:
"roads_pkey" PRIMARY KEY, btree (id)
"roads_geom_idx" gist (geom)
"roads_source_idx" btree (source)
"roads_target_idx" btree (target)
Topology creation query:
SELECT pgr_createTopology('roads', 0.00001, 'geom', 'id');
Shortest way test:
SELECT seq, node, edge, cost as cost, agg_cost, geom
FROM pgr_dijkstra(
'SELECT id, source, target, st_length(geom, true) AS cost FROM roads',
-- Some random points
1, 200
) AS pt
JOIN roads rd ON pt.edge = rd.id;
The problem was actually related to geometry data types. The function doesn't work properly with MultiLineString, though it doesn't produce any errors. So, I've converted MultiLineString to LineString and now everything seems to be OK.
I'm trying to create a table that would enforce a unique combination of two columns of the same type - in both directions. E.g. this would be illegal:
col1 col2
1 2
2 1
I have come up with this, but it doesn't work:
database=> \d+ friend;
Table "public.friend"
Column | Type | Modifiers | Storage | Stats target | Description
--------------+--------------------------+-----------+----------+--------------+-------------
user_id_from | text | not null | extended | |
user_id_to | text | not null | extended | |
status | text | not null | extended | |
sent | timestamp with time zone | not null | plain | |
updated | timestamp with time zone | | plain | |
Indexes:
"friend_pkey" PRIMARY KEY, btree (user_id_from, user_id_to)
"friend_user_id_to_user_id_from_key" UNIQUE CONSTRAINT, btree (user_id_to, user_id_from)
Foreign-key constraints:
"friend_status_fkey" FOREIGN KEY (status) REFERENCES friend_status(name)
"friend_user_id_from_fkey" FOREIGN KEY (user_id_from) REFERENCES user_account(login)
"friend_user_id_to_fkey" FOREIGN KEY (user_id_to) REFERENCES user_account(login)
Has OIDs: no
Is it possible to write this without triggers or any advanced magic, using constraints only?
A variation on Neil's solution which doesn't need an extension is:
create table friendz (
from_id int,
to_id int
);
create unique index ifriendz on friendz(greatest(from_id,to_id), least(from_id,to_id));
Neil's solution lets you use an arbitrary number of columns though.
We're both relying on using expressions to build the index which is documented
https://www.postgresql.org/docs/current/indexes-expressional.html
Do you consider the intarray extension to be magic?
You'd need to use int keys for the users instead of text though...
Here's a possible solution:
create extension intarray;
create table friendz (
from_id int,
to_id int
);
create unique index on friendz ( sort( array[from_id, to_id ] ) );
insert into friendz values (1,2); -- good
insert into friendz values (2,1); -- bad
http://sqlfiddle.com/#!15/c84b7/1