I am looking to Range Partition one of my table (i.e. TransactionLog) in PostgreSQL 11.
While evaluating query performance between the un-partitioned and partitioned table I am getting huge difference in planning time. Planning time is very high in partitioned table.
Similarly when I query by specifying partition name directly in query the planning time is much less 0.081 ms as compared to when I query based on partition table (parent table) name in query, where planning time 6.231 ms (Samples below).
Let me know how can I improve query performance on partitioned table.
Following is the schema
CREATE TABLE TransactionLog (
txid character varying(36) NOT NULL,
txnDetails character varying(64),
loggingtime timestamp(6) without time zone DEFAULT LOCALTIMESTAMP,
) PARTITION BY RANGE(loggingtime);
CREATE TABLE IF NOT EXISTS TransactionLog_20200223 PARTITION OF TransactionLog FOR VALUES FROM ('2020-02-23') TO ('2020-02-24');
CREATE UNIQUE INDEX TransactionLog_20200223_UnqTxId ON TransactionLog_20200223 (txnid);
Following is explain analyze result when I query Directly on partition. Planning time ~0.080 ms (average of 10 execution)
postgres=> EXPLAIN (ANALYZE,VERBOSE,COSTS,BUFFERS,TIMING,SUMMARY) select txnDetails FROM mra_part.TransactionLog_20200223 WHERE txnid = 'febd139d-1b7f-4564-a004-1b3474e51756';
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------------------------
Index Scan using TransactionLog_20200223_UnqTxId on TransactionLog_20200223 (cost=0.57..4.61 rows=1 width=10) (actual time=0.039..0.040 rows=1 loops=1)
Output: txnDetails
Index Cond: ((TransactionLog_20200223.txnid)::text = 'febd139d-1b7f-4564-a004-1b3474e51756'::text)
Buffers: shared hit=5
**Planning Time: 0.081 ms**
Execution Time: 0.056 ms
(6 rows)
Following is explain analyze result when I query by parent-table. Planning time ~6.198 ms (average of 10 execution)
postgres=> EXPLAIN (ANALYZE,VERBOSE,COSTS,BUFFERS,TIMING,SUMMARY) select txnDetails FROM mtdauthlog WHERE txnid = 'febd139d-1b7f-4564-a004-1b3474e51756' AND loggingtime >= '2020-02-23'::timestamp without time zone AND loggingtime < '2020-02-24'::timestamp without time zone;
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Append (cost=0.57..4.62 rows=1 width=10) (actual time=0.036..0.037 rows=1 loops=1)
Buffers: shared hit=5
-> Index Scan using TransactionLog_20200223_UnqTxId on TransactionLog_20200223 (cost=0.57..4.61 rows=1 width=10) (actual time=0.035..0.036 rows=1 loops=1)
Output: TransactionLog_20200223.txnDetails
Index Cond: ((TransactionLog_20200223.txnid)::text = 'febd139d-1b7f-4564-a004-1b3474e51756'::text)
Filter: ((TransactionLog_20200223.loggingtime >= '2020-02-23 00:00:00'::timestamp without time zone) AND (TransactionLog_20200223.loggingtime < '2020-02-24 00:00:00'::timestamp without time zone))
Buffers: shared hit=5
**Planning Time: 6.231 ms**
Execution Time: 0.076 ms
(9 rows)
PostgreSQL Version : PostgreSQL 11.7 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-39), 64-bit
Related
I have a table with 143 million rows (and growing), its current size is 107GB. One of the columns in the table is of type date and it has low selectivity. For any given date, its reasonable to assume that there are somewhere between 0.5 to 4 million records with the same date value.
Now, if someone tries to do something like this:
select * from large_table where date_column > '2020-01-01' limit 100
It will execute "forever", and if you EXPLAIN ANALYZE it, you can see that its doing a table scan. So the first (and only so far) idea is to try and make this into an index scan. If Postgres can scan a subsection of an index and return the "limit" number of records, it sounds fast to me:
create index our_index_on_the_date_column ON large_table (date_column DESC);
VACUUM ANALYZE large_table;
EXPLAIN ANALYZE select * from large_table where date_column > '2020-01-01' limit 100;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.00..37.88 rows=100 width=893) (actual time=0.034..13.520 rows=100 loops=1)
-> Seq Scan on large_table (cost=0.00..13649986.80 rows=36034774 width=893) (actual time=0.033..13.506 rows=100 loops=1)
Filter: (date_column > '2020-01-01'::date)
Rows Removed by Filter: 7542
Planning Time: 0.168 ms
Execution Time: 18.412 ms
(6 rows)
It still reverts to a sequential scan. Please disregard the execution time as this took 11 minutes before caching came into action. We can force it to use the index, by reducing the number of returned columns to what's being covered by the index:
select date_column from large_table where date_column > '2019-01-15' limit 100
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.57..3.42 rows=100 width=4) (actual time=0.051..0.064 rows=100 loops=1)
-> Index Only Scan using our_index_on_the_date_column on large_table (cost=0.57..907355.11 rows=31874888 width=4) (actual time=0.050..0.056 rows=100 loops=1)
Index Cond: (date_column > '2019-01-15'::date)
Heap Fetches: 0
Planning Time: 0.082 ms
Execution Time: 0.083 ms
(6 rows)
But this is of course a contrived example, since the table is very wide and covering all parts of the table in the index is not feasible.
So, anyone who can share some guidance on how to get some performance when using columns with low selectivity as predicates?
I was wondering if my indexes are working well since I am using nodejs and the dates with microseconds are not allowed in this language. So in my query for some good comparison I am doing this kind of thing:
`WHERE (created_at::timestamp(0), uuid) < (${createdAt}::timestamp(0), ${uuid})`
Since I am using a the cast which truncate to seconds, I supposed that the indexes are break. Am I right ? The solution then would be to change the precision of the timestamps stored, or is there another solution to keep the old ones ?
You could change the PostgreSQL data type to millisecond precision:
ALTER TABLE tab ALTER created_at TYPE timestamp(3) without time zone;
By using the recommended EXPLAIN(ANALYZE, VERBOSE, BUFFERS).
I created a table named users with a constraint on the created_at
create table users (
id uuid default uuid_generate_v4() not null
constraint users_pkey primary key,
created_at timestamp default CURRENT_TIMESTAMP
);
create index users_created_at_idx on users (created_at);
The test:
EXPLAIN(ANALYZE, VERBOSE, BUFFERS)
SELECT id
FROM users
WHERE (created_at >= '2022-01-21 15:43:33.631779');
Index Scan using users_created_at_idx on public.users (cost=0.14..4.16 rows=1 width=16) (actual time=0.010..0.018 rows=0 loops=1)
Output: id
Index Cond: (users.created_at >= '2022-01-21 15:43:33.631779'::timestamp without time zone)
Buffers: shared hit=1
Planning Time: 0.074 ms
Execution Time: 0.058 ms
EXPLAIN(ANALYZE, VERBOSE, BUFFERS)
SELECT id
FROM users
WHERE (created_at::timestamp(0) >= '2022-01-21 15:43:33.631779'::timestamp(0));
Seq Scan on public.users (cost=0.00..4.50 rows=33 width=16) (actual time=0.034..0.043 rows=0 loops=1)
Output: id
Filter: ((users.created_at)::timestamp(0) without time zone >= '2022-01-21 15:43:34'::timestamp(0) without time zone)
Rows Removed by Filter: 100
Buffers: shared hit=3
Planning Time: 0.073 ms
Execution Time: 0.089 ms
As we can see the index on the created_at column is not taken into account when we cast and truncate.
I have less than 200 partitions(Daily partitions) and each partition with 5M+ records.
When I pass one day data with direct partition I see estimated plan 0.01ms but while using parent table 190ms(too much). Only difference observed is Append in plan.
Can we eliminate Append or reduce pruning time in postgres 11?
QUERY:
explain (ANALYZE, VERBOSE, COSTS, BUFFERS, TIMING,SUMMARY) select 1 from test WHERE date1 >'2021-01-27 13:41:26' and date1<'2021-01-27 21:41:26' and own=123 and mob=123454234
----------------------------plan-----------
Append (cost=0.12..4.19 rows=1 width=4) (actual time=0.018..0.018 rows=0 loops=1)
Buffers: shared hit=1
-> Index Only Scan using test_20210127_pkey on test_20210127 (cost=0.12..4.17 rows=1 width=4) (actual time=0.017..0.017 rows=0 loops=1)
Output: 1
Index Cond: ((test_20210127.date1 > '2021-01-27 13:41:26'::timestamp without time zone) AND (test_20210127.date1 < '2021-01-27 21:41:26'::timestamp without time zone) AND (test_20210127.own = 123) AND (test_20210127.mob = 123454234))
Heap Fetches: 0
Buffers: shared hit=1
Planning Time: 190.440 ms
Execution Time: 0.093 ms
------------Snipped table structure----
CREATE TABLE public.test
(
own integer NOT NULL,
mob bigint NOT NULL,
date1 timestamp without time zone NOT NULL,
ver integer NOT NULL,
c5
...
c100
CONSTRAINT test_pkey PRIMARY KEY (date1, own, mob, ver)
USING INDEX TABLESPACE tb_1
) PARTITION BY RANGE (date1)
WITH (
OIDS = FALSE
)
TABLESPACE tb_1;
-- Partitions SQL
CREATE TABLE public.test_20211003 PARTITION OF public.test
FOR VALUES FROM ('2020-10-03 00:00:00') TO ('2020-10-04 00:00:00');
CREATE TABLE public.test_201004 PARTITION OF public.test
FOR VALUES FROM ('2020-10-04 00:00:00') TO ('2020-10-05 00:00:00');
........6 months partitions
You can upgrade to a later PostgreSQL version, as there were performance improvements in v12.
But if query execution time is short, planning time will always dominate. You can test a prepared statement, but I doubt that runtime partition pruning will be so much faster.
Essentially, the worse query performance is the expected price you are paying for the benefit of a simple way to discard old data.
I'm running Postgres 11.
I have a table with 1.000.000 (1 million) rows and each row has a size of 40 bytes (it contains 5 columns). That is equal to 40MB.
When I execute (directly executed on the DB via DBeaver, DataGrid ect.- not called via Node, Python ect.):
SELECT * FROM TABLE
it takes 40 secs first time (is this not very slow even for the first time).
The CREATE statement of my tables:
CREATE TABLE public.my_table_1 (
c1 int8 NOT NULL GENERATED ALWAYS AS IDENTITY,
c2 int8 NOT NULL,
c3 timestamptz NULL,
c4 float8 NOT NULL,
c5 float8 NOT NULL,
CONSTRAINT my_table_1_pkey PRIMARY KEY (id)
);
CREATE INDEX my_table_1_c3_idx ON public.my_table_1 USING btree (c3);
CREATE UNIQUE INDEX my_table_1_c2_idx ON public.my_table_1 USING btree (c2);
On 5 random tables: EXPLAIN (ANALYZE, BUFFERS) select * from [table_1...2,3,4,5]
Seq Scan on table_1 (cost=0.00..666.06 rows=34406 width=41) (actual time=0.125..7.698 rows=34406 loops=1)
Buffers: shared read=322
Planning Time: 15.521 ms
Execution Time: 10.139 ms
Seq Scan on table_2 (cost=0.00..9734.87 rows=503187 width=41) (actual time=0.103..57.698 rows=503187 loops=1)
Buffers: shared read=4703
Planning Time: 14.265 ms
Execution Time: 74.240 ms
Seq Scan on table_3 (cost=0.00..3486217.40 rows=180205440 width=41) (actual time=0.022..14988.078 rows=180205379 loops=1)
Buffers: shared hit=7899 read=1676264
Planning Time: 0.413 ms
Execution Time: 20781.303 ms
Seq Scan on table_4 (cost=0.00..140219.73 rows=7248073 width=41) (actual time=13.638..978.125 rows=7247991 loops=1)
Buffers: shared hit=7394 read=60345
Planning Time: 0.246 ms
Execution Time: 1264.766 ms
Seq Scan on table_5 (cost=0.00..348132.60 rows=17995260 width=41) (actual time=13.648..2138.741 rows=17995174 loops=1)
Buffers: shared hit=82 read=168098
Planning Time: 0.339 ms
Execution Time: 2730.355 ms
When I add a LIMIT 1.000.000 to table_5 (it contains 1.7 million rows)
Limit (cost=0.00..19345.79 rows=1000000 width=41) (actual time=0.007..131.939 rows=1000000 loops=1)
Buffers: shared hit=9346
-> Seq Scan on table_5(cost=0.00..348132.60 rows=17995260 width=41) (actual time=0.006..68.635 rows=1000000 loops=1)
Buffers: shared hit=9346
Planning Time: 0.048 ms
Execution Time: 164.133 ms
When I add a WHERE clause between 2 dates (I'm monitored the query below with DataDog software and the results are here (max.~ 31K rows/sec when fetching): https://www.screencast.com/t/yV0k4ShrUwSd):
Seq Scan on table_5 (cost=0.00..438108.90 rows=17862027 width=41) (actual time=0.026..2070.047 rows=17866766 loops=1)
Filter: (('2018-01-01 00:00:00+04'::timestamp with time zone < matchdate) AND (matchdate < '2020-01-01 00:00:00+04'::timestamp with time zone))
Rows Removed by Filter: 128408
Buffers: shared hit=168180
Planning Time: 14.820 ms
Execution Time: 2673.171 ms
All tables has an unique index on the c3 column.
The size of the database is like 500GB in total.
The server has 16 cores and 112GB M2 memory.
I have tried to optimize Postgres system variables - Like: WorkMem(1GB), shared_buffer(50GB), effective_cache_size (20GB) - But it doesn't seems to change anything (I know the settings has been applied - because I can see a big difference in the amount of idle memory the server has allocated).
I know the database is too big for all data to be in memory. But is there anything I can do to boost the performance / speed of my query?
Make sure CreatedDate is indexed.
Make sure CreatedDate is using the date column type. This will be more efficient on storage (just 4 bytes), performance, and you can use all the built in date formatting and functions.
Avoid select * and only select the columns you need.
Use YYYY-MM-DD ISO 8601 format. This has nothing to do with performance, but it will avoid a lot of ambiguity.
The real problem is likely that you have thousands of tables with which you regularly make unions of hundreds of tables. This indicates a need to redesign your schema to simplify your queries and get better performance.
Unions and date change checks suggest a lot of redundancy. Perhaps you've partitioned your tables by date. Postgres has its own built in table partitioning which might help.
Without more detail that's all I can say. Perhaps ask another question about your schema.
Without seeing EXPLAIN (ANALYZE, BUFFERS), all we can do is speculate.
But we can do some pretty good speculation.
Cluster the tables on the index on CreatedDate. This will allow the data to be accessed more sequentially, allowing more read-ahead (but this might not help much for some kinds of storage). If the tables have high write load, they may not stay clustered and so you would have recluster them occasionally. If they are static, this could be a one-time event.
Get more RAM. If you want to perform as if all the data was in memory, then get all the data into memory.
Get faster storage, like top-notch SSD. It isn't as fast as RAM, but much faster than HDD.
I used Influx to store our time series data. It's cool when it worked, then after about one month, it stopped working and I couldn't figure out why. (Similiar to this issue https://github.com/influxdb/influxdb/issues/1386)
Maybe Influx will be great one day, but for now I need to use something that's more stable. I'm thinking about Postgres. Our data comes from many sensors, each sensor has a sensor id. So I'm thinking about structuring our data as this:
(pk), sensorId(string), time(timestamp), value(float)
Influx is built for time series data so it probably has some built in optimizations. Do I need to do optimizations myself to make Postgres efficient? More specifically, I have these questions:
Influx has has this notion of 'series' and it's cheap to create new series. So I had a separate series for each sensor. Should I create a separate Postgres table for each sensor?
How should I setup up indexes to make queries fast? A typical query is: select all data for sensor123 for the last 3 days.
Should I use timestamp or integer for the time column?
How do I set a retention policy? E.g. delete data that's older than one week automatically.
Will Postgres scale horizontally? Can I setup ec2 clusters for data replication and load balancing?
Can I downsample in Postgres? I have read in some articles that I can use date_trunc. But it seems that I can't date_trunc it to a specific interval e.g. 25 seconds.
Any other caveats I missed?
Thanks in advance!
Updates
Storing the time column as big integer is faster than storing it as timestamp. Am I doing something wrong?
storing it as timestamp:
postgres=# explain analyze select * from test where sensorid='sensor_0';
Bitmap Heap Scan on test (cost=3180.54..42349.98 rows=75352 width=25) (actual time=10.864..19.604 rows=51840 loops=1)
Recheck Cond: ((sensorid)::text = 'sensor_0'::text)
Heap Blocks: exact=382
-> Bitmap Index Scan on sensorindex (cost=0.00..3161.70 rows=75352 width=0) (actual time=10.794..10.794 rows=51840 loops=1)
Index Cond: ((sensorid)::text = 'sensor_0'::text)
Planning time: 0.118 ms
Execution time: 22.984 ms
postgres=# explain analyze select * from test where sensorid='sensor_0' and addedtime > to_timestamp(1430939804);
Bitmap Heap Scan on test (cost=2258.04..43170.41 rows=50486 width=25) (actual time=22.375..27.412 rows=34833 loops=1)
Recheck Cond: (((sensorid)::text = 'sensor_0'::text) AND (addedtime > '2015-05-06 15:16:44-04'::timestamp with time zone))
Heap Blocks: exact=257
-> Bitmap Index Scan on sensorindex (cost=0.00..2245.42 rows=50486 width=0) (actual time=22.313..22.313 rows=34833 loops=1)
Index Cond: (((sensorid)::text = 'sensor_0'::text) AND (addedtime > '2015-05-06 15:16:44-04'::timestamp with time zone))
Planning time: 0.362 ms
Execution time: 29.290 ms
storing it as big integer:
postgres=# explain analyze select * from test where sensorid='sensor_0';
Bitmap Heap Scan on test (cost=3620.92..42810.47 rows=85724 width=25) (actual time=12.450..19.615 rows=51840 loops=1)
Recheck Cond: ((sensorid)::text = 'sensor_0'::text)
Heap Blocks: exact=382
-> Bitmap Index Scan on sensorindex (cost=0.00..3599.49 rows=85724 width=0) (actual time=12.359..12.359 rows=51840 loops=1)
Index Cond: ((sensorid)::text = 'sensor_0'::text)
Planning time: 0.130 ms
Execution time: 22.331 ms
postgres=# explain analyze select * from test where sensorid='sensor_0' and addedtime > 1430939804472;
Bitmap Heap Scan on test (cost=2346.57..43260.12 rows=52489 width=25) (actual time=10.113..14.780 rows=31839 loops=1)
Recheck Cond: (((sensorid)::text = 'sensor_0'::text) AND (addedtime > 1430939804472::bigint))
Heap Blocks: exact=235
-> Bitmap Index Scan on sensorindex (cost=0.00..2333.45 rows=52489 width=0) (actual time=10.059..10.059 rows=31839 loops=1)
Index Cond: (((sensorid)::text = 'sensor_0'::text) AND (addedtime > 1430939804472::bigint))
Planning time: 0.154 ms
Execution time: 16.589 ms
You shouldn't create a table for each sensor. Instead you could add a field to your table that identifies what series it is in. You could also have another table that describes additional attributes about the series. If data points could belong to multiple series, then you'd need a different structure altogether.
For the query you described in q2, an index on your recorded_at column should work (time is a sql reserved keyword, so best avoid that as a name)
You should use TIMESTAMP WITH TIME ZONE as your time data type.
Retention is up to you.
Postgres has various options for sharding/replication. That's a big topic.
Not sure I understand your objective for #6, but I'm sure you can figure something out.