add postgres partitioning to existing table

add postgres partitioning to existing table - postgresql

All examples show:
CREATE TABLE ... PARTITION BY ...
Which is kind of ridiculous, because the only time you would use partitioning is when a dataset has become too large, which by definition is not going to be a new table. If someone is making a new table with partitioning, i think almost anyone would criticize that as a premature optimization.

Just create a partitioned table and attach the existing table as a partition:
create table test (a int);
insert into test select generate_series(1,10);
alter table test_parent attach partition test DEFAULT;
select * from test_parent;
a
----
1
2
3
4
5
6
7
8
9
10
(10 rows)
You could also rename the table. However, if you do this, you will need to re-define any views that point at the original table.

Related

How to create encoded primary key for table from two different table primary keys

I have two table Table 1 and Table 2 now I have to create thrid table whose primary key is encoded and combination of table1 and table 2 pk's. I also want to fetch the record of table3 with table 1 and table 2 on the basis of table 3 pk.
can some one help me.
solution in my mind was if I can create some thing similar to JWT toke from the table 1 and table 2 pk's and store it in table 3 pk. but the problems is how I will decode it and perform basic CRUD operation with join query between different tables

How to fill two related tables via a foreign key in a trigger function?

I have two tables which I want to fill their corresponding FOREIGN KEYs simultaneously through a TRIGGER at the time of inserting data into customers table:
CREATE TABLE customers (
customer_id SERIAL PRIMARY KEY,
sld_id integer,
customer_name varchar(35)
);
CREATE TABLE slds (
sld_id SERIAL PRIMARY KEY,
customer_id integer,
sld_code varchar(8) UNIQUE
);
ALTER TABLE customers
ADD CONSTRAINT customers_sld_id_fk
FOREIGN KEY (sld_id)
REFERENCES slds(sld_id);
ALTER TABLE slds
ADD CONSTRAINT slds_customer_id_fk
FOREIGN KEY(customer_id)
REFERENCES customers(customer_id);
I have tried to use an AFTER INSERT trigger function, but NEW.customer_id returned NULL.
Then I used BEFORE INSERT which got me the value of NEW.customer_id. However, because of the constraint and the fact that the insertion didn't take place yet the FOREIGN KEY CONSTRAINT is not fulfilled and I get an error.
I have read here that currval() and lastval() can be used but not recommended.
So I created a proxy table to store the generated values. Then, an AFTER INSERT trigger to fill in those fields back in the related tables.
I thought of using a CREATE TEMP TABLE, but found out that it only lasts for the duration of the calling function and not the connection session. Maybe I misunderstood the error message.
Is this a normal efficient practice? Namely, having a dirty table around just to use for such situations.
Or maybe there is another way to achieve this without using a proxy table?
EDITED:
SAMPLE DATA
customersTABLE:
customer_id slds_id customer_name
1 1 johns
3 2 jenn
4 3 thomas
7 4 jeff
8 5 robin
9 6 chris
10 7 larry
slds TABLE:
slds_id slds_code customer_id
1 SL747561 1
2 SL710031 3
3 SL719995 4
4 SL765369 7
5 SL738011 8
6 SL722232 9
7 SL751591 10
EDIT 2:
Forgot to mention that slds_code is generated within a trigger function:
sld_code varchar(8) := 'SL7'||to_char(floor(random() * 100000 + 1)::int, 'fm00000');

Deleting rows in Postgres table using ctid

We have a table with nearly 2 billion events recorded. As per our data model, each event is uniquely identified with 4 columns combined primary key. Excluding the primary key, there are 5 B-tree indexes each on single different columns. So totally 6 B-tree indexes.
The events recorded span for years and now we need to remove the data older than 1 year.
We have a time column with long values recorded for each event. And we use the following query,
delete from events where ctid = any ( array (select ctid from events where time < 1517423400000 limit 10000) )
Does the indices gets updated?
During testing, it didn't.
After insertion,
total_table_size - 27893760
table_size - 7659520
index_size - 20209664
After deletion,
total_table_size - 20226048
table_size - 0
index_size - 20209664

Reindex can be done
Command: REINDEX
Description: rebuild indexes
Syntax:
REINDEX { INDEX | TABLE | DATABASE | SYSTEM } name [ FORCE ]

Considering #a_horse_with_no_name method is the good solution.
What we had:
Postgres version 9.4.
1 table with 2 billion rows with 21 columns (all bigint) and 5 columns combined primary key and 5 individual column indices with date spanning 2 years.
It looks similar to time-series data with a time column containing UNIX timestamp except that its analytics project, so time is not at an ordered increase. The table was insert and select only (most select queries use aggregate functions).
What we need: Our data span is 6 months and need to remove the old data.
What we did (with less knowledge on Postgres internals):
Delete rows at 10000 batch rate.
At inital, the delete was so fast taking ms, as the bloat increased each batch delete increased to nearly 10s. Then autovacuum got triggered and it ran for almost 3 months. The insert rate was high and each batch delete has increased the WAL size too. Poor stats in the table made the current queries so slow that they ran for minutes and hours.
So we decided to go for Partitioning. Using Table Inheritance in 9.4, we implemented it.
Note: Postgres has Declarative Partitioning from version 10, which handles most manual work needed in partitioning using Table Inheritance.
Please go through the official docs as they have clear explanation.
Simplified and how we implemented it:
Create parent table
Create child table inheriting it with check constraints. (We had monthly partitions and created using schedular)
Indexes are need to be created separately for each child table
To drop old data, just drop the table, so vacuum is not needed and will be instant.
Make sure to have the postgres property constraint_exclusion to partition.
VACUUM ANALYZE the old partition after started inserting in the new partition. (In our case, it helped the query planner to use Index-Only scan instead of Seq. scan)
Using Triggers as mentioned in the docs may make the inserts slower, so we deviated from it, as we partitioned based on time column, we calculated the table name at application level based on time value before every insert and it didn't affect the insert rate for us.
Also read other caveats mentioned there.

Efficient way to move large number of rows from one table to another new table using postgres

I am using PostgreSQL database for live project. In which, I have one table with 8 columns.
This table contains millions of rows, so to make search faster from table, I want to delete and store old entries from this table to new another table.
To do so, I know one approach:
first select some rows
create new table
store this rows in that table
than delete from main table.
But it takes too much time and it is not efficient.
So I want to know what is the best possible approach to perform this in postgresql database?
Postgresql version: 9.4.2.
Approx number of rows: 8000000
I want to move rows: 2000000

You can use CTE (common table expressions) to move rows in a single SQL statement (more in the documentation):
with delta as (
delete from one_table where ...
returning *
)
insert into another_table
select * from delta;
But think carefully whether you actually need it. Like a_horse_with_no_name said in the comment, tuning your queries might be enough.

This is a sample code for copying data between two table of same.
Here i used different DB, one is my production DB and other is my testing DB
INSERT INTO "Table2"
select * from dblink('dbname=DB1 dbname=DB2 user=postgres password=root',
'select "col1","Col2" from "Table1"')
as t1(a character varying,b character varying);

Hive partitioning external table based on range

I want to partition an external table in hive based on range of numbers. Say numbers with 1 to 100 go to one partition. Is it possible to do this in hive?

I am assuming here that you have a table with some records from which you want to load data to an external table which is partitioned by some field say RANGEOFNUMS.
Now, suppose we have a table called testtable with columns name and value. The contents are like
India,1
India,2
India,3
India,3
India,4
India,10
India,11
India,12
India,13
India,14
Now, suppose we have a external table called testext with some columns along with a partition column say, RANGEOFNUMS.
Now you can do one thing,
insert into table testext partition(rangeofnums="your value")
select * from testtable where value>=1 and value<=5;
This way all records from the testtable having value 1 to 5 will come into one partition of the external table.
The scenario is my assumption only. Please comment if this is not the scenario you have.
Achyut

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse