Is it possible to dump from Timescale without hypertable insertions? - postgresql

I followed the manual on: https://docs.timescale.com/v1.0/using-timescaledb/backup
When I dump it into a binary file everything work out as expected (can restore it easily).
However, when I dump it into plain text SQL, insertions to hyper tables will be created. Is that possible to create INSERTION to the table itself?
Say I have an 'Auto' table with columns of id,brand,speed
and with only one row: 1,Opel,170
dumping into SQL will result like this:
INSERT INTO _timescaledb_catalog.hypertable VALUES ...
INSERT INTO _timescaledb_internal._hyper_382_8930_chunk VALUES (1, 'Opel',170);
What I need is this (and let TS do the work in the background):
INSERT INTO Auto VALUES (1,'Opel',170);
Is that possible somehow? (I know I can exclude tables from pg_dump but that wouldn't create the needed insertion)

Beatrice. Unfortunately, pg_dump will dump commands that mirror the underlying implementation of Timescale. For example, _hyper_382_8930_chunk is a chunk underlying the auto hypertable that you have.
Might I ask why you don't want pg_dump to behave this way? The SQL file that Postgres creates on a dump is intended to be used by pg_restore. So as long as you dump and restore and see correct state, there is no problem with dump/restore.
Perhaps you are asking a different question?

Related

What happens to existing data with psql dbname < pg_dump_file [duplicate]

This question already has an answer here:
will pg_restore overwrite the existing tables?
(1 answer)
Closed 9 months ago.
I have an database on aws' rds and I use a pg_dump from a local version of the database, then psql dbname > pg_dump_file with proper arguments for remote upload to populate the database.
I'd like to know what is expected to happen if that rds db already contains data. More specifically:
Data present in the local dump, but absent in rds
Data present on rds, but absent in the local data
Data present in both but that have been modified
My current understanding:
New data will be added and be present in both after upload
Data in rds should be unaffected?
The data from the pg_dump will be present in both (assuming the same pk, but different fields otherwise)
Is that about correct? I've been reading this, but it's a little thin on how the restore is actually performed, so I'm having a harder time figuring that out. Thanks.
EDIT: following #wildplasser comment, by looking at the pg_dump file it appears that the following happens:
CREATE TABLE [....]
ALTER TABLE [setting table owner]
ALTER SEQUENCE [....]
For each table in the db. Then, again one table at a time:
COPY [tablename] (list of cols) FROM stdin;
[data to be copied]
Finally, more ALTER statements to set contraints, foreign keys etc.
So I guess the ultimate answer is "it depends". One could I suppose remove the CREATE TABLE [...], ALTER TABLE, ALTER SEQUENCE statements if those are already created as they should. I am not positive yet what happens if one tries CREATE TABLE with an existing table (error thrown perhaps?).
Then I guess the COPY statements would overwrite whatever already exists. Or perhaps throw an error. I'll have to test that. I'll write up an answer once I figure it out.
So the answer is a bit dull. Turns out that even if one removes the initial statements before the copy, if the table as an primary key (thus uniqueness constrains) then it won't work:
ERROR: duplicate key value violates unique constraint
So one gets shutdown pretty quickly there. One would have I guess to rewrite the dump as a list of UPDATE statements instead, but then I guess might as well write a script to do so. Unsure if pg_dump is all that useful in that case.

Restoring PG database from dump fails due to generated columns

We use our Postgres database dumps as a way to backup/reset our staging DB. As part of that, we frequently remove all rows in the DB and insert the rows from the PG dump. However, the generated columns are included as part of the PG dump, but with their values instead of the DEFAULT keyword.
Trying to run the DB dump triggers cannot insert into column errors since one cannot insert values into a generated column. How do we dump our DB and recreate it from the dump despite the generated columns?
EDIT: Note that we cannot use GENERATED BY DEFAULT or OVERRIDING SYSTEM VALUE since those are only available for identity columns and not generated columns.
EDIT 2: It seems that it's a special case for us that the values are dumped instead of as DEFAULT. Any idea why that might be?

Postgres backup and overwrite one table

I have a postgres database, I am trying to backup a table with :
pg_dump --data-only --table=<table> <db> > dump.sql
Then days later I am trying to overwrite it (basically want to erase all data and add the data from my dump) by:
psql -d <db> -c --table=<table> < dump.sql
But It doesn't overwrite, it adds on it without deleting the existing data.
Any advice would be awesome, thanks!
You have basically two options, depending on your data and fkey constraints.
If there are no fkeys to the table, then the best thing to do is to truncate the table before loading it. Note that truncate behaves a little odd in transactions so the best thing to do is (in a transaction block):
Lock the table
Truncate
Load
This will avoid other transactions seeing an empty table.
If you have fkeys then you may want to load into a temporary table and then do an upsert. In this case you may still want to lock the table to avoid a race condition if it is possible other transactions may want to write to the table (also in a transaction block):
Load data into a temporary table
Lock the destination table (optional, see above)
use a writeable cte to "upsert" in the table.
Use a separate delete statement to delete data from the table.
Stage 3 is a little tricky. You might need to ask a separate question about it, but basically you will have two stages (and write this in consultation with the docs):
Update existing records
Insert non-existing records
Hope this helps.

PostgreSQL reset tables to original state

We have a large PostgreSQL dump with hundreds of tables that I can successfully import with pg_restore. We are developing a software that inserts into a lot of these tables (~100) and for every run we need to return these tables to their original state (that means to the content that was in the dump). Restoring the original dump again takes a lot of time and we just can't wait for half an hour before every debugging session. So I need a relatively fast way to revert these tables to the state they are in after restoring from the dump.
I've tried using pg_restore with -L switch and selecting these tables but I get either a duplicate key error when using both --data-only and --clean or a "cannot drop table X because other objects depend on it" error when using only --clean. Issuing a SET CONSTRAINTS ALL DEFERRED command before pg_restore did not work either. Maybe I have the rows in the table list all wrong, right now it's
491; 1259 39623998 TABLE public some_table some_user
8021; 0 0 COMMENT public TABLE some_table some_user
8022; 0 0 ACL public some_table some_user
for every table and then
6700; 0 39624062 TABLE DATA public some_table postgres
8419; 0 0 SEQUENCE SET public some_table_pk_id_seq some_user
for every table.
We only insert data and don't update existing rows so deleting all rows above an index and resetting the sequences might work, but I really don't want to have to manually create these commands for all the hundred tables and I'm not even sure it would work even if I set cascade to delete other objects depending on the given row.
Does anyone have any better idea how to handle this?
So you are looking for something like a snapshot in order to be able to revert quickly to a certain state.
I am not aware of a possiblity in PostgreSql to rollback to a certain timestamp.
While searching for a solution, I've found two ideas here
Use create database with the template option
Virtualize your PostgreSql installation using VMWare or VirtualBox, and use the snapshot feature of the virtual machines.
Again, both ideas are copied from the above source (I have search for "postgresql db snapshots").
You can use PITR to create a snapshot before loads and use the PITR snapshot to take you back to any point that you have the logs for.

db2 reorganize a table

When I alter a table in db2, I have to reorganize it
so I execute the next query:
Call Sysproc.admin_cmd ('reorg Table myTable');
I m searching an appropriate solution to reorganize a table when it s altered, or reorganize all the schema after making various modifications
You can determine when tables will require a REORG by looking at SYSIBMADM.ADMINTABINFO:
select tabschema, tabname
from sysibmadm.admintabinfo
where reorg_pending = 'Y'
You may also want to look at the NUM_REORG_REC_ALTERS column as this may show you additional tables that don't require reorganization due to various ALTER TABLE statements.
The reorg operation is similar to a defrag in hard disk. It frees empty spaces in pages, and eventually it could reorganize data according to an index. Depending on the features, it creates the compression dictionary and compress data.
As you can see, reorg operation is an administrative task, and it is not necessary each time data is modified. A database could run without reorg.
It order to ease this, DB2 included autonomic features like automatic backup, however this doesn't answer you own question. This will only trigger reorg on tables that need that.
To reorg a table explicitly you need to execute the command reorg http://publib.boulder.ibm.com/infocenter/db2luw/v10r1/topic/com.ibm.db2.luw.admin.cmd.doc/doc/r0001966.html
or via the admin_cmd http://publib.boulder.ibm.com/infocenter/db2luw/v10r1/topic/com.ibm.db2.luw.sql.rtn.doc/doc/r0023582.html
in db2 config we have:
Automatic reorganization (AUTO_REORG) = OFF
we can set auto_reorg to on