Im trying to export oracle table into a local postgresql dump via the copy command :
\copy (select * from remote_oracle_table) to /postgresql/table.dump with binary;
The oracle table`s size is 25G. Howvere, the copy command created a 50G file. How is it possible ?
I'm capable of selecting from remote oracle table because i have the oracle_fdw
extension.
A few factors are likely at work here, including:
Small numbers in integer and numeric fields use more space in binary format than text format;
Oracle probably stores the table with some degree of compression, which the binary dump won't have.
You'll likely find that if you compress the resulting dump it'll be a lot smaller.
Related
I followed the manual on: https://docs.timescale.com/v1.0/using-timescaledb/backup
When I dump it into a binary file everything work out as expected (can restore it easily).
However, when I dump it into plain text SQL, insertions to hyper tables will be created. Is that possible to create INSERTION to the table itself?
Say I have an 'Auto' table with columns of id,brand,speed
and with only one row: 1,Opel,170
dumping into SQL will result like this:
INSERT INTO _timescaledb_catalog.hypertable VALUES ...
INSERT INTO _timescaledb_internal._hyper_382_8930_chunk VALUES (1, 'Opel',170);
What I need is this (and let TS do the work in the background):
INSERT INTO Auto VALUES (1,'Opel',170);
Is that possible somehow? (I know I can exclude tables from pg_dump but that wouldn't create the needed insertion)
Beatrice. Unfortunately, pg_dump will dump commands that mirror the underlying implementation of Timescale. For example, _hyper_382_8930_chunk is a chunk underlying the auto hypertable that you have.
Might I ask why you don't want pg_dump to behave this way? The SQL file that Postgres creates on a dump is intended to be used by pg_restore. So as long as you dump and restore and see correct state, there is no problem with dump/restore.
Perhaps you are asking a different question?
I am using Postgres file data wrapper(file_fdw) to perform query from a remote file. Interestingly, I saw that if Postgres perform a query on a binary file, it uses more main memory space than text file such as .csv. Can anyone confirm why is this happen? Is there any way to optimize this?
My file contains two column (id | geometry), here geometry represents polygon. So, my foreign table consists of two column. I have tested with join query like ST_Overlaps() and ST_Contains() on CSV file and Postgres compatible binary file.
SELECT COUNT(*) FROM ftable1 a, ftable2 b WHERE ST_Overlaps(a.geom, b.geom);
I have checked the memory usage by PostgreSQL using htop, I saw that if Postgres perform a query on a binary file, it uses more main memory space than CSV file. For example, if it is 500MB for CSV then almost 1GB for a binary file. why?
In order to build the staging database from my production data to test migrations, etc. I do a regular \copy of a subset of production records to CSV files, and import the result. I do this for the specific tables with are very large (600G), as I don't need them all for testing.
One such column is a varying(30) [for hysterical raisins involving django 1].
I have data in that column which is UTF-8 encoded. Some of it is exactly 30 glyphs wide. But, that takes more than 30 bytes to encode. Strangely, it fits just fine in the original database, but after creating a new database, it does not fit.
I copy in with:
\copy public.cdrviewer_cdr from '/sg1/backups/2017-02-20/cdr.csv'
with (format 'csv', header, encoding 'utf-8') ;
This seems like it may be a bug, or maybe it's just a limitation of copy.
(I am using postgresql 9.6 on Devuan Jessie)
I don't want to use pg_dump to export data into sql script, since feeding it to the greenplum cluster is too slow when I have a large amount of data to import. So it seems using greenplum's gpfdist is prefered. Is there any way I can do this?
Or as an alternative, can I export a particular Postgres table's data into a CSV format file containing the large orbjects of that table?
pg_dump will create a file that will use "COPY" to load the data back into a database. When loading into Greenplum, it will load through the Master server and for very large loads, it will become a bottleneck. Yes, the preferred method is to use gpfdist but you can most certainly use COPY to load data into Greenplum. It won't load in the 10+ TB per hour rate that gpfdist can achieve but it still can achieve 1 to 2 TB per hour.
Another alternative is to use gpfdist to execute a program to get data. It would execute the SELECT statement against PostgreSQL to make that available to an External Table in Greenplum. I created a wrapper for this process called, "gplink". You can check it out here: http://www.pivotalguru.com/?page_id=982
Accoridng to greenplum reference:
The simplest data loading method is the SQL INSERT statement...
You can use the COPY command to load the data into a table when the data
is in external text files...
You can use a pair of Greenplum utilities, gpfdist and gpload, to load external data into tables...
Nevertheless if you want to use csv to import data, you can generate csv with large object "filename" joining you table against pg_largeobject. Eg:
b=# create table lo (n text,p oid);
CREATE TABLE
b=# insert into lo values('wheel',lo_import ('/tmp/wheel.PNG'));
INSERT 0 1
b=# copy (select lo.*, pg_largeobject.pageno, pg_largeobject.data from lo join pg_largeobject on lo.p = loid) to '/tmp/lo.csv' WITH (format csv, header);
COPY 20
Generated /tmp/lo.csv will have name, oid and data bytea in csv format.
I have to transfer data from old database to new database where table name and column name is different.
Can it be done with DOS command or any other solution?
One is POSTGRESQL and old is MYSQL.
My concern is table name and column names are different, column number is same.
Thank you
I do not know the postgresql part but for sql server you can use sqlcmd.exe to export data as text format with or w/o column names
Please check
http://msdn.microsoft.com/en-us/library/ms162773.aspx