So I have a decent size database (roughly 8 million rows) that I need to pull data from. It needs to be output into a CSV that can be opened by Excel. I've tried virtually every solution I found, to no avail.
\copy - Puts all values in a single column, separated by ','.
copy...to...with csv header - same result as above.
copy...into outfile - refuses to work. Claims there's something wrong with my path, when I used the same path as before
I'm not the most experienced with my SQL to say the least, but I'll try my best to provide any information necessary.
Have you try mysql DUMP?
I have experience like you, to backup or upload 11 million data.
And success with mysqlDUMP, maybe you can seach like mysqldump for PostGresql
Related
In few other DB engines I can easily extract (part of) table to single file.
Then if needed I can 'mount' this file as regular table. Querying is obviously slow but this is very useful
I wonder if similar stuff is possible with psql ?
I know COPY FROM/TO function - but for bigger tables I need to wait ages in order to copy records from CSV
Yes, you can use file_fdw to access (read) a CSV file on the database server as if it were a table.
I'm trying to load some large data sets from CSV into a Postgres 11 database (Windows) to do some testing. Fist problem I ran into was that with very large CSV I got this error: "ERROR: could not stat file "'D:/temp/data.csv' Unknown error". So after searching, I found a workaround to load the data from a zip file. So I setup 7-zip and was able to load some data with a command like this:
psql -U postgres -h localhost -d MyTestDb -c "copy my_table(id,name) FROM PROGRAM 'C:/7z e -so d:/temp/data.zip' DELIMITER ',' CSV"
Using this method, I was able to load a bunch of files of varying sizes, one with 100 million records that was 700MB zipped. But then I have one more large file with 100 million records that's around 1GB zipped, that one for some reason is giving me grief. Basically, the psql process just keeps running and never stops. I can see based on data files growing that it generates data up to a certain point, but then at some point it stops growing. I'm seeing 6 files in a data folder named 17955, 17955.1, 17955.2, etc. up to 17955.5. The Date Modified date on those files continues to be updated, but they're not growing in size and my psql program just sits there. If I shut down the process, I lose all the data since I assume it rolls it back when the process does not run to completion.
I looked at the logs in the data/log folder, there doesn't seem to be anything meaningful there. I can't say I'm very used to Postgres, I've used SQL Server the most, so looking for tips on where to look, or what extra logging to turn on, or anything else that could help figure out why this process is stalling.
Figured it out thanks to #jjanes comment above (sadly he/she didn't add an answer). I was adding 100 million records to a table with a foreign key to another table with 100 million records. I removed the foreign key, added the records, and then re-added the foreign key, that did the trick. I guess checking the foreign keys is just too much with a bulk insert this size.
I have a CSV file whose data is to be imported to Postgres database , I did it using import function in pgadmin III but the problem is my CSV file changes frequently so how to import the data overwriting the already existing data in database from CSV file ?
You can save WAL logging through an optimization between TRUNCATE/COPY in the same transaction. The basic idea is to wipe the database table with TRUNCATE and reimport the data with COPY. This doesn't need to be done manually with pgAdmin each time. It can be scripted with something like:
BEGIN;
-- The CSV file is 'mydata.csv' and the table is 'mydata'.
TRUNCATE mydata;
COPY mydata FROM 'mydata.csv' WITH (FORMAT csv);
COMMIT;
Note that it requires superuser access to work. The COPY command also takes various arguments, so you can adjust for different settings for null and headers etc.
Finally it should be noted that you ideally want these both to be in the same transaction. I'm not going to over-complicate this example here though as this level of care isn't needed in many of the real-world sorts of cases where one is copying in a CSV file. If you think your situation needs it, it's not too hard to track down.
After going through similar questions on Stackoverflow, I am unable to find a method where I could export a large CSV file from a query made in MySQL workbench (v 5.2).
The query is about 4 million rows, 8 columns (comes to about 300Mb when exported as a csv file).
Currently I load the entire rows (have see it in the GUI) and use the export option. This makes my machine crash most of the time)
My constraints are:
I am not looking for a solution via bash terminal.
I need to export it to the client machine and not the database server.
Is this drawback of MySQL Workbench?
How do I not see it in GUI but yet export all the rows into a single file?
There is a similar question I found, but the answers dont meet the constraints I have:
" Exporting query results in MySQL Workbench beyond 1000 records "
Thanks.
In order to export to CSV you first have to load all that data, which is a lot to have in a GUI. many controls are simply no made to carry that much data. So your best bet is to avoid GUI as much as possible.
One way could be to run your query outputting to a text window (see Query menu). This is not CSV but at least should work. You can then try to copy out the text into a spreadsheet and convert it to CSV.
If that is too much work try limiting your rows into ranges, say 1 million each, using the LIMIT clause on your query. Lower the size until you have one that can be handled by MySQL Workbench. You will get n CSV files you have to concatenate later. A small application or (depending on your OS) a system tool should be able to strip headers and concatenate the files into one.
I am faced with a situation where we get a lot of CSV files from different clients but there is always some issue with column count and column length that out target table is expecting.
What is the best way to handle frequently changing CSV files. My goal is load these CSV files into Postgres database.
I checked the \COPY command in Postgres but it does have an option to create a table.
You could try creating a pg_dump compatible file instead which has the appropriate "create table" section and use that to load your data instead.
I recommend using an external ETL tool like CloverETL, Talend Studio, or Pentaho Kettle for data loading when you're having to massage different kinds of data.
\copy is really intended for importing well-formed data in a known structure.