Postgres Error | COPY Command Disk Quota Exceeded - postgresql

I have 2 tables with 7.7 million records and other with 160 million records.
I want dump of the tables on my NAS drive (700GB+ available memory). I'm using following command to export the data to csv file:
\COPY (select * from table_name) to '/path_to_nas_drive/test.csv' csv header;
After running the above command, it's throwing the issue
Could not write COPY data : disk quota exceeded
Why it's throwing the error? is it because of the space issue on database server that it's not able to create buffer/temporary file or is there any way to handle this?

Let's try scaling back the request and just getting one row just to make sure you can use COPY even for just one row:
\COPY (select * from table_name limit 1) to '/path_to_nas_drive/test.csv' csv header;
You can also try compressing the output with gzip like so:
psql -c "\COPY (select * from table_name) to stdout" yourdbname | gzip > /path_to_nas_drive/test.csv.gz

It is a client issue, not a server issue. "could not write COPY data" is generated by psql, and "disk quota exceeded" comes from the OS and is then passed on by psql.
So your NAS is telling you that you are not allowed to write as much to it as you would like to. Maybe it has plenty of space, but it isn't going to give it to you, where "you" is whoever is running psql, not whoever is running PostgreSQL server.

Related

PostgreSQL error: could not extend file no space left on device

I'm running a query that duplicates a very large table (92 million rows) on PostgreSQL. After a 3 iterations I got this error message:
The query was:
CREATE TABLE table_name
AS SELECT * FROM big_table
The issue isn't due to lack of space in the database cluster: at 0.3% of max possible storage at the time of running the query, table size is about 0.01% of max storage including all replicas. I also checked temporary files and it's not that.
You are definitely running out of file system resources.
Make sure you got the size right:
SELECT pg_table_size('big_table');
Don't forget that the files backing the new table are deleted after the error, so it is no surprise that you have lots of free space after the statement has failed.
One possibility is that you are not running out of disk space, but of free i-nodes. How to examine the free resources differs from file system to file system; for ext4 on Linux it would be
sudo dumpe2fs -h /dev/... | grep Free

Backing a very large PostgreSQL DB "by parts"

I have a PostgreQL DB that is about 6TB. I want to transfer this database to another server using for example pg_dumpall. The problem I have is that I only have a 1TB HD. How can I do to copy this database to the other new server that has enough space? Let's suppose I can not get another HD. Is there the possibility to do partial backup files, upload them to the new server, erase the HD and get another batch of backup files until the transfer is complete?
This works here(proof of concept):
shell-command issued from the receiving side
remote side dumps through the network-connection
local side psql just accepts the commands from this connection
the data is never stored in a physical file
(for brevity, I only sent the table definitions, not the actual data: --schema-only)
you could have some problems with users and tablespaces (these are global for an installation in Postgres) pg_dumpall will dump+restore these, too, IIRC.
#!/bin/bash
remote=10.224.60.103
dbname=myremotedbname
pg_dump -h ${remote} --schema-only -c -C ${dbname} | psql
#eof
As suggested above if you have a fast network connection between source and destination you can do it without any extra disk.
However for a 6 TB DB (which includes indexes I assume) using the archive dump format (-Fc) could yield a database dump of less than 1 TB.
Regarding the "by parts" question: yes, it possible using the table pattern (-t, --table):
pg_dump -t TABLE_NAME ...
You can also exclude tables using -T, --exclude-table:
pg_dump -T TABLE_NAME ...
The above options (-t , -T) can be specified multiple times and can be even combined.
They also support patterns for specifying the tables:
pg_dump -t 'employee_*' ...

vacuumlo command deleted all matched records from postgresql db

My data folder size is 194GB. When I look at the size of PG_LARGEOBJECT table, it is 178GB. So i ran below command to get rid of all unmatched oid's from my database.
vacuumlo -U postgres -W -v DB name
After that I ran the "VACUUM FULL ANALYZE pg_largeobject;" command to free up space.
Issue here is space is freed up but when I look at the application getting Large Object not exists error.
Please help me what I missed.

Postgres copy query connection lost error

I'm trying to bulk load around 200M lines (3.5GB) of data to an Amazon RDS postgresql DB using the following command:
cat data.csv | psql -h<host>.rds.amazonaws.com -U<user> <db> -c "COPY table FROM STDIN DELIMITER AS ','"
After a couple of minutes I get this error:
connection not open
connection to server was lost
If I run head -n 100000000 data.csv to send the first 100M lines instead of all 200M then the command succeeds instead. I'm guessing that there's a timeout somewhere that's causing the query with the full dataset to fail. I've not been able to find any timeout settings or parameters though.
How can I make the bulk insert succeed with the full dataset?
As I read the statement you're using, it basically creates a giant string, then connects to SQL and then it tries to feed the entire string as argument.
If you load psql and run something like \copy ... from '/path/to/data.csv' ..., I'd imagine the connection might stay alive while the file's content is streamed chunk by chunk.
That would be my hunch as to why 10M lines works (= argument pushed entirely before the connection times out) but not the entire file (= argument still uploading).

Create query that copies from a CSV file on my computer to the DB located on another computer in Postgres

I am trying to create a query that will copy data from a CSV file that is located on my computer to a Postgres DB that is on a different computer.
Our Postgres DB is located on another computer, and I work on my own to import and query data. I have successfully copied data from the CSV file on MY computer TO the DB in PSQL Console using the following:
\COPY table_name FROM 'c:\path\to\file.csv' CSV DELIMITER E'\t' HEADER;
But when writing a query using the SQL Editor, I use the same code above without the '\' in the beginning. I get the following error:
ERROR: could not open file "c:\pgres\dmi_vehinventory.csv" for reading: No such file or directory
********** Error **********
ERROR: could not open file "c:\pgres\dmi_vehinventory.csv" for reading: No such file or directory
SQL state: 58P01
I assume the query is actually trying to find the file on the DB's computer rather than my own.
How do I write a query that tells Postgres to look for the file on MY computer rather than the DB's computer?
Any help will be much appreciated !
\COPY is a correct way if you want to upload file from local computer (computer where you've stared psql)
COPY is correct when you want to upload on remote host from remote directory
here is an example, i've connected with psql to remote server:
test=# COPY test(i, i1, i3)
FROM './test.csv' WITH DELIMITER ',';
ERROR: could not open file "./test.csv" for reading: No such file
test=# \COPY test(i, i1, i3)
FROM './test.csv' WITH DELIMITER ',';
test=# select * from test;
i | i1 | i3
---+----+----
1 | 2 | 3
(1 row)
There are several common misconceptions when dealing with PostgreSQL's COPY command.
Even though psql's \COPY FROM '/path/to/file/on/client' command has identical syntax (other than the backslash) to the backend's COPY FROM '/path/to/file/on/server' command, they are totally different. When you include a backslash, psql actually rewrites it to a COPY FROM STDIN command instead, and then reads the file itself and transfers it over the connection.
Executing a COPY FROM 'file' command tells the backend to itself open the given path and load it into a given table. As such, the file must be mapped in the server's filesystem and the backend process must have the correct permissions to read it. However, the upside of this variant is that it is supported by any postgresql client that supports raw sql.
Successfully executing a COPY FROM STDIN places the connection into a special COPY_IN state during which an entirely different (and much simpler) sub-protocol is spoken between the client and server, which allows for data (which may or may not come from a file) to be transferred from the client to the server. As such, this command is not well supported outside of libpq, the official client library for C. If you aren't using libpq, you may or may not be able to use this command, but you'll have to do your own research.
COPY FROM STDIN/COPY TO STDOUT doesn't really have anything to do with standard input or standard output; rather the client needs to speak the sub-protocol on the database connection. In the COPY IN case, libpq provides two commands, one to send data to the backend, and another to either commit or roll back the operation. In the COPY OUT case, libpq provides one function that receives either a row of data or an end of data marker.
I don't know anything about SQL Editor, but it's likely that issuing a COPY FROM STDIN command will leave the connection in an unusable state from its point of view, especially if it's connecting via an ODBC driver. As far as I know, ODBC drivers for PostgreSQL do not support COPY IN.