I am running PostgreSQL on Windows 8 using the OpenGeo Suite. I'm running out of disk space on a large join. How can I change the temporary directory where the "hash-join temporary file" gets stored?
I am looking at the PostgreSQL configuration file and I don't see a tmp file directory.
Note: I am merging two tables with 10 million rows using a variable text field which is set to a primary key.
This is my query:
UPDATE blocks
SET "PctBlack1" = race_blocks."PctBlack1"
FROM race_blocks
WHERE race_blocks.esriid = blocks.geoid10
First, make sure you have an index on these columns (of both tables). This would make PostgreSQL use less temporary files. Also, set the GUC work_mem to as high as possible, to make PostgreSQL use more memory for operations like this.
Now, if still need, to change the temporary path, you first need to create a tablespace (if you didn't do it already):
CREATE TABLESPACE temp_disk LOCATION 'F:\pgtemp';
Then, you have to set the GUC temp_tablespaces. You can set it per database, per user, at postgresql.conf or inside the current session (before your query):
SET temp_tablespaces TO 'temp_disk';
UPDATE blocks
SET "PctBlack1" = race_blocks."PctBlack1"
FROM race_blocks
WHERE race_blocks.esriid = blocks.geoid10
One more thing, the user must have CREATE privilege to use this:
GRANT CREATE ON TABLESPACE temp_disk TO app_user;
I was unable to set the F:/pgtemp directory directly in PostgreSQL due to a lack of permissions.
So I created a symlink to it using the windows command line "mklink /D" (a soft link). Now PostgreSQL writes temporary files to c:\Users\Administrator.opengeo\pgdata\Administrator\base\pgsql_tmp they get stored on the F: drive.
Related
I'm trying to change these two parameters (random_page_cost and effective_io_concurrency) in my Postgresql 12 database. For the first one I found this command :
alter database mydb set random_page_cost=1.1;
However, for the second I couldn't find how to do it. I thought about changing the postgres.conf file but it's not there at all. Can I add it ? or is there a similar command to random_page_cost ?
The syntax is as follows:
ALTER TABLESPACE mytablespace SET effective_io_concurrency = 1;
Replace mytablespace with your specific tablespace and choose another number for effective_io_concurrency.
Corresponding documentation for V10:
https://www.postgresql.org/docs/10/runtime-config-resource.html#:~:text=effective_io_concurrency
https://www.postgresql.org/docs/10/sql-altertablespace.html
I have ~2.5 TB database, which is divided into tablespaces. The problem is that ~250 GB are stored in pg_defalut tablespace.
I have 3 tables and 6 tablespaces: 1 per each table and 1 for its index. Each tablespace directory is not empty, so there are no missing tablespaces for some tables/indexes. But the size of data/main/base/OID_of_database directory is about 250 GB.
Can anyone tell me what is stored there, is it OK, and if not, how can I move it to tablespace?
I am using PostgreSQL 10.
Inspect the base subdirectory of the data directory. It will contain number directories that correspond to your databases and perhaps a pgsql_tmp directory.
Find out which directory contains the 250GB. Map directory names to databases using
SELECT oid, datname
FROM pg_database;
Once you have identified the directory, change into it and see what it contains.
Map the numbers to database objects using
SELECT relname, relkind, relfilenode
FROM pg_class;
(Make sure you are connected to the correct database.)
Now you know which objects take up the space.
If you had frequent crashes during operations like ALTER TABLE or VACUUM (FULL), the files may be leftovers from that. They can theoretically be deleted, but I wouldn't do that without consulting with a PostgreSQL expert.
I am trying to
create a snapshot of a PostgreSQL database (using pg_dump),
do some random tests, and
restore to the exact same state as the snapshot, and do some other random tests.
These can happen over many/different days. Also I am in a multi-user environment where I am not DB admin. In particular, I cannot create new DB.
However, when I restore db using
gunzip -c dump_file.gz | psql my_db
changes in step 2 above remain.
For example, if I make a copy of a table:
create table foo1 as (select * from foo);
and then restore, the copied table foo1 remains there.
Could some explain how can I restore to the exact same state as if step 2 never happened?
-- Update --
Following the comments #a_horse_with_no_name, I tried to to use
DROP OWNED BY my_db_user
to drop all my objects before restore, but I got an error associated with an extension that I cannot control, and my tables remain intact.
ERROR: cannot drop sequence bg_gid_seq because extension postgis_tiger_geocoder requires it
HINT: You can drop extension postgis_tiger_geocoder instead.
Any suggestions?
You have to remove everything that's there by dropping and recreating the database or something like that. pg_dump basically just makes an SQL script that, when applied, will ensure all the tables, stored procs, etc. exist and have their data. It doesn't remove anything.
You can use PostgreSQL Schemas.
I created a database called mapdata in which I will create a table called school. One of the datatypes for one of the columns is db2gse.ST_Point. I have tried creating the table school with the column with that datatype but it gave me an error saying db2gse.ST_Point is an undefined name. So then I figured I had to enable the spatial commands using this statement:
db2se enable_db mapdata
But that gives me error as well. It says a temporary table space could not be created because there is no available system temporary table space that has a compatible page size.
How can I resolve this problem?
If you take a look at the db2se enable_db page in the manual you will probably notice this, among other things:
Usage notes
Ensure that you have a system temporary table space with a page size of 8 KB or larger and with a minimum size of 500 pages. This is a requirement to run the db2se enable_db command successfully.
The error message tells you that there is no such tablespace. I suspect that your database also does not have a matching bufferpool.
To create a system temporary tablespace you might use the following commands (assuming your database is configured with automatic storage):
db2 "create bufferpool bp8k pagesize 8 k"
db2 "create system temporary tablespace tmpsys8k pagesize 8 k bufferpool bp8k"
We have a large PostgreSQL dump with hundreds of tables that I can successfully import with pg_restore. We are developing a software that inserts into a lot of these tables (~100) and for every run we need to return these tables to their original state (that means to the content that was in the dump). Restoring the original dump again takes a lot of time and we just can't wait for half an hour before every debugging session. So I need a relatively fast way to revert these tables to the state they are in after restoring from the dump.
I've tried using pg_restore with -L switch and selecting these tables but I get either a duplicate key error when using both --data-only and --clean or a "cannot drop table X because other objects depend on it" error when using only --clean. Issuing a SET CONSTRAINTS ALL DEFERRED command before pg_restore did not work either. Maybe I have the rows in the table list all wrong, right now it's
491; 1259 39623998 TABLE public some_table some_user
8021; 0 0 COMMENT public TABLE some_table some_user
8022; 0 0 ACL public some_table some_user
for every table and then
6700; 0 39624062 TABLE DATA public some_table postgres
8419; 0 0 SEQUENCE SET public some_table_pk_id_seq some_user
for every table.
We only insert data and don't update existing rows so deleting all rows above an index and resetting the sequences might work, but I really don't want to have to manually create these commands for all the hundred tables and I'm not even sure it would work even if I set cascade to delete other objects depending on the given row.
Does anyone have any better idea how to handle this?
So you are looking for something like a snapshot in order to be able to revert quickly to a certain state.
I am not aware of a possiblity in PostgreSql to rollback to a certain timestamp.
While searching for a solution, I've found two ideas here
Use create database with the template option
Virtualize your PostgreSql installation using VMWare or VirtualBox, and use the snapshot feature of the virtual machines.
Again, both ideas are copied from the above source (I have search for "postgresql db snapshots").
You can use PITR to create a snapshot before loads and use the PITR snapshot to take you back to any point that you have the logs for.