Postgres SQL Query about database and table space - postgresql

What is the hierarchy of Database related objects in postgres SQL?
Should it be like, table space must be created at instance level unlike other RDBMS(where we have table space under database).
If so we create the table space at instance level, what is the purpose of database? and what is difference between table space and database on postgres server?

An instance (in PostgreSQL called cluster) is a data directory initialized with initdb with a PostgreSQL server process.
A tablespace is a directory outside the data directory where objects can also be stored. Tablespaces are useful for certain corner cases like distributing I/O or limiting space for a subset of the data.
A database is a container for objects with permissions, organized in schemas.
The difference is that tablespaces are a physical concept, it defines a space where the data are stored, while databases are a logical concept about how data are organized, what they mean, how they are related, who is allowed to access them and so on.
The two concepts are orthogonal.
A database can have tables in several tablespaces, and a tablespace can contain data from several databases.

Database is where you organize all your objects. Tablespace is just storage space for those object.
You can storage your db object in different Tablespace. For example one table is storage in a Tablespace in diskA but another Table use a Tablespace in diskB to improve the performance. Or maybe you need a tablespace for big tables and dont mind use a slow big HDD for those objects.

Related

Limit size of temporary tables (PostgreSQL)

I'm managing a PostgreSQL database server for some users who need to create temporary tables. One user accidentally sent a query with ridiculously many outer joins, and that completely filled the disk up.
PostgreSQL has a temp_file_limit parameter but it seems to me that it is not relevant:
It should be noted that disk space used for explicit temporary tables, as opposed to temporary files used behind-the-scenes in query execution, does not count against this limit.
Is there a way then to put a limit on the size on disk of "explicit" temporary tables? Or limit the row count? What's the best approach to prevent this?
The only way to limit a table's size in PostgreSQL is to put it in a tablespace on a file system of an appropriate size.
Since temporary tables are created in the default tablespace of the database you are connected to, you have to place your database in that size restricted tablespace. To keep your regular tables from being limited in the same way, you'd have to explicitly create them in a different, less limited tablespace. Make sure that your user has no permissions on that less limited tablespace.
This is a rather unappealing solution, so maybe you should rethink your requirement. After all, the user could just as well fill up the disk by inserting the data into a permanent table.

How does AWS postgres RDS read replication handle schema switching?

I am wanting to know how an AWS postgres RDS does replication where I rename schemas to "swap" them within the read/write instance of the database.
Does it replicate this action to the read-replicas by sending on the "alter schema" rename commands I gave to my read/write instance? Or after my renames, does it see wholly different sets of data in the schemas and do a whole new copy of each out to the read-replicas?
For example...
In my RDS instance I have a read/write instance of "my_mega_database" which I want to create read-replicas of for my applications to connect to.
Typically, in "my_mega_database" there are two schemas "my_data" and "my_data_old", whereby "my_data" contains data that was delivered last night, and "my_data_old" contains data from the previous night. Each contains many tables and huge amounts of data.
If I were to do the following...
ALTER SCHEMA my_data_old RENAME TO my_data_tmp;
ALTER SCHEMA my_data RENAME TO my_data_old;
ALTER SCHEMA my_data_tmp RENAME TO my_data;
... I have affectively swapped these around.
My expectation is that these actions are replicated via the postgres WAL (ie: it sends the rename commands out to the replicas) and AWS RDS replication won't try and waste time copying huge amounts of data all over the place.
Is this correct?
(Speaking about PostgreSQL here, but RDS is probably similar.)
Renaming a schema (or any other object) is a small update in a catalog table, and no data are moved. Internally PostgreSQL uses only the numeric object ID, which stays the same.
You might wrap the three statements in a transaction to make the whole magic atomic.
The same is true on the standby, it is a trivial (meta)data modification.
The only thing that might be a problem are concurrent sessions holding locks.

TABLE vs TABLESPACE , what's the difference?

I want to know the difference between TABLE and a TABLESPACE in db2-luw.
The tablespace is where tables gets stored.
It links the physical storage layer (files on disks) and the logical storage layer (tables, indexes).
You can assign each table to a tablespace to control the physical storage layout (for example to put some tables on faster or more redundant disks, or to stripe tables across disks).

Postgres - Is it necessary to create tablespace in my case?

I have a mobile/web project, using pg9.3 as database, and linux as server.
The data won't be huge, but as time goes on, the data increase.
For long term considering, I want to know about:
Questions:
1. Is it necessary for me to create tablespace for my database, or just use the default one?
2. If I create new tablespace, what is the proper location on linux to create the folder, and why?
3. If I don't create it now, and wait until I have to, till then, will it be easy for me to migrate db with data to new tablespace?
Just use the default tablespace, do not create new tablespaces. Tablespaces are only useful if you have multiple physical disks, so you can define which data is stored on which physical disk. The directory where your data is located is not that important for the workings of postgres, so if you only have one disk it is useless to use tablespaces
Should your data grow beyond the capacity of 1 disk, you will have to perform a full data migration anyway to move it to another physical disk, so you can configure tablespaces at that time
The idea behind defining which data is located on which disk (with tablespaces) is that you can do things like putting a big table which is hardly used on a slow disk, and putting this very intensively used table on a separated faster disk. But I assume you're not there yet, so don't over complicate things

Moving Postgres tablespaces and tables across EC2 instance

I have postgres database running on Amazon EC2 instance. I have few tablespaces created for
some monthly tables, such that each table is on individual tablespace. To get the maximum performance, I have created each tablespace on individual amazon ebs volume.
I want to move some of this tables to different instance and database. I will explain it with one example.
Lets say.
I have EC2 instance A with postgres setup as explained above.
I have another Amazon instance B running and I installed postgres on it as well.
I want to create the same table structure for some of the tables present in A on B. I want to detach the volumes from instance A and attach it to instance B.
Also, I want to create tablespaces on instance B, which will point to the newly attached volumes.
And when I start up this newly created postgres, I expect to see the tables populated with data from those volumes(database).
finally I will delete those tables from A
I know I am being rusty in writing, but couldn't find a better way to ask the question.
Is something along these lines is possible? Are there any pointers for achieving something like this?
No.
The data in the tablespace directory is only the data. You also need the metadata that's in the tables in the pg_catalog schema, as well as the information from pg_clog and pg_xlog to access it.
If you want to move things across using volumes, you must move the entire installation at once (all the tablespaces, including pg_default). Otherwise, you need to use pg_dump/pg_restore to transfer the data over.