How to save database with all tables limited with 1000 rows only - postgresql

I have a database with 20 tables. Currently, we are using the pg_dump command daily to archive our database.
a few tables in this database are very big. We are working to make a light version of this database for testing purposes and small tickets.
So, I need a way to use pg_dump command and save all tables with only 1000 rows in each table. I tried to find anything like that in Google, but without success.

Related

How to replicate a Postgres DB with only a sample of the data

I'm attempting to mock a database for testing purposes. What I'd like to do is given a connection to an existing Postgres DB, retrieve the schema, limit the data pulled to 1000 rows from each table, and persist both of these components as a file which can later be imported into a local database.
pg_dump doesn't seem to fullfill my requirements as theres no way to tell it to only retrieve a limited amount of rows from tables, its all or nothing.
COPY/\copy commands can help fill this gap, however, it doesn't seem like theres a way to copy data from multiple tables into a single file. I'd rather avoid having to create a single file per table, is there a way to work around this?

Amazon RDS Postgresql snapshot preserves schema but loses all data

Using AWS RDS console I created a snapshot backup of a Postgresql v11 database containing multiple schemas. I then created a new instance from the backup. The process seemed to work fine without error. However, upon inspection of the data in the new instance, I noticed that in only one of my schemas the data was not preserved. The schema structure, tables, indexes, constraints, etc looked fine, but every table was empty (select count(*) from schema.table was 0 for every table in the schema). All other schemas looked fine and contained the expected data. I looked everywhere (could not find help for this online) and tried many tests myself (changing roles, rebuilding the schema, privileges, much more) while attempting to solve this issue. What would cause my snapshots to preserve the entire schema structure, but lose all of the data itself?
I finally realized that the only difference between the problem schema and the other was that all tables in the problem schema had been created with the 'UNLOGGED' keyword. This was done to increase write speed for millions of rows inserted when the schema was first built. However, when a snapshot is created/restored as described above, the process depends on the WAL files that are written with normal (logged) tables to restore the data. To fix my problem I simply altered all of the tables and set them to be logged (alter table schema.table set logged). After this, snapshots worked fine. For anyone else in the future that is doing something similar, should unlogged tables be needed for initial mass population of data to get better write speed, it would be a good to changed them to be logged after initial data population (if you plan on using snapshots or replications or similar). Side note, pg_dump/pg_restore does still work for unlogged tables.

How can I automatically maintain a dump of modified rows in PostGreSql

So, I have a PostGreSQL DB. For some chosen tables in that DB I want to maintain a plain dump of the rows when modified. Note this dump is not a recovery or backup dump. It is just a file which will have the incremental rows. That is, whenever a row is inserted or updated, I want that appended to this file or to a file in a folder. Idea is to load that folder into say something like hive periodically so that I can run queries to check previous states of certain rows, columns. Now, these are very high transactional tables and the dump does not need to be real time. It can be in batches, every hour. I want to avoid a trigger firing hundreds of times every minute. I am looking for something which is off the shelf - already available in PostGreSQL. I did some research but everything is related to PostGreSQL backup - which is not the exact use case.
I have read some links like https://clarkdave.net/2015/02/historical-records-with-postgresql-and-temporal-tables-and-sql-2011/ Implementing history of PostgreSQL table etc - but these are based on insert update trigger and create the history table on PostGreSQL itself. I want to avoid both. I cannot have the history on PostGreSQL as it will be huge soon. And I do not want to keep writing to files through a trigger firing constantly.

Importing data from MS Access db to PostgreSQL db

I have a table in an MS Access db that I want to export to a PostgreSQL database. Every 2 or so months, I want to move all records from the Access table to a table in Postgres.
Right now, I am using the Export to ODBC option in Access to do this, but every time it exports as an entirely new table in Postgres. Is there a way for me to routinely append the records in the Access table to an existing table in my Postgres database? I have come across the option of a FDW but I am not familiar with how to install/use it.
I am new to using PostgreSQL, and have little to no experience working with databases other than Access, so any input/advice would be greatly appreciated.

How can I combine multiple Postgres database dumps into a single database?

I have ~100 Postgres .dump from different sources. They all have the same schema, just a single table, and a few hundred to a few hundred thousand rows. However, the data was collected at different locations and now needs to all be combined.
So I'd like to merge all the rows from all the databases into one single database, ignoring the ID key. What would be a decent way to do this? I may collect more data in the future from more sources, so it's likely to be a process I need to repeat.
if needed use pg_restore to convert the dumps into SQL.
run the SQL dump trhough
sed '/^COPY .* FROM stdin;$/,/^\\.$/ p;d'
as there is only one table in your data that will give you the copy command needed to load the data send that to your database to load the data.