Reverse engineer a database without primary key - db2

I need help/direction to reverse engineer a DB2 database which does not have primary key or foreign key relationship. Please advise me of some tools and steps. I am relatively new to DB.
Thank you very much for your help.
Regards

If you need to get DDLs, you can use db2look utility, e.g.
db2look -d MY_DB -a -e -x -o FileOut.txt
To move data between different databases, look at db2move.
Otherwise you can look at IBM Data Studio or at some tools listed in Data Model tools for DB2 (altough it's a quite old post).

Related

Protect Data on Postgres with pgcrypto

Someone can clarify the difference among different type of encrypting a database?
I saw that a lot of people use pgcrypto, but they say that TDE is always the best choice.
Is pgcrypto enough to respect GPDR?
I have already installed pgcrypto and test it. It works fine.
The only guide I found for the TDE on Postgres says that it is possible to use by adding on postgresql.conf these line:
keystore_location
tablespace_encryption_algorithm
And executing these lines code:
select pgx_set_master_key 'passphrase'
pg_ctl --keystore-passphrase restart 'keystore location'
At the end you can create a new tablespace.
On the official docs of Postgres the TDE is never explained.
Thanks

What is the easiest way to generate a script to drop and create all objects in a database?

I'm used to working with SQL Server and the SQL Server Management Studio has the option to automatically generate a script to drop and recreate everything in a database (tables/views/procedures/etc). I find that when developing a new application and writing a bunch of junk in a local database for basic testing it's very helpful to have the options to just nuke the whole thing and recreate it in a clean slate, so I'm looking for a similar functionality within postgres/pgadmin.
PGAdmin has an option to generate a create script for a specific table but right clicking each table would be very tedious and I'm wondering if there's another way to do it.
To recreate a clean schema only database you can use the pg_dump client included with a Postgres server install. The options to use are:
-c
--clean
Output commands to clean (drop) database objects prior to outputting the commands for creating them. (Unless --if-exists is also specified, restore might generate some harmless error messages, if any objects were not present in the destination database.)
This option is ignored when emitting an archive (non-text) output file. For the archive formats, you can specify the option when you call pg_restore.
and:
-s
--schema-only
Dump only the object definitions (schema), not data.
This option is the inverse of --data-only. It is similar to, but for historical reasons not identical to, specifying --section=pre-data --section=post-data.
(Do not confuse this with the --schema option, which uses the word “schema” in a different meaning.)
To exclude table data for only a subset of tables in the database, see --exclude-table-data.
clean in Flyway
The database migration tool Flyway offers a clean command that drops all objects in the configured schemas.
To quote the documentation:
Clean is a great help in development and test. It will effectively give you a fresh start, by wiping your configured schemas completely clean. All objects (tables, views, procedures, …) will be dropped.
Needless to say: do not use against your production DB!

Managing foreign keys when using pg_restore with multiple dumps

I have a bit of a weird issue. We were trying to create a database baseline for our local environment that has very specific data pre-seeded into it. Our hopes were to make sure that everyone was operating with the same data, making collaboration and reviewing code a bit simpler.
My idea for this was to run a command to dump the database whenever we run a migration or decide a new account is necessary for local dev. The issue with this is the database dump is around 17MB. I'm trying to avoid us having to add a 17MB file to GitHub every time we update the database.
So the best solution I could think of was to setup a script to dump each individual table in the database. This way, if a single table is updated, we'd only be pushing that backup to GitHub and it would be more along a ~200kb file as opposed to 17mb.
The main issue I'm running into with this is trying to restore the database. With a full dump, handling the foreign keys is relatively simple as it's all done in a single restore command. But with multiple restores, it gets a bit more complicated.
I'm looking to find a way to restore all tables to a database, ignoring triggers and constraints, and then enabling them again once the data has been populated. (or find a way to export the tables based on the order the foreign keys are defined). There are a lot of tables to work with, so doing this manually would be a bit of a task.
I'm also concerned about the relational integrity of the database if I disabled/re-enable constraints. Any help or advice would be appreciated.
Right now I'm running the following on every single table:
pg_dump postgres://user:password#pg:5432/database -t table_name -Fc -Z9 -f /data/www/database/data/table_name.bak
And then this command to restore all backups to the DB.
$data_command = "pg_restore --disable-triggers -d $dbUrl -Fc \"%s\"";
$backups = glob("$directory*.bak");
foreach($backups as $data_file){
if($data_file != 'data_roles.bak') {
exec(sprintf($data_command, $data_file));
}
}
This obviously doesn't work as I hit a ton of "Relationship doesn't exist" errors. I guess I'm just looking for a better way to accomplish this.
I would separate the table data and the database metadata.
Create a pre- and post-data scfipt with
pg_dump --section=pre-data -f pre.sql mydb
pg_dump --section=post-data -f post.sql mydb
Then dump just the data for each table:
pg_dump --section=data --table=tab1 -f tab1.sql mydb
To restore the database, first restore pre.sql, then all the table data, then post.sql.
The pre- and post-data will change often, but they are not large, so that shouldn't be a problem.

Best way to make PostgreSQL backups

I have a site that uses PostgreSQL. All content that I provide in my site is created at a development environment (this happens because it's webcrawler content). The only information created at the production environment is information about the users.
I need to find a good way to update data stored at production. May I restore to production only the tables updated at development environment and PostgreSQL will update this records at production or the best way would be to backup the users information at production, insert them at development and restore the whole database at production?
Thank you
You can use pg_dump to export the data just from the non-user tables in the development environment and pg_restore to bring that into prod.
The -t switch will let you pick specific tables.
pg_dump -d <database_name> -t <table_name>
https://www.postgresql.org/docs/current/static/app-pgdump.html
There are many tips arounds this subject here and here.
I'd suggest you to take a look on these links before everything.
If your data is discarded at each update process then a plain dump will be enough. You can redirect pg_dump output directly to psql connected on production to avoid pg_restore step, something like below:
#Of course you must drop tables to load it again
#so it'll be reasonable to make a full backup before this
pg_dump -Fp -U user -h host_to_dev -T=user your_db | psql -U user -h host_to_production your_db
You might asking yourself "Why he's saying to drop my tables"?
Bulk loading data on a fresh table is faster than deleting old data and inserting again. A quote from the docs:
Creating an index on pre-existing data is quicker than updating it incrementally as each row is loaded.
Ps¹: If you can't connect on both environment at same time then you need to do pg_restore manually.
Ps²: I don't recommend it but you can append --clean option on pg_dump to generate DROP statements automatically. Be extreme careful with this option to avoid dropping unnexpected objects.

AWS pg_dump Does Not Include Globals

We have multiple PostgreSQL Instances in AWS RDS. We need to maintain an on-premise copy of each database to comply with our disaster recovery policy. I have been successful is using pg_dump and pg_restore to export the database schemas and tables to our on-premise server, but I have been unsuccessful in exporting the roles and tablespaces. I have found that this is only possible by using pg_dumpall, but as this requires super_user access, and that is not allowed in RDS, how can I export those aspects of the database to on our on-premise server?
My pg_dump command:
Pg_dump -h {AWS Endpoint} -U {Master Username}-p 5432 -F c -f C:\AWS_Backups\{filename}.dmp {database name}
My pg_restore command:
pg_restore -h {AWS Endpoint} -p 5432 -U {Master Username} -d {database name} {filename}.dmp
I have found multiple examples of people using pg_dump to export their PostgreSQL databases, however, they are not addressing the "Globals" that are ignored using pg_dump. Have I misread the documentation? After performing my pg_restore, my logins were not created on the database.
Any help you can provide on getting the FULL database (including globals) to our offsite location would be greatly appreciated.
UPDATE: My patch is now a part of Postgres v10+.
You can read about how this works here 3.
Earlier, I had also posted a working solution posted to my Github account. Then, you'd need to compile the binary and use that however, with the patch now a part of Postgres v10+, any pg_dumpall since that version now supports this feature.
You can read some more detailed inner workings here.
I haven't been able to find an answer to my question anywhere online. Just in case someone else may be experiencing this problem, I thought I would post a high-level outline of my "solution". I go around my elbow to get to my knee, but this is the option I have come up with:
Create a table (I created 2 - 1 for roles, and one for logins) in each PostgreSQL database within AWS. This table(s) will need to have all columns that you will need to dynamically create the SQL to do CREATE, GRANT, REVOKE, etc.
Insert all roles, logins, privileges, and permissions into this table. These are scattered everywhere, but here are the ones I used:
pg_auth_members (role and login relationships)
pg_roles (role and login permissions ie can login, inherit parent, etc)
information_schema.role_usage_grants (schema privileges)
information_schema.role_table_grants (table privileges)
information_schema.role_routine_grants (function privileges)
To fill in the gaps, there are clever queries on the web page below to use the built in functions to check for access. You will have to loop through the tables and process a row at a time
https://vibhorkumar.wordpress.com/2012/07/29/list-user-privileges-in-postgresqlppas-9-1/
Specifically, I used a variation of database_privs function
Once all of the data is in those tables, you can execute pg_dump, and it will extract that info from each database to your on-premise location. I did this through a Python script.
On your server, use the data in the tables to dynamically create the SQL statements needed to run the CREATE, GRANT, REVOKE, etc. scripts. Save in a .sql file that you can instruct a Python script to execute against the database and recreate the AWS roles and logins.
One thing I forgot to mention - because we are unable to access the pg_auth_id table in AWS, I have found no way to extract the passwords out of AWS. We are going to store these in a password manager, and when I create the CREATE ROLE statements, I'll pass a default to be updated.
I haven't completed the process, but it has taken me several days to track down a viable option to the absence of pg_dumpall's functionality. If anyone sees any flaws in my logic, or has a better solution, I'd love to read about it. Good luck!