Postgresql - AWS RDS data migration to GCP Cloudsql - postgresql

I am trying to migrate bunch of databases from AWS RDS Postresql server to GCP Cloud SQL.
Since both are postgresql engine, I thought it will be a simple solution to take pgdump from aws and do import in gcp.
However I am surprised when import cloud sql failed with error complaining some roles are missing.
Below are the steps which are tried
Dump of database in AWS RDS
pg_dump -h <connection_endpoint> -U root -f db_dump.sql <db_name>
Then I tried to import it in GCP Cloud sql with below command
instance-1:~$ PGPASSWORD=<passwprd> psql --host=<host_name> --port=5432 --username=postgres --dbname=<db_name> < db_dump.sql
SET
SET
SET
SET
SET
set_config
------------
(1 row)
SET
SET
SET
CREATE SCHEMA
ERROR: must be member of role "rdsadmin"
CREATE SCHEMA
ERROR: must be member of role "root"
CREATE SCHEMA
ERROR: must be member of role "root"
CREATE SCHEMA
ERROR: must be member of role "root"
CREATE SCHEMA
ERROR: must be member of role "root"
CREATE EXTENSION
ERROR: must be owner of extension plpgsql
CREATE EXTENSION
COMMENT
SET
SET
CREATE TABLE...
As you can see rdsadmin and root roles are missing.
How to make sure that these missing roles are present in GCP Cloud sql with correct settings because even after creating roles with same name in cloud sql it doesn't succeed?
Any solution please?

Answering my own question as I found a solution for it.
Since rdsAdmin user is created by AWS for administrative tasks on RDS cluster. Taking pg_dump without owners and restoring it without owners does the trick and I am able to perform restore in cloud sql.
pg_dump -Fc -O -h <rds-host> -U <user> -d <db> > db.dump
pg_restore -U postgres -d <db> -h <cloudsql-host> -v --no-owner db.dump
pg_restore needs formatted compressed output to restore. To fulfil that -Fc is used in pg_dump command

Related

Problems to perform PostgreSQL restore

I'm having trouble performing the restore from a dump. The scenario is as follows: I am migrating an environment from GCP to AWS, and at the moment I am working on the migration of the bank.
A partner dumped db that is in GCP and placed the file on AWS S3 (I don't know the command he used to perform the dump).
I created an EC2 in the AWS environment and copied the dump from S3 to EC2 (the file is 13 GB). I also created the RDS to host the new db with all the correct security group settings.
Here comes the problem, I connect to the RDS from the server (EC2) without problems, but when doing the restore using pg_restore I get the following error message: pg_restore: too many command line arguments (first is "dbclient. dump ").
The complete command I used was this:
pg_restore -h client-aurora-cluster-hmg-legado-instance-1.c23ltjbbz7ms.us-east-1.rds.amazonaws.com -U postgres -d db_hmg_legado dbclient.dump -W
OK, I changed the approach. I tried with psql instead of pg_restore and then the command was like this:
psql -h client-aurora-cluster-hmg-legado-instance-1.c23ltjbbz7ms.us-east-1.rds.amazonaws.com -U postgres -d db_hmg_legado dbclient.dump -W
Only this time it worked !!!!
But I received some error messages while performing the restore. Which I put below:
psql: dbclient.dump: 23: ERROR: schema "dw" already exists
CREATE EXTENSION
psql: dbclient.dump: 37: ERROR: must be owner of extension hstore
CREATE EXTENSION
psql: dbclient.dump: 51: ERROR: must be owner of extension intarray
CREATE EXTENSION
psql: dbclient.dump: 65: ERROR: must be owner of extension pg_trgm
CREATE EXTENSION
psql: dbclient.dump: 79: ERROR: must be owner of extension unaccent
But the restore takes a long time and is partially finished.
In general I wanted to understand why pg_restore didn't work. Has anyone ever experienced this?
And about these owner errors does anyone know how to resolve this using psql?
As documented in the manual the file to be restored is the last parameter and it is specified without a "switch". But you are using -W after the dump file. Move the -W parameter somewhere before that (although it's usually not necessary to begin with)
So you need something like this:
pg_restore -W -h ... -U postgres -d db_hmg_legado dbclient.dump
However, if the restore worked when using psql then the dump file is a "plain text" dump which can't be restored using pg_restore to begin with.
Concerning the errors:
You should restore the dump into an empty database that doesn't contain any schemas except the default ones.
You need a superuser for CREATE EXTENSION, which you don't have in a hosted database. So pre-install these extensions with the techniques that Amazon provides, then restore the dump and ignore the errors.

PostgreSQL: cannot see schemas

So I am using PostgreSQL 12 and pgAdmin4. I created two new schema from pgAdmin4, ml and web in a database testingDB. But when I access them from terminal, I could only see public schema. How can I view other schemas?
$ psql -U awspostgres -h address -p 5432 -d testingDB
If you want the tables in a schema to be visible with \dt, either be explicit:
\dt ml.*
or add the schema to your search_path:
SET search_path = ml, web, public;
If you want the latter change to be persistent for all connections to the database
ALTER DATABASE "testingDB" SET search_path = ml, web, public;
Postgres can have multiple databases under one DBMS instance. Make sure you are connnecting to a database by specifying the name with the -d flag to psql:
$ psql -U awspostgres -h address -p 5432 -d <db name> testingDB
Once connected to postgres (and even if you didn't provide a database name when you started psql), you can show databases with \l and connect to one using \c <db name>

Can't pg_restore DB to RDS: missing pgpass.conf file?

I'm trying to restore my Postgres database to an RDS Postgres Instance in the AWS using pg restore from an EC2 instance.
I am using the following command to restore the database:
pg_restore -v -h host --no-owner -d PostgresDB postgres.dump
Now the problem is that originally, I didn't specify the --no-owner option, and since the owner of the local database that has been backed up, and the owner of the RDS Instance aren't the same. This threw an error, which is why I read that specifying this option helps solve the issue.
However, now I get a
pg_restore: [archiver (db)] connection to database "XY" failed:
FATAL: password authentication failed for user "ec2-user"
error message, although the password is right. Now I read that this does happens with Postgresql, but I can't find a way to solve this on EC2. According to this thread, I need to change the format of one of the configuration files. But this my postgres is an AWS Instance, how can I achieve this? I browsed my EC2 server instance and didn't find any pgpass.conf file (according to 'which' the file doesn't exist on the server). How can I solve this?
What username did you create for Postgres? Surely it isn't ec2-user. You need to be specifying the username and password with the --username and --password options. Here is the documentation.

Restore Database from dump.sql

I am getting this error while restoring data from the dump file.
nishant#nishant-Lenovo-G50-70:~/Documents$ psql sortation_gor1 < dump.sql
psql: FATAL: role "nishant" does not exist
I have followed the Postgress Ubuntu Documentation
But when I am trying to restore the database I am getting this error.
Any IDea. ?
PostgreSQL pg_dump doesn't save a roles. Roles in PostgreSQL are related to database cluster, not to single database. It does pg_dumpall with option -r. You should to create missing roles manually with SQL statement CREATE ROLE name LOGIN or you have to use export roles with pg_dump -r.
The I did it with the psql -U postgres -d d1atabase_name -f dump.sql

How to solve privileges issues when restore PostgreSQL Database

I have dumped a clean, no owner backup for Postgres Database with the command
pg_dump sample_database -O -c -U
Later, when I restore the database with
psql -d sample_database -U app_name
However, I encountered several errors which prevents me from restoring the data:
ERROR: must be owner of extension plpgsql
ERROR: must be owner of schema public
ERROR: schema "public" already exists
ERROR: must be owner of schema public
CREATE EXTENSION
ERROR: must be owner of extension plpgsql
I digged into the plain-text SQL pg_dump generates and I found it contains SQL
CREATE SCHEMA public;
COMMENT ON SCHEMA public IS 'standard public schema';
CREATE EXTENSION IF NOT EXISTS plpgsql WITH SCHEMA pg_catalog;
COMMENT ON EXTENSION plpgsql IS 'PL/pgSQL procedural language';
I think the causes are that the user app_name doesn't have the privileges to alter the public schema and plpgsql.
How could I solve this issue?
To solve the issue you must assign the proper ownership permissions. Try the below which should resolve all permission related issues for specific users but as stated in the comments this should not be used in production:
root#server:/var/log/postgresql# sudo -u postgres psql
psql (8.4.4)
Type "help" for help.
postgres=# \du
List of roles
Role name | Attributes | Member of
-----------------+-------------+-----------
<user-name> | Superuser | {}
: Create DB
postgres | Superuser | {}
: Create role
: Create DB
postgres=# alter role <user-name> superuser;
ALTER ROLE
postgres=#
So connect to the database under a Superuser account sudo -u postgres psql and execute a ALTER ROLE <user-name> Superuser; statement.
Keep in mind this is not the best solution on multi-site hosting server so take a look at assigning individual roles instead: https://www.postgresql.org/docs/current/static/sql-set-role.html and https://www.postgresql.org/docs/current/static/sql-alterrole.html.
AWS RDS users if you are getting this it is because you are not a superuser and according to aws documentation you cannot be one. I have found I have to ignore these errors.
For people using Google Cloud Platform, any error will stop the import process.
Personally I encountered two different errors depending on the pg_dump command I issued :
1- The input is a PostgreSQL custom-format dump. Use the pg_restore command-line client to restore this dump to a database.
Occurs when you've tried to dump your DB in a non plain text format. I.e when the command lacks the -Fp or --format=plain parameter. However, if you add it to your command, you may then encounter the following error :
2- SET SET SET SET SET SET CREATE EXTENSION ERROR: must be owner of extension plpgsql
This is a permission issue I have been unable to fix using the command provided in the GCP docs, the tips from this current thread, or following advice from Google Postgres team here. Which recommended to issue the following command :
pg_dump -Fp --no-acl --no-owner -U myusername myDBName > mydump.sql
The only thing that did the trick in my case was manually editing the dump file and commenting out all commands relating to plpgsql.
I hope this helps GCP-reliant souls.
Update :
It's easier to dump the file commenting out extensions, especially since some dumps can be huge :
pg_dump ... | grep -v -E '(CREATE\ EXTENSION|COMMENT\ ON)' > mydump.sql
Which can be narrowed down to plpgsql :
pg_dump ... | grep -v -E '(CREATE\ EXTENSION\ IF\ NOT\ EXISTS\ plpgsql|COMMENT\ ON\ EXTENSION\ plpgsql)' > mydump.sql
Try using the -L flag with pg_restore by specifying the file taken from pg_dump -Fc
-L list-file
--use-list=list-file
Restore only those archive elements that are listed in list-file, and restore them in the order they appear in the file. Note that if filtering switches such as -n or -t are used with -L, they will further restrict the items restored.
list-file is normally created by editing the output of a previous -l operation. Lines can be moved or removed, and can also be commented out by placing a semicolon (;) at the start of the line. See below for examples.
https://www.postgresql.org/docs/9.5/app-pgrestore.html
pg_dump -Fc -f pg.dump db_name
pg_restore -l pg.dump | grep -v 'COMMENT - EXTENSION' > pg_restore.list
pg_restore -L pg_restore.list pg.dump
Here you can see the Inverse is true by outputting only the comment:
pg_dump -Fc -f pg.dump db_name
pg_restore -l pg.dump | grep 'COMMENT - EXTENSION' > pg_restore_inverse.list
pg_restore -L pg_restore_inverse.list pg.dump
--
-- PostgreSQL database dump
--
-- Dumped from database version 9.4.15
-- Dumped by pg_dump version 9.5.14
SET statement_timeout = 0;
SET lock_timeout = 0;
SET client_encoding = 'UTF8';
SET standard_conforming_strings = on;
SELECT pg_catalog.set_config('search_path', '', false);
SET check_function_bodies = false;
SET client_min_messages = warning;
SET row_security = off;
--
-- Name: EXTENSION plpgsql; Type: COMMENT; Schema: -; Owner:
--
COMMENT ON EXTENSION plpgsql IS 'PL/pgSQL procedural language';
--
-- PostgreSQL database dump complete
--
You can probably safely ignore the error messages in this case. Failing to add a comment to the public schema and installing plpgsql (which should already be installed) aren't going to cause any real problems.
However, if you want to do a complete re-install you'll need a user with appropriate permissions. That shouldn't be the user your application routinely runs as of course.
Shorter answer: ignore it.
This module is the part of Postgres that processes the SQL language. The error will often pop up as part of copying a remote database, such as with
a 'heroku pg:pull'. It does not overwrite your SQL processor and warns you about that.
For people using AWS, the COMMENT ON EXTENSION is possible only as superuser, and as we know by the docs, RDS instances are managed by Amazon. As such, to prevent you from breaking things like replication, your users - even the root user you set up when you create the instance - will not have full superuser privileges:
http://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/Appendix.PostgreSQL.CommonDBATasks.html
When you create a DB instance, the master user system account that you
create is assigned to the rds_superuser role. The rds_superuser role
is a pre-defined Amazon RDS role similar to the PostgreSQL superuser
role (customarily named postgres in local instances), but with some
restrictions. As with the PostgreSQL superuser role, the rds_superuser
role has the most privileges on your DB instance and you should not
assign this role to users unless they need the most access to the DB
instance.
In order to fix this error, just use -- to comment out the lines of SQL that contains COMMENT ON EXTENSION
EDIT 1: As suggested by Dmitrii I., you can also omit comments when dumping: pg_dump --no-comments
For people who have narrowed down the issue to the COMMENT ON statements (as per various answers below) and who have superuser access to the source database from which the dump file is created, the simplest solution might be to prevent the comments from being included to the dump file in the first place, by removing them from the source database being dumped...
COMMENT ON EXTENSION postgis IS NULL;
COMMENT ON EXTENSION plpgsql IS NULL;
COMMENT ON SCHEMA public IS NULL;
Future dumps then won't include the COMMENT ON statements.
Use the postgres (admin) user to dump the schema, recreate it and grant priviledges for use before you do your restore.
In one command:
sudo -u postgres psql -c "DROP SCHEMA public CASCADE;
create SCHEMA public;
grant usage on schema public to public;
grant create on schema public to public;" myDBName
For me, I was setting up a database with pgAdmin and it seems setting the owner during database creation was not enough. I had to navigate down to the 'public' schema and set the owner there as well (was originally 'postgres').
Some of the answers have already provided various approaches related to getting rid of the create extension and comment on extensions. For me, the following command line seemed to work and be the simplest approach to solve the problem:
cat /tmp/backup.sql.gz | gunzip - | \
grep -v -E '(CREATE\ EXTENSION|COMMENT\ ON)' | \
psql --set ON_ERROR_STOP=on -U db_user -h localhost my_db
Some notes
The first line is just uncompressing my backup and you may need to adjust accordingly.
The second line is using grep to get rid of offending lines.
the third line is my psql command; you may need to adjust as you normally would use psql for restore.