Heroku: importing from S3 failing - postgresql

I'm trying to import a local Postgresql database to Heroku and I'm following these steps https://devcenter.heroku.com/articles/heroku-postgres-import-export#import-to-heroku-postgres.
I have successfully:
created a dump
uploaded it to an S3 Bucket
created from AWS CLI a signed link
ran the command heroku pg:backups:restore '<SIGNED URL>' DATABASE_URL (adding -a with my app name).
The process to restore a backup starts correctly but then exits with this code:
! An error occurred and the backup did not finish.
!
! Could not initialize transfer
!
! Run heroku pg:backups:info r011 for more details.
Opening the log shows:
Database: BACKUP
Finished at: 2020-01-09 18:49:30 +0000
Status: Failed
Type: Manual
Backup Size: 0.00B (0% compression)
=== Backup Logs
2020-01-09 18:49:30 +0000 Could not initialize transfer
I've tried:
re-uploading the file to the bucket,
generating a new signed link,
putting the app in maintenance mode,
I've created a user in my IAM management service with full S3 access and saved the credentials in the app environment as from https://devcenter.heroku.com/articles/s3
Not sure where to go from here but would appreciate any help. (I'm on the hobby plan therefore I can't ask Heroku's support for help)
Edit: I also tried:
deleting and recreating the S3 Bucket
installing version 1 of the AWS CLI to see if by chance the structure of a presigned link had changed
Edit 2: Since I could not find a solution I've opted to migrate the hosting entirely on AWS for the moment

Make sure that your credentials on your machine that are stored in ~/.aws/ the default value is set to the credentials you created for your heroku configs. Then also make sure the signed url is created with those credentials and configs. I had to set my default credentials to the credentials I put in my heroku configs. Then I also had to set my default region in ~/.aws/config to match the bucket location. Should work after that.
Here are some instructions if you are on mac or linux.
Sorry Windows people. I would assume it is something similar.
Create new access id and key in IAM on AWS
Set heroku configs to use those credentials heroku config:set AWS_ACCESS_KEY_ID=xxx AWS_SECRET_ACCESS_KEY=yyy
Optional (You may have to set the bucket name in heroku config too)
On your machine set your credentials you just created to the default in ~/.aws/credentials
On your machine set your default region that corresponds to your bucket in ~/.aws/config
Create signed URL aws s3 presign s3://your-bucket-address/your-object
Run restore heroku pg:backups:restore '<SIGNED URL>' DATABASE_URL

Had the exact same error and made these 2 adjustments. In the S3 console click on the file you want to use for the backup. You should see the name fo your file followed by 4 tabs. In the General information tab, do the following:
Click on Make public to make the file available for download.
Get the URL for that object where it says URL of object
(should be something like https://mybucket.s3.amazonaws.com/my.file, you can test if it works by pasting that url in a new Chrome tab and hitting that url. That should trigger the download of your file)
Once the previous check is working you can proceed to
heroku pg:backups restore 'https://mybucket.s3.amazonaws.com/my.file' DATABASE_URL

I ran into the same issue and discovered the issue was that I had my bucket's region set as us-east rather than us-east-1.

Related

How can I access to a PostgreSQL DB from outside of a heroku app (for a Python app)

I have a PostgreSQL DB hosted in heroku and I want to get access from other applications which aren't hosted in heroku, but I saw in the DB settings that the credentials are not permanent.
How can I get access from other application always having updated credentials?
Heroku recommends using the Heroku CLI to fetch fresh credentials every time you run your external application:
Always fetch the database URL config var from the corresponding Heroku app when your application starts. For example, you may follow 12Factor application configuration principles by using the Heroku CLI and invoke your process like so:
DATABASE_URL=$(heroku config:get DATABASE_URL -a your-app) your_process
This way, you ensure your process or application always has correct database credentials.
In this example, your_process will see an environment variable called DATABASE_URL that is set to the same value as the DATABASE_URL config far on the Heroku app called your-app.
Since you are using Python, here is one way to access that value:
import os
database_url = os.getenv("DATABASE_URL", default="some_default_for_local_development")

Database transfer from Heroku to Digital Ocean

I'm currently in the process of switching my cloud server from Heroku to Digital Ocean. However is there a way to migrate the database from the heroku server to the digital ocean one? I use postgresql for my database
I hope you already got a solution, but in case you didn’t, I’ll provide a simple guide on how I did it. I am going to assume that you have already created a postgres database on digitalocean. Also you need navigate to your project directory and log in to heroku using the heroku cli. And, you need to have postgresql installed or a psql client. Installing postgresql would do it as it comes with psql.
Step 1: Create a backup and download the backup from heroku postgres
heroku pg:backups:capture --app <app_name>
heroku pg:backups:download --app <app_name>
The first command will create a backup of your database and the second command will download it to your current directory, its a .dump file. If you would like to read more, here is an article.
Step 2: Connect to your remote (digital ocean’s) database using psql
Before you can do this, you need to go and add your machine you are connecting from to the list of database’s list of trusted sources, If you don’t, you’ll get a Connection Timed Out error. That’s because the database’s firewall doesn’t allow you to connect to the database from your local machine or resource (for security reasons).
Step 3: Import the Database
pg_restore -d "postgresql://<database_username>:<database_password>#<host>:<port>/<database>?sslmode=require" --jobs 4 -c "/path/to/dump_file.dump"
This will import your database from your dump file. Just substitute the variables will your connection parameters that you get from your dashboard. If you would like to read more, here is another article for this step.
Another thing to make clear is, sometimes, you will see some harmless error messages when running this command, but it will push through anyway. To learn more about pg_restore read this article.
And that’s it, your database has been migrated. Now, can you confirm it worked?, well, as for me, I used pgAdmin to connect to the remote database and I saw the tables and data as expected.
Hope this helps anyone with the same problem :)

Heroku Postgres app works on local machine but not on Heroku

I built and deployed a Node.js Postgres app to Heroku and can not get to any of my endpoints via the Heroku site except the root GET route. Curiously, when I run Heroku local web ALL my endpoints behave exactly as they should. I can successfully perform CRUD on the app running via Heroku local web. However, when I try, for instance, to create a user using the Heroku URL, it returns an empty error message. Yet, when I check the associated database I find that the user was indeed created. Other than returning an empty error message when I try to either create a user or sign it, the app correctly responds with the different errors I programmed. For example, when I tweak my login details or try to register the same user I earlier tried to register it correctly says the user already exists!. Still, when I try to log in that same existing user I get a blank error message. Note that I created both the Heroku PostgreSQL database and my local PostgreSQL database from exactly the same queries. Please, can you help me through this bottleneck? I am using Postman to test my APIs.
Test to sign in user on Heroku app running on the local machine: success!
Same exact test with Heroku URL: cryptic error.
Ok, so after a lot of researching and fiddling around I discovered the solution. I did not add keys from my .env file to Heroku as config vars found under the settings tab of the Heroku User dashboard. Manually adding my environment variables resolved the matter. Now my app is working both on my local machine and via the Heroku URL.

PostgreSQL in Openshift won't execute the entrypoint and can not start the database

We have a read-only Postgresql database that should run in Openshift cluster.
We are using RHEL as the undrlying operating system.
Our Dockerfile will install postgres software, create the database instance, loads the data to it than shuts the database down and save the image.
We are using only bash and sql scripts and deploy the database using flyway.
When starting the container the entrypoint script will simply startup the database instance using "pg_ctl" command then perform an endless loop to keep the container running.
The Dockerfile has as the last command USER 26, where 26 is the id of the postgres user. The entrypoint script can be started as the postgres user or by a sudo user.
Everything is working well in Docker.
In Openshift the container is started by a different user belonging to the root group, but not the root user nor the user 26. Actually Openshift ignores the USER 26 clause in the Dockerfile.
The user starting the container (we'll call it containeruser) has anyhow no rights to start the postgres instance , so when running the entrypoint it will get permission denied on the postgresql data folder.
I have tried different options, adding the containeruser user to the wheel group and modify the sudoers file to allow it using sudo and start the entrypoint as postgres user but with no success.
So I have my database image ready but can not start it in Openshift.
On the openshift configuration side we are not allowed to make changes like allowing sudo usage, or starting the container as root or postgres user.
Any idea or help to this problem?
I am not an Openshift expert.
Thank you!
Best regards,
rimetnac
You have two choices.
The preferred choice is to fix your image so that it can run as any user. For this, do not use the existing postgres user. Create a new user, where that user has group root. Then ensure that all directories/files that PostgreSQL needs to write to are owned by that user, but also have group root and writable by group. When the container is then started up, it will run as an assigned user ID, not in /etc/passwd, and so will fallback to using group root still. Because the directories/file are writable to group root, everything will still work. For more information see:
https://docs.openshift.org/latest/creating_images/guidelines.html
Specifically, section 'Support Arbitrary User IDs'.
The second option if you have admin control of the cluster, and your security team do not object that you are overriding the default security model, is to allow your image to be run as user ID it wants to.
First create a new service account:
oc create serviceaccount runasnonroot
Next grant that service account the ability to run as non root user ID of its choosing.
oc adm policy add-scc-to-user nonroot -z runasnonroot --as system:admin
Then patch the deployment config to use that service account.
oc patch dc/mydatabase --patch '{"spec":{"template":{"spec":{"serviceAccountName": "runasnonroot"}}}}'
Note that this still requires you use USER in the image with an integer user ID and not postgres. Otherwise it can't verify it will run as non root user. That is because if you use a user name instead of user ID, you could be maliciously mapping that to root.
I spent days figuring this out and found one good solution.
OpenShift Origin runs an image as a user created by it, as explained in this OpenShift blog post. This prevents programs from being able to access needed files and directories. To successfully run a program on OpenShift Origin, the blog post provides two solutions, however, the first will not work for PostgreSQL and the second has two disadvantages (explained in the notes):
Grant group write access to the directories used by the main program.
This will not solve the problem because, although the PostgreSQL files will be accessible by any program, they must be owned by the owner of the PostgreSQL process.
Ensure that when operating system libraries are used to look up a system user, one is returned for the ID of the user OpenShift Origin runs the image as. The following are two methods for doing this:
Use a package called nss_wrapper, "which intercepts any calls which look up details of a user and returns a valid entry."
Make the UNIX password database file (/etc/passwd) have global write permissions in the image build so that the OpenShift user can be added to it in the S2I run script.
Each option has a disadvantage: 1. install an extra package and 2. make user accounts insecure.
The best solution is to build the docker image to run as the user OpenShift Origin will run the image as. I built this instructional image with it.
One additional problem to note is that, as the owner of the PostgreSQL process must be the owner of the files and directories accessed by PostgreSQL, PostgreSQL must be set up (i.e. initdb, roles, databases, etc.) during the image build. This is because file ownership can only be changed during the image build and the ownership of the files must be changed after PostgreSQL has been set up for the reason explained in #2 below.
Here are the complete steps with notes for setting up PostgreSQL in the image build:
Manually create the PostgreSQL data directory and change its ownership to a non-root user that will be used to initialze PostgreSQL and set up the components (e.g. roles and databases) required to run the server on OpenShift Origin.
This is required because the "initdb" executable must be executed by a user other than root and will need access to the data directory. Additionally, this user cannot be the user OpenShift Origin will run the image as because it is not in the system.
Switch to the non-root user.
This is required because the initdb executable must be executed "as the user that will own the server process, because the server needs to have access to the files and directories that initdb creates" (PostgreSQL documentation) and because the PostgreSQL server will be started to set up components (e.g. roles and databases) required to run the server on OpenShift Origin.
Run the "initdb" executable.
Start the PostgreSQL server, set up the required components (roles, databases, etc.) and stop the PostgreSQL server.
Switch back to the root user.
Change the ownership of the PostgreSQL files and directories to the user OpenShift Origin will run the image as.
Edit (06/20/18): I have found that there is a solution to set up PostgreSQL after the image is built. The user OpenShift Origin will run the image as can be added to the system at the start of the build. This will allow PostgreSQL to be set up and the ownership of its files and directories to be changed after the image build.
After gathering the comments from all contributors I can asnwer my question as follows:
Option 1
When you create the postgres database during image build, you must configure openshift policies to allow starting your container as the user that created the database during image build. Use this option when the database must be filled with data and this operation takes much time making it inappropriate for a container start. the entrypoint will only start the already prepared database.
Option 2
Create your database when starting the container using the entrypoint script. Use this option when the database creation is fast enough to be done at container start.
Option 3
See the last comment from Adrian which seems to answer all the problems anyhow I didn't got the time to test it.
Thank you all for your contributions.

Heroku database restore issue

Have gone through different solutions available on stackoverflowand also on different forums. But none addresses the precise problem.
As per the documentation: https://devcenter.heroku.com/articles/heroku-postgres-import-export
I have the dump file created from my local database, with this command:
pg_dump -Fc --no-acl --no-owner -h localhost -U postgres dss_iaya>dss_iaya_db_dump1.dump
Then as per documentation, uploaded to a server with public access URL: https://firebasestorage.googleapis.com/v0/b/iaya-664f3.appspot.com/o/dss_iaya_db_dump1.dump?alt=media&token=06167d04-1e98-4e4b-b0e0-9d83a86dd167
Now when I try to restore on Heroku as per its documentation syntax heroku pg:backups:restore [BACKUP] [DATABASE] --app APP using following command, it returns error message when restoring.
heroku pg:backups:restore --app heroku-postgres-*** 'https://firebasestorage.***/dss_iaya_db_dump1.dump?alt=media&token=***' 'postgres://quesu***:I***#ec2-54-***.eu-west-1.compute.amazonaws.com:5432/d3n***k0'
I have used *** for security purpose only, as can not mention full credentials. But I believe one can understand the whole syntax.
When I restore same .dump file on a newly created local database it works without any issues and creates/restores the whole database with tables and data.
Just found the solution, actually two things were wrong in my case.
One, the uploaded .dump file was not well readable/usable by the heroku.
Two, the heroku postgresql DB complete URL was not required to be provided.
So, the right way that worked for me was that the uploaded file should be accessible without any token and also without any virtual/indirect path, etc. The URL to the file should point to the file directly. In my questioned problem, I was using firebase to host my DB file temporarily to do the heroku operation. And firebase was not giving direct URL to the uploaded physical file.
heroku pg:backups:restore --app heroku-postgres-f3*** 'https://www.h***.com/dss_iaya_db_dump2.dump' DATABASE_URL
After typing this command, I was asked to retype the heroku app name just to confirm the operation. Once done, everything worked like a charm.
In logs you see dump size: 0 Bytes. Also you see aborting and 403,
so if you check your file: https://llfirebasestorage.googleapis.com/vo/b/iaya-664f3.appspot.com/0/dss_iaya_db_dump1.dump?a1t=media
you get:
404. That’s an error.
The requested URL /vo/b/iaya-664f3.appspot.com/0/dss_iaya_db_dump1.dump was not found on
this server. That’s all we know.