Running Script In AWS? - postgresql

I don't know this is a relevant question or not? I have one csv file and 1 shape file in my own drive and I used to run one script in cmd and combined these two files and stored in a pgsql using pgfuttur. I want to do the same thing in aws. If I kept these two files on a bucket is it possible to do the same with the below command I used in cmd?
shp2pgsql -I "<Shape file directory>" <Tablename> | psql -U <username> -d <DatabaseName> Example : shp2pgsql -I "C:\Test\filep" public.geo | psql -U postgres -d gisDB
if yes please help me to get this. If no please let me know the reason.. [Please note that I am new to AWS]

You can do it in two ways:
You plan to do it only once or few times: Download the files locally using AWS CLI: aws s3 cp or aws s3 sync and then pass those files as input
You will be accessing multiple files: Use another AWS service to expose your S3 objects as files. Check AWS Storage Gateway and choose AWS Storage Gateway for Files. Once configured, you can refer to the S3 objects as files.

Related

Can't retrieve MongoDB to local drive using SCP from AWS EC2

I have a Docker container using Strapi (which used MondoDB) on a now defunct AWS EC2. I need the content off that server - it can't run because it's too full. So i've tried to retrieve all the files using SCP - which worked a treat apart from download the database content (the actual stuff i need - Strapi and docker book up fine, but because it has to database content, it treats it as a new instance).
Every time i try to download the contents on db from AWS i get 'permission denied'
I'm using SCP something like this
scp -i /directory/to/***.pem -r user#ec2-xx-xx-xxx-xxx.compute-1.amazonaws.com:strapi-docker/* /your/local/directory/files/to/download
Does anyone know how i can get this entire docker container running locally with the database content?
You can temporarily change permissions (recursively) on the directory in question to be world-readable using chmod.

Move default docker postgres data volume

I've created a docker postgis container with the following command :
docker run -p 5435:5432 -e POSTGRES_PASSWORD=xxxxxxx -e POSTGRES_INITDB_ARGS="-d" mdillon/postgis:9.6
This created a volume for data in /var/lib/docker/volumes/[some_very_long_id]/_data
Now I need to move this volume to somewhere else to ease backup for my outsourcing contractor... and don't know how to do this. I'm a bit lost as there seems to be different alternative with data volumes and fs mounts for example.
So what's the correct way to do it as today ? And how to move my current data directory to a better place ?
Thanks.
You can declare a volume mount when you run your container. For example, you could run your container like this:
docker run -p 5435:5432 -e POSTGRES_PASSWORD=xxxxxxx -e POSTGRES_INITDB_ARGS="-d" \
-v /the/path/you/want/on/your/host:/var/lib/postgresql/data \
mdillon/postgis:9.6
This way the postgres data directory will be in the /the/path/you/want/on/your/host in your host.
I didn't check or search deeply, but in your case I suggest to do following step:
Create another container with outside folder.
Using pg_basebackup get all data from the old container to the new container. Or using replicate.
So, you have the data folder outside the container.
Hopefully it will help your case.

Dokku/Docker, how to access file in file system of running container?

Previously, to access a file in a running dokku instance I would run:
docker ps to get the container ID followed by
ls /var/lib/docker/aufs/diff/<container-id>/app/...
note: I'm just using 'ls' as an example command. I ultimately want to reference a particular file.
This must have changed, as the container ID is no longer accessible via this path. There are loads of dirs in that folder, but none that match any of the running containers.
It seems like mounting a volume for the entire container would be overkill in this scenario. I know I can access the file system using dokku run project-name ls, and also docker exec <container-id> ls, but neither of those will satisfy my use case.
To explain a bit more fully, in my dokku project, I have some .sql files that I'm using to bootstrap my postgres DB. These files get pushed up via git push with the rest of the project.
I'm hoping to use the postgres dokku plugin to run the following:
dokku postgres:connect db-name < file-name.sql
This is what I had previously been doing using:
dokku postgres:connect db-name < /var/lib/docker/aufs/diff/<container-id>/app/file-name.sql but that no longer works.
Any thoughts on this? Am I going about this all wrong?
Thanks very much for any thoughts.
Here's an example:
dokku apps:list
=====> My Apps
DummyApp
dokku enter DummyApp
Enter bash to the DummyApp container.
Never rely on the /var/lib/docker file system paths as most of the data stored there is dependent on the storage driver currently in use so it is subject to change.
cat the file from an existing container
docker exec <container> cat file.sql | dokku postgres:connect db-name
cat the file from an image
docker run --rm <image> cat file.sql | dokku postgres:connect db-name
Copy file from an existing container
docker cp <container>:file.sql file.sql
dokku postgres:connect db-name < file.sql

Importing data from external drive to Mongodb hosted on Google compute engine

I deployed MongoDB on google cloud. I have trouble importing data now. I have a json format data on my hard drive, and would like to import it to the database. I tried multiple ways that didn't work:
directly specifying the location of the file
saving the file in a Google storage bucket.
These are the commands I ran:
mongoimport -d test -c trialcollection - f /mongobucket/trial.json
mongoimport -d test -c trialcollection /mongobucket/trial.json
mongoimport -d test -c trialcollection - f C:/desktop/mongo/trial.json
How do I get data to import into Mongo hosted on the Google compute engine?
It sounds like you have the json files on your local computer and you need to mongoimport them into your remove GCE MongoDB instance. The best way to do that is to copy the files that you need over to your GCE instance.
If you haven't already, you should install the Google Cloud SDK on your local system. After you've installed that, you should be able to use the gcloud compute copy-files command to copy the files from your local system to your GCE instance. This command essentially works like scp.
From there you can use gcloud compute ssh to connect to your instance and then run the mongoimport command locally on your GCE instance.

Copying directly from S3 to Google Cloud Storage

I can migrate data from Amazon AWS S3 to Azure using AWS SDK for Java and Azure SDk for Java. Now I want to do migrate data from Amazon AWS S3 to Google Cloud storage using Java.
The gsutil command-line tool supports S3. After you've configured gsutil, you'll see this in your ~/.boto file:
# To add aws credentials ("s3://" URIs), edit and uncomment the
# following two lines:
#aws_access_key_id =
#aws_secret_access_key =
Fill in the aws_access_key_id and aws_secret_access_key settings with your S3 credentials and uncomment the variables.
Once that's set up, copying from S3 to GCS is as easy as:
gsutil cp -R s3://bucketname gs://bucketname
If you have a lot of objects, run with the -m flag to perform the copy in parallel with multiple threads:
gsutil -m cp -R s3://bucketname gs://bucketname
Use the Google Cloud Storage transfer tool.
The answer suggested by jterrace (aws key and secret in .boto file) is correct and worked for me for many regions but not for some regions that need only AWS Signature Version 4. For instance while connecting to 'Mumbai' region I got this error:
BadRequestException: 400 InvalidRequest
The authorization mechanism you have provided is not supported. Please use AWS4-HMAC-SHA256
In order to overcome this problem (make gsutil use AWS Signature v4) I had to add following additional lines to ~/.boto file. These lines create a new section [s3] in the config file:
[s3]
host = s3.ap-south-1.amazonaws.com
use-sigv4 = True
Reference:
Interoperability support for AWS Signature Version 4
Gsutil cannot copy to s3 due to authentication
Create a new .boto file
[Credentials]
aws_access_key_id = ACCESS_KEY_ID
aws_secret_access_key = SECRET_ACCESS_KEY
and this command
BOTO_CONFIG=.boto gsutil -m cp s3://bucket-name/filename gs://bucket-name
or this
BOTO_CONFIG=.boto gsutil -m cp gs://bucket-name/filename s3://bucket-name
AWS_ACCESS_KEY_ID=XXXXXXXX AWS_SECRET_ACCESS_KEY=YYYYYYYY gsutil -m cp s3://bucket-name/filename gs://bucket-name
This approach allows to copy data from s3 to gcs without the need of a a boto file. There can be situations where storing the credentials file in the running virtual machine is not recommended. With this approach we can integrate the gcp secret manager and generate the above command during runtime and execute, preventing the need to store the credentials permanently as a file stored in the machine.