Load to redshift from s3 without redshift credentials - amazon-redshift

We are loading loading data from S3 to Redshift, but proving redshift username and password on the command line.
Can we do this too role based because this leads to hard coding user name password in code which is a security vulnerability.
psql -h $redshift_jdbc_url -U $redshift_db_username -d $redshift_dbname -p $port_number -c "copy $destinationTable$columnList from '$s3fileName' credentials 'aws_iam_role=arn:aws:iam::$account_number:role/$s3role;master_symmetric_key=$master_key' region '$s3region' format as json '$jsonPathFile' timeformat 'auto' GZIP TRUNCATECOLUMNS maxerror $maxError";

Though this question has nothing to do specifically with Redshift, there could be multiple options to avoid username/password, by mistake checked in to code repository like (cvs,git etc) or getting shared.
Not sure if we do(as stated below) is best practice or not, here is how we do and I think, its safe.
We use the environment variable in our case, and those environment variables are outside of source code repository and the shell script code reads usually there at particular instance environment only.
For e.g. if you have shell script that execute the above command, will load the environment file variable like below. example psql.sh
#!/bin/bash
echo "Loading environment variable"
. "$HOME/.env"
Your other commands
The env file could have variables like below,
#!/bin/bash
export REDSHIFT_USER="xxxxxxxxx"
export REDSHIFT_PASSWORD="xxxxxx"
There are other options too, but not sure if they work well with Redshift.
.pgpass file to store the password. refer below link.
http://www.postgresql.org/docs/current/static/libpq-pgpass.html
"trust authentication" for that specific user, refer below link.
http://www.postgresql.org/docs/current/static/auth-methods.html#AUTH-TRUST
Hope that answers your question.

Approach 1:
Generate temporary username / password which has a TTL as part of your script. Use that temporary username / password to connect to DB.
Reference From AWS documentation
https://docs.aws.amazon.com/cli/latest/reference/redshift/get-cluster-credentials.html
Approach 2:
Use AWS Secerets Manager Service

Related

How do I store a SQL connection password in Airflow cfg?

In the .cfg file, I connected sql alchemy to Postgres with user: airflow_admin and password: pass:
sql_alchemy_conn = postgresql+psycopg2://airflow_admin:pass#localhost:5432/airflow_backend
How do I anonymize this so that the password doesn't show? Do I create a .env file and store the password as a variable and then reference that variable in .cfg conn string?
I read the following but an example would be helpful: https://airflow.readthedocs.io/en/stable/howto/set-config.html
There are several way to do it:
1. change the configuration file permision to read only by airflow account
2. used airflow by docker mode, and encrpyt the configurationfile by docker secret create and map it.
THe mode 1 is easy and convinent. The mode 2 is flexible and can be used in production environment.
Good luck.
If you think the answer is good, pls up vote

postgreSQL pg_dump Through LAN

I'm looking for a way to back up a database through a LAN on a mounted drive on a workstation. Basically a Bash script on the workstation or the server to dump a single database into a path on that volume. The volume isn't mounted normally, so I'm not clear as to which box to put the script on, given username/password and mounted volume permissions/availability.
The problem I currently have is permissions on the workstation:
myfile='/volumes/Dragonfly/PG_backups/serverbox_PG_mydomain5myusername_'`date +%Y_%m_%d_%H_%M`'.sql'
pg_dump -h serverbox.local -U adminuser -w dbname > $myfile
Is there a syntax that I can provide for this? Read the docs and there is no provision for a password, which is kind of expected. I also don't want to echo the password and keep it in a shell script. Or is there another way of doing this using rsync after the backups are done locally? Cheers
First, note the pg_dump command you are using includes the -w option, which means pg_dump will not issue a password prompt. This is indeed what you want for unattended backups (i.e. performed by a script). But you just need to make sure you have authentication set up properly. The options here are basically:
Set up a ~/.pgpass file on the host the dump is running from. Based on what you have written, you should keep this file in the home directory of the server this backup job runs on, not stored somewhere on the mounted volume. Based on the info in your example, the line in this file should look like:
serverbox.local:5432:database:adminuser:password
Remember to specify the database name that you are backing up! This was not specified in your example pg_dump command.
Fool with your Postgres server's pg_hba.conf file so that connections from your backup machine as your backup user don't require a password, but use something like trust or ident authentication. Be careful here of course, if you don't fully trust the host your backups are running on (e.g. it's a shared machine), this isn't a good idea.
Set environment variables on the server such as PGPASSWORD that are visible to your backup script. Using a ~/.pgpass file is generally recommended instead for security reasons.
Or is there another way of doing this using rsync after the backups are done locally?
Not sure what you are asking here -- you of course have to specify credentials for pg_dump before the backup can take place, not afterwards. And pg_dump is just one of many backup options, there are other methods that would work if you have SSH/rsync access to the Postgres server, such as file-system level backups. These kinds of backups (aka "physical" level) are complementary to pg_dump ("logical" level), you could use either or both methods depending on your level of paranoia and sophistication.
Got it to work with ~/.pgpass, pg_hba.conf on the server, and a script that included the TERM environment variable (xterm), and a path to pg_dump.
There is no login for the crontab, even as the current admin user. So it's running a bit blind.

.pgpass file does not work as advertised

(Debian 8)
My .pgpass file is at my home folder (admin)
I and using the right format as indicated in the documentation
hostname:port:database:username:password
The same fields I put there work well when put explicitly into the psql command line. (of course I have to enter the password manually).
However, running psql by itself gives our an error:
psql: FATAL: role "admin" does not exist
Note that my sql username is NOT admin, which is my debian username.
What am I doing wrong? my goal is to get access to psql without having to use an elaborate command line including host/port/username/database
.pgpass is not a way to choose which settings you want to use, it's a way to store passwords for a number of settings you've already chosen to use. It can contain multiple lines. The relevant line is then chosen as follows, according to the documentation:
The password field from the first line that matches the current connection parameters will be used.
You still have to provide your connection parameters (besides the password).
If you always want to use the same connection parameters, you should probably use the environment variables (PGHOST, PGDATABASE, PGUSER, ...), and possibly place them in your .bashrc file (depending on the shell you use).
You can then choose to store the password itself in the PGPASSWORD environment variable or in the .pgpass file. The latter might give you a bit more flexibility.

Use postgresql copy command On Openshift $OPENSHIFT_DATA_DIR from within Node JS program

we are developing an app on Openshift.
we recently upgraded it and made it scalable, separating postgresql to a separate gear than the nodeJS.
in the app user can choose a csv file and upload it to the server ($OPENSHIFT_DATA_DIR).
we then execute from within Node JS:
copy uploaded_data FROM '/var/lib/openshift/our_app_id/app-root/data/uploads/table.csv' WITH CSV HEADER
since the upgrade the above copy command is broken, we are getting this error:
[error: could not open file "/var/lib/openshift/our_app_id/app-root/data/uploads/table.csv" for reading: No such file or directory]
I suppose because the pgsql is now on a separate gear it cannot access $OPENSHIFT_DATA_DIR.
can I make this folder visible to postgresql (though it is on a separate gear)?
is there any other folder that can be visible to both the DB and the APP (each on its own gear) ?
can you suggest alternative ways to achieve similar functionality ?
There is currently no shared disk space between gears within the same scaled application on OpenShift Online. If you want to store a file and access it on multiple gears, the best way would probably be to store it on Amazon S3 or some other shared file storage service that is accessible by all of your gears, or, as you have stated, store the data in the database and access it wherever you need it.
You can do this by using \COPY and psql. e.g.
first put your sql command in a file. (file.sql)
psql -h yourremotehost -d yourdatabase -p thedbport -U username -w -f file.sql
the -w eliminates the password prompt. If you need a password, you can't supply it on the command line. Instead set the environmental variable PGPASSWORD to your password. (The use of PGPASSWORD has been deprecated but it still works)
You can do this with rhc
rhc set-env PGPASSWORD=yourpassword -a yourapp
Here is a sample sql
CREATE TABLE junk(id integer, values float, name varchar(100);
\COPY junk from 'file.sql' with CSV HEADER
Notice there is NO semicolon at the end of the second line.
If you're running this command from a script in your application. The file that contains your data and the file.sql must be in your application's data directory.
ie. app-root/data

Automatically setup jenkins users with CLI

I did not find any reference to user related commands for the jenkins-cli tool.
I need this to automate deployment.
Any comeback?
To use jenkins internal database simply use the following command.
echo 'jenkins.model.Jenkins.instance.securityRealm.createAccount("user1", "password123")' | \
java -jar jenkins-cli.jar -s http://localhost:8080/ groovy =
This will create user=user1 with password=password123
If you have any existing user and have restricted anonymous access to your jenkins, you can specify the username and password with
--username "user_name" and --password "password"
Maybe you don't want to use Jenkins' internal user database at all. There are a host of "Authentication and User Management" plugins.
If you like MySQL, there is a MySQL authenticator (it reads a table of users and passwords), and your "adduser" command could do an insert on that table.
If you like flat files, there is a "Script Security Realm", where you can authenticate with an arbitrary script. Write a file with user and password combos in your favorite format, write an "adduser" script that writes to it, and write an auth script that reads the file and determines whether to authenticate the user.
You can also hook up to an LDAP server, Active Directory, Atlassian Crowd, Unix user accounts (pw_auth), or whatever authentication your application server uses (if it's running off of a Tomcat server, for instance, you can tell Jenkins to let Tomcat authenticate users, and set up Tomcat to do it however you want.
If you specify in more detail what you are trying to do people here may help you better. That said, here are some pointers:
All CLI commands are available via http://[jenkins-server]/cli. What's not found there is not available via CLI. You can specify user name / password via --username and --password (or --password-file) options in CLI commands.
Another option for Jenkins automation is to use Python JenkinsAPI.
You can also use tools like wget and curl to perform certain actions (such as starting a build). There you may use user-specific tokens instead of username/password.
Here's another link that can be helpful.