How to use different PostgreSQL instace for testing from production - postgresql

I need my docker containers to connect to different PostgreSQL server, depending on the environment (test & production). What I desire is testing my application locally with local database instance, and push the fixes after. From what I read, PostgreSQL's default connection parameters can be determined by environment variables, so I think writing two different environment variables files for test/production and pass the desired one in with --env-file option of docker run command would do the trick.
Is this a suitable way to test & deploy an web application? If not, what would be a better solution?

Yes, in general this is the approach you should take when using Docker. Store your DB connection parameters (URL, Username, Password) in environment variables. There is no real need to use an environment file unless you have a ton of environment variables, you could also pass an arbitrary number of "-e" parameters to docker as well. This is closer to how services like amazon's ECS will expect you to pass parameters.
If you're going to write those to a file, make sure that the file is encrypted/encoded somehow - storing database passwords in a file in plaintext is not a great security practice.

Related

How to connect build agent to PostgreSQL database

My integration tests for my asp.net core application require a connection to a PostgreSQL database. In my deployment pipeline I only want to deploy if my integration tests pass.
How do I supply a working connection string inside the Microsoft build agent?
I looked under service connections and couldn't see anything related to a database.
If you are using Microsoft hosted agent, then your database need to be accessible from internet.
Otherwise, you need to it on self-hosted agent that can access your database.
I assume the default connectionstring is in appsettings.json, you could store the actual database connectionstring to a secret variable, then update appsettings.json file with that variable value through some task (e.g. Set Json Property) or do it programming (e.g. powershell script) before running web app and starting test during build.
If you can use any PostgreSQL database, you can use service container with a docker image that has PostgreSQL database (e.g. postgres).
For classical pipeline, you could call docker command run the image.
I would recommend you to use runsettings which you can override in task. In that way you will keep your connection string away of source control. Please check this link. And in terms of service connection, you don't need any service connection, only what you need is proper connection string.
Since I don't know how you connect to your DB in details I can't give you more info. If you provide example how you already connect to database I can try to provide a better answer.

Docker and sensitive information used at run-time

We are dockerizing an application (written in Node.js) that will need to access some sensitive data at run-time (API tokens for different services) and I can't find any recommended approach to deal with that.
Some information:
The sensitive information is not in our codebase, but it's kept on another repository in encrypted format.
On our current deployment, without Docker, we update the codebase with git, and then we manually copy the sensitive information via SSH.
The docker images will be stored in a private, self-hosted registry
I can think of some different approaches, but all of them have some drawbacks:
Include the sensitive information in the Docker images at build time. This is certainly the easiest one; however, it makes them available to anyone with access to the image (I don't know if we should trust the registry that much).
Like 1, but having the credentials in a data-only image.
Create a volume in the image that links to a directory in the host system, and manually copy the credentials over SSH like we're doing right now. This is very convenient too, but then we can't spin up new servers easily (maybe we could use something like etcd to synchronize them?)
Pass the information as environment variables. However, we have 5 different pairs of API credentials right now, which makes this a bit inconvenient. Most importantly, however, we would need to keep another copy of the sensitive information in the configuration scripts (the commands that will be executed to run Docker images), and this can easily create problems (e.g. credentials accidentally included in git, etc).
PS: I've done some research but couldn't find anything similar to my problem. Other questions (like this one) were about sensitive information needed at build-time; in our case, we need the information at run-time
I've used your options 3 and 4 to solve this in the past. To rephrase/elaborate:
Create a volume in the image that links to a directory in the host system, and manually copy the credentials over SSH like we're doing right now.
I use config management (Chef or Ansible) to set up the credentials on the host. If the app takes a config file needing API tokens or database credentials, I use config management to create that file from a template. Chef can read the credentials from encrypted data bag or attributes, set up the files on the host, then start the container with a volume just like you describe.
Note that in the container you may need a wrapper to run the app. The wrapper copies the config file from whatever the volume is mounted to wherever the application expects it, then starts the app.
Pass the information as environment variables. However, we have 5 different pairs of API credentials right now, which makes this a bit inconvenient. Most importantly, however, we would need to keep another copy of the sensitive information in the configuration scripts (the commands that will be executed to run Docker images), and this can easily create problems (e.g. credentials accidentally included in git, etc).
Yes, it's cumbersome to pass a bunch of env variables using -e key=value syntax, but this is how I prefer to do it. Remember the variables are still exposed to anyone with access to the Docker daemon. If your docker run command is composed programmatically it's easier.
If not, use the --env-file flag as discussed here in the Docker docs. You create a file with key=value pairs, then run a container using that file.
$ cat >> myenv << END
FOO=BAR
BAR=BAZ
END
$ docker run --env-file myenv
That myenv file can be created using chef/config management as described above.
If you're hosting on AWS you can leverage KMS here. Keep either the env file or the config file (that is passed to the container in a volume) encrypted via KMS. In the container, use a wrapper script to call out to KMS, decrypt the file, move it in to place and start the app. This way the config data is not exposed on disk.

Reading from a password file for tftpconnection

I am trying to use tFTPConnection to download certain files from an FTP site.
It is a regular FTP connection, connecting on port 21.
I would like to be able to read the password from a file rather than hard coding the password to the job.
At the minute I'm simply making the connection and then printing success:
Any advice on how this could be approached or solved?
Talend supports the idea of context variables which allow you to define at run time the values used for them.
This is typically used so you can "contextualise" a connection and then deploy the job in multiple environments and have the connection be to the environment specific end point.
For instance, a job might need to connect to a database but that database is different for each of a development, a testing and a production environment.
Instead of hard coding the connection parameters to the job we instead create some context variables under a context group and refer to these context variables in the connection parameters:
Now, at run time we have the Talend job load these contexts from a file with the relevant connection parameters using an implicit context load:
In this case, the job will read the context variables at run time from a CSV called test.csv that looks like:
Now, when this job is ran it will attempt to connect to localhost:3306/test with the root user and an empty password.
If we have another context file on another machine (but with the same file path) then this could refer to a database on some other server or simply using different credentials and the job would connect to this other database instead.
For your use case you can simply create a single context group with the FTP connection settings, including the password (or potentially just contextualise the password), and then refer to it in the same way:

How to deploy changes to a Cassandra CQL schema

We have an application which is using Cassandra for its database. How should we deploy schema changes in a live production environment.
In development we are just blowing the database away and recreating it with a 'database.cql' script kept in version control. This clearly isn't a solution in production.
In the relational world I would either use a sequence of upgrade scripts and apply them in order, or use a tool to interactively compare the staging and production databases and make the appropriate schema changes.
How do I solve the same problem in the Cassandra?
Here's one I've started and have been using for a while.
https://github.com/heartysoft/aedes
It supports multiple environments, and versioning. Since we're Windows based, it's mainly powershell, but there's no reason a bash script couldn't be written to do the equivalent. The powershell script itself is extremely simple. It requires Powershell v3+. Usage is pretty easy:
aedes.ps1 192.168.40.4 [-u username -p password -env dev]
will look for schema files in the ..\schema folder. Schema files are expected to have a n_ prefix. Environment specific files have a .env.cql postfix. So, if the files are:
1_people.dev.cql
1_people.prod.cql
2_people_some_indexes.cql
3_jobs.dev.cql
3_jobs.prod.cql
4_jobs_something_changed.cql
And run it for prod, then the ones with .prod.cql and no "env" .cql will be applied in order. You can also specify a $start version that can be used to specify where to start applying from (e.g. if start is specified as 3, then anything with 1_ and 2_ will be skipped).
It's pretty basic but seems to work quite well. We just have Cassandra downloaded (not installed) on the "applier machine" (which could be your machine, i.e. not part of a cluster) and have cqlsh on the PATH for easier application. Did (and do) have plans for more features, but working nicely as is for the time being.
Since there wasn't an existing tool, I ended up writing one.
It is called cql-migrate, and provides incremental updates to a deployed Cassandra schema.
[update] Since writing this, I have found a couple more options: one for for rails and another for go

How to run same tests on different servers using prove?

I am using the Perl prove testing utility (TAP::Harness) to test my program.
I need to run the same tests first on a local computer, then on a remote computer.
(Test programs should connect to localhost or to remote host, respectively)
How can I pass parameters (test_server) to tests using prove? I should use environment or there is better solution?
Environment variable sounds good, since you do not have easy access to higher abstraction means to pass data like command-line options or function parameters.
We already have prior art in the variables TEST_VERBOSE, AUTOMATED_TESTING, RELEASE_TESTING which influence how tests are run.
Depending on your larger goals, you may wish to approach the problem differently. We use Jenkins to control test suite runs that ultimately run "prove". It's set up to run tests on multiple servers and offers a number of features to manage "builds", which can be test suite runs.