I am trying to read airflow variables into my ETL job to populate variables in the curation script. I am using the KubernetesPodOperator. How do I access the metadata database from my k8's pod?
Error I am getting from airflow:
ERROR - Unable to retrieve variable from secrets backend (MetastoreBackend). Checking subsequent secrets backend.
This is what I have in main.py for outputting into the console log. I have a key in airflow variables named "AIRFLOW_URL".
from airflow.models import Variable
AIRFLOW_VAR_AIRFLOW_URL = Variable.get("AIRFLOW_URL")
logger.info("AIRFLOW_VAR_AIRFLOW_URL: %s", AIRFLOW_VAR_AIRFLOW_URL)
Can anyone point me in the right direction?
Your DAG can pass them as environment variables to your Pod, using a template (e.g. KubernetesPodOperator(... env_vars={"MY_VAR": "{{var.value.my_var}}"}, ...)).
It looks like you have a secrets backend set in config without having a secrets backend set up, so Airflow is trying to go there to fetch your variable. See this link.
Alter your config to remove the backend and backend_kwargs keys, and it should look at your Airflow variables first.
[secrets]
backend =
backend_kwargs =
Related
I would like to be able to define environment variables in AWS ECS task definition like below:
TEST_USER: admin
TEST_PATH: /home/$TEST_USER/workspace
When I echo TEST_PATH:
Actual Value = /home/**$TEST_USER**/workspace
Expected Value = /home/**admin**/workspace
You can't do that. I don't think Docker in general supports evaluating environment variables like that before exposing them in the container.
If you are using something like CloudFormation or Terraform to create create your Task Definitions, you would use a variable in that tool to store the value, and create the ECS environment variables using that CloudFormatin/Terraform variable.
Otherwise you could edit the entrypoint script of your Docker image to do the following when the container starts up:
export TEST_PATH="/home/$TEST_USER/workspace"
During a pipeline run, under deployment job, providing a deployment environment eliminates the need of providing service connection manually. I'd guess, it's either creating a new SC at this time or it would have created SC at the time of environment creation and using the same.
Either ways, is there a way to find out which Service connection is being used from the logs of pipeline run or from anywhere else?
In our environment, I see a lot of service connection for one environment and a cleanup is necessary to get things in place.
I tried giving SC manually along with environment and it works as expected. So, going forward, I can use this method. But for cleanup, I'd still like to know which one gets used when not specified! (none of the auto-created SCs show any execution history, but I know the deployment has happened multiple times)
As a Kubernetes resource in an environment is referencing Kubernetes service connection, you can use this API to list the serviceEndpointId of a Kubernetes resource, which is also the resourceId of the referenced service connection.
GET https://dev.azure.com/{organization}/{project}/_apis/distributedtask/environments/{environmentId}/providers/kubernetes/{resourceId}?api-version=7.0
Applied with the value of the serviceEndpointId from the response of the above API, we can proceed to use this API to get the referenced service connection details.
GET https://dev.azure.com/{organization}/{project}/_apis/serviceendpoint/endpoints/{endpointId}?api-version=7.0
I'm using ECS Fargate Platform 1.4
I'm setting environment variable while creating Task definition in a cluster but When I tried to access that environment variable in containers but environment is missing container's environment.
I tried all possible way to set and get environment.
Even I tried to set env variable using command option but it failed.
Please help me out.
I know this is an older question, but I'll share what I've done.
I use terraform, so in my aws_ecs_task_definition container_definitions block I have environment defined like this:
environment = [
{
name = "MY_ENVIRONMENT_VARIABLE",
value = "MY_VALUE"
}
]
Then in my app, which is a Flask application, I do this:
from os import environ
env_value = environ.get("MY_ENVIRONMENT_VARIABLE", None) # "MY_VALUE"
In the ECS console, if you click on a task definition, you should be able to view the container definition, which should show the environment variables you set. It's just a sanity check.
So I have configured an OpenShift 3.9 build configuration such that environment variables are populated from an OpenShift secret at build-time. I am using these environment variables for setting passwords up for PostgreSQL roles in the image's ENTRYPOINT script.
Apparently these environment variables are baked into the image, not just the build image, but also the resulting database image. (I can see their values when issuing set inside the running container.) On one hand this seems necessary because the ENTRYPOINT script needs access to them, and it executes only at image run-time (not build-time). On the other this is somewhat disconcerting, because FWIK one who obtained the image could now extract those passwords. Unsetting the environment variables after use would not change that.
So is there a better way (or even best practice) for handling such situations in a more secure way?
UPDATE At this stage I see two possible ways forward (better choice first):
Configure DeploymentConfig such that it mounts the secret as a volume (not: have BuildConfig populate environment variables from it).
Store PostgreSQL password hashes (not: verbatim passwords) in secret.
As was suggested in a comment, what made sense was to shift the provision of environment variables from the secret from BuildConfig to DeploymentConfig. For reference:
oc explain bc.spec.strategy.dockerStrategy.env.valueFrom.secretKeyRef
oc explain dc.spec.template.spec.containers.env.valueFrom.secretKeyRef
I have a properties file in spring-boot app which has postgres instance details for both the database hosted in AWS and one in local.
Every time I checkout the code from git I have to comment the postgres AWS entries and uncomment the local postgres instance to work locally.
Again when I want to checkin, I have to do the opposite.
What is the smartest way to handle this configuration switching so that I don't have to do this every time.
N.B.: AWS deployment happens from github via Jenkins pipeline
You should provide your database parameters as environment variables in your IDE in the project setting (for example). Then set them in your application.properties as placeholders. For example:
spring.datasource.url=${DATASOURCE_URL}
Where DATASOURCE_URL is one of the env. variable.
So at your work you set your local parameters, and on AWS you set prod parameters.
use environment variables - you can use your local settings as default values and set environment variables for AWS usage on your EC2 instance
use profiles and set active profile using command line parameter or environment variable on EC2 instance
Read more about:
- externalised configuration in Spring Boot
- Spring Profiles