Flink kubernetes deployment - how to provide S3 credentials from Hashicorp Vault? - kubernetes

I'm trying to deploy a Flink stream processor to a Kubernetes cluster with the help of the official Flink kubernetes operator.
The Flink app also uses Minio as its state backend. Everything worked fine until I tried to provide the credentials from Hashicorp Vault in the following way:
apiVersion: flink.apache.org/v1beta1
kind: FlinkDeployment
metadata:
name: flink-app
namespace: default
spec:
serviceAccount: sa-example
podTemplate:
apiVersion: v1
kind: Pod
metadata:
name: pod-template
spec:
serviceAccountName: default:sa-example
containers:
- name: flink-main-container
# ....
flinkVersion: v1_14
flinkConfiguration:
presto.s3.endpoint: https://s3-example-api.dev.net
high-availability: org.apache.flink.kubernetes.highavailability.KubernetesHaServicesFactory
high-availability.storageDir: s3p://example-flink/example-1/high-availability/
high-availability.cluster-id: example-1
high-availability.namespace: example
high-availability.service-account: default:sa-example
# presto.s3.access-key: *
# presto.s3.secret-key: *
presto.s3.path-style-access: "true"
web.upload.dir: /opt/flink
jobManager:
podTemplate:
apiVersion: v1
kind: Pod
metadata:
name: job-manager-pod-template
annotations:
vault.hashicorp.com/namespace: "/example/dev"
vault.hashicorp.com/agent-inject: "true"
vault.hashicorp.com/agent-init-first: "true"
vault.hashicorp.com/agent-inject-secret-appsecrets.yaml: "example/Minio"
vault.hashicorp.com/role: "example-serviceaccount"
vault.hashicorp.com/auth-path: auth/example
vault.hashicorp.com/agent-inject-template-appsecrets.yaml: |
{{- with secret "example/Minio" -}}
presto.s3.access-key: {{.Data.data.accessKey}}
presto.s3.secret-key: {{.Data.data.secretKey}}
{{- end }}
When I comment the presto.s3.access-key and presto.s3.secret-key config values in the flinkConfiguration, replace them with the above listed Hashicorp Vault annotations and try to provide them programmatically during runtime:
val configuration: Configuration = getSecretsFromFile("/vault/secrets/appsecrets.yaml")
val env = org.apache.flink.streaming.api.environment.StreamExecutionEnvironment.getExecutionEnvironment(configuration)
I receive the following error message:
java.io.IOException: com.amazonaws.SdkClientException: Unable to load AWS credentials from any provider in the chain: [EnvironmentVariableCredentialsProvider: Unable to load AWS credentials from environment variables (AWS_ACCESS_KEY_ID (or AWS_ACCESS_KEY) and AWS_SECRET_KEY (or AWS_SECRET_ACCESS_KEY)), SystemPropertiesCredentialsProvider: Unable to load AWS credentials from Java system properties (aws.accessKeyId and aws.secretKey), WebIdentityTokenCredentialsProvider: You must specify a value for roleArn and roleSessionName, com.amazonaws.auth.profile.ProfileCredentialsProvider#5331f738: profile file cannot be null, com.amazonaws.auth.EC2ContainerCredentialsProviderWrapper#bc0353f: Failed to connect to service endpoint: ]
at com.facebook.presto.hive.s3.PrestoS3FileSystem$PrestoS3OutputStream.uploadObject(PrestoS3FileSystem.java:1278) ~[flink-s3-fs-presto-1.14.2.jar:1.14.2]
at com.facebook.presto.hive.s3.PrestoS3FileSystem$PrestoS3OutputStream.close(PrestoS3FileSystem.java:1226) ~[flink-s3-fs-presto-1.14.2.jar:1.14.2]
at org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:72) ~[flink-s3-fs-presto-1.14.2.jar:1.14.2]
at org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:101) ~[flink-s3-fs-presto-1.14.2.jar:1.14.2]
at org.apache.flink.fs.s3presto.common.HadoopDataOutputStream.close(HadoopDataOutputStream.java:52) ~[flink-s3-fs-presto-1.14.2.jar:1.14.2]
at org.apache.flink.runtime.blob.FileSystemBlobStore.put(FileSystemBlobStore.java:80) ~[flink-dist_2.12-1.14.2.jar:1.14.2]
at org.apache.flink.runtime.blob.FileSystemBlobStore.put(FileSystemBlobStore.java:72) ~[flink-dist_2.12-1.14.2.jar:1.14.2]
at org.apache.flink.runtime.blob.BlobUtils.moveTempFileToStore(BlobUtils.java:385) ~[flink-dist_2.12-1.14.2.jar:1.14.2]
at org.apache.flink.runtime.blob.BlobServer.moveTempFileToStore(BlobServer.java:680) ~[flink-dist_2.12-1.14.2.jar:1.14.2]
at org.apache.flink.runtime.blob.BlobServerConnection.put(BlobServerConnection.java:350) [flink-dist_2.12-1.14.2.jar:1.14.2]
at org.apache.flink.runtime.blob.BlobServerConnection.run(BlobServerConnection.java:110) [flink-dist_2.12-1.14.2.jar:1.14.2]
I initially also tried to append the secrets to flink-config.yaml in the docker-entrypoint.sh based on this documentation - Configure Access Credentials:
if [ -f '/vault/secrets/appsecrets.yaml' ]; then
(echo && cat '/vault/secrets/appsecrets.yaml') >> $FLINK_HOME/conf/flink-conf.yaml
fi
The question is how to provide the S3 credentials during the runtime since the Flink operator mounts the flink-config.yaml from a config map and it is a flink-conf.yaml: Read-only file system.
Thank you

There is no support for this from the Kubernetes operator. In fact, this is not a limitation of the Flink Kubernetes operator, it is due to the fact of lack in support in Kubernetes native integration. There is a separate story for this in the Kubernetes operator side - FLINK-27491.
As a workaround, what you can do is, set upĀ an init container and update the config map from the init container using kubernetes API after reading it from the vault. So the updated config map should have the secrets replaced by the init container and those will be visible to the job manager and all of its task managers. The whole Flink cluster journey starts only after updating the config map from the init container so it should be visible to the Flink cluster.
A simple example to update the config map from the init container can be found here. In this example, the config map is updated with a simple CURL command. In theory, you can use any lightweight client to update the config map like this.
A side note: If possible I would suggest to use AWS IAM role rather than IAM plain secrets as IAM role is more secure compared to IAM static credentials.

Related

How to collect log data of a specific namespace in Openshift?

I have a cluster with many namespaces.
I'm trying to log data from a specific namespace in my Openshift cluster but it is logging the data from all the namespaces. I tried to follow the documentation of the Openshift regarding logging, but there is no mention of scoping the log data.
I followed this documentation:
https://docs.openshift.com/container-platform/4.7/logging/cluster-logging.html
I'm using fluentd as the log collector.
As Cluster Logging on OpenShift, you can transfer logs in namespaces or Pods matched label you select.
The sample CR like Forward logs in my-project namespace to Elasticserach which is deployed by Cluster Logging could be as follows:
apiVersion: "logging.openshift.io/v1"
kind: ClusterLogForwarder
metadata:
name: instance
namespace: openshift-logging
spec:
inputs:
- name: my-app-logs
application:
namespaces:
- my-project
pipelines:
- name: my-app
inputRefs:
- my-app-logs
outputRefs:
- default
You can customize inputs field as you want. It also could be specified Pods using matchLabels expression. *2
outputs default means send logs to default Elasticsearch on Cluster Logging.
*1: https://docs.openshift.com/container-platform/4.11/logging/cluster-logging-external.html
*2: https://docs.openshift.com/container-platform/4.7/logging/cluster-logging-external.html#cluster-logging-collector-log-forward-logs-from-application-pods_cluster-logging-external

How to configure microk8s kubernetes to use private container's in https://hub.docker.com/?

microk8s document "Working with a private registry" leaves me unsure what to do. The Secure registry portion says Kubernetes does it one way (no indicating whether or not Kubernetes' way applies to microk8), and microk8s uses containerd inside its implementation.
My YAML file contains a reference to a private container on dockerhub.
apiVersion: apps/v1
kind: Deployment
metadata:
name: blaw
spec:
replicas: 1
selector:
matchLabels:
app: blaw
strategy:
type: Recreate
template:
metadata:
labels:
app: blaw
spec:
containers:
- image: johngrabner/py_blaw_service:v0.3.10
name: py-transcribe-service
When I microk8s kubectl apply this file and do a microk8s kubectl describe, I get:
Warning Failed 16m (x4 over 18m) kubelet Failed to pull image "johngrabner/py_blaw_service:v0.3.10": rpc error: code = Unknown desc = failed to pull and unpack image "docker.io/johngrabner/py_blaw_service:v0.3.10": failed to resolve reference "docker.io/johngrabner/py_blaw_service:v0.3.10": pull access denied, repository does not exist or may require authorization: server message: insufficient_scope: authorization failed
I have verified that I can download this repo from a console doing a docker pull command.
Pods using public containers work fine in microk8s.
The file /var/snap/microk8s/current/args/containerd-template.toml already contains something to make dockerhub work since public containers work. Within this file, I found
# 'plugins."io.containerd.grpc.v1.cri".registry' contains config related to the registry
[plugins."io.containerd.grpc.v1.cri".registry]
# 'plugins."io.containerd.grpc.v1.cri".registry.mirrors' are namespace to mirror mapping for all namespaces.
[plugins."io.containerd.grpc.v1.cri".registry.mirrors]
[plugins."io.containerd.grpc.v1.cri".registry.mirrors."docker.io"]
endpoint = ["https://registry-1.docker.io", ]
[plugins."io.containerd.grpc.v1.cri".registry.mirrors."localhost:32000"]
endpoint = ["http://localhost:32000"]
The above does not appear related to authentication.
On the internet, I found instructions to create a secret to store credentials, but this does not work either.
microk8s kubectl create secret generic regcred --from-file=.dockerconfigjson=/home/john/.docker/config.json --type=kubernetes.io/dockerconfigjson
While you have created the secret you have to then setup your deployment/pod to use that secret in order to download the image. This can be achieved with imagePullSecrets as described on the microk8s document you mentioned.
Since you already created your secret you just have reference it in your deployment:
...
spec:
containers:
- image: johngrabner/py_blaw_service:v0.3.10
name: py-transcribe-service
imagePullSecrets:
- name: regcred
...
For more reading check how to Pull an Image from a Private Registry.

Kubernetes exposes more environment variables than expected

I've faced a strange behaviour with K8s pods running in AWS EKS cluster (version 1.14). The services are deployed via Helm 3 charts. The case is that pod receives more environment variables than expected.
The pod specification says that variables should be populated from a config map.
apiVersion: v1
kind: Pod
metadata:
name: apigw-api-gateway-59cf5bfdc9-s6hrh
namespace: development
spec:
containers:
- env:
- name: JAVA_OPTS
value: -server -XX:MaxRAMPercentage=75.0 -XX:+UseContainerSupport -XX:+HeapDumpOnOutOfMemoryError
- name: GATEWAY__REDIS__HOST
value: apigw-redis-master.development.svc.cluster.local
envFrom:
- configMapRef:
name: apigw-api-gateway-env # <-- this is the map
# the rest of spec is hidden
The config map apigw-api-gateway-env has this specification:
apiVersion: v1
data:
GATEWAY__APP__ADMIN_LOPUSH: ""
GATEWAY__APP__CUSTOMER_LOPUSH: ""
GATEWAY__APP__DISABLE_RATE_LIMITS: "true"
# here are other 'GATEWAY__' envs
JMX_AUTH: "false"
JMX_ENABLED: "true"
# here are other 'JMX_' envs
kind: ConfigMap
metadata:
name: apigw-api-gateway-env
namespace: development
If I request a list of environment variables, I can find values from a different service. These values are not specified in the config map of the 'apigw' application; they are stored in a map for a 'lopush' application. Here is a sample.
/ # env | grep -i lopush | sort | head -n 4
GATEWAY__APP__ADMIN_LOPUSH=<hidden>
GATEWAY__APP__CUSTOMER_LOPUSH=<hidden>
LOPUSH_GAME_ADMIN_MOBILE_PORT=tcp://172.20.248.152:5050
LOPUSH_GAME_ADMIN_MOBILE_PORT_5050_TCP=tcp://172.20.248.152:5050
I've also noticed that this behaviour is somehow relative to the order in which the services were launched. That could be just because some config maps didn't exist at that moment. It seems for now like the pod receives variables from all config maps in the current namespace.
Did any one faced this issue before? Is it possible, that there are other criteria which force K8s to populate environment from other maps?
If you mean the _PORT stuff, that's for compatibility with the old Docker Container Links system. All services in the namespace get automatically set up that way to make it easier to move things from older Docker-based systems.

Standard way of keeping Dockerhub credentials in Kubernetes YAML resource

I am currently implementing CI/CD pipeline using docker , Kubernetes and Jenkins for my micro services deployment. And I am testing the pipeline using the public repository that I created in Dockerhub.com. When I tried the deployment using Kubernetes Helm chart , I were able to add my all credentials in Value.yaml file -the default file getting for adding the all configuration when we creating a helm chart.
Confusion
Now I removed my helm chart , and I am only using deployment and service n plane YAML files. SO How I can add my Dockerhub credentials here ?
Do I need to use environment variable ? Or Do I need to create any separate YAML file for credentials and need to give reference in Deployment.yaml file ?
If I am using imagePullSecrets way How I can create separate YAML file for credentials ?
From Kubernetes point of view: Pull an Image from a Private Registry you can create secrets and add necessary information into your yaml (Pod/Deployment)
Steps:
1. Create a Secret by providing credentials on the command line:
kubectl create secret docker-registry regcred --docker-server=<your-registry-server> --docker-username=<your-name> --docker-password=<your-pword> --docker-email=<your-email>
2. Create a Pod that uses your Secret (example pod):
apiVersion: v1
kind: Pod
metadata:
name: private-reg
spec:
containers:
- name: private-reg-container
image: <your-private-image>
imagePullSecrets:
- name: regcred
You can pass the dockerhub creds as environment variables at jenkins only and Imagepullsecrets are to be made as per kubernetes doc, as they are one time things, you can directly add them to the required clusters

How to specify a GKE node pool configuration in a YAML file instead of using gcloud container node-pools create?

It seems that the only way to create node pools on Google Kubernetes Engine is with the command gcloud container node-pools create. I would like to have all the configuration in a YAML file instead. What I tried is the following:
apiVersion: v1
kind: NodeConfig
metadata:
annotations:
cloud.google.com/gke-nodepool: ares-pool
spec:
diskSizeGb: 30
diskType: pd-standard
imageType: COS
machineType: n1-standard-1
metadata:
disable-legacy-endpoints: 'true'
oauthScopes:
- https://www.googleapis.com/auth/devstorage.read_only
- https://www.googleapis.com/auth/logging.write
- https://www.googleapis.com/auth/monitoring
- https://www.googleapis.com/auth/service.management.readonly
- https://www.googleapis.com/auth/servicecontrol
- https://www.googleapis.com/auth/trace.append
serviceAccount: default
But kubectl apply fails with:
error: unable to recognize "ares-pool.yaml": no matches for kind "NodeConfig" in version "v1"
I am surprised that Google yields almost no relevant results for all my searches. The only documentation that I found was the one on Google Cloud, which is quite incomplete in my opinion.
Node pools are not Kubernetes objects, they are part of the Google Cloud API. Therefore Kubernetes does not know about them, and kubectl apply will not work.
What you actually need is a solution called "infrastructure as code" - a code that will tell GCP what kind of node pool it wants.
If you don't strictly need YAML, you can check out Terraform that handles this use case. See: https://terraform.io/docs/providers/google/r/container_node_pool.html
You can also look into Google Deployment Manager or Ansible (it has GCP module, and uses YAML syntax), they also address your need.
I don' know if it answers accurately your needs but if you want to do IAC in general with Kubernetes, you can use Crossplane CRDs. If you already have a running cluster, you just have to install their helm chart and you can provision a cluster this way:
apiVersion: container.gcp.crossplane.io/v1beta1
kind: GKECluster
metadata:
name: gke-crossplane-cluster
spec:
forProvider:
initialClusterVersion: "1.19"
network: "projects/development-labs/global/networks/opsnet"
subnetwork: "projects/development-labs/regions/us-central1/subnetworks/opsnet"
ipAllocationPolicy:
useIpAliases: true
defaultMaxPodsConstraint:
maxPodsPerNode: 110
And then you can define an associated node pool as follows:
apiVersion: container.gcp.crossplane.io/v1alpha1
kind: NodePool
metadata:
name: gke-crossplane-np
spec:
forProvider:
autoscaling:
autoprovisioned: false
enabled: true
maxNodeCount: 2
minNodeCount: 1
clusterRef:
name: gke-crossplane-cluster
config:
diskSizeGb: 100
# diskType: pd-ssd
imageType: cos_containerd
labels:
test-label: crossplane-created
machineType: n1-standard-4
oauthScopes:
- "https://www.googleapis.com/auth/devstorage.read_only"
- "https://www.googleapis.com/auth/logging.write"
- "https://www.googleapis.com/auth/monitoring"
- "https://www.googleapis.com/auth/servicecontrol"
- "https://www.googleapis.com/auth/service.management.readonly"
- "https://www.googleapis.com/auth/trace.append"
initialNodeCount: 2
locations:
- us-central1-a
management:
autoRepair: true
autoUpgrade: true
If you want you can find a full example of a GKE provisionning with Crossplane here.