Setting Dynamic Properties in Dataproc Job

Here's what I am trying to accomplish. I want to create a workflow template so that I can spin up a cluster, run a job, and delete the cluster. Within the job, I want to pass in properties that can be set dynamically. For example, set a property to the current date.
Below is a simple example. I uses the data function correctly but that is handled at creation time so it looks like it will always be 12/31/2020 if I setup the workflow today. I know I can delete the job and add it back to the template for each run, but I was was hoping for a simpler way.
gcloud dataproc workflow-templates create workflow-mk-test --region us-east1 --project data-engineering-doz4
gcloud dataproc workflow-templates set-managed-cluster workflow-mk-test \
--cluster-name=cluster-mk-test \
--project data-engineering-doz4 \
--image-version=1.3-ubuntu18 \
--bucket data-engineering-dev \
--region us-east1 \
--subnet ml-data-engineering-east1 \
--no-address \
--zone us-east1-b \
--master-machine-type n1-standard-4 \
--master-boot-disk-size 15 \
--num-workers 2 \
--worker-machine-type n1-standard-4 \
--worker-boot-disk-size 15
gcloud dataproc workflow-templates add-job pyspark gs://data-engineering-dev/jobs/ \
--workflow-template=workflow-mk-test \
--step-id=test-job \
--region=us-east1 \
--project=data-engineering-doz4 \
-- date `date -v -1d '+%Y/%m/%d'` \
--output-location s3n://missionlane-data-engineering-dev-us-east-1/delete-me/`date -v -1d '+%Y/%m/%d'`

Dynamic properties generated by running shell commands is not a supported feature of Dataproc jobs. In this case, you might want to consider making the logic part of your job, i.e., getting the current date dynamically in


How to make Powershell pass a list as argument to gcloud

I want to submit my neural network model to google cloud via the following command as in the tutorial:
gcloud ai-platform jobs submit training ${JOB_NAME} \
--region=us-central1 \ \
--scale-tier=CUSTOM \
--master-machine-type=n1-standard-8 \
--master-accelerator=type=nvidia-tesla-p100,count=1 \
--job-dir=${JOB_DIR} \
--package-path=./trainer \
--module-name=trainer.task \
-- \
--train-files=gs://cloud-samples-data/ai-platform/chicago_taxi/training/small/taxi_trips_train.csv \
--eval-files=gs://cloud-samples-data/ai-platform/chicago_taxi/training/small/taxi_trips_eval.csv \
--num-epochs=10 \
--batch-size=100 \
I was working with powershell and I have a problem with the master-accelerator argument which should be a dictionnary. I don't know how to pass such to gcloud. I have tried #{count=1; type=...}, but received a Bad syntax for dict arg: [#] error.
How can I pass a list of parameters in PowerShell such that the gcloud submit command accepts it?
I thank you in advance for your help.
I have tried to use delimiters as ^^^^:^^^^, along this, but this still does not work (Invalid delimeter).

ERROR: (gcloud.sql.instances.create) Projects instance [my-project] not found: The requested flag is either misspelled or unsupported by Cloud SQL

When I try to create cloud sql instance using gcloud I got this error. Any thoughts folks?
--database-version=$DB_VERSION \
--cpu=$NUMBER_CPUS \
--memory=$MEMORY_SIZE \
--storage-type=$STORAGE_TYPE \
--storage-size=$STORAGE_SIZE \
--storage-auto-increase \
--database-flags=$DATABASE_FLAGS \
--region=$REGION \
--authorized-networks=$NETWORKS \
--assign-ip \
It doesnt mater enabled projectId or not in this command

In `aws cloudformation deploy --parameter-overrides`, how to pass multiple values to `List<AWS::EC2::Subnet::ID>` parameter?

I am using this CloudFormation template
The List parameter I'm trying to pass values to is:
"Subnets" : {
"Type" : "List<AWS::EC2::Subnet::Id>",
"Description" : "The list of SubnetIds in your Virtual Private Cloud (VPC)",
"ConstraintDescription" : "must be a list of at least two existing subnets associated with at least two different availability zones. They should be residing in the selected Virtual Private Cloud."
I've written an utility script that looks like this:
echo -e "\n==Deploying\n"
aws cloudformation deploy \
--region $REGION \
--profile $CLI_PROFILE \
--stack-name $STACK_NAME \
--template-file \
--no-fail-on-empty-changeset \
--capabilities CAPABILITY_NAMED_IAM \
--parameter-overrides \
VpcId=$VPC_ID \
Subnets="$SUBNET1 $SUBNET2" \ #<---------------this fails
InstanceType=$EC2_INSTANCE_TYPE \
KeyName=$KEY_NAME \
If I deploy this, after a while my stack fails to deploy saying that a Subnet with the value "subnet-abcdef subnet-ghijlmn" does not exist.
The correct way to pass parameters to list is to comma separate them
aws cloudformation deploy --parameter-overrides Subnets="$SUBNET1,SUBNET2"
will work
Tried every possible solution found online, none worked.
According to the documentation below, you should escape the comma without double-slashes. Tried that, didn't work either.
What worked FOR ME (apparently this is very environment-dependent) was the command below, escaping the coma with just one slash.
aws cloudformation create-stack --stack-name teste-memdb --template-body file://memorydb.yml --parameters ParameterKey=VpcId,ParameterValue=vpc-xxxx ParameterKey=SubnetIDs,ParameterValue=subnet-xxxxx\,subnet-yyyy --profile whatever
From the Documentation here
List/Array can be passed just like python Lists.
'["value1", "value2", "value3"]'
Also to note Cloudformation internally used python.

How Can I Debug apiserver Startup When No Logs Are Generated?

I am trying to install the aws-encryption-provider following the steps at After I added the --encryption-provider-config=/etc/kubernetes/aws-encryption-provider-config.yaml parameter to /etc/kubernetes/manifests/kube-apiserver.yaml the apiserver process did not restart. Nor do I see any error messages.
What technique can I use to see errors created when apiserver starts?
Realizing that the apiserver is running inside a docker container, I connected to one of my controller nodes using SSH. Then I started a container using the following command to get a shell prompt using the same docker image that apiserver is using.
docker run \
-it \
--rm \
--entrypoint /bin/sh \
--volume /etc/kubernetes:/etc/kubernetes:ro \
--volume /etc/ssl/certs:/etc/ssl/certs:ro \
--volume /etc/pki:/etc/pki:ro \
--volume /etc/pki/ca-trust:/etc/pki/ca-trust:ro \
--volume /etc/pki/tls:/etc/pki/tls:ro \
--volume /etc/ssl/etcd/ssl:/etc/ssl/etcd/ssl:ro \
--volume /etc/kubernetes/ssl:/etc/kubernetes/ssl:ro \
--volume /var/run/kmsplugin:/var/run/kmsplugin \
Once inside that container, I could run the same command that is setup in kube-apiserver.yaml. This command was:
kube-apiserver \
--encryption-provider-config=/etc/kubernetes/aws-encryption-provider-config.yaml \
--advertise-address= \
--service-node-port-range=30000-32767 \
--storage-backend=etcd3 \
I elided the bulk of the command since you'll need to get specific values from your own kube-apiserver.yaml file.
Using this technique showed me the error message:
Error: error while parsing encryption provider configuration file
"/etc/kubernetes/aws-encryption-provider-config.yaml": error while parsing
file: resources[0].providers[0]: Invalid value:
AESCBC:(*config.AESConfiguration)(nil), Secretbox:(*config.SecretboxConfiguration)
(nil), Identity:(*config.IdentityConfiguration)(nil), KMS:(*config.KMSConfiguration)
(nil)}: provider does not contain any of the expected providers: KMS, AESGCM,
AESCBC, Secretbox, Identity

How to get output of gcloud composer command?

I'm executing gcloud composer commands:
gcloud composer environments run airflow-composer \
--location europe-west1 --user-output-enabled=true \
backfill -- -s 20171201 -e 20171208 dags.my_dag_name \
kubeconfig entry generated for europe-west1-airflow-compos-007-gke.
It's a regular airflow backfill. The command above is printing the results at the end of the whole backfill range, is there any way to get the output in a streaming manner ? Each time a DAG gets backfilled it will be printed in the standard output, like in a regular airflow-cli.