How does one get config into volumes for ECS tasks? - amazon-ecs

I tend to use a bootstrap task which basically puts config in a volume ( gets it from S3 etc ) and then the main task mounts this volume .
Is there a better way to handle this scenario ?

One other way to achieve this would be to map an EFS volume in your tasks. It's a bit more work to add the volume in the task definition (+ creating the volume out of band) but it may be worth doing it for getting rid of the init task. This blog series talk about the why and how of ECS/EFS.

Related

How to share volume mounts between different cron jobs in kubernetes

I have unique use case, where i need to run a bunch of cron jobs every minute. These cron jobs have numerous volume mounts defined within them (50 or so) via PVCs and PVs. Clearly mounting and un-mounting during every cron run is inefficient. Given the constraint that I cannot move away from cron jobs for my use-case, is there a better way to share the volume mounts?
Thanks
K

Spring Batch Restartability on Kubernetes for File Operations

I want to learn what is the proper way to reach the processed files when restarting the spring batch application on Kubernetes. Especially if the target type is file, it is being deleted together with the pod after the job failed.
We are considering to use persistent volume or backing up the created file somewhere such as DB or sftp server by implementing a listener.
Is there anyone have the experience of persistent volume usage(nfs or other solutions) for file operations. We are concerned about the performance and unexpected problems. Do you have any suggestions?
Thank you.
You should not rely on the ephemeral file system of a Pod for files that should persist and survive a Job (Pod) failure.
You need to use a persistent volume for that, so that Spring Batch can find the (incomplete) output file in a restart scenario and resume writing where it left off.
If you want data persistence, you may begin by using hostPath volumes first. This will restrict which nodes your pods may be spawned on. But is the simplest and gives you the best performance.
https://kubernetes.io/docs/concepts/storage/volumes/#hostpath
If you want dynamic allocation, you will need to configure storage solutions such as GlusterFS, NFS, CEPH etc.

Kubernetes - Best way to initialize persistent volume for a database deployment

I have a Neo4j service, but before the deployment starts up, I need to pre-fill it with data (about 2GB of data). Currently, I wrote a Kubernetes Job to transform the data from a CSV and format it for the database using the neo4j-admin tool. It saves the formatted data to a persistent volume. After waiting for the job to complete, I mount the volume in the Neo4j container and the container is effectively read-only on this data for the rest of its life.
Is there a better way to do this more automatically?
I don't want to have to wait for the job to complete to run another command to create the Neo4j deployment. I looked into initContainers, but that isn't suitable because I don't want to redo the data filling when a pod is re-created. I just want subsequent pods to read from the same persistent volume. Is there a way to wait for the job to complete first?
As Jobs can't natively spawn new objects once finished (and if exited gracefully, using PreStop to invoke further actions won't work), you might want to monitor the API objects instead.
Programatically accessing the API to determine when the Job is finished and then, create your Deployment object might be a feasible, automated way to do it.
Doing it this way, you don't have to worry for redoing the data processing with initContainers as you can essentially call the deployment and remount the already existing volume.
Also, using the official Go library allows you to either run within the cluster, in a pod or externally.
I assume that your neo4j application data won't be updated from your neo4j deployment based on you said that the deployment loads the volume as read-only.
If that is the case why do you want kubernetes to do the data loading? Use object storage like s3 or azure data lake and ensure that there is some data workflow pipeline that will update the object storage. There are many tools that provides data pipeline features such as oozie, airflow.
In your deployment, then you can refer to the object storage via Persistent Volume Claim.

Set the ECS Cloudformation Update Stack timeout?

When updating a Cloudformation EC2 Container Service (ECS) Stack with a new Container Image, is there any way to control the timeout so if the service does not stabilize it rolls back automatically?
The UpdatePolicy attribute which is part of the Auto Scaling Group does not help since instances are not being created.
I also tried a WaitCondition but have not been able to get that to work.
The stack essentially just stays in the UPDATE_IN_PROGRESS state until it hits the default timeout (~3 hours), or you trigger a Cancel the update.
Ideally we would be able to have the stack timeout after a short period of time.
This is what my Cloudformation template looks like:
https://s3.amazonaws.com/aws-rga-cw-public/ops/cfn/ecs-cluster-asg-elb-cfn.yaml
Thanks.
I've created a workaround for this problem until AWS creates a ECS UpdatePolicy and CreationPolicy that allows for resourcing signaling:
Use AWS::CloudFormation::WaitCondition with a Macro that will create new WaitCondition resources when the service is expected to update. Signal the wait condition with a non-essential container attached to the task.
Example: https://github.com/deuscapturus/cloudformation-macro-WaitConditionUpdate/blob/master/example-ecs-service.yaml
The Macro for the above example can be found here: https://github.com/deuscapturus/cloudformation-macro-WaitConditionUpdate
My workaround for this problem is that before triggering an update stack, run a script in the background
./deployment-breaker.sh &
And for the script
#!/bin/bash
sleep 600
$deploymentStatus = (aws cloudformation describe-stack --stack-name STACK_NAME | jq XXX)
if [[ $deploymentStatus == YOUR_TERMINATE_CONDITION ]]then
aws cloudformation cancel-update-stack --stack-name STACK_NAME
fi
If your WaitCondition is in the original create you need to rename it (and the Handle). Once a waitcondition has been signaled as complete, it will always be complete. If you rename it and do an update, the original WaitCondition and Handle will be dropped and the new ones created created and signaled.
If you don't want to have to modify your template you might be able to use Lamba and Custom resources to create a unique WaitCondition via the aws cli for each update.
It's not possible at the moment with the provided CloudFormation types. I have the same problem and I might create a custom CloudFormation resource (usineg AWS Lambda) to replace my AWS::ECS::Service.
The other alternative is to use nested stacks to wrap the AWS::ECS::Service resources — it won't solve the problem, but it at least will isolate the individual service and the rest of the stack will be in a good state. My stacks have multiple services and this would help, but the custom resource is the best option so far (I know other people that did the same thing).

Coordinating Job containers and Volumes in a CI system

I'm working on a tinker Kubernetes-based CI system, where each build gets launched as a Job. I'm running these much like Drone CI does, in that each step in the build is a separate container. In my k8s CI case, I'm running each step as a container within a Job pod. Here's the behavior I'm after:
A build volume is created. All steps will mount this. A Job is fired
off with all of the steps defined as separate containers, in order of
desired execution.
The git step (container) runs, mounting the shared volume and cloning
the sources.
The 'run tests' step mounts the shared volume to a container spawned
from an image with all of the dependencies pre-installed.
If our tests pass, we proceed to the Slack announcement step, which is
another container that announces our success.
I'm currently using a single Job pod with an emptyDir Volume for the shared build space. I did this so that we don't have to wait while a volume gets shuffled around between nodes/Pods. This also seemed like a nice way to ensure that things get cleaned up automatically at build exit.
The problem becomes that if I fire up a multi-container Job with all of the above steps, they execute at the same time. Meaning the 'run tests' step could fire before the 'git' step.
I've thought about coming up with some kind of logic in each of these containers to sleep until a certain unlock/"I'm done!" file appears in the shared volume, signifying the dependency step(s) are done, but this seems complicated enough to ask about alternatives before proceeding.
I could see giving in and using multiple Jobs with a coordinating Job, but then I'm stuck getting into Volume Claim territory (which is a lot more complicated than emptyDir).
To sum up the question:
Is my current approach worth pursuing, and if so, how to sequence the Job's containers? I'm hoping to come up with something that will work on AWS/GCE and bare metal deployments.
I'm hesitant to touch PVCs, since the management and cleanup bit is not something I want my system to be responsible for. I'm also not wanting to require networked storage when emptyDir could work so well.
Edit: Please don't suggest using another existing CI system, as this isn't helpful. I am doing this for my own gratification and experimentation. This tinker CI system is unlikely to ever be anything but my toy.
If you want all the build steps to run in containers, GitLab CI or Concourse CI would probably be a much better fit for you. I don't have experience with fabric8.io, but Frank.Germain suggests that it will also work.
Once you start getting complex enough that you need signaling between containers to order the build steps it becomes much more productive to use something pre-rolled.
As an option you could use a static volume (i.e. a host path as an artifact cache, and trigger the next container in sequence from the current container, mounting the same volume between the stages. You could then just add a step to the beginning or end of the build to 'clean up' after your pipeline has been run.
To be clear: I don't think that rolling your own CI is the most effective way to handle this, as there are already systems in place that will do exactly what you are looking for