EFS storage growing too big - amazon-ecs

We have a ECS fargate cluster that runs the fluentd application for collecting logs and routing them to elasticsearch. Logs are buffered on the disk(file buffer) before being routed to the destination. Since we are using FARGATE we mount the buffer path /var/log/fluentd/buffer/ to EFS.
What we would ideally expect is, the data in the buffer path will be flushed to elasticsearch and the buffer directory will be deleted. However we see a huge number of these buffer directories from tasks that have died and restarted several months ago.
So when a ECS tasks dies and comes back up again (autoscaling) it creates a new path /var/log/fluentd/buffer/ that gets mounted on EFS while also holding on to the path /var/log/fluentd/buffer/. I am not sure if its the EFS that holding on to these and remounting back on the new tasks.
Is there a way to delete these stale directories from EFS and just have paths specific to the running tasks. At a time, we have 5 tasks running in a service.
Any help is appreciated?

Related

What happens when network connection to GCP is lost?

Imagine I have a GCS bucket mounted on my local Linux file system. Imagine I have an app that is writing new files into a Linux directory that is mounted to GCS. My goal is to have those locally written files eventually show up in GCS.
I understand that the writes on Linux happen "locally" until the file is closed ... what happens if I lose network connectivity and hence can't write to GCS? Will the local file eventually end up in GCS? Do retries and re-attempts happen?
Based on the repository documentation for gcsfuse, file upload retries are already built into the utility, and they happen when there are problems accessing the storage bucket that is mounted. You are able to modify the maximum backoff for retries by using the --max-retry-sleep flag. This flag controls the maximum time that can be reached between retries, after which retrying stops. The flag accepts an X amount of minutes as input.
This doc page is also relevant if you would like to know more about specific characteristics of gcsfuse.

Proper way for pods to read input files from the same persistent volume?

I'm new to Kubernetes and plan to use Google Kubernetes Engine. Hypothetically speaking, let's say I have a K8s cluster with 2 worker nodes. Each node would have its own pod housing the same application. This application will grab a file from some persistent volume and generate an output file that will be pushed back into a different persistent volume. Both pods in my cluster would be doing this continuously until there are no input files in the persistent volume left to be processed. Do the pods inherently know NOT to grab the same file that one pod is already using? If not, how would I be able account for this? I would like to avoid 2 pods using the same input file.
Do the pods inherently know NOT to grab the same file that one pod is already using?
Pods are just processes. Two separate processes accessing files from a shared directory are going to run into conflicts unless they have some sort of coordination mechanism.
Option 1
Have one process whose job it is to enumerate the available files. Your two workers connect to this process and receive filenames via some sort of queue/message bus/etc. When they finish processing a file, they request the next one, until all files are processed. Because only a single process is enumerating the files and passing out the work, there's no option for conflict.
Option 2
In general, renaming files is an atomic operation. Each worker creates a subdirectory within your PV. To claim a file, it renames the file into the appropriate subdirectory and then processes it. Because renames are atomic, even if both workers happen to pick the same file at the same time, only one will succeed.
Option 3
If your files have some sort of systematic naming convention, you can divide the list of files between your two workers (e.g., "everything that ends in an even number is processed by worker 1, and everything that ends with an odd number is processed by worker 2").
Etc. There are many ways to coordinate this sort of activity. The wikipedia entry on Synchronization may be of interest.

Spring Batch Restartability on Kubernetes for File Operations

I want to learn what is the proper way to reach the processed files when restarting the spring batch application on Kubernetes. Especially if the target type is file, it is being deleted together with the pod after the job failed.
We are considering to use persistent volume or backing up the created file somewhere such as DB or sftp server by implementing a listener.
Is there anyone have the experience of persistent volume usage(nfs or other solutions) for file operations. We are concerned about the performance and unexpected problems. Do you have any suggestions?
Thank you.
You should not rely on the ephemeral file system of a Pod for files that should persist and survive a Job (Pod) failure.
You need to use a persistent volume for that, so that Spring Batch can find the (incomplete) output file in a restart scenario and resume writing where it left off.
If you want data persistence, you may begin by using hostPath volumes first. This will restrict which nodes your pods may be spawned on. But is the simplest and gives you the best performance.
https://kubernetes.io/docs/concepts/storage/volumes/#hostpath
If you want dynamic allocation, you will need to configure storage solutions such as GlusterFS, NFS, CEPH etc.

Does each task in an ECS cluster get its own disk space?

I have an ECS cluster set up. I can launch several tasks that all point to the same task definition, and I see them running with different container runtime id.
I understand that Each Fargate task has its own isolation boundary and does not share the underlying kernel, CPU resources, memory resources, or elastic network interface with another task.
What I want to understand is does each task gets its own disk space as well? Suppose I append logs to a static file (logs/application_logs.txt). Will each running task only have its own logs in that file?
If 3 tasks are running together, will the logs of all 3 tasks end up in logs/application_logs.txt?
What I want to understand is does each task gets its own disk space as
well? Suppose I append logs to a static file
(logs/application_logs.txt). Will each running task only have its own
logs in that file?
Each fargate task gets its own storage space allocated. This is unique to the task. Each running task will only have its own logs in that particular location.
If 3 tasks are running together, will the logs of all 3 tasks end up in >logs/application_logs.txt?
No they will not. You can add data volumes to tasks that can be shared between multiple tasks if you wanted to. See https://docs.aws.amazon.com/AmazonECS/latest/developerguide/fargate-task-storage.html for more info.
Quoting the doc:
When provisioned, each Fargate task receives the following storage.
Task storage is ephemeral. After a Fargate task stops, the storage is
deleted.
10 GB of Docker layer storage
An additional 4 GB for volume mounts. This can be mounted and shared among containers using the volumes, mountPoints and volumesFrom
parameters in the task definition.
Yes, if you have provisioned storage inside volumes section of task definition then your task gets a non-persistent storage.
If you're appending logs to a file and there are 3 tasks running then I guess each one of them will have their own log file.

how kubernetes deal with file write locker accross multi pods when hostpath Volumes concerned

I got app that logs to file my_log/1.log, and then I use filebeat to collect the logs from the file
Now I use k8s to deploy it into some nodes, and use hostpath type Volumes to mounts my_log file to the local file syetem, /home/my_log, suddenly I found a subtle situation:
what will it happened if more than one pod deployed on this machine, and then they try to write the log at the same time?
I know that in normal situation, multi-processing try to write to a file at the same time, the system will lock the file,so these processes can write one by one, BUT I am not sure will k8s diffirent pods will not share the same lock space, if so, it will be a disaster.
I try to test this and it seems diffirent pods will still share the file lock,the log file seems normal
how kubernetes deal with file write locker accross multi pods when hostpath Volumes concerned
It doesn't.
Operating System and File System are handling that.
As an example let's take syslog. It handles it by opening a socket, setting the socket to server mode, opening a log file in write mode, being notified of packages, parsing the message and finally writing it to the file.
Logs can also be cached, and the process can be limited to 1 thread, so you should not have many pods writing to one file. This could lead to issues like missing logs or lines being cut.
Your application should handle the file locking to push logs, also if you want to have many pods writing logs, you should have a separate log file for each pod.