Regarding mongodb deployment as container in AWS Fargate - amazon-ecs

I want to deploy Mongo image on a container service like amazon Fargate.
can i write data to that container, if its possible to write data to the container where the data will be stored and they will charge it as a task?

Each Fargate task (PV 1.4) comes with 20GB of ephemeral storage included in the price. You can extend it up to 200GB (for an additional fee). See here. Again this space is ephemeral, if the task shuts down your disk is wiped.
One other option (in this case persistent) would be to mount an EFS volume to the Fargate task but probably not a great fit for a database workload.

Related

Add EFS volume to ECS for persistent mongodb data

I believe this requirement seems pretty straight forward for anyone trying to host their Tier3 i.e. database in a container.
I have MVP 3x Tier MERN app using -
1x Container instance
3x ECS services (Frontend, Backend and Database)
3x Tasks (1x running task per service)
The Database Task (mongodb) has its task definition updated to use EFS and have tested stopping the task and re-starting a new one for data persistence.
Question - How to ensure auto mount of EFS volume on the ECS container host (SPOT instance). If ECS leverages cloud formation template under the covers, do I need to update or modify this template to gain this persistent efs volume auto mounted on all container ec2 instances? I have come across various articles talking about a script in the ec2 launch config but I don't see any launch config created by ECS / cloud formation.
What is the easiest and simplest way to achieve something as trivial as persistent efs volume across my container host instances. Am guessing task definition alone doesn't solve this problem?
Thanks
Actually, I think below steps achieved persistence for the db task using efs -
Updated task definition for the database container to use EFS.
Mounted the EFS vol on container instance
sudo mount -t efs -o tls fs-:/ /database/data
The above mount command did not add any entries within the /etc/fstab but still seems to be persistent on the new ECS SPOT instance.

How to backup large mongodb instance hosted on kubernetes?

We have a mongo instance which is hosted on Kubernetes as Statefulset.
It's of around 3.5 TB size with persistent volumes attached.
We are looking for a way reduce backup time. What's the best way to backup and restore mongodb instance to and from AWS S3.
I've looked at Physical backup and Logical backup options using PBM. But not sure if they're suitable for instances deployed as statefulsets in k8s as they're of TBs size.

What is a preferable way to run a job need large tmp disk space?

I got a service that needs to scan large files and process them, upload them back to the file server.
My problem is that default available space in a pod is 10G which is not enough.
I have 3 options:
use hostFile/emptyDir volume, but this way I can't specify how much space I need, my pods could be scheduled to a node which didn't have enough disk space.
use hostFile persistent volume, but the documents say it is "Single node testing only“
use local persistent volume, but according to the document Dynamic provisioning is not supported yet, I have to manually create pv in each node which seems not acceptable by me, but if there is no other options this will be the only way to go.
Is there any other simpler options than local persistent volume?
Depending on your cloud provider you can mount their block storage options e.g
e.g. Google Cloud Storage, Azure storage by Azure, Elasticblockstore for AWS.
This way you won`t be depended on your node availability for storage. All of them are supported in Kubernetes via plugins as an expanded persistent volume claims. For example:
gcePersistentDisk
A gcePersistentDisk volume mounts a Google Compute Engine (GCE)
Persistent Disk into
your Pod. Unlike emptyDir, which is erased when a Pod is removed,
the contents of a PD are preserved and the volume is merely unmounted.
This means that a PD can be pre-populated with data, and that data can
be "handed off" between Pods.T
This is similar for awsElasticBlockStore or azureDisk
If you want to use AWS S3 there is an S3 Operator which you may find interesting.
AWS S3 Operator will deploy the AWS S3 Provisioner which will
dynamically or statically provision AWS S3 Bucket storage and access.

Is it Appropriate to Store Database in a Kubernetes Persistent Volume (And how to back up?)

I have a web application running on a Google Kubernetes cluster. My web app also uses persistent volumes for multiple MongoDB databases to store user and application data.
(1) Thus I am wondering if it is practical to store all data inside those persistent volumes in the long-run?
(2) Are there any methods for safely backing up the persistent volumes e.g. on a weekly basis (automatically)?
(3) I am also planning to integrate some kind of file upload into the application. Are persistent volumes capable of storing many GB/TB of data, or should I opt for something like Google cloud storage in this case?
Deploying statefull apps on K8s is bit painfull which is well known in K8s community. Usually, if we need HA for DBs supposed to deploy as cluster mode. But in K8s, if you want to deploy in cluster mode, you need to check StatefulSets concept. Anyways, I'm pasting links for your questions, so that you can start from there.
(1) Thus I am wondering if it is practical to store all data inside
those persistent volumes in the long-run?
Running MongoDB on Kubernetes with StatefulSets
(2) Are there any methods for safely backing up the persistent volumes
e.g. on a weekly basis (automatically)?
Persistent Volume Snapshots
Volume Snapshot (Beta from K8s docs)
You can google even more docs.
(3) I am also planning to integrate some kind of file upload into the
application. Are persistent volumes capable of storing many GB/TB of
data, or should I opt for something like Google cloud storage in this
case?
Not sure, it can hold TBs!?? but definitely, if you have cloud, consider to use it
Yes you can use the PVC in Kubernetes to store the data. However it's depends on your application usecase and size.
In kubernetes you can deploy Mongo DB as cluster and run it which is storing data inside PVC.MongoDB helm chart available for HA you can also look for that.
Helm chart : https://github.com/helm/charts/tree/master/stable/mongodb
It's suggested to single pod or statefulset of MongoDB on Kubernetes.
Backup:
For backup of MongoDB database, you can choose taking a snapshot of disk storage (PVC) weekly however along with that you can alos use Mongo snapshot.
Most people choose to manage service but still, it depends on your organization also.
Backup method
MongoDB snapshot
Disk storage snapshot
Filesystem :
Yes it can handle TB of data as it's ultimately disk volume or file
system.
Yes you can use PVC as file system but later in future you may get issue for scaling as PVC is ReadWriteOnce if you want to scale application along with PVC you have to implement ReadWriteMany.
There is sevral method also to achive this you can also directly mount file system to pod like AWS EFS but you may find it slow for file operations.
For file system there are various options available in Kubernetes like csi driver, gluster FS, minio, EFS.

Google Kubernetes storage in EC2

I started to use Docker and I'm trying out Google's Kubernetes project for my container orchestration. It looks really good!
The only thing I'm curious of is how I would handle the volume storage.
I'm using EC2 instances and the containers do volume from the EC2 filesystem.
The only thing left is the way I have to deploy my application code into all those EC2 instances, right? How can I handle this?
It's somewhat unclear what you're asking, but a good place to start would be reading about your options for volumes in Kubernetes.
The options include using local EC2 disk with a lifetime tied to the lifetime of your pod (emptyDir), local EC2 disk with lifetime tied to the lifetime of the node VM (hostDir), and an Elastic Block Store volume (awsElasticBlockStore).
The Kubernetes Container Storage Interface (CSI) project is reaching maturity and includes a volume driver for AWS EBS that allows you to attach EBS volumes to your containers.
The setup is relatively advanced, but does work smoothly once implemented. The advantage of using EBS rather than local storage is that the EBS storage is persistent and independent of the lifetime of the EC2 instance.
In addition, the CSI plugin takes care of the disk creation -> mounting -> unmounting -> deletion lifecycle for you.
The EBS CSI driver has a simple example that could get you started quickly