Update task's container in ECS services - mongodb

I have an ECS service with a task definition that has three containers:
Frontend container
Backend container
MongoDB container
I would like to update the frontend container and/or the backend container without loosing the mongoDB data.
what would be the best way to do it?
Is it possible to update the service without redeploy the entire task (and loosing the db data) but only the container(s) I need?
Should I use a bind volume for the mongo db so I can save the data there and when the service is redeployed the new mongo db container will retrieve the data from there?
Hope that make sense.
Thanks in advance.

what would be the best way to do it?
Ideally you wouldn't be running MongoDB in the same Task. You would be running it in a separate ECS task (or just running it on an EC2 instance directly). Right now you can't scale-out your frontend/backend services because doing so would spin up the new service instances with a new, empty database.
Is it possible to update the service without redeploy the entire task
(and loosing the db data) but only the container(s) I need?
No that's not possible
Should I use a bind volume for the mongo db so I can save the data there and when the service is redeployed the new mongo db container will retrieve the data from there?
If you are deploying your ECS tasks to EC2 targets, then you could use a bind mount that would store the DynamoDB data on one of the EC2 instances in your cluster. However if that EC2 instance ever went down you would still lose all your data. The more fault-tolerant and highly-available method would be to bind an EFS volume to your MongoDB container to store the data.
If you are deploying your ECS tasks to Fargate, then bind mounts would not be an option, EFS would be the only option.

Related

Add EFS volume to ECS for persistent mongodb data

I believe this requirement seems pretty straight forward for anyone trying to host their Tier3 i.e. database in a container.
I have MVP 3x Tier MERN app using -
1x Container instance
3x ECS services (Frontend, Backend and Database)
3x Tasks (1x running task per service)
The Database Task (mongodb) has its task definition updated to use EFS and have tested stopping the task and re-starting a new one for data persistence.
Question - How to ensure auto mount of EFS volume on the ECS container host (SPOT instance). If ECS leverages cloud formation template under the covers, do I need to update or modify this template to gain this persistent efs volume auto mounted on all container ec2 instances? I have come across various articles talking about a script in the ec2 launch config but I don't see any launch config created by ECS / cloud formation.
What is the easiest and simplest way to achieve something as trivial as persistent efs volume across my container host instances. Am guessing task definition alone doesn't solve this problem?
Thanks
Actually, I think below steps achieved persistence for the db task using efs -
Updated task definition for the database container to use EFS.
Mounted the EFS vol on container instance
sudo mount -t efs -o tls fs-:/ /database/data
The above mount command did not add any entries within the /etc/fstab but still seems to be persistent on the new ECS SPOT instance.

Regarding mongodb deployment as container in AWS Fargate

I want to deploy Mongo image on a container service like amazon Fargate.
can i write data to that container, if its possible to write data to the container where the data will be stored and they will charge it as a task?
Each Fargate task (PV 1.4) comes with 20GB of ephemeral storage included in the price. You can extend it up to 200GB (for an additional fee). See here. Again this space is ephemeral, if the task shuts down your disk is wiped.
One other option (in this case persistent) would be to mount an EFS volume to the Fargate task but probably not a great fit for a database workload.

Where/How to configure Cassandra.yaml when deployed by Google Kubernetes Engine

I can't find the answer to a pretty easy question: Where can I configure Cassandra (normally using Cassandra.yaml) when its deployed on a cluster with kubernetes using the Google Kubernetes Engine?
So I'm completely new to distributed databases, Kubernetes etc. and I'm setting up a cassandra cluster (4VMs, 1 pod each) using the GKE for a university course right now.
I used the official example on how to deploy Cassandra on Kubernetes that can be found on the Kubernetes homepage (https://kubernetes.io/docs/tutorials/stateful-application/cassandra/) with a StatefulSet, persistent volume claims, central load balancer etc. Everything seems to work fine and I can connect to the DB via my java application (using the datastax java/cassandra driver) and via Google CloudShell + CQLSH on one of the pods directly. I created a keyspace, some tables and started filling them with data (~100million of entries planned), but as soon as the DB reaches some size, expensive queries result in a timeout exception (via datastax and via cql), just as expected. Speed isn't necessary for these queries right now, its just for testing.
Normally I would start with trying to increase the timeouts in the cassandra.yaml, but I'm unable to locate it on the VMs and have no clue where to configure Cassandra at all. Can someone tell me if these configuration files even exist on the VMs when deploying with GKE and where to find them? Or do I have to configure those Cassandra details via Kubectl/CQL/StatefulSet or somewhere else?
I think the faster way to configure cassandra in Kubernetes Engine, you could use the next deployment of Cassandra from marketplace, there you could configure your cluster and you could follow this guide that is also marked there to configure it correctly.
======
The timeout config seems to be a configuration that require to be modified inside the container (Cassandra configuration itself).
you could use the command: kubectl exec -it POD_NAME -- bash in order to open a Cassandra container shell, that will allow you to get into the container configurations and you could look up for the configuration and change it for what you require.
after you have the configuration that you require you will need to automate it in order to avoid manual intervention every time that one of your pods get recreated (as configuration will not remain after a container recreation). Next options are only suggestions:
Create you own Cassandra image from am own Docker file, changing the value of the configuration you require from there, because the image that you are using right now is a public image and the container will always be started with the config that the pulling image has.
Editing the yaml of your Satefulset where Cassandra is running you could add an initContainer, which will allow to change configurations of your running container (Cassandra) this will make change the config automatically with a script ever time that your pods run.
choose the option that better fits for you.

How do i run a HA MongoDB in my kubernetes cluster without Portworx?

I want to have a MongoDB deployment as a service to my database per service type microservice architecture model.
Right now I am using helm packages to deploy mongo db by defining persistent volume and persistent volume claims.
But I want to deploy mongodb as HA with storing data in any EBS or so!
When I checked online for this solution everything suggests it with Portworx. But is there a way to do it without using Portworx?
Any help appreciated.

Kubernetes - Persistent storage for PostgreSQL

We currently have a 2-node Kubernetes environment running on bare-metal machines (no GCE) and now we wish to set up a PostgreSQL instance on top of this.
Our plan was to map a data volume for the PostgreSQL Data Directory to the node using the volumeMounts option in Kubernetes. However this would be a problem because if the Pod ever gets stopped, Kubernetes will re-launch it at random on one of the other nodes. Thus we have no guarantee that it will use the correct data directory on re-launch...
So what is the best approach for maintaining a consistent and persistent PostgreSQL Data Directory across a Kubernetes cluster?
one solution is to deploy HA postgresql, for example https://github.com/sorintlab/stolon
another is to have some network storage attached to all nodes(NFS, glusterFS) and use volumeMounts in the pods