I am evaluating a migration of an application working with docker-compose to Kubernates and came across two solutions: Kompose and compose-on-kubernetes.
I'd like to know their differences in terms of functionality/ease of use to make decision of which one is more suited.
Both product provide a migration path from docker-compose to Kubernetes, but they do it in a slightly different way.
Compose on Kubernetes runs within your Kubernetes cluster and allows you to deploy your compose setup unchanged on the Kubernetes cluster.
Kompose translates your docker-compose files to a bunch of Kubernetes resources.
Compose is a good solution if you want to continue running using docker-compose in parallel to deploying on Kubernetes and so plan to keep the docker-compose format maintained.
If you're migrating completely to Kubernetes and don't plan to continue working with docker-compose, it's probably better to complete the migration using Kompose and use that as the starting point for maintaining the configuration directly as Kubernetes resources.
Related
I have recently been reading more about infrastructure as a service (IaaS) and platform as a service (PaaS) and had some questions. I see when we opt for a PaaS solution, it is generally very easy to create the infrastructure as the cloud providers handle that for us and we can even automate the deployment using an infrastructure as code solution like Terraform.
But if we use an IaaS solution or even a local on premise cluster, we lose a lot of the automation it seems that PaaS allows. So I was curious, are there any good tools out there for automating infrastructure deployment on a local cluster that is not in the cloud?
The best thing I could think of was to run a local Kubernetes cluster and then Dockerize each of the infrastructure components, but this seems difficult as each node in the cluster will need its own specific configuration files.
From my basic Googling, it seems like there is not a good solution to this.
Edit:
I was not clear enough with my original intentions. I have two problems I am trying to solve.
How do I automate infrastructure deployment locally? For example, suppose I wanted to create a Hadoop HDFS cluster. I would need to configure one node to be the namenode with an accessible IP, and the other nodes to be datanodes that are aware of the namenode's IP. At the moment, I have to do this manually by logging into each node, checking it's IP, and then configuring each one. How would I automate this? If I were to use a Kubernetes approach, how do I specify that one of the running pods needs to be the namenode and the others are datanodes? How do I find the pods' IPs and have them be aware of the namenode IP?
The next problem I have is very similar to the first, but a slight modification. How would I deploy specific configuration files to each node. For instance in Kafka, the configuration file for one node, requires the IPs of the Zookeeper nodes, as well as the IP it should listen on. This may be different for every node in the cluster. Is there a good way to make these config files pod specific, so that I do not have to do bash text processing to insert the correct contents into each pod's config files?
You can use Terraform for all of your on-premise Infra. Automation, and Ansible for configuration management.
Let's say you have three HPE servers, Install K8s or VMware on them using Ansible, then you can treat them as three Avvaliabilty zones in one region, same as AWS. from this you can start deploying dockerize apps, or helm charts using Terraform.
Summary:
Ansbile for installing and configuration K8s.
Terraform for provisioning K8s.
Helm for installing apps on K8s.
After this you gonna have a base automated on-premise Infra.
I can't find the answer to a pretty easy question: Where can I configure Cassandra (normally using Cassandra.yaml) when its deployed on a cluster with kubernetes using the Google Kubernetes Engine?
So I'm completely new to distributed databases, Kubernetes etc. and I'm setting up a cassandra cluster (4VMs, 1 pod each) using the GKE for a university course right now.
I used the official example on how to deploy Cassandra on Kubernetes that can be found on the Kubernetes homepage (https://kubernetes.io/docs/tutorials/stateful-application/cassandra/) with a StatefulSet, persistent volume claims, central load balancer etc. Everything seems to work fine and I can connect to the DB via my java application (using the datastax java/cassandra driver) and via Google CloudShell + CQLSH on one of the pods directly. I created a keyspace, some tables and started filling them with data (~100million of entries planned), but as soon as the DB reaches some size, expensive queries result in a timeout exception (via datastax and via cql), just as expected. Speed isn't necessary for these queries right now, its just for testing.
Normally I would start with trying to increase the timeouts in the cassandra.yaml, but I'm unable to locate it on the VMs and have no clue where to configure Cassandra at all. Can someone tell me if these configuration files even exist on the VMs when deploying with GKE and where to find them? Or do I have to configure those Cassandra details via Kubectl/CQL/StatefulSet or somewhere else?
I think the faster way to configure cassandra in Kubernetes Engine, you could use the next deployment of Cassandra from marketplace, there you could configure your cluster and you could follow this guide that is also marked there to configure it correctly.
======
The timeout config seems to be a configuration that require to be modified inside the container (Cassandra configuration itself).
you could use the command: kubectl exec -it POD_NAME -- bash in order to open a Cassandra container shell, that will allow you to get into the container configurations and you could look up for the configuration and change it for what you require.
after you have the configuration that you require you will need to automate it in order to avoid manual intervention every time that one of your pods get recreated (as configuration will not remain after a container recreation). Next options are only suggestions:
Create you own Cassandra image from am own Docker file, changing the value of the configuration you require from there, because the image that you are using right now is a public image and the container will always be started with the config that the pulling image has.
Editing the yaml of your Satefulset where Cassandra is running you could add an initContainer, which will allow to change configurations of your running container (Cassandra) this will make change the config automatically with a script ever time that your pods run.
choose the option that better fits for you.
I would like to know if it is possible for multiple pods in the same Kubernetes cluster to access a database which is configured using persistent volumes on a Google cloud persistent disk.
Currently I am building a microservices achitecture web app which has 3 node apis in different pods all accessing the same database. So how do I achieve this with kubernetes.
Kindly let me know if my architecture is right as well
You can certainly connect multiple node-based app pods to the same database. It is sometimes said that microservices shouldn't share a database but this depends on what your apps are doing, the project history and the extent to which you want the parts to be worked on separately.
There are questions you have to answer about running databases at scale, such as your future load and whether you want to use relational databases if you're going to try to span availability zones. And there are
some specific to kubernetes, especially around how you associate DB Pods to data. See https://stackoverflow.com/a/53980021/9705485. Another popular option is to use a managed DB service from a cloud provider. If you do run the DB in k8s then I'd suggest looking for a helm chart or looking at an operator, such as the kubeDB operator, to avoid crafting the kubernetes descriptors yourself and to get more guidance on running the DB and setting it up.
If it's a new project and you've not used k8s before then you'll also have to decide where to host your code, your docker images and your deployment descriptors and how to setup your CI pipelines. If you've not got answers to these questions already then I'd suggest looking at Jenkins-X as it will provide you with out of the box defaults for a whole cluster and CI setup and a template ('build pack') for building node apps and deploying them to staging and prod environments through a pipeline.
I have a project that uses the following resources to work:
A jsf application running under JBoss and using a PostgreSQL
2 spring boot API using MongoDB.
So, I have the following dockers:
jsf+JBoss in same docker
PostgreSQL docker
mongo docker
one docker for each spring boot app.
In kubernates I need organize this containers in PODs, so my ideia is create the following:
A POD for jsf+JBoss docker
Another for PostgreSQL
Another POD for MongoDB
Only one POD for both spring boot app, because they need each other.
So, I have 4 POD and 6 containers. Thinking about best practices to use ks8, this is a good way to organize my project?
tl;dr; This doesn't follow Kubernetes best practices. Each application should be a separate deployment or StatefulSet.
A better way to run this in Kubernetes would be using a deployment or StatefulSet for each individual application, so it would be:
One Deployment with a single container for jsf+JBoss
One StatefulSet for PostgreSQL (though I would suggest looking at an Operator to manage your PostgreSQL cluster, i.e. kubedb
One StatefulSet for MongoDB (again, strongly suggest using an Operator to manage your MongoDB cluster, which kubedb can also handle)
One deployment each for your Spring Boot applications, assuming they communicate with each other via a network. You can then manage and scale each independently of each other, regardless of their dependency on each other.
I want to deploy my application in cloud using Kubernetes based deployment. It consits of 3 layers Kafka, Ignite(as DB and processing) and Python(ML engine).
From Kafka layer we get data stream input which is then passed to Ignite for processing(feature engg). After processing the data is passed to the python
server for further ML predictions. How can I break this monolith application to microservices in Kubernetes?
Also can using Istio provide some advantage?
You can use the bitnami/kafka on docker hub from bitnami if you want pre-build image.
Export the image to your container registry with the gcloud command.
gcloud docker -- push [your image container registry path]
Deploy the images using UI or gcloud command
Expose the port{2181 9092-9099} or which one is exposed in the pulled image after the deployment on kubernetes.
Here is the link of the Ignite image on Google Compute, you have just to deploy it on the kubernetes engine and expose the appropriate ports
For python you have just to Build your python app using dockerfile as ignacio suggested.
it is possible and in fact those tools are easy to deploy in Kubernetes. Firstly, you need to gain some expertise in Kubernetes basics, specially in statefulsets and persistent volumes, since Kafka and Ignite are stateful components.
To deploy a Kafka cluster in Kubernetes follow instructions form this repository: https://github.com/Yolean/kubernetes-kafka
There are other alternatives, but this is the only one I've tested in production environments.
I have not experience with Ignite, this docs provides a step-by-step guide. Maybe someone else could share other resources.
About Python, just dockerize your ML model as any other Python app. In the official docker image for Python you'll find a basic Dockerfile to do that. Once you have your docker image pushed to a registry, just create a YAML file describing the deployment and apply it to Kubernetes.
As an alternative for the last step, you can use Draft to dockerize and deploy Python code.
Good luck!