Chef Provisioning for canary deployment and orchestration - deployment

I am searching for chef features that, does below jobs for deployment.
Configuration:
1) Configuration of deployment node machines in specific environment
2) Configure no of service instances to be alive in environment at all time
Deployment:
Now, Just doing above configuration. When I trigger deployment of N services.
It will randomly pick up nodes from deployment environments and will start total N services.
Multiple Services:
If I have 2 nodes and I want to bring up 4 services, it should bring up 2 services on each node.
Service Failure recovery:
If any machine goes down or any service goes down in any node.
It will bring up a new service in any of environment node.

I answered this over on stackexchange and then saw this posted here so answering it here too.
You should take a look at BOSH. Its the tool that is used by CloudFoundry, its services and a distro of Kubernetes called Kubo for installation, management and "Day 2" operations.
It's basically a declarative, cloud-agnostic orchestration tool that features rolling updates, canary deployments, scaling, monitoring and self healing. It can monitor processes on VMs (i.e. services) as well as the VM themselves and will make sure that the deployment is running as you specified it to in the deployment manifest.
In order to do all of this (especially the monitoring and self-healing bits) it has a client-server architecture which is deployed with a cut-down version of BOSH itself called bosh bootloader or bbl for short. You use this to deploy the BOSH director. You talk to the bosh director by installing the bosh-cli (brew install bosh-cli on a mac).
For you deployment you would first need to create what's called a BOSH release and this can, admittedly, be a little daunting if you are not familiar with BOSH but as CF, its services and Kubo are all open source there are tons of references out there. There are also lots of pre-backed releases and stemcells (OSes).
After creating your release you upload it to your bosh director and bosh deploy. To upgrade it you upload the next version of your release and bosh deploy. To patch a security vulnerability you upload the latest blessed stemcell from bosh.io and bosh deploy. I am sure you get the picture.
If you want to go the next level then there is a good getting started guide here.

Chef does not do multi-node orchestration.

Related

Good solutions to automate infrastructure deployment locally?

I have recently been reading more about infrastructure as a service (IaaS) and platform as a service (PaaS) and had some questions. I see when we opt for a PaaS solution, it is generally very easy to create the infrastructure as the cloud providers handle that for us and we can even automate the deployment using an infrastructure as code solution like Terraform.
But if we use an IaaS solution or even a local on premise cluster, we lose a lot of the automation it seems that PaaS allows. So I was curious, are there any good tools out there for automating infrastructure deployment on a local cluster that is not in the cloud?
The best thing I could think of was to run a local Kubernetes cluster and then Dockerize each of the infrastructure components, but this seems difficult as each node in the cluster will need its own specific configuration files.
From my basic Googling, it seems like there is not a good solution to this.
Edit:
I was not clear enough with my original intentions. I have two problems I am trying to solve.
How do I automate infrastructure deployment locally? For example, suppose I wanted to create a Hadoop HDFS cluster. I would need to configure one node to be the namenode with an accessible IP, and the other nodes to be datanodes that are aware of the namenode's IP. At the moment, I have to do this manually by logging into each node, checking it's IP, and then configuring each one. How would I automate this? If I were to use a Kubernetes approach, how do I specify that one of the running pods needs to be the namenode and the others are datanodes? How do I find the pods' IPs and have them be aware of the namenode IP?
The next problem I have is very similar to the first, but a slight modification. How would I deploy specific configuration files to each node. For instance in Kafka, the configuration file for one node, requires the IPs of the Zookeeper nodes, as well as the IP it should listen on. This may be different for every node in the cluster. Is there a good way to make these config files pod specific, so that I do not have to do bash text processing to insert the correct contents into each pod's config files?
You can use Terraform for all of your on-premise Infra. Automation, and Ansible for configuration management.
Let's say you have three HPE servers, Install K8s or VMware on them using Ansible, then you can treat them as three Avvaliabilty zones in one region, same as AWS. from this you can start deploying dockerize apps, or helm charts using Terraform.
Summary:
Ansbile for installing and configuration K8s.
Terraform for provisioning K8s.
Helm for installing apps on K8s.
After this you gonna have a base automated on-premise Infra.

High Available Kubernetes cluster

We are starting migrating our system from Azure web app services to AKS infrastructure and currently we had an incident with our test cluster and connection to all our environments were lost. It was due to the upgrading version of Kubernetes and adding additional node pool which broke the route table and they lost communication between themselves.
So as a result we came up with the next HA infrastructure for our environments:
But that eventually adds more work on the CI/CD pipelines and doesn't look very logically as Kubernetes itself should be reliable.
Can I have your comments and thoughts if it is best practice or proper way of moving forward?

Service Fabric Single Node SingleNodeClusterUpdateNotAllowed

I've got a single node service fabric instance hosted in Azure, just for testing purposes. When I try and upgrade the service fabric version to 7.0 from 6.5 I get the message:
SingleNodeClusterUpdateNotAllowed
Is there anything I can do to allow this?
The short answer is no.
The reason for this is that in order to upgrade service fabric has to takes down a node, updates and restarts it. This is repeated for all nodes until the update is complete. In a single node cluster this would mean taking the cluster offline completely. This is not allowed by the service fabric rules (at the very least one node must be available).
A single node 'cluster' therefore cannot update the platform or applications running on it.
The only way you can update a single node cluster is to delete and reinstall it. The same goes for applications (delete the application type before deploying an updated version). Depending on where you have the software deployed (development box, a server, azure) I would recommend scripting as much as possible. This will allow you to easily delete and redeploy. I am using a combination of an Azure template (arm), DevOps pipeline and script to initialise and load some default data into the application.

Azure Service Fabric - connect to local service fabric cluster from outside the VM it's running on?

We have a 5-node Azure Service Fabric Cluster as our main Production microservices hub. Up until now, for testing purposes, we've just been pushing out separate versions of our applications (the production application with ".Test" appended to the name) to that production SFC.
We're looking for a better approach, namely a separate test Service Fabric Cluster. But the issue comes down to costs. The smallest SFC you can create in Azure is 3 nodes. Further, you can't shutdown a SFC when it's not being used, which we would also need to do to save on costs.
So now I'm looking at just spinning up a plain Windows VM in Azure and installing the local Service Fabric Cluster app (which allows just one-node setup). Is it possible to do this and be able to communicate with the cluster from outside the VM?
What you are trying to accomplish is setup a standalone cluster. The steps to do it is documented in this docs.
Yes, you can access the cluster from outside the VM, In simple terms enable access to the network and open the firewall ports.
Technically both deployments(Guide and DevCluster) are very similar, the main difference is that you have better control on the templates following the standalone guide, using the development setup you don't have much options and all the process is automated.
PS: I would highly recommend you have a UAT\Staging cluster with the
exact same specs as the production version, the approach you used
could be a good idea for staging environment. Having different
environments increase the risk of issues, mainly related to
configuration and concurrency.

Application monitoring in Azure Kubernetes cluster using new relic

Requirement - New Relic monitoring for an application running in pods as part of a kubernetes cluster.
I have installed Kube-state-metrics on my cluster and able to see kubernetes dashboard using newrelic insights.
Also, need to configure the Application monitoring for the same. Following https://blog.newrelic.com/2017/11/27/monitoring-application-performance-in-kubernetes/ for the same.
Have some questions for the same -
Can this be achieved using kube-state-metrics ?
Do I need to have separate yaml file for each pod containing license key?
Do I need to make changes in my application also or adding the information in spec will work?
Do I need to install Java agent in every pod? If yes, will it eat resources?
Somehow, Installation of application monitoring is becoming complex. Please explain the exact requirement of installation
You didn't mention your stack, you should follow instructions on their site for your language. Typically you just pull in their agent library and configure credentials to get started. You should not have a reason to tell your pods apart, so the agent credentials should be the same for all pods
Installing agents at infrastructure will let you have infrastructure data. So you'll get alerts if you're running out of memory/space/cpu and such. Infrastructure agent cannot possibly know about application data. If you want application performance data (apm) you need to install the agent at the application level too and you'll get data such as http request rates, error rates and response times if it's a webserver. You can also annotate current transaction with data which is all application specific. They have a bunch of client agents, see if there's one for your stack. For example all you need for a nodejs service is require('newrelic') at the top of your app and configuration