Usage of custom resource definition (CRD) vs Service Catalog in k8s - kubernetes

I recently started to explore k8s extensions and got introduced to two concepts:
CRD.
Service catalogs.
They look pretty similar to me. The only difference to my understanding is, CRDs are deployed inside same cluster to be consumed; whereas, catalogs are deployed to be exposed outside the cluster for example as database service (client can order cluster of mysql which will be accessible from his cluster).
My query here is:
Is my understanding correct? if yes, can there be any other scenario where I would like to create catalog and not CRD.

Yes, your understanding is correct. Taken from official documentation:
Example use case
An application developer wants to use message queuing as part of their application running in a Kubernetes cluster.
However, they do not want to deal with the overhead of setting such a
service up and administering it themselves. Fortunately, there is a
cloud provider that offers message queuing as a managed service
through its service broker.
A cluster operator can setup Service Catalog and use it to communicate
with the cloud provider’s service broker to provision an instance of
the message queuing service and make it available to the application
within the Kubernetes cluster. The application developer therefore
does not need to be concerned with the implementation details or
management of the message queue. The application can simply use it as
a service.
With CRD you are responsible for provisioning resources, running backend logic and so on.
More info can be found in this KubeCon 2018 presentation.

Related

Running a stateless app as a statefulset (Kubernetes)

In the Kubernetes world, a typical/classic pattern is using Deployment for Stateless Applications and using StatefulSet for a stateful application.
I am using a vendor product (Ping Access) which is meant to be a stateless application (it plays the role of a Proxy in front of other Ping products such as Ping Federate).
The github repo for Ping Cloud (where they run these components as containers) shows them running Ping Access (a stateless application) as a Stateful Set.
I am reaching out to their support team to understand why anyone would run a Stateless application as a StatefulSet.
Are there other examples of such usage (as this appears strange/bizarre IMHO)?
I also observed a scenario where a customer is using a StatefulApp (Ping Federate) as a regular deployment instead of hosting them as a StatefulSet.
The Ping Cloud repository does build and deploy Ping Federate as a StatefulSet.
Honestly, both these usages, running a stateless app as a StatefulSet (Ping Access) and running a stateful app as a deployment (Ping Federate) sound like classic anti-patterns.
Apart from the ability to attach dedicated Volumes to StatefulSets you get the following features of which some might be useful for stateless applications:
Ordered startup and shutdown of Pods with K8s doing them one by one in an ordered fashion.
Possibility to guarantee that not more than a single Pod is running at a time even during unscheduled Pod restarts.
Stable DNS names for Pods.
I can only speculate, why Ping Federate uses a StatefulSet. Possibly, it has to do with access limitations of the downstream services it connects to.
The consumption of PingAccess is stateless, but the operation is very much stateful. Namely, the PingAccess admin console maintains a database for configuration, and part of that configuration includes clustered engine mapping and session keys.
Thus, if you were to take away the persistent volume, restarting the admin console would decouple all the engines in the cluster. Then the engines no longer receive configuration.. and web session keys would be mismatched.
The ping-cloud-base repo uses StatefulSet for engines not for persistent volumes, but for sts naming scheme. I personally disagree with this and recommend using Deployment for engines. The only downside is you then have to remove orphaned engines from the admin configuration. Orphaned engines meaning engine config that stays in the admin console db after the engine deployment is rolled/updated. These can be removed from the admin UI, or API. Pretty easy to script in a hook.
It would be ideal for an application that is not a datastore to run without persistent volume, but for the reasons mentioned above, the PingAccess admin console does require and act like a persistent datastore so I think StatefulSet is okay.
Finally, the Ping DevOps team focuses support on their helm chart (where engines are also deployments by default). I'd suspect the community and enterprise support is much larger there for folks deploying on their own. ping-cloud-base is a good place to pick up some hooks though.

Usecase when we need service discovery service when infrastructure is built with kubernetes

I am learning Kubernetes and have developed good knowledge about it. however I am not able to understand why and in which case one would use the service discovery tools when infra is on Kubernetes.
This was asked to me during the interview like which service discovery software will you use for microservices. I am not sure why one would need service discovery when in Kubernetes we have services objects which can be referenced by name.
Has anyone come across a case, where they are developing microservices on Kubernetes and needed the service discovery tool to say like etcd ?
Yes, there could be many more cases for setting up your own service discovery. One, in particular, is a multi-cluster setup with k8s. You can look at how Submariner (a tool for connecting several k8s clusters with an l3/4 tunnel) utilize CoreDNS to add a cross-cluster DNS Service Discovery).

Flink native kubernetes deployment

I have some limitations with the rights required by Flink native deployment.
The prerequisites say
KubeConfig, which has access to list, create, delete pods and **services**, configurable
Specifically, my issue is I cannot have a service account with the rights to create/remove services. create/remove pods is not an issue. but services by policy only can be created within an internal tool.
could it be any workaround for this?
Flink creates two service in native Kubernetes integration.
Internal service, which is used for internal communication between JobManager and TaskManager. It is only created when the HA is not enabled. Since the HA service will be used for the leader retrieval when HA enabled.
Rest service, which is used for accessing the webUI or rest endpoint. If you have other ways to expose the rest endpoint, or you are using the application mode, then it is also optional. However, it is always be created currently. I think you need to change some codes to work around.

multiple environment for websites in Kubernetes

I am a newbie in Kubernetes.
I have 19 LAN servers with 190 machines.
Each of the 19 LANs has 10 machines and 1 exposed IP.
I have different websites/apps and their environments that are assigned to each LAN.
how do I manage my Kubernetes cluster and do setup/housekeeping.
Would like to have a single portal or manager to manage the websites and environment(dev, QA, prod) and keep isolation.
Is that possible?
I only got a vague idea of what you want to achieve so here goes nothing.
Since Kubernetes has a lot of convenience tools for setting a cluster on a public cloud platform, I'd suggest to start by going through "kubernetes-the-hard-way". It is a guide to setup a cluster on Google Cloud Platform without any additional scripts or tools, but the instructions can be applied to local setup as well.
Once you have an operational cluster, next step should be to setup an Ingress Controller. This gives you the ability to use one or more exposed machines (with public IPs) as gateways for the services running in the cluster. I'd personally recommend Traefik. It has great support for HTTP and Kubernetes.
Once you have the ingress controller setup, your cluster is pretty much ready to use. Process for deploying a service is really specific to service requirements but the right hand rule is to use a Deployment and a Service for stateless loads, and StatefulSet and headless services for stateful workloads that need peer discovery. This is obviously too generalized and have many exceptions.
For managing different environments, you could split your resources into different namespaces.
As for the single portal to manage it all, I don't think that anything as such exists, but I might be wrong. Besides, depending on your workflow, you can create your own portal using the Kubernetes API but it requires a good understanding of Kubernetes itself.

Disadvantages of using eureka for Service Discovery with kubernetes

Context
I am deploying a set of services that are containerised using Docker into AWS. No matter which deployment solution is chosen (e.g. raw EC2/ECS/Elastic Beanstalk/Fargate) we will face the issue of "service discovery".
To name just a few of the options for service discovery that I've considered:
AWS Route 53 Service Registry
Kubernetes
Hashicorp Consul
Spring Cloud Netflix Eureka
Specifics Of My Stack
I am developing Java Spring Boot applications using Spring Cloud with the target deployment environment being AWS.
Given that my stack is Spring based, spring cloud eureka made sense to me while developing locally. It was easy to set up a single node, integrates well with the stack and ecosystem of choice and required very little set up.
Locally, we are using docker compose (not swarm) to deploy services - one of the containers deployed is a single node Eureka service discovery server.
However, when we progress outside of local development and into staging or production environment we are considering options like Kubernetes.
My Own Assessment Of Pros/Cons
AWS Route 53 Service Registry
Requires us to couple code specifically to AWS services. Not a problem per se, we are quite tied in anyway on other parts of the stack (SNS/SQS).
Makes running the stack locally slightly more difficult as it relies on Route 53, I suppose we could open up a certain hosted zone for local development.
AWS native, no managing service registries or extra "moving parts".
Spring Cloud Eureka
Downside is that thus requires us to deploy and manage a high availability service registry cluster and requires more resources. Another "moving part" to manage.
Advantages are that it fits into our stack well (spring ecosystem, spring boot, spring cloud, feign and zuul work well with this). Also can be run locally trivially.
I presume we need to configure the networks and registry zone to ensure that that clients publish their host address rather and docker container internal IP address. e.g. if service A is on host A and wants to talk to service B on host B, service B needs to advertise its EC2 address rather than some internal docker IP.
Questions
If we use Kubernetes for orchestration, are there any disadvantages to using something like Spring Cloud Eureka over the built in service discovery options described here https://kubernetes.io/docs/concepts/services-networking/service/#discovering-services
Given Kube provides this, it seems suboptimal to then use eureka deployed using kube to perform discovery. I presume kube can make some optimisations that impact avaialbility and stability that might nit be possible using eureka. e.g kube would know when deploying a new service - eureka will have to rely on heartbeats/health checks and depending on how that is configured (e.g. frequency) this could result in stale records whereas i presume kube might not suffer from this for planned service shutdown/restarts. I guess it still does for unplanned failures such as a host failure or network partition.
Does anyone have any advice on this, do people use services like Kubernetes but use other mechanisms for service discovery rather than those provided by kube. Is there a good reason to do one or the other?
Possible Challenges I Anticipate
We could replace eureka, but relying on Kube to perform discovery will mean that we need to run kube locally to deploy whereas currently we have a simple tiny docker-compose file. Also, I'll have to look at how easy it'll be to ensure that ribbon, zuul and feign play nicely with this.
Currently we have ribbon configured with a eureka client so that service A can server to service B just as "service-b" for example and have ribbon resolve a healthy host via a eureka client. I guess we can configure ribbon to not use eureka and use an external Kube service name which will be resolved by Kube DNS at runtime...
Final Note
Thanks in advance for any contribution or advice. I know this might elicit a primarily opinion focused response. But I am hoping someone can provide objective guidance on when one solution might be preferable to another.
Service discovery is something you get out-of-the-box with Kubernetes. So having another external service in your platform will be another application to maintain, deploy and can be a point of failure. So I would stick with the the service discovery provided by Kubernetes.