I have a Kubernetes cluster on IBM Cloud Platform (not important, the question is related to Kubernetes itself).
If I wanted to replicate across different data centers in different regions then, should I use multiple and different master nodes for different regions? What's the best approach in this case and what would you suggest?
Thanks in advance,
I'll answer from an IBM Cloud perspective, since you are referring to data centers.
If you want to "replicate across different data centers in different regions", then you will need to create separate clusters in each of those data centers. Once you have done that, by definition you will have multiple masters (one for each of your clusters). So the short answer is yes, you will have multiple clusters (and masters).
See this doc for more info. In this case you're talking about scenario 3: https://console.bluemix.net/docs/containers/cs_clusters.html#planning_clusters
Note that you will need to provision a global load balancer to load balance between regions, as well as ensure your app can handle any data replication between regions that is needed.
Related
I want to use the Kubernetes namespace for each user of my application. So potentially I'll need to create thousands of namespaces each with kubernetes resources in them. I want to make sure this is scalable, so I want to ensure that I can have millions of namespaces on a Kubernetes Cluster before I use this construct on a per user basis.
I'm building a web hosting application. So I'm giving resources to each user, but I want them separated by namespaces.
Are there any limitations to the number of Kubernetes namespaces you can create?
"In majority of cases, thresholds are NOT hard limits - crossing the limit results in degraded performance and doesn't mean cluster immediately fails over.
Many of the thresholds (for cluster scope) are given for the largest possible cluster. For smaller clusters, the limits are proportionally lower.
"
#Namespaces = 10000 scope=cluster
source with more data
kube Talk explaining how the data is computed
You'll usually run into limitations with resources and etcd long before you hit a namespace limit.
For scaling, you're probably going to want to scale your clusters which most companies treat as cattle rather than create a giant cluster which will be a Pet, which is not a scenario you want to be dealing with.
In my cluster there are 30 VMs which are located in 3 different physical servers. I want to deploy different replicas of each workload on different physical server.
I know I can use podAntiAffinity to deploy replicas on different VMs but I cant find any way to guarantee spread replication on different physical server.
I want to know is there any way to solve this challenge?
I believe you gave the answer ;)
I went to the Kubernetes Patterns book (PDF available for free in here) to see if there was something related to that over there, and found exactly that:
To express how Pods should be spread to achieve high availability, or be packed and co-located together to improve latency, Pod affinity and antiaffinity can be used.
Node affinity works at node granularity, but Pod affinity is not limited to nodes and
can express rules at multiple topology levels. Using the topologyKey field, and the
matching labels, it is possible to enforce more fine-grained rules, which combine
rules on domains like node, rack, cloud provider zone, and region [...]
I really like the k8s docs as well, they are super complete and full of examples, so maybe you can get some ideas from here. I think the main idea will be to create your own affinity/antiaffinity rule.
----------------------------------- EDIT -----------------------------------
There is a new feature within k8s version 1.18 that may be a better solution.
It's called: Pod Topology Spread Constraints:
You can use topology spread constraints to control how Pods are spread across your cluster among failure-domains such as regions, zones, nodes, and other user-defined topology domains. This can help to achieve high availability as well as efficient resource utilization.
I'm planning to deploy RabbitMQ on Kubernetes Engine Cluster. I see there are two kinds of location types i.e. 1. Region 2. Zone
Could someone help me understand what kind of benefits I can think of respective to each location types? I believe having multi-zone set up
could help enhancing the network throughout. While multi-region set up can ensure an undisputed service even if in case of regional failure events. Is this understanding correct? I'm looking at relevant justifications to choose a location type. Please help.
I'm planning to deploy RabbitMQ on Kubernetes Engine Cluster. I see there are two kinds of location types:
Region
Zone
Could someone help me understand what kind of benefits I can think of respective to each location types?
A zone (Availability Zone) is typically a Datacenter.
A region is multiple zones located in the same geographical region. When deploying a "cluster" to a region, you typically have a VPC (Virtual private cloud) network spanning over 3 datacenters and you spread your components to those zones/datacenters. The idea is that you should be fault tolerant to a failure of a whole _datacenter, while still have relatively low latency within your system.
While multi-region set up can ensure an undisputed service even if in case of regional failure events. Is this understanding correct? I'm looking at relevant justifications to choose a location type.
When using multiple regions, e.g. in different parts of the world, this is typically done to be near the customer, e.g. to provide lower latency. CDN services is distributed to multiple geographical locations for the same reason. When deploying a service to multiple regions, communications between regions is typically done with asynchronous protocols, e.g. message queues, since latency may be too large for synchronous communication.
With the understanding that Ubernetes is designed to fully solve this problem, is it currently possible (not necessarily recommended) to span a single K8/OpenShift cluster across multiple internal corporate datacententers?
Additionally assuming that latency between data centers is relatively low and that infrastructure across the corporate data centers is relatively consistent.
Example: Given 3 corporate DC's, deploy 1..* masters at each datacenter (as a single cluster) and have 1..* nodes at each DC with pods/rc's/services/... being spun up across all 3 DC's.
Has someone implemented something like this as a stop gap solution before Ubernetes drops and if so, how has it worked and what would be some considerations to take into account on running like this?
is it currently possible (not necessarily recommended) to span a
single K8/OpenShift cluster across multiple internal corporate
datacententers?
Yes, it is currently possible. Nodes are given the address of an apiserver and client credentials and then register themselves into the cluster. Nodes don't know (or care) of the apiserver is local or remote, and the apiserver allows any node to register as long as it has valid credentials regardless of where the node exists on the network.
Additionally assuming that latency between data centers is relatively
low and that infrastructure across the corporate data centers is
relatively consistent.
This is important, as many of the settings in Kubernetes assume (either implicitly or explicitly) a high bandwidth, low-latency network between the apiserver and nodes.
Example: Given 3 corporate DC's, deploy 1..* masters at each
datacenter (as a single cluster) and have 1..* nodes at each DC with
pods/rc's/services/... being spun up across all 3 DC's.
The downside of this approach is that if you have one global cluster you have one global point of failure. Even if you have replicated, HA master components, data corruption can still take your entire cluster offline. And a bad config propagated to all pods in a replication controller can take your entire service offline. A bad node image push can take all of your nodes offline. And so on. This is one of the reasons that we encourage folks to use a cluster per failure domain rather than a single global cluster.
I'm new to Couchbase and NoSql technologies in general, but I'm working on a web chat application running on node js using express and some other modules.
I've chosen to work with NoSql to store sessions and all needed data on server-side. But I don't really understand some important features of Couchbase : What is a Cluster, a Bucket? Where can I find some clear definitions of how the server works?
Couchbase uses the term cluster in the same way as many other products, a Couchbase cluster is simply a collection of machines running as a co-ordinated, distributed system of Couchbase nodes.
A Bucket is a Couchbase specific term that is roughly analogous to a 'database' in traditional RDBMS terms. A Bucket provides a container for grouping your data, both in terms of organisation and grouping of similar data and resource allocation. You can configure your buckets separately, providing different quotas, different IO priorities and different security settings on a per bucket basis. Buckets are also the primary method for namespacing documents in Couchbase.
For further information, the Architecture and Concepts overview in the Couchbase documentation, specifically data storage, is a good starting point. A somewhat outdated, but still useful video on Introduction to Couchbase might also be useful to you.
Even though it's answered, hope the following would be more helpful for someone.
A Couchbase cluster contains nodes. Nodes contain buckets. Buckets contain documents. Documents can be retrieved multiple ways: by their keys, queried with N1QL, and also by using Views.(Ref)
As specified in the Couchbase Documentation,
Node
A single Couchbase Server instance running on a physical server,
virtual machine, or a container. All nodes are identical: they consist
of the same components and services and provide the same interfaces.
Cluster
A cluster is a collection of nodes that are accessed and managed as a
single group. Each node is an equal partner in orchestrating the
cluster to provide facilities such as operational information
(monitoring) or managing cluster membership of nodes and health of
nodes.
Clusters are scalable. You can expand a cluster by adding new nodes
and shrink a cluster by removing nodes.
The Cluster Manager is the main component that orchestrates the
cluster level operations. For more information, see Cluster Manager.
Bucket
A bucket is a logical container for a related set of items such as
key-value pairs or documents. Buckets are similar to databases in
relational databases. They provide a resource management facility for
the group of data that they contain. Applications can use one or more
buckets to store their data. Through configuration, buckets provide
segregation along the following boundaries:
Cache and IO management
Authentication
Replication and Cross Datacenter Replication (XDCR)
Indexing and Views
For further info : Couchbase Terminology