scale stateful set with shared volume per az

scale stateful set with shared volume per az - kubernetes

I would like to scale a kubernetes stateful set in EKS which serves static data. The data should be shared where possible, which means by availability zone.
Can I specify a volume claim template to make this possible? Or is there another mechanism?
I also need to initialize the volume (in an init-container) when the first node joins. Do I need to provide some external locking mechanism rather than just checking if volume is empty?

If you want your pods to share static data, then you can:
Put the data into the Persistent Volume
Mount the same volume into the Pods (with ReadOnlyMany)
A few comments:
Here you can find the list of volumes, so you can choose the one that fits your needs
Since every pod serves the same data, you may use Deployment instead of StatefulSet, StatefulSets are when each of your pod is different
If you need the first pod to initialize the data, then you can either use initContainer (then use ReadWriteMany instead of ReadOnlyMany). Depending on what you try to do exactly, but maybe you can first initialize the data and then start your Deployment (and Pods), then you'd not need to lock anything

Related

Increase ECS fargate memory using EFS

one of my application running ECS(with fargate) needs more memory, the 20GB ephemeral memory is not sufficient for my application, so I am planning to use efs
volume {
name = "efs-test-space
efs_volume_configuration {
file_system_id = aws_efs_file_system.efs_apache.id
root_directory = "/"
transit_encryption = "ENABLED"
container_path = "/home/user/efs/"
authorization_config {
access_point_id = aws_efs_access_point.efs-access-point.id
iam = "ENABLED"
}
}
I can see it is mounted and my application is able to access the mounted folder, but because of HA and to have parallelism my ecs task count are 6. Since I am using sone EFS and same will be shared by all tasks. So here the problem I got stuck is providing unique mounted EFS filepath for each task .
I added something like this /home/user/efs/{random_id} but this I want to make as part of task lifecycle, I mean this folder should get deleted if my task is stopped or destroyed/
So is there a way to mount efs as bind mount or enable delete of folder during task destroy stage?

You can now increase your ephemeral storage size to 200GB all you need to do is to set the ephemeral parameter in fargate task definition
"ephemeralStorage": {
"sizeInGiB": 100
}

While this could be achieved in theory, there is a lot of moving parts to it because the life cycle of EFS (and Access Points) is decoupled from the life cycle of tasks. This means you need to create Access Points out of band and on the fly AND these Access Points (and their data) are not automatically deleted when you tear down your tasks. The Fargate/EFS integration did not have this as a primary use case. The primary use case were more around sharing of the same data among different tasks(which is what you are observing but it doesn't serve your use case!) in addition to provide point persistency for a single task. .
What you need to solve your problem easily is a new feature the Fargate team is working on right now and that will allow you to expand your local ephemeral storage as a property of the task. I can't say more about the timing but the feature is actively being developed so you may want to consider intercepting it rather than building a complex workflow to achieve the same result.

How to differentiate between equally-named Prometheus metrics from dynamically discovered micro-services in Kubernetes

I’m looking for a way to differentiate between Prometheus metrics gathered from different dynamically discovered services running in a Kubernetes cluster (we’re using https://github.com/coreos/prometheus-operator). E.g. for the metrics written into the db, I would like to understand from which service they actually came.
I guess you can do this via a label from within the respective services, however, swagger-stats (http://swaggerstats.io/) which we’re using does not yet offer this functionality (to enhance this, there is an issue open: https://github.com/slanatech/swagger-stats/issues/50).
Is there a way to implement this over Prometheus itself, e.g. that Prometheus adds a service-specific label per time series after a scrape?
Appreciate your feedback!

Is there a way to implement this over Prometheus itself, e.g. that Prometheus adds a service-specific label per time series after a scrape?
This is how Prometheus is designed to be used, as a target doesn't know how the monitoring system views it and prefixing metric names makes cross-service analysis harder. Both setting labels across an entire target and prefixing metric names are considered anti-patterns.
What you want is called a target label, these usually come from relabelling applied to metadata from service discovery.
When using the Prometheus Operator, you can specify targetLabels as a list of labels to copy from the Kubernetes Service to the Prometheus targets.

Scale the replica set using Labels

I could able to scale the replica set using the following
/apis/apps/v1/namespaces/{namespace}/deployments/{deployment}/scale
Is there a way that I can do scaling based on the specific label instead of namespaces and deployment.
I could find a way to get the deployments based on label
/apis/extensions/v1beta1/deployments?labelSelector={labelKey}={labelValue}
But couldn't find scaling using label.
Any help is appreciated.

You can scale Deployments, ReplicaSets, ReplicaConlrollers and StatefulSets using appropriate API:
/apis/apps/v1/namespaces/{namespace}/deployments/{name}/scale
/apis/apps/v1/namespaces/{namespace}/replicationcontrollers/{name}/scale
/apis/apps/v1/namespaces/{namespace}/replicasets/{name}/scale
/apis/apps/v1/namespaces/{namespace}/statefulsets/{name}/scale
The idea is to find Deployment with required Labels using API /apis/extensions/v1beta1/deployments?labelSelector={labelKey}={labelValue},
and after that, use API /apis/apps/v1/namespaces/{namespace}/deployments/{name}/scale to scale.
You can implement this logic on ReplicaSets, ReplicaConlrollers and StatefulSets. But you need to remember, if you use Deployment, you need to scale it, not ReplicaConlroller created by it.

Kubernetes: send custom parameters per pod

I'm testing with a kubernetes cluster, and it's been marvelous to work with. But I got the following scenario:
I need to pass to each pod a custom value(s) just for that pod.
Let's say, I got deployment 1, and I define some env vars to that deployment, the env vars will go to each pod and that's good, but what I need is to send custom values that may go to a specific pod(like "to the third pod that I may create, send this").
This is what I got now:
Then, what I need is something like this:
Is there any artifact/feature I could use? It does not have to be an env var, it may be a configmap value, or anything. thanks in advance

Pods in a deployment are homogenous. If you want to set up a set of pods that are distinct from one another, you might want to use StatefulSet, which gives each pod an index you could use within the pod to select relevant config params

The real question here is how do you know what you want to put in particular pod in the first place. You could probably achieve something like this writing a custom initialiser for your pods. You could also have an init container prefetching information from central coordinator. To propose a solution, you need to figure it out in a "not a snowflake" way.

How to Use Data Disk from Service Fabric

I have a service fabric .net application that needs more temporary storage than is available on the instance. The current instance allows for up to 4 data disks. I see how to allocate these from the scale set, but not how to specify mounts, permissions, or even how to properly access them from the file API. I was surprised to find very little in the way of documentation on this so any help is appreciated.

Once you get into the instance of the scale then you add new disk to the VM using instruction here
https://learn.microsoft.com/en-us/azure/virtual-machines/windows/attach-disk-ps
After that you can access them using standard IO i.e System.IO.File.ReadAllText
However it is not recommend to change node type for primary node and for non primary node type it is better to create a new vm scale set. Graduate move services from old scale set to the new one by update placement properties