Unique access to Kubernetes resources - kubernetes

For some integration tests we would like to have a way of ensuring, that only one test at a time has access to certain resources (e.g. 3 DeploymentConfigurations).
For that to work we have have the following workflow:
Before test is started - wait until all DCs are "undeployed".
When test is started - set DC replicas to 1.
When test is stopped - set DC replicas to 0.
This works to some degree, but obviously has the problem, that once the test terminates unexpectedly, the DCs might still be in flight.
Now one way to "solve" this would be to introduce a CR, with a Controller, which handles lifetime of the lock (CR).
Is there any more elegant and straight forward way of allowing unique access to Kubernetes resources?
EDIT:
Sadly we are stuck with Kubernetes 1.9 for now.

look at 'kubectl wait' api to set different conditions between the test flow and depending on the result proceed to next test step.

Related

Limiting the number of times an endpoint of Kubernetes pod can be accessed?

I have a machine learning model inside a docker image. I pushed the docker image to google container registry and then deploy it inside a Kubernetes pod. There is a fastapi application that runs on Port 8000 and this Fastapi endpoint is public
(call it mymodel:8000).
The structure of fastapi is :
app.get("/homepage")
asynd def get_homepage()
app.get("/model):
aysnc def get_modelpage()
app.post("/model"):
async def get_results(query: Form(...))
User can put query and submit them and get results from the machine learning model running inside the docker. I want to limit the number of times a query can be made by all the users combined. So if the query limit is 100, all the users combined can make only 100 queries in total.
I thought of a way to do this:
Store a database that stores the number of times GET and POST method has been called. As soon as the total number of times POST has been called crosses the limit, stop accepting any more queries.
Is there an alternative way of doing this using Kubernetes limits? Such as I can define a limit_api_calls such that the total number of times mymodel:8000 is accessed is at max equal to limit_api_calls.
I looked at the documentation and I could only find setting limits for CPUs, Memory and rateLimits.
There are several approaches that could satisfy your needs.
Custom implementation: As you mentioned, keep in a persistence layer the number of API calls received and deny requests after it has been reached.
Use a service mesh: Istio (for instance) will let you limit the number of requests received and act as a circuit breaker.
Use an external Api Manager: Apigee will also let you limit and even charge your users, however if it is only for internal use (not pay per use) I definitely won't recommend it.
The tricky part is what you want to happen after the limit has been reached, if it is just a pod you may exit the application to finish and clear it.
Otherwise, if you have a deployment with its replica set and several resources associated with it (like configmaps), you probably want to use some kind of asynchronous alert or polling check to clean up everything related to your deployment. You may want to have a deep look at orchestrators like Airflow (Composer) and use several tools such as Helm for keeping deployments easy.

How to configure channels and AMQ for spring-batch-integration where all steps are run as slaves on another cluster member

Followup to Configuration of MessageChannelPartitionHandler for assortment of remote steps
Even though the first question was answered (I think well), I think I'm confused enough that I'm not able to ask the right questions. Please direct me if you can.
Here is a sketch of the architecture I am attempting to build. Today, we have a job that runs a step across the cluster that works. We want to extend the architecture to run n (unbounded and different) jobs with n (unbounded and different) remote steps across the cluster.
I am not confusing job executions and job instances with jobs. We already run multiple job instances across the cluster. We need to be able to run other processes that are scalable in hte same way as the one we have defined already.
The source data is all coming from database which are known to the steps. The partitioner is defining the range of data for the "where" clause in the source database and putting that in the stepContext. All of the actual work happens in the stepContext. The jobContext simply serves to spawn steps, wait for completion, and provide the control API.
There will be 0 to n jobs running concurrently, with 0 to n steps from however many jobs running on the slave VM's concurrently.
Does each master job (or step?) require its own request and reply channel, and by extension its own OutboundChannelAdapter? Or are the request and reply channels shared?
Does each master job (or step?) require its own aggregator? By implication this means each job (or step) will have its own partition handler (which may be supported by the existing codebase)
The StepLocator on the slave appears to require a single shared replyChannel across all steps, but it appears to me that the messageChannelpartitionHandler requires a separate reply channel per step.
What I think is unclear (but I can't tell since it's unclear) is how the single reply channel is picked up by the aggregatedReplyChannel and then returned to the correct step. Of course I could be so lost I'm asking the wrong questions.
Thank you in advance
Does each master job (or step?) require its own request and reply channel, and by extension its own OutboundChannelAdapter? Or are the request and reply channels shared?
No, there is no need for that. StepExecutionRequests are identified with a correlation Id that makes it possible to distinguish them.
Does each master job (or step?) require its own aggregator? By implication this means each job (or step) will have its own partition handler (which may be supported by the existing codebase)
That should not be the case, as requests are uniquely identified with a correlation ID (similar to the previous point).
The StepLocator on the slave appears to require a single shared replyChannel across all steps, but it appears to me that the messageChannelpartitionHandler requires a separate reply channel per step.
The messageChannelpartitionHandler should be step or job scoped, as mentioned in the Javadoc (see recommendation in the last note). As a side note, there was an issue with message crossing in a previous version due to the reply channel being instance based, but it was fixed here.

Uber Cadence task list management

We are using Uber Cadence and periodically we run into issues on the production environment.
The setup is the following:
One Java 14 BE with Cadence client 2.7.5
Cadence service version 0.14.1 with Postgres DB
There are multiple domains, for all domains the single BE server is registered as a worker.
What is visible in the logs is that sometimes during a query the cadence seems to lose stickiness to the BE service:
"msg":"query direct through matching failed on sticky, clearing sticky before attempting on non-sticky","service":"cadence-history","shard-id":1,"address":"10.1.1.111:7934"
"msg":"query directly though matching on non-sticky failed","service":"cadence-history","shard-id":1,"address":"10.1.1.111:7934"..."error":"code:deadline-exceeded message:timeout"
"msg":"query directly though matching on non-sticky failed","service":"cadence-history","shard-id":1,"address":"10.1.1.111:7934"..."error":"code:deadline-exceeded message:timeout"
"msg":"query directly though matching on non-sticky failed","service":"cadence-history","shard-id":1,"address":"10.1.1.111:7934"..."error":"code:deadline-exceeded message:timeout"
"msg":"query directly though matching on non-sticky failed","service":"cadence-history","shard-id":1,"address":"10.1.1.111:7934"..."error":"code:deadline-exceeded message:timeout"
...
In the backend in the meanwhile nothing is visible. However, during this time if I check the pollers on the cadence web client I see that the task list is there, but it is not considered as a decision handler any more (http://localhost:8088/domains/mydomain/task-lists/mytasklist/pollers). Because of this pretty much the whole environment is dead because there is nothing that can progress with the decision. The only option is to restart the backend service and let it re-register as a worker.
At this point the investigation is stuck, so some help would be appreciated.
Does anyone know about how a worker or task list can lose its ability to be a decision handler? Is it managed by cadence, like based on how many errors the worker generates? I was not able to find anything about this.
As I understand when the stickiness is lost, cadence will check for another worker to replay the workflow and continue it (in my case this will be the same worker as there is only one). Is it possible that replaying the flow is not possible (although I think it would generate something in the backend log from the cadence client) or at that point the worker is already removed from the list and that causes the time-out?
Any help would be more than welcome! Thanks!
Does anyone know about how a worker or task list can lose its ability to be a decision handler
This will happen when worker stops polling for decision tasks. For example if you configure the worker only polls for activity tasks, then it will show like that. So apparently it will also happen if for some reason the worker stops polling for decision tasks.
As I understand when the stickiness is lost, cadence will check for another worker to replay the workflow and continue
Yes, as long as there is another worker polling for decision tasks. Note that Query tasks is considered as of the the decision task types. (this is a wrong design, we are working on to separate it).
From your logs:
"msg":"query directly though matching on non-sticky failed","service":"cadence-history","shard-id":1,"address":"10.1.1.111:7934"..."error":"code:deadline-exceeded message:timeout"
This means that Cadence dispatch the Query tasks to a worker, and a worker accepted, but didn't respond back within timeout.
It's very highly possible that there is some bugs in your Query handler logic. The bug caused decision worker to crash(which means Cadence java client also has a bug too, user code crashing shouldn't crash worker). And then a query task loop over all instances of your worker pool, finally crashed all your decision workers.

How can I get down time of a specific deployment in kubernetes?

I have an use case where I need to collect the downtime of each deployment (if all the replicas(pods) are down at the same point of time).
My goal is to maintain the total down time for each deployment since it was created.
I tried getting it from deployment status, but the problem is that I need to make frequent calls to get the deployment and check for any down time.
Also the deployment status stores only the latest change. So, I will end up missing out the changes that occurred in between each call if there is more than one change(i.e., down time). Also I will end up making multiple calls for multiple deployments frequently which will consume more compute resource.
Is there any reliable method to collect the down time data of an deployment?
Thanks in advance.
A monitoring tool like prometheus would be a better solution to handle this.
As an example, below is a graph from one of our deployments for last 2 days
If you look at the blue line for unavailable replicas, we had one replica unavailable from about 17:00 to 10:30 (ideally unavailable count should be zero)
This seems pretty close to what you are looking for.

Checking later the state of nodes during a past job execution

Is there a way to check in Azure Batch whether a node went to the unusable state while running a specific job on a specific pool? The context is that, when running a job and checking the pool on which it was running at that time, there were some nodes that went to the unusable state during the job execution, but we wouldn't have any indication that this happened if we weren't checking the heatmap of the pool during the job execution. Thus, how can I check if nodes went to the unusable state during some job run?
Also, I see that there are metrics collected about the state of nodes in Azure portal, but I am not sure why these metrics are always zero for me even though I am running jobs and tasks that fail?
I had a quick look for you: (I hope this helps :))
For the nodes state monitoring you can do something mentioned here:
https://learn.microsoft.com/en-us/azure/batch/batch-efficient-list-queries
PoolOperations: https://learn.microsoft.com/en-us/dotnet/api/microsoft.azure.batch.pooloperations?view=azurebatch-7.0.1
ListComputeNodes : Enumerates the ComputeNode of the specified pool.
I think at the detail level if you filter with correct clasue you will get ComputeNode information, then you can loop through the information and check the state.
https://learn.microsoft.com/en-us/dotnet/api/microsoft.azure.batch.common.computenodestate?view=azurebatch-7.0.1
Possible sample implementation: (Please note this specific code is probably for the pool health) https://github.com/Azure/azure-batch-samples/blob/master/CSharp/Common/GettingStartedCommon.cs#L31
With regards to the metrics, how are you getting the metrics back. I am sure I will get corrected if I said anything doubtful or incorrect. Thanks!