How to monitor ECS tasks which are not registered on ELB? - amazon-ecs

I have services that interact over messaging and are not required to be registered on a load balancer. For health check, I have configured container healthcheck and I have verified that it all works. But I am not able to find a way to monitor these services - how do I create alarms and notifications when the tasks become unhealthy or go below the desired count? For other tasks registered on ALB, I have alarms for 'HealthyHostCount' metric.
I tried creating Eventbridge rule to receive notifications when a task stops by following this guide but I am not sure if this is the right way.

Related

Kubernetes. Can i execute task in POD from other POD?

I'm wondering if there is an option to execute the task from POD? So when I have POD which is, for example, listening for some requests and on request will delegate the task to other POD(worker POD). This worker POD is not alive when there are no jobs to do and if there is more than one job to do more PODs will be created. After the job is done "worker PODs" will be stopped. So worker POD is live during one task then killed and when new task arrived new worker POD is started. I hope that I described it properly. Do you know if this is possible in Kubernetes? Worker POD start may be done by, for example, rest call from the main POD.
There are few ways to achieve this behavior.
Pure Kubernetes Way
This solution requires ServiceAccount configuration. Please take a loot at Kubernetes documentation Configure Service Accounts for Pods.
Your application service/pod can handle different custom task. Applying specific serviceaccount to your pod in order to perform specific task is the best practice. In kubernetes using service account with predefined/specified RBAC rules allows you to handle this task almost out of the box.
The main concept is to configure specific RBAC authorization rules for specific service account by giving different permission (get,list,watch,create) to different Kubernetes resources like (pod,jobs).
In this scenario working pod is waiting for incoming request, after it receives specific request it can perform specific task against kubernetes api.
This can be extend i.e. by using sidecar container inside your working pod. More details about sidecar concept can be found in this article.
Create custom controller
Another way to achieve your goal is to use Custom Controller.
Example presented in Extending the Kubernetes Controller article is showing how custom controller watch kubernetes api in order to instrument underling worker pod (watching kubernetes configuration for any changes and then deletes corresponding pods). In your setup, such controller could watch your api for waiting/not processed request and perform additional task like kubernetes job creation inside k8s cluster.
Using existing solution like Job Processing Using a Work Queue.
RabbitMQ on Kubernetes
Coarse Parallel Processing Using a Work Queue
Kubernetes Message Queue
Keda

dual Kubernetes Readiness probes?

I have a scenario where it is required to 'prepare' Kubernetes towards taking off/terminating/shutdown a container, but allow it to serve some requests till that happens.
For example, lets assume that there are three methods: StartAction, ProcessAction, EndAction. I want to prevent clients from invoking StartAction when a container is about to be shutdown. However they should be able use ProcessAction and EndAction on that same container (after all Actions have been completed, the container will shutdown).
I was thinking that this is some sort of 'dual' readiness probe, where I basically want to indicate a 'not ready' status but continue to serve requests for already started Actions.
I know that there is a PreStop hook but I am not confident that this serves the need because according to the documentation I suspect that during the PreStop the pod is already taken off the load balancer:
(simultaneous with 3) Pod is removed from endpoints list for service, and are no longer considered part of the set of running Pods for replication controllers. Pods that shutdown slowly cannot continue to serve traffic as load balancers (like the service proxy) remove them from their rotations.
(https://kubernetes.io/docs/concepts/workloads/pods/pod/#termination-of-pods).
Assuming that I must rely on stickiness and must continue serving requests for Actions on containers where those actions were started, is there some recommended practice?
I think you can just implement 2 endpoints in your application:
Custom readiness probe
Shutdown preparation endpointList item
So to make graceful shutdown I think you should firstly call "Shutdown preparation endpoint" which will cause that "Custom readiness probe" will return error so Kubernetes will get out that POD from service load balancer (no new clients will come) but existing TCP connections will be kept (existing clients will operate). After your see in some custom metrics (which your service should provide) that all actions for clients are done you should shutdown containers using standard Kubernetes actions. All those actions should be probably automated somehow using Kubernetes and your application APIs.

Kubernetes service for background app

I'm in the middle of creating an K8s app that doesn't expose any HTTP endpoints, is just a background app that pulls messages from a message bus and takes some action based on the incoming message. No other apps will interact directly with this background app, only thru posting messages into the message bus.
Scaling is a requirement and most likely will always need to run more than one replica. What is the recommended Service type in Kubernetes to handle this type of workload ?
No service required... just create a Deployment, which will result in a ReplicaSet, which will keep n replicas of your app running.

Is there a Service Fabric notification BEFORE it takes down a service?

I have setup a service that uses RegisterServiceNotificationFilterAsync to get notified of service change events. It works as intended. When a service goes down, this event gets called.
But it happens AFTER the service has gone offline. Which means that several requests could have failed against that now offline service before I get it pulled out of my loadbalancer pool.
Sometimes Service Fabric can only react to a service going offline. For example, if someone pulls the plug on a server node, Service Fabric clearly can't tell me in advance that the service is going offline.
But many times, a threshold is reached and it is Service Fabric itself that kills the service (and starts a new one).
Is there a way to know BEFORE Service Fabric kills a service? (So I have time to update my loadblancer.)
(In case it matters, I am running Service Fabric on premises.)
You can only be notified on shutdown inside the service. RegisterServiceNotificationFilterAsync is based on endpoint changes from the naming service.
If it's a reliable service, you get events for these scenarios: https://learn.microsoft.com/en-us/azure/service-fabric/service-fabric-reliable-services-lifecycle
For guest executeables the signal Service Fabric send is Ctrl+C. For containers it's "docker stop".

Fabric Service availability on start

I have a scenario where one of our services exposes WCF hosts that receive callbacks from an external service.
These hosts are dynamically created and there may be hundreds of them. I need to ensure that they are all up and running on the node before the node starts receiving requests so they don't receive failures, this is critical.
Is there a way to ensure that the service doesn't receive requests until I say it's ready? In cloud services I could do this by containing all this code within the OnStart method.
My initial thought is that I might be able to bootstrap this before I open the communication listener - in the hope that the fabric manager only sends requests once this has been done, but I can't find any information on how this lifetime is handled.
There's no "fabric manager" that controls network traffic between your services within the cluster. If your service is up, clients or other services inside the cluster can choose to try to connect to it if they know its address. With that in mind, there are two things you have control over here:
The first is whether or not your service's endpoint is discoverable by other services or clients. This is the point at which your service endpoint is registered with Service Fabric's Naming Service, which occurs when your ICommunicationListener.OpenAsync method returns. At that point, the service endpoint is registered and others can discover it and attempt to connect to it. Of course you don't have to use the Naming Service or the ICommunicationListener pattern if you don't want to; your service can open up an endpoint whenever it feels like it, but if you don't register it with the Naming Service, you'll have to come up with your own service discovery mechanism.
The second is whether or not the node on which your service is running is receiving traffic from the Azure Load Balancer (or any load balancer if you're not hosting in Azure). This has less to do with Service Fabric and more to do with the load balancer itself. In Azure, you can use a load balancer probe to determine whether or not traffic should be sent to nodes.
EDIT:
I added some info about the Azure Load Balancer to our documentation, hope this helps: https://azure.microsoft.com/en-us/documentation/articles/service-fabric-connect-and-communicate-with-services/