What happens to the CronWorkflows in Cadence if a cluster is down when a workflow should have started? - cadence-workflow

What is the behaviour of CronWorkflows in the event that the Cadence cluster is down while a workflow should have started? When the cluster comes back, would we expect the workflow to still be started?

Yes, all workflows that didn't start yet are going to start right away. If multiple starts were missed only one will be executed.

Related

Pause Scheduled tasks in SCDF

Hi I'm running batch jobs via SCDF in openshift environment. All the jobs have been scheduled through the scheduling option in SCDF. Is there way to pause or Hold those jobs from executing instead of destroying the schedules ? Since the number of jobs are more, everytime we have to recreated the schedules for all of them.
Thanks.
We have an open issue: spring-cloud/spring-cloud-dataflow#3276 to add support for it.
Feel free to update the issue with your use-case requirements and the acceptance criteria. Better yet, it'd be great if you can contribute adding support for it in a PR; we would love to collaborate and release it.

Kubernetes CronJob - Do ConcurrencyPolicy and manual job execution/creation communicate with one another?

I have a Kube cronjob that has a concurrencyPolicy of Replace. As I'd have expected, documentation suggests this means if there is a job running when the next cycle in the schedule is met while the previous job is running that the previous job would be killed off / cancelled.
What I want to know is, if I manually kick off a job with kubectl create job --from, does the concurrencyPolicy still play a part? It seems as though the answer is no from the testing I've been doing (and then I'll have multiple concurrent jobs), but would like to confirm.
If I'm correct and they don't work together, is there a way to have this functionality? Basically wanting to be able to deploy a job and then test it without having to wait around for it to kick off, but also don't want to have two jobs running at the same time.
Thanks!

Updating a kubernetes job: what happens?

I'm looking for a definitive answer for k8s' response to a job being updated - specifically, if I update the container spec (image / args).
If the containers are starting up, will it stop & restart them?
If the job's pod is all running, will it stop & restart?
If it's Completed, will it run it again with the new setup?
If it failed, will it run it again with the new setup?
I've not been able to find documentation on this point, but if there is some I'd be very happy to get some signposting.
The .spec.template field can not be updated in a Job, the field is immutable. The Job would need to be deleted and recreated which covers all of your questions.
The reasoning behind the changes aren't available in the github commit or pr, but these changes were soon after Jobs were originally added. Your stated questions are likely part of that reasoning as making it immutable removes ambiguity.

Delete a pod but mark the job as successful

I have a system that brings up Jobs, each with a Pod that has multiple containers.
Two of those containers are not under my control and run "background"/sidecar daemons. The container I do control is able to run to completion, but once it's done, the Pod is still considered active since two of the containers are still up.
I've tried killing the other containers from mine, but that works ~99% of the time and we run a lot of Jobs. When it fails, deleting the Pod (or letting the Job timeout) works, but it marks the Job as a failure rather than as a success, and I use that status to indicate to users the result of their work.
Edit: I'm aware of the "sidecar containers" KEP, but no PR has been accepted for it yet, so it's not going to be available in a stable cluster for a very long time.
Don't know to what extent this answers your question, but it seems that there is an ongoing discussion about sidecar containers, and an enhancement proposal about this also.
An interesting solution proposed in the thread above is this k8s-controller-sidecars proposed by a user, and which seems easy to configure.
Try it out, and let us know if it works.

Chronos + Mesosphere. How to execute tasks in parallel?

Good day everyone.
I have single server for Chronos, Mesos and Zookeeper, and i want to use Chronos as something, what will run my scripts daily. Some scripts today, some tomorrow and so on..
The problem is when i'm trying to launch tasks one after another, only first one executes correctly, another one is lost somewhere. If i launch first then take a pause of 3-4 seconds and launch another - they both are launched, but sequentially.
And i need to run them in parallel.
Can someone provide a hint on this? Maybe there is some settings that i must change?
You should set a time in UTC time for both tasks to be launched with a repeating period of 24 hours. In this case, there is no reason why your tasks should not execute in parallel. Check the chronos logs and the tasks logs in sandbox on mesos for errors.
You can certainly run all of these components (Chronos, master, slave, and ZK) on the same machine, although ZK really becomes valuable once you have HA with multiple masters.
As user4103259 suggested, check the master and slave logs for that LOST/failed taskId to see what exactly happened to it. A task could go LOST/failed for numerous reasons, anywhere along the task launch/running/completing process.