Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about a specific programming problem, a software algorithm, or software tools primarily used by programmers. If you believe the question would be on-topic on another Stack Exchange site, you can leave a comment to explain where the question may be able to be answered.
Closed 1 year ago.
Improve this question
I have read enough papers on serverless cold start, but have not found a clear explanation on what causes cold start. Could you try to explain it from both commercial and open-source platform's points of view?
commercial platform such as AWS Lambda or Azure Funtion. I know they are more like a black-box to us
There are open-source platforms such as OpenFaaS, Knative, or OpenWhisk. Do those platforms also have a cold start issue?
My initial understanding about cold start latency is time spent on spinning up a container. After the container being up, it can be reused if not being killed yet, so there is a warm start. Is this understanding really true? I have tried to run a container locally from the image, no matter how large the image is, the latency is near to none.
Is the image download time also part of cold start? But no matter how many cold starts happened in one node, only one image download is needed, so this seems to make no sense.
Maybe a different question, I also wonder what happened when we instantiate a container from the image? Are the executable and its dependent libraries (e.g., Python library) copied from disk into memory during this stage? What if there are multiple containers based on the same image? I guess there should be multiple copies from disk to memory because each container is an independent process.
There's a lot of levels of "cold start" that all add latency. The hottest of the hot paths is the container is still running and additional requests can be routed to it. The coldest is a brand new node so it has to pull the image, start the container, register with SD, wait for the serverless plane's routing stuffs to update, probably some more steps if you dig deep enough. Some of those can happen in parallel but most can't. If the pod has been shut down because it wasn't being used, and the next run schedules on the same machine then yes kubelet usually skips pulling image (unless imagePullPolicy Always is forced somewhere) so you get a bit of a faster launch. K8s' scheduler doesn't generally optimize for that though.
Related
There are already a million questions and answers stating you cannot run a VPA+HPA at the same time in your cluster performing modifications, for obvious reasons.
However, my question is can you run a VPA in recommend only mode (updateMode: "Off") with an active HPA? Seems like others have had this question but I haven't found a definitive answer to my question. I just want to be really safe before I just start turning things on and have some stuff break.
Others have asked here: https://github.com/kubernetes/autoscaler/issues/3858
Well I just took the plunge and deployed it. So far I'm not having any issues. So it appears to be safe to deploy a VPA+HPA both watching CPU and Memory but have the VPA set with updateMode: "Off".
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 9 months ago.
Improve this question
Some context:
I have little experience with ci/CD and manage a fast paced growing application since it saw the light of the day for the first time. It is composed by several microservices at different environments. Devs are constantly pushing new code to DEV , but they frequently forget about sending new values from their local .env ove to the openshift cloud, regardless of this being a brand new environment or existing ones.
The outcome? Services that fail because they lack to have their secrets updated.
I understand the underlying issue is lack of communication between both us DevOps staff and devs themselves. But I've been trying to figure out some sort of process that would make sure we are not missing anything. Maybe something like a "before takeoff checklist" (yes, like the ones pilots do in a real flight preparation): if the chack fails then the aircraft is not ready to takeoff.
So the question is for everyone out there that practices DevOps. How do you guys deal with this?
Does anyone automates this within Openshift/kubernetes, for example? From your perspective and experience, would you suggest any tools for that, or simply enforce communication?
Guess no checklist or communication would work for team that ...frequently forget about sending new values from their local .env ove..., which you must have already done.
Your step in the pipeline should check for service availability before proceeding to next step, eg. does the service has endpoint registered within acceptable time, no endpoint means the backing pod(s) did not enter readiness state as expected. In such case, rollback and send notification to the team responsible for the service/application and exit cleanly.
There's no fix formula for CI/CD, especially human error. Check & balance at every step is the least you can do to trigger early warning and avoid a disastrous deployment.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 2 years ago.
Improve this question
I work in a hospital where the system shuts down when updating. making all orders hanging with no approvals or modifications. considering it's a hospital, this is a huge problem. so, my question is how can we update the system without it shutting down. I'm most interested in rolling updates where there's no down time.
This is a very broad question, but generally, yes, it is perfectly possible to update a system without shutting down the system.
The simplest possible solution is to have a duplicate system. Let's say you are currently working with System A. When you want to do an update, you update System B. The update can take as long as it needs, since you are not using System B. There will be no impact at all.
Once the update is finished, you can test the hell out of System B to make sure the update didn't break anything. Again, this has no impact on working with the system. Only after you are satisfied that the update didn't break anything, do you switch over to using System B.
This switchover is near instantaneous.
If you discover later that there are problems with the update, you can still switch back to System A which is still running the old version.
For the next update, you again update the system which is currently not in use (in this case System A) and follow all the same steps.
You can do the same if you have a backup system. Update the backup system, then fail over, then update the main system. Just be aware of the fact that while the update is happening, you do not have a backup system. So, if the main system crashes during the update process, you are in trouble. (Thankfully, this is not entirely as bad as it sounds, because it least you will already have a qualified service engineer on the system anyway who can immediately start working on either pushing the update forward to get the backup online or fix the problem with the main system.)
The same applies when you have a redundant system. You can temporarily disable redundancy, then update the disabled system, flip over, do it again. Of course, just like in the last option, you are operating without a safety net while the update process is ongoing.
If your system is a cluster system, it's even easier. If you have enough resources, you can take one machine out of the cluster, update it, then add it back into the cluster again, then do the next machine, and so on. (This is called a "rolling update", and is how companies like Netflix, Google, Amazon, Microsoft, Salesforce, etc. are able to never have any downtime.)
If you don't have enough resources, you can add a machine to the cluster just for the update, and then you are back to the situation that you do have enough resources.
Yes.
Every kind of component may be updated rebootlessly.
For windows you always can postpone reboots.
Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 11 years ago.
Improve this question
My understanding of a virtual appliance is 1+ pre-configured VM(s) designed to work with one another and each with a pre-configured:
Virtual hardware configuration (disks, RAM, CPUs, etc.)
Guest OS
Installed & configured software stack
Is this (essentially) the gist of what an appliance is? If not please correct me and clarify!
Assuming that my understanding is correct, it begins to beg the question: what are the best ways to back up an appliance? Obviously a SCM like SVN would not be appropriate because an appliance isn't source code - its an enormous binary file representing an entire machine or even set of machines.
So how does SO keep "backups" of appliances? How does SO imitate version control for appliance configurations?
I'm using VBox so I'll use that in the next example, but this is really a generic virtualization question.
If I develop/configure an appliance and label it as the "1.0" version, and deploy that appliance to a production server running the VBox hypervisor, then I'll use software terms and call that a "release". What happens if I find a configuration issue with the guest OS of that appliance and need to release a 1.0.1 patch?
Thanks in advance!
From what I've seen and used, appliances are released with the ability to restore their default VM, probably from a ghost partition of some kind (I'm thinking about Comrex radio STL units I've worked with). Patches can be applied to the appliance, with the latest patch usually containing all the previous patches (if needed).
A new VM means a new appliance - Comrex ACCESS 2.0 or whatever, and 1.0 patches don't work on it. It's never backed up, rather it can just be restored to a factory state. The Comrex units store connection settings, static IP configuration, all that junk, but resetting kills all that and has to be re-entered (which I've had to do before).
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about a specific programming problem, a software algorithm, or software tools primarily used by programmers. If you believe the question would be on-topic on another Stack Exchange site, you can leave a comment to explain where the question may be able to be answered.
Closed 4 years ago.
Improve this question
I wanted to know if I could just use Linux command 'dd' to make a copy of a hard drive. The hard drive has windows XP and the goal is to move data from a smaller HDD to this larger one without having to explicitly re-install windows. I personally don't know enough to know if this has the potential to screw up the file system. I also don't know if this works between different models of HDDs.
Yes, you can do this, with a couple things to be aware of.
Different brand hard drives (or even different models of the same brand) may not be the exact same size. You should check the real size of the block devices to verify the target drive is the same size or larger than the source drive. As long as it is, you are good to go.
If the target drive happens to be larger, after you've cloned the drive you can use gparted to expand the partition to fill the drive.
In fact, you could use gparted to shrink the source partition and then copy it to the target drive if the target drive happens to be smaller.
As dicoroce mentioned, you can also copy just partitions instead of an entire drive. Just be aware that if you only copy the partition, you will have to reinstall the MBR (though that is trivial)
Yup. dd operates beneath the file system. You can dup partitions or whole drives, depending on what device nodes you use.
You may want to research the optimal "bs" (block size) to use for your hardware because if you get it wrong, this can take forever.
If I'm not mistaken, one very nice feature of GParted is the ability to correctly resize an NTFS partition that has Windows installed, by updating some magic number somewhere. What this means for you is that you'll (probably) be able to expand the partition to fill the whole drive without worrying about confusing Windows.
You should be fine. Plus, if for some reason you have a problem you'll still have the original (smaller) drive as a fall back.