What are the impacts of bluemix auto-scaling in terms of resource management. For example if a runtime is specified with 1 GB of memory and auto-scaling is set to 2 instances, does the application consume 2 GB?
Same question for the disk allocated for the runtime?
Are logs from the various instances combined automatically?
If an instance is currently serving a REST request (short), how does Auto-Scaling make sure that the request is not interrupted while being served?
When you say, "a runtime is specified with 1 GB of memory and auto-scaling is set to 2 instances" I assume that you set your group/application up such that each instance is given 1 GB of memory and you are asking what will happen if the Auto-Scaling service scales up your group/application to 2 instances.
Memory/Disk
For example if a runtime is specified with 1 GB of memory and auto-scaling is set to 2 instances, does the application consume 2 GB? Same question for the disk allocated for the runtime?
Yes, your application will now consume 2 GB of your total memory quota. The same applies for disk allocation.
The Auto-Scaling service will deploy a new instance with the same configuration as your existing instances. If you've set up your group/application such that each instance gets 1 GB of memory, then when Auto-Scaling increases your group's instance count from 1 to 2 your application will now consume 2 GB of memory, assuming that adding another GB doesn't go beyond your memory quota. The same applies with disk allocation and quota.
Logs
Are logs from the various instances combined automatically?
Yes, the logs are combined automatically.
Cloud Foundry applications combine logs as well. For more information about viewing these logs check out the documentation.
The IBM Containers service sends logs to IBM's Logmet service. For more information check out the documentation.
Handling REST requests without interruption
If an instance is currently serving a REST request (short), how does Auto-Scaling make sure that the request is not interrupted while being served?
Adding an instance to the group/application: no interruption
If an instance is being added to the group then there will be no interruption to existing requests because any previously existing instances are not touched or altered by the Auto-Scaling service.
Removing an instance from the group/application: possible interruption
At this time, the Auto-Scaling service does not support protecting ongoing requests from being dropped during a scale down operation. If the request is being processed by the instance that is being removed, then that request will be dropped. It is up to the application to handle such cases. One option is your application could store session data in external storage to allow the user to retry the request.
Additional Information
There are currently two different Auto-Scaling services in Bluemix:
Auto-Scaling for Cloud Foundry applications exists in all Bluemix regions and is available as a service you bind to your existing Cloud Foundry application.
Auto-Scaling for Container Groups currently is available as a beta service for the London region in the new Bluemix console.
The answers to your questions above are applicable to both services.
I hope this helps! Happy scaling!
Related
I have a Flask API running in an Azure App Service. The API loads quite a lot of data on startup and is using about 60-70% memory on a 8GB plan (P1V3). I'm planning to scale the App Service plan to 3 - 5 instances depending on traffic.
Now I also want to release new versions of the API without downtime, but having a stage slot requires me to scale the plan to 16GB in order to run two versions of the API simultaneously before swapping.
This is just a very inefficient use of resources as our API then runs at around 30% memory for double the cost, so I'm looking for solutions in order to optimize our approach.
I've tried to manually scale up from 8 to 16 GB on release, but this takes down the API even when we have multiple instances and "Always On" enabled.
Does App Services support deploying one instance at a time (rolling deployment), or other deployment strategies which doesn't require us to scale our app service plan to 16GB?
We recently needed to add the Microsoft.Powershell.DSC extension to our VMSS that contain our service fabric cluster. We redeployed the cluster using our ARM template, with the addition of the new extension for DSC. During the deployment we observed that as many as 4 out of 5 scale set instances were in the restarting stage at a given time. The services in our cluster were also unresponsive during that time. The outage was only a few minutes long, but this seems like something that should not happen.
Reliability Level: Silver
Durability Level: Bronze
This is caused by the selected durability level 'bronze'.
The durability tier is used to indicate to the system the privileges
that your VMs have with the underlying Azure infrastructure. In the
primary node type, this privilege allows Service Fabric to pause any
VM level infrastructure request (such as a VM reboot, VM reimage, or
VM migration) that impact the quorum requirements for the system
services and your stateful services. In the non-primary node types,
this privilege allows Service Fabric to pause any VM level
infrastructure requests like VM reboot, VM reimage, VM migration etc.,
that impact the quorum requirements for your stateful services running
in it.
Bronze - No privileges. This is the default and is recommended if you are only > running stateless workloads in your cluster.
I suggest reading this article. Its a MS employee blog. I'll copy out the relevant part:
If you don’t mind all your VMs being rebooted at the same time, you can set upgradePolicy to “Automatic”. Otherwise set it to “Manual” and take care of applying changes to the scale set model to individual VMs yourself. It is fairly easy to script rolling out the update to VMs while maintaining application uptime. See https://learn.microsoft.com/en-us/azure/virtual-machine-scale-sets/virtual-machine-scale-sets-upgrade-scale-set for more details.
If your scale set is in a Service Fabric cluster, certain updates like changing OS version are blocked (currently – that will change in future), and it is recommended that upgradePolicy be set to “Automatic”, as Service Fabric takes care of safely applying model changes (like updated extension settings) while maintaining availability.
I have deployed my dockerized micro services in AWS server using Elastic Beanstalk which is written using Akka-HTTP(https://github.com/theiterators/akka-http-microservice) and Scala.
I have allocated 512mb memory size for each docker and performance problems. I have noticed that the CPU usage increased when server getting more number of requests(like 20%, 23%, 45%...) & depends on load, then it automatically came down to the normal state (0.88%). But Memory usage keeps on increasing for every request and it failed to release unused memory even after CPU usage came to the normal stage and it reached 100% and docker killed by itself and restarted again.
I have also enabled auto scaling feature in EB to handle a huge number of requests. So it created another duplicate instance only after CPU usage of the running instance is reached its maximum.
How can I setup auto-scaling to create another instance once memory usage is reached its maximum limit(i.e 500mb out of 512mb)?
Please provide us a solution/way to resolve these problems as soon as possible as it is a very critical problem for us?
CloudWatch doesn't natively report memory statistics. But there are some scripts that Amazon provides (usually just referred to as the "CloudWatch Monitoring Scripts for Linux) that will get the statistics into CloudWatch so you can use those metrics to build a scaling policy.
The Elastic Beanstalk documentation provides some information on installing the scripts on the Linux platform at http://docs.aws.amazon.com/elasticbeanstalk/latest/dg/customize-containers-cw.html.
However, this will come with another caveat in that you cannot use the native Docker deployment JSON as it won't pick up the .ebextensions folder (see Where to put ebextensions config in AWS Elastic Beanstalk Docker deploy with dockerrun source bundle?). The solution here would be to create a zip of your application that includes the JSON file and .ebextensions folder and use that as the deployment artifact.
There is also one thing I am unclear on and that is if these metrics will be available to choose from under the Configuration -> Scaling section of the application. You may need to create another .ebextensions config file to set the custom metric such as:
option_settings:
aws:elasticbeanstalk:customoption:
BreachDuration: 3
LowerBreachScaleIncrement: -1
MeasureName: MemoryUtilization
Period: 60
Statistic: Average
Threshold: 90
UpperBreachScaleIncrement: 2
Now, even if this works, if the application will not lower its memory usage after scaling and load goes down then the scaling policy would just continue to trigger and reach max instances eventually.
I'd first see if you can get some garbage collection statistics for the JVM and maybe tune the JVM to do garbage collection more often to help bring memory down faster after application load goes down.
I've been trying to push a new build of an app to Bluemix, but staging keeps failing when it's at "Installing App Management" because it can't create regular files and directories due to the disk quota being exceeded.
I've already tried pushing it with "-k 2G", but it it still fails.
Is there any way to find out how or why the disk quota keeps being exceeded? There's no way I'm near using 2GB of disk space.
Switching to npm v3 is a potentinal solution here as it reduces the number of duplicated dependencies.
You can do that in your package.json, for example:
"engines": { "npm": "3.x" }
By design, the CloudFoundry applications on IBM Bluemix are limited to an available diskquota of 2GB (default is 1 GB): usually if a cloud application needs more than a 1GB (even 1GB is a lot for a cloud application...) it should be redesigned according to cloud patterns, breaking down it into microservices, using external storage services if it needs simply static storage (for example ObjectStorage service on Bluemix).
You have also to consider that a cloud application filesystem is a unreliable filesystem, the application itself could be automatically deployed on a different virtual environment without any evidence for the end users.
Even the logs should be collected on external services (routing the log stream) if you need to keep these safe, otherwise they will be reset as soon as the application will be restarted on a different cluster node.
We are transitioning from building applications on monolith application servers, to more microservices oriented applications on Spring Boot. We will publish health information with SB Actuator through HTTP or JMX.
What are the options/best practices to monitor services, that will be around 30-50 in total? Thanks for your input!
Not knowing too much detail about your architecture and services, here are some suggestions that represent (a subset of) the strategies that have been proven in systems i've worked on in production. For this I am assuming you are using one container/VM per micro service:
If your services are stateless (as they should be :-) and you have redundancy (as you should have :-) then you set up your load balancer to call your /health on each instance and if the health check fails then the load balancer should take the instance out of rotation. Depending on how tolerant your system is, you can set up various rules that define failure instead of just a single failure (e.g. 3 consecutive, etc.)
On each instance run a Nagios agent that calls your health check (/health) on the localhost. If this fails, generate an alert that specifies which instance failed.
You also want to ensure that a higher level alert is generated if none of your instances are healthy for a given service. You might be able to set this up in your load balancer or you can set up a monitor process outside the load balancer that calls your service periodically and if it does not get any response (i.e. none of the instances are responding) then it should sound all alarms. Hopefully this condition is never triggered in production because you dealt with the other alarms.
Advanced: In a cloud environment you can connect the alarms with automatic scaling features. In that way, unhealthy instances are torn down and healthy ones are brought up automatically every time an instance of a service is deemed unhealthy by the monitoring system