Can't allocate more than 1 core to a container - kubernetes

Having an issue allocating more than 1 cpu to a pod that is running code that requires more processing power.
I have set my limit for a container to 3 cpu's
and have set set the container to request 2 cpu;s with a limit of 3
But when running the container it does not go over 1000Mi of 1 cpu.
There is very little running during this process and keda will start new nodes if needed.
How can i assign more cpu power to this container?
UPDATE
So i changed the Default Limit as suggested by moonkotte but i can only ever get just over 1 cpu
New Nodes are coming online when more containers are required, through Keda.
each node has 4 cpu, so sufficient resources
this is the details of each node. in this it is running one of the containers in question
It just isn't using all the cpu allocated

Related

Running multiple containers on the same Service Fabric node

I have a windows Service Fabric node with 4 cores and I want to host 3 containerized stateless services on it, where each windows container is allocated 1 core to read a message from a queue and process it. I run some experiments and got these results:
1 container running on the node: message takes ~18 sec to be
processed, avg cpu usage per container: 24.7%, memory usage: 1 GB
2 containers running on the node: message takes ~25 sec to be
processed, avg cpu usage per container: 24.4%, memory usage: 1 GB
3 containers running on the node: message takes ~35 sec to be
processed, avg cpu usage per container: 24.6%, memory usage: 1 GB
I thought that containers are supposedly isolated, and I expected the processing time to be constant at ~18s regardless of the number of containers, but in this case, it seems that adding one container affects the processing time in other containers. Each container is set to use 1 core, so they shouldn't be overstepping to use each other's resources, and cpu is not reaching full utilization. Even if cpu was a bottleneck here, I'd expect that at least 2 containers would be able to run with ~18 sec processing time.
Is there a logical explanation for the results? Isn't it not possible to run multiple containers on the same Service Fabric host without affecting the performance of each when there are enough compute resources? How big could the Service Fabric overhead possibly be when trying to run multiple containers on the same node?
Thanks!
Your container is not only using CPU, but also memory and I/O (disk, network), which can also become bottlenecks.
To see the overhead of SF, run the containers outside of SF and see if it makes a difference.
Use a machine with more memory, and after that, try using an SSD drive. See if that increases performance.
To avoid process overhead, consider using a single container and have multiple threads do parallel message processing. Make sure to assign it 3 cores.

Pods start to be in pending state too long

I have a cluster where jobs are created in order of what my users do.
Sometimes I can have 0 job in parallel and sometimes 20 to 100.
I have set the following limits for each container:
cpu limit: 512m
memory limit: 512Mi;
cpu request: 256m;
memroy request: 128Mi;
I have by default 2 nodes and each one has:
7.91 CPU allocable
10.16 GB allocable
The node pool can scale to 5 nodes max.
But when the cluster starts to have 8 and more jobs in parallel, the new jobs start to be in pending, waiting for other jobs to get down.
If a job is selected to start directly it will be completed in 6 to 7 seconds.
But when the cluster starts to struggle from 8 or 10 jobs, each job take approximately 20 seconds to be completed, because it blocked in pending state or in ContainerCreating state.
I have IfNotPresent as imagePullPolicy and each image has a version.
I suppose the cluster will start struggling with 28 jobs knowing my allocable resources, then creates a new node and so on.
Why am I wrong ?
Is it possible to force each container to start without the pending state ?
I have found a new scheduler, but i am not sure if it can help me poseidon-firmament-alternate-scheduler ?

Does Kubernetes allocates resources if the resource limit is above node capacity?

I would like to know, does scheduler considers resource limits when scheduling a pod?
For example, of scheduler schedules 4 pods in a specific node with total capacity <200mi, 400m> and the total resource limits of those pods are <300mi, 700m>, what will be happened?
Only resource requests are considered during scheduling. This can result in a node being overcommitted. (Managing Compute Resources for Containers in the Kubernetes documentation says a little more.)
In your example, say your node has 1 CPU and 2 GB of RAM, and you've scheduled 4 pods that request 0.2 CPU and 400 MB RAM each. Those all "fit" (requiring 0.8 CPU and 1.6 GB RAM total) so they get scheduled. If any individual pod exceeds its own limit, its CPU usage will be throttled or memory allocation will fail or the process will be killed. But, say all 4 of the pods try to allocate 600 MB of RAM: none individually exceeds its limits, but in aggregate it's more memory than the system has, so the underlying Linux kernel will invoke its out-of-memory killer and shut down processes to free up space. You might see this as a pod restarting for no apparent reason.

Once kubernetes cpu limit is set, does container get more than the set limit

lets say I have set a request of 1cpu and limit of 4 cpu, once the container is up, is it possible that the container gets more than 4 cpu if the node has more cpu available
No, it is not possible - the limit is "hard". Even if there is more CPU resource available, the container will not be allowed to use it.

AWS ECS Task Memory and CPU Allocation

I'm looking for guidance on allocating memory for an ECS task. I'm running a Rails app for a client who wants to be as cheap as possible on server cost. I was looking at the medium server size that has 2 CPU and 4 gb memory.
Most of the time I'll only need 1 container running the rails server at a time. However, there are occasional spikes and I want to scale out another server and have the container deployed to it. When traffic slows down, I want to scale back down to the single server / task.
Here's where I need help:
What should I make my task memory setting be? 4GB? That would be the total on the box but doesn't account for system processes. I could do 3 GB, but then I'd be wasting some passionless free memory. Same question for the CPU... should I just make it 100%?
I don't want to pay for a bigger server, i.e. 16 GB to sit there and only have 1 container needed most of the time... such a waste.
What I want seems simple. 1 task per instance. When the instance gets to 75% usage, scale a new instance and deploy the task to the second. I don't get why I have to set task memory and CPU settings when it's a one-to-one ratio.
Can anyone give me guidance on how to do what I've described? Or what the proper task definition settings should be when it's meant to be one-to-one with the instance?
Thanks for any help.
--Edit--
Based on feedback, here's a potential solution:
Task definition = memory reservation is 3 GB and memory is 4 GB.
Ec2 medium nodes, Which have 4 GB
ECS Service autoscaling configured:
- scale up (increase task count by 1) when Service CPU utilization is greater than 75%.
- scale down (decrease task count by 1) when Service CPU utilization is less than 25%.
ECS Cluster scaling configured:
- scale up (increase ec2 instance count by 1) when cluster memory utilization is greater than 80%.
- scale down (decrease ec2 instance count by 1) when cluster memory utilization is less than 40%.
Example:
Starts with 1 EC2 instance running a task with 3 GB reservation. This is 75% cluster utilization.
When the service spikes and CPU utilization of the service jumps to greater than 75%, it will trigger a service scale. Now the task count is increased and the new task is asking for 3 GB again, which makes it a total of 6 GB but only 4 is available so the cluster is at 150% utilization.
This triggers the cluster scale (over 80%) which adds a new ec2 node to the cluster for the new service. When it's there, we're back down to 6GB demand / 8 GB available which is 75% and stable.
The scale down would happen the same.
For setting memory for containers, I would recommend using "memoryReservation": The soft limit of memory to your container and
"memory": the hard limit on your container.
You can set "memoryReservation" to 3GB, which will ensure the other instance of the container does not land up on the same EC2 instance. The "memory" option will allow the container to swell up on memory when absolutely needed.
Ref:http://docs.aws.amazon.com/AmazonECS/latest/developerguide/task_definition_parameters.html
ECS right now does not support the flexibility to disable the deployment of same task twice on the same ec2 compute instance.
But you can hack your way by either blocking cpu/memory or externalizing a known port on you task.