CPU/Memory setting for Task with more than 1 container - amazon-ecs

If I create a task running on fargate with more than 1 container.
Ignore the fact that each container should be split out in to its on task.
e.g. terraform code:
resource "aws_ecs_task_definition" "citi" {
family = "myfamily"
execution_role_arn = data.aws_iam_role.this.arn
container_definitions = <<EOF
[
{
"name": "myapp1",
"image": "myapp1image",
"portMappings": [
{
"containerPort": 10001
}
]
},
{
"name": "myapp2",
"image": "myapp2image",
"portMappings": [
{
"containerPort": 10004
}
]
}
]
EOF
cpu = 512
memory = 1024
requires_compatibilities = ["FARGATE"]
network_mode = "awsvpc"
}
Note: not specifying cpu and memory in containers.
Should each container specify its cpu/memory value as best practise?
If they do not as in the case above, how is cpu/memory allocated to each container?
Do they get half overall e.g. Container 1 get cpu 256, memory 512
Or something else e.g. if Container 1 busy and Container 2 is not, can Container1 most
of the tasks resources
If each container has default value of essential set to true, does this mean if Container
1 stops, the task will stop and therefore Container 2 stops?

Short story:
1- it depends on the use case. If you want to optimize for simplicity you can avoid that and the two+ containers will fight for resources within the task. If you want to optimize for control you should configure that.
2- Yes (sort of). They are basically going to compete for all cpu/memory resources with even weights.
3- yes
Long story: https://aws.amazon.com/blogs/containers/how-amazon-ecs-manages-cpu-and-memory-resources/

Related

Kubernetes: change backoffLimit default value

Is it possible to configure backoffLimit globally (for example, change default limit from 6 to 2 for all jobs in cluster not specifying backoffLimit: 2 for each job)?
It seems that the default values, with the spec.backOffLimit being included, are hardcoded directly into Kubernetes code.
From apis/batch/v1/defaults.go
func SetDefaults_Job(obj *batchv1.Job) {
// For a non-parallel job, you can leave both `.spec.completions` and
// `.spec.parallelism` unset. When both are unset, both are defaulted to 1.
if obj.Spec.Completions == nil && obj.Spec.Parallelism == nil {
obj.Spec.Completions = utilpointer.Int32Ptr(1)
obj.Spec.Parallelism = utilpointer.Int32Ptr(1)
}
if obj.Spec.Parallelism == nil {
obj.Spec.Parallelism = utilpointer.Int32Ptr(1)
}
if obj.Spec.BackoffLimit == nil {
obj.Spec.BackoffLimit = utilpointer.Int32Ptr(6)
}
labels := obj.Spec.Template.Labels
if labels != nil && len(obj.Labels) == 0 {
obj.Labels = labels
}
if utilfeature.DefaultFeatureGate.Enabled(features.IndexedJob) && obj.Spec.CompletionMode == nil {
mode := batchv1.NonIndexedCompletion
obj.Spec.CompletionMode = &mode
}
if utilfeature.DefaultFeatureGate.Enabled(features.SuspendJob) && obj.Spec.Suspend == nil {
obj.Spec.Suspend = utilpointer.BoolPtr(false)
}
}
So I think it cannot be changed without changing the code, at the moment.
No, it's not possible since backoffLimit is configured on Pod level as per the official documentation:
There are situations where you want to fail a Job after some amount of retries due to a logical error in configuration etc. To do so, set .spec.backoffLimit to specify the number of retries before considering a Job as failed. The back-off limit is set by default to 6. Failed Pods associated with the Job are recreated by the Job controller with an exponential back-off delay (10s, 20s, 40s ...) capped at six minutes. The back-off count is reset when a Job's Pod is deleted or successful without any other Pods for the Job failing around that time.

Problem attaching multiple NICs to a VM in Azure using a Terraform module

We are developing a module to create Linux based NVAs. Each NVA needs to have 3 NICs attached to it, each on a different subnet. At the moment, the module is able to successfully create the 3 NICs but the output from the NIC creation is of type tuple and we have not figured out how to use the NIC Resource IDs stored in the tuple as input into the VM creation.
We are trying to keep the module generic enough to be able to use it to create other types of Linux VMs that may not have multiple NICs.
Working code that calls module to create the NICs:
module "linux_vm" {
source = "../modules/test_module"
resource_group_name = azurerm_resource_group.saca_rg.name
nic_names = ["nic-mgmt","nic-int","nic-ext"]
subnet_id = ["${local.subnet_ids[4]}","${local.subnet_ids[3]}","${local.subnet_ids[2]}"]
ip_allocation = ["Static","Static","Static"]
static_ip = ["${cidrhost(local.static_ips[4], 4)}","${cidrhost(local.static_ips[3], 4)}","${cidrhost(local.static_ips[2], 4)}"]
}
Module code that creates the NICs:
resource "azurerm_network_interface" "vm_nic" {
count = length(var.nic_names)
name = var.nic_names[count.index]
resource_group_name = data.azurerm_resource_group.rg.name
location = data.azurerm_resource_group.rg.location
ip_configuration {
name = "${var.nic_names[count.index]}-ipconfig"
subnet_id = var.subnet_id[count.index]
private_ip_address_allocation = var.ip_allocation[count.index]
private_ip_address = var.static_ip[count.index]
}
}
The output from this is a tuple that resembles the output below:
"outputs": {
"nic_ids": {
"value": [
"/subscriptions/<sub_id>/resourceGroups/rg/providers/Microsoft.Network/networkInterfaces/nic-mgmt",
"/subscriptions/<sub_id>/resourceGroups/rg/providers/Microsoft.Network/networkInterfaces/nic-int",
"/subscriptions/<sub_id>/resourceGroups/rg/providers/Microsoft.Network/networkInterfaces/nic-ext"
],
"type": [
"tuple",
[
"string",
"string",
"string"
]
]
}
},
When we try to use this output as input into the VM creation block using the parameter below:
network_interface_ids = [azurerm_network_interface.vm_nic.*.id]
We get the error generated below:
Error: Incorrect attribute value type
on ../modules/test_module/main.tf line 27, in resource "azurerm_linux_virtual_machine" "vm":
27: network_interface_ids = [azurerm_network_interface.vm_nic.*.id]
|----------------
| azurerm_network_interface.vm_nic is tuple with 3 elements
Inappropriate value for attribute "network_interface_ids": element 0: string
required.
Discovered what the problem was.
We were trying to use the line below:
network_interface_ids = [azurerm_network_interface.vm_nic.*.id]
But since a tuple is like a list...this was basically nesting a list inside of a list. When we switched to the below, all 3 NICs assigned as expected:
network_interface_ids = azurerm_network_interface.vm_nic.*.id

Pre-deploying Kubernetes loadbalancer with terraform on DigitalOcean?

I'm learning about creating a k8s cluster on DO using terraform, I've been trying to take the ID of the single K8s node I've created, and reference it from the loadbalancer.
The main reasoning for this is so that I can declare the FQDN in the .tf file.
First, here is the cluster declaration:
variable "digitalocean_token" {}
provider "digitalocean" {
token = "${var.digitalocean_token}"
}
resource "digitalocean_kubernetes_cluster" "foo" {
name = "foo"
region = "nyc1"
version = "1.12.1-do.2"
node_pool {
name = "woker-pool"
size = "s-1vcpu-2gb"
node_count = 1
}
}
And here is the load balancer declaration:
resource "digitalocean_loadbalancer" "foo" {
name = "k8s-lb.nyc1"
region = "nyc1"
forwarding_rule {
entry_port = 80
entry_protocol = "http"
target_port = 80
target_protocol = "http"
}
droplet_ids = ["${digitalocean_kubernetes_cluster.foo.node_pool.0.id}"]
}
output "loadbalancer_ip" {
value = "${digitalocean_loadbalancer.foo.ip}"
}
resource "digitalocean_record" "terraform" {
domain = "example.com" # "${digitalocean_domain.example.name}"
type = "A"
name = "terraform"
value = "${digitalocean_loadbalancer.foo.ip}"
}
# Output the FQDN for the record
output "fqdn" {
value = "${digitalocean_record.terraform.fqdn}"
}
I'm guessing that maybe the digitalocean_loadbalancer resources is only setup to work with individual droplets?
Here are the output errors: when I run terraform apply:
* output.loadbalancer_ip: Resource 'digitalocean_loadbalancer.foo' not found for variable 'digitalocean_loadbalancer.foo.ip'
* digitalocean_record.terraform: Resource 'digitalocean_loadbalancer.foo' not found for variable 'digitalocean_loadbalancer.foo.ip'
* digitalocean_loadbalancer.foo: droplet_ids.0: cannot parse '' as int: strconv.ParseInt: parsing "d4292e64-9c0a-4afb-83fc-83f239bcb4af": invalid syntax
Pt. 2
I added a digitalocean_droplet resource, to see what kind of id was passed to the load balancer.
resource "digitalocean_droplet" "web" {
name = "web-1"
size = "s-1vcpu-1gb"
image = "ubuntu-18-04-x64"
region = "nyc1"
}
digitalocean_kubernetes_cluster.foo.node_pool.0.id = '6ae6a787-d837-4e78-a915-cb52155f66fe'
digitalocean_droplet.web.id = 132533158
So, the digitalocean_loadbalancer resource has an optional droplet_tag argument, which can be used to supply a common tag given to the created nodes/droplets.
However, when declaring a load-balancer inside kubernetes, a new one will still be created. So for now at least, it would appear that defining the domain/CNAME record with terraform isn't possible on digitalocean
You're using the wrong attribute reference for your load balancer droplet ids.
droplet_ids = ["${digitalocean_kubernetes_cluster.foo.node_pool.0.id}"]
This will use the node_pool id linked here
What you actually need to do is use the node_pool nodes id, which is referenced here
droplet_ids = "${digitalocean_kubernetes_cluster.foo.node_pool.0.nodes}"
The next problem you're going to have is that this returns a list of maps, and you'll need to build a list of ids from that. I'm not currently sure how to solve that, I'm afraid, but this should move you along hopefully.
It seems from your answer however, that what you want to do is update DNS for your loadbalancer.
You can do this external-dns using the digitalocean provider
Simply deploy this as a pod, specifying the required configuration, and ensure that the arg --source=service is set.
If you want to go a step further, and allow updating DNS with specific hostname, deploy an ingress controller like nginx-ingress and specify ingresses for your applications. The external-dns deployment (if you set --source=ingress) will the hostname from your ingress and update DNS for you.

Iteration through RestAPI POST calls

I'm working with a private cloud platform that is used for creating and testing Virtual Machines. They have rich API which allows me to create VMs:
{
"name": "WIN2016-01",
"description": "This is a new VM",
"vcpus": 4,
"memory": 2147483648,
"templateUuid": "sdsdd66-368c-4663-82b5-dhsg7739smm",
...
}
I need to automate this process of creating machines by just simply iterating -01 part, so it becomes:
"name": "WIN2016-01",
"name": "WIN2016-02",
"name": "WIN2016-03"
etc.
I tried to use Postman Runner and build the workflow https://learning.getpostman.com/docs/postman/collection_runs/building_workflows/ but with no luck - not sure what syntax I need to use in Tests tab.
This is one way of doing it.
Create a collection and your POST request.
In your pre-request, add the following:
/* As this will be run through the Collection Runner, this extracts
the number of the current iteration. We're adding +1, as the iteration starts from 0.*/
let count = Number(pm.info.iteration) + 1;
//Convert the current iteration number, to a '00' number format (will be a string)
let countString = ((count) < 10) ? '0' + count.toString() :
count.toString();
//Set an environment variable, which can be used anywhere
pm.environment.set("countString", countString)
In your POST request body, do something like this:
{
"name": "WIN2016-{{countString}}",
...
}
Now, run your collection through the 'Collection Runner', and enter the number of Iterations (e.g. how many times you want your collection to run). You can also add a Delay, if your API imposes rate limits.
Finally, click Run.

Is it possible to prevent children inheriting the CPU/core affinity of the parent?

I'm particularly interesting in doing this on Linux, regarding Java programs. There are already a few questions that say you have no control from Java, and some RFEs closed by Sun/Oracle.
If you have access to source code and use a low-level language, you can certainly make the relevant system calls. However, sand-boxed systems - possibly without source code - present more of a challenge. I would have thought that a tool to set this per-process or an kernel parameter are able to control this from outside the parent process. This is really what I'm after.
I understand the reason why this is the default. It looks like some version of Windows may allow some control of this, but most do not. I was expecting Linux to allow control of it, but seems like it's not an option.
Provided you have sufficient privileges, you could simply call setaffinity before execing in the child. In other words, from
if (fork() == 0)
execve("prog", "prog", ...);
move to use
/* simple example using taskset rather than setaffinity directly */
if (fork() == 0)
execve("taskset", "taskset", "-c", "0-999999", ...);
[Of course using 999999 is not nice, but that can be substituted by a program which automatically determined the number of cpus and resets the affinity mask as desired.]
What you could also do, is change the affinity of the child from the parent, after the fork(). By the way, I'm assuming you're on linux, some of this stuff, such as retrieving the number of cores with sysconf() will be different on different OS's and unix flavors.... The example here, gets the cpu of the parent process and tries to ensure all child processes are scheduled on a different core, in round robin.
/* get the number of cpu's */
numcpu = sysconf( _SC_NPROCESSORS_ONLN );
/* get our CPU */
CPU_ZERO(&mycpuset);
sched_getaffinity( getpid() , sizeof mycpuset , &mycpuset);
for(i=0 ; i < numcpu ; i++ )
{
if(CPU_ISSET( i, &mycpuset))
{
mycpu = i;
break;
}
}
//...
while(1)
{
//Some other stuff.....
/* now the fork */
if((pid = fork()) == 0)
{
//do your child stuff
}
/* Parent... can schedule child. */
else
{
cpu = ++cpu % numcpu;
if(cpu == mycpu)
cpu = ++cpu % numcpu;
CPU_ZERO(&mycpuset);
CPU_SET(cpu,&mycpuset);
/*set processor affinity*/
sched_setaffinity(pid, sizeof mycpuset, &mycpuset );
//any other father stuff
}
}