One or more placement constraints on the service are undefined on all nodes that are currently up - azure-service-fabric

trying to setup that specific services get deploy to specific node types I am getting this error using Visual Studio publish dialog (that breaks calling new-servicefabricapplication PS command)
I am using the service manifest to define the placementConstraints like this:
<StatelessServiceType ServiceTypeName="VisualObjects2.WebServiceType" >
<PlacementConstraints>(nodeType==node2)</PlacementConstraints>
</StatelessServiceType>
How can i define this placement constraints on the nodes?

In the Azure portal, go to your SF Cluster, select node types and for each one you can add a key-value list of placement constraints. There I put the key-value: nodetype = node2. After this, the deployment was done only in the nodes with this attribute

Related

How to upsize volume of Terraformed EKS node

We have been using Terraform for almost a year now to manage all kinds of resources on AWS from bastion hosts to VPCs, RDS and also EKS.
We are sometimes really baffled by the EKS module. It could however be due to lack of understanding (and documentation), so here it goes:
Problem: Upsizing Disk (volume)
module "eks" {
source = "terraform-aws-modules/eks/aws"
version = "12.2.0"
cluster_name = local.cluster_name
cluster_version = "1.19"
subnets = module.vpc.private_subnets
#...
node_groups = {
first = {
desired_capacity = 1
max_capacity = 5
min_capacity = 1
instance_type = "m5.large"
}
}
I thought the default value for this (dev) k8s cluster's node can easily be the default 20GBs but it's filling up fast so I know want to change disk_size to let's say 40GBs.
=> I thought I could just add something like disk_size=40 and done.
terraform plan tells me I need to replace the node. This is a 1 node cluster, so not good. And even if it were I don't want to e.g. drain nodes. That's why I thought we are using managed k8s like EKS.
Expected behaviour: since these are elastic volumes I should be able to upsize but not downsize, why is that not possible? I can def. do so from the AWS UI.
Sure with a slightly scary warning:
Are you sure that you want to modify volume vol-xx?
It may take some time for performance changes to take full effect.
You may need to extend the OS file system on the volume to use any newly-allocated space
But I can work with the provided docs on that: https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/recognize-expanded-volume-linux.html?icmpid=docs_ec2_console
Any guidelines on how to up the storage? If I do so with the UI but don't touch Terraform then my EKS state will be nuked/out of sync.
To my knowledge, there is currently no way to resize an EKS node volume without recreating the node using Terraform.
Fortunately, there is a workaround: As you also found out, you can directly change the node size via the AWS UI or API. To update your state file afterward, you can run terraform apply -refresh-only to download the latest data (e.g., the increased node volume size). After that, you can change the node size in your Terraform plan to keep both plan and state in sync.
For the future, you might want to look into moving to ephemeral nodes as (at least my) experience shows that you will have unforeseeable changes to clusters and nodes from time to time. Already planning with replaceable nodes in mind will make these changes substantially easier.
By using the terraform-aws-eks terraform module you are actually following the "ephemeral nodes" paradigm, because for both ways of creating instances (self-managed workers or managed node groups) the module is creating Autoscaling Groups that create EC2 instances out of a Launch Template.
ASG and Launch Templates are specifically designed so that you don't care anymore about specific nodes, and rather you just care about the number of nodes. This means that for updating the nodes, you just replace them with new ones, which will use the new updated launch template (with more GBs for example, or with a new updated AMI, or a new instance type).
This is called "rolling updates", and it can be done manually (adding new instances, then draining the node, then deleting the old node), with scripts (see: eks-rolling-update in github by Hellofresh), or it can be done automagically if you use the AWS managed nodes (the ones you are actually using when specifying "node_groups", that is why if you add more GB, it will replace the node automatically when you run apply).
And this paradigm is the most common when operating Kubernetes in the cloud (and also very common on-premise datacenters when using virtualization).
Option 1) Self Managed Workers
With self managed nodes, when you change a parameter like disk_size or instance_type, it will change the Launch Template. It will update the $latest version tag, which is commonly where the ASG is pointing to (although can be changed). This means that old instances will not see any change, but new ones will have the updated configuration.
If you want to change the existing instances, you actually want to replace them with new ones. That is what this ephemeral nodes paradigm is.
One by one you can drain the old instances while increasing the number of desired_instances on the ASG, or let the cluster autoscaler do the job. Alternatively, you can use an automated script which does this for you for each ASG: https://github.com/hellofresh/eks-rolling-update
In terraform_aws_eks module, you create self managed workers by either using worker_groups or worker_groups_launch_template (recommended) field
Option 2) Managed Nodes
Managed nodes is an EKS-specific feature. You configure them very similarly, but in reality, it is an abstraction, and AWS will create the actual underlying ASG.
You can specify a Launch Template to be used by the ASG and its version. Some config can be specified at the managed node level (i.e. AMI and instance_types) and at the Launch Template (if it wasn't specified in the former).
Any change on the node group level config, or on the Launch Template version, will trigger an automatic rolling update, which will replace all old instances.
You can delay the rolling update by just not pointing to the $latest version (or pointing to $default, and not updating the $default tag when changing the LT).
In terraform_aws_eks module, you create self managed workers by using the node_groups field. You can also play with these settings: create_launch_template=true and set_instance_types_on_lt=true if you want the module to create the LT for you (alternatively you can just not use it, or pass a reference to one); and to set the instance_type on such LT as specified above.
But behavior is similar to worker groups. In no case you will have your existing instances changed. You can only change them manually.
However, there is an alternative: The manual way
You can use the EKS module to create the control plane, but then use a regular EC2 resource in terraform (https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/instance) to create one ore multiple (using count or for_each) instances.
If you create the instances using the aws_instance resource, then terraform will patch those instances (updated-in-place) when any change is allowed (i.e. increasing the root volue GB or the instance type; whereas changing the AMI will force a replacement).
The only tricky part, is that you need to configure the cloud-init script to make the instance join the cluster (something that is automatically done by the EKS module when using self/managed node groups).
However, it is very possible, and you can borrow the script from the module and plug it into the aws_instance's user_data field (https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/instance#user_data)
In this case (when talking about disk_size), however, you still need to manually (either by SSH, or by running an hacky exec using terraform) to patch the XFS filesystem so it sees the increased disk space.
Another alternative: Consider Kubernetes storage
That said, there is also another alternative for certain use cases. If you want to increase the disk space of those instances because of one of your applications using a hostPath, then it might be the case that you can use a kubernetes built-in storage solution using the EBS CSI driver.
For example, I manage an ElasticSearch cluster in Kubernetes (and deploy it from terraform with the helm module), and it uses dynamic storage provisioning to request an EBS volume (note that performance is the same, because both root and this other volume are EBS volumes). EBS CSI driver supports volume expansion, so I can just increase this disk by changing a terraform variable.
To conclude, I would not recommend the aws_instance way, unless you understand it and are sure you really want it. It may make sense in certain cases, but definitely not common

How to define stateles services that will run per environment in Service Fabric

I have an application manifest with five stateless services defined. I have multiple Application Parameters files, one per environment, to change the number of instances for each service. For one of the environments, I don't want two specific services to run at all (zero instances) but SF doesn't accept 0 instance parameter. How can I achieve that?
The best way to achieve this would be to stop using default services and instead use a script to start the required services in the appropriate environments.
The following links offer some comprehensive detail on this subject:
https://stackoverflow.com/a/50445801/490282
https://devblogs.microsoft.com/premier-developer/how-not-to-use-service-fabric-default-services/

Kubernetes: Policy check before container execution

I am new to Kubernetes, I am looking to see if its possible to hook into the container execution life cycle events in the orchestration process so that I can call an API to pass the details of the container and see if its allowed to execute this container in the given environment, location etc.
An example check could be: container can only be run in a Europe or US data centers. so before someone tries to execute this container, outside this region data centers, it should not be allowed.
Is this possible and what is the best way to achieve this?
You can possibly set up an ImagePolicy admission controller in the clusters, were you describes from what registers it is allowed to pull images.
kube-image-bouncer is an example of an ImagePolicy admission controller
A simple webhook endpoint server that can be used to validate the images being created inside of the kubernetes cluster.
If you don't want to start from scratch...there is a Cloud Native Computing Foundation (incubating) project - Open Policy Agent with support for Kubernetes that seems to offer what you want. (I am not affiliated with the project)

Property placeholder resolution precedence when using vault and consul

I have question about placeholder resolution priority when using consul-config and vault-config
I created simple app using this information
My dependencies are:
dependencies {
compile('org.springframework.cloud:spring-cloud-starter-consul-config')
compile('org.springframework.cloud:spring-cloud-starter-vault-config')
compile('org.springframework.boot:spring-boot-starter-webflux')
compile('org.springframework.cloud:spring-cloud-starter')
testCompile('org.springframework.boot:spring-boot-starter-test')
}
Note that I'm not using service discovery.
Doing next step I created property foo.prop = consul (in consul storage)
and foo.prop = vault.
When using:
#Value("${foo.prop}")
private String prop;
I'm getting vault as an output, but when I delete foo.prop from vault and restart app, I will get consul.
I did this few times in different combinations and seems vault config has higher priority over consul.
My question is where I can find information about resolving strategy.(Imagine that we added as third zookeeper-config). Seems spring-core documentation keep quiet about this.
From what I understood by debugging the Spring source code... Now Vault has a priority.
My investigation results:
PropertySourceBootstrapConfiguration.java is responsible to initialize all property sources in bootstrap phase. Before locating properties it sorts all propertySourceLocators by Order:
AnnotationAwareOrderComparator.sort(this.propertySourceLocators);
Vault always "wins" because instance of LeasingVaultPropertySourceLocator (at least this one was created during my debugging) implements PriorityOrdered interface. Instance of ConsulPropertySourceLocator has #Order(0) annotation. According to OrderComparator : instance of PriorityOrdered is 'more important'.
In case you have another PriorityOrdered property source (e.g. custom one) you can influence this order by setting spring.cloud.vault.config.order for Vault.
For now without customization I don't know how to change priority between Vault and Consul.

Why isn't it possible to change placement constraints in an upgrade?

I have a stateless ASP.NET Core (RC1) service running in my Azure Service Fabric cluster. It has the following manifest:
<ServiceManifest Name="MyServicePkg" Version="1.0.2" ...>
<ServiceTypes>
<StatelessServiceType ServiceTypeName="MyServiceType" />
</ServiceTypes>
...
</ServiceManifest>
My cluster is configured with placement properties. I have 5 servers with "nodeType=Backend" and 3 servers with "nodeType=Frontend".
I would like to upgrade my Service and specify that it may only be placed on "Backend" nodes. This is my updated manifest:
<ServiceManifest Name="MyServicePkg" Version="1.0.3" ...>
<ServiceTypes>
<StatelessServiceType ServiceTypeName="MyServiceType">
<PlacementConstraints>(nodeType==Backend)</PlacementConstraints>
</StatelessServiceType>
</ServiceTypes>
...
</ServiceManifest>
However, if I now execute the upgrade, I get the following error:
Start-ServiceFabricApplicationUpgrade : Default service descriptions
must not be modified as part of upgrade. Modified default service:
fabric:/MyApp/MyService
Why isn't it possible to change the constraints with an upgrade?
Would I have to delete and re-create the service? This would seem extremely problematic to me because it would result in downtime and data loss for stateful services.
So the issue here is actually with the DefaultService part of the ApplicationManifest. When services are created as part of the DefaultService, there are things you can't change about it afterwards. You might be able to change it through the ServiceFabric explorer, but I'm not sure.
One recommendation would be to keep the DefaultServices empty in the ApplicationManifest, and instead create your services manually. With manual I mean either through powershell, code or the ServiceFabric Explorer.
That gives you more flexibility about changing parts of the service afterwards. When it's done that way, you I know you have the possibility to change things like placement constraints after the service is running.
To create Services with PowerShell you can use the New-ServiceFabricService command.
To create it from code, you can use FabricClient to do it. A sample of that can be found here: Azure Service Fabric Multi-Tenancy
There's actually a fairly easy way to do this without having to write a bunch of code to manually define the application on the fabric cluster.
While you can declare the placement constraints in the service manifest, you can also declare them in the application manifest. Anything declared in the application manifest will override what's in the service manifest. And with the setting in the application manifest, you can then use parameters to alter the values based on the parameter file you want to a specific deployment.
I've just written up a blog post that discusses this approach in greater detail. I hope you find it useful. :)