Probable causes for idempotent error by terraform for infra generation - amazon-ecs

We are using terraform to launch ECS containers in AWS infra using custom task definition.
As we didn't require full infra to be launched every-time, a part only for launching ECS container was segregated.
The launch was happening correctly, till ECS launch code was segregated, then the ECS service launch started giving an error indicating Idempotent issue.
│ Error: error creating target service: error waiting for ECS service (sandbox) creation: InvalidParameterException: Creation of service was not idempotent.
│
│ with aws_ecs_service.ecs_service_target,
│ on aws_infra_ecs.tf line 100, in resource "aws_ecs_target" "ecs_service_target":
│ 100: resource "aws_ecs_target" "ecs_service_target" {
│
ECS service is defined somewhat like below:
resource "aws_ecs_service" "ecs_service_target" {
desired_count = 1
name = "target"
launch_type = "FARGATE"
cluster = data.aws_ecs_cluster.cluster_target.id
enable_ecs_managed_tasks = true
task_definition = aws_ecs_task_definition.target_taskdef.arn
platform_version = "1.4.0"
...
load_balancer {
...
target_group_arn = data.aws_lb_target_group.aws_target.arn
}
...
network_configuration {
...
security_groups = [ data.aws_security_group.target_sg.id ]
subnets = [ "subet-5767c3c2" ] # A dynamic subnet reference id is used here
}
depends_upon = [
var.second_service_name,
aws_ecs_task_definition.target_taskdef,
data.aws_efs_access_point.target_ap
]
...
}
I was expecting the problems to be one of following kind:
Subnet selected may be different due to variable based selection
Use of indirect data references (rather than direct resource reference) may cause issue
task definition JSON encoding issue
What might be other causes for such a problem.

Related

mock outputs in Terragrunt dependency

I want to use Terragrunt to deploy this example: https://github.com/aws-ia/terraform-aws-eks-blueprints/blob/main/examples/complete-kubernetes-addons/main.tf
So far, I was able to create the VPC/EKS resource without a problem, I separated each module into a different module directory, and everything worked as expected.
When I tried to do the same for the Kubernetes-addons module, I faced an issue with the data source trying to call to the cluster and failing since the cluster wasn't created at this point.
Here's my terragrunt.hcl which I'm trying to execute for this specific module:
...
terraform {
source = "git::git#github.com:aws-ia/terraform-aws-eks-blueprints.git//modules/kubernetes-addons?ref=v4.6.1"
}
locals {
# Extract needed variables for reuse
cluster_version = "${include.envcommon.locals.cluster_version}"
name = "${include.envcommon.locals.name}"
}
dependency "eks" {
config_path = "../eks"
mock_outputs = {
eks_cluster_endpoint = "https://000000000000.gr7.eu-west-3.eks.amazonaws.com"
eks_oidc_provider = "something"
eks_cluster_id = "something"
}
}
inputs = {
eks_cluster_id = dependency.eks.outputs.cluster_id
eks_cluster_endpoint = dependency.eks.outputs.eks_cluster_endpoint
eks_oidc_provider = dependency.eks.outputs.eks_oidc_provider
eks_cluster_version = local.cluster_version
...
}
The error that I'm getting here:
`
INFO[0035]
Error: error reading EKS Cluster (something): couldn't find resource
with data.aws_eks_cluster.eks_cluster,
on data.tf line 7, in data "aws_eks_cluster" "eks_cluster":
7: data "aws_eks_cluster" "eks_cluster" {
`
The kubernetes-addons module is deploying addons into an existing Kubernetes cluster. If you don't have a cluster running (apparently you don't have one when you're mocking the cluster_id variable), then you get the error of not having the aws_eks_cluster data source.
You need to create the K8s cluster first, before you can start deploying the addons.

Create Azure AKS with Managed Identity using Terraform gives AutoUpgradePreview not enabled error

I am trying to create an AKS cluster with managed identity using Terraform. This is my code so far, pretty basic and standard from a few documentation and blog posts I found online.
terraform {
required_providers {
azurerm = {
source = "hashicorp/azurerm"
version = "2.79.1"
}
}
}
provider "azurerm" {
features {}
use_msi = true
}
resource "azurerm_resource_group" "rg" {
name = "prod_test"
location = "northeurope"
}
resource "azurerm_kubernetes_cluster" "cluster" {
name = "prod_test_cluster"
location = azurerm_resource_group.rg.location
resource_group_name = azurerm_resource_group.rg.name
dns_prefix = "weak"
default_node_pool {
name = "default"
node_count = "4"
vm_size = "standard_ds3_v2"
}
identity {
type = "SystemAssigned"
}
}
And this is the error message that I can't come around to a solution. Any thoughts on it?
Error: creating Managed Kubernetes Cluster "prod_test_cluster" (Resource Group "prod_test"): containerservice.ManagedClustersClient#CreateOrUpdate: Failure sending request: StatusCode=0 -- Original Error: Code="BadRequest" Message="Feature Microsoft.ContainerService/AutoUpgradePreview is not enabled. Please see https://aka.ms/aks/previews for how to enable features."
│
│ with azurerm_kubernetes_cluster.cluster,
│ on main.tf line 19, in resource "azurerm_kubernetes_cluster" "cluster":
│ 19: resource "azurerm_kubernetes_cluster" "cluster" {
│
I tested it on my environment and faced the same issue as you can see below:
So, to give a description on the issue the AutoChannelUpgrade went
to public preview on August 2021. And as per the terraform azurerm provider 2.79.0 , it bydefault passes that value to none in the
backend but as we have not registered for the feature it fails giving
the error Feature Microsoft.ContainerService/AutoUpgradePreview is not enabled.
To confirm you don't have the feature registered you can use the
below command :
az feature show -n AutoUpgradePreview --namespace Microsoft.ContainerService
You will see it not registered as below:
Now to overcome this you can try two solutions as given below:
You can try using terraform azurerm provider 2.78.0 instead of 2.79.1.
Other solution will be to register for the feature and then you can
use the same code that you are using .
You can follow the below steps:
You can use below command to register the feature (it will take around 5
mins to get registered) :
az login --identity
az feature register --namespace Microsoft.ContainerService -n AutoUpgradePreview
After the above is done you can check the registration stauts with below command :
az feature registration show --provider-namespace Microsoft.ContainerService -n AutoUpgradePreview
After the feature status becomes registered you can do a terraform apply to your code .
I tested it using the below code on my VM:
provider "azurerm" {
features {}
subscription_id = "948d4068-xxxxx-xxxxxx-xxxx-e00a844e059b"
tenant_id = "72f988bf-xxxxx-xxxxxx-xxxxx-2d7cd011db47"
use_msi = true
}
resource "azurerm_resource_group" "rg" {
name = "terraformtestansuman"
location = "west us 2"
}
resource "azurerm_kubernetes_cluster" "cluster" {
name = "prod_test_cluster"
location = azurerm_resource_group.rg.location
resource_group_name = azurerm_resource_group.rg.name
dns_prefix = "weak"
default_node_pool {
name = "default"
node_count = "4"
vm_size = "standard_ds3_v2"
}
identity {
type = "SystemAssigned"
}
}
Outputs:
Reference:
Github Issue
Install Azure CLI if not installed on the VM using Microsoft Installer

Azure app registration creation error through terraform Azure Devops yml pipeline [duplicate]

This question already has answers here:
json.Marshal(): json: error calling MarshalJSON for type msgraph.Application
(2 answers)
Closed 1 year ago.
I have very simple terraform code.
terraform {
required_providers {
azurerm = {
source = "hashicorp/azurerm"
version = "=2.46.0"
}
azuread = {
source = "hashicorp/azuread"
version = "~> 2.0.0"
}
}
}
provider "azurerm" {
features {}
}
provider "azuread" {
tenant_id = "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
}
terraform {
backend "azurerm" {
resource_group_name = "xxxx"
storage_account_name = "xxxxxxxxx"
container_name = "xxxxxxxxxxxxx"
key = "xxxxxxxxxxxxxxxxx"
}
}
data "azuread_client_config" "current" {}
resource "azurerm_resource_group" "test" {
name = "test-rg-005"
location = "East US"
}
resource "azuread_application" "example" {
display_name = "Example-app"
}
However when i run this through yml pipeline on azure devops, i am getting this error during apply stage.
Plan: 1 to add, 0 to change, 0 to destroy.
azuread_application.example: Creating...
│ Error: Could not create application
│
│ with azuread_application.example,
│ on terraform.tf line 42, in resource "azuread_application" "example":
│ 42: resource "azuread_application" "example" {
│
│ json.Marshal(): json: error calling MarshalJSON for type
│ msgraph.Application: json: error calling MarshalJSON for type
│ *msgraph.Owners: marshaling Owners: encountered DirectoryObject with nil
│ ODataId
##[error]Error: The process '/opt/hostedtoolcache/terraform/1.0.5/x64/terraform' failed with
exit code 1
Any clue will be helpful, not really clear what this error is about?
Thanks.
There is a bug in azure Active directory provider after an MSFT update. This is impacting any azure ad provider usage creating new resources, however it seems to be working on already deployed resources, i.e. changing and upgrading the configurations of already deployed resource within azure ad. Following is the link for the bug updates.
https://github.com/hashicorp/terraform-provider-azuread/issues/588

terraform plan recreates resources on every run with terraform cloud backend

I am running into an issue where terraform plan recreates resources that don't need to be recreated every run. This is an issue because some of the steps depend on those resources being available, and since they are recreated with each run, the script fails to complete.
My setup is Github Actions, Linode LKE, Terraform Cloud.
My main.tf file looks like this:
terraform {
required_providers {
linode = {
source = "linode/linode"
version = "=1.16.0"
}
helm = {
source = "hashicorp/helm"
version = "=2.1.0"
}
}
backend "remote" {
hostname = "app.terraform.io"
organization = "MY-ORG-HERE"
workspaces {
name = "MY-WORKSPACE-HERE"
}
}
}
provider "linode" {
}
provider "helm" {
debug = true
kubernetes {
config_path = "${local_file.kubeconfig.filename}"
}
}
resource "linode_lke_cluster" "lke_cluster" {
label = "MY-LABEL-HERE"
k8s_version = "1.21"
region = "us-central"
pool {
type = "g6-standard-2"
count = 3
}
}
and my outputs.tf file
resource "local_file" "kubeconfig" {
depends_on = [linode_lke_cluster.lke_cluster]
filename = "kube-config"
# filename = "${path.cwd}/kubeconfig"
content = base64decode(linode_lke_cluster.lke_cluster.kubeconfig)
}
resource "helm_release" "ingress-nginx" {
# depends_on = [local_file.kubeconfig]
depends_on = [linode_lke_cluster.lke_cluster, local_file.kubeconfig]
name = "ingress"
repository = "https://kubernetes.github.io/ingress-nginx"
chart = "ingress-nginx"
}
resource "null_resource" "custom" {
depends_on = [helm_release.ingress-nginx]
# change trigger to run every time
triggers = {
build_number = "${timestamp()}"
}
# download kubectl
provisioner "local-exec" {
command = "curl -LO https://storage.googleapis.com/kubernetes-release/release/$(curl -s https://storage.googleapis.com/kubernetes-release/release/stable.txt)/bin/linux/amd64/kubectl && chmod +x kubectl"
}
# apply changes
provisioner "local-exec" {
command = "./kubectl apply -f ./k8s/ --kubeconfig ${local_file.kubeconfig.filename}"
}
}
In Github Actions, I'm running these steps:
jobs:
init-terraform:
runs-on: ubuntu-latest
defaults:
run:
working-directory: ./terraform
steps:
- name: Checkout code
uses: actions/checkout#v2
with:
ref: 'privatebeta-kubes'
- name: Setup Terraform
uses: hashicorp/setup-terraform#v1
with:
cli_config_credentials_token: ${{ secrets.TERRAFORM_API_TOKEN }}
- name: Terraform Init
run: terraform init
- name: Terraform Format Check
run: terraform fmt -check -v
- name: List terraform state
run: terraform state list
- name: Terraform Plan
run: terraform plan
id: plan
env:
LINODE_TOKEN: ${{ secrets.LINODE_TOKEN }}
When I look at the results of terraform state list I can see my resources:
Run terraform state list
terraform state list
shell: /usr/bin/bash -e {0}
env:
TERRAFORM_CLI_PATH: /home/runner/work/_temp/3f9749b8-515b-4cb4-8053-1a6318496321
/home/runner/work/_temp/3f9749b8-515b-4cb4-8053-1a6318496321/terraform-bin state list
helm_release.ingress-nginx
linode_lke_cluster.lke_cluster
local_file.kubeconfig
null_resource.custom
But my terraform plan fails and the issue seems to stem from the fact that those resources try to get recreated.
Run terraform plan
terraform plan
shell: /usr/bin/bash -e {0}
env:
TERRAFORM_CLI_PATH: /home/runner/work/_temp/3f9749b8-515b-4cb4-8053-1a6318496321
LINODE_TOKEN: ***
/home/runner/work/_temp/3f9749b8-515b-4cb4-8053-1a6318496321/terraform-bin plan
Running plan in the remote backend. Output will stream here. Pressing Ctrl-C
will stop streaming the logs, but will not stop the plan running remotely.
Preparing the remote plan...
Waiting for the plan to start...
Terraform v1.0.2
on linux_amd64
Configuring remote state backend...
Initializing Terraform configuration...
linode_lke_cluster.lke_cluster: Refreshing state... [id=31946]
local_file.kubeconfig: Refreshing state... [id=fbb5520298c7c824a8069397ef179e1bc971adde]
helm_release.ingress-nginx: Refreshing state... [id=ingress]
╷
│ Error: Kubernetes cluster unreachable: stat kube-config: no such file or directory
│
│ with helm_release.ingress-nginx,
│ on outputs.tf line 8, in resource "helm_release" "ingress-nginx":
│ 8: resource "helm_release" "ingress-nginx" {
Is there a way to tell terraform it doesn't need to recreate those resources?
Regarding the actual error shown, Error: Kubernetes cluster unreachable: stat kibe-config: no such file or directory... which is referencing your outputs file... I found this which could help with your specific error: https://github.com/hashicorp/terraform-provider-helm/issues/418
1 other thing looks strange to me. Why does your outputs.tf refer to 'resources' & not 'outputs'. Shouldn't your outputs.tf look like this?
output "local_file_kubeconfig" {
value = "reference.to.resource"
}
Also I see your state file / backend config looks like it's properly configured.
I recommend logging into your terraform cloud account to verify that the workspace is indeed there, as expected. It's the state file that tells terraform not to re-create the resources it manages.
If the resources are already there and terraform is trying to re-create them, that could indicate that those resources were created prior to using terraform or possibly within another terraform cloud workspace or plan.
Did you end up renaming your backend workspace at any point with this plan? I'm referring to your main.tf file, this part where it says MY-WORKSPACE-HERE :
terraform {
required_providers {
linode = {
source = "linode/linode"
version = "=1.16.0"
}
helm = {
source = "hashicorp/helm"
version = "=2.1.0"
}
}
backend "remote" {
hostname = "app.terraform.io"
organization = "MY-ORG-HERE"
workspaces {
name = "MY-WORKSPACE-HERE"
}
}
}
Unfortunately I am not a kurbenetes expert, so possibly more help can be used there.

Features block terraform

terraform init successfully initializes but gets stuck on terraform plan.
The error is related to the feature block. I'm unsure where to add the feature block:
Insufficient features blocks (source code not available) At least 1 "features" blocks are required.
My configuration looks like
terraform {
required_version = ">= 0.11"
backend "azurerm" {
features {}
}
}
I tried removing and adding features block as github page
When you run updated version of terraform you need to define another block defined below
provider "azurerm" {
features {}
}
An other reason for the message could be, that a named provider is in use:
provider "azurerm" {
alias = "some_name" # <- here
features {}
}
But not specified on the resource:
resource "azurerm_resource_group" "example" {
# might this block is missing
# -> provider = azurerm.some_name
name = var.rg_name
location = var.region
}
Error message:
terraform plan
╷
│ Error: Insufficient features blocks
│
│ on <empty> line 0:
│ (source code not available)
│
│ At least 1 "features" blocks are required.
In Terraform >= 0.13, here's what a sample versions.tf looks like (note the provider config being in a separate block):
# versions.tf
terraform {
required_providers {
azurerm = {
# ...
}
}
required_version = ">= 0.13"
}
# This block goes outside of the required_providers block!
provider "azurerm" {
features {}
}
Please check if highlighted lines are added to your template