How do I prevent resources from being destroyed during the Azure DevOps Terraform Plan Step? - azure-devops

I have inherited some reponsibilities and by that I mean managing Terraform in AzureDevOps Release Pipeline deployments.
I am using the Terraform Task with the following steps:
init
validate
plan
apply
But during the plan output I can see a number of resources being destroyed that I don't want to be removed.
azurerm_key_vault_secret.kv_secret_az_backup_storage_account_name will be destroyed
I was looking for a way to disable any resource destruction during the creation of the tfstate file but there doesn't appear to be a way in Azure DevOps. So my best option would be I suppose resorted to amending the underlying main.tf script but I don't know how.
This is one of the resources being removed. I have renamed to keep anonymity. Can anyone suggestion a solution to my dilemma?
resource "azurerm_key_vault_secret" "kv_secret_az_storage_account_name" {
name = "storage-account-name"
value = azurerm_storage_account.storage_account.name
key_vault_id = azurerm_key_vault.keyvault.id
depends_on = [azurerm_storage_account.storage_account]
}

plan phase doesn't destroy your resource nor creates new. It does inform you what will happen when you run apply.
so
azurerm_key_vault_secret.kv_secret_az_backup_storage_account_name will be destroyed
this just says that your storage account will be destroyed if you run apply.
But since it tries to remove, it means that terrafrom keeps information bout this resource in state. So it was created be terraform and now if you want to put it out of the scope here - I mean you don't want to have it longer maintained by your teffafrom script you can use state rm command.
Items removed from the Terraform state are not physically destroyed. Items removed from the Terraform state are only no longer managed by Terraform. For example, if you remove an AWS instance from the state, the AWS instance will continue running, but terraform plan will no longer see that instance.

Related

Azure terraform pipeline

I hope somebody can help me to solve this issue and understand how to implement the best approach.
I have a production environment running tons of azure services (sql server, databases, web app etc).
all those infra has been created with terraform. For as powerful as it is, I am terrified on using it in a pipeline for 1 reason.
Some of my friend, often they do some changes to the infra manually, and having not having those changes in my terraform states, if I automate this process, it might destroy the resource ungracefully, which is something that I don't want to face.
so I was wondering if anyone can shade some light on the following question:
is it possible to have terraform automated to check the infra state at every push to GitHub, and to quit if the output of the plan reports any change?
change to make clear my example.
Lets say I have a terraform state on which I have 2 web app, and somebody manually created a 3 web app on that resource group, it develops some code and push it to GitHub.My pipeline triggers, and as first step I have terraform that runs a terraform plan and/or terraform apply, if this command reports any change, I want it to quit the pipeline(fail) so I will know there is something new there, and if the terraform plan and/or terraform apply return there are no changes to the infra, is up to date to continue with the code deployment.
thank you in advance for any help and clarification.
Yes, you can just run
terraform plan -detailed-exitcode
If the exit code is != 0, you know there are changes. See here for details.
Let me point out that I would highly advise you to lock down your prod environment so that nobody can do manual changes! Your CI/CD pipeline should be the only way to make changes there.
Adding to the above answer, you can also make use of terraform import command just to import the remote changes to your state file. The terraform import command is used to import existing resources into Terraform. Later run plan to check if the changes are in sync.
Refer: https://www.terraform.io/docs/cli/commands/import.html

GCP Cloud SQL failed to delete instance because `deletion_protection` is set to true - gcloud toggle?

Error message:
terraform destroy
module.application.google_sql_database_instance.sql-db-xxx: Destroying... [id=db-xxx]
Error: Error, failed to delete instance because deletion_protection is set to true. Set it to false to proceed with instance deletion
The terraform solution is here:
https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/sql_database_instance
On newer versions of the provider, you must explicitly set deletion_protection=false (and run terraform >apply to write the field to state) in order to destroy an instance. It is recommended to not set this >field (or set it to true) until you're ready to destroy the instance and its databases.
Question:
I do NOT want to make changes to the terraform script. I would rather toggle the deletion protection flag via gcloud, then destroy as per usual.
For gcloud VMs there is a deletion protection flag I can toggle. However, I cannot find the corresponding flag for the database:
cloud sql instances describe db-xxx
Is this deletion_protection flag meta data within terraform itself?
If not, what is the gcloud command to toggle it?
If so, how can I override it via terraform without modifying the code; ie command line parameter?
I have insufficient 'points' to add to the existing thread of a similar title.
To answer your questions:
From Terraform docs:
deletion_protection - Whether or not to allow Terraform to destroy the instance.
So yes, this is within Terraform itself. Deletion protection flag on GCP is currently only available on Compute Engine instances, not Cloud SQL instances.
Currently, deletion protection can only be toggled on a Compute Engine Instance.
You may consider using input variables like this:
terraform apply -var="deletion_protection=false"
terraform destroy
There are also other ways to use input variables. For more reference, here's the link.

How to set automatic rollbacks in CodeDeploy with CloudFormation?

I'm creating a Deployment Group in CodeDeploy with a CloudFormation template.
The Deployment Group is successfully created and the application is deployed perfectly fine.
The CF resource that I defined (Type: AWS::CodeDeploy::DeploymentGroup) has the "Deployment" property set. The thing is that I would like to configure automatic rollbacks for this deployment, but as per CF documentation for "AutoRollbackConfiguration" property: "Information about the automatic rollback configuration that is associated with the deployment group. If you specify this property, don't specify the Deployment property."
So my understanding is that if I specify "Deployment", I cannot set "AutoRollbackConfiguration"... Then how are you supposed to configure any rollback for the deployment? I don't see any other resource property that relates to rollbacks.
Should I create a second DeploymentGroup resource and bind it to the same instances that the original Deployment Group has? I'm not sure this is possible or makes sense but I ran out of options.
Thanks,
Nicolas
First i like to describe why you cannot specify both, deployment and rollback configuration:
Whenever you specify a deployment directly for the group, you already state which revision you like to deploy. This conflicts with the idea of CloudFormation of having resources managed by it without having a drift in the actual configuration of those resources.
I would recommend the following:
Use CloudFormation to deploy the 'underlying' infrastructure (the deployment group, application, roles, instances, etc.)
Create a CodePipline within this infrastructure template, which then includes a CodeDeploy deployment action (https://docs.aws.amazon.com/codepipeline/latest/userguide/action-reference-CodeDeploy.html, https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-properties-codepipeline-pipeline-stages-actions-actiontypeid.html)
The pipeline can triggered whenever you have a new version inside you revision location
This approach clearly separates the underlying stuff, which is not changing dynamically and the actual application deployment, done using a proper pipeline.
Additionally in this way you can specify how you like to deploy (green/blue, canary) and how/when rollbacks should be handled. The status of your deployment also to be seen inside CodePipeline.
I didn't mention it but what you are suggesting about CodePipeline is exactly what I did.
In fact, I have one CloudFormation template that creates all the infrastructure and includes the DeploymentGroup. With this, the application is deployed for the first time to my EC2 instances.
Then I have another CF template for CI/CD purposes with a CodeDeploy stage/action that references the previous DeploymentGroup. Whenever I push some code to my repository, the Pipeline is triggered, code is built and new version successfully deployed to the instances.
However, I don't see how/where in any of the CF templates to handle/configure the rollback for the DeploymentGroup as you were saying. I think I get the idea of your explanation about the conflict CF might have in case of having a drift, but my impression is that in case of errors during the CF stack creation, CF rollback should just remove the DeploymentGroup you're trying to create. In other words, for me there's no CodeDeploy deployment rollback involved in that scenario, just removing the resource (DeploymentGroup) CF was trying to create.
One thing that really impresses me is that you can enable/disable automatic rollbacks for the DeploymentGroup through the AWS Console. Just edit and go to Advanced Configuration for the DeploymentGroup and you have a checkbox. I tried it and triggered the Pipeline again and worked perfectly. I made a faulty change to make the deployment fail in purpose, and then CodeDeploy automatically reverted back to the previous version of my application... completely expected behavior. Doesn't make much sense that this simple boolean/flag option is not available through CF.
Hope this makes sense and helps clarifying my current situation. Any extra help would be highly appreciated.
Thanks again

In teraform, is there a way to refresh the state of a resource using TF files without using CLI commands?

I have a requirement to refresh the state of a resource "ibm_is_image" using TF files without using CLI commands ?
I know that we can import the state of a resource using "terraform import ". But I should do the same using IaC in TF files.
How to achieve this ?
Example:
In workspace1, I create a resource "f5_custom_image" which gets deleted later from command line. In workspace2, the same code in TF file will assume that "f5_custom_image" already exists and it fails to read the custom image resource. So, my code has to refresh the terraform state of this resource for every execution of "terraform apply":
resource "ibm_is_image" "f5_custom_image" {
depends_on = ["data.ibm_is_images.custom_images"]
href = "${local.image_url}"
name = "${var.vnf_vpc_image_name}"
operating_system = "centos-7-amd64"
timeouts {
create = "30m"
delete = "10m"
}
}
In Terraform's model, an object is fully managed by a single Terraform configuration and nothing else. Having an object be managed by multiple configurations or having an object be created by Terraform but then deleted later outside of Terraform is not a supported workflow.
Terraform is intended for managing long-lived architecture that you will gradually update over time. It is not designed to manage build artifacts like machine images that tend to be created, used, and then destroyed.
The usual architecture for this sort of use-case is to consider the creation of the image as a "build" step, carried out using some other software outside of Terraform, and then we use Terraform only for the "deploy" step, at which point the long-lived infrastructure is either created or updated to use the new image.
That leads to a build and deploy pipeline with a series of steps like this:
Use separate image build software to construct the image, and record the id somewhere from which it can be retrieved using a data source in Terraform.
Run terraform apply to update the long-lived infrastructure to make use of the new image. The Terraform configuration should include a data block to read the image id from wherever it was recorded in the previous step.
If desired, destroy the image using software outside of Terraform once Terraform has completed.
When implementing a pipeline like this, it's optional but common to also consider a "rollback" process to use in case the new image is faulty:
Reset the recorded image id that Terraform is reading from back to the id that was stored prior to the new build step.
Run terraform apply to update the long-lived infrastructure back to using the old image.
Of course, supporting that would require retaining the previous image long enough to prove that the new image is functioning correctly, so the normal build and deploy pipeline would need to retain at least one historical image per run to roll back to. With that said, if you have a means to quickly recreate a prior image during rollback then this special workflow isn't strictly needed: instead, you can implement rollback instead by "rolling forward" to an image constructed with the prior configuration.
An example software package commonly used to prepare images for use with Terraform on other cloud vendors is HashiCorp Packer, but sadly it looks like it does not have IBM Cloud support and so you may need to look for some similar software that does support IBM Cloud, or write something yourself using the IBM Cloud SDK.

Set the ECS Cloudformation Update Stack timeout?

When updating a Cloudformation EC2 Container Service (ECS) Stack with a new Container Image, is there any way to control the timeout so if the service does not stabilize it rolls back automatically?
The UpdatePolicy attribute which is part of the Auto Scaling Group does not help since instances are not being created.
I also tried a WaitCondition but have not been able to get that to work.
The stack essentially just stays in the UPDATE_IN_PROGRESS state until it hits the default timeout (~3 hours), or you trigger a Cancel the update.
Ideally we would be able to have the stack timeout after a short period of time.
This is what my Cloudformation template looks like:
https://s3.amazonaws.com/aws-rga-cw-public/ops/cfn/ecs-cluster-asg-elb-cfn.yaml
Thanks.
I've created a workaround for this problem until AWS creates a ECS UpdatePolicy and CreationPolicy that allows for resourcing signaling:
Use AWS::CloudFormation::WaitCondition with a Macro that will create new WaitCondition resources when the service is expected to update. Signal the wait condition with a non-essential container attached to the task.
Example: https://github.com/deuscapturus/cloudformation-macro-WaitConditionUpdate/blob/master/example-ecs-service.yaml
The Macro for the above example can be found here: https://github.com/deuscapturus/cloudformation-macro-WaitConditionUpdate
My workaround for this problem is that before triggering an update stack, run a script in the background
./deployment-breaker.sh &
And for the script
#!/bin/bash
sleep 600
$deploymentStatus = (aws cloudformation describe-stack --stack-name STACK_NAME | jq XXX)
if [[ $deploymentStatus == YOUR_TERMINATE_CONDITION ]]then
aws cloudformation cancel-update-stack --stack-name STACK_NAME
fi
If your WaitCondition is in the original create you need to rename it (and the Handle). Once a waitcondition has been signaled as complete, it will always be complete. If you rename it and do an update, the original WaitCondition and Handle will be dropped and the new ones created created and signaled.
If you don't want to have to modify your template you might be able to use Lamba and Custom resources to create a unique WaitCondition via the aws cli for each update.
It's not possible at the moment with the provided CloudFormation types. I have the same problem and I might create a custom CloudFormation resource (usineg AWS Lambda) to replace my AWS::ECS::Service.
The other alternative is to use nested stacks to wrap the AWS::ECS::Service resources — it won't solve the problem, but it at least will isolate the individual service and the rest of the stack will be in a good state. My stacks have multiple services and this would help, but the custom resource is the best option so far (I know other people that did the same thing).