Install kubeflow using terraform - juju needed? - kubernetes

There is need to installl kubeflow using terraform. As I understand 'juju' must be installed, so I found juju provider on terraform: https://registry.terraform.io/providers/juju/juju/latest
I use below code into terraform:
terraform {
required_version = "> 1.3.0"
required_providers {
kubernetes = {
source = "hashicorp/kubernetes"
version = "2.16"
}
juju = {
source = "juju/juju"
version = "0.4.3"
}
}
backend "local" {
path = "/tmp/terraform.tfstate"
}
}
provider "juju" {
# controller_addresses = "localhost:8080"
username = "jujuuser"
password = "password1"
}
I don't know what should be the controller_addresses so i get:
Error: host addresses: cannot parse "" as address:port: missing port in address not valid
and I have some problems with certificate x509 but first the controller_addresses problem need to be solved.
Maybe someone know how to handle to install kubeflow the easier way.
Terraform will be used on local machine for tests and after on premises server.
The main thing for the task is to install kubeflow using terraform.

Related

To enable preview feature of azure resource provider

I would like to enable an azure preview feature via terraform. I have configured skip provider registration but when I tried to apply still get provider already exists error. I have to import manually as a workaround.
QUERY?:
do we must import manually to avoid provider exist error when register preview feature?
as I already define skip registration but seems it didn’t work.
Thanks!
======== configuration ========
Configure the Azure provider
terraform {
required_providers {
azurerm = {
source = "hashicorp/azurerm”
version = “~> 2.99"
}
}
required_version = “>= 1.1.0”
}
provider “azurerm” {
features {}
skip_provider_registration = true
}
resource “azurerm_resource_provider_registration” “example” {
name = “Microsoft.Network”
feature {
name = “AFWEnableNetworkRuleNameLogging”
registered = true
}
}
have configured to skip provider registration but when I tried to apply still get provider already exists.
======== error log
terraform apply main.tf plan
azurerm_resource_provider_registration.example: Creating…
╷
│ Error: A resource with the ID. “/subscriptions/xxxx-xxxx/providers/Microsoft.Network” already exists - to be managed via Terraform this resource needs to be imported into the State. Please see the resource documentation for “azurerm_resource_provider_registration” for more information.
Any solution on the above requirement to enable the preview feature of the corresponding namespace resource provider.
If the Terraform statefile already contains the relevant providers, we should import it first before making any changes. Then only Terraform will read the respective changes from statefile.
Step1:
Add below code in provider tf and main tf as below
provider tf file
terraform {
required_version = ">= 1.1.0"
required_providers {
azurerm = {
source = "hashicorp/azurerm"
version = "~> 2.99"
}
}
}
provider "azurerm" {
features {}
skip_provider_registration = true
}
main tf file as follows
resource "azurerm_resource_provider_registration" "example" {
name = "Microsoft.Network"
feature {
name = "AFWEnableNetworkRuleNameLogging"
registered = true
}
}
Step2:
Run bellow commands
terraform plan
Run below command
terraform apply -auto-approve
NOTE:
if error saying its "already exists - to be managed via Terraform this resource needs to be imported into the State." then please run below command to import the respective service via terraform
terraform import azurerm_resource_provider_registration.example /subscriptions/************************/providers/Microsoft.Network
Output as follows:
Step3:
run below commands
terraform plan
terraform apply -auto-approve

Restoring a AWS documentdb snapshot with terraform

I am unsure how to restore an AWS documentdb cluster that is managed by terraform.
My terraform setup looks like this:
resource "aws_docdb_cluster" "this" {
cluster_identifier = var.env_name
engine = "docdb"
engine_version = "4.0.0"
master_username = "USERNAME"
master_password = random_password.this.result
db_cluster_parameter_group_name = aws_docdb_cluster_parameter_group.this.name
availability_zones = ["us-east-1a", "us-east-1b", "us-east-1c"]
db_subnet_group_name = aws_docdb_subnet_group.this.name
deletion_protection = true
backup_retention_period = 7
preferred_backup_window = "07:00-09:00"
skip_final_snapshot = false
# Added on 6.25.22 to rollback an incorrect application of the namespace
# migration, which occurred at 2AM EST on June 23.
snapshot_identifier = "...the arn for the snapshot..."
}
resource "aws_docdb_cluster_instance" "this_2a" {
count = 1
engine = "docdb"
availability_zone = "us-east-1a"
auto_minor_version_upgrade = true
cluster_identifier = aws_docdb_cluster.this.id
instance_class = "db.r5.large"
}
resource "aws_docdb_cluster_instance" "this_2b" {
count = 1
engine = "docdb"
availability_zone = "us-east-1b"
auto_minor_version_upgrade = true
cluster_identifier = aws_docdb_cluster.this.id
instance_class = "db.r5.large"
}
resource "aws_docdb_subnet_group" "this" {
name = var.env_name
subnet_ids = module.vpc.private_subnets
}
I added the snapshot_identifier parameter and applied it, expecting a rollback. However, this did not have the intended effect of restoring documentdb state to its settings on June 23rd. (As far as I can tell, nothing changed at all)
I wanted to avoid using the AWS console approach (described here) because that creates a new cluster which won't be tracked by terraform.
What is the proper way of accomplishing this rollback using terraform?
The snapshot_identifier parameter is only used when Terraform creates a new cluster. Setting it after the cluster has been created just tells Terraform "If you ever have to recreate this cluster, use this snapshot".
To actually get Terraform to recreate the cluster you would need to do something else to make Terraform think the cluster needs to be recreated. Possible options are:
Run terraform taint aws_docdb_cluster.this to signal to Terraform that the resource needs to be recreated. It will then recreate it the next time you run terraform apply.
Delete the cluster through some other means, like the AWS console, and then run terraform apply.
The general approach is this, but i have no experience with documentdb. Hope this helps.
0. Take a backup of your terrafrom state file terraform state pull > backup_state_file_timestamp.json
Restore through the console to the point in time you want.
Remove the old instances and cluster from your terraform state file
terraform state rm aws_docdb_cluster_instance.this_2a
terraform state rm aws_docdb_cluster_instance.this_2b
terraform state rm aws_docdb_cluster.this
Import the manually restored cluster and instance into terraform
terraform import aws_docdb_cluster.this cluster_identifier
terraform import rm aws_docdb_cluster_instance.this_2a identifier
terraform import rm aws_docdb_cluster_instance.this_2b identifier
(see import at the bottom https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/docdb_cluster_instance and https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/docdb_cluster)

Terraform tries to load old defunct provider

Attempting to use cyrilgdn/postgresql provider but terraform continues to attempt to load hashicorp/postgresql, this causes init to fail. Currently using terraform 1.0.0, although the problems happens on 14.1 too - have not upgraded from 12.x, always run 14.1 or newer on this work.
I've reduced the code to the below, nothing else in this folder and still get the problem
terraform {
required_version = ">= 0.14.1"
required_providers {
postgres = {
source = "cyrilgdn/postgresql"
version = ">=1.13.0"
}
}
}
provider "postgresql" {
host = "TBC"
port = 5432
username = "TBC"
password = "TBC"
}
init reports:
Initializing provider plugins...
- Finding cyrilgdn/postgresql versions matching ">= 1.13.0"...
- Finding latest version of hashicorp/postgresql...
- Installing cyrilgdn/postgresql v1.13.0...
- Installed cyrilgdn/postgresql v1.13.0 (self-signed, key ID 3918DD444A3876A6)
Partner and community providers are signed by their developers.
If you'd like to know more about provider signing, you can read about it here:
https://www.terraform.io/docs/cli/plugins/signing.html
Error: Failed to query available provider packages
Could not retrieve the list of available versions for provider
hashicorp/postgresql: provider registry registry.terraform.io does not have a
provider named registry.terraform.io/hashicorp/postgresql
terraform providers reports
Providers required by configuration:
.
├── provider[registry.terraform.io/hashicorp/postgresql]
└── provider[registry.terraform.io/cyrilgdn/postgresql] >= 1.13.0
How can I stop it trying to find hashicorp/postgresql ?
It should be postgresql, not postgres:
terraform {
required_version = ">= 0.14.1"
required_providers {
postgresql = {
source = "cyrilgdn/postgresql"
version = ">=1.13.0"
}
}
}

Terraform running in Azure Pipeline attempting to install azcli provider

I'm running Terraform in an Azure Pipeline (something I have experience of doing) and for some reason the init step is attempting to install a provider for azcli, which I don't think exists. This does not happen when I run Terraform on my local machine.
My providers file is:
terraform {
required_version = ">=0.13"
backend "azurerm" {
container_name = "tfstate"
key = "terraform.tfstate"
}
required_providers {
grafana = {
source = "grafana/grafana"
version = "=1.5.0"
}
}
}
This is the error I'm seeing:
I'm not sure why Terraform is trying to install the azcli provider; I don't think it even exists. Has anyone seen this before?
Terraform searches directly and indirectly for providers when initialization. It is possible there is a mistake in the resource name or provider definition. Search your codebase for azcli.
▶ cat .\main.tf
resource "azcli_test" "test" {
test = "true"
}
~\projects\test\t5 ◷ 10:10:21 AM
▶ C:\Users\pearcec\bin\terraform init
Initializing the backend...
Initializing provider plugins...
- Finding latest version of hashicorp/azcli...
Error: Failed to install provider
Error while installing hashicorp/azcli: provider registry
registry.terraform.io does not have a provider named
registry.terraform.io/hashicorp/azcli
or
~\projects\test\t5 ◷ 10:10:23 AM
▶ cat .\main.tf
provider "azcli" {
features {}
}
~\projects\test\t5 ◷ 10:13:41 AM
▶ C:\Users\pearcec\bin\terraform init
Initializing the backend...
Initializing provider plugins...
- Finding latest version of hashicorp/azcli...
Error: Failed to install provider
Error while installing hashicorp/azcli: provider registry
registry.terraform.io does not have a provider named
registry.terraform.io/hashicorp/azcli
or
▶ cat .\main.tf
terraform {
required_providers {
azcli = {
source = "-/azcli"
}
}
}
~\projects\test\t5 ◷ 10:16:09 AM
▶ C:\Users\pearcec\bin\terraform init
Initializing the backend...
Initializing provider plugins...
- Finding latest version of -/azcli...
Error: Failed to query available provider packages
Could not retrieve the list of available versions for provider -/azcli:
provider registry registry.terraform.io does not have a provider named
registry.terraform.io/-/azcli

How to issue letsencrypt certificate for k8s (AKS) using terraform resources?

Summary
I am unable to issue a valid certificate for my terraform kubernetes cluster on azure aks. The domain and certificate is successfully created (cert is created according to crt.sh), however the certificate is not applied to my domain and my browser reports "Kubernetes Ingress Controller Fake Certificate" as the applied certificate.
The terraform files are converted to the best of my abilities from a working set of yaml files (that issues certificates just fine). See my terraform code here.
UPDATE! In the original question I was also unable to create certificates. This was fixed by using the "tls_cert_request" resource from here. The change is included in my updated code below.
Here a some things I have checked out and found NOT to be the issue
The number of issued certificates from acme letsencrypt is not above rate-limits for either staging or prod.
I get the same "Fake certificate" error using both staging or prod certificate server.
Here are some areas that I am currently investigating as potential sources for the error.
I do not see a terraform-equivalent of the letsencrypt yaml input "privateKeySecretRef" and consequently what the value of my deployment ingress "certmanager.k8s.io/cluster-issuer" should be.
If anyone have any other suggestions, I would really appreciate to hear them (as this has been bugging me for quite some time now)!
Certificate Resources
provider "acme" {
server_url = var.context.cert_server
}
resource "tls_private_key" "reg_private_key" {
algorithm = "RSA"
}
resource "acme_registration" "reg" {
account_key_pem = tls_private_key.reg_private_key.private_key_pem
email_address = var.context.email
}
resource "tls_private_key" "cert_private_key" {
algorithm = "RSA"
}
resource "tls_cert_request" "req" {
key_algorithm = "RSA"
private_key_pem = tls_private_key.cert_private_key.private_key_pem
dns_names = [var.context.domain_address]
subject {
common_name = var.context.domain_address
}
}
resource "acme_certificate" "certificate" {
account_key_pem = acme_registration.reg.account_key_pem
certificate_request_pem = tls_cert_request.req.cert_request_pem
dns_challenge {
provider = "azure"
config = {
AZURE_CLIENT_ID = var.context.client_id
AZURE_CLIENT_SECRET = var.context.client_secret
AZURE_SUBSCRIPTION_ID = var.context.azure_subscription_id
AZURE_TENANT_ID = var.context.azure_tenant_id
AZURE_RESOURCE_GROUP = var.context.azure_dns_rg
}
}
}
Pypiserver Ingress Resource
resource "kubernetes_ingress" "pypi" {
metadata {
name = "pypi"
namespace = kubernetes_namespace.pypi.metadata[0].name
annotations = {
"kubernetes.io/ingress.class" = "inet"
"kubernetes.io/tls-acme" = "true"
"certmanager.k8s.io/cluster-issuer" = "letsencrypt-prod"
"ingress.kubernetes.io/ssl-redirect" = "true"
}
}
spec {
tls {
hosts = [var.domain_address]
}
rule {
host = var.domain_address
http {
path {
path = "/"
backend {
service_name = kubernetes_service.pypi.metadata[0].name
service_port = "http"
}
}
}
}
}
}
Let me know if more info is required, and I will update my question text with whatever is missing. And lastly I will let the terraform code git repo stay up and serve as help for others.
The answer to my question was that I had to include a cert-manager to my cluster and as far as I can tell there are no native terraform resources to create it. I ended up using Helm for my ingress and cert manager.
The setup ended up a bit more complex than I initially imagined, and as it stands now it needs to be run twice. This is due to the kubeconfig not being updated (have to apply "set KUBECONFIG=.kubeconfig" before running "terraform apply" a second time). So it's not pretty, but it "works" as a minimum example to get your deployment up and running.
There definitively are ways of simplifying the pypi deployment part using native terraform resources, and there is probably an easy fix to the kubeconfig not being updated. But I have not had time to investigate further.
If anyone have tips for a more elegant, functional and (probably most of all) secure minimum terraform setup for a k8s cluster I would love to hear it!
Anyways, for those interested, the resulting terraform code can be found here