Get redis host url from terraform helm_release resource - kubernetes

I'm using Terraform helm_release resource to install bitnami/redis instance in my K8s cluster.
The code looks like this:
resource "helm_release" "redis-chart" {
name = "redis-${var.env}"
repository = "https://charts.bitnami.com/bitnami"
chart = "redis"
namespace = "redis"
create_namespace = true
set {
name = "auth.enabled"
value = "false"
}
set {
name = "master.containerPort"
value = "6379"
}
set {
name = "replica.replicaCount"
value = "2"
}
}
It completes successfully and in a separate directory I have my app terraform configuration.
In the app configuration, I want to get get the redis host from the above helm_release.
I'm doing it like this:
data "kubernetes_service" "redis-master" {
metadata {
name = "redis-${var.env}-master"
namespace = "redis"
}
}
And then in the kubernetes_secret resource I'm passing the data to my app deployment:
resource "kubernetes_secret" "questo-server-secrets" {
metadata {
name = "questo-server-secrets-${var.env}"
namespace = kubernetes_namespace.app-namespace.metadata.0.name
}
data = {
REDIS_HOST = data.kubernetes_service.redis-master.metadata.0.name
REDIS_PORT = data.kubernetes_service.redis-master.spec.0.port.0.port
}
}
But unfortunately, when I run the app deployment I'm getting the following logs:
[ioredis] Unhandled error event: Error: getaddrinfo ENOTFOUND
redis-dev-master
at GetAddrInfoReqWrap.onlookup [as oncomplete] (dns.js:66:26) [ioredis] Unhandled error event: Error: getaddrinfo ENOTFOUND
Which suggests that redis-dev-master is not a correct host for the redis instance.
How do I get the redis host from helm_release or any underlying services the release creates?
I've tried debugging and pinging a specific pod from my app deployment.
For reference, these are my redis resources:
NAME READY STATUS RESTARTS AGE
pod/redis-dev-master-0 1/1 Running 0 34m
pod/redis-dev-replicas-0 1/1 Running 1 34m
pod/redis-dev-replicas-1 1/1 Running 0 32m
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/redis-dev-headless ClusterIP None <none> 6379/TCP 34m
service/redis-dev-master ClusterIP 172.20.215.60 <none> 6379/TCP 34m
service/redis-dev-replicas ClusterIP 172.20.117.134 <none> 6379/TCP 34m
NAME READY AGE
statefulset.apps/redis-dev-master 1/1 34m
statefulset.apps/redis-dev-replicas 2/2 34m

Maybe you were just using the wrong attribute to get that information. Checking the documentation at the Terraform Registry Website we can use the
cluster_ip attribute as described at the spec documentation description.
So you should end up with something like:
data.kubernetes_service.redis-master.spec.cluster_ip
And finally ending up with the following:
resource "kubernetes_secret" "questo-server-secrets" {
metadata {
name = "questo-server-secrets-${var.env}"
namespace = kubernetes_namespace.app-namespace.metadata.0.name
}
data = {
REDIS_HOST = data.kubernetes_service.redis-master.spec.cluster_ip
REDIS_PORT = data.kubernetes_service.redis-master.spec.port
}
}

Related

Cannot access container from NodePort using Kubernetes ingress istio

I'm learning Istio so I followed the instruction here
As I'm using terraform so I converted the yaml file to terraform and install istio via Helm
locals {
istio_charts_url = "https://istio-release.storage.googleapis.com/charts"
}
resource "helm_release" "istio-base" {
name = "istio-base"
repository = local.istio_charts_url
chart = "base"
namespace = "istio-system"
create_namespace = true
}
resource "helm_release" "istiod" {
name = "istiod"
repository = local.istio_charts_url
chart = "istiod"
namespace = "istio-system"
depends_on = [helm_release.istio-base]
}
resource "kubernetes_namespace" "istio-ingress" {
metadata {
labels = {
istio-injection = "enabled"
}
name = "istio-ingress"
}
}
resource "helm_release" "istio-ingress" {
repository = local.istio_charts_url
chart = "gateway"
name = "istio-ingress"
namespace = kubernetes_namespace.istio-ingress.id
depends_on = [helm_release.istiod]
set {
name = "service.type"
value = "NodePort"
}
}
and application:
### blog page frontend
resource "kubernetes_service" "blog_page" {
metadata {
name = "blog-page"
namespace = kubernetes_namespace.istio-ingress.id
}
spec {
port {
port = 5000
name = "http"
}
selector = {
app = "blog_page"
}
}
}
resource "kubernetes_deployment" "blog_page_v1" {
metadata {
name = "blog-page-v1"
namespace = kubernetes_namespace.istio-ingress.id
}
spec {
replicas = 1
selector {
match_labels = {
app = "blog_page"
version = "v1"
}
}
template {
metadata {
labels = {
app = "blog_page"
version = "v1"
}
}
spec {
container {
image = "thiv17/blog-service:v1"
name = "blog-page"
image_pull_policy = "Always"
port {
container_port = 5000
}
}
}
}
}
}
resource "kubernetes_ingress" "istio-app" {
metadata {
name = "istio-app"
namespace = kubernetes_namespace.istio-ingress.id
annotations = {
"kubernetes.io/ingress.class" = "istio"
}
}
spec {
rule {
http {
path {
path = "/*"
backend {
service_name = kubernetes_service.blog_page.metadata[0].name
service_port = kubernetes_service.blog_page.spec[0].port[0].port
}
}
}
}
}
}
I expected that I can access via the node port with the Node IP is 10.0.83.140
kubectl describe svc istio-ingress --namespace=istio-ingress
-----
Port: http2 80/TCP
TargetPort: 80/TCP
NodePort: http2 30968/TCP
Endpoints: 10.0.91.237:80
Port: https 443/TCP
kubectl get pods --selector=“app=istio-ingress” --namespace=istio-ingress --output=wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
istio-ingress-5bd77ffbdf-h25vs 1/1 Running 0 24h 10.0.91.237 ip-10-0-83-140.us-west-2.compute.internal <none> <none>
However, when I ssh to this node, even though this node is listening to the port 30968
[ec2-user#ip-10-0-83-140 ~]$ netstat -plan | grep 30968
(No info could be read for "-p": geteuid()=1000 but you should be root.)
tcp 0 0 0.0.0.0:30968 0.0.0.0:* LISTEN -
But I can't access the address http://localhost:30968
* Trying ::1:30968...
* connect to ::1 port 30968 failed: Connection refused
* Failed to connect to localhost port 30968 after 0 ms: Connection refused
* Closing connection 0
curl: (7) Failed to connect to localhost port 30968 after 0 ms: Connection refused
[ec2-user#ip-10-0-83-140 ~]$
I tried to use the public IP also (Changed Security group to public Port 30968) and even changed to use LoadBlancer as well but still did not access it successfully.
Other debug info
kubectl get pods --namespace=istio-ingress
NAME READY STATUS RESTARTS AGE
blog-api-v1-86789596cf-8rh2j 2/2 Running 0 7h58m
blog-page-v1-54d45997f8-q6h6l 2/2 Running 0 7h58m
blog-page-v2-74b6d4b7c9-bgdrm 2/2 Running 0 7h58m
istio-ingress-5bd77ffbdf-h25vs 1/1 Running 0 24h
kubectl describe ingress istio-app --namespace=istio-ingress
Name: istio-app
Labels: <none>
Namespace: istio-ingress
Address:
Default backend: default-http-backend:80 (<error: endpoints "default-http-backend" not found>)
Rules:
Host Path Backends
---- ---- --------
*
/* blog-page:5000 (10.0.81.70:5000,10.0.95.8:5000)
Annotations: kubernetes.io/ingress.class: istio
Events: <none>
Full code:
https://gitlab.com/jimmy-pet-projects/terraform-eks-with-monitoring/-/blob/main/modules/kubernetes/istio.tf
https://gitlab.com/jimmy-pet-projects/terraform-eks-with-monitoring/-/blob/main/modules/kubernetes/istio_app.tf (edit
I found the issue: The name of helm should be istio-ingressgateway. I don't understand its document is using istio-ingress
$ helm install istio-ingress istio/gateway -n istio-ingress --wait
For all those who are facing issues with the Istio template. Here is the working template for the same. Since I faced a couple of issues with that template, I compiled it for my own use case. I hope it's helpful.
provider "helm" {
kubernetes {
config_path = "~/.kube/config"
}
}
provider "kubernetes" {
config_path = "~/.kube/config"
}
locals {
istio_charts_url = "https://istio-release.storage.googleapis.com/charts"
}
resource "kubernetes_namespace" "istio_system" {
metadata {
name = "istio-system"
labels = {
istio-injection = "enabled"
}
}
}
resource "helm_release" "istio-base" {
repository = local.istio_charts_url
chart = "base"
name = "istio-base"
namespace = kubernetes_namespace.istio_system.metadata.0.name
version = ">= 1.12.1"
timeout = 120
cleanup_on_fail = true
force_update = false
}
resource "helm_release" "istiod" {
repository = local.istio_charts_url
chart = "istiod"
name = "istiod"
namespace = kubernetes_namespace.istio_system.metadata.0.name
version = ">= 1.12.1"
timeout = 120
cleanup_on_fail = true
force_update = false
set {
name = "meshConfig.accessLogFile"
value = "/dev/stdout"
}
depends_on = [helm_release.istio-base]
}
resource "helm_release" "istio-ingress" {
repository = local.istio_charts_url
chart = "gateway"
name = "istio-ingress"
namespace = kubernetes_namespace.istio_system.metadata.0.name
version = ">= 1.12.1"
timeout = 500
cleanup_on_fail = true
force_update = false
depends_on = [helm_release.istiod]
}

Accessing a private GKE cluster via Cloud VPN

We have setup a GKE cluster using Terraform with private and shared networking:
Network configuration:
resource "google_compute_subnetwork" "int_kube02" {
name = "int-kube02"
region = var.region
project = "infrastructure"
network = "projects/infrastructure/global/networks/net-10-23-0-0-16"
ip_cidr_range = "10.23.5.0/24"
secondary_ip_range {
range_name = "pods"
ip_cidr_range = "10.60.0.0/14" # 10.60 - 10.63
}
secondary_ip_range {
range_name = "services"
ip_cidr_range = "10.56.0.0/16"
}
}
Cluster configuration:
resource "google_container_cluster" "gke_kube02" {
name = "kube02"
location = var.region
initial_node_count = var.gke_kube02_num_nodes
network = "projects/ninfrastructure/global/networks/net-10-23-0-0-16"
subnetwork = "projects/infrastructure/regions/europe-west3/subnetworks/int-kube02"
master_authorized_networks_config {
cidr_blocks {
display_name = "admin vpn"
cidr_block = "10.42.255.0/24"
}
cidr_blocks {
display_name = "monitoring server"
cidr_block = "10.42.4.33/32"
}
cidr_blocks {
display_name = "cluster nodes"
cidr_block = "10.23.5.0/24"
}
}
ip_allocation_policy {
cluster_secondary_range_name = "pods"
services_secondary_range_name = "services"
}
private_cluster_config {
enable_private_nodes = true
enable_private_endpoint = true
master_ipv4_cidr_block = "192.168.23.0/28"
}
node_config {
machine_type = "e2-highcpu-2"
tags = ["kube-no-external-ip"]
metadata = {
disable-legacy-endpoints = true
}
oauth_scopes = [
"https://www.googleapis.com/auth/logging.write",
"https://www.googleapis.com/auth/monitoring",
]
}
}
The cluster is online and running fine. If I connect to one of the worker nodes i can reach the api using curl:
curl -k https://192.168.23.2
{
"kind": "Status",
"apiVersion": "v1",
"metadata": {
},
"status": "Failure",
"message": "forbidden: User \"system:anonymous\" cannot get path \"/\"",
"reason": "Forbidden",
"details": {
},
"code": 403
}
I also see a healthy cluster when using a SSH port forward:
❯ k get pods --all-namespaces --insecure-skip-tls-verify=true
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system event-exporter-gke-5479fd58c8-mv24r 2/2 Running 0 4h44m
kube-system fluentbit-gke-ckkwh 2/2 Running 0 4h44m
kube-system fluentbit-gke-lblkz 2/2 Running 0 4h44m
kube-system fluentbit-gke-zglv2 2/2 Running 4 4h44m
kube-system gke-metrics-agent-j72d9 1/1 Running 0 4h44m
kube-system gke-metrics-agent-ttrzk 1/1 Running 0 4h44m
kube-system gke-metrics-agent-wbqgc 1/1 Running 0 4h44m
kube-system kube-dns-697dc8fc8b-rbf5b 4/4 Running 5 4h44m
kube-system kube-dns-697dc8fc8b-vnqb4 4/4 Running 1 4h44m
kube-system kube-dns-autoscaler-844c9d9448-f6sqw 1/1 Running 0 4h44m
kube-system kube-proxy-gke-kube02-default-pool-2bf58182-xgp7 1/1 Running 0 4h43m
kube-system kube-proxy-gke-kube02-default-pool-707f5d51-s4xw 1/1 Running 0 4h43m
kube-system kube-proxy-gke-kube02-default-pool-bd2c130d-c67h 1/1 Running 0 4h43m
kube-system l7-default-backend-6654b9bccb-mw6bp 1/1 Running 0 4h44m
kube-system metrics-server-v0.4.4-857776bc9c-sq9kd 2/2 Running 0 4h43m
kube-system pdcsi-node-5zlb7 2/2 Running 0 4h44m
kube-system pdcsi-node-kn2zb 2/2 Running 0 4h44m
kube-system pdcsi-node-swhp9 2/2 Running 0 4h44m
So far so good. Then I setup the Cloud Router to announce the 192.168.23.0/28 network. This was successful and replicated to our local site using BGP. Running show route 192.168.23.2 displays the correct route is advertised and installed.
When trying to reach the API from the monitoring server 10.42.4.33 I just run into timeouts. All three, the Cloud VPN, the Cloud Router and the Kubernetes Cluster run in europe-west3.
When i try to ping one of the workers its working completely fine, so networking in general works:
[me#monitoring ~]$ ping 10.23.5.216
PING 10.23.5.216 (10.23.5.216) 56(84) bytes of data.
64 bytes from 10.23.5.216: icmp_seq=1 ttl=63 time=8.21 ms
64 bytes from 10.23.5.216: icmp_seq=2 ttl=63 time=7.70 ms
64 bytes from 10.23.5.216: icmp_seq=3 ttl=63 time=5.41 ms
64 bytes from 10.23.5.216: icmp_seq=4 ttl=63 time=7.98 ms
Googles Documentation gives no hit what could be missing. From what I understand the Cluster API should be reachable by now.
What could be missing and why is the API not reachable via VPN?
I have been missing the peering configuration documented here:
https://cloud.google.com/kubernetes-engine/docs/how-to/private-clusters#cp-on-prem-routing
resource "google_compute_network_peering_routes_config" "peer_kube02" {
peering = google_container_cluster.gke_kube02.private_cluster_config[0].peering_name
project = "infrastructure"
network = "net-10-13-0-0-16"
export_custom_routes = true
import_custom_routes = false
}

Error: No valid responses from any peers. Errors: peer=undefined, status=grpc, message=Endorsement has failed

I m working on hyperledger blockchain and Kubernetes(minikube) on Ubuntu 18-04. In my network there two Organisations with one peer each and the orderer type is Solo. This whole network I deployed on minikube. Chaincode install and instantiation did successfully on pods. After that am trying to invoke by using SDK.
I am using the below code as invoke.js
'use strict';
const { Gateway, Wallets } = require('fabric-network');
const fs = require('fs');
const path = require('path');
async function main() {
try {
// load the network configuration
const ccpPath = path.resolve(__dirname, '..', '..', 'first-network', 'connection1-org1.json');
let ccp = JSON.parse(fs.readFileSync(ccpPath, 'utf8'));
// Create a new file system based wallet for managing identities.
const walletPath = path.join(process.cwd(), 'wallet');
const wallet = await Wallets.newFileSystemWallet(walletPath);
console.log(`Wallet path: ${walletPath}`);
// Check to see if we've already enrolled the user.
const identity = await wallet.get('user1');
if (!identity) {
console.log('An identity for the user "user1" does not exist in the wallet');
console.log('Run the registerUser.js application before retrying');
return;
}
// Create a new gateway for connecting to our peer node.
const gateway = new Gateway();
await gateway.connect(ccp, { wallet, identity: 'user1',discovery: { enabled: true, asLocalhost: true } });
// Get the network (channel) our contract is deployed to.
const network = await gateway.getNetwork(channelname);
// Get the contract from the network.
const contract = network.getContract(contractname);
// Submit the specified transaction.
// createCar transaction - requires 5 argument, ex: ('createCar', 'CAR12', 'Honda', 'Accord', 'Black', 'Tom')
// changeCarOwner transaction - requires 2 args , ex: ('changeCarOwner', 'CAR10', 'Dave')
await contract.submitTransaction('arumnet','argument');
console.log('Transaction has been submitted');
// Disconnect from the gateway.
await gateway.disconnect();
} catch (error) {
console.error(`Failed to submit transaction: ${error}`);
process.exit(1);
}
}
main();
configTx.yaml
Organizations:
- &Orderer
Name: Orderer
ID: OrdererMSP
MSPDir: ./crypto-config/ordererOrganizations/acme.com/msp
# Policies are mandatory starting 2.x
Policies: &OrdererPolicies
Readers:
Type: Signature
Rule: "OR('OrdererMSP.member')"
Writers:
Type: Signature
Rule: "OR('OrdererMSP.member')"
Admins:
Type: Signature
# ONLY Admin Role can carry out administration activities
Rule: "OR('OrdererMSP.admin')"
Endorsement:
Type: Signature
Rule: "OR('OrdererMSP.member')"
- &Acme
Name: Acme
ID: AcmeMSP
MSPDir: ./crypto-config/peerOrganizations/acme.com/msp
Policies: &AcmePolicies
Readers:
Type: Signature
# Any member can READ e.g., query
Rule: "OR('AcmeMSP.member')"
Writers:
Type: Signature
# Any member can WRITE e.g., submit transaction
Rule: "OR('AcmeMSP.member')"
Admins:
Type: Signature
# Either Acme admin OR Orderer Admin can carry out admin activities
Rule: "OR('AcmeMSP.admin')"
Endorsement:
Type: Signature
# Any member can act as an endorser
Rule: "OR('AcmeMSP.member')"
AnchorPeers:
- Host: acme-peer-clusterip
Port: 30751
- &Budget
Name: Budget
ID: BudgetMSP
MSPDir: ./crypto-config/peerOrganizations/budget.com/msp
Policies: &BudgetPolicies
Readers:
Type: Signature
# Any member
Rule: "OR('BudgetMSP.member')"
Writers:
Type: Signature
# Any member
Rule: "OR('BudgetMSP.member')"
Admins:
Type: Signature
# BOTH Budget Admin AND Orderer Admin needed for admin activities
Rule: "OR('BudgetMSP.member')"
Endorsement:
Type: Signature
Rule: "OR('BudgetMSP.member')"
AnchorPeers:
- Host: budget-peer-clusterip
Port: 30851
Connection.json
{
"name": "first-network-acme",
"version": "1.0.0",
"client": {
"organization": "AcmeMSP",
"connection": {
"timeout": {
"peer": {
"endorser": "300"
}
}
}
},
"organizations": {
"AcmeMSP": {
"mspid": "AcmeMSP",
"peers": [
"peer1.acme.com"
],
"certificateAuthorities": [
]
}
},
"channel":{
"airlinechannel":{
"orderers": [
"orderer.acme.com"
],
"peers": {
"peer1.acme.com": {}
}
}
},
"peers": {
"peer1.acme.com": {
"url": "grpc://10.109.214.71:3005",
"tlsCACerts": {
"pem": "/crypto-config/peerOrganizations/acme.com/tlsca/tlsca.acme.com-cert.pem"
},
"grpcOptions": {
"ssl-target-name-override": "peer1.acme.com",
"hostnameOverride": "peer1.acme.com"
}
}
},
"certificateAuthorities": {
}
}
Logs after running invoke.js
2020-11-26T05:31:09.252Z | connectivity_state | dns:localhost:30751 CONNECTING -> CONNECTING
2020-11-26T05:31:09.252Z | dns_resolver | Resolved addresses for target dns:localhost:30751: [127.0.0.1:30751]
2020-11-26T05:31:09.252Z | pick_first | IDLE -> IDLE
2020-11-26T05:31:09.252Z | resolving_load_balancer | dns:localhost:30751 CONNECTING -> IDLE
2020-11-26T05:31:09.253Z | connectivity_state | dns:localhost:30751 CONNECTING -> IDLE
2020-11-26T05:31:09.253Z | pick_first | Connect to address list 127.0.0.1:30751
2020-11-26T05:31:09.253Z | subchannel | 127.0.0.1:30751 refcount 3 -> 4
2020-11-26T05:31:09.253Z | pick_first | IDLE -> TRANSIENT_FAILURE
2020-11-26T05:31:09.253Z | resolving_load_balancer | dns:localhost:30751 IDLE -> TRANSIENT_FAILURE
2020-11-26T05:31:09.253Z | connectivity_state | dns:localhost:30751 IDLE -> TRANSIENT_FAILURE
2020-11-26T05:31:12.254Z - error: [ServiceEndpoint]: Error: Failed to connect before the deadline on Endorser- name: acme-peer-clusterip:30751, url:grpc://localhost:30751, connected:false, connectAttempted:true
2020-11-26T05:31:12.254Z - error: [ServiceEndpoint]: waitForReady - Failed to connect to remote gRPC server acme-peer-clusterip:30751 url:grpc://localhost:30751 timeout:3000
2020-11-26T05:31:12.254Z - error: [DiscoveryService]: _buildPeer[dsg-test] - Unable to connect to the discovered peer acme-peer-clusterip:30751 due to Error: Failed to connect before the deadline on Endorser- name: acme-peer-clusterip:30751, url:grpc://localhost:30751, connected:false, connectAttempted:true
2020-11-26T05:31:12.261Z - error: [DiscoveryHandler]: _build_endorse_group_member >> G1:0 - returning an error endorsement, no endorsement made
2020-11-26T05:31:12.261Z - error: [Transaction]: Error: No valid responses from any peers. Errors:
peer=undefined, status=grpc, message=Endorsement has failed
Failed to submit transaction: Error: No valid responses from any peers. Errors:
peer=undefined, status=grpc, message=Endorsement has failed
Below kubernetes pods setup
kubectl get all
NAME READY STATUS RESTARTS AGE
pod/acme-orderer-0 1/1 Running 0 107m
pod/acme-peer-0 2/2 Running 0 107m
pod/budget-peer-0 2/2 Running 0 107m
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/acme-orderer-clusterip ClusterIP 10.108.218.191 <none> 30750/TCP 107m
service/acme-orderer-nodeport NodePort 10.111.186.82 <none> 30750:30750/TCP 107m
service/acme-peer-clusterip ClusterIP 10.98.236.210 <none> 30751/TCP,30752/TCP 107m
service/acme-peer-nodeport NodePort 10.101.38.254 <none> 30751:30751/TCP,30752:30752/TCP 107m
service/budget-peer-clusterip ClusterIP 10.108.194.45 <none> 30851/TCP 107m
service/budget-peer-nodeport NodePort 10.100.136.250 <none> 30851:30851/TCP,30852:30852/TCP 107m
service/kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 112m
service/svc-acme-orderer LoadBalancer 10.105.155.207 10.105.155.207 6005:30696/TCP 27m
service/svc-acme-peer LoadBalancer 10.98.44.14 10.109.214.71 3005:30594/TCP 56m
NAME READY AGE
statefulset.apps/acme-orderer 1/1 107m
statefulset.apps/acme-peer 1/1 107m
statefulset.apps/budget-peer 1/1 10
Some things to check:
connection1-org1.json - you've supplied connection1-org1.json in the code for invoke.js but connection.json config as a candidate you presumably expect to be used to load/access the network. Pls confirm if this is correct.
connection.json - specifies a tlsCACerts but the url scheme used is grpc:// rather than grpcs://
connection.json - does not specify the orderer url. This may not be necessary if it is discoverable via other means
Is TLS enabled on the peers/orderer?
If you can confirm/amend as necessary it will provide a better chance for someone to help.

EKS kube-system deployments CrashLoopBackOff

I am trying to deploy Kube State Metrics into the kube-system namespace in my EKS Cluster (eks.4) running Kubernetes v1.14.
Kubernetes Connection
provider "kubernetes" {
host = var.cluster.endpoint
token = data.aws_eks_cluster_auth.cluster_auth.token
cluster_ca_certificate = base64decode(var.cluster.certificate)
load_config_file = true
}
Deployment Manifest (as .tf)
resource "kubernetes_deployment" "kube_state_metrics" {
metadata {
name = "kube-state-metrics"
namespace = "kube-system"
labels = {
k8s-app = "kube-state-metrics"
}
}
spec {
replicas = 1
selector {
match_labels = {
k8s-app = "kube-state-metrics"
}
}
template {
metadata {
labels = {
k8s-app = "kube-state-metrics"
}
}
spec {
container {
name = "kube-state-metrics"
image = "quay.io/coreos/kube-state-metrics:v1.7.2"
port {
name = "http-metrics"
container_port = 8080
}
port {
name = "telemetry"
container_port = 8081
}
liveness_probe {
http_get {
path = "/healthz"
port = "8080"
}
initial_delay_seconds = 5
timeout_seconds = 5
}
readiness_probe {
http_get {
path = "/"
port = "8080"
}
initial_delay_seconds = 5
timeout_seconds = 5
}
}
service_account_name = "kube-state-metrics"
}
}
}
}
I have deployed all the required RBAC manifests from https://github.com/kubernetes/kube-state-metrics/tree/master/kubernetes as well - redacted here for brevity.
When I run terraform apply on the deployment above, the Terraform output is as follows :
kubernetes_deployment.kube_state_metrics: Still creating... [6m50s elapsed]
Eventually timing out at 10m.
Here are the outputs of the logs for the kube-state-metrics pod
I0910 23:41:19.412496 1 main.go:140] metric white-blacklisting: blacklisting the following items:
W0910 23:41:19.412535 1 client_config.go:541] Neither --kubeconfig nor --master was specified. Using the inClusterConfig. This might not work.
W0910 23:41:19.412565 1 client_config.go:546] error creating inClusterConfig, falling back to default config: open /var/run/secrets/kubernetes.io/serviceaccount/token: no such file or directory
F0910 23:41:19.412782 1 main.go:148] Failed to create client: invalid configuration: no configuration has been provided
Adding the following to the spec has taken me to a successful deployment.
automount_service_account_token = true
For posterity :
resource "kubernetes_deployment" "kube_state_metrics" {
metadata {
name = "kube-state-metrics"
namespace = "kube-system"
labels = {
k8s-app = "kube-state-metrics"
}
}
spec {
replicas = 1
selector {
match_labels = {
k8s-app = "kube-state-metrics"
}
}
template {
metadata {
labels = {
k8s-app = "kube-state-metrics"
}
}
spec {
automount_service_account_token = true
container {
name = "kube-state-metrics"
image = "quay.io/coreos/kube-state-metrics:v1.7.2"
port {
name = "http-metrics"
container_port = 8080
}
port {
name = "telemetry"
container_port = 8081
}
liveness_probe {
http_get {
path = "/healthz"
port = "8080"
}
initial_delay_seconds = 5
timeout_seconds = 5
}
readiness_probe {
http_get {
path = "/"
port = "8080"
}
initial_delay_seconds = 5
timeout_seconds = 5
}
}
service_account_name = "kube-state-metrics"
}
}
}
}
I didn't try with terraform.
I have just run this deployment locally i got the same error.
Please run your deployment locally to see the state of your deployment and pods.
I0910 13:25:49.632847 1 main.go:140] metric white-blacklisting: blacklisting the following items:
W0910 13:25:49.632871 1 client_config.go:541] Neither --kubeconfig nor --master was specified. Using the inClusterConfig. This might not work.
and finally:
I0910 13:25:49.634748 1 main.go:185] Testing communication with server
I0910 13:25:49.650994 1 main.go:190] Running with Kubernetes cluster version: v1.12+. git version: v1.12.8-gke.10. git tree state: clean. commit: f53039cc1e5295eed20969a4f10fb6ad99461e37. platform: linux/amd64
I0910 13:25:49.651028 1 main.go:192] Communication with server successful
I0910 13:25:49.651598 1 builder.go:126] Active collectors: certificatesigningrequests,configmaps,cronjobs,daemonsets,deployments,endpoints,horizontalpodautoscalers,ingresses,jobs,limitranges,namespaces,nodes,persistentvolumeclaims,persistentvolumes,poddisruptionbudgets,pods,replicasets,replicationcontrollers,resourcequotas,secrets,services,statefulsets,storageclasses
I0910 13:25:49.651607 1 main.go:226] Starting metrics server: 0.0.0.0:8080
I0910 13:25:49.652149 1 main.go:201] Starting kube-state-metrics self metrics server: 0.0.0.0:8081
verification:
Connected to kube-state-metrics (xx.xx.xx.xx) port 8080 (#0)
GET /metrics HTTP/1.1
Host: kube-state-metrics:8080
User-Agent: curl/7.58.0
Accept: */*
HTTP/1.1 200 OK
Content-Type: text/plain; version=0.0.4
Date: Tue, 10 Sep 2019 13:39:52 GMT
Transfer-Encoding: chunked
[49027 bytes data]
HELP kube_certificatesigningrequest_labels Kubernetes labels converted to
Prometheus labels.
If you are building own image please follow issues on gihtub and docs
update: Just to clarify.
AS mentioned in my answer. I didn't try with terraform but it seems that the first question described only one problem W0910 13:25:49.632871 1 client_config.go:541] Neither --kubeconfig nor --master was specified. Using the inClusterConfig. This might not work.
So I suggested to run this deployment locally and verify all errors from the logs. Later occurred that there is a problem with automount_service_account_token. This important errors wasn't applied to the the original question.
So please follow terraform issues on github how you can manage to solve this problem
As per description on github:
I spent hours trying to figure out why a service account and deployment wasn't working in Terraform, but worked with no issues in kubectl - it was the AutomountServiceAccountToken being hardcoded to False in the deployment resource.
At a minimum this should be documented in the Terraform docs for the resource with something noting the resource does not behave like kubectl does.
I hope it explains this problem.

How to configure correctly to receive the prediction result in Seldon Core python client?

I'm checking out with Seldon Core on Minikube and have successfully deployed a model on a cluster.
I tested with below code:
seldon-core-api-tester ../seldon-core/examples/models/keras_mnist/contract.json `minikube ip` `kubectl get svc -l app=seldon-apiserver-container-app -o jsonpath='{.items[0].spec.ports[0].nodePort}'` --oauth-key oauth-key --oauth-secret oauth-secret -p
and got the right prediction result looking like this.
RECEIVED RESPONSE:
meta {
puid: "gn83vb2ag419k547eqkhfduos2"
requestPath {
key: "mnist"
value: "mnist:0.1"
}
}
data {
names: "t:0"
names: "t:1"
names: "t:2"
names: "t:3"
names: "t:4"
names: "t:5"
names: "t:6"
names: "t:7"
names: "t:8"
names: "t:9"
ndarray {
values {
list_value {
values {
number_value: 0.00026227490161545575
}
values {
number_value: 0.0007252057548612356
}
values {
number_value: 0.028986405581235886
}
values {
number_value: 0.8030332922935486
}
values {
number_value: 7.914198795333505e-05
}
values {
number_value: 0.14541368186473846
}
values {
number_value: 0.002676495350897312
}
values {
number_value: 0.015001941472291946
}
values {
number_value: 0.0034872409887611866
}
values {
number_value: 0.00033424459979869425
}
}
}
}
}
However, when I was trying to use the python client,
from seldon_core.seldon_client import SeldonClient
sc = SeldonClient(deployment_name="mnist",namespace="seldon", seldon_rest_endpoint= '127.0.0.1:30790')
r = sc.predict(transport="rest")
I got this error.
HTTPConnection object at 0xb2bb5a780>: Failed to establish a new connection: [Errno 61] Connection refused'))
Could someone help me find out what's wrong?
$kubectl get svc
mnist-deployment-mnist ClusterIP 10.99.10.81 <none> 8000/TCP,5001/TCP 2d22h
kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 4d22h
seldon-core-redis-master ClusterIP 10.107.217.176 <none> 6379/TCP 2d22h
seldon-core-seldon-apiserver NodePort 10.106.34.6 <none> 8080:30790/TCP,5000:31866/TCP 2d22h
seldon-mnist-0-1-4249605 ClusterIP 10.101.205.227 <none> 9000/TCP 2d22h
When you run the seldon-core-api-tester script, you provide minikube ip as an argument (along with the ambassador port). You'll need this address for the endpoint when you initialize the client instead of 127.0.0.1. So first run in your shell
minikube ip
and take a note of the ip, then find the ambassador port
kubectl get svc ambassador -o jsonpath='{.spec.ports[0].nodePort}'
then your client and call will look sth like this
from seldon_core.seldon_client import SeldonClient
import numpy as np
# this is the ip from `minikube ip` and port from `kubectl get svc ambassador -o jsonpath='{.spec.ports[0].nodePort}'`
minikube_ambassador_endpoint = "192.168.99.108:32667"
deployment_name = "mnist"
namespace = "default"
sc = SeldonClient(
gateway="ambassador",
gateway_endpoint=minikube_ambassador_endpoint,
transport="rest",
deployment_name=deployment_name,
namespace=namespace
)
response = sc.predict(
data=np.ones((5,)),
deployment_name=deployment_name,
payload_type="ndarray"
)
print(response)