I'm following the tutorials to evaluate Istio as the service mesh for my K8s cluster, but for some reason I cannot make the simple example that uses a couple of services to work properly:
https://istio.io/docs/tasks/integrating-services-into-istio.html
If I try to call service-two from service-one, I get this error:
# kubectl exec -ti ${CLIENT} -- curl -v service-two:80
Defaulting container name to app.
Use 'kubectl describe pod/service-one-714088666-73fkp' to see all of the containers in this pod.
* Rebuilt URL to: service-two:80/
* Trying 10.102.51.89...
* connect to 10.102.51.89 port 80 failed: Connection refused
* Failed to connect to service-two port 80: Connection refused
* Closing connection 0
curl: (7) Failed to connect to service-two port 80: Connection refused
However, if I try to connect to service-two from another service in my cluster, even in a different namespace, then it works:
# kubectl exec -ti redis-4054078334-mj287 -n redis -- curl -v service-two.default:80
* Rebuilt URL to: service-two.default:80/
* Hostname was NOT found in DNS cache
* Trying 10.102.51.89...
* Connected to service-two.default (10.102.51.89) port 80 (#0)
> GET / HTTP/1.1
> User-Agent: curl/7.38.0
> Host: service-two.default
> Accept: */*
>
< HTTP/1.1 200 OK
* Server envoy is not blacklisted
< server: envoy
< date: Sat, 19 Aug 2017 14:43:01 GMT
< content-type: text/plain
< x-envoy-upstream-service-time: 2
< transfer-encoding: chunked
<
CLIENT VALUES:
client_address=127.0.0.1
command=GET
real path=/
query=nil
request_version=1.1
request_uri=http://service-two.default:8080/
SERVER VALUES:
server_version=nginx: 1.10.0 - lua: 10001
HEADERS RECEIVED:
accept=*/*
content-length=0
host=service-two.default
user-agent=curl/7.38.0
x-b3-sampled=1
x-b3-spanid=00000caf6e052e86
x-b3-traceid=00000caf6e052e86
x-envoy-expected-rq-timeout-ms=15000
x-forwarded-proto=http
x-ot-span-context=00000caf6e052e86;00000caf6e052e86;0000000000000000;cs
x-request-id=1290973c-7bca-95d2-8fa8-80917bb404ad
BODY:
* Connection #0 to host service-two.default left intact
-no body in request-
Any reason or explanation why I get this unexpected behaviour?
Thanks.
I figured out what happened: on service-one the init containers had not been properly completed, so it was not resolving correctly.
Related
I was following this doc (https://istio.io/latest/docs/tasks/traffic-management/ingress/secure-ingress/) to set up my Kubernetes on my Docker-Desktop and used istio ingress gateway. I deployed an echo test app, added virtual service that points to the test app endpoint at port 8081. Then I set the istio gateway to open port 443 with the following:
servers:
- hosts:
- some.random.host
port:
name: https
number: 443
protocol: HTTPS
tls:
mode: SIMPLE
credentialName: test-app-tls
where I also created a tls type secret with name test-app-tls using the certs and private key I generated.
(Just in case I forgot to mention something here, I tried with port 80 and http and everything works. Here is an example)
curl -i -H 'Host: some.random.host' 'localhost:80/host'
HTTP/1.1 200 OK
content-type: application/json
date: Tue, 02 Aug 2022 21:10:31 GMT
content-length: 148
x-envoy-upstream-service-time: 1
server: istio-envoy
{"name":"some-expected-response","address":"some-other-expected-response"}
Then I tried to curl my localhost to hit the test app in the cluster with the following command
curl -k -i -v -H 'Host: some.random.host' 'https://localhost:443/host'
it gave me this error
* Trying ::1...
* TCP_NODELAY set
* Connected to localhost (::1) port 443 (#0)
* ALPN, offering h2
* ALPN, offering http/1.1
* successfully set certificate verify locations:
* CAfile: /etc/ssl/cert.pem
CApath: none
* TLSv1.2 (OUT), TLS handshake, Client hello (1):
* LibreSSL SSL_connect: SSL_ERROR_SYSCALL in connection to localhost:443
* Closing connection 0
curl: (35) LibreSSL SSL_connect: SSL_ERROR_SYSCALL in connection to localhost:443
I also tried with https://127.0.0.1:443/host and still doesn't work.
I'm fairly new to setting up TLS for Kubernetes. Could anyone please help me with this?
Thank you very much!!
I get "Connection reset by peer" every time I try to use proxy from the Kubernetes pod.
Here is the log when from the curl:
>>>> curl -x http://5.188.62.223:15624 -L http://google.com -vvv
* Trying 5.188.62.223:15624...
* Connected to 5.188.62.223 (5.188.62.223) port 15624 (#0)
> GET http://google.com/ HTTP/1.1
> Host: google.com
> User-Agent: curl/7.79.1
> Accept: */*
> Proxy-Connection: Keep-Alive
>
* Recv failure: Connection reset by peer
* Closing connection 0
curl: (56) Recv failure: Connection reset by peer
Interesting fact, that I have no issues when I use same proxy on local computer, in docker and on remote host. Apparently smith within the cluster doesn't let me communicate with it.
Currently I use Azure hosted Kubernetes. But the same error happens on Digital Ocean as well.
I would be grateful for every leading clue of how I can bypass this restrictions, because Im out of ideas.
Server Info:
{
Major:"1",
Minor:"20",
GitVersion:"v1.20.7",
GitCommit:"ca90e422dfe1e209df2a7b07f3d75b92910432b5",
GitTreeState:"clean",
BuildDate:"2021-10-09T04:59:48Z",
GoVersion:"go1.15.12", Compiler:"gc",
Platform:"linux/amd64"
}
The yaml file I use in order to start the pod is just super basic. But originally I use airflow with Kubernetes executor, which actually spawn pretty similar basic pods:
apiVersion: v1
kind: Pod
metadata:
name: scrapeevent.test
spec:
affinity: {}
containers:
- command:
- /bin/sh
- -ec
- while :; do echo '.'; sleep 5 ; done
image: jaklimoff/mooncrops-opensea:latest
imagePullPolicy: Always
name: base
restartPolicy: Never
I get this error after waiting for a while ~1 min
Waiting for HTTP-01 challenge propagation: failed to perform self check GET request 'http://jenkins.xyz.in/.well-known/acme-challenge/AoV9UtBq1rwPLDXWjrq85G5Peg_Z6rLKSZyYL_Vfe4I': Get "http://jenkins.xyz.in/.well-known/acme-challenge/AoV9UtBq1rwPLDXWjrq85G5Peg_Z6rLKSZyYL_Vfe4I": dial tcp 103.66.96.201:80: connect: connection timed out
I am able to access this url in the browser from anywhere (internet)
curl -v http://jenkins.xyz.in/.well-known/acme-challenge/AoV9UtBq1rwPLDXWjrq85G5Peg_Z6rLKSZyYL_Vfe4I
* Trying 103.66.96.201:80...
* Connected to jenkins.xyz.in (103.66.96.201) port 80 (#0)
> GET /.well-known/acme-challenge/AoV9UtBq1rwPLDXWjrq85G5Peg_Z6rLKSZyYL_Vfe4I HTTP/1.1
> Host: jenkins.xyz.in
> User-Agent: curl/7.71.1
> Accept: */*
>
* Mark bundle as not supporting multiuse
< HTTP/1.1 200 OK
< cache-control: no-cache, no-store, must-revalidate
< date: Wed, 13 Jan 2021 08:54:23 GMT
< content-length: 87
< content-type: text/plain; charset=utf-8
< x-envoy-upstream-service-time: 1
< server: istio-envoy
<
* Connection #0 to host jenkins.xyz.in left intact
AoV9UtBq1rwPLDXWjrq85G5Peg_Z6rLKSZyYL_VfT4I.EZvkP5Fpi6EYc_-tWTQgvaQxrrbSr2MEJkuXJaywatk
my setup is:
1. Istio Ingress load balancer running on node (192.168.14.118)
2. I am pointing my external IP and domain jenkins.xyz.in
to 192.168.14.118 through an another load balancer
request -> public IP -> load balancer -> 192.168.14.118
From outside it works fine.
but when I try this from node itself / from pod inside cluster I get :
$ curl -v http://jenkins.xyz.in/
* About to connect() to jenkins.xyz.in port 80 (#0)
* Trying 103.66.96.201...
I have read somewhere about hairpinning
Since my kubernetes node IP and the istio ingress loadbalacer external IPs are same, request might be looping.
EXTRA: I am running k8s on bare metal
is there any solution to get around this?
I found a work around.
As my node was not able to access the URL (loop),
I added another node to cluster and set Cert-Manager pods affinity to new node.
Cert-Manager was able to access the URL from new node. Although not a good solution, but worked for me.
I have deployed 2 istio enabled services on a GKE cluster.
istio version is 1.1.5 and GKE is on v1.15.9-gke.24
istio has been installed with global.mtls.enabled=true
serviceA communicates properly
serviceB apparently has TLS related issues.
I spin up a non-istio enabled deployment just for testing and exec into this test pod to curl these 2 service endpoints.
/ # curl -v serviceA
* Rebuilt URL to: serviceA/
* Trying 10.8.61.75...
* TCP_NODELAY set
* Connected to serviceA (10.8.61.75) port 80 (#0)
> GET / HTTP/1.1
> Host: serviceA
> User-Agent: curl/7.57.0
> Accept: */*
>
< HTTP/1.1 200 OK
< content-type: application/json
< content-length: 130
< server: istio-envoy
< date: Sat, 25 Apr 2020 09:45:32 GMT
< x-envoy-upstream-service-time: 2
< x-envoy-decorator-operation: serviceA.mynamespace.svc.cluster.local:80/*
<
{"application":"Flask-Docker Container"}
* Connection #0 to host serviceA left intact
/ # curl -v serviceB
* Rebuilt URL to: serviceB/
* Trying 10.8.58.228...
* TCP_NODELAY set
* Connected to serviceB (10.8.58.228) port 80 (#0)
> GET / HTTP/1.1
> Host: serviceB
> User-Agent: curl/7.57.0
> Accept: */*
>
* Recv failure: Connection reset by peer
* Closing connection 0
curl: (56) Recv failure: Connection reset by peer
Execing into the envoy proxy of the problematic service and turning trace level logging on, I see this error
serviceB-758bc87dcf-jzjgj istio-proxy [2020-04-24 13:15:21.180][29][debug][connection] [external/envoy/source/extensions/transport_sockets/tls/ssl_socket.cc:168] [C1484] handshake error: 1
serviceB-758bc87dcf-jzjgj istio-proxy [2020-04-24 13:15:21.180][29][debug][connection] [external/envoy/source/extensions/transport_sockets/tls/ssl_socket.cc:201] [C1484] TLS error: 268435612:SSL routines:OPENSSL_internal:HTTP_REQUEST
The envoy sidecars of both containers, display similar information when debugging their certificates.
I verify this by execing into both istio containers, cd-ing to /etc/certs/..data and running
openssl x509 -in root-cert.pem -noout -text
The two root-cert.pem are identical!
Since those 2 istio proxies have exactly the same tls configuration in terms of certs, why this cryptic SSL error on serviceB?
FWIW serviceB communicates with a non-istio enabled postgres service.
Could that be causing the issue?
curling the container of serviceB from within itself however, returns a healthy response.
I have istio v1.1.6 installed on Kubernetes v1.11 using Helm chart provided in istio repository with bunch of overrides including:
global:
outboundTrafficPolicy:
mode: ALLOW_ANY
pilot:
env:
PILOT_ENABLE_FALLTHROUGH_ROUTE: "1"
mixer:
enabled: true
galley:
enabled: true
security:
enabled: false
The problem is that I can't make any simple outbound HTTP request to a service running on port 80 (inside & outside of the mesh) from a pod which is inside istio mesh and has istio-proxy injected as sidecar. The response is always 404:
user#pod-12345-12345$ curl -v http://httpbin.org/headers
* Hostname was NOT found in DNS cache
* Trying 52.200.83.146...
* Connected to httpbin.org (52.200.83.146) port 80 (#0)
> GET /headers HTTP/1.1
> User-Agent: curl/7.38.0
> Host: httpbin.org
> Accept: */*
>
< HTTP/1.1 404 Not Found
< date: Wed, 15 May 2019 05:43:24 GMT
* Server envoy is not blacklisted
< server: envoy
< content-length: 0
<
* Connection #0 to host httpbin.org left intact
The response flag in the istio-proxy logs from envoy states that it can't find the proper route:
"GET / HTTP/1.1" 404 NR "-" 0 0 0 - "-" "curl/7.38.0" "238d0799-f83d-4e5e-94e7-79de4d14fa53" "httpbin.org" "-" - - 172.217.27.14:80 100.99.149.201:52892 -
NR: No route configured for a given request in addition to 404 response code.
Probably worths to add that:
Other outbound calls to any ports other than 80 works perfectly fine.
Checking proxy-status also shows nothing: all pods are SYNCED.
mTLS is disabled
The example above is a call to external service but calls to internal services (eg: curl another-service.svc.cluster.local/health) have the same issue.
I expecte calls to internal mesh services work out of the box, tho even I tried to define DestinationRoute and ServiceEntry and that did not help as well.
I don't really want to add traffic.sidecar.istio.io/excludeOutboundIPRanges: "0.0.0.0/0" annotation to deployment as according to docs:
this approach completely bypasses the sidecar, essentially disabling all of Istio’s features for the specified IPs
Any idea where else I can look or what is missing?
To me it looks like you have a short connection timeouts defined in your istio-proxy sidecar, please check here similar github issue of Envoy's project.
btw. as #Robert Panzer mentioned, sharing the whole dump of istio-proxy config would help much in investigating your particular case.