Looking for debugging advice on SSL errors from EKS using varnish - kubernetes

I know this is somewhat specific of a question, but I'm having a problem I can't seem to track down. I have a single pod deployed to EKS - the pod contains a python app, and a varnish reverse caching proxy. I'm serving chunked json (that is, streaming lines of json, a la http://jsonlines.org/), and it can be multiple GB of data.
The first time I make a request, and it hits the python server, everything acts correctly. It takes (much) longer than the cached version, but the entire set of json lines is downloaded. However, now that it's cached in varnish, if I use curl, I get:
curl: (56) GnuTLS recv error (-110): The TLS connection was non-properly terminated.
or
curl: (56) GnuTLS recv error (-9): A TLS packet with unexpected length was received.
The SSL is terminated at the ELB, and when I use curl from the proxy container itself (using curl http://localhost?....), there is no problem.
The hard part of this is that the problem is somewhat intermittent.
If there is any advice in terms of clever varnishlog usage, or anything of the same ilk on AWS, I'd be much obliged.
Thanks!

Because TLS is terminated on your ELB loadbalancers, the connection between should be in plain HTTP.
The error is probably not coming from Varnish, because Varnish currently doesn't handle TLS natively. I'm not sure if varnishlog can give you better insights in what is actually happening.
Checklist
The only checklist I can give you is the following:
Make sure the certificate you're using is valid
Make sure you're connecting to your target group over HTTP, not HTTPS
If you enable the PROXY protocol on your ELB, make sure Varnish has a -a listener that listens for PROXY protocol requests, on top of regular HTTP requests.
Debugging
Perform top-down debugging:
Increase the verbosity of your cURL calls and try to get more information about the error
Try accessing the logs of your ELB and get more details there
Get more information from your EKS logs
And finally, perform a varnislog -g request -q "ReqUrl eq '/your-url'" to get a full Varnishlog for a specific URL

Related

How to force kubernetes pod to route through the internet?

I have an issue with my kubernetes routing.
The issue is that one of the pods makes a GET request to auth.domain.com:443 but the internal routing is directing it to auth.domain.com:8443 which is the container port.
Because the host returning the SSL negotiation identifies itself as auth.domain.com:8443 instead of auth.domain.com:443 the connection times out.
[2023/01/16 18:03:45] [provider.go:55] Performing OIDC Discovery...
[2023/01/16 18:03:55] [main.go:60] ERROR: Failed to initialise OAuth2 Proxy: error intiailising provider: could not create provider data: error building OIDC ProviderVerifier: could not get verifier builder: error while discovery OIDC configuration: failed to discover OIDC configuration: error performing request: Get "https://auth.domain.com/realms/master/.well-known/openid-configuration": net/http: TLS handshake timeout
(If someone knows the root cause of why it is not identifying itself with the correct port 443 but instead the container port 8443, that would be extremely helpful as I could fix the root cause.)
To workaround this issue, I have the idea to force it to route out of the pod onto the internet and then back into the cluster.
I tested this by setting up the file I am trying to GET on a host external to the cluster, and in this case the SSL negoiation works fine and the GET request succeeds. However, I need to server the file from within the cluster, so this isn't a viable option.
However, if I can somehow force the pod to route through the internet, I believe it would work. I am having trouble with this though, because everytime the pod looks up auth.domain.com it sees that it is an internal kubernetes IP, and it rewrites the routing so that it is routed locally to the 10.0.0.0/24 address. After doing this, it seems to always return with auth.domain.com:8443 with the wrong port.
If I could force the pod to route through the full publicly routable IP, I believe it would work as it would come back with the external facing auth.domain.com:443 with the correct 443 port.
Anyone have any ideas on how I can achieve this or how to fix the server from identifying itself with the wrong auth.domain.com:8443 port instead of auth.domain.com:443 causing the SSL negotiation to fail?

How Postgres negotiate TLS usage?

I am puzzled a bit about Postgres option sslmode=prefer. It implies that it negotiates with the server to figure out whether the server supports TLS or not.
I am curious how it's done. Does it try TLS first and if it fails, try without TLS or am I missing something in TLS (or Postgres) which allow them to truly negotiate this?
Does it try TLS first and if it fails, try without TLS
Yes. And when both attempts fail, this might be visible, as two different error messages might be produced.
Some additional info on top of #janes answer:
https://www.postgresql.org/docs/current/protocol-flow.html
To initiate an SSL-encrypted connection, the frontend initially sends
an SSLRequest message rather than a StartupMessage. The server then
responds with a single byte containing S or N, indicating that it is
willing or unwilling to perform SSL, respectively. The frontend might
close the connection at this point if it is dissatisfied with the
response. To continue after S, perform an SSL startup handshake (not
described here, part of the SSL specification) with the server. If
this is successful, continue with sending the usual StartupMessage. In
this case the StartupMessage and all subsequent data will be
SSL-encrypted. To continue after N, send the usual StartupMessage and
proceed without encryption.

HAProxy : Prevent stickiness to a backup server

I'm facing a configuration issue with HAProxy (1.8).
Context:
In a HAProxy config, I have a several severs in a backend, and an additional backup server in case the other servers are down.
Once a client gets an answer from a server, it must stick to this server for its next queries.
For some good reasons, I can't use a cookie for this concern, and I had to use a stick-table instead.
Problem:
When every "normal" server is down, clients are redirected to the backup server, as expected.
BUT the stick-table is then filled with an association between the client and the id of the backup server.
AND when every "normal" server is back, the clients which are present in the stick table and associated with the id of the backup server will continue to get redirected to the backup server instead of the normal ones!
This is really upsetting me...
So my question is: how to prevent HAProxy to stick clients to a backup server in a backend?
Please find below a configuration sample:
defaults
option redispatch
frontend fe_test
bind 127.0.0.1:8081
stick-table type ip size 1m expire 1h
acl acl_test hdr(host) -i whatever.domain.com
...
use_backend be_test if acl_test
...
backend be_test
mode http
balance roundrobin
stick on hdr(X-Real-IP) table fe_test
option httpchk GET /check
server test-01 server-01.lan:8080 check
server test-02 server-02.lan:8080 check
server maintenance 127.0.0.1:8085 backup
(I've already tried to add a lower weight to the backup server, but it didn't solve this issue.)
I read in the documentation that the "stick-on" keyword has some "if/unless" options, and maybe I can use it to write a condition based on the backend server names, but I have no clue about the syntax to use, or even if it is possible.
Any idea is welcome!
So silly of me! I was so obsessed by the stick table configuration that I didn't think to look in the server options...
There is a simple keyword that perfectly solves my problem: non-stick
Never add connections allocated to this sever to a stick-table. This
may be used in conjunction with backup to ensure that stick-table
persistence is disabled for backup servers.
So the last line of my configuration sample simply becomes:
server maintenance 127.0.0.1:8085 backup non-stick
...and everything is now working as I expected.

Haproxy Health Check port

I'm trying to think through the advantages and disadvantages of haproxy health checks happening on a different port from regular traffic.
If a server becomes overloaded having health checks on a different port may mark the server as being up even when overloaded. I think this is a good thing because taking servers offline may make an overloading problem worse, but want to confirm that that makes sense. I can't seem to find any good docs on the tradeoffs though and was wondering if someone has a good analysis on the tradeoffs.
The port keyword is often used with address to send health checks somewhere else than directly to the service you are checking. One example might be enabling option httpchk to monitor a non-HTTP service. What you then do is have a HTTP-compatible service that when queried can execute complex health checks against the service you are actually testing.
The above is often done with agent-check nowdays, but some people prefer to use an HTTP interface.
This also has nothing to do with server load, the only idea is to send health checks to some other service, not the one directly monitored, which is more capable of testing the actual service (possibly by using a more-complex logic) and returning a result. As an example, one could have a MySQL backend which instead of being tested just for authentication by option mysql-check, could be tested by a PHP script that, for example, checks if backup is running and if it is returns a 5xx HTTP error. The configuration could be something like:
backend mysql
mode tcp
option httpchk GET /mysql-status.php
server mysqlserver 10.0.0.1:3306 check port 80

How to send HTTP Commands through Port 80

Breif Description of what I am trying to accomplish. So I am working with Crestrons Simpl+ software. My job is to create a module for a sound masking system called QT Pro. Now, QT Pro has an API where you can control it via HTTP. I need a way to establish a connection with the QT Pro via HTTP( I have everything I need, IP, Username, Password).
Whats the problem? I have just started working with this language. Unfortunately there isn't as much documentation as I would like, otherwise I wouldn't be here. I know I need to create a socket connection via TCP on port 80. I just don't know what I'm supposed to send through it.
Here is an example:
http://username:password#address/cmd.htm?cmd=setOneZoneData&ZN=Value&mD=Value
&mN=Value&auxA=Value&auxB=Value&autoR=Value
If I were to put this into the URL box, and fill it in correctly. then it would change the values that I specify. Am I supposed to send the entire thing? Or just after cmd.htm? Or is there some other way I'm supposed to send data? I'd like to stay away from the TCP/IP Module so I can keep this all within the same module.
Thanks.
You send
GET /cmd.htm?cmd=setOneZoneData&ZN=Value&mD=Value&mN=Value&auxA=Value&auxB=Value&autoR=Value HTTP/1.1
Host: address
Connection: close
(End with a couple of newlines.)
If you need to use HTTP basic authentication, then also include a header like
Authorization: Basic dXNlcm5hbWU6cGFzc3dvcmQ=
where the gibberish is the base64-encoded version of username:password.
But surely there is some mechanism for opening HTTP connections already there for you? Just blindly throwing out headers like this and hoping the response is what you expect is not robust, to say the least.
To see what is going on with your requests and responses, a great tool is netcat (or telnet, for that matter.)
Do nc address 80 to connect to server address on port 80, then paste your HTTP request:
GET /cmd.htm HTTP/1.1
Host: address
Authorization: Basic dXNlcm5hbWU6cGFzc3dvcmQ=
Connection: close
and see what comes back. SOMETHING should come back. (Remember to terminate with two newlines.)
To see what requests your browser is sending when you do something that works, you can listen like this: nc -l -p 8080.
Then direct your browser to localhost:8080 with the rest of the URL as before, and you'll see the request that was sent. (Then you can type back to see how the browser handles the response.)