Failed to accept an incoming connection: connection from "9.42.x.x" rejected, allowed hosts: "zabbix-server" - kubernetes

SUMMARY
I have installed zabbix on OpenShift cluster. I am trying to monitor a host(vm) outside the cluster but the zabbix server is unable to connect to it. In the /etc/zabbix/zabbix_agentd.conf file I have mentioned the DNS name of the server zabbix-server but it looks like there server is trying to connect through a different public IP. I am not sure what this IP is.
OS / ENVIRONMENT / Used docker-compose files
I applied the kubernetes.yaml file present in this repo - https://github.com/zabbix/zabbix-docker/blob/6.2/kubernetes.yaml - on an OpenShift cluster.
CONFIGURATION
In the /etc/zabbix/zabbix_agentd.conf file Server=zabbix-server.
STEPS TO REPRODUCE
Apply the kubernetes.yaml file on Openshift cluster and try to monitor any external vm.
EXPECTED RESULTS
The zabbix server should be able to connect to the vm.
ACTUAL RESULTS
Zabbix server logs.
Defaulted container "zabbix-server" out of: zabbix-server, zabbix-snmptraps
\*\* Updating '/etc/zabbix/zabbix_server.conf' parameter "DBHost": 'mysql-server'...added
287:20230120:060843.131 Zabbix agent item "system.cpu.load\[all,avg5\]" on host "Host-C" failed: first network error, wait for 15 seconds
289:20230120:060858.592 Zabbix agent item "system.cpu.num" on host "Host-C" failed: another network error, wait for 15 seconds
289:20230120:060913.843 Zabbix agent item "system.sw.arch" on host "Host-C" failed: another network error, wait for 15 seconds
289:20230120:060929.095 temporarily disabling Zabbix agent checks on host "Host-C": interface unavailable
Logs from the agent installed on the vm.
350446:20230122:103232.230 failed to accept an incoming connection: connection from "9.x.x.219" rejected, allowed hosts: "zabbix-server"
350444:20230122:103332.525 failed to accept an incoming connection: connection from "9.x.x.219" rejected, allowed hosts: "zabbix-server"
350445:20230122:103432.819 failed to accept an incoming connection: connection from "9.x.x.210" rejected, allowed hosts: "zabbix-server"
350446:20230122:103533.114 failed to accept an incoming connection: connection from "9.x.x.217" rejected, allowed hosts: "zabbix-server"
If I add this IP in /etc/zabbix/zabbix_agentd.conf it will work. But what IP is this? Is this a service? Or any node/pod IP? It keeps on changing. Everytime I cannot change this id in the conf file. I need something more stable.
Kindly help me out with this issue.

So I don't know zabbix. So I have to make some educated guesses both in how the agent works and how the server works.
But, to summarize, unlike something like docker compose where you are running the zabbix server on a known server, in Openshift/Kubernetes you are deploying into a cluster of machines with their own networking. In other words, the whole point of OpenShift is that OpenShift will control where the application's pod gets deployed and will relocate/restart that pod as needed. With a different IP every time. (And the DNS name is meaningless since the two systems aren't sharing DNS anyway.) Most likely the IP's you are seeing are the pod's randomly assigned IPs.
So, what are you to do when you have a situation like yours where an external application requires a predicable IP? Well, option 1, is to remove that requirement. Using something like a certificate is obviously more secure and more reliable than depending on an IP anyway. But another option is to use an egress IP. This is a feature of OpenShift where you essentially use a proxy to provide an external application with a consistent IP.

Related

How to force kubernetes pod to route through the internet?

I have an issue with my kubernetes routing.
The issue is that one of the pods makes a GET request to auth.domain.com:443 but the internal routing is directing it to auth.domain.com:8443 which is the container port.
Because the host returning the SSL negotiation identifies itself as auth.domain.com:8443 instead of auth.domain.com:443 the connection times out.
[2023/01/16 18:03:45] [provider.go:55] Performing OIDC Discovery...
[2023/01/16 18:03:55] [main.go:60] ERROR: Failed to initialise OAuth2 Proxy: error intiailising provider: could not create provider data: error building OIDC ProviderVerifier: could not get verifier builder: error while discovery OIDC configuration: failed to discover OIDC configuration: error performing request: Get "https://auth.domain.com/realms/master/.well-known/openid-configuration": net/http: TLS handshake timeout
(If someone knows the root cause of why it is not identifying itself with the correct port 443 but instead the container port 8443, that would be extremely helpful as I could fix the root cause.)
To workaround this issue, I have the idea to force it to route out of the pod onto the internet and then back into the cluster.
I tested this by setting up the file I am trying to GET on a host external to the cluster, and in this case the SSL negoiation works fine and the GET request succeeds. However, I need to server the file from within the cluster, so this isn't a viable option.
However, if I can somehow force the pod to route through the internet, I believe it would work. I am having trouble with this though, because everytime the pod looks up auth.domain.com it sees that it is an internal kubernetes IP, and it rewrites the routing so that it is routed locally to the 10.0.0.0/24 address. After doing this, it seems to always return with auth.domain.com:8443 with the wrong port.
If I could force the pod to route through the full publicly routable IP, I believe it would work as it would come back with the external facing auth.domain.com:443 with the correct 443 port.
Anyone have any ideas on how I can achieve this or how to fix the server from identifying itself with the wrong auth.domain.com:8443 port instead of auth.domain.com:443 causing the SSL negotiation to fail?

Remote EJB in Kubernetes

I'm trying to setup a remote EJB call between 2 WebSphere Liberty servers deployed in k8s.
Yes, I'm aware that EJB is not something one would want to use when deploying in k8s, but I have to deal with it for now.
The problem I have is how to expose remote ORB IP:port in k8s. From what I understand, it's only possible to get it to work if both client and remote "listen" on the same IP. I'm not a network expert, and I'm quite fresh in k8s, so maybe I'm missing something here, that's why I need help.
The only way I got it to work is when I explicitly set host on remote server to it's own IP address and then accessed it from client on that same IP. This test was done on Docker host with macvlan0 network (each container had it's own IP address).
This is ORB setup for remote server.xml configuration:
<iiopEndpoint id="defaultIiopEndpoint" host="172.30.106.227" iiopPort="2809" />
<orb id="defaultOrb" iiopEndpointRef="defaultIiopEndpoint">
<serverPolicy.csiv2>
<layers>
<!-- don't care about security at this point -->
<authenticationLayer establishTrustInClient="Never"/>
<transportLayer sslEnabled="false"/>
</layers>
</serverPolicy.csiv2>
</orb>
And client server.xml configuration:
<orb id="defaultOrb">
<clientPolicy.csiv2>
<layers>
<!-- really, I don't care about security -->
<authenticationLayer establishTrustInClient="Never"/>
<transportLayer sslEnabled="false"/>
</layers>
</clientPolicy.csiv2>
</orb>
From client, this is JNDI name I try to access it:
corbaname::172.30.106.227:2809#ejb/global/some-app/ejb/BeanName!org\.example\.com\.BeanRemote
And this works.
Since one doesn't want to set fixed IP when exposing ORB port, I have to find a way to expose it dynamically, based on host IP.
Exposing on 0.0.0.0 does not work. Same goes for localhost. In both cases, client refuses to connect with this kind of error:
Error connecting to host=0.0.0.0, port=2809: Connection refused (Connection refused)
In k8s, I've exposed port 2809 through LoadBalancer service for remote pods, and try to access remote server from client pod, where I've set remote's service IP address in corbaname definition.
This, of course, does not work. I can access remote ip:port by telnet, so it's not a network issue.
I've tried all combinations of setup on remote server. Exporting on host="0.0.0.0" results with same exception as above (Connection refused).
I'm not sure exporting on internal IP address would work either, but even if it would, I don't know the internal IP before pod is deployed in k8s. Or is there a way to know? There is no env. variable with it, I've checked.
Exposing on service IP address (with host="${REMOTE_APP_SERVICE_HOST}") fails with this error:
The server socket could not be opened on 2,809. The exception message is Cannot assign requested address (Bind failed).
Again, I know replacing EJB with Rest is the way to go, but it's not an option for now (don't ask why).
Help, please!
EDIT:
I've managed to get some progress. Actually, I believe I've successfully called remote EJB.
What I did was add hostAliases in pod definition, which added alias for my host, something like this:
hostAliases:
- ip: 0.0.0.0
hostnames:
- my.host.name
Then I added this host name to remote server.xml:
<iiopEndpoint id="defaultIiopEndpoint" host="my.host.name" iiopPort="2809" />
I've also added host alias to my client pod:
hostAliases:
- ip: {remote.server.service.ip.here}
hostnames:
- my.host.name
Finally, I've changed JNDI name to:
corbaname::my.host.name:2809#ejb/global/some-app/ejb/BeanName!org\.example\.com\.BeanRemote
With this setup, remote server was successfully called!
However, now I have another problem which I didn't have while testing on Docker host. Lookup is done, but what I get is not what I expect.
Lookup code is pretty much what you'd expect:
Object obj = new InitialContext().lookup(jndi);
BeanRemote remote = (BeanRemote) PortableRemoteObject.narrow(obj, BeanRemote.class);
Unfortunatelly, this narrow call fails with ClassCastException:
Caused by: java.lang.ClassCastException: org.example.com.BeanRemote
at com.ibm.ws.transport.iiop.internal.WSPortableRemoteObjectImpl.narrow(WSPortableRemoteObjectImpl.java:50)
at [internal classes]
at javax.rmi.PortableRemoteObject.narrow(PortableRemoteObject.java:62)
Object I do receive is org.omg.stub.java.rmi._Remote_Stub. Any ideas?
Solved it!
So, the first problem was resolving host mapping, which was resolved as mentioned in edit above, by adding host aliases id pod definitions:
Remote pod:
hostAliases:
- ip: 0.0.0.0
hostnames:
- my.host.name
Client pod:
hostAliases:
- ip: {remote.server.service.ip.here}
hostnames:
- my.host.name
Remote server then has to use that host name in iiop host definition:
<iiopEndpoint id="defaultIiopEndpoint" host="my.host.name" iiopPort="2809" />
Also, client has to reference that host name through JNDI lookup:
corbaname::my.host.name:2809#ejb/global/some-app/ejb/BeanName!org\.example\.com\.BeanRemote
This setup resolves remote EJB call.
The other problem with ClassCastException was really unusual. I managed to reproduce the error on Docker host and then changed one thing at a time until the problem was resolved. It turns out that the problem was with ldapRegistry-3.0 feature (!?). Adding this feature to client's feature list resolved my problem:
<feature>ldapRegistry-3.0</feature>
With this feature added, remote EJB was successfully called.

Are service addresses available to the dc/os host OS?

I’m trying to have my dc/os 1.8 docker containers send log messages to a logstash that is also running in dc/os by using the service address of the logstash service.
that doesn’t appear to work as docker throws an error: logstash.marathon.l4lb.thisdcos.directory: no such host
are service addresses not exposed to the host systems (or do I need to configure something for this)?
on dc/os 1.7 I used a fixed host port in my logstash config and logstash.marathon.mesos as host, but these .marathon.mesos hostnames seem to not exist in 1.8 anymore.
the service addresses work fine when I try to use them from within a container (for example to link my prometheus service to my alertmanager service). but from the host level they don’t exist.
EDIT:
my statement about the missing marathon.mesos urls was wrong. they do work, but I uses the wrong one. for now this fixes my problem kind of. I configured logging using this host and a fixed container port.
for everybody trying the same thing: you have to configure the fixed host port everytime you make changes to the service config in the ui via the json mode. the fixed host port config is no longer available in the network tab of the ui, so the dc/os ui will DELETE the host port config on every load.
still no idea why the l4lb urls don't work.
EDIT2
still no idea, but i figured out that minuteman generates crash and error logs every other second:
/opt/mesosphere/active/minuteman/minuteman/error.log:
CRASH REPORT Process <0.25809.2> with 0 neighbours exited with reason: {timeout,{gen_server,call,[{lashup_kv,'navstar#10.2.140.216'},{start_kv_sync_fsm,'minuteman#10.2.103.143',<0.25809.2>}]}} in gen_server:call/2 line 204
/opt/mesosphere/active/minuteman/minuteman/log/crash.log
2016-10-12 13:16:49 =CRASH REPORT====
crasher:
initial call: lashup_kv_sync_tx_fsm:init/1
pid: <0.29002.2>
registered_name: []
exception exit: {{timeout,{gen_server,call,[{lashup_kv,'navstar#10.2.140.216'},{start_kv_sync_fsm,'minuteman#10.2.103.143',<0.29002.2>}]}},[{gen_server,call,2,[{file,"gen_server.erl"},{line,204}]},{lashup_kv_sync_tx_fsm,init,1,[{file,"/pkg/src/minuteman/_build/default/lib/lashup/src/lashup_kv_sync_tx_fsm.erl"},{line,23}]},{gen_statem,init_it,6,[{file,"gen_statem.erl"},{line,554}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,247}]}]}
ancestors: [lashup_kv_aae_sup,lashup_kv_sup,lashup_platform_sup,lashup_sup,<0.916.0>]
messages: []
links: [<0.992.0>]
dictionary: []
trap_exit: false
status: running
heap_size: 610
stack_size: 27
reductions: 127
neighbours:
the dc/os ui claims spartan and minuteman are healthy, but while the crash.log of the dns dispatcher is empty the l4lb gets new crashes every other second.
They should certainly be available from the host OS. Are these host services running the "Spartan" and "Minuteman" services?
my problem was twofold:
the l4b did not properly run, that was only fixed after a total reinstall of the cluster
the l4b only supports TCP traffic. because i wanted to use it to send container-logs to logstash using udp (docker-gelf only supports UDP) this failed

Consul.io - how to run multiple servers on same machine

This is probably a very basic question for you, but I'm just getting into consul and for testing purposes, I wanna run multiple servers on my PC. For example, I run the first server with
consul agent -server -bootstrap-expect=1 -dc=dev -data-dir=/tmp/consul -ui-dir="c:/consul 0.5.2/dist"
and then I try to run the second server with
consul agent -server -data-dir=/tmp/consul2 -dc=dc2
but it returns
==> Error starting agent: Failed to start Consul server: Failed to start RPC lay
er: listen tcp 0.0.0.0:8300: bind: Only one usage of each socket address (protoc
ol/network address/port) is normally permitted.
What am I missing from my command?
You are launching two consul servers using mostly default values. In this case the problem is that you use default ports.
When you read the error message you will notice that your second consul server tries to bind to port 8300. But your first server is already using this port, causing the second server to fail at startup. (note: consul binds to a variety of ports, each having another purpose and default setting. Take a look at the documentation).
As suggested by LenW, you can use Vagrant to set your environment. You could follow the consul tutorial.
If you do not want to use vagrant or set up any virtual machines on your own. You could change the defaults of the second server.
If you are trying to simulate a production topology on your dev machine I would look at using Vagrant in combination with VirtualBox to simulate a couple of machines for testing.

RhodeCode - What is blocking my connection?

All connection attempts on RhodeCode on CentOS 6.3 are refused except from localhost.
Note that iptables is not running, and I am only trying to visit the web interface.
I have googled the exact error message below and looked around SO. I have yet to find a solution.
abort: error: No connection could be made because the target machine actively refused it
If the firewall is down, and I am not trying to modify any repository, what else is preventing me from connecting? EDIT: See #5 below. Not sure how to address it yet.
Things tried and other info
Using localhost, 127.0.0.1 and hostname in production.ini
service iptables stop
Connected over HTTP successfully. In other words, connections are accepted outside RhodeCode.
Made sure no authentication methods were enabled or configured in production.ini
Although the server accepts connections on localhost, netstat -l does not show that port 5000 is listening. Port 5000 is set in production.ini and ps uax | grep paster confirms the server is running. No other software tries to grab port 5000.
Ok, apparently I have been misunderstanding the host configuration. I was running on the assumption that host should be set to 127.0.0.1 or localhost in production.ini for RhodeCode to know what host to look for for another service. This was a faulty presumption on my part, since I am used to pointing web applications to local systems to look for databases.
It turns out that host binds the application to a specific address for access, meaning that it RhodeCode was supposed to only respond to local requests, regardless of what other system policies say. The setup docs did not make this clear because it did not specify that external connections would be refused. All it said was:
This command [paster serve] runs the RhodeCode server. The web app should be available at the 127.0.0.1:5000. This ip and port is configurable via the production.ini file created in previous step
The problem was fixed by binding RhodeCode to 0.0.0.0, which opened it to outside connections. Kudos to Łukasz Balcerzak for pointing this out in the RC support google group.