Running 2VMs by Vagrant within the private network like:
host1: 192.168.1.1/24
host2: 192.168.1.2/24
In host1, the app listens port 6443. But cannot access in host2:
# host1
root#host1:~# ss -lntp | grep 6443
LISTEN 0 4096 *:6443 *:* users:(("kube-apiserver",pid=10537,fd=7))
# host2
root#host2:~# nc -zv -w 3 192.168.1.2 6443
nc: connect to 192.168.1.2 port 6443 (tcp) failed: Connection refused
(Actually, the app is the "kube-apiserver" and fail to join the host2 as a worker node with kubeadm)
What am I missed?
Both are ubuntu focal (box_version '20220215.1.0') and ufw are inactivated.
After change the IP of hosts, it works:
host1: 192.168.1.1/24 -> 192.168.1.2/24
host2: 192.168.1.2/24 -> 192.168.1.3/24
I guess it is caused by using the reserved IP as the gateway, the first IP of the subnet, 192.168.1.1.
I'll update the references about that here later, I have to setup the k8s cluster for now.
Related
I have an Arch Linux Linode, it runs WordPress, using the Linux Server IO Swag container. It works. I installed UFW and Tailscale, all SSH traffic is over the Tailnet, port 80 and 443 are open:
Status: active
To Action From
-- ------ ----
Anywhere on tailscale0 ALLOW Anywhere
80 ALLOW Anywhere
443 ALLOW Anywhere
Anywhere (v6) on tailscale0 ALLOW Anywhere (v6)
Everything works well, until I access the WordPress instance on my iPhone (Firefox for iOS), at that point the IP I use to access the system from the iPhone gets blocked for some time. Let me demonstrate:
I scan the ports
nmap -p 443,80 anacreon.domain.nl
Starting Nmap 7.93 ( https://nmap.org ) at 2023-01-05 21:50 CET
Nmap scan report for anacreon.domain.nl (139.144.66.219)
Host is up (0.034s latency).
Other addresses for anacreon.domain.nl (not scanned): 2a01:7e01::f03c:93ff:fea2:10ab
PORT STATE SERVICE
80/tcp open http
443/tcp open https
Nmap done: 1 IP address (1 host up) scanned in 0.15 seconds
Now, let me connect with my iPhone (to another domain pointing to the same box, routed to WP), and see what we get within 2 sec or so:
nmap -p 443,80 anacreon.domain.nl
Starting Nmap 7.93 ( https://nmap.org ) at 2023-01-05 21:51 CET
Nmap scan report for anacreon.domain.nl (139.144.66.219)
Host is up (0.034s latency).
Other addresses for anacreon.domain.nl (not scanned): 2a01:7e01::f03c:93ff:fea2:10ab
PORT STATE SERVICE
80/tcp closed http
443/tcp closed https
Nmap done: 1 IP address (1 host up) scanned in 0.11 seconds
Now, if I switch on my Mullvad VPN on my iPhone I can find the WP instance and click about 1 link on it before it's blocked again. If I switch on my Mullvad VPN on my laptop can access the system, let me demonstrate, I execute these commands withing 3 sec or so:
[freek#freex ~]$ nmap -p 443,80 anacreon.domain.nl
Starting Nmap 7.93 ( https://nmap.org ) at 2023-01-05 21:54 CET
Nmap scan report for anacreon.domain.nl (139.144.66.219)
Host is up (0.033s latency).
Other addresses for anacreon.domain.nl (not scanned): 2a01:7e01::f03c:93ff:fea2:10ab
PORT STATE SERVICE
80/tcp closed http
443/tcp closed https
Nmap done: 1 IP address (1 host up) scanned in 0.12 seconds
[freek#freex ~]$ wg-quick up mullvad-se3
wg-quick must be run as root. Please enter the password for freek to continue:
[#] ip link add mullvad-se3 type wireguard
[#] wg setconf mullvad-se3 /dev/fd/63
[#] ip -4 address add 10.66.88.174/32 dev mullvad-se3
[#] ip -6 address add fc00:bbbb:bbbb:bb01::3:58ad/128 dev mullvad-se3
[#] ip link set mtu 1420 up dev mullvad-se3
[#] resolvconf -a mullvad-se3 -m 0 -x
[#] wg set mullvad-se3 fwmark 51820
[#] ip -6 route add ::/0 dev mullvad-se3 table 51820
[#] ip -6 rule add not fwmark 51820 table 51820
[#] ip -6 rule add table main suppress_prefixlength 0
[#] nft -f /dev/fd/63
[#] ip -4 route add 0.0.0.0/0 dev mullvad-se3 table 51820
[#] ip -4 rule add not fwmark 51820 table 51820
[#] ip -4 rule add table main suppress_prefixlength 0
[#] sysctl -q net.ipv4.conf.all.src_valid_mark=1
[#] nft -f /dev/fd/63
[freek#freex ~]$ nmap -p 443,80 anacreon.domain.nl
Starting Nmap 7.93 ( https://nmap.org ) at 2023-01-05 21:54 CET
Nmap scan report for anacreon.domain.nl (139.144.66.219)
Host is up (0.050s latency).
Other addresses for anacreon.domain.nl (not scanned): 2a01:7e01::f03c:93ff:fea2:10ab
PORT STATE SERVICE
80/tcp open http
443/tcp open https
Nmap done: 1 IP address (1 host up) scanned in 0.25 seconds
Indeed now I can access and use the site from my laptop normally. And, in about 10-20 minutes the ports open for my private IP address again.
I'm really baffled, I didn't install any firewall other than UFW and it shouldn't even block anything. I did not install fail2ban or any such service.
What could this be? Why does my iPhone trigger it (with normal use even)?
Any suggestions on how to further investigate?
Oh, when I disable UFW it still happens, here I keep port scanning while I access the WP instance:
nmap -p 443,80 anacreon.domain.nl
Starting Nmap 7.93 ( https://nmap.org ) at 2023-01-05 22:03 CET
Nmap scan report for anacreon.domain.nl (139.144.66.219)
Host is up (0.036s latency).
Other addresses for anacreon.domain.nl (not scanned): 2a01:7e01::f03c:93ff:fea2:10ab
PORT STATE SERVICE
80/tcp open http
443/tcp open https
Nmap done: 1 IP address (1 host up) scanned in 0.16 seconds
[freek#freex ~]$ nmap -p 443,80 anacreon.domain.nl
Starting Nmap 7.93 ( https://nmap.org ) at 2023-01-05 22:03 CET
Nmap scan report for anacreon.domain.nl (139.144.66.219)
Host is up (0.16s latency).
Other addresses for anacreon.domain.nl (not scanned): 2a01:7e01::f03c:93ff:fea2:10ab
PORT STATE SERVICE
80/tcp closed http
443/tcp filtered https
Nmap done: 1 IP address (1 host up) scanned in 4.50 seconds
[freek#freex ~]$ nmap -p 443,80 anacreon.domain.nl
Starting Nmap 7.93 ( https://nmap.org ) at 2023-01-05 22:03 CET
Note: Host seems down. If it is really up, but blocking our ping probes, try -Pn
Nmap done: 1 IP address (0 hosts up) scanned in 3.04 seconds
[freek#freex ~]$ nmap -p 443,80 anacreon.domain.nl
Starting Nmap 7.93 ( https://nmap.org ) at 2023-01-05 22:03 CET
Nmap scan report for anacreon.domain.nl (139.144.66.219)
Host is up (0.27s latency).
Other addresses for anacreon.domain.nl (not scanned): 2a01:7e01::f03c:93ff:fea2:10ab
PORT STATE SERVICE
80/tcp closed http
443/tcp closed https
Nmap done: 1 IP address (1 host up) scanned in 2.27 seconds
Notice that for a time it indicates the port is filtered...
Perhaps I should also mention that I have nginx basic auth enabled in front of the WP instance.
As said, I'm confused. I would really like to learn to determine what goes wrong here.
Ok, I found it, the Swag container has fail2ban activated... It seems that is does not play nice with the Nginx basic auth: https://docs.linuxserver.io/images/docker-swag#using-fail2ban
I guess this is the flip side of complex infra as code, 15 years ago there never was some service I didn't know I was running ;)
My docker postgres instance can't be connected to from the internet.
I think it is because it is mapped by docker to localhost.
root#VM01:~/docker# docker port postgres
5432/tcp -> 127.0.0.1:5432
I am new to docker and I would like to try remapping that to
5432/tcp -> 0.0.0.0:5432
To see if I can then connect remotely over the internet.
root#VM01:~/docker# netstat -na | grep 5432
tcp 0 0 127.0.0.1:5432 0.0.0.0:* LISTEN
Does anyone have experience doing this or advice on if this might work...?
I have another docker instance on the same host that reflects 0.0.0.0:8000 and using telnet from any machine on the internet shows it is accessible.
Not this one though:
127.0.0.1:5432->5432/tcp
Running a clean install of microk8s 1.19 on Fedora 32, I am able to ping an external IP address, but when I try to wget, I get "no route to host" (this is the output of commands run from a busybox pod):
/ # wget x.x.x.x
Connecting to x.x.x.x (x.x.x.x:80)
wget: can't connect to remote host (x.x.x.x): No route to host
/ # ping x.x.x.x
PING x.x.x.x (x.x.x.x): 56 data bytes
64 bytes from x.x.x.x: seq=0 ttl=127 time=1.209 ms
64 bytes from x.x.x.x: seq=1 ttl=127 time=0.765 ms
Finally found https://github.com/ubuntu/microk8s/issues/408
Had to enable masquerade in the firewall zone associated with the bridge interface, or in my case, my ethernet connection.
This is a Kubespray deployment using calico. All the defaults are were left as-is except for the fact that there is a proxy. Kubespray ran to the end without issues.
Access to Kubernetes services started failing and after investigation, there was no route to host to the coredns service. Accessing a K8S service by IP worked. Everything else seems to be correct, so I am left with a cluster that works, but without DNS.
Here is some background information:
Starting up a busybox container:
# nslookup kubernetes.default
Server: 169.254.25.10
Address: 169.254.25.10:53
** server can't find kubernetes.default: NXDOMAIN
*** Can't find kubernetes.default: No answer
Now the output while explicitly defining the IP of one of the CoreDNS pods:
# nslookup kubernetes.default 10.233.0.3
;; connection timed out; no servers could be reached
Notice that telnet to the Kubernetes API works:
# telnet 10.233.0.1 443
Connected to 10.233.0.1
kube-proxy logs:
10.233.0.3 is the service IP for coredns. The last line looks concerning, even though it is INFO.
$ kubectl logs kube-proxy-45v8n -nkube-system
I1114 14:19:29.657685 1 node.go:135] Successfully retrieved node IP: X.59.172.20
I1114 14:19:29.657769 1 server_others.go:176] Using ipvs Proxier.
I1114 14:19:29.664959 1 server.go:529] Version: v1.16.0
I1114 14:19:29.665427 1 conntrack.go:52] Setting nf_conntrack_max to 262144
I1114 14:19:29.669508 1 config.go:313] Starting service config controller
I1114 14:19:29.669566 1 shared_informer.go:197] Waiting for caches to sync for service config
I1114 14:19:29.669602 1 config.go:131] Starting endpoints config controller
I1114 14:19:29.669612 1 shared_informer.go:197] Waiting for caches to sync for endpoints config
I1114 14:19:29.769705 1 shared_informer.go:204] Caches are synced for service config
I1114 14:19:29.769756 1 shared_informer.go:204] Caches are synced for endpoints config
I1114 14:21:29.666256 1 graceful_termination.go:93] lw: remote out of the list: 10.233.0.3:53/TCP/10.233.124.23:53
I1114 14:21:29.666380 1 graceful_termination.go:93] lw: remote out of the list: 10.233.0.3:53/TCP/10.233.122.11:53
All pods are running without crashing/restarts etc. and otherwise services behave correctly.
IPVS looks correct. CoreDNS service is defined there:
# ipvsadm -ln
IP Virtual Server version 1.2.1 (size=4096)
Prot LocalAddress:Port Scheduler Flags
-> RemoteAddress:Port Forward Weight ActiveConn InActConn
TCP 10.233.0.1:443 rr
-> x.59.172.19:6443 Masq 1 0 0
-> x.59.172.20:6443 Masq 1 1 0
TCP 10.233.0.3:53 rr
-> 10.233.122.12:53 Masq 1 0 0
-> 10.233.124.24:53 Masq 1 0 0
TCP 10.233.0.3:9153 rr
-> 10.233.122.12:9153 Masq 1 0 0
-> 10.233.124.24:9153 Masq 1 0 0
TCP 10.233.51.168:3306 rr
-> x.59.172.23:6446 Masq 1 0 0
TCP 10.233.53.155:44134 rr
-> 10.233.89.20:44134 Masq 1 0 0
UDP 10.233.0.3:53 rr
-> 10.233.122.12:53 Masq 1 0 314
-> 10.233.124.24:53 Masq 1 0 312
Host routing also looks correct.
# ip r
default via x.59.172.17 dev ens3 proto dhcp src x.59.172.22 metric 100
10.233.87.0/24 via x.59.172.21 dev tunl0 proto bird onlink
blackhole 10.233.89.0/24 proto bird
10.233.89.20 dev calib88cf6925c2 scope link
10.233.89.21 dev califdffa38ed52 scope link
10.233.122.0/24 via x.59.172.19 dev tunl0 proto bird onlink
10.233.124.0/24 via x.59.172.20 dev tunl0 proto bird onlink
x.59.172.16/28 dev ens3 proto kernel scope link src x.59.172.22
x.59.172.17 dev ens3 proto dhcp scope link src x.59.172.22 metric 100
172.17.0.0/16 dev docker0 proto kernel scope link src 172.17.0.1 linkdown
I have redeployed this same cluster in separate environments with flannel and calico with iptables instead of ipvs. I have also disabled the docker http proxy after deploy temporarily. None of which makes any difference.
Also:
kube_service_addresses: 10.233.0.0/18
kube_pods_subnet: 10.233.64.0/18
(They do not overlap)
What is the next step in debugging this issue?
I highly recommend you to avoid using latest busybox image to troubleshoot DNS. There are few issues reported regarding dnslookup on versions newer than 1.28.
v 1.28.4
user#node1:~$ kubectl exec -ti busybox busybox | head -1
BusyBox v1.28.4 (2018-05-22 17:00:17 UTC) multi-call binary.
user#node1:~$ kubectl exec -ti busybox -- nslookup kubernetes.default
Server: 169.254.25.10
Address 1: 169.254.25.10
Name: kubernetes.default
Address 1: 10.233.0.1 kubernetes.default.svc.cluster.local
v 1.31.1
user#node1:~$ kubectl exec -ti busyboxlatest busybox | head -1
BusyBox v1.31.1 (2019-10-28 18:40:01 UTC) multi-call binary.
user#node1:~$ kubectl exec -ti busyboxlatest -- nslookup kubernetes.default
Server: 169.254.25.10
Address: 169.254.25.10:53
** server can't find kubernetes.default: NXDOMAIN
*** Can't find kubernetes.default: No answer
command terminated with exit code 1
Going deeper and exploring more possibilities, I've reproduced your problem on GCP and after some digging I was able to figure out what is causing this communication problem.
GCE (Google Compute Engine) blocks traffic between hosts by default; we have to allow Calico traffic to flow between containers on different hosts.
According to calico documentation, you can do it by creating a firewall allowing this communication rule:
gcloud compute firewall-rules create calico-ipip --allow 4 --network "default" --source-ranges "10.128.0.0/9"
You can verify the rule with this command:
gcloud compute firewall-rules list
This is not present on the most recent calico documentation but it's still true and necessary.
Before creating firewall rule:
user#node1:~$ kubectl exec -ti busybox2 -- nslookup kubernetes.default
Server: 10.233.0.3
Address 1: 10.233.0.3 coredns.kube-system.svc.cluster.local
nslookup: can't resolve 'kubernetes.default'
command terminated with exit code 1
After creating firewall rule:
user#node1:~$ kubectl exec -ti busybox2 -- nslookup kubernetes.default
Server: 10.233.0.3
Address 1: 10.233.0.3 coredns.kube-system.svc.cluster.local
Name: kubernetes.default
Address 1: 10.233.0.1 kubernetes.default.svc.cluster.local
It doesn't matter if you bootstrap your cluster using kubespray or kubeadm, this problem will happen because calico needs to communicate between nodes and GCE is blocking it as default.
This is what works for me, I tried to install my k8s cluster using kubespray configured with calico as CNI and containerd as container runtime
iptables -P INPUT ACCEPT
iptables -P FORWARD ACCEPT
iptables -P OUTPUT ACCEPT
iptables -F
[delete coredns pod]
I bought a VPS from DigitalOcean with applications Rails+Unicorn+Nginx. I installed Postgresql 9.1 and trying to accept remote connections from that. I read all of the solutions/problems about it (Googled much) and did exactly. The problem is the following:
psql: could not connect to server: Connection refused
Is the server running on host "xxx.xxx.xxx.xxx" and accepting
TCP/IP connections on port 5432?
I edited the postgresql.conf file with listen_addresses='*'
I edited the pg_hba.conf file and added host all all 0.0.0.0/0 md5
I restarted postgresql service and even the VPS however still I cannot connected to the database. So I tried to check the server's listening ports:
netstat -an | grep 5432
tcp 0 0 127.0.0.1:5432 0.0.0.0:* LISTEN
unix 2 [ ACC ] STREAM LISTENING 8356 /tmp/.s.PGSQL.5432
and then I nmap'ed the server:
Not shown: 996 closed ports
PORT STATE SERVICE
21/tcp open ftp
22/tcp open ssh
80/tcp open http
554/tcp open rtsp
But still I cannot understand why postgresql not serving at the port 5432 after the configurations. Need advice.
Thanks.