Get non SYN/ACK packets of gRPC via tcpdump - sockets

I wanted to see just the DATA packets in the underlying transfers in gRPC. I ran the greeter server-client example application. Both are running on localhost. When I captured traffic on port 50051, I get the following trace (Omitting a few lines for conciseness):
$ sudo tcpdump -i lo port 50051
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on lo, link-type EN10MB (Ethernet), capture size 262144 bytes
14:55:43.756358 IP6 ip6-localhost.58242 > ip6-localhost.50051: Flags [S], seq 3711036939, win 65476, options [mss 65476,sackOK,TS val 3380654929 ecr 0,nop,wscale 7], length 0
14:55:43.756379 IP6 ip6-localhost.50051 > ip6-localhost.58242: Flags [S.], seq 60801323, ack 3711036940, win 65464, options [mss 65476,sackOK,TS val 3380654929 ecr 3380654929,nop,wscale 7], length 0
...
...
14:55:43.760075 IP6 ip6-localhost.50051 > ip6-localhost.58242: Flags [P.], seq 224:241, ack 396, win 512, options [nop,nop,TS val 3380654933 ecr 3380654933], length 17
14:55:43.760091 IP6 ip6-localhost.58242 > ip6-localhost.50051: Flags [.], ack 241, win 512, options [nop,nop,TS val 3380654933 ecr 3380654933], length 0
14:55:43.760440 IP6 ip6-localhost.58242 > ip6-localhost.50051: Flags [F.], seq 396, ack 241, win 512, options [nop,nop,TS val 3380654933 ecr 3380654933], length 0
14:55:43.760588 IP6 ip6-localhost.58242 > ip6-localhost.50051: Flags [R.], seq 397, ack 241, win 512, options [nop,nop,TS val 3380654933 ecr 3380654933], length 0
Note that the above trace has some SYN, ACK, and FIN packets too.
However, when I wanted to extract just the SYN and ACK packets, it gives me no output?!
$ sudo tcpdump -i lo 'tcp port 50051 and tcp[tcpflags] & (tcp-syn|tcp-ack) != 0'
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on lo, link-type EN10MB (Ethernet), capture size 262144 bytes
^C
0 packets captured
0 packets received by filter
0 packets dropped by kernel
I do get the packets back when I reverse the condition:
sudo tcpdump -i lo 'tcp port 50051 and not tcp[tcpflags] & (tcp-syn|tcp-ack) != 0'
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on lo, link-type EN10MB (Ethernet), capture size 262144 bytes
15:05:09.644989 IP6 ip6-localhost.58658 > ip6-localhost.50051: Flags [S], seq 1905191489, win 65476, options [mss 65476,sackOK,TS val 3381220818 ecr 0,nop,wscale 7], length 0
15:05:09.645011 IP6 ip6-localhost.50051 > ip6-localhost.58658: Flags [S.], seq 3628833616, ack 1905191490, win 65464, options [mss 65476,sackOK,TS val 3381220818 ecr 3381220818,nop,wscale 7], length 0
...
...
15:05:09.649368 IP6 ip6-localhost.50051 > ip6-localhost.58658: Flags [P.], seq 224:241, ack 396, win 512, options [nop,nop,TS val 3381220822 ecr 3381220822], length 17
15:05:09.649382 IP6 ip6-localhost.58658 > ip6-localhost.50051: Flags [.], ack 241, win 512, options [nop,nop,TS val 3381220822 ecr 3381220822], length 0
15:05:09.649768 IP6 ip6-localhost.58658 > ip6-localhost.50051: Flags [F.], seq 396, ack 241, win 512, options [nop,nop,TS val 3381220822 ecr 3381220822], length 0
15:05:09.649929 IP6 ip6-localhost.58658 > ip6-localhost.50051: Flags [R.], seq 397, ack 241, win 512, options [nop,nop,TS val 3381220823 ecr 3381220822], length 0
How do I remove the SYN/ACK/FIN packets from the tcpdump trace?

Related

TCP proxy through Istio is not working in one of our cluster and working in other. What should I change in config? How do I debug further?

TCP proxy through Istio is not working in one of our cluster and working in other. Every configuration is same.
We have configured RabbitMQ and it is accepting TLS connection on 5671 port. If I port-forward the rabbitmq service and try to connect from localhost it works, but the same does not work through Istio TCP proxy. The same code works with RabbitMQ running in another cluster.
RabbitMQ logs shows no connection lifecycle related log. The connection closes abruptly with the following error.
error in connection to rabbitmq Error: Client network socket disconnected before secure TLS connection was established
Gateway and VS config
apiVersion: networking.istio.io/v1alpha3
kind: Gateway
metadata:
name: amqptls-ingressgateway
namespace: istio-system
spec:
selector:
istio: ingressgateway
servers:
- port:
number: 5671
name: ampqs
protocol: TCP
hosts:
- "rabbitmq.xyz.io"
---
apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
name: rabbitmq-virtual-service
namespace: istio-system
spec:
hosts:
- "rabbitmq.xyz.io"
gateways:
- amqptls-ingressgateway
tcp:
- match:
- port: 5671
route:
- destination:
host: rabbitmq.xyz.svc.cluster.local
port:
number: 5671
Services
MacBook-Pro-3:Desktop manuchaudhary$k get svc -nistio-system
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
infra-applications ClusterIP 172.20.191.220 <none> 8080/TCP 284d
istio-ingressgateway ClusterIP 172.20.142.243 <none> 15021/TCP,80/TCP,443/TCP,5671/TCP,5672/TCP 285d
istiod ClusterIP 172.20.55.27 <none> 15010/TCP,15012/TCP,443/TCP,15014/TCP 285d
RabbitMQ
MacBook-Pro-3:Desktop manuchaudhary$ k get svc -nxyz | grep rabbitmq
rabbitmq ClusterIP 172.20.36.189 <none> 5672/TCP,5671/TCP,4369/TCP,25672/TCP,15672/TCP,9419/TCP 72m
rabbitmq-headless ClusterIP None <none> 4369/TCP,5672/TCP,5671/TCP,25672/TCP,15672/TCP 72m
Other info
MacBook-Pro-3:bin manuchaudhary$ ./istioctl version
client version: 1.12.0
control plane version: 1.12.0
data plane version: 1.12.0 (16 proxies)
TCP dump inside Istio ingress gateway of a working cluster
15:46:23.647656 IP 122.162.144.32.23435 > 10.0.1.47.5671: Flags [S], seq 3103710824, win 65535, options [mss 1440,nop,wscale 6,nop,nop,TS val 2736535918 ecr 0,sackOK,eol], length 0
15:46:23.647697 IP 10.0.1.47.5671 > 122.162.144.32.23435: Flags [S.], seq 681655678, ack 3103710825, win 62643, options [mss 8961,sackOK,TS val 3140511074 ecr 2736535918,nop,wscale 7], length 0
15:46:23.717973 IP 122.162.144.32.23435 > 10.0.1.47.5671: Flags [.], ack 1, win 2052, options [nop,nop,TS val 2736535999 ecr 3140511074], length 0
15:46:23.720154 IP 122.162.144.32.23435 > 10.0.1.47.5671: Flags [P.], seq 1:308, ack 1, win 2052, options [nop,nop,TS val 2736536001 ecr 3140511074], length 307
15:46:23.720175 IP 10.0.1.47.5671 > 122.162.144.32.23435: Flags [.], ack 308, win 487, options [nop,nop,TS val 3140511146 ecr 2736536001], length 0
15:46:23.722885 IP 10.0.1.47.5671 > 122.162.144.32.23435: Flags [.], seq 1:1429, ack 308, win 487, options [nop,nop,TS val 3140511149 ecr 2736536001], length 1428
15:46:23.722892 IP 10.0.1.47.5671 > 122.162.144.32.23435: Flags [P.], seq 1429:2857, ack 308, win 487, options [nop,nop,TS val 3140511149 ecr 2736536001], length 1428
15:46:23.722897 IP 10.0.1.47.5671 > 122.162.144.32.23435: Flags [.], seq 2857:4285, ack 308, win 487, options [nop,nop,TS val 3140511149 ecr 2736536001], length 1428
15:46:23.722899 IP 10.0.1.47.5671 > 122.162.144.32.23435: Flags [P.], seq 4285:5713, ack 308, win 487, options [nop,nop,TS val 3140511149 ecr 2736536001], length 1428
15:46:23.722921 IP 10.0.1.47.5671 > 122.162.144.32.23435: Flags [P.], seq 5713:6595, ack 308, win 487, options [nop,nop,TS val 3140511149 ecr 2736536001], length 882
15:46:23.795525 IP 122.162.144.32.23435 > 10.0.1.47.5671: Flags [.], ack 6595, win 1949, options [nop,nop,TS val 2736536075 ecr 3140511149], length 0
15:46:23.803620 IP 122.162.144.32.23435 > 10.0.1.47.5671: Flags [.], ack 6595, win 2048, options [nop,nop,TS val 2736536084 ecr 3140511149], length 0
15:46:23.807625 IP 122.162.144.32.23435 > 10.0.1.47.5671: Flags [P.], seq 308:446, ack 6595, win 2048, options [nop,nop,TS val 2736536086 ecr 3140511149], length 138
15:46:23.807636 IP 10.0.1.47.5671 > 122.162.144.32.23435: Flags [.], ack 446, win 486, options [nop,nop,TS val 3140511234 ecr 2736536086], length 0
15:46:23.807883 IP 10.0.1.47.5671 > 122.162.144.32.23435: Flags [P.], seq 6595:6602, ack 446, win 486, options [nop,nop,TS val 3140511234 ecr 2736536086], length 7
15:46:23.808115 IP 10.0.1.47.5671 > 122.162.144.32.23435: Flags [F.], seq 6602, ack 446, win 486, options [nop,nop,TS val 3140511234 ecr 2736536086], length 0
...
TCP dump inside Istio ingress gateway of a NOT WORKING cluster. Notice that F that is sent from gateway.
15:46:43.860865 IP 122.162.144.32.25867 > 10.0.3.179.5671: Flags [S], seq 1911004422, win 65535, options [mss 1440,nop,wscale 6,nop,nop,TS val 1866753148 ecr 0,sackOK,eol], length 0
15:46:43.860889 IP 10.0.3.179.5671 > 122.162.144.32.25867: Flags [S.], seq 3885047813, ack 1911004423, win 62643, options [mss 8961,sackOK,TS val 2941952952 ecr 1866753148,nop,wscale 7], length 0
15:46:44.147954 IP 122.162.144.32.25867 > 10.0.3.179.5671: Flags [.], ack 1, win 2052, options [nop,nop,TS val 1866753437 ecr 2941952952], length 0
15:46:44.148953 IP 122.162.144.32.25867 > 10.0.3.179.5671: Flags [P.], seq 1:308, ack 1, win 2052, options [nop,nop,TS val 1866753440 ecr 2941952952], length 307
15:46:44.148969 IP 10.0.3.179.5671 > 122.162.144.32.25867: Flags [.], ack 308, win 487, options [nop,nop,TS val 2941953240 ecr 1866753440], length 0
15:46:44.149082 IP 10.0.3.179.5671 > 122.162.144.32.25867: Flags [F.], seq 1, ack 308, win 487, options [nop,nop,TS val 2941953240 ecr 1866753440], length 0
15:46:44.455798 IP 122.162.144.32.25867 > 10.0.3.179.5671: Flags [.], ack 2, win 2052, options [nop,nop,TS val 1866753746 ecr 2941953240], length 0
15:46:44.455798 IP 122.162.144.32.25867 > 10.0.3.179.5671: Flags [.], ack 2, win 2052, options [nop,nop,TS val 1866753746 ecr 2941953240], length 0
15:46:44.459798 IP 122.162.144.32.25867 > 10.0.3.179.5671: Flags [F.], seq 308, ack 2, win 2052, options [nop,nop,TS val 1866753750 ecr 2941953240], length 0
15:46:44.459818 IP 10.0.3.179.5671 > 122.162.144.32.25867: Flags [.], ack 309, win 487, options [nop,nop,TS val 2941953551 ecr 1866753750], length 0
...
So the issue was that there was an Authorization policy in place
apiVersion: security.istio.io/v1beta1
kind: AuthorizationPolicy
metadata:
name: block-api-clients
namespace: istio-system
spec:
action: DENY
rules:
- to:
- operation:
paths: ["/api/clients/xyz*"]
Not sure about why this was causing istio to close the connection with both downstream and upstream. But removing this fixed the issue.

TCPDUMP, tcp Flag not changing from Flags [S] to other flag values

I need support understanding these lines.
when i tried to connect to server in a particular port it shows connecting and gives me timeout error.
But in the tcp-dump command the packet flag not changing from [s] to other flags.
Review the below log and provide me the better solution.
07:30:42.787417 IP broadband.actcorp.in.44306 > xxx.xx.xxx.xx.5000: Flags [S], seq 416168771, win 29200, options [mss 1460,sackOK,TS val 1731483200 ecr 0,nop,wscale 7], length 0
E..<.C#./.6fj3.(..d.......;C......r.q..........
g4V#........
07:30:42.788613 IP broadband.actcorp.in.44304 > xxx.xxx.xxx.xx.5000: Flags [S], seq 288165140, win 29200, options [mss 1460,sackOK,TS val 1731483200 ecr 0,nop,wscale 7], length 0
E..<..#./...j3.(..d......-........r............
g4V#........
07:30:43.811043 IP broadband.actcorp.in.44306 > xxx.xxx.xxx.xx.5000: Flags [S], seq 416168771, win 29200, options [mss 1460,sackOK,TS val 1731484225 ecr 0,nop,wscale 7], length 0
E..<.D#./.6ej3.(..d.......;C......r.m..........
g4ZA........
07:30:43.812304 IP broadband.actcorp.in.44304 > xxx.xxx.xxx.xx.5000: Flags [S], seq 288165140, win 29200, options [mss 1460,sackOK,TS val 1731484225 ecr 0,nop,wscale 7], length 0
E..<..#./...j3.(..d......-........r............
g4ZA........
07:30:45.826800 IP broadband.actcorp.in.44306 > xxx.xxx.xxx.xx.5000: Flags [S], seq 416168771, win 29200, options [mss 1460,sackOK,TS val 1731486240 ecr 0,nop,wscale 7], length 0
E..<.E#./.6dj3.(..d.......;C......r.e..........
g4b ........
07:30:45.828063 IP broadband.actcorp.in.44304 > xxx.xxx.xxx.xx.5000: Flags [S], seq 288165140, win 29200, options [mss 1460,sackOK,TS val 1731486240 ecr 0,nop,wscale 7], length 0
E..<..#./...j3.(..d......-........r............
g4b ........
07:30:49.955119 IP broadband.actcorp.in.44306 > xxx.xxx.xxx.xx.5000: Flags [S], seq 416168771, win 29200, options [mss 1460,sackOK,TS val 1731490368 ecr 0,nop,wscale 7], length 0
E..<.F#./.6cj3.(..d.......;C......r.U..........
g4r#........
07:30:49.956179 IP broadband.actcorp.in.44304 > xxx.xxx.xxx.xx.5000: Flags [S], seq 288165140, win 29200, options [mss 1460,sackOK,TS val 1731490368 ecr 0,nop,wscale 7], length 0
E..<..#./...j3.(..d......-........r............
g4r#........
07:30:50.425698 IP 192.168.100.4.46835 > xxx.xxx.xxx.xx.5000: Flags [S], seq 3617736523, win 14600, options [mss 1460,sackOK,TS val 3944752462 ecr 0,nop,wscale 7], length 0
E..<b.#.#..R..d...d.......GK......9............
. %N........
07:30:51.427297 IP 192.168.100.4.46835 > xxx.xxx.xxx.xx.5000: Flags [S], seq 3617736523, win 14600, options [mss 1460,sackOK,TS val 3944753464 ecr 0,nop,wscale 7], length 0
E..<b.#.#..Q..d...d.......GK......9............
. )8........
07:30:53.431409 IP 192.168.100.4.46835 > xxx.xxx.xxx.xx.5000: Flags [S], seq 3617736523, win 14600, options [mss 1460,sackOK,TS val 3944755468 ecr 0,nop,wscale 7], length 0
E..<b.#.#..P..d...d.......GK......9............
. 1.........
07:30:57.435466 IP 192.168.100.4.46835 > xxx.xxx.xxx.xx.5000: Flags [S], seq 3617736523, win 14600, options [mss 1460,sackOK,TS val 3944759472 ecr 0,nop,wscale 7], length 0
E..<b.#.#..O..d...d.......GK......9..X.........
. #.........
07:30:58.147142 IP broadband.actcorp.in.44306 > xxx.xxx.xxx.xx.5000: Flags [S], seq 416168771, win 29200, options [mss 1460,sackOK,TS val 1731498561 ecr 0,nop,wscale 7], length 0
E..<.G#./.6bj3.(..d.......;C......r.5..........
g4.A........

Kubernetes multiport service: Unable to connect to graphite port in influxdb from a client pod

I have an influxdb service running in my kubernetes cluster and it exposes the following two ports
---
apiVersion: v1
kind: Service
metadata:
labels:
task: influxdb
name: influxdb
namespace: my-namespace
spec:
type: NodePort
ports:
- port: 8086
name: influxdb-port
targetPort: 8086
nodePort: 30101
- port: 2003
name: graphite-port
targetPort: 2003
nodePort: 30103
selector:
k8s-app: influxdb
I also have the graphite enabled in the influxdb.conf as follows
[[graphite]]
enabled = true
bind-address = ":2003"
protocol = "tcp"
When I deploy this service in my cluster, I am able to write to the graphite database port from outside the k8s cluster as well as any of the k8s node.
echo "local.random.diceroll 4 `date +%s`" | nc -v 10.233.26.252 2003
Ncat: Version 7.50 ( https://nmap.org/ncat )
Ncat: Connected to 10.233.26.252:2003.
Ncat: 35 bytes sent, 0 bytes received in 0.01 seconds.
However if I run the same command from a client pod running in the same namespace the command hangs. From the same client pod I am able to writing the influxdb port without any issue.
Following is the tcpdump trace of the nc command being sent to the graphite port 2003 from the client pod
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on any, link-type LINUX_SLL (Linux cooked), capture size 65535 bytes
03:02:17.133367 In b2:7c:b9:93:b8:9c ethertype IPv4 (0x0800), length 76: 10.233.102.131.60272 > 10.233.26.252.cfinger: Flags [S], seq 3653742255, win 29200, options [mss 1460,sackOK,TS val 629141248 ecr 0,nop,wscale 7], length 0
03:02:17.133414 Out ethertype IPv4 (0x0800), length 76: 10.233.102.131.60272 > 10.233.71.26.cfinger: Flags [S], seq 3653742255, win 29200, options [mss 1460,sackOK,TS val 629141248 ecr 0,nop,wscale 7], length 0
03:02:17.133783 In ethertype IPv4 (0x0800), length 76: 10.233.71.26.cfinger > 10.233.102.131.60272: Flags [S.], seq 4245034624, ack 3653742256, win 28960, options [mss 1460,sackOK,TS val 629140002 ecr 629141248,nop,wscale 7], length 0
03:02:17.133791 Out ee:ee:ee:ee:ee:ee ethertype IPv4 (0x0800), length 76: 10.233.26.252.cfinger > 10.233.102.131.60272: Flags [S.], seq 4245034624, ack 3653742256, win 28960, options [mss 1460,sackOK,TS val 629140002 ecr 629141248,nop,wscale 7], length 0
03:02:17.133805 In b2:7c:b9:93:b8:9c ethertype IPv4 (0x0800), length 68: 10.233.102.131.60272 > 10.233.26.252.cfinger: Flags [.], ack 1, win 229, options [nop,nop,TS val 629141248 ecr 629140002], length 0
03:02:17.133809 Out ethertype IPv4 (0x0800), length 68: 10.233.102.131.60272 > 10.233.71.26.cfinger: Flags [.], ack 1, win 229, options [nop,nop,TS val 629141248 ecr 629140002], length 0
03:02:17.134036 In b2:7c:b9:93:b8:9c ethertype IPv4 (0x0800), length 103: 10.233.102.131.60272 > 10.233.26.252.cfinger: Flags [P.], seq 1:36, ack 1, win 229, options [nop,nop,TS val 629141249 ecr 629140002], length 35
03:02:17.134049 Out ethertype IPv4 (0x0800), length 103: 10.233.102.131.60272 > 10.233.71.26.cfinger: Flags [P.], seq 1:36, ack 1, win 229, options [nop,nop,TS val 629141249 ecr 629140002], length 35
03:02:17.134235 In ethertype IPv4 (0x0800), length 68: 10.233.71.26.cfinger > 10.233.102.131.60272: Flags [.], ack 36, win 227, options [nop,nop,TS val 629140002 ecr 629141249], length 0
03:02:17.134258 Out ee:ee:ee:ee:ee:ee ethertype IPv4 (0x0800), length 68: 10.233.26.252.cfinger > 10.233.102.131.60272: Flags [.], ack 36, win 227, options [nop,nop,TS val 629140002 ecr 629141249], length 0
^C
10 packets captured
10 packets received by filter
I had to use the -q0 option instead of -v. The -q0 option makes sure that the netcat connection is closed after sending the data.

libpcap findalldevs not working in guest LDOM on Solaris 11

Environment
Oracle Solaris 11 for SPARC
Running in a Non-primary (Guest) Logical Domain (LDOM).
Logged in with root access.
Problem
My application uses libpcap to capture network traffic. When my application (myTestApp) calls libpcap findalldevs, my application only sees one network interface ("lo0"), yet ifconfig -a shows many more interfaces.
My application is statically linked to libpcap (version 1.3). The build machine is SunOS RS-T5120-01 5.10 Generic_141444-09 sun4v sparc SUNW,SPARC-Enterprise-T5120.
Any ideas why my application can't see all the network interfaces ?
Linux command Line Sample Output
# tcpdump --version
tcpdump version 4.1.1
libpcap version 1.1.1
# uname -a
SunOS g99dnpi802-LD 5.11 11.1 sun4v sparc sun4v
# ./myTestApp -adapters
[Available Adapters]
name: "lo0", description: "", address: 127.0.0.1, mask: 255.0.0.0
# ifconfig -a
lo0: flags=2001000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv4,VIRTUAL> mtu 8232 index 1
inet 127.0.0.1 netmask ff000000
net0: flags=100001000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4,PHYSRUNNING> mtu 1500 index 2
inet 10.99.220.15 netmask ffffff00 broadcast 10.99.220.255
ether 0:14:4f:fa:e0:8d
net1: flags=100001000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4,PHYSRUNNING> mtu 1500 index 3
inet 10.99.193.210 netmask ffffff80 broadcast 10.99.193.255
ether 0:14:4f:f9:d0:9c
lo0: flags=2002000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv6,VIRTUAL> mtu 8252 index 1
inet6 ::1/128
net0: flags=120002000840<RUNNING,MULTICAST,IPv6,PHYSRUNNING> mtu 1500 index 2
inet6 ::/0
ether 0:14:4f:fa:e0:8d
net1: flags=120002000840<RUNNING,MULTICAST,IPv6,PHYSRUNNING> mtu 1500 index 3
inet6 ::/0
ether 0:14:4f:f9:d0:9c
# tcpdump -i net1
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on net1, link-type EN10MB (Ethernet), capture size 65535 bytes
09:32:29.520815 IP g99dnpi802-LD.ssh > 10.99.8.102.65436: Flags [P.], seq 3397909586:3397909718, ack 1479093081, win 64240, length 132
09:32:29.520860 IP g99dnpi802-LD.ssh > 10.99.8.102.65436: Flags [P.], seq 132:232, ack 1, win 64240, length 100
09:32:29.521644 IP 10.99.8.102.65436 > g99dnpi802-LD.ssh: Flags [.], ack 132, win 16379, length 0
09:32:29.680844 00:14:4f:f9:8d:84 (oui Unknown) > Broadcast, ethertype Unknown (0xcafe), length 90:
0x0000: 0500 ad85 0939 ffff 0001 ffff 809c 7401 .....9........t.
0x0010: 0000 004c 0000 0000 8070 00ab 0000 0000 ...L.....p......
0x0020: 0000 0000 0000 0000 0043 ffff 2074 6167 .........C...tag
0x0030: 6d61 7374 0672 0014 4ff9 8d84 5f31 3362 mast.r..O..._13b
0x0040: 650a 0000 0000 0000 84f9 0aab e...........
[update]
Here is the (edited) output of running the following truss command on the build machine and the customer machine.
truss –f –a –vall –l –d –o truss.txt ./myTestApp -adapters
truss on build machine
14365/1: 0.0751 so_socket(PF_INET, SOCK_DGRAM, IPPROTO_IP, "", SOV_DEFAULT) = 3
14365/1: 0.0753 so_socket(PF_INET6, SOCK_DGRAM, IPPROTO_IP, "", SOV_DEFAULT) = 4
14365/1: 0.0755 ioctl(3, SIOCGLIFNUM, 0xFFBE9F50) = 0
14365/1: 0.0757 ioctl(3, SIOCGLIFCONF, 0xFFBE9F40) = 0
14365/1: 0.0804 ioctl(3, SIOCGLIFFLAGS, 0xFFBE9DC8) = 0
14365/1: 0.0806 ioctl(3, SIOCGLIFNETMASK, 0xFFBE9C50) = 0
14365/1: 0.0809 open64("/dev/lo", O_RDWR) Err#2 ENOENT
14365/1: 0.0811 open64("/dev/lo0", O_RDWR) Err#2 ENOENT
14365/1: 0.0813 ioctl(3, SIOCGLIFFLAGS, 0xFFBE9DC8) = 0
14365/1: 0.0815 ioctl(3, SIOCGLIFNETMASK, 0xFFBE9C50) = 0
14365/1: 0.0817 ioctl(3, SIOCGLIFBRDADDR, 0xFFBE9AD8) = 0
14365/1: 0.0819 open64("/dev/e1000g", O_RDWR) = 5
truss on customer machine
6346/1: 0.0315 so_socket(PF_INET, SOCK_DGRAM, IPPROTO_IP, 0, SOV_DEFAULT) = 3
6346/1: 0.0319 so_socket(PF_INET6, SOCK_DGRAM, IPPROTO_IP, 0, SOV_DEFAULT) = 5
6346/1: 0.0320 ioctl(3, SIOCGLIFNUM, 0xFFBEA830) = 0
6346/1: 0.0321 ioctl(3, SIOCGLIFCONF, 0xFFBEA820) = 0
6346/1: 0.0322 ioctl(3, SIOCGLIFFLAGS, 0xFFBEA6A8) = 0
6346/1: 0.0323 ioctl(3, SIOCGLIFNETMASK, 0xFFBEA530) = 0
6346/1: 0.0327 open64("/dev/lo", O_RDWR) Err#2 ENOENT
6346/1: 0.0328 open64("/dev/lo0", O_RDWR) = 6
6346/1: 0.0345 ioctl(3, SIOCGLIFFLAGS, 0xFFBEA6A8) = 0
6346/1: 0.0346 ioctl(3, SIOCGLIFNETMASK, 0xFFBEA530) = 0
6346/1: 0.0347 ioctl(3, SIOCGLIFBRDADDR, 0xFFBEA3B8) = 0
6346/1: 0.0347 open64("/dev/net", O_RDWR) Err#21 EISDIR
6346/1: 0.0349 ioctl(3, SIOCGLIFFLAGS, 0xFFBEA6A8) = 0
6346/1: 0.0349 ioctl(3, SIOCGLIFNETMASK, 0xFFBEA530) = 0
6346/1: 0.0350 ioctl(3, SIOCGLIFBRDADDR, 0xFFBEA3B8) = 0
6346/1: 0.0351 open64("/dev/net", O_RDWR) Err#21 EISDIR

Apple push notifications stop working after a while [closed]

This question is unlikely to help any future visitors; it is only relevant to a small geographic area, a specific moment in time, or an extraordinarily narrow situation that is not generally applicable to the worldwide audience of the internet. For help making this question more broadly applicable, visit the help center.
Closed 10 years ago.
FIXED!
It turned out it was caused by our firewall which would remove idle connections from its session list after 1 hour. We increased the timeout to 24 hours and setup a eventmachine periodic timer to reconnect the connection every 23 hours. That's a workaround until eventmachine 1.0.0 hits stable which will allow us to set SO_KEEPALIVE which should resolve our issues
The Problem was
We are using an eventmachine based implementation to push messages to the apple apns. It basically works nicely until it doesn't... :) We were trying to debug this quite hevily and now narrowed it down to something strange happening on the socket to apple after a while.
so, usually, until you send a notification to the apns server, the socket is completely quiet.
If you send a notification this is what tcpdump spits out (sudo tcpdump -vv -i bond0 tcp port 2195):
18:05:23.672477 IP (tos 0x0, ttl 64, id 47828, offset 0, flags [DF], proto TCP (6), length 489) my-worker-hostname.50669 > st11p01st-interface013-bz.push.apple.com.2195: P 3894:4331(437) ack 2724 win 91 <nop,nop,timestamp 893114182 2332608880>
18:05:23.776055 IP (tos 0x0, ttl 48, id 33720, offset 0, flags [DF], proto TCP (6), length 52) st11p01st-interface013-bz.push.apple.com.2195 > my-worker-hostname.50669: ., cksum 0x7844 (correct), 2724:2724(0) ack 4331 win 159 <nop,nop,timestamp 2332623235 893114182>
Nothing suspicious so far imho.
However, after a while (a while being a random amount of time) our worker process starts sending packets to the apple server every 1-2 minutes even though no push notification has been triggered by us:
17:55:06.009741 IP (tos 0x0, ttl 64, id 51807, offset 0, flags [DF], proto TCP (6), length 489) my-worker-hostname.54853 > st11p01st-interface002-bz.push.apple.com.2195: P 0:437(437) ack 1 win 91 <nop,nop,timestamp 892959766 2935299208>
17:56:25.881823 IP (tos 0x0, ttl 64, id 51808, offset 0, flags [DF], proto TCP (6), length 489) my-worker-hostname.54853 > st11p01st-interface002-bz.push.apple.com.2195: P 0:437(437) ack 1 win 91 <nop,nop,timestamp 892979734 2935299208>
17:58:25.877756 IP (tos 0x0, ttl 64, id 51809, offset 0, flags [DF], proto TCP (6), length 489) my-worker-hostname.54853 > st11p01st-interface002-bz.push.apple.com.2195: P 0:437(437) ack 1 win 91 <nop,nop,timestamp 893009734 2935299208>
17:59:12.030887 IP (tos 0x0, ttl 64, id 20781, offset 0, flags [DF], proto TCP (6), length 489) my-worker-hostname.59335 > st11p01st-interface013-bz.push.apple.com.2195: P 3749093679:3749094116(437) ack 4206642630 win 91 <nop,nop,timestamp 893021272 2330366860>
17:59:12.345740 IP (tos 0x0, ttl 64, id 20782, offset 0, flags [DF], proto TCP (6), length 489) my-worker-hostname.59335 > st11p01st-interface013-bz.push.apple.com.2195: P 0:437(437) ack 1 win 91 <nop,nop,timestamp 893021351 2330366860>
17:59:12.977805 IP (tos 0x0, ttl 64, id 20783, offset 0, flags [DF], proto TCP (6), length 489) my-worker-hostname.59335 > st11p01st-interface013-bz.push.apple.com.2195: P 0:437(437) ack 1 win 91 <nop,nop,timestamp 893021509 2330366860>
As soon as this starts, the push notifications don't work anymore until we restart the worker.
I'm running out of ideas here...
UPDATE:
after waiting another while without sending notifications and nothing happening on the socket at all I've just initiated another push, this caused the described behavior once again:
19:10:44.951026 IP (tos 0x0, ttl 64, id 47829, offset 0, flags [DF], proto TCP (6), length 489) my-worker-hostname.50669 > st11p01st-interface013-bz.push.apple.com.2195: P 4331:4768(437) ack 2724 win 91 <nop,nop,timestamp 894094502 2332623235>
19:10:45.361786 IP (tos 0x0, ttl 64, id 47830, offset 0, flags [DF], proto TCP (6), length 489) my-worker-hostname.50669 > st11p01st-interface013-bz.push.apple.com.2195: P 4331:4768(437) ack 2724 win 91 <nop,nop,timestamp 894094605 2332623235>
19:10:46.185822 IP (tos 0x0, ttl 64, id 47831, offset 0, flags [DF], proto TCP (6), length 489) my-worker-hostname.50669 > st11p01st-interface013-bz.push.apple.com.2195: P 4331:4768(437) ack 2724 win 91 <nop,nop,timestamp 894094811 2332623235>
19:10:47.837788 IP (tos 0x0, ttl 64, id 47832, offset 0, flags [DF], proto TCP (6), length 489) my-worker-hostname.50669 > st11p01st-interface013-bz.push.apple.com.2195: P 4331:4768(437) ack 2724 win 91 <nop,nop,timestamp 894095223 2332623235>
19:10:51.133744 IP (tos 0x0, ttl 64, id 47833, offset 0, flags [DF], proto TCP (6), length 489) my-worker-hostname.50669 > st11p01st-interface013-bz.push.apple.com.2195: P 4331:4768(437) ack 2724 win 91 <nop,nop,timestamp 894096048 2332623235>
19:10:57.725824 IP (tos 0x0, ttl 64, id 47834, offset 0, flags [DF], proto TCP (6), length 489) my-worker-hostname.50669 > st11p01st-interface013-bz.push.apple.com.2195: P 4331:4768(437) ack 2724 win 91 <nop,nop,timestamp 894097696 2332623235>
19:11:10.913826 IP (tos 0x0, ttl 64, id 47835, offset 0, flags [DF], proto TCP (6), length 489) my-worker-hostname.50669 > st11p01st-interface013-bz.push.apple.com.2195: P 4331:4768(437) ack 2724 win 91 <nop,nop,timestamp 894100992 2332623235>