Apache NiFi - Listen UDP/TCP ranges - range

I'm trying to configure the ListenUDP or ListenTCP processors to get input from multiple, but very specific IP's. I'm trying to find out if IP ranges can be used instead of a single IP, this way all my palo altos will go to one processor, and so on.

ListenTCP & ListenUDP do not filter incoming traffic.
You can choose which network interface is used to listen for incoming traffic by setting Local Network Interface - so you can apply normal filtering techniques such as iptables or a firewall to filter what traffic is allowed to reach that network interface.
In theory, recieved messages do write an attribute tcp.sender or udp.sender to each FlowFile. So you could technically filter AFTER the ListenTCP by comparing the value of this Attribute and dropping messages that are not valid...but this is a lot less efficient than filtering network traffic outside of NiFi.

Related

How to enforce single threaded requests?

I've started rate limiting my API using HAProxy, but my biggest problem is not so much the rate of requests, but when multi-threaded requests overlap.
Even within my legal per-second limits, big problems are occurring when clients don't wait for a response before issuing another request.
Is it possible (say, per IP address) to queue requests and pass them one at at time to the back end for sequential processing?
Here is a possible solution to enforce one connection at a time per src IP.
You need to put the following HAProxy conf in the corresponding frontend:
frontend fe_main
mode http
stick-table type ip size 1m store conn_cur
tcp-request connection track-sc0 src
tcp-request connection reject if { src_conn_cur gt 1 }
This will create a stick table that stores concurrent connection counts per source IP. Then rejects new connections if there is already one established from the same src IP.
Browsers imitating multiple connections to your API or clients behind a NAT will not be able to efficiently use you API.

How to Enable Timestamp Option in IP Header

I am designing an application layer protocol on top of UDP. One of requirements is that the receiving side should keep only the most up to date datagram.
Therefore, if datagram A was sent and then datagram B was sent, but datagram B was received first, datagram A should be discarded by the application when received.
One way to implement this is a counter stored in the data part of the UDP packet. The counter is incremented each time a datagram is sent.
I also noticed that IP options contain a timestamp option which looks suitable for this task.
My questions are (in the context of BSD-like sockets):
How do I enable this option on the sending side?
How do I read this field on the receiving side?
You can set IP options using setsockopt() using option level IPPROTO_IP and specifying the name of the option. See Unix/Linux IP documentation, for example see here. Reading IP header options generally requires using a RAW socket which in turn usually requires root permissions. It's not advisable to (try to) use IP options because it may not always be supported since it's very rarely used (either at the origination system or at systems it passes).

how can firewall/iptables check incoming tcp traffic of already bound ports?

As far as i know only one process can be bound to a port of the same protocol, and in order to read incoming information to a port a socket must be bound to a that relevant port.
is there a way of sharing a socket with another process or something like that?
is there a way of sharing a socket with another process or something like that?
Sharing a socket and thus the port between two processes is possible (like after fork) but this is probably not what you want for data analysis, since if one process reads the data the other does not get them anymore.
how can firewall/iptables check incoming tcp traffic of already bound ports?
Packet filter like iptables work inside the kernel and get the data before they gets send to the socket. It does not even matter if there is socket bound to this specific port at all. Unless the packet filter denies the data they get forwarded unchanged to the socket (if there is any).
Passive IDS like snort or tools like tcpdump get the raw packets and here it also does not matter if there is a socket at all. They can only read the packets, i.e. not modify or block.
Application level firewalls or (reverse) proxies have their own socket and receive the data there (directly or redirected by the packet filter). They can then analyse the data and will explicitly forward the data (maybe after modification) to the original application.

At the level of IP, does "leave the connection open" have a specific technical meaning - such as intermediate gateways storing an IP map entry?

I am an experienced socket-level programmer in C++, but I do not understand what happens at the IP network level when a socket connection is left open (vs. being closed by calling the close function on the socket from within code).
I have studied the IP header and tried to understand if leaving a socket open has any implications at the IP level.
At the TCP level, leaving a socket open could make sense to me, because perhaps that means the "sequence number" field in the TCP header continues to increment. However, that would be a purely endpoint-based implementation, and therefore could not cut down on transit time for TCP packets. It is my understanding that leaving a connection open generally means that transit time between endpoints across the internet is decreased for packets.
The question is, does it mean anything at the IP level to leave a socket connection open?
The best guess I have is that if a socket connection remains open, that intervening gateways along the complete IP network path will attempt to leave an entry in their mapping table so that the next hop can be executed immediately, without needing to do a broadcast to all connected gateways in order to determine the next hop.
(Perhaps the overhead of DNS lookup is also avoided in this fashion.)
Am I correct in guessing that "leaving a connection open" corresponds to map entries remaining in place on intermediate IP gateways (which speeds up packet transfer)?
Direct answer: No.
Your question suggests that you don't fully understand the purpose of TCP, which is to establish a data stream between two hosts. Keeping that in mind, the purpose of leaving a connection open should be obvious: if you close the connection, the stream will end.
The status of a TCP connection is not visible on the IP level; it's only of relevance to TCP. With the exception of NAT gateways, intermediate hosts do not generally keep track of the status of TCP connections passing through them. (In many cases, it'd be impossible for them to do so -- large routers have far more connections running through them than they could possibly track.)
The best guess I have is that if a socket connection remains open, that intervening gateways along the complete IP network path will attempt to leave an entry in their mapping table so that the next hop can be executed immediately, without needing to do a broadcast to all connected gateways in order to determine the next hop.
This guess is incorrect. A router will have some sort of algorithm for picking a route based on the destination IP, based on a set of routing tables it keeps internally. Read up on BGP for details on how this is determined on large routers; on smaller routers, the routing table is typically defined by the administrator.
First of all, let's clear up a misconception:
that intervening gateways along the complete IP network path will attempt to leave an entry in their mapping table so that the next hop can be executed immediately, without needing to do a broadcast to all connected gateways in order to determine the next hop.
Routers never "broadcast to all connected gateways" in order to determine the next hop. If a packet arrives and the router does not already know how to route it, the packet is simply dropped (possibly with an ICMP error message being sent back to the source). The job of the routing protocols that run on routers is to prepopulate the router's routing table with routes learned from peers so that they are then prepared to receive packets and route them.
Also, "the complete IP network path" is not well-defined. The network path can change at any time as links fail on the network or new links become available. It can even change from one packet to the next in the absence of routing changes due to load balancing.
Back to your question: no, whether or not a socket is closed has no impact on IP. IP is stateless in the sense that every packet is self-contained and routed independently.
Whether or not a socket is closed does make a difference to TCP, but, as you note, that concerns only the two nodes at the endpoints of the connection.
The impact of "leaving a connection open" on speed, such that it is, is that establishing a connection in TCP requires a round-trip. But more to the point, a connection also has semantic meaning to most protocols running on TCP. Two bits of data sent on the same connection are related in a way that two bits of data sent on different connections are not.

Ganglia - security when polling metrics over TCP (xml format) from nodes

Context: I am a student and I am trying to prepare a proof of concept for quick network-monitoring.
our imaginary context is that we have multiple clusters which are on different subnets. I have read numerous documentations regarding ganglia and what I really want to find out is during node polling, assuming that gmetad is on a different subnet as the node as well, is there any security measure that is utilised to protect sending the XML data over TCP.
It's not entirely clear whether you mean to ask about TCP or UDP transport here, but I assume TCP since that's how gmetad-gmetad and gmetad-gmond communication is done.
The only security measures are the trusted_hosts configuration attribute for gmetad and the access control lists that can be specified for gmond's tcp_accept_channel configuration.
You could perhaps consider a secure tunneled route between the hosts if you're looking to avoid eavesdropping?