Is there a maintained alternative for libnids? - pcap

As libnids seems to be two years old and there are no current updates, do some one know any alternative solution for libnids or better library than it, as it seems to drop packets in higher speeds more than 1G/per sec
And more over it has no support for 64 bit ip addresses.

An alternative to libnids is Bro. It comes with a robust TCP reassembler which has been thoroughly tested and used by the network security monitoring community over the years. It ships with a bunch of protocol analyzers for common protocols, such as HTTP, DNS, FTP, SMTP, and SSL.
Bro is "the Python of network processing:" it has its own domain-specific scripting language with first-class types and functions for IP addresses (both v4 and v6), subnets, ports. The programming style has an asynchronous event-based flavor: users write callback functions for events that reflect network activity. The analysis operates at connection granularity. Here is an example:
event connection_established(c: connection)
{
if ( c$id$orig_h == 1.2.3.4 && c$id$resp_p == 31337/udp )
// IP 1.2.3.4 successfully connected to remote host at port 31337.
}
Moreover, Bro supports a cluster mode that allows for line-rate monitoring of 10 Gbps links. Because most analyses do not require sharing of inter-connection state, Bro scales very well across cores (using PF_RING) as well as multiple nodes. There exist Bro installations with >= 140 nodes. A typical deployment looks as follows:
(source: bro.org)
Due to the high scalability, there is typically no more need to grapple with low-level details and fine-tune C implementations. Or put differently, with Bro you spend your time working on the analysis and not the implementation.

Related

Having multiple sockets for same Context() and same port in ZMQ

My current system takes input stream from cameras, each camera in a separate instance, and apply Computer Vision models on each camera (Object Detection, Object Tracking and Personnel Recognition), and then pass the results to a sink/master process that performs the rest of functionality over those results and I'm using ZMQ as an inter-process communication.
What I implemented now is that each worker connects to a different port, and then the sink subscribes to these ports independently, but this solution is not scalable, as we might have 3 or 4 cameras/worker, and I felt that it won't be efficient to keep opening ports like that.
Multiple Ports Implementation
That's when I tried to implement Multi-Pub/Single-Sub module, where all workers will connect to one port and the sink will subscribe to that port only.
Single Port Implementation
The problem I faced is that I no longer can distinguish between different cameras since I'm receiving different footages in the same port which causes a problem in streaming them later, that's why I'm thinking about the possibility of having multiple sockets for each context, while each socket subscribes to a different IP, is that possible?
Note: I've seen this answer but it has different ports for different sockets which does not really serve my case.
Q : " ... I no longer can distinguish between different cameras ... "
A :Yet, there are ZeroMQ tools to do so - check details about :
.setsockopt( zmq.METADATA, "X-key:value" )
.setsockopt( zmq.ROUTING_ID, Id )
As you see, PUB/SUB-archetype is the worst one to be used here ( you pay all the costs of TOPIC-filter based subscription-management, yet receive nothing for doing that ).
Using better matching archetypes is the way to go.
Given not performance details were posted, the capacity may soon get over-saturated, so may use more specific steps to flatten the workload and protect smooth-flow of the service :
.setsockopt( zmq.TOS, aTransportPath_TOS )
.setsockopt( zmq.MAXMSGSIZE, aBLOB_limit_to_save_RAM )
Given a streaming could block on many "old"-frames not having got through the e2e-pipeline in due time, it might make sense to also set this :
.setsockopt( zmq.CONFLATE, 1 )
As you can see, there are many smart details in the configuration space of the ZeroMQ, plus once scaling is to grow larger and larger, your design shall also fine-tune the Context()-engine performance once instantiating :
.Context( aNumOfContextIOthreads2use )

Bidirectional REQ/REP on a single port with ZeroMQ

I'm struggling with the following problem:
I'd like to make bidirectional asynchronous requests/replies between N clients and 1 server with ZeroMQ.
This means that any client can make a request to the server and the server must reply to the client.
In the other way, the server must be able to make a request to any of the identified clients and the client to reply to the server.
I think I must use routers/dealers, but I'm not sure I need them in both ways.
Moreover, is there a way to have this whole paradigm using only one port on the server side and on each client side?
Q: is there a way to have this using only one port on the server / client side?
Well, this is the simpler part. No, this is not achievable. Having just one telephone box in the college will ring in the hallway, but will never help to address a call to the required department ( it would not correctly reach the intended professor of Quantum Mechanics instead of the one in Fine Arts ).
Using but one port means having a chance to expose for public access a one and only one ZeroMQ Scalable Formal Communication Pattern Archetype AccessPoint and ( except the very specific PAIR/PAIR distributed behaviour Archetype ) there is always some kind of hard-wired distributed-behaviour of the interconnected agents' AccessPoint-s.
This means, using one port gives just one and only one kind of such distributed-behaviour, not a mix of them.
This also answers your first part. If a REQ/REP distributed-behaviour Archetype was used in a direction from client-nodes towards the server, and another REQ/REP distributed-behaviour Archetype was to be used in an opposite direction from server towards the client-nodes, these ( directed ) services cannot co-exist on the same address:port.
BONUS PART : A kind of Life-Saving, but a bit dirty Trick( Not to be used for Medical and/or Emergency Systems )one may sort of supersample one and only one of the REQ/REP messaging directions and add a tricky "quasi-protocol" so as to serve this same channel for the both intended signalling directions. If sending sufficiently enough protocol-messages from one side, be it { client | server } the REQ/REP message initiator will simply send NOP-messages often enough to permit the REP-side replying party to "quasi-initiate" its "quasi-REQ-message" when still being the authentic-REP-AccessPoint in the single REQ/REP distributed-behaivour Archetype.Yes, performance and resources use are a cost for doing this, but a careful soft-real-time system design practices will help to make this work, if your needs are extremely dependent on using but one port and if your consciousness and your deployment ecosystem may tolerate the increased traffic patterns of such supersampled-REQ/REP data-flows
You might also like posts, here on Stack Overflow, about unavoidable mutual deadlocks, the distributed REQ/REQ FSA-s will fall into.
ZeroMQ hierarchy explained in less than a five seconds
UN-AVOIDABLE DEADLOCKS

How to effectively establish point to point channel using ZeroMQ?

I have trouble with establishing asynchronous point to point channel using ZeroMQ.
My approach to build point to point channel was that it generates as many ZMQ_PAIR sockets as possible up to the number of peers in the network. Because ZMQ_PAIR socket ensures an exclusive connection between two peers, it needs the same number of peers. My first attempt is realized as the following diagram that represents paring connections between two peers.
But the problem of the above approach is the fact that each pairing socket needs a distinct bind address. For example, if four peers are in the network, then each peer should have at least three ( TCP ) address to bind the rest of peers, which is very unrealistic and inefficient.
( I assume that peer has exactly one unique address among others. Ex. tcp://*:5555 )
It seems that there is no way other than using different patterns, which contain some set of message brokers, such as XREQ/XREP.
( I intentionally avoid broker based approach, because my application will heavily exchange message between peers, which it will often result in performance bottleneck at the broker processes. )
But I wonder that if there is anybody who uses ZMQ_PAIR socket to efficiently build point to point channel? Or is there a way to bypass to have distinct host IP addresses for multiple ZMQ_PAIR sockets to bind?
Q: How to effectively establish ... well,
Given the above narrative, the story of "How to effectively ..." ( where a metric of what and how actually measures the desired effectivity may get some further clarification later ), turns into another question - "Can we re-factor the ZeroMQ Signalling / Messaging infrastructure, so as to work without using as many IP-addresses:port#-s as would the tcp://-transport-class based topology actually need?"
Upon an explicitly expressed limit of having not more than a just one IP:PORT# per host/node ( being thus the architecture's / desing's the very, if not the most expensive resource ) one will have to overcome a lot troubles on such a way forward.
It is fair to note, that any such attempt will come at an extra cost to be paid. There will not be any magic wand to "bypass" such a principal limit expressed above. So get ready to indeed pay the costs.
It reminds me one Project in TELCO, where a distributed-system was operated in a similar manner with a similar original motivation. Each node had an ssh/sshd service setup, where local-port forwarding enabled to expose a just one publicly accessible IP:PORT# access-point and all the rest was implemented "inside" a mesh of all the topological links going through ssh-tunnels not just because the encryption service, but right due to the comfort of having the ability to maintain all the local-port-forwarding towards specific remote-ports as a means of how to setup and operate such exclusive peer-to-peer links between all the service-nodes, yet having just a single public access IP:PORT# per node.
If no other approach will seem feasible ( PUB/SUB being evicted for either traffic actually flowing to each terminal node in cases of older ZeroMQ/API versions, where Topic-filtering gets processed but on the SUB-side, which both security and network Departments will not like to support, or for concentrated workloads and immense resources needs on PUB-side, in cases of newer ZeroMQ/API versions, where Topic-filter is being processed on the sender's side. Adressing, dynamic network peer (re-)discovery, maintenance, resources planning, fault resilience, ..., yes, not any easy shortcut seems to be anywhere near to just grab and (re-)use ) the above mentioned "stone-age" ssh/sshd-port-forwarding with ZeroMQ, running against such local-ports only, may save you.
Anyway - Good Luck on the hunt!

kernel-based (Linux) data relay between two TCP sockets

I wrote TCP relay server which works like peer-to-peer router (supernode).
The simplest case are two opened sockets and data relay between them:
clientA <---> server <---> clientB
However the server have to serve about 2000 such A-B pairs, ie. 4000 sockets...
There are two well known data stream relay implementations in userland (based on socketA.recv() --> socketB.send() and socketB.recv() --> socketA.send()):
using of select / poll functions (non-blocking method)
using of threads / forks (blocking method)
I used threads so in the worst case the server creates 2*2000 threads! I had to limit stack size and it works but is it right solution?
Core of my question:
Is there a way to avoid active data relaying between two sockets in userland?
It seems there is a passive way. For example I can create file descriptor from each socket, create two pipes and use dup2() - the same method like stdin/out redirecting. Then two threads are useless for data relay and can be finished/closed.
The question is if the server should ever close sockets and pipes and how to know when the pipe is broken to log the fact?
I've also found "socket pairs" but I am not sure about it for my purpose.
What solution would you advice to off-load the userland and limit amount fo threads?
Some extra explanations:
The server has defined static routing table (eg. ID_A with ID_B - paired identifiers). Client A connects to the server and sends ID_A. Then the server waits for client B. When A and B are paired (both sockets opened) the server starts the data relay.
Clients are simple devices behind symmetric NAT therefore N2N protocol or NAT traversal techniques are too complex for them.
Thanks to Gerhard Rieger I have the hint:
I am aware of two kernel space ways to avoid read/write, recv/send in
user space:
sendfile
splice
Both have restrictions regarding type of file descriptor.
dup2 will not help to do something in kernel, AFAIK.
Man pages: splice(2) splice(2) vmsplice(2) sendfile(2) tee(2)
Related links:
Understanding sendfile() and splice()
http://blog.superpat.com/2010/06/01/zero-copy-in-linux-with-sendfile-and-splice/
http://yarchive.net/comp/linux/splice.html (Linus)
C, sendfile() and send() difference?
bridging between two file descriptors
Send and Receive a file in socket programming in Linux with C/C++ (GCC/G++)
http://ogris.de/howtos/splice.html
OpenBSD implements SO_SPLICE:
relayd asiabsdcon2013 slides / paper
http://www.manualpages.de/OpenBSD/OpenBSD-5.0/man2/setsockopt.2.html
http://metacpan.org/pod/BSD::Socket::Splice .
Does Linux support something similar or only own kernel-module is the solution?
TCPSP
SP-MOD described here
TCP-Splicer described here
L4/L7 switch
HAProxy
Even for loads as tiny as 2000 concurrent connections, I'd never go with threads. They have the highest stack and switching overhead, simply because it's always more expensive to ensure that you can be interrupted anywhere than when you can only be interrupted at specific places. Just use epoll() and splice (if your sockets are TCP, which seems to be the case) and you'll be fine. You can even make epoll work in event triggered mode, where you only register your fds once.
If you absolutely want to use threads, use one thread per CPU core to spread the load, but if you need to do this, it means you're playing at speeds where affinity, RAM location on each CPU socket etc... plays a significant role, which doesn't seem to be the case in your question. So I'm assuming that a single thread is more than enough in your case.

What is the fastest (lowest latency) messaging queue solution for sending a message from host A to host B?

Ok folks, NOT counting ethernet speed (Infinitband), kernel bypass or any other fancy stuff, just plain TCP/IP (TCP/UDP over Ethernet) networking. What is the fastest messaging queue implementation that can deliver a message from host A to host B?
Let's assume 10Gigabits ethernet cards connecting both machines with up-to-date architecture and CPUs. What latency in microseconds are we talking here for a 1472 bytes message (MTU - IP/UDP headers)?
As #Sachin described very well, what I am looking for is the messaging queue and the latency number to send a message from A to B like below:
Host A <-------TCP-------> Messaging queue (process, route, etc) <-------TCP-------> Host B
if you do not require a broker in between, 0MQ gave us the best performance (you will need to test the numbers on your platform/use case). If using a broker in between, both ActiveMQ & RabbitMQ performed in the same range. Using Redis as a messaging server did not hold up for us.
If not using a messaging server, options such as Netty, J-groups etc might be useful (not sure about your programming language).
You could look into reliable UDP as well if going with straight socket connectivity.
hope it helps.
The lower bound would be at least 2 TCP connections and the routing time inside the messaging queue server (meaning the delays associated with these)
Host A <-------TCP-------> Messaging queue (process, route, etc) <-------TCP-------> Host B
Off course, if you build in redundancy, fault tolerance etc, then you are going to be certainly way above this lower bound.
It looks like you are talking about an UDP-based MQ because you mentioned MTU. Well, for UDP-based MQs this time is usually measured as the time required to publish a message and see it back in the message bus. So it is a round-trip time, not a one-way time as you described. This can usually be done in less than 6 microseconds, depending of course on your choice of LAN.