socket programming: How do I handle out of band data - sockets

I just looked into wikipedia's entry on out-of-band data and as far as I understand, OOB data is somehow flagged more important and treated as ordinary data, but transmitted in a seperate stream, which profoundly confuses me.
The actual question would be (besides "Could someone explain what OOB data is?"):
I'm writing a unix application that uses sockets and need to make use of select() and was wondering what to do with the exceptfds parameter? Do I need to put all my sockets into this parameter and react to such events? Or do I just ignore them?

I know you've decided you don't need to handle OOB data, but here are some things to keep in mind if you ever do care about OOB...
IPv4 doesn't really send OOB data on a separate channel, or at a different priority. It is just a flag on the packet.
OOB data is extremely limited -- 1 byte!
OOB data can be received either inline or separately depending on socket options
An "exception" signaling OOB data may occur even if the next read doesn't contain the OOB data (the network stack on the sender may flag any already queued data, so the other side will know there's OOB ASAP). This is often handled by entering a "drain" loop where you discard data until the actual OOB data is available.
If this seems a bit confusing and worthless, that's because it mostly is. There are good reasons to use OOB, but it's rare. One example is FTP, where the user may be in the middle of a large transfer but decide to abort. The abort is sent as OOB data. At that point the server and client just eat any further "normal" data to drain anything that's still in transit. If the abort were handled inline with the data then all the outstanding traffic would have to be processed, only to be dumped.
It's good to be aware that OOB exists and the basics of how it works, just in case you ever do need it. But don't bother learning it inside-out unless you're just curious. Chances are decent you may never use it.

I think I found the answer on this page. In short:
I don't need to handle OOB data on the receiving side if I'm not sending any OOB data. I had thought that OOB data could be generated by the OS of the sender.

You don't need to handle it at the receiving end even if you are sending it - OOB data is transparently ignored in all circumstances unless you actively go about receiving it.

Related

CAN Communication: Good Practices

I am preparing to write some code for a master controller that communicates (via CANbus) with multiple nodes in a product. Each node monitors its own sensors (i.e. voltages, currents, fault flags, etc.) and can be started/stopped by the master controller. The master controller also sends the data to a display.
I am using an STM32H7B3I-EVAL board and using the STM32CubeIDE environment to write the code. I am trying to determine some good practices for writing this code, but I am inexperienced in CAN communication. I wanted to get people's opinions on the following high-level questions:
If we want to be constantly monitoring, should all the code for transmitting and receiving data be in a never-ending while loop?
Is it better to transmit all data then receive all data, or transmit data when needed and have an interrupt for received messages?
What are the pros/cons in using an RXBUFFER vs RXFIFO?
First of all, you need to invent an application tier CAN protocol unless you have one already. This isn't entirely trivial and requires some experience of CAN. Here you first of all need to take bus load in account, which in turn depends on the amounts of nodes and data allowed, as well as the baudrate. How to design this also depends on if it's a control system (hard realtime, milliseconds) or just some industrial sensor network (hundreds of milliseconds or seconds).
If we want to be constantly monitoring, should all the code for transmitting and receiving data be in a never-ending while loop?
Probably not. Regarding RX, depending on what CAN controller you have, there will at least be some manner of RX FIFO. Modern controllers also support dedicated "mailbox" slots for individual messages, which is more powerful and easier to work with. Your only requirement for never losing data is that you empty the FIFO at least as often as FIFO size times the time it takes to send the minimum package size (DLC=0). Unless your program is very busy, this is usually not a tough realtime deadline to meet.
Regarding TX, again it depends on the controller, but here it is usually sufficient to see that the previously send message has been send before attempting a new one. And unless you are really competing hard for bus access during a time of heavy bus load, this shouldn't be happening. Sensible CAN application protocols have some simple scheduling requirements such as "this gets sent after x ms in relation to that". Re-sending messages lost due to errors is handled by the controller hardware.
Is it better to transmit all data then receive all data, or transmit data when needed and have an interrupt for received messages?
TX and RX buffers work independently of each other. Also what you are saying doesn't really make sense, since CAN is semi-duplex and one node's TX is another node's RX.
What are the pros/cons in using an RXBUFFER vs RXFIFO?
Those terms are pretty much synonymous. I suppose they may have some special meaning given a specific CAN controller, but you don't mention one (STM32 have several, one old and really bad "bxCAN" and one newer which I don't know much about. And some stubbornly insist on the horrible solution of using external controllers, particularly the Arduino kids).
Anyway, it is better to have neither, using a CAN controller with mailboxes is the best option. Unless the amount of expected identifiers are more than you have mailbox slots - in that case you have to direct low priority messages to a RX FIFO and use mailbox slots for high priority messages.

Architecture diagram involving the flow of data between trading engine, order routing engine,quickfix and the exchange

If I write an order routing system based on QuickfixJ, can I just start submitting my trades to an exchange? Or do I need to register myself with the exchange or get permission or something like that?
I am not able to understand how QuickfixJ, the order routing system, the actual trading engine and the exchange fits together. Any online architecture diagram would be very helpful for how these components fit together.
FIX is just a transmission protocol. By itself, it's pretty dumb. QuickFIX (any language port) is just an engine that does all the boring dirty work of managing a FIX connection.
The FIX specification includes a list of messages and fields. In reality, you can treat these as suggestions that, in practice, no commercial FIX counterparty uses as-is. Every counterparty I've connected to makes modifications to those messages and fields, sometimes adding entirely new messages. No counterparty supports every message and field.
When connecting to a counterparty, do not assume anything. Your counterparty should provide documentation on how they expect their interface to be used, and which messages and fields they will send and which they expect to receive from you.
Their docs should tell you which message to send them to request market data and any special fields/options you must use.
Their docs will tell you how to submit a trade.
Their docs will tell you how to do anything that they support, and which messages/fields you will receive in return.
Do not try to send any message type to your counterparty unless their docs say they support it.
If you are writing the ORS side... then you have no docs. If you haven't written a FIX client before, you probably shouldn't be writing a FIX server without some assistance from someone who has. At the least, you should try to get ahold of some other systems' FIX interface docs to get an idea of how to go about it. (Unfortunately, such firms usually only give them to client-developers.)

How much data can a socket store in its buffer before read?

I have a client / server application where clients send messages to the server. Due to a legacy library that I use, my server cannot read immediately but must wait for a condition to come true until it reads the messages. How much data can a socket store? Is there a fixed buffer size/limit?
Thanks.
It depends on the size of the socket receive buffer, whose default value varies among operating systems. You can control it from your application via setsockopt() and the SO_RCVBUFSIZE option.
That depends on many factors of which many you can't control. This is not the correct approach.
You should read the data as soon as it is available, but only process it if the conditions are met.
Edit: I guess I misinterpreted the question, see #EJP's answer.

How to maintain a persistant network-connection between two applications over a network?

I was recently approached by my management with an interesting problem - where I am pretty sure I am telling my bosses the correct information but I really want to make sure I am telling them the correct stuff.
I am being asked to develop some software that has this function:
An application at one location is constantly processing real-time data every second and only generates data if the underlying data has changed in any way.
On the event that the data has changed send the results to another box over a network
Maintains a persistent connection between the both machines, altering the remote box if for some reason the network connection went down
From what I understand, I imagine that I need to do some reading on doing some sort of TCP/IP socket-level stuff. That way if the connection is dropped the remote location will be aware that the data it has received may be stale.
However management seems to be very convinced that this can be accomplished using SOAP. I was under the impression that SOAP is more or less a way for a client to initiate a procedure from a server and get some results via the HTTP protocol. Am I wrong in assuming this? I haven't been able to find much information on how SOAP might be able to solve a problem like this.
I feel like a lot of people around my office are using SOAP as a buzzword and that has generated a bit of confusion over what SOAP actually is - and is capable of.
Any thoughts on how to accomplish this task would be appreciated!
I think SOAP is the wrong tool. SOAP is a spec for exchanging structured data. For your problem, the simplest thing would be to write a program to just transfer data and figure out if the other end is alive. Sockets are a good way to go. There are lots of socket programming tutorials on the net. Pick your language, and ask Mr. Google. Write a couple of demo programs to teach yourself how it works. Ask if you have more specific questions.
For the problem, you'll need a sender and a receiver. The sender sends data when it gets it, the receiver waits for data and hands it off when it arrives. Get that working first. Next, add in heartbeats; a message that says "I'm alive", sent periodically. Get that working next. You'll need to be determine the exact behavior you want -- should both sides send heartbeats to the other end, the maximum time you are willing to wait for a heartbeat, and what action you take should heartbeats stop arriving. The network connection can drop, the other end can crash, the other end can hang, and perhaps there are other conditions you should think about (e.g., what if the real time data is nonsense?). Figure out how to handle each condition, and code up the error handling. Test it out, and serve with a side of documentation.
SOAP certainly won't tell you when the data source goes down, though you could use "heartbeats" to add that.
Probably you are right and they are just repeating a buzz word, and don't actually know much about what SOAP is or does or have any real argument for why it ought to be used here.

serving large file using select, epoll or kqueue

Nginx uses epoll, or other multiplexing techniques(select) for its handling multiple clients, i.e it does not spawn a new thread for every request unlike apache.
I tried to replicate the same in my own test program using select. I could accept connections from multiple client by creating a non-blocking socket and using select to decide which client to serve. My program would simply echo their data back to them .It works fine for small data transfers (some bytes per client)
The problem occurs when I need to send a large file over a connection to the client. Since i have only one thread to serve all client till the time I am finished reading the file and writing it over to the socket i cannot resume serving other client.
Is there a known solution to this problem, or is it best to create a thread for every such request ?
When using select you should not send the whole file at once. If you e.g. are using sendfile to do this it will block until the whole file has been sent. Instead use a small buffer, and send a little data at a time to each client. Then use select to identify when the socket is again ready to be written to and send some more until all data has been sent. This will allow you to handle multiple clients in parallel.
The simplest approach is to create a thread per request, but it's certainly not the most scalable approach. I think at this time basically all high-performance web servers use various asynchronous approaches built on things like epoll (Linux), kqueue (BSD), or IOCP (Windows).
Since you don't provide any information about your performance requirements, and since all the non-threaded approaches require restructuring your application to use these often-complex asynchronous techniques (as described in the C10K article and others found from there), for now your best bet is just to use the threaded approach.
Please update your question with concrete requirements for performance and other relevant data if you need more.
For background this may be useful reading http://www.kegel.com/c10k.html
I think you are using your callback to handle a single connection. This is not how it was designed. Your callback has to handle the whatever-thousand of connections you are planning to serve, i.e from the number of file descriptor you get as parameter, you have to know (by reading the global variables) what to do with that client, either read() or send() or ... whatever