How to create and maintain a receiver buffer for a network simulation framework? - matlab

I am trying to simulate a mesh network in matlab. The intermediate nodes and destination need to maintain a receiver buffer so that whenever a packet arrives from a source, it is stored in the buffer and can be used for further operations. I am using a main file and the source, intermediate and destination nodes are functions. Since functions are called everytime a new packet arrives, how and where can I maintain a individual or combined buffer for reception? The packets cant be treated on a first come first served basis but need to be collectively buffered.Please ask if I haven't explained the problem correctly.

Related

Omnetpp application sends multiple streams

Let's say I have a car with different sensors: several cameras, LIDAR and so on, the data from this sensors are going to be send to some host over 5G network (omnetpp + inet + simu5g). For video it is like 5000 packets 1400 bytes each, for lidar 7500 packets 1240 bytes and so on. Each flow is encoded in UDP packets.
So in omnetpp module in handleMessage method I have two sentTo calls, each is scheduled "as soon as possible", i.e., with no delay - that corresponds to the idea of multiple parallel streaming. How does omnetpp handle situations, when it needs to send two different packets at the same time from the same module to the same module (some client, which receives sensor data streams)? Does it create some inner buffer on the sender or receiver side, therefore allowing really only one packet sending per handleMessage call or is it wrong? I want to optimize data transmission and play with packet sizes and maybe with sending intervals, so I want to know, how omnetpp handles multiple streaming at the same time, because if it actually buffers, maybe than it makes sense to form a single package from multiple streams, each such package will consist of a certain amount of data from each stream.
There is some confusion here that needs to be clarified first:
OMNeT++ is a discrete event simulator framework. An OMNeT++ model contains modules that communicate with each other, using OMNeT++ API calls like sendTo() and handleMessage(). Any call of the sendTo() method just queues the provided message into the future event queue (an internal, time ordered queue). So if you send more than one packet in a single handleMessage() method, they will be queued in that order. The packets will be delivered one by one to the requested destination modules when the requested simulation time is reached. So you can send as many packets as you wish and those packets will be delivered one by one to the destination's handleMessage() method. But beware! Even if the different packets will be delivered one by one sequentially in the program's logic, they can still be delivered simultaneously considering the simulation time. There are two time concepts here: real-time that describes the execution order of the code and simulation-time which describes the time passes from the point of the simulated system. That's why, while OMNeT++ is a single threaded application that runs each events sequentially it still can simulate infinite number of parallel running systems.
BUT:
You are not modeling directly with OMNeT++ modules, but rather using INET Framework which is a model directly created to simulate internet protocols and networks. INET's core entity is a node which is something that has network interface(s) (and queues belonging to them). Transmission between nodes are properly modeled and only a single packet can travel on an ethernet line at a time. Other packets must queue in the network interface queue and wait for an opportunity to be delivered from there.
This is actually the core of the problem for Time Sensitive Networks: given a lot of pre-defined data streams in a network, how the various packets interfere and affect each other and how they change the delay and jitter statistics of various streams at the destination, Plus, how you can configure the source and network gate scheduling to achieve some desired upper bounds on those statistics.
The INET master branch (to be released as INET 4.4) contains a lot TSN code, so I highly recommend to try to use it if you want to model in vehicle networks.
If you are not interested in the in-vehicle communication, bit rather want to stream some data over 5G, then TSN is not your interest, but you should NOT start to multiplex/demultiplex data streams at application level. The communication layers below your UDP application will fragment/defragment and queue the packets exactly how it is done in the real world. You will not gain anything by doing mux/demux at application layer.

how to set timer for physical process in Castalia?

As the usual practice in Castalia is that the application module requests for sensor reading using requestsensorreading() function which is handled by sensor manager. Sensor manager forwards the request to physical process and the physical process replies back with its value.
What i want to do is, i want the physical process to broadcast its value at set intervals of time. Sensor device will have a sensitivity > 0 and few nodes will receive the value. How can i accomplish this? is it possible to use timerFiredCallback function and BROADCAST_NETWORK_ADDRESS inside physical process?
You seem to be confused about the basic models of Castalia. The physical process is not a sensor node to send network broadcast messages. It is a module to model
the physical process that sensors in our sensor nodes are sampling. Moreover, a Physical process does not have one value. Values are changing depending on space and time, and depending on the specific model you have defined (the manual has plenty of info on how to define physical processes).You could define a physical process that only returns one value for every point in space and every point in time, but I am not sure why you would like to use such a process in simulation.
A physical process does not "broadcast its value". Sensor nodes sample the physical process and based on space, time, and the specific model of the process they get a value back. Different sensors nodes might get different values back. To achieve what you want, you simply make all sensor nodes periodically sample the physical process. There are some examples of Applications that do that.
So to recap: You define how your physical process needs to behave and then you make sensor nodes sample it (from the Application module using the method requestSensorReading() as you already know).

Push data to client, how to handle slow clients?

In a push model, where server pushes data to clients, how does one handle clients with low or variable bandwidth?
For example i receive data from a producer and send the data to my clients (push). What if one of my clients decides to download a linux iso, the available bandwidth to this client becomes too little to download my data.
Now when my producers produces data and the server pushes it to the client, all clients will have to wait until all clients have downloaded the data. This is a problem when there is one or more slow clients with little bandwidth.
I can cache the data to be send for every client, but because the data size is big this isn't really an option (lots of clients * data size = huge memory requirements).
How is this generally solved? No need for code, just a few thoughts/ideas are already more then welcome.
Now when my producers produces data and the server pushes it to the
client, all clients will have to wait until all clients have
downloaded the data.
The above shouldn't be the case -- your clients should be able to download asynchronously from each other, with each client maintaining its own independent download state. That is, client A should never have to wait for client B to finish, and vice versa.
I can cache the data to be send for every client, but because the data
size is big this isn't really an option (lots of clients * data size =
huge memory requirements).
As Warren said in his answer, this problem can be reduced by keeping only one copy of the data rather than one copy per client. Reference-counting (e.g. via shared_ptr, if you are using C++, or something equivalent in another language) is an easy way to make sure that the shared data is deleted only when all clients are done downloading it. You can make the sharing more fine-grained, if necessary, by breaking up the data into chunks (e.g. instead of all clients holding a reference to a single 800MB linux iso, you could break it up into 800 1MB chunks, so that you can start removing the earlier chunks from memory as soon as all clients have downloaded them, instead of having to hold the entire 800MB of data in memory until every client has downloaded the entire thing)
Of course, that sort of optimization only gets you so far -- e.g. if two clients each request a different 800MB file, then you're liable to end up with 1.6GB of RAM usage for caching, unless you come up with a more clever solution.
Here are some possible approaches you could try (from less complex to more complex). You could try any of these either separately or in combination:
Monitor how much each client's "backlog" is -- that is, keep a count of the amount of data you have cached waiting to send to that client. Also keep track of the number of bytes of cached data your server is currently holding; if that number gets too high, force-disconnect the client with the largest backlog, in order to free up memory. (this doesn't result in a good user experience for the client, of course; but if the client has a buggy or slow connection he was unlikely to have a good user experience anyway. It does keep your server from crashing or swapping itself to death because a single client has a bad connection)
Keep track of how much data your server has cached and waiting to send out. If the amount of data you have cached is too large (for some appropriate value of "too large"), temporarily stop reading from the socket(s) that are pushing the data out to you (or if you are generating your data internally, temporarily stop generating it). Once the amount of cached data gets down to an acceptable level again, you can resume receiving (or generating) more data to push.
(this may or may not be applicable to your use-case) Revise your data model so that instead of being communications-oriented, it becomes state-oriented. For example, if your goal is to update the clients' state to match the state of the data-source, and you can organize the data-source's state into a set of key/value pairs, then you can require that the data-source include a key with each piece of data it sends. Whenever a key/value pair is received from the data-source, simply place that key-value pair into a map (or hash table or some other key/value oriented data structure) for each client (again, used shared_ptr's or similar here to keep memory usage reasonable). Whenever a given client has drained its queue of outgoing TCP data, remove the oldest item from that client's key/value map, convert it into TCP bytes to send, and add them to the outgoing-TCP-data queue. Repeat as necessary. The advantage of this is that "obsolete" values for a given key are automatically dropped inside the server and therefore never need to be sent to the slow clients; rather the slow clients will only ever get the "latest" value for that given key. The beneficial consequence of that is that a given client's maximum "backlog" will be limited by the number of keys in the state-model, regardless of how slow or intermittent the client's bandwidth is. Thus a slow client might see fewer updates (per second/minute/hour), but the updates it does see will still be as recent as possible given its bandwidth.
Cache the data once only, and have each client handler keep track of where it is in the download, all using the same cache. Once all clients have all the data, the cached data can be deleted.

Handling concurrent UDP DatagramReceivedFcn executions in Matlab

I'm attempting to read ocean depth values at multiple frequencies which are being broadcast via UDP packets. What I'm doing is telling the logging program to return the depth values to a specific UDP port, then use the DatagramReceivedFcn to run a function when data is received and essentially save that depth.
u1 = udp(remoteip,dataport18,'ByteOrder','littleEndian','LocalPort',dataport18,'DatagramTerminateMode','off');
set(u1,'InputBufferSize',6000);
u1.DatagramReceivedFcn = {#receivedata18};
fopen(u1);
Thus, when data is received on the port specified in 'dataport18', it will run the function receivedata18(). However, I'm trying to read depth data for multiple frequencies, so I create additional UDP objects:
u2 = udp(remoteip,dataport38,'ByteOrder','littleEndian','LocalPort',dataport38,'DatagramTerminateMode','off');
set(u2,'InputBufferSize',6000);
u2.DatagramReceivedFcn = {#receivedata38};
fopen(u2);
What I'm finding though is that only data for u1 (18 kHz) is being saved. My guess is that since both frequencies ping at the same time, they both send their UDP packets and try to evaluate their respective functions at the same time, which Matlab is not capable of doing.
Is this indeed what is going on? If so, is there any way around this issue so that I can concurrently read depth data that is being sent at the same time from two separate UDP packets?
Thanks!
Update
I'm wondering if I would need the Parallel Computing Toolbox in order to perform this. I have a similar program in Python that is performed in essentially the same way, however it has no issues. I'm assuming that it must be that Matlab can't run simultaneous functions without the Parallel Computing Toolbox
Thought I should update this in case anyone cares. It's not really an answer to my question, but what I'm currently doing that's working.
Instead of having the data sent to different UDP port, I simply have them sent to the same port and then read them sequentially. Thus I'm not reading them synchronously, although that doesn't really slow down the operation much at all.

Implement a good performing "to-send" queue with TCP

In order not to flood the remote endpoint my server app will have to implement a "to-send" queue of packets I wish to send.
I use Windows Winsock, I/O Completion Ports.
So, I know that when my code calls "socket->send(.....)" my custom "send()" function will check to see if a data is already "on the wire" (towards that socket).
If a data is indeed on the wire it will simply queue the data to be sent later.
If no data is on the wire it will call WSASend() to really send the data.
So far everything is nice.
Now, the size of the data I'm going to send is unpredictable, so I break it into smaller chunks (say 64 bytes) in order not to waste memory for small packets, and queue/send these small chunks.
When a "write-done" completion status is given by IOCP regarding the packet I've sent, I send the next packet in the queue.
That's the problem; The speed is awfully low.
I'm actually getting, and it's on a local connection (127.0.0.1) speeds like 200kb/s.
So, I know I'll have to call WSASend() with seveal chunks (array of WSABUF objects), and that will give much better performance, but, how much will I send at once?
Is there a recommended size of bytes? I'm sure the answer is specific to my needs, yet I'm also sure there is some "general" point to start with.
Is there any other, better, way to do this?
Of course you only need to resort to providing your own queue if you are trying to send data faster than the peer can process it (either due to link speed or the speed that the peer can read and process the data). Then you only need to resort to your own data queue if you want to control the amount of system resources being used. If you only have a few connections then it is likely that this is all unnecessary, if you have 1000s then it's something that you need to be concerned about. The main thing to realise here is that if you use ANY of the asynchronous network send APIs on Windows, managed or unmanaged, then you are handing control over the lifetime of your send buffers to the receiving application and the network. See here for more details.
And once you have decided that you DO need to bother with this you then don't always need to bother, if the peer can process the data faster than you can produce it then there's no need to slow things down by queuing on the sender. You'll see that you need to queue data because your write completions will begin to take longer as the overlapped writes that you issue cannot complete due to the TCP stack being unable to send any more data due to flow control issues (see http://www.tcpipguide.com/free/t_TCPWindowSizeAdjustmentandFlowControl.htm). At this point you are potentially using an unconstrained amount of limited system resources (both non-paged pool memory and the number of memory pages that can be locked are limited and (as far as I know) both are used by pending socket writes)...
Anyway, enough of that... I assume you already have achieved good throughput before you added your send queue? To achieve maximum performance you probably need to set the TCP window size to something larger than the default (see http://msdn.microsoft.com/en-us/library/ms819736.aspx) and post multiple overlapped writes on the connection.
Assuming you already HAVE good throughput then you need to allow a number of pending overlapped writes before you start queuing, this maximises the amount of data that is ready to be sent. Once you have your magic number of pending writes outstanding you can start to queue the data and then send it based on subsequent completions. Of course, as soon as you have ANY data queued all further data must be queued. Make the number configurable and profile to see what works best as a trade off between speed and resources used (i.e. number of concurrent connections that you can maintain).
I tend to queue the whole data buffer that is due to be sent as a single entry in a queue of data buffers, since you're using IOCP it's likely that these data buffers are already reference counted to make it easy to release then when the completions occur and not before and so the queuing process is made simpler as you simply hold a reference to the send buffer whilst the data is in the queue and release it once you've issued a send.
Personally I wouldn't optimise by using scatter/gather writes with multiple WSABUFs until you have the base working and you know that doing so actually improves performance, I doubt that it will if you have enough data already pending; but as always, measure and you will know.
64 bytes is too small.
You may have already seen this but I wrote about the subject here: http://www.lenholgate.com/blog/2008/03/bug-in-timer-queue-code.html though it's possibly too vague for you.