How to simulate M/M/1 using omnet++? - queue

I am new in using omnet+ + , I am trying use INET framework to simulate queueing model M/M/1 on server-side based on FIFO discipline
Firstly, I found INET support external output queue such as FIFO a DopeTailqueue, I try to edit it and to make it work on input data not output, but I am not sure if I am in the right way.
My question, how can I simulate the M/M/1 queue in which layer and how can I calculate the arrival rate and processing rate?​

As your task has nothing to do with actual protocols (just a generic queuing question), you would be better off with the queuing tutorial in the base OMNeT++ installation (omnetpp/samples/queuenet). You can quite easily assemble various queueing models.

Related

Omnetpp application sends multiple streams

Let's say I have a car with different sensors: several cameras, LIDAR and so on, the data from this sensors are going to be send to some host over 5G network (omnetpp + inet + simu5g). For video it is like 5000 packets 1400 bytes each, for lidar 7500 packets 1240 bytes and so on. Each flow is encoded in UDP packets.
So in omnetpp module in handleMessage method I have two sentTo calls, each is scheduled "as soon as possible", i.e., with no delay - that corresponds to the idea of multiple parallel streaming. How does omnetpp handle situations, when it needs to send two different packets at the same time from the same module to the same module (some client, which receives sensor data streams)? Does it create some inner buffer on the sender or receiver side, therefore allowing really only one packet sending per handleMessage call or is it wrong? I want to optimize data transmission and play with packet sizes and maybe with sending intervals, so I want to know, how omnetpp handles multiple streaming at the same time, because if it actually buffers, maybe than it makes sense to form a single package from multiple streams, each such package will consist of a certain amount of data from each stream.
There is some confusion here that needs to be clarified first:
OMNeT++ is a discrete event simulator framework. An OMNeT++ model contains modules that communicate with each other, using OMNeT++ API calls like sendTo() and handleMessage(). Any call of the sendTo() method just queues the provided message into the future event queue (an internal, time ordered queue). So if you send more than one packet in a single handleMessage() method, they will be queued in that order. The packets will be delivered one by one to the requested destination modules when the requested simulation time is reached. So you can send as many packets as you wish and those packets will be delivered one by one to the destination's handleMessage() method. But beware! Even if the different packets will be delivered one by one sequentially in the program's logic, they can still be delivered simultaneously considering the simulation time. There are two time concepts here: real-time that describes the execution order of the code and simulation-time which describes the time passes from the point of the simulated system. That's why, while OMNeT++ is a single threaded application that runs each events sequentially it still can simulate infinite number of parallel running systems.
BUT:
You are not modeling directly with OMNeT++ modules, but rather using INET Framework which is a model directly created to simulate internet protocols and networks. INET's core entity is a node which is something that has network interface(s) (and queues belonging to them). Transmission between nodes are properly modeled and only a single packet can travel on an ethernet line at a time. Other packets must queue in the network interface queue and wait for an opportunity to be delivered from there.
This is actually the core of the problem for Time Sensitive Networks: given a lot of pre-defined data streams in a network, how the various packets interfere and affect each other and how they change the delay and jitter statistics of various streams at the destination, Plus, how you can configure the source and network gate scheduling to achieve some desired upper bounds on those statistics.
The INET master branch (to be released as INET 4.4) contains a lot TSN code, so I highly recommend to try to use it if you want to model in vehicle networks.
If you are not interested in the in-vehicle communication, bit rather want to stream some data over 5G, then TSN is not your interest, but you should NOT start to multiplex/demultiplex data streams at application level. The communication layers below your UDP application will fragment/defragment and queue the packets exactly how it is done in the real world. You will not gain anything by doing mux/demux at application layer.

Need for multi-threading in Systemverilog using fork-join

In most text books advocating layered testbench designs, it is recommended that different layers/block run in parallel. I'm currently unable to figure out the reason why is it so. Why cannot we follow the following sequence.
repeat for 1000 tests
generate a transaction
drive the transaction on the DUT
monitor the transaction on the DUT
compare output with a reference
Instead, what is recommended is that all four blocks generator, driver, monitor and scoreboard/checker should run in parallel. My confusion is that why do we avoid the above mentioned sequential behavior in which we go through tests one test case at a time and prefer different blocks running in parallel.
Some texts say that it is because that is how things are done in hardware, i.e. everything runs in parallel. However, the layered testbench is not needed to model any synthesizable hardware. So, why do we have to restrict our verification enivornment/testbench to follow these hardware-like behavior.
A sample block diagram that I'm referring to is given below:
Suppose that you have a fifo which you want to test. Your driver pushes data into it, and the monitor checks the other end. The data gets pushed when it is available and till the fifo is full, the consumer on the other end reads data when it can. So, the pipe gets sometimes full, sometimes empty.
When the fifo is full, the driver must stop. The monitor works always, but its values do not change at the same frequency as the stimuli and it is delayed due to the fifo depth.
In your example, when the fifo is full, the stopped driver will block the whole loop, so the monitor will not work either. Of course, you can come up with some conditional statements which will bypass stopped driver. But you will need to run the monitor and the scoreboard every time, even if the data is not changing.
With more complicated designs with multiple fifos, pipelines, delays, clock frequencies, etc., your loop will become so complicated that it would be difficult if not impossible to manage.
The problem is that in the simple programming it is not possible to express block/wait conditions for statement without blocking the whole loop. It is much easier to do with parallel threads.
The general approach is to run driver and monitor in separate simulation threads. Monitor in this case waits for the data to appear and does not block the driver. The driver pushes data when it is available and can be blocked by fifo full or if there is nothing to drive. It does not block the monitor.
With a single monitor, you can probably pack the scoreboard in the same thread with the monitor, but with multiple monitors it will be problematic, in particular when all monitors run in separate threads. So, the scoreboard should run as a separate thread as well.
You are mixing two different concepts. The layered approach is a software concept that helps manage different abstraction levels from software transactions (a frame of data) to the individual pin wiggles. These layers are very similar to OSI Network Model. Layering also help with maintenance and reusability by defining clear interfaces that enable you to build up a larger system. It's hard to see the benefits of this on a testbench for a small combinational block.
Parallelism come into play for other reasons. There are relatively few complete designs out there that can be tested as a single stream of inputs and then comparing the output to a reference model. You might be able to test one small block of a design this way, but not a complete chip as it typically has many interfaces that need to be driven in parallel.
But let's take the case of two simple blocks that you tested individually with the approach above. Now you want to connect them together where the output of the first DUT becomes the driver of the second DUT
Driver1 -> DUT1 -> DUT2 -> Monitor2
This works best if I originally write the drivers and monitors as separate objects running in parallel.

Simulink: Introduce delay with UDP Send/Receive

I'm building a client/server-type subsystem in a control system application using UDP Send/Receive blocks in Simulink. Data x is sent to the server via UDPSend block which is then processed at the server that returns output y.
Currently, I've both the client (a Simulink model) and the server (processing logic return in Java) resides in the localhost. Therefore, the packet exchanges essentially take near-zero time. I'd like to introduce network delay such that the packet exchanges take a varying amount of time (say due to changes in bandwidth availability), effectively simulating a scenario where the server node is located in a different geographical location.
Could someone guide me on how to achieve this? Thanks.
As a general (Simulink-independent) solution in a Windows environment, you should have a look at following tool, which "makes your network condition significantly worse, but in a managed and interactive manner."

Is Communicating Sequential Processes [CSP] an alternative to the actor model in Scala?

In a 1978 Paper by Hoare we have an idea called Communicating Sequential Processes. This is used by Go, Occam, and in Clojure in core.async.
Is it possible to use CSP as an alternative to the Actor Model in Scala? (I'm seeing JCSP but I'm wondering if this is the only option, if it is mature, and if anyone uses it).
EDIT - I'm also seeing Communicating Scala Objects as an alternative to JCSP in Scala. But those of these seem to be tied to real threads - which seems to miss one of the benefits of CSP, being to get away from the memory resource cost of keeping large numbers of threads always active.
You should consult this document, but in general there are a few differences:
Channels are anonymous while actors have identities
In CSP, you use channels to transmit messages, but actors can directly contact each other.
In CSP communication is done in the form of rendezvous (i.e., it is synchronous). Actors support asynchronous message passing.
And yes, it is possible to use CSP as an alternative to the Actor model if these differences are acceptable in your position. I don't have any experience with JCSP but I wouldn't recommend using that specific library (the reason is as I see there aren't any activity in the project since 2011).

What is Event Driven Concurrency?

I am starting to learn Scala and functional programming. I was reading the book !Programming scala: Tackle Multi-Core Complexity on the Java Virtual Machine". Upon the first chapter I've seen the word Event-Driven concurrency and Actor model. Before I continue reading this book I want to have an idea about Event-Driven concurrency or Actor Model.
What is Event-Driven concurrency, and how is it related to Actor Model?
An Event Driven programming model involves registering code to be run when a given event fires. An example is, instead of calling a method that returns some data from a database:
val user = db.getUser(1)
println(user.name)
You could instead register a callback to be run when the data is ready:
db.getUser(1, u => println(u.name))
In the first example, no concurrency was happening; The current thread would block until db.getUser(1) returned data from the database. In the second example db.getUser would return immediately and carry on executing the next code in the program. In parallel to this, the callback u => println(u.name) will be executed at some point in the future.
Some people prefer the second approach as it doesn't mean memory hungry Threads are needlessly sat around waiting for slow I/O to return.
The Actor Model is an example of how Event-Driven concepts can be used to help the programmer easily write concurrent programs.
From a super high level, Actors are objects that define a series of Event Driven message handlers that get fired when the Actor receives messages. In Akka, each instance of an Actor is single Threaded, however when many of these Actors are put together they create a system with concurrency.
For example, Actor A could send messages to Actor B and C in parallel. Actor B and C could fire messages back to Actor A. Actor A would have message handlers to receive these messages and behave as desired.
To learn more about the Actor model I would recommend reading the Akka documentation. It is really well written: http://doc.akka.io/docs/akka/2.1.4/
There is also lot's of good documentation around the web about Event Driven Concurrency that us much more detailed than what I've written here. http://berb.github.io/diploma-thesis/original/055_events.html
Theon's answer provides a good modern overview. I'd like to add some historical perspective.
Tony Hoare and Robert Milner both developed mathematical algebra for analysing concurrent systems (Communicating Sequential Processes, CSP, and Communicating Concurrent Systems, CCS). Both of these look like heavy mathematics to most of us but the practical application is relatively straightforward. CSP led directly to the Occam programming language amongst others, with Go being the newest example. CCS led to Pi calculus and the mobility of communicating channel ends, a feature that is part of Go and was added to Occam in the last decade or so.
CSP models concurrency purely by considering automomous entities ('processes', v.lightweight things like green threads) interacting simply by event exchange. The medium for passing events is along channels. Processes may have to deal with several inputs or outputs and they do this by selecting the event that is ready first. The events usually carry data from the sender to the receiver.
A principle feature of the CSP model is that a pair of processes engage in communication only when both are ready - in practical terms this leads to what is usually called 'synchronous' communication. However, the actual implementations (Go, Occam, Akka) allow channels to be buffered (the normal state in Akka) so that the lock-step exchange of events is often actually decoupled instead.
So in summary, an event-driven CSP-based system is really a data-flow network of processes connected by channels.
Besides the CSP interpretation of event-driven, there have been others. An important example is the 'event-wheel' approach, once popular for modelling concurrent systems whilst actually having a single processing thread. Such systems handle events by putting them into a processing queue and dealing with them due course, usually via a callback. Java Swing's event processing engine is a good example. There were others, e.g. for time-based simulation engines. One might think of the Javascript / NodeJS model as fitting into this category as well.
So in summary, an event-wheel was a way to express concurrency but without parallelism.
The irony of this is that the two approaches I've described above are both described as event driven but what they mean by event driven is different in each case. In one case, hardware-like entities are wired together; in the other, almost all actions are executed by callbacks. The CSP approach claims to be scalable because it's fully composable; it's naturally adept at parallel execution also. If there are any reasons to favour one over the other, these are probably it.
To understand the answer to this you have to look at event concurrency from the OS layer up. First you start with threads which are the smallest section of code that can be run by the OS and eventually deal with I/O, timing and other kinds of events.
The OS groups threads into a process in which they share the same memory, protection and security permissions. Above that layer you have user programs which typically make I/O requests that are handled by user libraries.
The I/O libraries handle these requests in one of two ways. Unix-like systems use a "reactor" model in which the library registers I/O handlers for all the different types of I/O and events in the system. These handlers are activated when I/O is ready on a specific device. Windows-like systems use an I/O completion model in which I/O requests are made and a callback is triggered when the request is complete.
Both of these models require a significant amount of overhead to manage overall program state if you were to use them directly. However some programming tasks (web apps / services) lend themselves to a seemingly more direct implementation if you use an event model directly, but you still need to manage all of that program state. In order to track program logic across dispatches of several related events you have to manually track state and pass it around to the callbacks. This tracking structure is usually called a state context or baton. As you might imagine passing batons around all over the place to numerous seemingly unrelated handlers makes for some extremely hard to read and spaghetti-like code. It's also a pain to write and debug -- especially when you're trying to handle the synchronization of various concurrent paths of execution. You start getting into Futures and then the code becomes really difficult to read.
One well-known event processing library is call libuv. It's a portable event loop that integrates Unix's reactor model with Windows' completion model into a single model usually called a "proactor". Its the event handler that drives NodeJS.
Which brings us to communicating sequential processes.
https://en.wikipedia.org/wiki/Communicating_sequential_processes
Rather than writing asynchronous I/O dispatch and synchronization code using one or more concurrency models (and their often competing conventions), we flip the problem on its head. We use a "coroutine" which looks like normal sequential code.
A simple example is a coroutine that receives a single byte over an event channel from another coroutine that sends a single byte. This effectively synchronizes I/O producer and consumer because the writer/sender has to wait for a reader/receiver and vice-versa. While either process is waiting they explicitly yield execution to other processes. When a coroutine yields, its scoped program state is saved on a stack frame thus saving you from the confusion of managing multi-layered baton state in an event loop.
Using applications built on these event channels we can construct arbitrary, reusable, concurrent logic and the algorithms no longer look like spaghetti code. In pure CSP systems if you write to a channel and there is no reader, you will be blocked. The channel endpoints are known via handles internally to the program.
Actor systems are different in a couple of ways. First, the endpoints are the actor threads and they are named and known external to the mainline program. The second difference is that sends and receives on these channels are buffered. In other words if you send a message to an actor and there isn't one listening or its busy you aren't blocked until one reads from their input channel. Other differences exist like one actor can publish to two different actors concurrently.
As you might guess Actor systems can easily be built from CSP systems. There are other details like waiting for specific event patterns and selecting from them, but that's the basics.
I hope that clarifies things a bit.
Other constructs can be built from these ideas. Various programming systems (Go, Erlang, etc) include CSP implementations within them. Operating systems like Inferno and Node9 use CSPs and Channels as the basis of their distributed computing model.
Go: https://en.wikipedia.org/wiki/Go_(programming_language)
Erlang: https://en.wikipedia.org/wiki/Erlang_(programming_language)
Inferno: https://en.wikipedia.org/wiki/Inferno_(operating_system)
Node9: https://github.com/jvburnes/node9