Is pipelining possible in ESPER CEP? - complex-event-processing

I am new to ESPER.
I just want to know if pipelining is possible in Esper.
I have a listener which monitors temperature difference. Suppose there are two cases :
Difference in temperatures is 10 degrees.
spike in temperature is above 2%.
Now I have to run these checks in two separate EPstatments on the same event having same listener.
For example if it receives a temperature difference of 30 then both the queries will return true and corresponding listener will execute, so there is no reason for both of them raising 2 alerts.
So, how can we design so that if first statement is true and raises an alert the second should not execute.

Related

Machine Learning to predict time-series multi-class signal changes

I would like to predict the switching behavior of time-dependent signals. Currently the signal has 3 states (1, 2, 3), but it could be that this will change in the future. For the moment, however, it is absolutely okay to assume three states.
I can make the following assumptions about these states (see picture):
the signals repeat periodically, possibly with variations concerning the time of day.
the duration of state 2 is always constant and relatively short for all signals.
the duration of states 1 and 3 are also constant, but vary for the different signals.
the switching sequence is always the same: 1 --> 2 --> 3 --> 2 --> 1 --> [...]
there is a constant but unknown time reference between the different signals.
There is no constant time reference between my observations for the different signals. They are simply measured one after the other, but always at different times.
I am able to rebuild my model periodically after i obtained more samples.
I have the following problems:
I can only observe one signal at a time.
I can only observe the signals at different times.
I cannot trigger my measurement with the state transition. That means, when I measure, I am always "in the middle" of a state. Therefore I don't know when this state has started and also not exactly when this state will end.
I cannot observe a certain signal for a long duration. So, i am not able to observe a complete period.
My samples (observations) are widespread in time.
I would like to get a prediction either for the state change or the current state for the current time. It is likely to happen that i will never have measured my signals for that requested time.
So far I have tested the TimeSeriesPredictor from the ML.NET Toolbox, as it seemed suitable to me. However, in my opinion, this algorithm requires that you always pass only the data of one signal. This means that assumption 5 is not included in the prediction, which is probably suboptimal. Also, in this case I had problems with the prediction not changing, which should actually happen time-dependently when I query multiple predictions. This behavior led me to believe that only the order of the values entered the model, but not the associated timestamp. If I have understood everything correctly, then exactly this timestamp is my most important "feature"...
So far, i did not do any tests on Regression-based approaches, e.g. FastTree, since my data is not linear, but keeps changing states. Maybe this assumption is not valid and regression-based methods could also be suitable?
I also don't know if a multiclassifier is required, because I had understood that the TimeSeriesPredictor would also be suitable for this, since it works with the single data type. Whether the prediction is 1.3 or exactly 1.0 would be fine for me.
To sum it up:
I am looking for a algorithm which is able to recognize the switching patterns based on lose and widespread samples. It would be okay to define boundaries, e.g. state duration 3 of signal 1 will never last longer than 30s or state duration 1 of signal 3 will never last longer 60s.
Then, after the algorithm has obtained an approximate model of the switching behaviour, i would like to request a prediction of a certain signal state for a certain time.
Which methods can I use to get the best prediction, preferably using the ML.NET toolbox or based on matlab?
Not sure if this is quite what you're looking for, but if detecting spikes and changes using signals is what you're looking for, check out the anomaly detection algorithms in ML.NET. Here are two tutorials that show how to use them.
Detect anomalies in product sales
Spike detection
Change point detection
Detect anomalies in time series
Detect anomaly period
Detect anomaly
One way to approach this would be to first determine the periodicity of each of the signals independently. This could be done by looking at the frequency distribution of time differences between measurements of state 2 only and separately for each signal.
This will give a multinomial distribution. The shortest time difference will be the duration of the switching event (after discarding time differences less than the max duration of state 2). The second shortest peak will be the duration between the end of one switching event and the start of the next.
When you have the 3 calculations of periodicity you can simply calculate the difference between each of them. Given you have the timestamps of the measurements of state 2 for each signal you should be able to calculate the time of switching for all other signals.

Modeling a waiting line in a call centre in AnyLogic

I need to model a waiting line in a call centre in AnyLogic. This is what I don't understand. It says:
If all of the service representatives are busy, an arriving customer is placed on hold, but ties up on the phone lines.
I am not sure what block or how to model customers waiting. Can someone help me? Thank You!
Here is my solution, although I'm absolutely sure other methods exist.
To represent an arrival rate of 15/hr, use a source block with arrivals defined by rate, with the rate set to 15 per hour.
To represent 24 phone lines and 3 service reps, use a queue block (callsWaiting) followed by a delay block (service). The queue block should have capacity = 21 and the delay block should have capacity = 3 with a delay time of exponential(0.1,0) minutes representing the exponential service time (with mean 10 min).
To represent losing calls when all of the phone lines are tied up, place a selectOutput block before the callsWaiting queue and set its condition to: callsWaiting.canEnter(). It will return false if the queue is at maximum capacity. On the false branch for that selectOutput, place a sink block for dropped calls.

Anylogic - How to measure work in process inventory (WIP) within simulation

I am currently working on a simple simulation that consists of 4 manufacturing workstations with different processing times and I would like to measure the WIP inside the system. The model is PennyFab2 in case anybody knows it.
So far, I have measured throughput and cycle time and I am calculating WIP using Little's law, however the results don't match he expectations. The cycle time is measured by using the time measure start and time measure end agents and the throughput by simply counting how many pieces flow through the end of the simulation.
Any ideas on how to directly measure WIP without using Little's law?
Thank you!
For little's law you count the arrivals, not the exits... but maybe it doesn't make a difference...
Otherwise.. There are so many ways
you can count the number of agents inside your system using a RestrictedAreaStart block and use the entitiesInside() function
You can just have a variable that adds +1 if something enters and -1 if something exits
No matter what, you need to add the information into a dataset or a statistics object and you get the mean of agents in your system
Little's Law defines the relationship between:
Work in Process =(WIP)
Throughput (or Flow rate)
Lead Time (or Flow Time)
This means that if you have 2 of the three you can calculate the third.
Since you have a simulation model you can record all three items explicitly and this would be my advice.
Little's Law should then be used to validate if you are recording the 3 values correctly.
You can record them as follows.
WIP = Record the average number of items in your system
Simplest way would be to count the number of items that entered the system and subtract the number of items that left the system. You simply do this calculation every time unit that makes sense for the resolution of your model (hourly, daily, weekly etc) and save the values to a DataSet or Statistics Object
Lead Time = The time a unit takes from entering the system to leaving the system
If you are using the Process Modelling Library (PML) simply use the timeMeasureStart and timeMeasureEnd Blocks, see the example model in the help file.
Throughput = the number of units out of the system per time unit
If you run the model and your average WIP is 10 units and on average a unit takes 5 days to exit the system, your throughput will be 10 units/5 days = 2 units/day
You can validate this by taking the total units that exited your system at the end of the simulation and dividing it by the number of time units your model ran
if you run a model with the above characteristics for 10 days you would expect 20 units to have exited the system.

Simulation: send packets according to exponential distribution

I am trying to build a network simulation (aloha like) where n nodes decide at any instant whether they have to send or not according to an exponential distribution (exponentially distributed arrival times).
What I have done so far is: I set a master clock in a for loop which ticks and any node will start sending at this instant (tick) only if a sample I draw from a uniform [0,1] for this instant is greater than 0.99999; i.e. at any time instant a node has 0.00001 probability of sending (very close to zero as the exponential distribution requires).
Can these arrival times be considered exponentially distributed at each node and if yes with what parameter?
What you're doing is called a time-step simulation, and can be terribly inefficient. Each tick in your master clock for loop represents a delta-t increment in time, and in each tick you have a laundry list of "did this happen?" possible updates. The larger the time ticks are, the lower the resolution of your model will be. Small time ticks will give better resolution, but really bog down the execution.
To answer your direct questions, you're actually generating a geometric distribution. That will provide a discrete time approximation to the exponential distribution. The expected value of a geometric (in terms of number of ticks) is 1/p, while the expected value of an exponential with rate lambda is 1/lambda, so effectively p corresponds to the exponential's rate per whatever unit of time a tick corresponds to. For instance, with your stated value p = 0.00001, if a tick is a millisecond then you're approximating an exponential with a rate of 1 occurrence per 100 seconds, or a mean of 100 seconds between occurrences.
You'd probably do much better to adopt a discrete-event modeling viewpoint. If the time between network sends follows the exponential distribution, once a send event occurs you can schedule when the next one will occur. You maintain a priority queue of pending events, and after handling the logic of the current event you poll the priority queue to see what happens next. Pull the event notice off the queue, update the simulation clock to the time of that event, and dispatch control to a method/function corresponding to the state update logic of that event. Since nothing happens between events, you can skip over large swatches of time. That makes the discrete-event paradigm much more efficient than the time step approach unless the model state needs updating in pretty much every time step. If you want more information about how to implement such models, check out this tutorial paper.

Controlling events in a hybrid Modelica model

I am confused by the hybrid modelling paradigm in Modelica. On one hand, events are useful, on the other hand, they are to be avoided. Let me explain my case:
I have a large model consisting of multiple buildings in a neighborhood that is simulated over 1 year. Initially, the model ran very slow. Adding noEvent() around as many if-conditions as possible drastically improved the speed.
As the development continued, the control of the model got more complicated, and I have again many events, sometimes at very short intervals. To give an idea:
Number of (model) time events : 28170
Number of (U) time events : 0
Number of state events : 22572
Number of step events : 0
These events blow up the output (for correct post-processing I need the variables at events) and slows the simulation. And moreover, I have the feeling that some of the noEvent(if...) lead to unexpected behavior.
I wonder if it would be a solution to force my events at certain time steps and prohibit them in between these time steps? Ideally, I would like to trigger these 'forced events' based on certain conditions. For example: during the day they should be every 15 minutes, at high solar radiation at every minute, during nights I don't want events at all.
Is this a good idea to do? I guess this will be faster as many of the state events will become time events? How can this be done with Modelica 3.2 (in Dymola)?
Thanks on beforehand for all answers.
Roel
A few comments.
First, if you have a simulation with lots of events (relative to the total duration of the simulation), the first thing I would encourage you to do is use a lower order integrator. The point here is that higher-order integrators normally allow you to take longer time steps. But if those steps are constantly truncated by events, they just end up being really expensive.
Second, you could try fixed-step integrators. Depending on the tool, they may implement this kind of "pool events and fire them all at once" kind of approach in the context of fixed-time step integrators. But the specification doesn't really say anything on how tools should deal with events that occur between fixed time steps.
Third, another way to approach this would be to "pool" your events yourself. The simplest way I could imagine doing this would be to take all the statements that currently generate events and wrap them in a "when sample(...,...) then" statement. This way, you could make sure that the events were only triggered at specific intervals. This would be more portable then the fixed time step approach. I think this is what you were actually proposing in your question but it is important to point out that it should not be based on time steps (the model has no concept of a time step) but rather on a model specified sampling interval (which will, in practice, be completely independent of time steps).
As you point out, using "sample(...,...)" will turn these into time events and, yes, this should be faster.