I have a need to incorporate retroactive events in my event stream and I'm not sure of the best way to implement it.
We need to keep the original event stream unchanged for audit and all of the other standard benefits. The event stream is also temporal in nature, giving us the ability to see the values for any point in history. i.e. The value of x was 10.00 at 5 pm June 1st. Occasionally we find out on June 5 the value of x as was actually 12.00 at 5pm June 1st. In this scenario we refer to 10.00 as the 'as-at' value and '12.00' as the as-of value and we track both of these in the event stream.
Rebuilding the state for the as-at value is a straight forward query from the most recent snap before 5 pm June 1st and all of the events though June 1st.
Where I am hesitant is in rebuilding the as-of state. If there is an as-of correction to the model then then it should be used by default rather than the as-at, but I can't see any way to determine if there is an as-of correction without reading the entire event stream from the point in time until the present (this can be large) and most of the changes will not matter as they will be related to future changes and not the point in time in question.
Is there a different approach I should be looking at here?
Thanks,
Chris
I think what you're referring to is a bitemporal data model. That is, you can answer not only "who won the US Presidential election in 2000", but "who did we think won the US Presidential election in the evening of election day in 2000".
In general, your event stream is not necessarily built to answer all your queries and bitemporal queries efficiently. It is simply a history of the facts you learned. If you learn today a fact about last year, it still belongs at the end of your event stream, but marked with the relevant dates.
The best way to query this data depends on what kinds of questions you want to answer. There are several nice papers on how to construct temporal and bitemporal database schemas, which would be populated by projectors feeding off your event stream.
Related
I am new to drools / fusion (7.x) and am not sure how to solve this requirement. Assume I have event objects as Event{long: timestamp, id: string} where id identifies a physical asset (like tractor) and timestamp represents the time the event fired relative to the asset. In my scenario these Events do not arrive in my system in 'real-time', meaning they can be seconds, minutes or even days late. And my rules system needs to monitor multiple assets. Given this, when rules are evaluate the clock needs to be relative to the asset being monitored, it can't be a clock that spans assets.
I'm aware of Pseudo Clock, is there a way to assign Pseudo clocks per Asset?
My assumption is that a clock must always progress forward or temporal functions will not work properly. Take for the example the following scenario:
Fact A for Asset 1 arrive at 1:00 it is inserted into memory and rules fired. Then Fact B arrives for same Asset 1 at 2:00. It too is inserted and rules fired. Now Fact Z arrives for Asset 2 at 1:30 (- 30 minutes from clock). I'm assuming I shouldn't simply progress the clock backwards and evaluate, furthermore I'd want to set the clock back to 2:00 since that was the "latest" data I received. Now assume I am monitoring thousands of assets, all sending data at different times...
The best way I can think to address this is to keep a clock per asset and then save the engine state when each assets data is evaluated. Can individual KieSession's have different clocks, or is it at a container level?
Sample rule: When Fact 1 arrives after Fact 2 for the same Asset.
You're approaching the problem incorrectly. Regardless of whether you're using a realtime or psuedo clock, you're using a clock. You can't say "Fact #1 use clock A, and Fact #2 use clock B."
Instead you should be leveraging the metadata tags for events, specifically the #timestamp tag. This tag indicates to Drools that a specific field inside of the event is actually the timestamp for the Event, rather than the actual time the fact enters working memory.
For example:
import com.example.SampleEvent
declare SampleEvent
#role( event )
// this field is actually in the object, it's not the time the fact was inserted
#timestamp( createdDateTime )
end
Not knowing anything about what your rules are actually doing, the major issue I can foresee here is that if your rules rely on the temporal operators or define an expiry (#expires), they're not going to work and you'll need to redesign them. Especially for expirations: once an event expires, it is removed from working memory; when your out-of-band events come in any previously expired events are already gone and can't be worked against.
Of course that concern would be true regardless of whether you use #timestamp or your original "different psuedo clock" plan. Either way you're going to have to manage the fact that events cannot live forever in working memory -- you will eventually run out of resources and your system will crash. Events must be evicted at some point, so you'll need to design around that in both your models and your rules.
I've lost track of how many questions & responses I've read while trying to find an answer on this. The ones that sound like they're related often aren't, or else the users are just accused of being confused. (As an example, the first answer here just tells the person asking the question that they don't really mean to be asking what they're asking. Then there's this one. The most upvoted answer here says it's just impossible. Etc., etc., etc.)
I need to be able to take a time--say 9:00 AM--and work with it as 9:00 AM regardless of which timezone my user is in. If a user pulls up this time in a US/Eastern timezone, they should see this value as 9:00 AM. If a user pulls up this time in a US/Pacific timezone, they should see this value as 9:00 AM. I recognize that this is not actually the same moment in time, and I don't need it to be.
To illustrate, let's call the timezone-immune timestamp I'm talking about timezoneImmuneTimestamp, and say that its value should always be 9:00 AM.
Say I'm executing someMomentInUsEasternTimezone.diff(timezoneImmuneTimestamp, 'minutes'), where someMomentInUsEasternTimezone is equal to 10:00AM (EST). The answer I need is 60 minutes.
Now let's add another Moment, someMomentInUsPacificTimezone and say its value is 11:00AM (PST). When I execute someMomentInUsWesternTimezone.diff(timezoneImmuneTimestamp, 'minutes'), the answer I need is 120 minutes.
Has anyone else had this particular problem, and more importantly, solved it?
It sounds like you want to work with only the "wall time" of each moment object. To do that, first create a clone of each moment and set their offsets to zero, as if they were UTC. When doing so, pass true to keep the wall time instead of the same point in actual universal time.
This is described in the docs for the utcOffset function:
The utcOffset function has an optional second parameter which accepts
a boolean value indicating whether to keep the existing time of day.
Passing false (the default) will keep the same instant in Universal Time, but the local time will change.
Passing true will keep the same local time, but at the expense of choosing a different point in Universal Time.
Thus, to get the difference in minutes between momentA and momentB with respect only to wall time:
momentA.clone().utcOffset(0, true).diff(momentB.clone().utcOffset(0, true), 'minutes')
Though missing from the docs, the same argument can be passed to the utc function. So if you prefer, you can shorten it to:
momentA.clone().utc(true).diff(momentB.clone().utc(true), 'minutes')
(Cloning helps the rest of your code by not mutating the original moment objects.)
Also - The Moment team highly recommends only using Moment in legacy/existing code. If you are writing a new application, please try Luxon instead. In Luxon, the setZone function has a keepLocalTime option that does the same thing as I showed in Moment.
I'm trying to get Dialogflow to report sales in different time periods. When I ask things like
what's the sale last month
what's the sale in march 2011
They work fine. However, if I didn't specify "last" or year, Dialogflow always think I'm asking about the upcoming period (which I guess works great for restaurant booking, etc.). For example, "what was our sale in april" would guess that I mean April of next year.
Should I write a fulfillment to deal with this or is there a way to specify only historical periods only?
Another related question is it has trouble understanding notations like 2010q1 or q2/2015 which is quite common in economics and finance. "q1/2000" would extract only q1 (correctly, but with the upcoming q1). "2000q1" wouldn't be recognized at all.
Thanks!
Problem
Some reoccurring events, that don't really end at some point (like club meetings?), depend on other conditions (like holiday season). However, manually adding these exceptions would be necessary every year, as the dates might differ.
Research
I have found out about exdate (see the image of "iCalendar components and their properties" on Wikipedia (2))
Also found some possible workaround: 'just writing a script to do process such events'. This would still mean I need to process a .ics manually and import it into my calendar, which implies some limitations:
can not be determined for all time spans (e.g. holidays not fixed for more than three years)
these events would probably be separate and not reoccurring/'grouped', which makes further edits harder
Question
Is there a way to specify recurring exceptions in iCal ?
To clarify, I have a recurring event and recurring exceptions.
So for instance I have a infinitely reoccurring weekly event, that depends on the month; where it might only take place if it's not e.g. January, August, or December.
Is there a way to use another event (/calendar) to filter events by boolean logic ?
If one could use a second event (or several) to plug into exdate this would solve the first problem and add some more possibilities.
note
if this question is too specific and the original problem could be solved by other means (other calendar-formats), feel free to comment/edit/answer
RFC2445 defines an EXRULE (exception rule) property. You can use that in addition to the RRULE to define recurring exceptions.
However, RFC2445 was superseded by RFC5545, which unfortunately deprecates the EXRULE property. So, client support is questionable.
As you already proposed, automatically adding EXDATE properties is a possible solution.
BYMONTH would be another possibility, e.g. here's a rule for a club meeting that occurs the first Wednesday of every month except December (which is their Christmas party, so no business meeting)
RRULE:FREQ=MONTHLY;BYDAY=1WE;BYMONTH=1,2,3,4,5,6,7,8,9,10,11
Let's say we have a reactive sales forecasting system.
Every time we make a sale we re-calculate our Forecast for future sales.
This works beautifully if there are lots of sales triggering our re-forecasting.
What happens however if sales go from 100 events per second, to 0. And stay 0 for a long time?
The forecast we published back when sales were good stays being the most up to date forecast.
How would you model in this situation an event that represents 'No sales happening' without falling back to some batch hourly/minutely/arbitrary time segment event that says 'X time has passed'.
This is a specific case of a generic question - How do you model time passing with nothing happening in an event based system - without using a ticking clock style event which would wake everyone up to reconsider their current values [an implementation which would not scale].
The only option I have considered that makes sense:
Every time we take a sale, we also schedule a deferred event 2 hours in the future that asks us to reconsider our assessment of that sale.
In handling that deferred event we may then choose to schedule further deferred events for re-considering.
Considering this is a very generic scenario, you've made a rather large assumption that it's not possible to come up with a design for re-evaluating past sales in a scalable way unless it's done one sale at a time.
There are many different scale related numbers in the scenario, and you're only looking at the one whereby a single scheduled forecast updater may attempt to process a very large number of past sales at the same time.
Other scalability issues I can think of:
Reevaluating the forecast for every single new sale doesn't sound great if you're expecting 100s of sales per second. If you're talking about a financial forecasting model for accounting, it's unlikely it needs to be updated every single time the organisation makes a sale, if the organisation is making hundreds of sales a second.
If you're talking about a short term predictive engine to be used for financial markets (ie predicting how much cash you'll need in the next 10 seconds, or energy, or other resources), then it sounds like you have constant volatility and you're not really likely to have a situation where nothing happens for hours. And if you do need forecasts updated very frequently, waiting a couple of hours before triggering a re-update is not likely to get you the kind of information you need in the way you need it.
With your approach, you will end up with one future scheduled event per product (which could be large), and every time you make a sale, you'll be dropping the old scheduled event and scheduling a new one. So for frequently selling products, you'll be doing repetitive work to constantly kick the can down the road a bit further, when you're not likely to ever get there.
What constitutes a good design is going to be based on the real scenario. The generic case is interesting to think about, but good designs need to be shaped to their circumstances.
Here are a few ideas I have that might be appropriate:
If you want an updated forecast per product when that product has a sale, but some products can sell very frequently, then a good approach may be to throttle or buffer the sales on a per product basis. If a product is selling 50 times a second, you can probably afford to wait 1 second, 10 seconds, 2 hours, whatever and evaluate all those sales at once, rather than re-forecasting 50 times a second. Especially if your forecasting process is heavy, doing it for every sale is likely to cause high load for low value, as the information will be outdated almost straight away by the next sale.
You could also have a generic timer that updates forecasts for all products that haven't sold in the last window, but handle the products in buffers. For example, every hour you could pick the 10 products with the oldest forecasts and update them. This prevents the single timer from taking on re-forecasting the entire product set in one hit.
You could use only the single timer approach above and forget the forecast updates on every sale if you want something dead simple.
If you're worried about load from batch forecasting, this kind of work should be done on different hardware from the ones handling sales.