FlinkCEP: Can I reference an earlier event to define a subsequent match? - scala

Here is a simple example:
val pattern =
Pattern.begin[Event]("start").where(_.getId == 42).
next("middle").subtype(classOf[SubEvent]).where(x => x.getVolume == **first event matched**.getVolume) ...
Essentially the second event ("middle") need to access the state of the first event ("start"). Is it possible to do this within FlinkCEP without requiring an external state?

Sure. You can get events by for a specific pattern with the help of Context.
new IterativeCondition<Event>() {
private static final long serialVersionUID = 8061969839441121955L;
#Override
public boolean filter(Event value, IterativeCondition.Context<Event> ctx) throws Exception {
double sum = 0.0;
for (Event e : ctx.getEventsForPattern("middle")) {
sum += e.getPrice();
}
return sum > 5.0;
}
}

Related

How to do Geofence monitoring/analytics using KSQLDB?

I am trying to do geofence monitoring/analytics using KSQLDB. I want to get a message whenever a vehicle ENTERS/LEAVES a geofence. Taking inspiration from the [https://github.com/gschmutz/various-demos/tree/master/kafka-geofencing] I have created a UDF named as GEOFENCE, below is the code for the same.
Below is my query to perform join on geofence stream and live vehicle position stream
CREATE stream join_live_pos_geofence_status_1 AS SELECT lp1.vehicleid,
lp1.lat,
lp1.lon,
s1p.geofencecoordinates,
Geofence(lp1.lat, lp1.lon, 'POLYGON(('+s1p.geofencecoordinates+'))') AS geofence_status
FROM live_position_1 LP1
LEFT JOIN stream_1_processed S1P within 72 hours
ON kmdlp1.clusterid = kmds1p.clusterid emit changes;
I am taking into account all the geofences created in last 3 days.
I have created another query to use the geofence status from previous query to calculate whether the vehicle is ENTERING/LEAVING geofence.
CREATE stream join_geofence_monitoring_1 AS SELECT *,
Geofence(jlpgs1.lat, jlpgs1.lon, 'POLYGON(('+jlpgs1.geofencecoordinates+'))', jlpgs1.geofence_status) geofence_monitoring_status
FROM join_live_pos_geofence_status_1 JLPGS1 emit changes;
The above query give me the output as 'INSIDE', 'INSIDE' for geofence_status and geofence_monitoring_status columns, respectively or the output is 'OUTSIDE', 'OUTSIDE' for geofence_status and geofence_monitoring_status columns, respectively. I know I am not taking into account the time aspect, like these 2 queries should never be executed at same time say 't0' but I am not able to think the correct way of doing this.
public class Geofence
{
private static final String OUTSIDE = "OUTSIDE";
private static final String INSIDE = "INSIDE";
private static GeometryFactory geometryFactory = JTSFactoryFinder.getGeometryFactory();
private static WKTReader wktReader = new WKTReader(geometryFactory);
#Udf(description = "Returns whether a coordinate lies within a polygon or not")
public static String geofence(final double latitude, final double longitude, String geometryWKT) {
boolean status = false;
String result = "";
Polygon polygon = null;
try {
polygon = (Polygon) wktReader.read(geometryWKT);
// However, an important point to note is that the longitude is the X value
// and the latitude the Y value. So we say "lat/long",
// but JTS will expect it in the order "long/lat".
Coordinate coord = new Coordinate(longitude, latitude);
Point point = geometryFactory.createPoint(coord);
status = point.within(polygon);
if(status)
{
result = INSIDE;
}
else
{
result = OUTSIDE;
}
} catch (ParseException e) {
throw new RuntimeException(e.getMessage());
}
return result;
}
#Udf(description = "Returns whether a coordinate moved in or out of a polygon")
public static String geofence(final double latitude, final double longitude, String geometryWKT, final String statusBefore) {
String status = geofence(latitude, longitude, geometryWKT);
if (statusBefore.equals("INSIDE") && status.equals("OUTSIDE")) {
//status = "LEAVING";
return "LEAVING";
} else if (statusBefore.equals("OUTSIDE") && status.equals("INSIDE")) {
//status = "ENTERING";
return "ENTERING";
}
return status;
}
}
My question is how can I calculate correctly that a vehicle is ENTERING/LEAVING a geofence? Is it even possible to do with KSQLDB?
Would it be correct to say that the join_live_pos_geofence_status_1 stream can have rows that go from INSIDE -> OUTSIDE and then from OUTSIDE -> INSIDE for some key value?
And what you're wanting to do is to output LEAVING and ENTERING events for these transitions?
You can likely do what you want using a custom UDAF. Custom UDAFs take and input and calculate an output, via some intermediate state. For example, an AVG udaf would take some numbers as input, its intermediate state would be the number of inputs and the sum of inputs, and the output would be count/sum.
In your case, the input would be the current state, e.g. either INSIDE or OUTSIDE. The UDAF would need to store the last two states in its intermediate state, and then the output state can be calculated from this. E.g.
Input Intermediate Output
INSIDE INSIDE <only single in intermediate - your choice what you output>
INSIDE INSIDE,INSIDE no-change
OUTSIDE INSIDE,OUTSIDE LEAVING
OUTSIDE OUTSIDE,OUTSIDE no-change
INSIDE OUTSIDE,INSIDE ENTERING
You'll need to decide what to output when there is only a single entry in the intermediate state, i.e. the first time a key is seen.
You can then filter the output to remove any rows that have no-change.
You may also need to set cache.max.bytes.buffering to zero to stop any results being conflated.
UPDATE: suggested code.
Not tested, but something like the following code may do what you want:
#UdafDescription(name = "my_geofence", description = "Computes the geofence status.")
public final class GoeFenceUdaf {
private static final String STATUS_1 = "STATUS_1";
private static final String STATUS_2 = "STATUS_2";
#UdafFactory(description = "Computes the geofence status.",
aggregateSchema = "STRUCT<" + STATUS_1 + " STRING, " + STATUS_2 + " STRING>")
public static Udaf<String, Struct, String> calcGeoFenceStatus() {
final Schema STRUCT_SCHEMA = SchemaBuilder.struct().optional()
.field(STATUS_1, Schema.OPTIONAL_STRING_SCHEMA)
.field(STATUS_2, Schema.OPTIONAL_STRING_SCHEMA)
.build();
return new Udaf<String, Struct, String>() {
#Override
public Struct initialize() {
return new Struct(STRUCT_SCHEMA);
}
#Override
public Struct aggregate(
final String newValue,
final Struct aggregate
) {
if (newValue == null) {
return aggregate;
}
if (aggregate.getString(STATUS_1) == null) {
// First status for this key:
return aggregate
.put(STATUS_1, newValue);
}
final String lastStatus = aggregate.getString(STATUS_2);
if (lastStatus == null) {
// Second status for this key:
return aggregate
.put(STATUS_2, newValue);
}
// Third and subsequent status for this key:
return aggregate
.put(STATUS_1, lastStatus)
.put(STATUS_2, newValue);
}
#Override
public String map(final Struct aggregate) {
final String previousStatus = aggregate.getString(STATUS_1);
final String currentStatus = aggregate.getString(STATUS_2);
if (currentStatus == null) {
// Only have single status, i.e. first status for this key
// What to do? Probably want to do:
return previousStatus.equalsIgnoreCase("OUTSIDE")
? "LEAVING"
: "ENTERING";
}
// Two statuses ...
if (currentStatus.equals(previousStatus)) {
return "NO CHANGE";
}
return previousStatus.equalsIgnoreCase("OUTSIDE")
? "ENTERING"
: "LEAVING";
}
#Override
public Struct merge(final Struct agg1, final Struct agg2) {
throw new RuntimeException("Function does not support session windows");
}
};
}
}

How do I update a gtk listbox from an async method?

So when writing UI in GTK it's generally preferrable to handle reading of files, etc. in an Async Method. things such as listboxes, are generally bound to a ListModel, the items in the ListBox updated in accordance with the items_changed signal.
So if I have some class, that implements ListModel, and has an add function, and some FileReader that holds a reference to said ListModel, and call add from an async function, how do i make that in essence triggering the items_changed and having GTK update accordingly?
I've tried list.items_changed.connect(message("Items changed!")); but it never triggers.
I saw this: How can one update GTK+ UI in Vala from a long operation without blocking the UI
but in this example, it's just the button label that is changed, no signal is actually triggered.
EDIT: (Codesample added at the request of #Michael Gratton
//Disclaimer: everything here is still very much a work in progress, and will, as soon as I'm confident that what I have is not total crap, be released under some GPL or other open license.
//Note: for the sake of readability, I adopted the C# naming convention for interfaces, namely, putting a capital 'I' in front of them, a decision i do not feel quite as confident in as I did earlier.
//Note: the calls to message(..) was put in here to help debugging
public class AsyncFileContext : Object{
private int64 offset;
private bool start_read;
private bool read_to_end;
private Factories.IVCardFactory factory;
private File file;
private FileMonitor monitor;
private Gee.Set<IVCard> vcard_buffer;
private IObservableSet<IVCard> _vCards;
public IObservableSet<IVCard> vCards {
owned get{
return this._vCards;
}
}
construct{
//We want to start fileops at the beginning of the file
this.offset = (int64)0;
this.start_read = true;
this.read_to_end = false;
this.vcard_buffer = new Gee.HashSet<IVCard>();
this.factory = new Factories.GenericVCardFactory();
}
public void add_vcard(IVCard card){
//TODO: implement
}
public AsyncFileContext(IObservableSet<IVCard> vcards, string path){
this._vCards = vcards;
this._vCards = IObservableSet.wrap_set<IVCard>(new Gee.HashSet<IVCard>());
this.file = File.new_for_path(path);
this.monitor = file.monitor_file(FileMonitorFlags.NONE, null);
message("1");
//TODO: add connect
this.monitor.changed.connect((file, otherfile, event) => {
if(event != FileMonitorEvent.DELETED){
bool changes_done = event == FileMonitorEvent.CHANGES_DONE_HINT;
Idle.add(() => {
read_file_async.begin(changes_done);
return false;
});
}
});
message("2");
//We don't know that changes are done yet
//TODO: Consider carefully how you want this to work when it is NOT called from an event
Idle.add(() => {
read_file_async.begin(false);
return false;
});
}
//Changes done should only be true if the FileMonitorEvent that triggers the call was CHANGES_DONE_HINT
private async void read_file_async(bool changes_done) throws IOError{
if(!this.start_read){
return;
}
this.start_read = false;
var dis = new DataInputStream(yield file.read_async());
message("3");
//If we've been reading this file, and there's then a change, we assume we need to continue where we let off
//TODO: assert that the offset isn't at the very end of the file, if so reset to 0 so we can reread the file
if(offset > 0){
dis.seek(offset, SeekType.SET);
}
string line;
int vcards_added = 0;
while((line = yield dis.read_line_async()) != null){
message("position: %s".printf(dis.tell().to_string()));
this.offset = dis.tell();
message("4");
message(line);
//if the line is empty, we want to jump to next line, and ignore the input here entirely
if(line.chomp().chug() == ""){
continue;
}
this.factory.add_line(line);
if(factory.vcard_ready){
message("creating...");
this.vcard_buffer.add(factory.create());
vcards_added++;
//If we've read-in and created an entire vcard, it's time to yield
message("Yielding...");
Idle.add(() => {
_vCards.add_all(vcard_buffer);
vcard_buffer.remove_all(_vCards);
return false;
});
Idle.add(read_file_async.callback);
yield;
message("Resuming");
}
}
//IF we expect there will be no more writing, or if we expect that we read ALL the vcards, and did not add any, it's time to go back and read through the whole thing again.
if(changes_done){ //|| vcards_added == 0){
this.offset = 0;
}
this.start_read = true;
}
}
//The main idea in this class is to just bind the IObservableCollection's item_added, item_removed and cleared signals to the items_changed of the ListModel. IObservableCollection is a class I have implemented that merely wraps Gee.Collection, it is unittested, and works as intended
public class VCardListModel : ListModel, Object{
private Gee.List<IVCard> vcard_list;
private IObservableCollection<IVCard> vcard_collection;
public VCardListModel(IObservableCollection<IVCard> vcard_collection){
this.vcard_collection = vcard_collection;
this.vcard_list = new Gee.ArrayList<IVCard>.wrap(vcard_collection.to_array());
this.vcard_collection.item_added.connect((vcard) => {
vcard_list.add(vcard);
int pos = vcard_list.index_of(vcard);
items_changed(pos, 0, 1);
});
this.vcard_collection.item_removed.connect((vcard) => {
int pos = vcard_list.index_of(vcard);
vcard_list.remove(vcard);
items_changed(pos, 1, 0);
});
this.vcard_collection.cleared.connect(() => {
items_changed(0, vcard_list.size, 0);
vcard_list.clear();
});
}
public Object? get_item(uint position){
if((vcard_list.size - 1) < position){
return null;
}
return this.vcard_list.get((int)position);
}
public Type get_item_type(){
return Type.from_name("VikingvCardIVCard");
}
public uint get_n_items(){
return (uint)this.vcard_list.size;
}
public Object? get_object(uint position){
return this.get_item((int)position);
}
}
//The IObservableCollection parsed to this classes constructor, is the one from the AsyncFileContext
public class ContactList : Gtk.ListBox{
private ListModel list_model;
public ContactList(IObservableCollection<IVCard> ivcards){
this.list_model = new VCardListModel(ivcards);
bind_model(this.list_model, create_row_func);
list_model.items_changed.connect(() => {
message("Items Changed!");
base.show_all();
});
}
private Gtk.Widget create_row_func(Object item){
return new ContactRow((IVCard)item);
}
}
Heres the way i 'solved' it.
I'm not particularly proud of this solution, but there are a couple of awful things about the Gtk ListBox, one of them being (and this might really be more of a ListModel issue) if the ListBox is bound to a ListModel, the ListBox will NOT be sortable by using the sort method, and to me at least, that is a dealbreaker. I've solved it by making a class which is basically a List wrapper, which has an 'added' signal and a 'remove' signal. Upon adding an element to the list, the added signal is then wired, so it will create a new Row object and add it to the list box. That way, data is control in a manner Similar to ListModel binding. I can not make it work without calling the ShowAll method though.
private IObservableCollection<IVCard> _ivcards;
public IObservableCollection<IVCard> ivcards {
get{
return _ivcards;
}
set{
this._ivcards = value;
foreach(var card in this._ivcards){
base.prepend(new ContactRow(card));
}
this._ivcards.item_added.connect((item) => {
base.add(new ContactRow(item));
base.show_all();
});
base.show_all();
}
}
Even though this is by no means the best code I've come up with, it works very well.

To return a list from Database using a custom Item Reader in spring batch

I have a custom item reader to return a list of records from table.My job is running in an infinite loop as the reader contract is not met.Any suggestions on this pls?
public class customReader implements ItemReader<List<T>>{
#Autowired
customDao customDao;
static List<T> CCTransDlyLg = null;
#Override
public List<T> read() throws Exception {
if(CCTransDlyLg==null || (CCTransDlyLg!=null && CCTransDlyLg.size()==0)){
CCTransDlyLg=customDao.getList();
}
log.info("CCTransDlyLg List:"+CCTransDlyLg.size());
return CCTransDlyLg.size()==0 ? null : CCTransDlyLg;
}
You're list never changes. Assuming you read a list that is size 5, your return statement will always return that same list. The logic of your ItemReader looks like you only want to return a single list (aka one call to the read() method).
As per Spring Batch Reader contract your method will be called again and again till it returns null.In your code if customDao succeeds your list will be always of Same Size it will never be zero. You need some condition to break out of that loop and return null .This is one possible solution by using a variable called index to break out of that loop.
On other note i see Mike answered your question i learned spring batch from his book and video itself :)
public class customReader implements ItemReader<List<T>> {
private static List<T> CCTransDlyLg = null;
#Autowired
customDao customDao;
private int index = 0;
#Override
public List<T> read() throws Exception {
if (CCTransDlyLg == null || (CCTransDlyLg != null && CCTransDlyLg.size() == 0)) {
CCTransDlyLg = customDao.getList();
index = CCTransDlyLg.size() + 1;
}
log.info("CCTransDlyLg List:" + CCTransDlyLg.size());
return index > CCTransDlyLg.size() ? null : CCTransDlyLg;
}

How to set max no of records read in flatfileItemReader?

My application needs only fixed no of records to be read
& processed. How to limit this if I am using a flatfileItemReader ?
In DB based Item Reader, I am returning null/empty list when max_limit is reached.
How to achieve the same if I am using a org.springframework.batch.item.file.FlatFileItemReader ?
For the FlatFileItemReader as well as any other ItemReader that extends AbstractItemCountingItemStreamItemReader, there is a maxItemCount property. By configuring this property, the ItemReader will continue to read until either one of the following conditions has been met:
The input has been exhausted.
The number of items read equals the maxItemCount.
In either of the two above conditions, null will be returned by the reader, indicating to Spring Batch that the input is complete.
If you have any custom ItemReader implementations that need to satisfy this requirement, I'd recommend extending the AbstractItemCountingItemStreamItemReader and going from there.
The best approch is to write a delegate which is responsible to track down number of read records and stop after a fixed count; the components should take care of execution context to allow restartability
class CountMaxReader<T> implements ItemReader<T>,ItemStream
{
private int count = 0;
private int max = 0;
private ItemReader<T> delegate;
T read() {
T next = null;
if(count < max) {
next = delegate.read();
++count;
}
return next;
}
void open(ExecutionContext executionContext) {
((ItemStream)delegate).open(executionContext);
count = executionContext.getInt('count', 0);
}
void close() {
((ItemStream)delegate).close(executionContext);
}
void update(ExecutionContext executionContext) {
((ItemStream)delegate).update(executionContext);
executionContext.putInt('count', count);
}
}
This works with any reader.
public class CountMaxFlatFileItemReader extends FlatFileItemReader {
private int counter;
private int maxCount;
public void setMaxCount(int maxCount) {
this.maxCount = maxCount;
}
#Override
public Object read() throws Exception {
counter++;
if (counter >= maxCount) {
return null; // this will stop reading
}
return super.read();
}
}
Something like this should work. The reader stops reading, as soon as null is returned.

Case-insensitive indexing with Hibernate-Search?

Is there a simple way to make Hibernate Search to index all its values in lower case ? Instead of the default mixed-case.
I'm using the annotation #Field. But I can't seem to be able to configure some application-level set
Fool that I am ! The StandardAnalyzer class is already indexing in lowercase. It's just a matter of setting the search terms in lowercase too. I was assuming the query would do that.
However, if a different analyzer were to be used, application-wide, then it can be set using the property hibernate.search.analyzer.
Lowercasing, term splitting, removing common terms and many more advanced language processing functions are applied by the Analyzer.
Usually you should process user input meant to match indexed strings with the same Analyzer used at indexing; configuring hibernate.search.analyzer sets the default (global) Analyzer, but you can customize it per index, per entity type, per field and even on different entity instances.
It is for example useful to have language specific analysis, so to process Chinese descriptions with Chinese specific routines, Italian descriptions with Italian tokenizers.
The default analyzer is ok for most use cases, and does lowercasing and splits terms on whitespace.
Consider as well that when using the Lucene Queryparser the API requests you the appropriate Analyzer.
When using the Hibernate Search QueryBuilder it attempts to apply the correct Analyzer on each field; see also http://docs.jboss.org/hibernate/search/4.1/reference/en-US/html_single/#search-query-querydsl .
There are multiple way to make sort insensitive in string type field only.
1.First Way is add #Fields annotation in field/property on entity.
Like
#Fields({#Field(index=Index.YES,analyze=Analyze.YES,store=Store.YES),
#Field(index=Index.YES,name = "nameSort",analyzer = #Analyzer(impl=KeywordAnalyzer.class), store = Store.YES)})
private String name;
suppose you have name property with custom analyzer and sort on that. so it's not possible then you can add new Field in index with nameSort apply sort on that field.
you must apply Keyword Analyzer class because that is not tokeniz field and by default apply lowercase factory class in field.
2.Second way is that you can implement your comparison class on sorting like
#Override
public FieldComparator newComparator(String field, int numHits, int sortPos, boolean reversed) throws IOException {
return new StringValComparator(numHits, field);
}
Make one class with extend FieldComparatorSource class and implement above method.
Created new Class name with StringValComparator and implements FieldComparator
and implement following method
class StringValComparator extends FieldComparator {
private String[] values;
private String[] currentReaderValues;
private final String field;
private String bottom;
StringValComparator(int numHits, String field) {
values = new String[numHits];
this.field = field;
}
#Override
public int compare(int slot1, int slot2) {
final String val1 = values[slot1];
final String val2 = values[slot2];
if (val1 == null) {
if (val2 == null) {
return 0;
}
return -1;
} else if (val2 == null) {
return 1;
}
return val1.toLowerCase().compareTo(val2.toLowerCase());
}
#Override
public int compareBottom(int doc) {
final String val2 = currentReaderValues[doc];
if (bottom == null) {
if (val2 == null) {
return 0;
}
return -1;
} else if (val2 == null) {
return 1;
}
return bottom.toLowerCase().compareTo(val2.toLowerCase());
}
#Override
public void copy(int slot, int doc) {
values[slot] = currentReaderValues[doc];
}
#Override
public void setNextReader(IndexReader reader, int docBase) throws IOException {
currentReaderValues = FieldCache.DEFAULT.getStrings(reader, field);
}
#Override
public void setBottom(final int bottom) {
this.bottom = values[bottom];
}
#Override
public String value(int slot) {
return values[slot];
}
}
Apply sorting on Fields Like
new SortField("name",new StringCaseInsensitiveComparator(), true);