how to create a elastic watch which can identify the changes of data in a given index of elasticsearch - watch

In the offical site of Elastic Watcher, they said
Watcher is a plugin for Elasticsearch that provides alerting and notification based on changes in your data
The relevant data or changes in data can be identified with a periodic Elasticsearch query
What I want is a function like Trigger of MySQL, that is when a record is updated, a action is triggered.
But I didn't find a example or document to address this use case, can anybody tell me how to do this?

You define an input of type search and using body, indices (mainly) you define which indices to look at (indices) and what is the actual query (body). If you need other settings, there are many more things to configure. After this, you define a condition and an action to complete the flow.
Make an attempt and create a watch. If you have difficulties, provide details of what you tried in a different SO post (you realize your current post is not appropriate for SO since you ask for complete code without you trying anything).

Related

how to quickly locate which sheets/dashboards contain a field?

I am creating a data dictionary and I am supposed to track the location of any used field in a workbook. For example (superstore sample data), I need to specify which sheets/dashboards have the [sub-category] field.
My dataset has hundreds of measures/dimensions/calc fields, so it's incredibly time exhaustive to click into every single sheet/dashboard just to see if a field exists in there, so is there a quicker way to do this?
One robust, but not free, approach is to use Tableau's Data Catalog which is part of the Tableau Server Data Management Add-On
Another option is to build your own cross reference - You could start with Chris Gerrard's ruby libraries described in the article http://tableaufriction.blogspot.com/2018/09/documenting-dashboards-and-their.html

Real-time data streaming using Wikipedia's RecentChanges API

I'm lately trying to create a demo on real time streaming using NiFi -> Kafka -> Druid -> Superset. For the purposes of this demo I chose to use Wikipedia's RecentChanges API in order to get asynchronous data of the most recent changes.
I use this URL in order to get a response of changes. I'm calling the API constanlty in order to not miss any changes. This way I get a lot of duplicates that I do not want. Is there anyway to parameterize this API to fix it for example getting all the changes from the previous second and doing that everysecond or something else to tackle this issue. I'm trying to make a configuration for this uing NiFi, if someone has to add something on that part then visit this discussion on Cloudera.
Yes. See https://en.wikipedia.org/w/api.php?action=help&modules=query%2Brecentchanges Use rcstart and rcend to define your start and end times. You can use "now" for rcend.
I want to expand smartse's answer and come up with a solution. You want to put your API request in certain time windows, by shifting the start and end parameters. Windowing might work like this:
Initialize start, end timestamp parameters
Put those parameters as attributes on the flow
Downstream processors can call the API using those parameters
After doing that, you have to set start = previous_end + 1 second and end = now
When you determine the new window for the next run, you need the parameters from the previous run. This is why you have to remember those values. You can achieve this using NiFi's distributed map cache.
I've assembled a flow for you:
Zoom into Get next date range:
The end parameter is always now, so you just have to store the start parameter. FetchDistributedMapCache will fetch that for you and put it into stored.state attribute:
Set time range processor will initialize the parameters:
Notice that end is always now and start is either an initial date (for the first run) or the last end parameter plus 1 second. At this point the flow is directed into the Time range output, where you can call your API downstream. Additionally you have to update the stored.value. This happens in the ReplaceText processor:
Finally you update the state:
The lifecycle of the parameters are bound to the cache identifier. When you change the identifier, you start from scratch.

Data store in an actor system

I'm working on an event processing pipeline based on Akka actors. I have 3 actors for each step of the pipeline: FilterWorker, EnrichWorker and ProcessWorker; plus a supervisor actor that makes sure the events are sent from one step of the pipeline to the next.
The enrich step might need to query some external database for extra data or even create new data that I'll want to persist. For example, the enrich step of a web analytics system might want to enrich a click event with the user that made the click and store that user information in a database.
Keeping in mind that example, I see the following options:
1.Use a singleton; e.g. UserStore that keeps in memory all the users gathered so far and saves them to the database once in a while; has all the logic to fetch users that are not yet in memory. Doesn't seem like a good idea to use a singleton in an actor system however (?).
Use a store actor. Use tell to add a new user and ask to fetch it.
Is there a better pattern for this?
Thanks!
In order to not leave this unanswered, I went with my second option and johanandren's suggestion of having an Actor fill the data store role. Works pretty well!

Visio 2013: How to trigger a change in databinding of all shapes

I have a nice process overview for our ordering process in Visio. I have an external data source (SQL Server), which works fine. Every record in my data source represents one ordering process. Currently all my shapes of the process are linked to the first record of the data source.
Now I want to add a dynamic behavior. What I want to achieve is this:
A user provides the order reference in a textbox (order reference is a column in the data source)
Afterwards the user clicks a button
After the button click, the process is updated and all shapes are now linked to the external data source record, that matches the provided order reference
So in short: the user should be able to select which process that needs to be visualized.
I assume that this is common functionality, but I don't see how I can deal with this requirement. I've searched already some days on this issue, but without any success.
Can you help me with this issue?
Thanks a lot!
Problem solved :-)
Some old school VBA was required. Using the DataRecordSet object did the trick. It contains a method GetDataRowIDs that you can use to query the external dataset. Once you have the record to visualize, it's just a matter of dynamically updating the shapes with the correct record. Use macro recording to see how to do this.
MSDN: http://msdn.microsoft.com/en-us/library/office/ms195694(v=office.12).aspx

Determining whether mongodb save method really update a record or not

My question is clear as in the title. When a request come to my service for updating related record in mongoDb, we use "save" method.
However, I would like to understand whether the save method really updates the record or not.
In other words, I would like to know if the content going to save is the same with the existing content in mongoDb. Accordingly, even if save method is executed without any errors, is it possible to understand whether it is really updated or not?
Thanks in advance
There are several ways to checks this.
The first is after calling Save, is to call the getLastError method. Within the console this is just db.getLastError().
This will tell you if an error occurred during the last operation. More details can be found at te following address http://docs.mongodb.org/manual/core/write-operations/#write-concern.
Another way would be to call findAndModify, this will allow you to update the document and either get the updated document back.
http://docs.mongodb.org/manual/reference/command/findAndModify/
Both of these are available in all of the official drivers.
Save method always writes the record.
There is no situation in Mongo where the write would not happen because the record that is being saved is identical to the record that's already there. The write would simply happen and "overwrite" existing data with new data (which happens to be identical).
The only way you can tell is by comparing the old and new documents - and that's a lot of extra work.