debugging Elastic Ingest pipelines with grok processor - elastic-stack

I have an elastic ingest pipeline with grok processor defined along with error handling
{
"my_ingest" : {
"description" : "parse multiple patterns",
"processors" : [
{
"grok" : {
"field" : "message",
"patterns" : [
"""^\[end ] %{DATA:method} \'%{GREEDYDATA:url}' %{DATA:status} :: Duration: %{DATA:duration} ms""",
"""^\[start] %{DATA:method} \'%{GREEDYDATA:url}' :: Start Time:%{GREEDYDATA:starttime}""",
"%{GREEDYDATA:message}"
],
"on_failure" : [
{
"set" : {
"field" : "failure",
"value" : "{{_ingest.on_failure_processor_type }}-{{ _ingest.on_failure_message }}"
}
}
]
}
}
],
"on_failure" : [
{
"set" : {
"field" : "_index",
"value" : "failedindex"
}
}
]
}
}
i am referring to this pipeline in my filebeat.yml
the grok filters works when i do a simulate in dev tools. But when i run the actual logging i do not see the log statements. it looks like they are failing to get parsed and not visible in kibana. i also don't see a new index created where i am hoping to see the errors logged as defined on on_failure.
can some one please suggest or give pointers for debugging the issue.
how do i access this on_failure_processor_type and on_failure_message from the metadata ?
Thanks

The _simulate endpoint is generally the best starting point for debugging.
If that doesn't solve the issue, please post a sample document. Otherwise we won't be able to help there.
Also for "i also don't see a new index created": Are you sure the data is being sent to Elasticsearch? Some logs from Filebeat or whatever you are using might be worth a check (or a share).

Related

How to build definition hover capabilities to a new language extension

The background is I have a custom JSON based language.
Consider the following
In file 1, I have the following:
[
{
"name" : "abcde",
"source" : "source::abcde",
// other attributes
},
{
"name" : "qwerty",
"source" : "source::qwerty"
// other attributes
},
]
In file 2, I have the following:
abcde.json
{
"name" : "abcde"
// properties related to it
}
querty.json
{
"name" : "querty"
// properties related to it
}
Now, I want to build an extension/ grammar such that when a visitor uses Ctrl + click on source::abcde, it takes them to abcde.json.
I am wondering how to achieve through a VS code extension. I dont have a lot of expertise in this area.
I took a look into https://marketplace.visualstudio.com/ , could not find one directly. I have 10000+ such definitions, it is becoming very hard to maintain and update these.
Any help on how to achieve this? or some pointing blogs would be really helpful

MongoListener + Spring Detect updated fields in Document

I have a Springboot application + MongoDB and I need to audit every update made to a collection on specified fields (data analysis purpose).
If I have a collection like:
{
"_id" : ObjectId("12345678910"),
"label_1" : ObjectId("someIdForLabel1"),
"label_2" : ObjectId("someIdForLabel2"),
"label_3" : ObjectId("someIdForLabel"),
"name": "my data",
"description": "some curious stuff",
"updatedAt" : ISODate("2022-06-21T08:28:23.115Z")
}
I want to write an audit document whenever a label_* is updated. Something like
{
"_id" : ObjectId("111213141516"),
"modifiedDocument" : ObjectId("12345678910"),
"modifiedLabel" : "label_1",
"newValue" : ObjectId("someNewIdForLabel1"),
"updatedBy" : ObjectId("userId"),
"updatedAt" : ISODate("2022-06-21T08:31:20.315Z")
}
How can I achieve this with MongoListener? I already have two methods for AfterSave and AfterDelete , for other purposes, but they give me the whole new Document.
I would rather avoid to query again the DB or to use a findAndModify() in the first place.
I gave a look to ChangeStreams too, but I have too many doubts when it comes to more than 1 instance.
Thank you so much, any tip will be appreciated!

MongoDB Find and Modify With GraphQL

I am working on GraphQL mutation and need help here. My document looks like
{
"_id" : ObjectId("5bc02db357146d0c385d4988"),
"item_type" : "CategoryMapping",
"id" : null,
"CategoryGroupName" : "Mystries & Thriller",
"CustomCategory" : [
{
"name" : "Private Investigator",
"MappedBisacs" : [
"investigator",
"Privately owned",
"Secret"
]
},
{
"name" : "Crime Investigator",
"MappedBisacs" : [
"crime investigator",
"crime thriller"
]
}
]
}
UI
Allow user to update MappedBisacs through list of checkbox. So user can add/update or delete list of bisacs.
Problem - When client send GraphQL query like following;
mutation {
CategoryMapping_add(input: {CategoryGroupName: "Mystries & Thriller", CustomCategory: [{name: "Crime Investigator", MappedBisacs: ["investigator", "dafdfdaf", "dafsdf"]}]}) {
clientMutationId
}
}
I need to find Specific custom category and update its bisac array.
I am not sure if I got it, but this more a doubt on MongoDb than on GraphQL itself. First you must find the document that you want (I would use the id of the document instead of CategoryGroupName), then you can update this array in several ways. For example, after you found the document, you could simply access the array content and spread into a new one adding this new data from your mutation, and save this object with the update method. (if you simply want to add new data without removing any)
So, it depends on the case.
Check: https://docs.mongodb.com/manual/reference/operator/update-array/
Hope it helps! :)

How to trigger a marketing goal in Sitecore and later see it in reports?

I am triggering marketing goals using back-end code as follows:
if (!TrackerEnabled())
{
Tracker.StartTracking();
}
Item goal = Sitecore.Context.Database.GetItem(goalId);
var goalAsPageEvent = new PageEventItem(goal);
var pageEventsRow = Sitecore.Analytics.Tracker.CurrentPage.Register(goalAsPageEvent);
Sitecore.Analytics.Tracker.Submit();
And I can see the data in MongoDB interactions table as follows:
"PageEvents" : [
{
"Name" : "Apply Now - Auto Loans",
"Timestamp" : NumberLong(0),
"PageEventDefinitionId" : LUUID("dc9d7115-7bd5-7b40-9fa5-2722a2fb2e00"),
"IsGoal" : true,
"DateTime" : ISODate("2016-07-28T12:47:33.700Z"),
"Value" : 25
},
// ...
]
My question is: how can I see this data in Sitecore Experience Analytics or Content Editor?
Yes, you will be able to see this in Experience Analytics in aggregated state.
If you want to see this data in Sitecore with details you should use Experience Profile application.

Storing a query in Mongo

This is the case: A webshop in which I want to configure which items should be listed in the sjop based on a set of parameters.
I want this to be configurable, because that allows me to experiment with different parameters also change their values easily.
I have a Product collection that I want to query based on multiple parameters.
A couple of these are found here:
within product:
"delivery" : {
"maximum_delivery_days" : 30,
"average_delivery_days" : 10,
"source" : 1,
"filling_rate" : 85,
"stock" : 0
}
but also other parameters exist.
An example of such query to decide whether or not to include a product could be:
"$or" : [
{
"delivery.stock" : 1
},
{
"$or" : [
{
"$and" : [
{
"delivery.maximum_delivery_days" : {
"$lt" : 60
}
},
{
"delivery.filling_rate" : {
"$gt" : 90
}
}
]
},
{
"$and" : [
{
"delivery.maximum_delivery_days" : {
"$lt" : 40
}
},
{
"delivery.filling_rate" : {
"$gt" : 80
}
}
]
},
{
"$and" : [
{
"delivery.delivery_days" : {
"$lt" : 25
}
},
{
"delivery.filling_rate" : {
"$gt" : 70
}
}
]
}
]
}
]
Now to make this configurable, I need to be able to handle boolean logic, parameters and values.
So, I got the idea, since such query itself is JSON, to store it in Mongo and have my Java app retrieve it.
Next thing is using it in the filter (e.g. find, or whatever) and work on the corresponding selection of products.
The advantage of this approach is that I can actually analyse the data and the effectiveness of the query outside of my program.
I would store it by name in the database. E.g.
{
"name": "query1",
"query": { the thing printed above starting with "$or"... }
}
using:
db.queries.insert({
"name" : "query1",
"query": { the thing printed above starting with "$or"... }
})
Which results in:
2016-03-27T14:43:37.265+0200 E QUERY Error: field names cannot start with $ [$or]
at Error (<anonymous>)
at DBCollection._validateForStorage (src/mongo/shell/collection.js:161:19)
at DBCollection._validateForStorage (src/mongo/shell/collection.js:165:18)
at insert (src/mongo/shell/bulk_api.js:646:20)
at DBCollection.insert (src/mongo/shell/collection.js:243:18)
at (shell):1:12 at src/mongo/shell/collection.js:161
But I CAN STORE it using Robomongo, but not always. Obviously I am doing something wrong. But I have NO IDEA what it is.
If it fails, and I create a brand new collection and try again, it succeeds. Weird stuff that goes beyond what I can comprehend.
But when I try updating values in the "query", changes are not going through. Never. Not even sometimes.
I can however create a new object and discard the previous one. So, the workaround is there.
db.queries.update(
{"name": "query1"},
{"$set": {
... update goes here ...
}
}
)
doing this results in:
WriteResult({
"nMatched" : 0,
"nUpserted" : 0,
"nModified" : 0,
"writeError" : {
"code" : 52,
"errmsg" : "The dollar ($) prefixed field '$or' in 'action.$or' is not valid for storage."
}
})
seems pretty close to the other message above.
Needles to say, I am pretty clueless about what is going on here, so I hope some of the wizzards here are able to shed some light on the matter
I think the error message contains the important info you need to consider:
QUERY Error: field names cannot start with $
Since you are trying to store a query (or part of one) in a document, you'll end up with attribute names that contain mongo operator keywords (such as $or, $ne, $gt). The mongo documentation actually references this exact scenario - emphasis added
Field names cannot contain dots (i.e. .) or null characters, and they must not start with a dollar sign (i.e. $)...
I wouldn't trust 3rd party applications such as Robomongo in these instances. I suggest debugging/testing this issue directly in the mongo shell.
My suggestion would be to store an escaped version of the query in your document as to not interfere with reserved operator keywords. You can use the available JSON.stringify(my_obj); to encode your partial query into a string and then parse/decode it when you choose to retrieve it later on: JSON.parse(escaped_query_string_from_db)
Your approach of storing the query as a JSON object in MongoDB is not viable.
You could potentially store your query logic and fields in MongoDB, but you have to have an external app build the query with the proper MongoDB syntax.
MongoDB queries contain operators, and some of those have special characters in them.
There are rules for mongoDB filed names. These rules do not allow for special characters.
Look here: https://docs.mongodb.org/manual/reference/limits/#Restrictions-on-Field-Names
The probable reason you can sometimes successfully create the doc using Robomongo is because Robomongo is transforming your query into a string and properly escaping the special characters as it sends it to MongoDB.
This also explains why your attempt to update them never works. You tried to create a document, but instead created something that is a string object, so your update conditions are probably not retrieving any docs.
I see two problems with your approach.
In following query
db.queries.insert({
"name" : "query1",
"query": { the thing printed above starting with "$or"... }
})
a valid JSON expects key, value pair. here in "query" you are storing an object without a key. You have two options. either store query as text or create another key inside curly braces.
Second problem is, you are storing query values without wrapping in quotes. All string values must be wrapped in quotes.
so your final document should appear as
db.queries.insert({
"name" : "query1",
"query": 'the thing printed above starting with "$or"... '
})
Now try, it should work.
Obviously my attempt to store a query in mongo the way I did was foolish as became clear from the answers from both #bigdatakid and #lix. So what I finally did was this: I altered the naming of the fields to comply to the mongo requirements.
E.g. instead of $or I used _$or etc. and instead of using a . inside the name I used a #. Both of which I am replacing in my Java code.
This way I can still easily try and test the queries outside of my program. In my Java program I just change the names and use the query. Using just 2 lines of code. It simply works now. Thanks guys for the suggestions you made.
String documentAsString = query.toJson().replaceAll("_\\$", "\\$").replaceAll("#", ".");
Object q = JSON.parse(documentAsString);