I am capturing live streaming of data and processing it. I configured my logstash.conf file.
I started my ElasticSearch, Logstash and Kibana.
I created my index in kibana and when I do a get index in the dev tools,
I have something like this
"message": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
I want to change the type of message from String to Double. How can I do it?
You can't change mapping after an index is created - you'll have to create the mapping yourself in a new index explicitly create the fields/types you need:
https://www.elastic.co/guide/en/elasticsearch/reference/current/mapping.html
then re-index from the old to the new index:
https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-reindex.html
Note the type you want is 'double' not 'Double':
https://www.elastic.co/guide/en/elasticsearch/reference/current/mapping-types.html
Changing the data type of a field in Elasticsearch(ES) is a breaking change. In your case, you need an update the mapping and update in ES.
Please use https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-get-mapping.html to verify, that mapping is updated successfully in ES.
Reindex API requires _source to be enabled, Please refer https://www.elastic.co/guide/en/elasticsearch/reference/current/mapping-source-field.html for more information on _source field and whether it's enabled in your case.
If it's not enabled in your case, then the only option you have it to delete the old index(which has older mapping) and create it again with the new mapping.
Let me know if you have any doubt or face any issue implementing this.
Related
I'm using MS Automate to solve an integration challenge between two systems we use in our Project Management lifecycle. I have a custom connector written by the vendor of System A which allows me to create a Flow in MS Automate which is triggered when a record is Created or Updated.
So far, so good. However, the method in the connector provided by System A returns the new or updated record containing a number of fields which contain value GUIDs as the fields are 'choice' type fields e.g. Department, Status etc. What I end up with is a record where Status = "XXXXXX-000000-00000-00000" etc. The vendor also provides a restful API endpoint which I can query, which returns a JSON collection of fields, which include a 'Choices' section for each field of this type which is a standard JSON which looks like:
{
"Id": "156e6c29-24b3-4413-af91-80a62a04d443",
"Order": 110,
"InternalName": "PrjStatus",
"DisplayName": "Status",
"ColumnType": 5,
"ColumnAggregate": 0,
"Choices": {
"69659014-be4d-eb11-bf94-00155df8457c": "(0) Not Set",
"c30c50e5-51cf-ea11-bfd3-00155de84703": "(1) On Track",
"c40c50e5-51cf-ea11-bfd3-00155de84703": "(2) At Risk",
"c50c50e5-51cf-ea11-bfd3-00155de84703": "(3) Off Track",
"6a659014-be4d-eb11-bf94-00155df8457c": "(4) Not Tracked"
},
Technical problem:
What I have is the GUID of the choice (not the field). I need to take the GUID, in this case "6a659014-be4d-eb11-bf94-00155df8457c" and translate it into "(4) Not Tracked" and store this in a variable to write to a SharePoint list. I need to do this for about 30 fields which are similar in the record.
I've created the flow and the connector has given me the record with a list of fields, some of which contain value GUIDs. I know which fields these are and I have the Display Names of these fields.
I have added a HTTP call to the provided API endpoint (lets call it GetFields), which has returned a 200 response, the body of the response containing a JSON collection of the 50 or so fields in System A.
I can't work out how to parse the body of the response for the GUID I have for each field value and ensure I have the right corresponding text value, so I can then write it to a field variable, and then create a SharePoint record, all wrapped up in an MS Automate flow.
I hope I've understood you correctly but from what I can work out, you want to dynamically select the value of the choice from the GUID you've been provided (by whatever means).
I created a small flow to prove the concept. Firstly, these two steps setup the scenario, the first being the GUID you want to extract the choice value for and the second being the JSON object itself ...
The third step will take the value from the first variable and use it dynamically in an expression to extract that key from the JSON and return the value.
This is the expression ...
variables('JSON')?['Choices'][variables('Choice ID')]
You an see I'm just using the variable in the path expression over the top of the JSON object to extract the key I want.
This is the end result ...
According to the documentation:
Document metadata added by steps
For every content object outputted by a Data Hub step, regardless of the step type, Data Hub will add the following document metadata keys and values to the document wrapped by the content object:
datahubCreatedOn = the date and time at which the document is written
datahubCreatedBy = the MarkLogic user used to run the step
datahubCreatedInFlow = the name of the flow containing the step being run
datahubCreatedByStep = the name of the step being run
datahubCreatedByJob = the ID of the job being run; this will contain the job ID of every flow run on the step, with multiple values being space-delimited
Is there any possibility to add some extra metadata keys and values to the document?
It is possible to add additional static values in your headers options or use one of these keywords to dynamically add values.
{
"headers": {
"sources": [{
"name": "loadCustomersJSON"
}],
"createdOn": "datahubCreatedOn",
"createdBy": "datahubCreatedBy"
}
}
You can also dynamically add values by using an interceptor
(See: https://docs.marklogic.com/datahub/5.6/flows/about-interceptors-custom-hooks.html) or updating the header value in a custom step if you are already using one (See:https://docs.marklogic.com/datahub/5.6/modules/editing-custom-step-module.html
I have a question regarding best practices to insert Documents in MongoDb.
In my data source the key "myData2" can be null or a string. Should I add "myData2" as null to my database or is it better to leave the value out if not defined? What is the "clean" way to deal with this?
[{
"myData1": "Stuff",
"myData2": null
}]
Since MongoDB permits fields to be added to documents at any time, most (production) applications are written to handle both of the following cases:
A new field is added to the code, but the existing data doesn't have it, and it needs to be added over time to the existing data either on demand or as a background process
A field is no longer used by the code but still contains values in the database
What would your application do if the field is missing, as opposed to if it's set to the null value? If it would do the same thing, then I suggest not setting fields to null values for two reasons:
It streamlines the code because you only need to handle one possibility (missing field) on the reading side, instead of two (field missing or null)
It requires less storage space in the database.
In the documentation from FMI the HTTP-body example for creating records using FMS16 Data API (REST) looks like this
{"data":
{
"field_1": "value_1",
"field_2": "value_2",
"repetitionField(1)" : "fieldValue",
"Orders::OrderDate.0":"12/22/2015"
}
}
The last attribute Orders::OrderDate.0 sets a value to a field on a related record and since the record donĀ“t already exist it will be created.
My question focus on the .0 suffix of the attribute name. It looks to me like the 0 indicates a serial/identifier for on which related record the value should be inserted. This leads me to wonder if it is possible to create more then one related record in the same request that creates the parent record.
The below body returns error that the record does not exist, but why can one related record be created but not two?
{"data":
{
"field_1": "value_1",
"field_2": "value_2",
"repetitionField(1)" : "fieldValue",
"Orders::OrderDate.0":"12/22/2015",
"Orders::OrderDate.1":"11/11/2011"
}
}
Any clue if the above code should work? Am I missing something?
I am fully aware that I can (should) post several requests aimed at the related tables layout to create the related records. I just wish to know, since the .0 notation is in the documentation, does it should have a valid function?
Found this under the notes section in the doc you linked to:
"Only one related record can be created per create record call."
So there you have it. Looks like it behaves similarly to record creation from a portal, where you also can only create one related record at a time.
I wonder, How do I change a live data schema with MongoDB ?
For example If I have "Users" collection with the following document:
var user = {
_id:123312,
name:"name",
age:12,
address:{
country:"",
city:"",
location:""
}
};
now, in a new version of my application, if I add a new property to "User" entity, let us say weight, tall or adult ( based on users year ), How to change all the current live data which does not have adult property. I read MapReduce and group aggregation command but, they seem to be comfortable and suitable for analytic operation or other calculations, or I am wrong.
So what is the best way to change your current running data schema in MongoDB ?
It really depends upon your programming language. MongoDB is really good at having a dynamic schema. I think your pattern of thought at the moment is too SQL related whereby you believe that all rows, even if they do not yet have a value, must have the new field.
The reality is quite different. The rows which have nothing meaningful to put into them do not require the field and you can, in your application, just check to see if the returned document has a value, if not then you can assume, as in a fixed SQL schema, that the value is null.
So this is one aspect where MongoDB shines, is the fact that you don't have to apply that new field to the entire schema on demand, instead you can lazy fill it as data is entered by the user.
So just code the field into your application and let the user do the work for you.
The best way to add this field is to write a loop, in maybe the console close or on the primary of your replica (if you have one, otherwise just on the server), like so:
db.users.find().forEach(function(doc){
doc.weight = '44 stone';
db.users.save(doc);
});
That is currently the best way to do something like what your asking.