MarkLogic Data Hub document metadata added by steps - metadata

According to the documentation:
Document metadata added by steps
For every content object outputted by a Data Hub step, regardless of the step type, Data Hub will add the following document metadata keys and values to the document wrapped by the content object:
datahubCreatedOn = the date and time at which the document is written
datahubCreatedBy = the MarkLogic user used to run the step
datahubCreatedInFlow = the name of the flow containing the step being run
datahubCreatedByStep = the name of the step being run
datahubCreatedByJob = the ID of the job being run; this will contain the job ID of every flow run on the step, with multiple values being space-delimited
Is there any possibility to add some extra metadata keys and values to the document?

It is possible to add additional static values in your headers options or use one of these keywords to dynamically add values.
{
"headers": {
"sources": [{
"name": "loadCustomersJSON"
}],
"createdOn": "datahubCreatedOn",
"createdBy": "datahubCreatedBy"
}
}
You can also dynamically add values by using an interceptor
(See: https://docs.marklogic.com/datahub/5.6/flows/about-interceptors-custom-hooks.html) or updating the header value in a custom step if you are already using one (See:https://docs.marklogic.com/datahub/5.6/modules/editing-custom-step-module.html

Related

Azure data factory - custom mapping for Rest service

So, I am creating a Copy activity that reads from SQL Server table and have to send the data to an API end point with the PATCH request.
API provider specified that the body must be in the form of
"updates":[{"key1":"value1","key2":"value2","key3":"value3" },
{"key1":"value1","key2":"value2","key3":"value3" }, ...
.... {"key1":"value1","key2":"value2","key3":"value3" }]
However, my sql table maps to json this way (without the wrapper 'updates:')
[{"key1":"value1","key2":"value2","key3":"value3" },
{"key1":"value1","key2":"value2","key3":"value3" }, ...
.... {"key1":"value1","key2":"value2","key3":"value3" }]
I use the copy activity with the sink data set being of type Rest ..
How can we modify the mapping, so that schema gets wrapped by "updates" object ?
Using copy data activity, there might not be any possibility to wrap the data (array of objects) to an updates key.
To do this, I have used a lookup activity to get the data, set variable activity to wrap the data with an updates object key and finally, use Web activity with PATCH method and above variable value as body to complete the activity.
The following is the sample data I have taken for my SQL server table.
Use look up activity to select the data from this table using table or query option (I used query option). The debug output would be as follows:
NOTE: If your data is not same as in sample table I have taken, try using the query option so the output would be something as shown below
In the set variable activity, I have used an array variable and used the following dynamic content to wrap the above array of objects with updates key.
#array(json(concat('{"updates":',string(activity('Lookup1').output.value),'}')))
Now in the Web activity, choose all the necessary settings (PATCH method, authorizations, headers, URL, etc.,) and give the body as follows (I used a fake REST api as a demo):
#variables('tp')[0]
Since I am using the Fake REST API, the activity succeeds, but checking the Web activity debug input shows what is the body that is being passed to the Rest API. The following is an image for reference:

Translating a GUID to a text value, from an API response in a Power Automate Flow

I'm using MS Automate to solve an integration challenge between two systems we use in our Project Management lifecycle. I have a custom connector written by the vendor of System A which allows me to create a Flow in MS Automate which is triggered when a record is Created or Updated.
So far, so good. However, the method in the connector provided by System A returns the new or updated record containing a number of fields which contain value GUIDs as the fields are 'choice' type fields e.g. Department, Status etc. What I end up with is a record where Status = "XXXXXX-000000-00000-00000" etc. The vendor also provides a restful API endpoint which I can query, which returns a JSON collection of fields, which include a 'Choices' section for each field of this type which is a standard JSON which looks like:
{
"Id": "156e6c29-24b3-4413-af91-80a62a04d443",
"Order": 110,
"InternalName": "PrjStatus",
"DisplayName": "Status",
"ColumnType": 5,
"ColumnAggregate": 0,
"Choices": {
"69659014-be4d-eb11-bf94-00155df8457c": "(0) Not Set",
"c30c50e5-51cf-ea11-bfd3-00155de84703": "(1) On Track",
"c40c50e5-51cf-ea11-bfd3-00155de84703": "(2) At Risk",
"c50c50e5-51cf-ea11-bfd3-00155de84703": "(3) Off Track",
"6a659014-be4d-eb11-bf94-00155df8457c": "(4) Not Tracked"
},
Technical problem:
What I have is the GUID of the choice (not the field). I need to take the GUID, in this case "6a659014-be4d-eb11-bf94-00155df8457c" and translate it into "(4) Not Tracked" and store this in a variable to write to a SharePoint list. I need to do this for about 30 fields which are similar in the record.
I've created the flow and the connector has given me the record with a list of fields, some of which contain value GUIDs. I know which fields these are and I have the Display Names of these fields.
I have added a HTTP call to the provided API endpoint (lets call it GetFields), which has returned a 200 response, the body of the response containing a JSON collection of the 50 or so fields in System A.
I can't work out how to parse the body of the response for the GUID I have for each field value and ensure I have the right corresponding text value, so I can then write it to a field variable, and then create a SharePoint record, all wrapped up in an MS Automate flow.
I hope I've understood you correctly but from what I can work out, you want to dynamically select the value of the choice from the GUID you've been provided (by whatever means).
I created a small flow to prove the concept. Firstly, these two steps setup the scenario, the first being the GUID you want to extract the choice value for and the second being the JSON object itself ...
The third step will take the value from the first variable and use it dynamically in an expression to extract that key from the JSON and return the value.
This is the expression ...
variables('JSON')?['Choices'][variables('Choice ID')]
You an see I'm just using the variable in the path expression over the top of the JSON object to extract the key I want.
This is the end result ...

Update multiple records for PUT in Rest API

I have an API to update records in my Database. I have the endpoint as below
PUT /student/name/{name}/roll/{rollNumber}/reg-number/{regNumber}
and body as
{
"studentList" : [
{
"fathersName": "string",
"baseSubject": "string",
"age": "string",
"preferredLanguage": "string"
}
]
}
Each record is uniquely identified by the name, rollNumber and regNumber and I want to update multiple records at once (I have a list in my request body).
What is the best way to achieve this? Should I pass arrays of name, rollNumber and regNumber as path param and corresponding records in sequential order in the body inside studentList
or
I should have all fields in the body itself and nothing in the path params?
I am following OpenAPI specs.
Update: Adding some more details to my question
I have a table with Primary key as a combination of name, rollNumber and regNumber
I want to update multiple rows in a single PUT call. The fields which can be updated are passed in my request body and the fields which are used to identify the row that has to be updated are being passed in the URI as path Params.
I would like to know the correct approach and rest specification to achieve this. The two options which I have in mind have been mentioned in my question.

Referencing JSON payload value in Azure Data Factory for If condition

I have a Json file like so as a payload returned from an API call - which is an http dataset type in data factory.
{
"count": 2,
"name": "DatasetABC",
"columnNames": [
"Column_1",
"Column_2"
],
"rows": ["1234",
"5678"
]
}
I would like to be able to use the count records returned in an If condition. Im wondering what I need to use to get the value of "count" which is 2.
Any help appreciated.
Based on your description, i suppose you could use LookUp Activity in Azure Data Factory.
Lookup activity can retrieve a dataset from any of the Azure Data Factory-supported data sources. Use it in the following scenario:
Dynamically determine which objects to operate on in a subsequent
activity, instead of hard coding the object name. Some object examples
are files and tables. Lookup activity reads and returns the content of
a configuration file or table. It also returns the result of executing
a query or stored procedure. The output from Lookup activity can be
used in a subsequent copy or transformation activity if it's a
singleton value. The output can be used in a ForEach activity if it's
an array of attributes.
For example,maybe you could access the count value by using #{activity('MyLookupActivity').output.firstRow.count} in the IF activity.

PUT multiple related records in Data API request

In the documentation from FMI the HTTP-body example for creating records using FMS16 Data API (REST) looks like this
{"data":
{
"field_1": "value_1",
"field_2": "value_2",
"repetitionField(1)" : "fieldValue",
"Orders::OrderDate.0":"12/22/2015"
}
}
The last attribute Orders::OrderDate.0 sets a value to a field on a related record and since the record donĀ“t already exist it will be created.
My question focus on the .0 suffix of the attribute name. It looks to me like the 0 indicates a serial/identifier for on which related record the value should be inserted. This leads me to wonder if it is possible to create more then one related record in the same request that creates the parent record.
The below body returns error that the record does not exist, but why can one related record be created but not two?
{"data":
{
"field_1": "value_1",
"field_2": "value_2",
"repetitionField(1)" : "fieldValue",
"Orders::OrderDate.0":"12/22/2015",
"Orders::OrderDate.1":"11/11/2011"
}
}
Any clue if the above code should work? Am I missing something?
I am fully aware that I can (should) post several requests aimed at the related tables layout to create the related records. I just wish to know, since the .0 notation is in the documentation, does it should have a valid function?
Found this under the notes section in the doc you linked to:
"Only one related record can be created per create record call."
So there you have it. Looks like it behaves similarly to record creation from a portal, where you also can only create one related record at a time.