Loading and setting values of json in scala - scala

I am new to scala and cucumber. I have following json file in the resources of project.
{
"town": {
"address": {
"Dates": [
{
"startDate": ""
}
],
"condtion": {
"includeAll": [
{
"type": "",
"id": "",
"details": [
{
"destination": ""
}
]
},
{
"includeAny": [
{
"type": "",
"id": "",
"details": [
{
"value": ""
}
]
}
]
}
]
}
"FinalId": "N9"
}
}
}
Field are kept empty in this file. In my feature file I have following info in examples:
Given:...
When: ...
Then: ...
Examples:
| includeAll | includeAny |
|typeValue;idValue;destination,typeValue1;idValue1;destination1 | typeValue2;idValue2;destination2,typeValue3;idValue4;destination4
values of field are separated by delimeter ";" and particular combination is separated by another delimeter ",". I may many combinations of typeValue,idValue, destinationValue to in my feature file separated by "," and I need to load that empty json and set them in my json and store json in a varaible or file. includeAny values may or may not be present but includeAll will always be there. In case if includeAny values are not specified, I don't want to include "includeAny" in my final json. How can I do that?

Related

Cannot parametrize any value under placement.managedCluster.config

My goal is to create dataproc workflow template from python code. Meanwhile I want to have ability to parametrize placement.managedCluster.config.gceClusterConfig.subnetworkUri field during template instantiation.
I read template from json file like:
{
"id": "bigquery-extractor",
"placement": {
"managed_cluster": {
"config": {
"gce_cluster_config": {
"subnetwork_uri": "some-subnet-name"
},
"software_config" : {
"image_version": "1.5"
}
},
"cluster_name": "some-name"
}
},
"jobs": [
{
"pyspark_job": {
"args": [
"job_argument"
],
"main_python_file_uri": "gs:///path-to-file"
},
"step_id": "extract"
}
],
"parameters": [
{
"name": "CLUSTER_NAME",
"fields": [
"placement.managedCluster.clusterName"
]
},
{
"name": "SUBNETWORK_URI",
"fields": [
"placement.managedCluster.config.gceClusterConfig.subnetworkUri"
]
},
{
"name": "MAIN_PY_FILE",
"fields": [
"jobs['extract'].pysparkJob.mainPythonFileUri"
]
},
{
"name": "JOB_ARGUMENT",
"fields": [
"jobs['extract'].pysparkJob.args[0]"
]
}
]
}
code snippet I use:
options = ClientOptions(api_endpoint="{}-dataproc.googleapis.com:443".format(region))
client = dataproc.WorkflowTemplateServiceClient(client_options=options)
template_file = open(path_to_file, "r")
template_dict = eval(template_file.read())
print(template_dict)
template = dataproc.WorkflowTemplate(template_dict)
full_region_id = "projects/{project_id}/regions/{region}".format(project_id=project_id, region=region)
try:
client.create_workflow_template(
parent=full_region_id,
template=template
)
except AlreadyExists as err:
print(err)
pass
when I try to run this code I get the following error:
google.api_core.exceptions.InvalidArgument: 400 Invalid field path placement.managed_cluster.configuration.gce_cluster_config.subnetwork_uri: Field gce_cluster_config does not exist.
This behavior is the same also if I try to parametrize placement.managedCluster.config.softwareConfig.imageVersion, I will get
google.api_core.exceptions.InvalidArgument: 400 Invalid field path placement.managed_cluster.configuration.software_config.image_version: Field software_config does not exist.
But if I exclude any field under placement.managedCluster.config from parameters map, template is created successfully.
I didn't find any restriction on parametrizing these fields. Is there any? Or is it just me doing something wrong?
This doc listed the parameterizable fields. It seems that only managedCluster.name of managedCluster is parameterizable:
Managed cluster name. Dataproc will use the user-supplied name as the name prefix, and append random characters to create a unique cluster name. The cluster is deleted at the end of the workflow.
I don't see managedCluster.config parameterizable.

Gentics Mesh - Multilanguage support - Cross language in a list of node - GraphQL query

Gentics Mesh Version : v1.5.1
Intro:
Let suppose we have schema A with a field of type: list and list type: node and allowed schemas: B. (see (1)).
An instance of B node has been created (b1-EN) in language en and (b1-DE) in de.
An instance of B node has been created (b2-EN) in languages en.
An instance of A node has been created (a1-DE) in language de and b1-DE and b2-EN are added in the node list (Bs) of a1.
As result, when selecting de language in the Gentics Mesh CMS, Node a1-DE (de) has a list of 2 nodes b1-DE, b2-EN.
When the following GraphQL query is applied :
{
node(path: "/a1-DE") {
... on A {
path
uuid
availableLanguages
fields {
Bs {
... on B {
path
fields {
id
}
}
}
}
}
}
}
The result is :
{
"data": {
"node": {
"path": "/a1-DE",
"uuid": "30dfd534cdee40dd8551e6322c6b1518",
"availableLanguages": [
"de"
],
"fields": {
"Bs": [
{
"path": "/b1-DE",
"fields": {
"id": "b1-DE"
}
},
{
"path": null,
"fields": null
}
]
}
}
}
}
Question:
Why the result is not showing the b2-EN node in the list of nodes ? Is the query wrong ? What I would like to get as result is the default language version of the node (b2-EN) because the b2-DE is not contributed yet. so the expected result :
{
"data": {
"node": {
"path": "/a1-DE",
"uuid": "30dfd534cdee40dd8551e6322c6b1518",
"availableLanguages": [
"de"
],
"fields": {
"Bs": [
{
"path": "/b1-DE",
"fields": {
"id": "b1-DE"
}
},
{
"path": "/b2-EN",
"fields": {
"id": "b2-EN"
}
}
]
}
}
}
}
In the documentation (2):
The fallback to the configured default language will be applied if no other matching content found be found. Null will be returned if this also fails.
Can someone enlighten me ?
(1): Schema
{
"name": "A",
"container": false,
"autoPurge": false,
"displayField": "id",
"segmentField": "id",
"urlFields": [
"id"
],
"fields": [
{
"name": "Bs",
"type": "list",
"label": "Bs",
"required": false,
"listType": "node",
"allow": [
"B"
]
},
{
"name": "id",
"type": "string",
"label": "id",
"required": true
}
]
}
(2) https://getmesh.io/docs/graphql/#_multilanguage_support
There are some known issues and inconsistent behaviour when loading nodes via GraphQL. See this issue: https://github.com/gentics/mesh/issues/971
In your case, the queried list of nodes will always be in the configured default language (in mesh.yml). In your case this seems to be de. This is why the English-only node yields no result.
Until this is fixed, you can work around this issue by loading all languages of the node list:
{
node(path: "/a1-DE") {
... on A {
path
uuid
availableLanguages
fields {
Bs {
... on B {
languages {
path
language
fields {
id
}
}
}
}
}
}
}
}
You will the contents of all languages of the node list. This means that you will have to filter for the desired language in your code after receiving the response.

MongoDb Query returning unwanted documents

I have a database containing documents of two structures:
{
"name": "",
"name_ar": "",
"description": "",
"bla1": {
"name": "",
"link": "",
"Logo": ""
},
"bla2": {
"name": "",
"id": ""
}
}
and
{
"name": "",
"name_ar": "",
"description": "",
"bla1": {
"name": [],
"link": "",
"Logo": ""
},
"bla2": {
"name": "",
"id": ""
}
}
I want to query my collection to get documents with "bla1.name" exactly equal to something. However using the following query:
{$and: [{'bla1.name': {'$type': 'string'}}, {"bla1.name":'something'}]}
returns all documents (even where "bla1.name" is an array) containing the name: 'something'.
What am I doing wrong?
From the MongoDB docs:
$type now works with arrays in the same way it works with other BSON types. Previous versions only matched documents where the field contained a nested array.
That means: If an array has at least one element with the given type it gets selected.
If you want to exclude arrays as type you have to extend your query. As the query already matches strings, you can exclude the type selection for string:
$and: [
// not necessary any more, as this selection is already implied by the last part
// {
// "bla1.name": {
// "$type": "string"
// }
// },
{
"bla1.name": {
$not: {
"$type": "array"
}
}
}, {
"bla1.name": "something"
}
]
See the official docs: https://docs.mongodb.com/manual/reference/operator/query/type/#behavior
Here is a working demo on the Mongo playground: https://mongoplayground.net/p/3ri7Bjfrae8

Using ADF Copy Activity with dynamic schema mapping

I'm trying to drive the columnMapping property from a database configuration table. My first activity in the pipeline pulls in the rows from the config table. My copy activity source is a Json file in Azure blob storage and my sink is an Azure SQL database.
In copy activity I'm setting the mapping using the dynamic content window. The code looks like this:
"translator": {
"value": "#json(activity('Lookup1').output.value[0].ColumnMapping)",
"type": "Expression"
}
My question is, what should the value of activity('Lookup1').output.value[0].ColumnMapping look like?
I've tried several different json formats but the copy activity always seems to ignore it.
For example, I've tried:
{
"type": "TabularTranslator",
"columnMappings": {
"view.url": "url"
}
}
and:
"columnMappings": {
"view.url": "url"
}
and:
{
"view.url": "url"
}
In this example, view.url is the name of the column in the JSON source, and url is the name of the column in my destination table in Azure SQL database.
The issue is due to the dot (.) sign in your column name.
To use column mapping, you should also specify structure in your source and sink dataset.
For your source dataset, you need specify your format correctly. And since your column name has dot, you need specify the json path as following.
You could use ADF UI to setup a copy for a single file first to get the related format, structure and column mapping format. Then change it to lookup.
And as my understanding, your first format should be the right format. If it is already in json format, then you may not need use "json" function in your expression.
There seems to be a disconnect between the question and the answer, so I'll hopefully provide a more straightforward answer.
When setting this up, you should have a source dataset with dynamic mapping. The sink doesn't require one, as we're going to specify it in the mapping.
Within the copy activity, format the dynamic json like the following:
{
"structure": [
{
"name": "Address Number"
},
{
"name": "Payment ID"
},
{
"name": "Document Number"
},
...
...
]
}
You would then specify your dynamic mapping like this:
{
"translator": {
"type": "TabularTranslator",
"mappings": [
{
"source": {
"name": "Address Number",
"type": "Int32"
},
"sink": {
"name": "address_number"
}
},
{
"source": {
"name": "Payment ID",
"type": "Int64"
},
"sink": {
"name": "payment_id"
}
},
{
"source": {
"name": "Document Number",
"type": "Int32"
},
"sink": {
"name": "document_number"
}
},
...
...
]
}
}
Assuming these were set in separate variables, you would want to send the source as a string, and the mapping as json:
source: #string(json(variables('str_dyn_structure')).structure)
mapping: #json(variables('str_dyn_translator')).translator
VladDrak - You could skip the source dynamic definition by building dynamic mapping like this:
{
"translator": {
"type": "TabularTranslator",
"mappings": [
{
"source": {
"type": "String",
"ordinal": "1"
},
"sink": {
"name": "dateOfActivity",
"type": "String"
}
},
{
"source": {
"type": "String",
"ordinal": "2"
},
"sink": {
"name": "CampaignID",
"type": "String"
}
}
]
}
}

How would I use the attribute as input for Value with JOLT?

For a specific function that I am building, I need to parse my JSON and have in some cases the attribute, instead of the value itself, be used as the value for the attribute. But how do I manage that with JOLT?
Let's say this is my input:
{
"Results": [
{
"FirstName": "John",
"LastName": "Doe"
},
{
"FirstName": "Mary",
"LastName": "Joe"
},
{
"FirstName": "Thomas",
"LastName": "Edison"
}
]
}
And this should be the outcome:
{
"Results": [
{
"Name": "FirstName",
"Value": "John"
},
{
"Name": "FirstName",
"Value": "Mary"
},
{
"Name": "FirstName",
"Value": "Thomas"
},
{
"Name": "LastName",
"Value": "Doe"
},
{
"Name": "LastName",
"Value": "Doe"
},
{
"Name": "LastName",
"Value": "Edison"
},
]
}
For those interested.. I'm building a JSON to Excel export functionality in Mendix and it has to be completely dynamic, regardless of the input. To accomplish this, I need an array where each attribute (equal to a column in Excel) has to be it's own object with a column name and a value. If each column data is it's own object, I can simply say "create column for each object with the same "Name". Little bit difficult to explain, but it 'should' work.
Arrays and Jolt, are not the best. Basically there are 3 ways to deal with arrays in Shift.
you explicitly assign data to an array position. Aka foo[0] and foo[1]
you reference a "number" that exists in the input data. Aka foo[&2] and foo[&3]
you "accumulate" data into a list. Aka foo[].
Your input data is array of size 3. Your desired output is an array of size 6. You want this to be flexible and be able to handle variable inputs.
This means option 3. So you have to "fix" / process your data into it "final form", while maintaining the original input Json structure (of a list with 3 items), and then accumulate all the "built" items into a list.
This means that you are buildling a list of lists, and then finally "squashing" it down to a single list.
Spec
[
{
// Step 1 : Pivot the data into parallel lists of keys and values
// maintaining the original outer input list structure.
"operation": "shift",
"spec": {
"Results": {
"*": { // results index
"*": { // FirstName / Lastname
"$": "temp[&2].keys[]",
"#": "temp[&2].values[]"
}
}
}
}
},
{
// Step 2 : Un-pivot the data into the desired
// Name/Value pairs, using the inner array index to
// keep things organized/separated.
"operation": "shift",
"spec": {
"temp": {
"*": { // temp index
"keys": {
"*": "temp[&2].[&].Name"
},
"values": {
"*": "temp[&2].[&].Value"
}
}
}
}
},
{
// Step 3 : Accumulate the "finished" Name/Value pairs
// into the final "one big list" output.
"operation": "shift",
"spec": {
"temp": {
"*": { // outer array
"*": "Results[]"
}
}
}
}
]