Filter Lookup Results from values in Second Lookup in Azure Data Factory - azure-data-factory

I have two lookups within an Until activity in ADF. The first lookup (BookList) is a list of books that look like the JSON listed below.
[
{
"BookID": 1,
"BookName": "Book A"
},
{
"BookID": 2,
"BookName": "Book B"
}
]
The second lookup is a list of books that I want to exclude from the first list (ExcludedBooks) which is listed below.
[
{
"BookID": 2,
"BookName": "Book B"
}
]
After these two lookups, I have a Filter activity whose items are the values from BookList lookup. I would like the filter condition to be based on the BookID value not being listed in the ExcludedBooks values, but I'm not sure how to write this condition based on the collection functions in ADF. What I have is listed below which does not work.
#not(contains(activity('ExcludedBooks').output.value, item().BookID))
I realize one way to solve this is to loop through each record of the ExcludedBooks and use a SetVariable
activity to build an array of BookIDs which WOULD work with the collection function Contains(), but ADF does not allow nested activity groups for some reason (ForEach within an Until).
I also cannot set the list of excluded books outside of the Until activtity as it will change with each iteration of the Until activity. I also realize the workaround to the nested group activity restriction is to create a completely different pipeline, but that is not ideal and creates unnecessary complexity when trying to return the results.
Does anyone have any suggestions for how to filter the results of a lookup based on the results of another lookup?

Below expression doesn't work because item of activity('ExcludedBooks').output.value is object,item().BookID is number.
#not(contains(activity('ExcludedBooks').output.value, item().BookID))
If your each item in ExcludedBooks is the same as your item in BookList(like your provide sample),you can use this expression:#not(contains(activity('ExcludedBooks').output.value, item())).
My test result:
For another hand,if your item in ExcludedBooks like this json(BookList is the same as your provided):
[
{
"BookID": 2,
"BookName": "Book B",
"num": 22
}
]
you can only compare their BookID by using this expression:
#not(contains(join(activity('ExcludedBooks').output.value,','),concat('"BookID":',item().BookID,',')))
(cast activity('ExcludedBooks').output.value to string,concat item() in 'BookList' as "BookID":2, and check whether 'ExcludedBooks' string contains 'BookList' item string)
My test result:
Hope this can help you.

Related

Is there a way to define single fields that are never indexed in firestore in all collections

I understand that index has a cost in firestore. Most of the time we simply store objects without really caring about index and even if we don’t want most of the fields to be indexed.
If I understand correctly, any field at any level are indexed. I.e. for the following document in pseudo json
{
"root_field1": "abc" (indexed)
"root_field2": "def" (indexed)
"root_field3": {
"Sub_field1: "ghi" (indexed)
"sub_field2: "jkl" (indexed)
"sub_field3: {
"Inner_field1: "mno" (indexed)
"Inner_field2: "pqr" (indexed)
}
}
Let’s assume that I have the following record
{
"name": "abc"
"birthdate": "2000-01-01"
"gender": "m"
}
Let’s assume that I just want the field "name" to be indexed. One solution (A), without having to specify every field is to define it this way (i.e. move the root fields to a sub level unindexed), and exclude unindexed from being indexed
{
"name": "abc"
"unindexed" {
"birthdate": "2000-01-01"
"gender": "m"
}
Ideally I would like to just specify a prefix such as _ to prevent each field to be indexed but there is no global solution for that.
{
"name": "abc"
"_birthdate": "2000-01-01"
"_gender": "m"
}
Is my solution (A) correct and is there a more elegant generic solution?
Thanks!
Accordinig to the documentation
https://cloud.google.com/firestore/docs/query-data/indexing
Add a single-field index exemption
Single-field index exemptions allow you to override automatic index
settings for specific fields in a collection. You can add a
single-field exemptions from the console:
Go to the Single Field Indexes section.
Click Add Exemption.
Enter a Collection ID and Field path.
Select new indexing settings for this field. Enable or disable
automatically updated ascending, descending, and array-contains
single-field indexes for this field.
Click Save Exemption.

Azure Data Factory - Copy Activity - rest api collection reference

Helo eveyone,
I am fairly new to Data Factory and I need to copy information from Dynamics Business Central's Rest API. I am struggling with the "Details" type entities such as "invoiceSalesHeader".
The api for that entity forces me to provide a header ID as a filter. In that sense, I would have to loop x times (a few thousand) and call the Rest API to retreive the lines of each sales invoice. I find that completely ridiculous and am trying to find other ways to get the information.
To avoid doing that, I am trying to get the information by calling the "salesInvoice" entity and use "$expand=salesInvoiceLines".
That gets me the information I need but inside data factory's Copy Activity, I am struggling with what I should put as a "collection reference" so that I end up with one row per salesInvoiceLine.
The data returned is an array of sales invoices with a sub array of invoice lines.
If I select "salesInvoiceLines" as the collection reference, I end up with "$['value'][0]['salesInvoiceLines']" and that only gives me the lines for the first invoice (since there is an index of zero).
What should I put in Collection Reference so that I get one row per salesInvoiceLine
It is not support to foreach nested json array in ADF.
Alternatively, we can use a Flattern activity in data flow to flatten the nested json array.
Here is my example:
This is my example json data, the structure is like yours:
[
{
"id": 1,
"Value": "January",
"orders":[{"orderid":1,"orderno":"qaz"},{"orderid":2,"orderno":"edc"}]
},
{
"id": 2,
"Value": "February",
"orders":[{"orderid":3,"orderno":"wsx"},{"orderid":4,"orderno":"rfv"}]
},
{
"id": 3,
"Value": "March",
"orders":[{"orderid":5,"orderno":"rfv"},{"orderid":6,"orderno":"tgb"}]
},
{
"id": 11,
"Value": "November",
"orders":[{"orderid":7,"orderno":"yhn"},{"orderid":8,"orderno":"ujm"}]
}
]
In the dataflow, we can select the header of the nested json array, here is orders:
Then we can see the result, we have transposed the JSON orders array with 2 objects (orderid, orderno) into 8 flatten rows:

Return Only Most Recent Record From Related Entity in OData Query

I am trying to create an OData Query to return Bugs from Azure DevOps for a PowerBI report, but I am not getting the results I am looking for, as one of the Related Entities that I am trying to expand returns multiple results.
My base Query looks like this (simplified & removing custom fields)
https://analytics.dev.azure.com/[organization]/[project]/_odata/v3.0-preview/WorkItems?$select=WorkItemId,WorkItemType,Title,State,LeadTimeDays&$filter=WorkItemType eq 'bug'&$expand=Teams($select=TeamName,AnalyticsUpdatedDate)
Some records return multiple Team Names in the JSON Response
"value": [
{
"WorkItemId": 16547,
"LeadTimeDays": 173.0639004,
"Title": "test",
"WorkItemType": "Bug",
"State": "Closed",
"Severity": "3 - Medium",
"Teams": [
{
"TeamName": "Team1",
"AnalyticsUpdatedDate": "2019-09-17T01:48:46.5433333Z"
},
{
"TeamName": "Team2",
"AnalyticsUpdatedDate": "2019-12-03T16:52:39.9466667Z"
}
]
}
]
I can't tell why these records have multiple values for this Entity, but I only need the most recent (Team 2 in the example above). Is it possible to return only the most recent record for the Related Teams Entity? I've tried using orderby and top on the expand clause and other places in the query to no effect. If I can't do it in the OData query, then I can accomplish it in Power BI after expanding the Table.
I found how to solve this. I needed semicolons between the clauses within the Expand clause.
https://analytics.dev.azure.com/[organization]/[projet]_odata/v3.0-preview/WorkItems?$select=WorkItemId,WorkItemType,Title,State,LeadTimeDays&$filter=WorkItemType eq 'bug'&$expand=Teams($select=TeamName,AnalyticsUpdatedDate;$orderby=AnalyticsUpdatedDate desc;$top=1)

Partial Representations in REST API's for collections vs items

I'm putting together a REST based API but I'm not sure on how I should deliver the response for collections vs individual resources.
Does it make sense to have a slimmed down representation for a collection over a single item in the world of REST?
Say I have something along the lines of this for a collection of albums:
{
items: [
{
"id": 1,
"title": "Thriller"
},
...
]
}
But then for the actual individual item I had
{
"id": 1,
"title": "Thriller",
"artist": "Michael Jackson",
"released": "1982",
"imageLinks": {
"smallThumbnail": "...",
"largeThumbnail": "..."
}
...
}
A resource representation should be unique irrespective of whether it is given as a collection or a single item. But, you can introduce a new parameter like fields which can be used by the clients to get only the required field thereby optimising the bandwidth.
/albums - This should give the list of objects each having the structure of what you would give in a individual item api
/albums?fields=id,title - This can give the list of objects with just the id & title.

How can I model my meteor collection to feed three different reactive views

I am having some difficulty structuring my data so that I can benefit from reactivity in Meteor. Mainly nesting arrays of objects makes queries tricky.
The three main views I am trying to project this data onto are
waiter: shows order for one table, each persons meal (items nested, essentially what I have below)
kitchen manager: columns of orders by table (only needs table, note, and the items)
cook: columns of items, by category where started=true (only need item info)
Currently I have a meteor collection of order objects like this:
Order {
table: "1",
waiter: "neil",
note: "note from kitchen",
meals: [
{
seat: "1",
items: [ {n: "potato", category: "fryer", started: false },
{n: "water", category: "drink" }
]
},
{
seat: "2",
items: [ {n: "water", category: "drink" } ]
},
]
}
Is there any way to query inside the nested array and apply some projection, or do I need to look at an entirely different data model?
Assuming you're building this app for one restaurant, there shouldn't be many active orders at any given time—presumably you don't have thousands of tables. Even if you want to keep the orders in the database after they're served, you could add a field active to separate out the current ones.
Then your query is simple: activeOrders = Orders.find({active: true}).fetch(). The fetch returns an array, which you could loop through several times for each of your views, using nested if and for loops as necessary to dig down into the child objects. See also Underscore's _.pluck. You don't need to get everything right with some complicated Mongo query, and in fact your app will run faster if you query once and process the results however many times you need to.