JSONB transformation help (Postgress) - postgresql

SQL table rec_call has a jsonb column called config. It is deeply nested array of objects. That objects have items field which is array of objects. Further those objects have concepts which is an array of objects.
[{ "items": [{ "concepts": [{ "text": "hello" }] }] }]
Except the field $.items[*].concepts[*].text every field is optional.
Select config from rec_call LIMIT 1;
[
{
"items": [
{
"tag": "whitelist",
"concepts": [
{
"text": "cable",
"wordsAttributes": {
"groupName": "Festnetz",
"score": 1.0
},
"score": 0.0,
"count": 0
},
{
"text": "tv",
"wordsAttributes": {
"groupName": "Group 2",
"score": 1.0
}
},
{
"text": "adhl",
},
{
"text": "internet",
"wordsAttributes": {
"groupName": "Group 2",
"score": 1.0
}
}
]
},
{
"tag": "blacklist",
"concepts": [
{
"text": "cable",
"wordsAttributes": {
"groupName": "sssdd",
"score": 1.0
},
"textOriginal": "Cable",
"score": 0.0,
"count": 0
},
{
"text": "cable"
}
]
},
{
"tag": "filler",
"concepts": [
{
"text": "something",
"wordsAttributes": {
"groupName": "filler",
"score": 1.0
},
"score": 0.0,
"count": 0
}
]
}
]
}
]
Write a migration to transform config column for every row to following format:
Field transformations mappings:
first level grouping is by tag => whitelist | blacklist | filler,
second level grouping is done by wordsAttributes.groupName. If not exist than `group` is empty string .
for each group every concepts[*]->text => pushed into texts field
score => wordsAttributes.score
other is empty array
End result should be:
{
"whitelist": [
{
group: "Festnetz",
score: 1,
texts: ["cable"],
other: []
},
{
group: "Group 2",
score: 1,
texts: ["tv", "internet"],
other: []
},
{
group: "",
score: 1,
texts: ["adhl"],
other: []
}
],
"blacklist": [
{
group: "sssdd",
score: 1,
texts: ["something"],
other: []
}
],
"filler": [
{
group: "filler",
score: 1,
texts: ["Cable"],
other: []
}
]
}
Postgress 11, but if not possible than postgress >11

Related

Remove unwanted key on nested unique keys MongoDB

I have this kind of mongodb document example
"data": {
"2023-02-01": {
"123": {
"price": 100,
},
"234": {
"price": 100,
},
},
"2023-02-02": {
"123": {
"price": 100,
},
"234": {
"price": 100,
},
},
"2023-02-03": {
"123": {
"price": 100,
},
"234": {
"price": 100,
},
},
}
I have list of mapped ID on my aystem, it should be like
ids = [123]
I want to remove the key that not in the list (ids) from the document, started from a specific date (today/"2023-02-02"), the date always updated and so the ID, my expected result is
"data": {
"2023-02-01": {
"123": {
"price": 100,
},
"234": {
"price": 100,
},
},
"2023-02-02": {
"123": {
"price": 100,
},
},
"2023-02-03": {
"123": {
"price": 100,
},
},
}
Could I achieve that on MongoDB aggregation? I'm using pymongo
Following the discussion in comments, if refactoring the schema is an option, you can achieve what you need in very simple query.
db.collection.update({
"date": {
$gte: ISODate("2023-02-02")
}
},
[
{
$set: {
value: {
$filter: {
input: "$value",
as: "v",
cond: {
$in: [
"$$v.key",
[
"123"
]
]
}
}
}
}
}
],
{
multi: true
})
Mongo Playground
The schema I am proposing:
[
{
"date": ISODate("2023-02-01"),
"value": [
{
"key": "123",
"price": 100
},
{
"key": "234",
"price": 100
}
]
},
{
"date": ISODate("2023-02-02"),
"value": [
{
"key": "123",
"price": 100
},
{
"key": "234",
"price": 100
}
]
},
{
"date": ISODate("2023-02-03"),
"value": [
{
"key": "123",
"price": 100
},
{
"key": "234",
"price": 100
}
]
}
]
You can see there is a few things:
avoided using dynamic value as field name
formatted date as proper date objects
avoided highly nesting arrays/objects

How can I merge two documents, get rid of duplicates and keep certain data?

I have the following data, which describes who is going to do what work.
Basically I want to replace the "workId" and "userId" with objects that contain all the data from their respective documents and retain the "when" data.
I am starting with this data:
{
"schedule": {
"WorkId": "4e51dc1069c27c015ede4e3e",
"daily": [
{
"when": 1,
"U_W": [
{
"workId": "3a60dc1069c27c015ede1111",
"userId": "5f60c3b7f93d8e00a1cdf414"
},
{
"workId": "3a60dc1069c27c015ede1122",
"userId": "5f60c3b7f93d8e00a1cdf415"
}
]
}
]
}
}
Here is the user table
"userSchema": [
{
_id: "5f60c3b7f93d8e00a1cdf414",
Name: "Bob"
},
{
_id: "5f60c3b7f93d8e00a1cdf415",
Name: "Joe"
}
],
Here is the work table
"workSchema": [
{
_id: "3a60dc1069c27c015ede1111",
Name: "shovel"
},
{
_id: "3a60dc1069c27c015ede1122",
Name: "hammer"
}
]
what I want to end up with is this
{
"schedule": {
"WorkId": "4e51dc1069c27c015ede4e3e",
"daily": [
{
"when": 1,
"U_W": [
{
"work": {
"id": "3a60dc1069c27c015ede1111",
"name": "shovel"
},
"user": {
"id": "5f60c3b7f93d8e00a1cdf414",
"name": "bob"
}
},
{
"work": {
"id": "3a60dc1069c27c015ede1122",
"name": "hammer"
},
"user": {
"id": "5f60c3b7f93d8e00a1cdf415",
"name": "joe"
}
}
]
}
]
}
}
Here is my first attempt:
I have it joining the the two documents
How can I get rid of the duplicates ( bob:hammer and joe:shovel ) ?
and how do I include the "when" ?
Here is the playground that provides the following :
[
{
"_id": ObjectId("5a934e000102030405000000"),
"user_info": {
"Name": "Bob",
"_id": "5f60c3b7f93d8e00a1cdf414"
},
"work_role": {
"Name": "shovel",
"_id": "3a60dc1069c27c015ede1111"
}
},
{
"_id": ObjectId("5a934e000102030405000000"),
"user_info": {
"Name": "Bob",
"_id": "5f60c3b7f93d8e00a1cdf414"
},
"work_role": {
"Name": "hammer",
"_id": "3a60dc1069c27c015ede1122"
}
},
{
"_id": ObjectId("5a934e000102030405000000"),
"user_info": {
"Name": "Joe",
"_id": "5f60c3b7f93d8e00a1cdf415"
},
"work_role": {
"Name": "shovel",
"_id": "3a60dc1069c27c015ede1111"
}
},
{
"_id": ObjectId("5a934e000102030405000000"),
"user_info": {
"Name": "Joe",
"_id": "5f60c3b7f93d8e00a1cdf415"
},
"work_role": {
"Name": "hammer",
"_id": "3a60dc1069c27c015ede1122"
}
}
]
After beating my head against the wall for some time...
I found a pretty cool feature of mongo "references"
eg:
REF_work: { type: Schema.Types.ObjectId, required: true, ref: 'work' },
REF_person: { type: Schema.Types.ObjectId, required: true, ref: 'users' },
then when I call it from my get function I add a populate to the find
assignments.find(query).populate('daily.cp.REF_person').populate('daily.cp.REF_work');
I get exactly what I want:
[
{
"_id": ObjectId("5a934e000102030405000000"),
"REF_person": {
"Name": "Bob",
"_id": "5f60c3b7f93d8e00a1cdf414"
},
"REF_work": {
"Name": "shovel",
"_id": "3a60dc1069c27c015ede1111"
}
},
{
"_id": ObjectId("5a934e000102030405000000"),
"REF_person": {
"Name": "Joe",
"_id": "5f60c3b7f93d8e00a1cdf415"
},
"REF_work": {
"Name": "hammer",
"_id": "3a60dc1069c27c015ede1122"
}
}
]

how to insert an object into the players array in mongoDB?

I have the document below and I need to insert an object into the players arrays how to do this with mongoDB
{
"data": {
"createTournament": {
"_id": "6130d9a565aa744f173a824a",
"title": "Jogo de truco",
"description": "",
"status": "PENDING",
"size": 8,
"prizePool": 20,
"currency": "USD",
"type": "Battle",
"entryFee": 1,
"startDate": "2021-09-01",
"endDate": "2021-09-01",
"rounds": [{
"round": 1,
"totalMatches": 4,
"matches": [{
"match": 1,
"players": []
}
]
}]
}
}
}
it will add 3 to array players that array matches has match of 1 and rounds array has round of 1
db.collection('exmaple').updateOne({},
{
$push:{"data.createTournament.rounds.$[outer].matches.$[inner].players":"3"}
},
{ "arrayFilters": [
{ "outer.round": 1 }, // you could change this to choose in which array must be pushed
{ "inner.match":1 } // // you could change this to choose in which array must be pushed
] }
)

MongoDB count value of attribute and group by month and year

I have documents in my mongodb and i try to count every same description i have and classify to month they created.
I want to return an array that include array of objects that includes month number with sub array of description value with the count.
I want to be able to choose which description value to count and choose by what year he will show me the data
for example this is my documents:
[
{
"username": "ron",
"skills": [
{
"rank": "high",
list: [
{
"subject": "Football"
},
{
"subject": "Swim"
}
]
},
{
"rank": "low",
list: [
{
"subject": "Baseball"
},
]
}
],
"duration": 0,
"date": ISODate("2021-07-24T12:05:27.127Z"),
"createdAt": ISODate("2021-07-24T12:05:49.985Z"),
"updatedAt": ISODate("2021-07-24T12:05:49.985Z"),
"__v": 0
},
{
"username": "john",
"skills": [
{
"rank": "low",
list: [
{
"subject": "Football"
},
]
}
],
"duration": 0,
"date": ISODate("2021-07-25T12:05:53.000Z"),
"createdAt": ISODate("2021-07-24T12:05:59.249Z"),
"updatedAt": ISODate("2021-07-24T12:05:59.249Z"),
"__v": 0
},
{
"username": "david",
"skills": [
{
"rank": "high",
list: [
{
"subject": "Football"
},
{
"subject": "Baseball"
}
]
},
{
"rank": "low",
list: [
{
"subject": "Swim"
},
]
}
],
"duration": 0,
"date": ISODate("2021-08-26T12:06:13.000Z"),
"createdAt": ISODate("2021-07-24T12:06:21.328Z"),
"updatedAt": ISODate("2021-07-24T12:06:21.328Z"),
"__v": 0
},
{
"username": "david",
"skills": [
{
"request": "high",
list: [
{
"subject": "Swim"
},
]
},
{
"request": "low",
list: [
{
"subject": "Football"
},
{
"subject": "Baseball"
},
]
}
],
"duration": 0,
"date": ISODate("2021-01-21T13:07:50.000Z"),
"createdAt": ISODate("2021-07-24T12:08:05.552Z"),
"updatedAt": ISODate("2021-07-24T12:14:51.285Z"),
"__v": 0
},
{
"username": "david",
"skills": [
{
"rank": "high",
list: [
{
"subject": "Football"
},
]
},
],
"duration": 0,
"date": ISODate("2022-01-21T13:07:50.000Z"),
"createdAt": ISODate("2022-07-24T12:08:05.552Z"),
"updatedAt": ISODate("2022-07-24T12:14:51.285Z"),
"__v": 0
}
]
The result I expect to get if i want to get match only the description that equal to "Football" or "Baseball":
{
[ {"month_number":7,"result":[{"description":'Football',"count":2},{"description":"Baseball","count":1}]},
{"month_number":8,"result":[{"description":'Football',"count":1}]}]
}
I'm new to mongodb ... so far I have been able to count how many there are of each value and display only the values I want but I do not know how to classify it into months depending on the year I choose.
I tried this:
db.exercises.aggregate([{$match:{ $and:[{description:{$in: ["Baseball","Football"]}}]}} ,{$group:{_id:"$description",count:{$sum:1}}}])
and this
db.exercises.aggregate(
[
{
$project:
{
_id: 0,
year: { $year: "$date" },
month: { $month: "$date" },
description:"$description"
}
},
{
$match:{$and:[{description:{$in: ["Baseball","Football","Swim"]},year:2021}]}
},
{$group:{_id:"$description" , count:{$sum:1}}}
]
)
You need add 1 more grouping i.e. by month.
Working playground
db.collection.aggregate([
{
$project: {
_id: 0,
year: {
$year: "$date"
},
month: {
$month: "$date"
},
description: "$description"
}
},
{
$match: {
$and: [
{
description: {
$in: [
"Baseball",
"Football",
"Swim"
]
},
year: 2021
}
]
}
},
{
$group: {
_id: {
description: "$description",
month: "$month"
},
count: {
$sum: 1
}
}
},
{
$group: {
_id: {
month_number: "$_id.month"
},
results: {
$push: {
description: "$_id.description",
count: "$count"
}
}
}
},
{
$project: {
_id: 0,
month_number: "$_id.month_number",
results: 1
}
}
])

MongoDB count values of multible nested documents

My data looks something like that:
[
{
"_id": 1,
"members": [
{
"id": 1,
"name": "name_1",
"assigned_tasks": [
1,
2,
3
]
},
{
"id": 1,
"name": "name_2",
"assigned_tasks": [
1
]
}
],
"tasks": [
{
"id": 1,
"name": "task_1",
},
{
"id": 2,
"name": "task_2",
},
{
"id": 3,
"name": "task_3",
}
]
}
]
I have a collection that represents a "class" which contains a list of members and a list of projects.
Each member can be assigned to multiple projects.
I wanna be able to count the number of members assigned to each of the tasks in the results and add it as a new field like:
[
{
"_id": 1,
"members": [
{
"id": 1,
"name": "name_1",
"assigned_tasks": [
1,
2,
3
]
},
{
"id": 1,
"name": "name_2",
"assigned_tasks": [
1
]
}
],
"tasks": [
{
"id": 1,
"name": "task_1",
"number_of_assigned_members":2
},
{
"id": 2,
"name": "task_2",
"number_of_assigned_members":1
},
{
"id": 3,
"name": "task_3",
"number_of_assigned_members":2
}
]
}
]
How can I create that query?
You can use $map and than $reduce,
$map tasks through object by object check in $reduce on members, if assigned_tasks is available or not, if available then add 1 otherwise 0,
db.collection.aggregate([
{
$addFields: {
tasks: {
$map: {
input: "$tasks",
as: "t",
in: {
$mergeObjects: [
"$$t",
{
number_of_assigned_members: {
$reduce: {
input: "$members",
initialValue: 0,
in: {
$cond: [
{ $in: ["$$t.id", "$$this.assigned_tasks"] },
{ $add: ["$$value", 1] },
"$$value"
]
}
}
}
}
]
}
}
}
}
}
])
Playground