Why Doctrine2 ODM's results of findBy() and createQueryBuilder()->getQuery()->execute() differ from each other? - mongodb

I have tried two different ways to do the same query with Doctrine's MongoDB-ODM.
Can you figure out that why the two, in my opinion similar queries both return different result? Snippet 1 doesn't return anything where Snippet 2 returns correct database entries. Both of the queries seem similar in the log file - except #1 does not have skip & limit lines.
Snippet 1
$dateDayAgo = new \DateTime('1 day ago');
$recentLogins = $this->get('user_activity_tracker')->findBy(array(
'targetUser' => $userAccount->getId(),
'code' => array('$in' => array('login.attempt','login.ok')),
'ts' => array('$gte', $dateDayAgo)
))->sort(['ts' => 1]);
Symfony's log entries from the Snippet 1:
[2012-08-13 09:14:33] doctrine.INFO: MongoDB query: { "find": true, "query": { "targetUser": ObjectId("4fa377e06803fa7303000002"), "code": { "$in": [ "login.attempt", "login.ok" ] }, "ts": [ "$gte", new Date("Sun, 12 Aug 2012 09:14:33 +0000") ] }, "fields": [ ], "db": "eventio_com", "collection": "ActivityEvent" } [] []
[2012-08-13 09:14:33] doctrine.INFO: MongoDB query: { "sort": true, "sortFields": { "ts": 1 }, "query": { "targetUser": ObjectId("4fa377e06803fa7303000002"), "code": { "$in": [ "login.attempt", "login.ok" ] }, "ts": [ "$gte", new Date("Sun, 12 Aug 2012 09:14:33 +0000") ] }, "fields": [ ] } [] []
Snippet 2
$recentLoginsQuery = $this->get('user_activity_tracker')->createQueryBuilder()
->field('targetUser')->equals($userAccount->getId())
->field('code')->in(array('login.attempt','login.ok'))
->field('ts')->gte($dateDayAgo)
->sort('ts','asc')
->getQuery();
$recentLogins = $recentLoginsQuery->execute();
Log entries for Snippet 2:
[2012-08-13 09:17:30] doctrine.INFO: MongoDB query: { "find": true, "query": { "targetUser": ObjectId("4fa377e06803fa7303000002"), "code": { "$in": [ "login.attempt", "login.ok" ] }, "ts": { "$gte": new Date("Sun, 12 Aug 2012 09:17:30 +0000") } }, "fields": [ ], "db": "eventio_com", "collection": "ActivityEvent" } [] []
[2012-08-13 09:17:30] doctrine.INFO: MongoDB query: { "limit": true, "limitNum": null, "query": { "targetUser": ObjectId("4fa377e06803fa7303000002"), "code": { "$in": [ "login.attempt", "login.ok" ] }, "ts": { "$gte": new Date("Sun, 12 Aug 2012 09:17:30 +0000") } }, "fields": [ ] } [] []
[2012-08-13 09:17:30] doctrine.INFO: MongoDB query: { "skip": true, "skipNum": null, "query": { "targetUser": ObjectId("4fa377e06803fa7303000002"), "code": { "$in": [ "login.attempt", "login.ok" ] }, "ts": { "$gte": new Date("Sun, 12 Aug 2012 09:17:30 +0000") } }, "fields": [ ] } [] []
[2012-08-13 09:17:30] doctrine.INFO: MongoDB query: { "sort": true, "sortFields": { "ts": 1 }, "query": { "targetUser": ObjectId("4fa377e06803fa7303000002"), "code": { "$in": [ "login.attempt", "login.ok" ] }, "ts": { "$gte": new Date("Sun, 12 Aug 2012 09:17:30 +0000") } }, "fields": [ ] } [] []
My 'user_activity_tracker' service works just as a proxy to the underlying Doctrine repository / document manager. Both snippets get a LoggableCursor back after query.

The extra log output with the query builder method is due to Query::prepareCursor(), which always sets additional cursor options. The repository findBy() method, which utilizes DocumentPersister::loadAll(), only sets options if a non-null value is provided. That explains the difference in log output, but is unrelated to a difference of result sets.
The logged query for each example are identical, apart from a small drift in the ts criteria. If the count() values of both cursors differ, and the results are different after unwrapping the cursor with iterator_to_array(), I would suggest attempting to reproduce this in a failing test case and submit a pull request against the mongodb-odm repository.

Related

MongoDB Atlas Search not showing results when typing few characters

The problem I am facing is that I want to develop an autocomplete search bar using Mean Stack like the one in this site, but when I type, for example, 'ag' it's not returning the right location that should be 'Aguascalientes'.
I have two different search indexes set up and a different query for each.
First Index:
{
"mappings": {
"dynamic": false,
"fields": {
"name": {
"foldDiacritics": false,
"maxGrams": 7,
"minGrams": 3,
"tokenization": "edgeGram",
"type": "autocomplete"
},
"searchName": {
"foldDiacritics": false,
"maxGrams": 7,
"minGrams": 3,
"tokenization": "edgeGram",
"type": "autocomplete"
}
}
}
}
First Query:
[
{
$search: {
index: "autocomplete2",
compound: {
must: [
{
text: {
query: search,
path: "searchName",
fuzzy: {
maxEdits: 2,
},
},
},
],
},
},
},
{
$limit: 10,
},
]
The first ones are not returning any document at all. But the second example is:
{
"mappings": {
"dynamic": false,
"fields": {
"name": {
"analyzer": "lucene.standard",
"type": "string"
},
"searchName": {
"analyzer": "lucene.standard",
"type": "string"
}
}
}
}
Query:
[
{
$search: {
index: 'default',
compound: {
must: [
{
text: {
query: search,
path: 'name',
fuzzy: {
maxEdits: 1,
},
},
},
{
text: {
query: search,
path: 'searchName',
fuzzy: {
maxEdits: 1,
},
},
},
],
},
},
},
{
$limit: 5,
},
]
The second example is only returning documents if the search term 'aguascalient' but is not returning any document if the search term is shorter like the site. Maybe it has something to do with the fuzzy edits but if I set it up to greater than 2 I get an error.
Also the order is not right, it returns first the CITY and second the STATE but I need the STATE first because the search term is more similar than the city. Let me explain, search field for STATE is only 'Aguascalientes' but search field cities is 'Aguascalientes Aguascalientes' so I don't know why is not working properly. Maybe in that case I should give weights accordingly but I'm not sure if it's the right approach to solve this.
My data structure:
{
"_id": "638d0ffc34ad076c6bd12cb6",
"depth": 2,
"label": "CITY",
"location_id": "V1-C-247",
"name": "Aguascalientes",
"parent": "Aguascalientes",
"fullName": "Aguascalientes, Aguascalientes",
"parentId": "V1-B-61",
"searchName": "Aguascalientes Aguascalientes",
}
{
"_id": "638d0ffc34ad076c6bd12cb6",
"depth": 1,
"label": "STATE",
"location_id": "V1-C-248",
"name": "Aguascalientes",
"parent": null,
"fullName": "Aguascalientes",
"parentId": null,
"searchName": "Aguascalientes",
}
For the first index + query setup:
First, you are indexing the name field but are not searching on it. I will remove it from the code snippets for readability, but you can add it back to your index definition if you find you need to search on it.
There are two problems with the this index + query setup if you want to return results with a query for "ag". You have searchName defined as a field mapping of type autocomplete, but you also need to use the autocomplete operator in your query:
[
{
$search: {
index: "autocomplete2",
compound: {
must: [
{
autocomplete: {
query: search,
path: "searchName",
},
},
],
},
},
},
{
$limit: 10,
},
]
Second, in your index definition field mapping for searchName, you have minGram set to 3 and maxGram set to 7. Based on the documentation for the autocomplete field mapping, this means that your data will be tokenized into sequences of character lengths between 3 to 7, using the selected tokenization strategy. Since you have selected edgeGram, the tokens generated by the text "Aguascalientes" will be tokenized starting from the left edge, resulting in tokens "agu", "agua", "aguas", "aguasc", "aguasca". Since the search term "ag" does not match any of the tokens, nothing is returned. So, you must change the minGram to 2 to get the token "ag":
{
"mappings": {
"dynamic": false,
"fields": {
"searchName": {
"foldDiacritics": false,
"maxGrams": 7,
"minGrams": 2,
"tokenization": "edgeGram",
"type": "autocomplete"
}
}
}
}
Finally, if you want the document with an exact match to return over a partial match, ie. "Aguascalientes" should return before "Aguascalientes Aguascalientes", you need to implement exact matching. Here is a MongoDB blog post outlining a few options.
One option that I tried: In the index, use a keyword analyzer on the "searchName" field typed as a string data type. In the query, use the text operator nested in a should clause so that exact matches will return higher than other results.
Index:
{
"mappings": {
"dynamic": false,
"fields": {
"searchName": [
{
"foldDiacritics": false,
"maxGrams": 7,
"type": "autocomplete"
},
{
"analyzer": "lucene.keyword",
"searchAnalyzer": "lucene.keyword",
"type": "string"
}
]
}
}
}
Query:
[
{
$search: {
compound: {
must: [
{
autocomplete: {
query: search,
path: "searchName"
}
}
],
should:[
{
text: {
query: search,
path: "searchName"
}
}
],
},
},
},
]

MongoDB: $set specific fields for a document array elements only if not null

I have a collection with the following documents (for example):
{
"_id": {
"$oid": "61acefe999e03b9324czzzzz"
},
"matchId": {
"$oid": "61a392cc54e3752cc71zzzzz"
},
"logs": [
{
"actionType": "CREATE",
"data": {
"talent": {
"talentId": "qq",
"talentVersion": "2.10",
"firstName": "Joelle",
"lastName": "Doe",
"socialLinks": [
{
"type": "FACEBOOK",
"url": "https://www.facebook.com"
},
{
"type": "LINKEDIN",
"url": "https://www.linkedin.com"
}
],
"webResults": [
{
"type": "VIDEO",
"date": "2021-11-28T14:31:40.728Z",
"link": "http://placeimg.com/640/480",
"title": "Et necessitatibus",
"platform": "Repellendus"
}
]
},
"createdBy": "DEVELOPER"
}
},
{
"actionType": "UPDATE",
"data": {
"talent": {
"firstName": "Joelle new",
"webResults": [
{
"type": "VIDEO",
"date": "2021-11-28T14:31:40.728Z",
"link": "http://placeimg.com/640/480",
"title": "Et necessitatibus",
"platform": "Repellendus"
}
]
}
}
}
]
},
{
"_id": {
"$oid": "61acefe999e03b9324caaaaa"
},
"matchId": {
"$oid": "61a392cc54e3752cc71zzzzz"
},
"logs": [....]
}
a brief breakdown: I have many objects like this one in the collection. they are a kind of an audit log for actions takes on other documents, 'Match(es)'. for example CREATE + the data, UPDATE + the data, etc.
As you can see, logs field of the document is an array of objects, each describing one of these actions.
data for each action may or may not contain specific fields, that in turn can also be an array of objects: socialLinks and webResults.
I'm trying to remove sensitive data from all of these documents with specified Match ids.
For each document, I want to go over the logs array field, and change the value of specific fields only if they exist, for example: change firstName to *****, same for lastName, if those appear. also, go over the socialLinks array if exists, and for each element inside it, if a field url exists, change it to ***** as well.
What I've tried so far are many minor variations for this query:
$set: {
'logs.$[].data.talent.socialLinks.$[].url': '*****',
'logs.$[].data.talent.webResults.$[].link': '*****',
'logs.$[].data.talent.webResults.$[].title': '*****',
'logs.$[].data.talent.firstName': '*****',
'logs.$[].data.talent.lastName': '*****',
},
and some play around with this kind of aggregation query:
[{
$set: {
'talent.socialLinks.$[el].url': {
$cond: [{ $ne: ['el.url', null] },'*****', undefined],
},
},
}]
resulting in errors like: message: "The path 'logs.0.data.talent.socialLinks' must exist in the document in order to apply array updates.",
But I just cant get it to work... :(
Would love an explanation on how to exactly achieve this kind of set-only-if-exists behaviour.
A working example would also be much appreciated, thx.
Would suggest using $\[<indentifier>\] (filtered positional operator) and arrayFilters to update the nested document(s) in the array field.
In arrayFilters, with $exists to check the existence of the certain document which matches the condition and to be updated.
db.collection.update({},
{
$set: {
"logs.$[a].data.talent.socialLinks.$[].url": "*****",
"logs.$[b].data.talent.webResults.$[].link": "*****",
"logs.$[b].data.talent.webResults.$[].title": "*****",
"logs.$[c].data.talent.firstName": "*****",
"logs.$[d].data.talent.lastName": "*****",
}
},
{
arrayFilters: [
{
"a.data.talent.socialLinks": {
$exists: true
}
},
{
"b.data.talent.webResults": {
$exists: true
}
},
{
"c.data.talent.firstName": {
$exists: true
}
},
{
"d.data.talent.lastName": {
$exists: true
}
}
]
})
Sample Mongo Playground

Cannot use Nested VariableOperators.mapItemsOf in Spring Data MongoDb

I'm forced to use the aggregation framework and the project operation of Spring Data MongoDb.
What I'd like to do is creating an array of object as a result of a project operation.
Considering this intermediate aggregation result:
{
"processes": [
{
"id": "101a",
"assignees": [
{
"id": "201a",
"username": "carl93"
},
{
"id": "202a",
"username": "susan"
}
]
},
{
"id": "101b",
"assignees": [
{
"id": "201a",
"username": "carl93"
},
{
"id": "202a",
"username": "susan"
}
]
}
]
}
I'm trying to get for each process, all the assignee usernames and ids. Hence, what I want to obtain is something like this:
[
{
"results": [
{
"id": "201a",
"value": "carl93",
"parentObjectId": "101a"
},
{
"id": "202a",
"value": "susan",
"parentObjectId": "101a"
},
{
"id": "201a",
"value": "carl93",
"parentObjectId": "101b"
},
{
"id": "202a",
"value": "susan",
"parentObjectId": "101b"
}
]
}
]
To reach this goal I'm using 2 nested VariableOperators.mapItemsOf obtaining:
org.springframework.data.mapping.MappingException: Cannot convert [Document{{id= 201a, value= carl93, parentObjectId= 101a}}, Document{{id= 202a, value = susan, parentObjectId= 101a}}]
of type class java.util.ArrayList into an instance of class java.lang.Object!
Implement a custom Converter<class java.util.ArrayList, class java.lang.Object> and register it with the CustomConversions.
Here's the code that I'm currently using:
new ProjectionOperation().and(
VariableOperators.mapItemsOf("processes")
.as("pr")
.andApply(
VariableOperators.mapItemsOf("$pr.ownership.assignees")
.as("ass")
.andApply(aggregationOperationContext -> {
Document document = new Document();
document.append("id", "$$ass.id");
document.append("value", "$$ass.username");
document.append("parentObjectId", "$$pr.id");
return document;
})
)
).as("results");
The code produces this:
[
[
{
"id": "201a",
"value": "carl93",
"parentObjectId": "101a"
},
{
"id": "202a",
"value": "susan",
"parentObjectId": "101a"
}
],
[
{
"id": "201a",
"value": "carl93",
"parentObjectId": "101b"
},
{
"id": "202a",
"value": "susan",
"parentObjectId": "101b"
}
]
]
As you can see there are 2 nested arrays, [[],[]]. This is the reason why the exception is thrown.
Nevertheless what I want to obtain is just one array, adding all the objects in it (possibly without duplicates or null values). I've tried the addToSet operator and other aggregtion operators, without any success.
Use $reduce with $concatArrays to join the arrays.
new ProjectionOperation().and(
ArrayOperators.arrayOf("processes")
.reduce(ArrayOperators.ConcatArrays.arrayOf("$$value").concat(
VariableOperators.mapItemsOf("$$this.ownership.assignees")
.as("ass")
.andApply(aggregationOperationContext -> {
Document document = new Document();
document.append("id", "$$ass.id");
document.append("value", "$$ass.username");
document.append("parentObjectId", "$$this.id");
return document;
})
)).startingWith(Arrays.asList())
).as("results");

Update field of object in array of array with mongoose

I have a mongo db with this model:
_id: ObjectId("5705005b240166e927f841cb")
chapters: {
type: Array,
default: [
{
"id":"capitulo_0",
"active": true,
"title": "CAPÍTULO 0 - INTRODUCCIÓN",
"sections": [
{
"title": "Institucional",
"type": "Video",
"id": "d74fb24654a2",
"url": "jPTG5P0528k",
"active": true
}
]
},
{
"id":"capitulo_1",
"active": false,
"title": "CAPÍTULO 1 - BIENVENIDA",
"sections": [
{
"title": "Introducción",
"type": "Video",
"url": "j2TG1P05k8k",
"id": "b2454d7f66de",
"active": false
}
]
},
...
]
}
For the query i have the user_id and the id of the sections and i need update the active field of the sections array.
I'm doing this:
User.findOneAndUpdate({_id: userId, 'chapters.sections':{$elemMatch: {id:sectionId}}}, {$set: {'sections.$.active': false}}).exec(function (err, doc) {console.log(doc)});
The active field not change.
How can I do it this query?
Thank's
I dont have your data so run this:
User.findOne({_id: userId, 'chapters.sections':{$elemMatch: {id:sectionId}}})
and see if you get response and record will be found, because your update looks fine.
UPDATE:
After seeing your data I think you are missing chapters in your query
User.findOneAndUpdate({_id: userId, 'chapters.sections':{$elemMatch: {id:sectionId}}}, {$set: {'chapters.sections.$.active': false}}).exec(function (err, doc) {console.log(doc)});
I hope this helps

MongoDB: addToSet not working when adding subdocuments to an array

I am new to MongoDB and I am stuck trying to get unique subdocuments in an array.
A document in my collection looks like this:
{
"PubDate": "1/01/01 00:00",
"Title": "Identification of DNA-Dependent Protein Kinase Catalytic Subunit (DNA-PKcs) as a Novel Target of Bisphenol A",
"Datums": [
{
"evidence_id": "3515620_6",
"evidence": [
"\n\nTo examine the interaction between DNA-PKcs and Ku70/Ku80 more directly, we performed immunoprecipitation (IP) using FLAG-Ku70 or FLAG-Ku80 recombinants, which were expressed in 293T cells after IR-irradiation (Fig. 4B\n ) or UV-irradiation (Fig. 4C\n ). After IR-irradiation, co-precipitation of DNA-PKcs with Ku80 increased compared with that in the non-irradiated controls (Fig. 4B\n lanes 7 and 8)."
],
"map": {
"change": [
{
"Text": "increased"
}
],
"subject": [
{
"Entity": {
"strings": [
"dna-pkcs"
],
"uniprotSym": "P78527"
}
}
],
"treatment": [
{
"Entity": {
"strings": [
"dna-pkcs"
],
"uniprotSym": "P78527"
}
}
],
"assay": [
{
"Text": "copptby"
}
]
}
},
{
"evidence_id": "3515620_6",
"evidence": [
"\n\nTo examine the interaction between DNA-PKcs and Ku70/Ku80 more directly, we performed immunoprecipitation (IP) using FLAG-Ku70 or FLAG-Ku80 recombinants, which were expressed in 293T cells after IR-irradiation (Fig. 4B\n ) or UV-irradiation (Fig. 4C\n ). After IR-irradiation, co-precipitation of DNA-PKcs with Ku80 increased compared with that in the non-irradiated controls (Fig. 4B\n lanes 7 and 8)."
],
"map": {
"change": [
{
"Text": "increased"
}
],
"subject": [
{
"Entity": {
"strings": [
"dna-pkcs"
],
"uniprotSym": "P78527"
}
}
],
"treatment": [
{
"Entity": {
"strings": [
"dna-pkcs"
],
"uniprotSym": "P78527"
}
}
],
"assay": [
{
"Text": "copptby"
}
]
}
},
{
"evidence_id": "3515620_6",
"evidence": [
"\n\nTo examine the interaction between DNA-PKcs and Ku70/Ku80 more directly, we performed immunoprecipitation (IP) using FLAG-Ku70 or FLAG-Ku80 recombinants, which were expressed in 293T cells after IR-irradiation (Fig. 4B\n ) or UV-irradiation (Fig. 4C\n ). After IR-irradiation, co-precipitation of DNA-PKcs with Ku80 increased compared with that in the non-irradiated controls (Fig. 4B\n lanes 7 and 8)."
],
"map": {
"change": [
{
"Text": "increased"
}
],
"subject": [
{
"Entity": {
"strings": [
"dna-pkcs"
],
"uniprotSym": "P78527"
}
}
],
"treatment": [
{
"Entity": {
"strings": [
"dna-pkcs"
],
"uniprotSym": "P78527"
}
}
],
"assay": [
{
"Text": "copptby"
}
]
}
}
],
"Volume": "7",
"FullJournalName": "PLoS ONE",
"Authors": "Ito Y, Ito T, Karasawa S, Enomoto T, Nashimoto A, Hase Y, Sakamoto S, Mimori T, Matsumoto Y, Yamaguchi Y, Handa H",
"Issue": "12",
"Pages": "e50481",
"PMCID": "3515620"
}
In the above example, the "Datums" field has only one subdocument, but usually, the "Datums" field will have around 20-30 subdocuments. I want my MongoDB query to output documents (that satisfy certain criteria), where the "Datums" field will have unique subdocuments in its array. To do that I am using the following MongoDB query:
db.My_Datums.aggregate(
[
{ "$match": {
"Datums":
{
"$elemMatch":
{
"map.treatment.Entity.uniprotSym": { "$in": ["P33981", "P78527"] },
"map.assay.Text": "copptby"
}
}
}},
{ "$project": { "PMCID":1, "Title":1, "PubDate":1, "Volume":1, "Issue":1, "Pages":1, "FullJournalName":1, "Authors":1, "Datums.map.assay.Text":1, "Datums.map.change.Text":1, "Datums.map.subject.Entity.strings":1, "Datums.map.treatment.Entity.uniprotSym":1, "Datums.evidence_id":1, "_id":0 }},
{ "$unwind": "$Datums" },
{ "$match": { "Datums.map.treatment.Entity.uniprotSym": { "$in": ["P33981", "P78527"] }, "Datums.map.assay.Text": "copptby" }},
{ "$group": { "_id": "$PMCID", "Datums": { "$addToSet": "$Datums" }}}
]
#{ allowDiskUse: 1 }
)
But on running the above command, I am getting the below output:
{u'Datums': [{u'evidence_id': u'3515620_6',
u'map': {u'assay': [{u'Text': u'copptby'}],
u'change': [{u'Text': u'increased'}],
u'subject': [{u'Entity': {u'strings': u'dna-pkcs'}}],
u'treatment': [{u'Entity': {u'uniprotSym': u'P78527'}}]}},
{u'evidence_id': u'3515620_6',
u'map': {u'assay': [{u'Text': u'copptby'}],
u'change': [{u'Text': u'increased'}],
u'subject': [{u'Entity': {u'strings': u'dna-pkcs'}}],
u'treatment': [{u'Entity': {u'uniprotSym': u'P78527'}}]}},
{u'evidence_id': u'3515620_6',
u'map': {u'assay': [{u'Text': u'copptby'}],
u'change': [{u'Text': u'increased'}],
u'subject': [{u'Entity': {u'strings': u'dna-pkcs'}}],
u'treatment': [{u'Entity': {u'uniprotSym': u'P78527'}}]}}],
u'_id': u'3515620'}
What I am not understanding is why addToSet adding duplicate subdocuments to "Datums". Is there any way I can filter out the duplicates? What am I doing wrong in my query? I have searched a lot and read up a lot, but couldnt find any solution. Any MongoDB guru out there who could help this noob?? I will be eternally grateful to you!
Thanks in advance!