Find result within Array of Objects and match email address field - mongodb

I'm trying to match the emailAddress field and the page_slug. Currently I'm using the following which matches just the about page in the modularSequence:
db.getCollection('users').find({"modularSequence.page_slug": "about"}, {"modularSequence.$": 1 })
This works and returns:
{
"_id" : ObjectId("5740c631742da6e83389abb4"),
"modularSequence" : [
{
"page_id" : "1",
"sequence" : "m_1",
"category" : "headers",
"page_slug" : "about"
}
]
}
Which it half what I want. I'm looking to return the emailAddress field as well. I've tried using this but it returns everything and multiple modular elements:
db.getCollection('users').find({$and:[{"emailAddress": 'paul#example.com'}, {"modularSequence.page_slug": "about"}, {"modularSequence": {$elemMatch: {page_slug:'about'}}}]})
[
{
"emailAddress": "paul#example.com",
"modularSequence": [
{
"page_slug": "about",
"category": "headers",
"sequence": "m_1",
"page_id": "1"
},
{
"page_slug": "contact",
"category": "content",
"sequence": "m_4",
"page_id": "2"
}
]
}
]
How do I match both the emailAddress field and the modularSequence.page_slug - only return a result if both the email address matches and the page_slug?

Your $and array is including your field selection parameter as well. But you don't need to use $and here anyway as multiple query terms are implicitly ANDed by default, so you can simplify your query to:
db.users.find({"emailAddress": 'paul#example.com', "modularSequence.page_slug": "about"},
{"emailAddress": 1, "modularSequence.$": 1})
Which is your first query, but with an emailAddress field added to both the query and field selection parameters.
The first parameter of find is the query (which docs), and the second is the projection (which fields within those docs), so that's why those fields are there twice. The $ in the projection represents the modularSequence array element matched in the query.

Related

Mongodb find documents with a specific aggregate value in an array

I have a mongo database with a collection of countries.
One property (currencies) contains an array of currencies.
A currency has multiple properties:
"currencies": [{
"code": "EUR",
"name": "Euro",
"symbol": "€"
}],
I wish to select all countries who use Euro's besides other currencies.
I'm using the following statement:
db.countries.find({currencies: { $in: [{code: "EUR"}]}})
Unfortunately I'm getting an empty result set.
When I use:
db.countries.find({"currencies.code": "EUR"})
I do get results. Why is the first query not working and the second one succesfull?
The first query is not working as it checks whether the whole currency array is in the array, which is never true.
It is true when:
currencies: {
$in: [
[{
"code": "EUR",
"name": "Euro",
"symbol": "€"
}],
...
]
}
I believe that $elemMatch is what you need besides the dot notation.
db.collection.find({
currencies: {
$elemMatch: {
code: "EUR"
}
}
})
Sample Mongo Playground
MongoDB works in the same way if the query field is an array or a single value, that's why the second one works.
So why the first one doesn't work? The problem here is that you are looking for an object that is exactly defined as {code: "EUR"}: no name or symbol field are specified. To make it work, you should change it to:
db.getCollection('countries').find({currencies: { $in: [{
"code" : "EUR",
"name" : "Euro",
"symbol" : "€"
}]}})
or query the subfield directly:
db.getCollection('stuff').find({"currencies.code": { $in: ["EUR"]}})

mongodb check regex on fields from one collection to all fields in other collection

After digging google and SO for a week I've ended up asking the question here. Suppose there are two collections,
UsersCollection:
[
{...
name:"James"
userregex: "a|regex|str|here"
},
{...
name:"James"
userregex: "another|regex|string|there"
},
...
]
PostCollection:
[
{...
title:"a string here ..."
},
{...
title: "another string here ..."
},
...
]
I need to get all users whose userregex will match any post.title(Need user_id, post_id groups or something similar).
What I've tried so far:
1. Get all users in collection, run regex on all products, works but too dirty! it'll have to execute a query for each user
2. Same as above, but using a foreach in Mongo query, it's the same as above but only Database layer instead of application layer
I searched alot for available methods such as aggregations, upwind etc with no luck.
So is it possible to do this in Mongo? Should i change my database type? if yes what type would be good? performance is my first priority. Thanks
It is not possible to reference the regex field stored in the document in the regex operator inside match expression.
So it can't be done in mongo side with current structure.
$lookup works well with equality condition. So one alternative ( similar to what Nic suggested ) would be update your post collection to include an extra field called keywords ( array of keyword values it can be searched on ) for each title.
db.users.aggregate([
{$lookup: {
from: "posts",
localField: "userregex",
foreignField: "keywords",
as: "posts"
}
}
])
The above query will do something like this (works from 3.4).
keywords: { $in: [ userregex.elem1, userregex.elem2, ... ] }.
From the docs
If the field holds an array, then the $in operator selects the
documents whose field holds an array that contains at least one
element that matches a value in the specified array (e.g. ,
, etc.)
It looks like earlier versions ( tested on 3.2 ) will only match if array have same order, values and length of arrays is same.
Sample Input:
Users
db.users.insertMany([
{
"name": "James",
"userregex": [
"another",
"here"
]
},
{
"name": "John",
"userregex": [
"another",
"string"
]
}
])
Posts
db.posts.insertMany([
{
"title": "a string here",
"keyword": [
"here"
]
},
{
"title": "another string here",
"keywords": [
"another",
"here"
]
},
{
"title": "one string here",
"keywords": [
"string"
]
}
])
Sample Output:
[
{
"name": "James",
"userregex": [
"another",
"here"
],
"posts": [
{
"title": "another string here",
"keywords": [
"another",
"here"
]
},
{
"title": "a string here",
"keywords": [
"here"
]
}
]
},
{
"name": "John",
"userregex": [
"another",
"string"
],
"posts": [
{
"title": "another string here",
"keywords": [
"another",
"here"
]
},
{
"title": "one string here",
"keywords": [
"string"
]
}
]
}
]
MongoDB is good for your use case but you need to use a approach different from current one. Since you are only concerned about any title matching any post, you can store the last results of such a match. Below is a example code
db.users.find({last_post_id: {$exists: 0}}).forEach(
function(row) {
var regex = new RegExp(row['userregex']);
var found = db.post_collection.findOne({title: regex});
if (found) {
post_id = found["post_id"];
db.users.updateOne({
user_id: row["user_id"]
}, {
$set :{ last_post_id: post_id}
});
}
}
)
What it does is that only filters users which don't have last_post_id set, searches post records for that and sets the last_post_id if a record is found. So after running this, you can return the results like
db.users.find({last_post_id: {$exists: 1}}, {user_id:1, last_post_id:1, _id:0})
The only thing you need to be concerned about is a edit/delete to an existing post. So after every edit/delete, you should just run below, so that all matches for that post id are run again.
post_id_changed = 1
db.users.updateMany({last_post_id: post_id_changed}, {$unset: {last_post_id: 1}})
This will make sure that next time you run the update these users are processed again. The approach does have one drawback that for every user without a matching title, the query for such users would run again and again. Though you can workaround that by using some timestamps or post count check
Also you should make to sure to put index on post_collection.title
I was thinking that if you pre-tokenized your post titles like this:
{
"_id": ...
"title": "Another string there",
"keywords": [
"another",
"string",
"there"
]
}
but unfortunately $lookup requires that foreignField is a single element, so my idea of something like this will not work :( But maybe it will give you another idea?
db.Post.aggregate([
{$lookup: {
from: "Users",
localField: "keywords",
foreignField: "keywords",
as: "users"
}
},
]))

Concurrent update of array elements which are embedded documents in MongoDB

I have documents like this one at collection x at MongoDB:
{
"_id" : ...
"attrs" : [
{
"key": "A1",
"type" : "T1",
"value" : "13"
},
{
"key": "A2",
"type" : "T2",
"value" : "14"
}
]
}
The A1 and A2 elements above are just examples: the attrs field may hold any number of array elements.
I'd need to access concurrently to the attrs array from several independent clients accessing to MongoDB. For example, considers two clients, one wanting to change the value of the element identified by key equal to "A1" to "80" and other wanting to change the value of the element identified by key equal to "A2" to "20". Is there any compact way of doing it using MongoDB operations?
It is important to note that:
Clients doesn't know the position of each element in the attr array, only the key of the element which value has to be modified.
Reading the whole attrs array in client space, searching the element to modify at client space, then updating attrs with the new array (in which the element to modify has been changed) would involve race conditions.
Clients also may add and remove elements in the array. Thus, doing a first search at MongoDB to locate the position of the element to modify, then update it using that particular position doesn't work in general, as elements could have been added/removed thus altering of the position previously found.
The process here is really quite simple, it only varies in where you want to "find or create" the elements in the array.
First, assuming the elements for each key are in place already, then the simple case is to query for the element and update with the index returned via the positional $ operator:
db.collection.update(
{
"_id": docId,
"attrs": { "$elemMatch": { "key": "A1", "type": "T1" } }
}
{ "$set": { "attrs.$.value": "20" }
)
That will only modify the element that is matched without affecting others.
In the second case where "find or create" is required and the particular key may not exist, then you use "two" update statements. But the Bulk Operations API allows you to do this in a single request to the server with a single response:
var bulk = db.collection.initializeOrderedBulkOp();
// Try to update where exists
bulk.find({
"_id": docId,
"attrs": { "$elemMatch": { "key": "A1", "type": "T2" } }
}).updateOne({
"$set": { "attrs.$.value": "30" }
});
// Try to add where does noes not exist
bulk.find({
"_id": docId,
"attrs": { "$not": { "$elemMatch": { "key": "A1", "type": "T2" } } }
}).updateOne({
"$push": { "attrs": { "key": "A1", "type": "T2", "value": "30" } }
});
bulk.execute();
The basic logic being that first the update attempt is made to match an element with the required values just as done before. The other condition tests for where the element is not found at all by reversing the match logic with $not.
In the case where the array element was not found then a new one is valid for addition via $push.
I should really add that since we are specifically looking for negative matches here it is always a good idea to match the "document" that you intend to update by some unique identifier such as the _id key. While possible with "multi" updates, you need to be careful about what you are doing.
So in the case of running the "find or create" process then element that was not matched is added to the array correctly, without interferring with other elements, also the previous update for an expected match is applied in the same way:
{
"_id" : ObjectId("55b570f339db998cde23369d"),
"attrs" : [
{
"key" : "A1",
"type" : "T1",
"value" : "20"
},
{
"key" : "A2",
"type" : "T2",
"value" : "14"
},
{
"key" : "A1",
"type" : "T2",
"value" : "30"
}
]
}
This is a simple pattern to follow, and of course the Bulk Operations here remove any overhead involved by sending and receiving multiple requests to and from the server. All of this hapily works without interferring with other elements that may or may not exist.
Aside from that, there are the extra benefits of keeping the data in an array for easy query and analysis as supported by the standard operators without the need to revert to JavaScript server processing in order to traverse the elements.

mongodb $group aggregation yields _id with multiple values as array; how to remove dupes from _id?

I am trying to conduct a very simple aggregation to collect some indexes associated with a particular owner. My query is as follows (in moped syntax):
owners = Serials.collection.aggregate([
{'$group' => {
'_id' => '$owners.owner.party_name',
'serials' => { '$addToSet' => '$serial_number' }
}}])
That's the entire function. The issue is that the 'owners.owner' field can take two forms -- it is often a nested array, with multiple party names associated with the record. But, it can also be a single record:
Form 1:
"owners": {
"owner": [
{
"entry_number": "1",
"party_name": "Company Name, LLC",
"other_fields": "other info",
},
{
"entry_number": "1",
"party_name": "Company Name, LLC",
"other_fields": "other info",
}
]
},
(yes, often the entries are repeating within the array. Sometimes it is two or more distinct owners.)
Form 2:
"owners": {
"owner": {
"entry_number": "1",
"party_name": "Another Company, Inc.",
"other_fields": "other_info",
}
},
Notice it is not embedded in an array in this case. Thus, I'm not sure an $unwind step in the aggregation process would work because the documents without an embedded array would return an error.
So anyways, the results of the aggregation yield records that look like this:
{"_id"=>["Random co.", "Random co."], "serials"=>["12345678"]}
but also records that look like this:
{"_id"=>["Company 1 co.", "Company 2 co."], "serials"=>["12345679", "12345778", "14562378", "87654321", "33822112", "11111111"]}
i.e. the 'party_name' fields are sometimes unique, but sometimes are two or more distinct strings.
My question is, how can I further refine this aggregation to remove duplicate strings from the '_id' field, and only preserve distinct values?
So, for example, in the first case the result would be:
{"_id"=>["Random co."], "serials"=>["12345678"]}
While in the second case the result would be identical.

Updating Value of Array Element in MongoDB

I'd like to know how to update the "value" field of one of the elements identified by the "name" field in the array "array_of_stuff". For example, I want to update the value for "name_of_thing_1" to "new_value_of_thing_1". How can I do this ONLY using the second parameter (i.e. the update parameter) to the update command. I am re-using a class library written in-house I don't have control over the first argument to the update command (i.e. the query parameter). Is this possible?
{
"array_of_stuff": [
{
"name": "name_of_thing_1",
"value": "value_of_thing_1",
},
{
"name": "name_of_thing_2",
"value": "value_of_thing_2",
}
]
}
Thanks for your help!
You can update the value of a single item in an array (if you know its index) like this:
db.stuff.update(/* query ... */, {$set:{"arrayname.<index>":new_value}})
If your array contains documents, you can update a particular field of a document at that index like this:
db.stuff.update(/* query ... */, {$set:{"array_of_stuff.0.value":"new_value_of_thing_1"}})
// If you could use the query parameter and knew something
// about the value in the array you wanted to change:
db.stuff.update({"array_of_stuff.value":"value_of_thing_1"}, {$set:{"array_of_stuff.$.value":"new_value_of_thing_1"}})
See if this example help you:
db.bruno.insert({"array": [{"name": "Hello", "value": "World"}, {"name": "Joker", "value": "Batman"}]})
db.bruno.update({"array.name": "Hello"}, {$set: {"array.$.value": "Change"}})
db.bruno.find().pretty()
output:
db.bruno.find().pretty()
{
"_id" : ObjectId("52389faaafd72821e7b25a73"),
"array" : [
{
"name" : "Hello",
"value" : "Change"
},
{
"name" : "Joker",
"value" : "Batman"
}
]
}
I don't think it is possible. In order to update field of one of the elements in array, you should use positional $ operator, e.g.:
update({'array_of_stuff.name':'name_of_thing_1'},
{ $set: {'array_of_stuff.$.value':'new_value_of_thing_1'}})
But according to documentation: positional $ operator acts as a placeholder for the first element that matches query document, and the array field must appear as part of the query document.