Checking for similar records in mongodb across multiple fields. - mongodb

In mongodb, I have a collection of people with the schema below. I need to write an aggregation to find possible duplicates in the database by checking:
If another person with same firstName, lastName & currentCompany exists.
Or, if another person with the same currentCompany & currentTitle exists.
Or, if another person has the same email (which is stored as an object in an array)
Or, if someone else has the same linkedIn/twitter url.
Is there a straightforward way of checking for duplicates based on the above cases w/ a mongodb aggregation? This question is close to what I'm looking for, but I need to check more than just one key/value.
{ _id: 'wHwNNKMSL9v3gKEuz',
firstName: 'John',
lastName: 'Doe',
currentCompany: 'John Co',
currentTitle: 'VP Sanitation',
emails:
[ { address: 'Anais.Grant#hotmail.com',
valid: true } ],
urls:
{ linkedIn: 'http://linkedin.com/johnDoe',
twitter: 'http://twitter.com/#john',
}
}
Thanks!

We can achieve it is using the following
$and, $or, $ne.
Note:- You need to feed one record as input for the conditions to match it with other records for eliminating the duplicates
I have given a sample query which will be filtering your collection for these two criterias, you can add the rest of your conditions to get the final result
If another person with same firstName, lastName & currentCompany exists.
Or, if someone else has the same linkedIn/twitter url.
db.yourcollection.find({
$and: [{
$or: [{
firstName: {
$ne: 'John'
}
}, {
lastName: {
$ne: 'Doe'
}
}, {
currentCompany: {
$ne: 'John Co'
}
}]
}, {
$or: [{
"urls.linkedIn": {
$ne: 'http://linkedin.com/Doe'
}
}]
}]
})

Related

How do I return all results that don't contain a certain value in mongodb

I have some data in my mongodb database that looks similar to this:
[
{
username: 'will',
post: 'some random post',
user_id: '12345',
_id: 0
},
{
username: 'jogno',
post: 'some random post',
user_id: '23412',
_id: 1
},
{
username: 'aflack',
post: 'some random post',
user_id: '24332',
_id: 2
}
]
If my user_id is 12345, I want my query to return the other 2 posts where the user_id does NOT equal 12345. However when I use the query below it just returns all 3 posts, am I missing something or is there a different and better way to do this? Thanks
Post.find(
{ user_id: { $not: 12345 } }
)
SOLUTION:
I had to use $ne instead of $not
In your database, the user_id is a string, but in your query you are selecting a number.
Also, you should be using $ne for this.
Try $ne: "12345"

MongoDB - Query nested objects in nested array with array of strings filter

So basically I need to filter my data with my own filter, which is array of strings, but problem is, that that exact field is inside nested object in array in DB. so, part of my Schema looks like this:
members: [
{
_id: { type: Schema.Types.ObjectId, ref: "Users" },
profilePicture: { type: String, required: true },
profile: {
firstName: { type: String },
lastName: { type: String },
occupation: { type: String },
gender: { type: String }
}
}
]
and my filter looks like this
gender: ["male","female"]
expected result with this filter is to get a team which has both male users and female users, if it has only male, or only female, it should not give me that team. but everything i've tried was giving me everything what included males and females even tho there were only male members.
what i've tried:
db.teams.find(members: { $elemMatch: { "profile.gender": { $in: gender } } })
This works only when there is one gender specified in the filter, and well, i know it must not work on what i am trying to achieve, but i dont know how to achieve it. any help will be appreciated
Edit: I've tried to do it in this way
db.teams.find({
$and: [
{ members: { $elemMatch: { "profile.gender": gender[0] } } },
{ members: { $elemMatch: { "profile.gender": gender[1] } } }
]
})
and this gives me result only when both filters are specified, however, if there is only one filter(either "male", or "female") it is giving me nothing.
Use $and operator instead of $in.
db.teams.find(members: {$elemMatch: {$and: [{'profile.gender': 'male'}, {'profile.gender': 'female'}]}})
This query works no matter how many elements you want to compare
db.teams.find({$and: [{'members.profile.gender': 'male'}, {'members.profile.gender': 'female'}]})
You need to dynamically generate the query before passing it to find, if you want to cover more than one case.
You can do this with the $all operator that finds docs where an array field contains contains all specified elements:
var gender = ['male', 'female'];
db.teams.find({'members.profile.gender': {$all: gender}});

Mongoose how to use positional operator to pull from double nested array with specific condition, and return new result

Suppose I have the following schema:
{
_id: ObjectId(1),
title: string,
answers: [
{
_id: ObjectId(2),
text: string,
upVotes: [
{
_id: ObjectId(3),
userId: ObjectId(4)
}
]
}
]
}
What I want is pull vote of a specific user from answer upvotes, and return the new update result.
For example, find a question with id 1, and get its specific answer with id 2, then from that answer pull my vote using userId inside upvotes.
I want to do it with a single findOneAndUpdate query
You can even use single $ positional with the $pull operator to update the nested array
db.collection.findOneAndUpdate(
{ "_id": ObjectId(1), "answers._id": ObjectId(2) },
{ "$pull": { "answers.$.upVotes": { "userId": ObjectId(4) }}}
)
I think I understood that you want to do a search in the specific array
db.collection.update(
{
"_id": "507f1f77bcf86cd799439011", // id field
"answers.upVotes._id":"507f1f77bcf86cd799439011" //id array
}
),{
"$set":{"answers.$.upVotes": {userId :"507f1f77bcf86cd799439011"}}},//edit
//use "addToSet" for add

mongo: update subdocument's array

I have the following schema:
{
_id: objectID('593f8c591aa95154cfebe612'),
name: 'test'
businesses: [
{
_id: objectID('5967bd5f1aa9515fd9cdc87f'),
likes: [objectID('595796811aa9514c862033a1'), objectID('593f8c591ba95154cfebe790')]
}
{
_id: objectID('59579ff91aa9514f600cbba6'),
likes: [objectID('693f8c554aa95154cfebe146'), objectID('193f8c591ba95154cfeber790')]
}
]
}
I need to update "businesses.likes" where businesses._id equal to a center value and where businesses.likes contains certain objectID.
If the objectID exists in the array, I want to remove it.
This is what I have tried and didn't work correctly, because $in is searching in all the subdocuments, instead of the only subdocument where businesses._id = with my value:
db.col.update(
{ businesses._id: objectID('5967bd5f1aa9515fd9cdc87f'), 'businesses.likes': {$in: [objectID('193f8c591ba95154cfeber790')]}},
{$pull: {'businesses.$.likes': objectID('193f8c591ba95154cfeber790')}}
)
Any ideas how how I can write the query? Keep in mind that businesses.likes from different businesses can have the same objectID's.

MongoDB Regex $and $or Search Query

I am trying to construct a query that will accept multiple fields that can be searched over using regex for partial field matching that also has a hard constraint on other fields.
Example:
Collection: "Projects"
Required Information: { propertyId: "abc", clientId: "xyz" }
Fields to be Searched: name, serviceType.name, manager.name
Currently, I have a query like this, but if there are no results it returns all the results, which isn't helpful.
{
'$and': [
{ propertyId: '7sHGCHT4ns6z9j6BC' },
{ clientId: 'xyz' },
{ '$or':
[
{ name: /HVAC/gi },
{ 'serviceType.name': /HVAC/gi },
{ 'manager.name': /HVAC/gi }
]
}
]
}
If anyone has any insight into this it would be much appreciated.
Example Document:
{
_id: "abc",
propertyId: "7sHGCHT4ns6z9j6BC",
clientId: "xyz"
name: "16.000.001",
serviceType: {
_id: "asdf",
name: "HVAC"
},
manager: {
_id: "dfgh",
name: "Patrick Lewis",
}
}
The expected result is to only find documents where propertyId = 7sHGCHT4ns6z9j6BC AND one at least one of the following keys: name, serviceType.name, or manager.name match an inputted string, in this case, it's HVAC and if none of the regex fields match, then return nothing.
UPDATE
The issue was with MongoDB, after restarting it, everything worked.
Try following script:
db.collection.find({
$and:[
{propertyId:"7sHGCHT4ns6z9j6BC"},
{
$or:[
{name: /HVAC/i},
{"serviceType.name": /HVAC/i},
{"manager.name": /HVAC/i}
]
}]
})
Query above will return a document or documents if and only if propertyId matches and either of name, serviceType.name or manager.name matches desired regex.