Group and Combine Text Fields Using Pymongo - mongodb

I have a collection of user reviews and I'm trying to combine all the reviews by user so I can run some NLP analysis on them. This feels like it should be easy, but I'm missing something with how Mongo treats strings.
My documents look like this:
{'_id': ObjectId('57e079d3e3874f12ad721f70'),
'atmosphere': 5,
'review_id': 63,
'dedication': 3,
'orgName': 'Some Organization',
'enabled': True,
'accessibility': 3,
'efficiency': 3,
'orgId': '57e05e0de3874f121d516616',
'user': '5809f2c0bc0a53eb49eac583',
'date': '10/20/15 0:00',
'quality': 3,
'orgId_orig': 1098,
'description': 'Here is some sample text'
}
I've tried this:
agg_result = revs.aggregate( [
{ "$group": { "_id": "$user", "mergedText": { "$mergeObjects": "$description" } } }
])
for i in agg_result:
print(i)
But I'm getting this error:
OperationFailure: $mergeObjects requires object inputs, but input "Here is some sample text" is of type string
My expected output would be
{
'userId1':{'mergedText':'joined descriptions from this user'},
'userId2':{'mergedText':'this users descriptions'},
'userId3':{'mergedText':'all descriptions from this user'}
}
where the various userIds are Mongo ObjectIds from the 'user' field.
I'm brand new to Mongo and this has been tripping me up for awhile. Thank you.

try this , merge object needs objectbut your description is string you could push in array
agg_result = revs.aggregate( [
{ "$group": { "_id": "$user", "mergedText": { "$push": "$description" } } }
])
for i in agg_result:
print(i)

Related

Can't remove object in array using Mongoose

This has been extensively covered here, but none of the solutions seems to be working for me. I'm attempting to remove an object from an array using that object's id. Currently, my Schema is:
const scheduleSchema = new Schema({
//unrelated
_id: ObjectId
shifts: [
{
_id: Types.ObjectId,
name: String,
shift_start: Date,
shift_end: Date,
},
],
});
I've tried almost every variation of something like this:
.findOneAndUpdate(
{ _id: req.params.id },
{
$pull: {
shifts: { _id: new Types.ObjectId(req.params.id) },
},
}
);
Database:
Database Format
Within these variations, the usual response I've gotten has been either an empty array or null.
I was able slightly find a way around this and accomplish the deletion by utilizing the main _id of the Schema (instead of the nested one:
.findOneAndUpdate(
{ _id: <main _id> },
{ $pull: { shifts: { _id: new Types.ObjectId(<nested _id>) } } },
{ new: true }
);
But I was hoping to figure out a way to do this by just using the nested _id. Any suggestions?
The problem you are having currently is you are using the same _id.
Using mongo, update method allows three objects: query, update and options.
query object is the object into collection which will be updated.
update is the action to do into the object (add, change value...).
options different options to add.
Then, assuming you have this collection:
[
{
"_id": 1,
"shifts": [
{
"_id": 2
},
{
"_id": 3
}
]
}
]
If you try to look for a document which _id is 2, obviously response will be empty (example).
Then, if none document has been found, none document will be updated.
What happens if we look for a document using shifts._id:2?
This tells mongo "search a document where shifts field has an object with _id equals to 2". This query works ok (example) but be careful, this returns the WHOLE document, not only the array which match the _id.
This not return:
[
{
"_id": 1,
"shifts": [
{
"_id": 2
}
]
}
]
Using this query mongo returns the ENTIRE document where exists a field called shifts that contains an object with an _id with value 2. This also include the whole array.
So, with tat, you know why find object works. Now adding this to an update query you can create the query:
This one to remove all shifts._id which are equal to 2.
db.collection.update({
"shifts._id": 2
},
{
$pull: {
shifts: {
_id: 2
}
}
})
Example
Or this one to remove shifts._id if parent _id is equal to 1
db.collection.update({
"_id": 1
},
{
$pull: {
shifts: {
_id: 2
}
}
})
Example

Mongo DB - How to create a dynamic field based on the presence of element in array?

I have a problem in Mongo for which I am not getting any clue to resolve it efficiently.
Say I have a 'Course' collection something like this (index is created on the 'studentIds' field):
{
"courseId": 1,
"name": "Mathematics",
"studentIds": [1,3,5]
...
...
}
{
"courseId": 2,
"name": "Physics",
"studentIds": [2,3,5]
...
...
}
I am trying to write a query which would return records in the below format:
Say student 1 is querying the courses, he is enrolled for courseId 1, so the 'enrolled' is true, but student 1 not enrolled for courseId 2 and so the 'enrolled' is false:
{
"courseId": 1,
"name": "Mathematics",
"enrolled": true
}
{
"courseId": 2,
"name": "Physics",
"enrolled": false
}
Only solution I can think of is have two queries, first query to find all course IDs the student is enrolled in and while running through the cursor on the courses in the second query, add 'enrolled' field based on the existence of the courseId in the result of the first query, but looking for a way to achieve this in a single query.
Thanks.
You just need $in operator:
let studentId = 1;
db.collection.aggregate([
{
$project: {
courseId: 1,
name: 1,
enrolled: { $in: [ studentId, "$studentIds" ] }
}
}
])
Mongo Playground

How to show specific column in mongo db collection

I tried to show particular columns in mongodb colletion.but its not working.how to show particular columnns.
user_collection
[{
"user_name":"hari",
"user_password":"123456"
}]
find_query
db.use_collection.find({},{projection:{user_name:1}})
I got output
[{
"user_name":"hari",
"user_password":"123456"
}]
Excepted output
[{
"user_name":"hari",
}]
Try:
db.use_collection.find({}, {user_name:1, _id: 0 })
In that way you get the field user_name and exclude the _id.
Extra info:
project fields and project fields excluding the id
With aggregate:
db.use_collection.aggregate( [ { $project : { _id: 0, user_name : 1 } } ] )
You can try this
Mongo query:
db.users.aggregate([
{
"$project":
{
"_id": 0,
"first_name": 1,
}
}
])
Or in ruby (Mongoid)
User.collection.aggregate(
[
"$project":
{
"_id": 0,
"first_name": 1,
}
]
)
If you try to inspect the record, you can convert it into an array first (e.g. User.collection.aggregate(...).to_a)
You can use the official mongodb reference when writing in Mongoid, usually you just need to use double quote on the property name on the left hand side, to make it work on Mongoid.
Try:
db.use_collection.find({}, {user_password:0, _id: 0 ,user_name:1 })

Mongoose how to use positional operator to pull from double nested array with specific condition, and return new result

Suppose I have the following schema:
{
_id: ObjectId(1),
title: string,
answers: [
{
_id: ObjectId(2),
text: string,
upVotes: [
{
_id: ObjectId(3),
userId: ObjectId(4)
}
]
}
]
}
What I want is pull vote of a specific user from answer upvotes, and return the new update result.
For example, find a question with id 1, and get its specific answer with id 2, then from that answer pull my vote using userId inside upvotes.
I want to do it with a single findOneAndUpdate query
You can even use single $ positional with the $pull operator to update the nested array
db.collection.findOneAndUpdate(
{ "_id": ObjectId(1), "answers._id": ObjectId(2) },
{ "$pull": { "answers.$.upVotes": { "userId": ObjectId(4) }}}
)
I think I understood that you want to do a search in the specific array
db.collection.update(
{
"_id": "507f1f77bcf86cd799439011", // id field
"answers.upVotes._id":"507f1f77bcf86cd799439011" //id array
}
),{
"$set":{"answers.$.upVotes": {userId :"507f1f77bcf86cd799439011"}}},//edit
//use "addToSet" for add

merge mongodb aggregate result

I have one collection say user. Structure of each document is something like this
_id:String
status:Int32
account:{
firstName:
lastName:
.. some other nested property
}
..some more property
My end goal is to generate a new nested field fullName in accountfield which is a concatenation of two name fields. I can run aggregate query like this
db.user.aggregate(
[
{ $project: { 'account.name': { $concat: [ "$account.firstName", " ", "$account.lastName" ] } } }
])
if I write $out along with db name, but my existing data get replaced. How do I actually merge so that my final structure remains as
_id:String
status:Int32
account:{
firstName:String
lastName:String
fullName:String
.. some other nested property
}
..some more property
In your $project pipeline, you need to include the other fields using the dot notation on the embedded fields as follows:
db.user.aggregate([
{
"$project": {
"status": 1,
"field1": 1, // the other properties
"field2": 1,
"account.firstName": 1,
"account.lastName": 1,
"account.name": {"$concat":["$account.firstName", " ", "$account.lastName"]}
}
},
{ "$out": "tempcollection" }
])