MongoDB count occurances of a substring in a collection - mongodb

Hello I'm a MongoDb beginner. I have a database of a IRC chatlog. The document structure is very simple
{
"_id" : ObjectId("000"),
"user" : "username",
"message" : "foobar foobar potato idontknow",
"time" : NumberLong(1451775601469)
}
I have thousands of these and I want to count the number of occurrences of the string "foobar". I have googled this issue and found something about aggregations. I looks very complicated and I haven't really found any issue this "simple". I'd be glad if someone pointed me in the right direction what to research and I wouldn't mind an example command that does exactly this what I want. Thank you.

There is no any built-in operator to solve your request.
You can try this query, but it has very poor performance:
db.chat.find().forEach(function(doc){
print(doc["user"] + " > " + ((doc["message"].match(/foobar/g) || []).length))
})
If you could change your message field to array, then we could apply aggregation...
EDIT:
If you add array of splitted words into your entry, we can apply aggregation
Sample:
{
"_id" : ObjectId("569bb7040586bcb40f7d2539"),
"user" : "username",
"fullmessage" : "foobar foobar potato idontknow",
"message" : [
"foobar",
"foobar",
"potato",
"idontknow"
],
"time" : NumberLong(1451775601469)
}
Aggregation. We create new entry for each array element, match given word (foobar, in this case) and then count matched result.
db.chat.aggregate([
{"$unwind" : "$message"},
{"$match" : {"message" : {"$regex" : "foobar", "$options" : "i"}}},
{"$group" : {_id:{"_id" : "$_id", "user" : "$user", "time" : "$time", "fullmessage" : "$fullmessage"}, "count" : {$sum:1}}},
{"$project" : {_id:"$_id._id", "user" : "$_id.user", "time" : "$_id.time", "fullmessage" : "$_id.fullmessage", "count" : "$count"}}
])
Result:
[
{
"_id" : ObjectId("569bb7040586bcb40f7d2539"),
"count" : 2,
"user" : "username",
"time" : NumberLong(1451775601469),
"fullmessage" : "foobar foobar potato idontknow"
}
]

Related

Mongodb update with $concat and field missing

I wanted to update a collection to set a new Field with other inner field
it looks like to generate a person's full name .
MongoDB Enterprise > db.name.find()
{ "_id" : ObjectId("5d7ca743c45316e35251a49e"), "first" : "Don", "middle" : "Jhon", "last" : "Trump" }
{ "_id" : ObjectId("5d7ca75bc45316e35251a49f"), "first" : "Dila", "last" : "Tp" }
{ "_id" : ObjectId("5d7ca76dc45316e35251a4a0"), "first" : "Li", "last" : "Wei" }
I want to set the full name with $first +$middle + $last in One
update
I try to use update with an aggregation pipeline. this is a new feature in mongodb 4.2
db.name.updateMany({},[{$set:{full:{$concat: [ "$first", "$middle","$last" ] }}}])
but this result return a lot of null value when some field is missing .
db.name.find()
{ "_id" : ObjectId("5d7ca743c45316e35251a49e"), "first" : "Don", "middle" : "Jhon", "last" : "Trump", "full" : "DonJhonTrump" }
{ "_id" : ObjectId("5d7ca75bc45316e35251a49f"), "first" : "Dila", "last" : "Tp", "full" : null }
{ "_id" : ObjectId("5d7ca76dc45316e35251a4a0"), "first" : "Li", "last" : "Wei", "full" : null }
What you are looking for is ifNull function. Rewrite your query like this
db.name.updateMany({},[{$set:{full:{$concat: [ {$ifNull:["$first", ""]}, {$ifNull:["$middle", ""]},{$ifNull:["$last", ""]} ] }}}])
I also try use with $cond it also worked but more complicated
db.name.updateMany({},[{$set:{full:{$concat: [ "$first", {"$cond" :{if :{$gt:["$middle", null]},then :"$middle" ,else :""}},"$last" ] }}}])
put it here to help others if needed .

Insert document into mongodb from existing table

I am trying to write a query in mongo that will create a new table, loop through my data set, and insert the TopExecutiveTitle into the new table. I also would like it to keep count of each position and only insert a position into the table when it is new.
This is what I have so far. This code loops through my table and inserts the TopExectuiveTitle into a new table. However, it does not group them together and keep count. How do I write my query so that it will?
db.car.find().forEach( function (x) {
db.TopExecutiveTable.insert({Topexecutivetitle: x.Topexecutivetitle})
});
Here is a sample of a document in my database.
{
"_id" : ObjectId("5a22c8e562c2e489c5df70fa"),
"2016rank" : 1,
"Dealershipgroupname" : "AutoNation Inc.?",
"Address" : "200 S.W. 1st Ave.",
"City/State/Zip" : "Fort Lauderdale, FL 33301",
"Phone" : "(954) 769-7000",
"Companywebsite" : "www.autonation.com",
"Topexecutive" : "Mike Jackson",
"Topexecutivetitle" : "chairman & CEO",
"Totalnewretailunits" : "337,622",
"Totalusedunits" : "225,713",
"Totalfleetunits" : 3,
"Totalwholesaleunits" : "82,342",
"Total_units" : "649,415",
"Total_number_of _dealerships" : 260,
"Grouprevenuealldepartments*" : "$21,609,000,000",
"2015rank" : 1
}
The result I would like is something like this
"Topexecutivetitle" : "chairman & CEO"
"Count" : 3
"Topexecutivetitle" : "president"
"Count" : 7
}
To do this you need to use the aggregate function of mongo, something like this:
db.car.aggregate([
{
$group:{
_id:"$Topexecutivetitle",
count:{$sum:1}
}
},
{
$project:{
Topexecutivetitle:"$_id",
count:1,
_id:0
}
},
{
$out:"result"
}])
This will give you your desired output and store it into a new collection "result":
{
"_id" : "president",
"count" : 1.0
},
{
"_id" : "chairman & CEO",
"count" : 3.0
}

How to query an array of objects in mongodb

I have an object structure as shown below
{
"_id" : ObjectId("55d164f1c8f2c53a82535b9a"),
"plant_name" : "TOTAL",
"installed_capacity" : 3473,
"wind_data" : [
{
"date" : "16-08-15",
"timestamp" : " 16:27:15",
"generated_capacity" : 617.24,
"frequency" : 50.01
},
{
"date" : "16-08-15",
"timestamp" : " 21:21:15",
"generated_capacity" : 670.25,
"frequency" : 49.94
}, ....]
}
I need to sum up (at least retrieve) "generated_capacity" of all the objects under "wind_data" having "date" equal to "16-08-15" of "TOTAL" object. I have tried this query
db.collectionName.aggregate(
{"$unwind":"$wind_data"},
{"$match":{"plant_name":"TOTAL","wind_data.date":"16-08-15"}}
)
But, this query is not working. Please suggest some way to figure this out.
The following query would do the job
db.collectionName.aggregate([
{"$unwind":"$wind_data"},
{"$match":{"plant_name":"TOTAL","wind_data.date":"16-08-15"}},
{"$group":{"_id":"$wind_data.date","generated_capacity_sum":{"$sum":"$wind_data.generated_capacity"}}}
])

MongoDB Text search fails for stop words

I'm trying to do a query in my collection, but its not returning anything.
Here's my query:
{'$match': {'$text': {'$search': 'a'}}},
{'$group': {'_id': {'texto': '$texto'},
'somanumero': {'$sum': '$numero'}}}
My collection:
{ "_id" : ObjectId("555cdc4fe13823315537042d"), "texto" : ObjectId("555cdc4fe13823315537042c"), "numero" : ObjectId("555cdc4fe13823315537042e") }
{ "_id" : ObjectId("555cdc5ee13823315537042f"), "numero" : 5, "texto" : "a", "lattexto" : "-15.79506", "lontexto" : "-47.88322" }
{ "_id" : ObjectId("555cdc6ae138233155370430"), "numero" : 10, "texto" : "a", "lattexto" : "-15.79506", "lontexto" : "-47.88322" }
{ "_id" : ObjectId("555cdc73e138233155370431"), "numero" : 3, "texto" : "b", "lattexto" : "-15.79506", "lontexto" : "-47.88322" }
And here's my text index:
{
"v" : 1,
"key" : {
"_fts" : "text",
"_ftsx" : 1
},
"name" : "texto_text",
"ns" : "OSA.teste_texto",
"default_language" : "portuguese",
"weights" : {
"texto" : 1
},
"language_override" : "language",
"textIndexVersion" : 2
}
When i use $group or $match alone, it works.
Am I doing something wrong?
From the docs:
MongoDB supports text search for various languages. text indexes drop
language-specific stop words (e.g. in English, “the”, “an”, “a”,
“and”, etc.) and uses simple language-specific suffix stemming.
The problem with your data is that some of the records have the language-specific stop word, a, which is considered to be a stop word in portugese too. Some of the stop words include, and a is on top of the list.
a
ao
aos
aquela
aquelas
aquele
aqueles
aquilo
as
até
com
como
These words are never indexed, and hence whenever you query for stop words, you get no results.
At the same time, If you query for b, you would get results, since it is not a stop word and would be indexed.

Mongodb + Mongoose: trying to add a sub-sub-item

Does this makes any sense when trying to add a sub-sub-item? (I'm new to mongo - be merciful :-))
question = db.questions.findOne({_id: ObjectId("529c5d44211c9a8c11000006")})
question.answers[0].votes.insert(...)
When I run this from the mongo console the result is an error saying [object object] does not have the method insert.
I have the following mongoDB Question Schema.
{
"__v" : 2,
"_creator" : ObjectId("529c5d2d211c9a8c11000005"),
"_id" : ObjectId("529c5d44211c9a8c11000006"),
"answers" : [
{
"postDate" : ISODate("2013-12-02T10:14:19.060Z"),
"postDateText" : "15min ago",
"authorEmail" : "guys#pix.com",
"authorName" : "guys#pix.com",
"body" : "You need magic powder",
"isWinner" : false,
"_creator" : ObjectId("529c5d2d211c9a8c11000005"),
"_id" : ObjectId("529c5d7b211c9a8c11000008"),
"votes" : [
{
"voteType" : "up",
"_creator" : ObjectId("529c5d2d211c9a8c11000005"),
"_id" : ObjectId("529c5d5b211c9a8c11000007")
}
]
}
],
"authorEmail" : "guys#wix.com",
"authorName" : "guys#wix.com",
"body" : "I'm trying to fly...\n\n<pre class=\"brush: js;\">\nfunction logName(name) {\n console.log(name);\n}\n</pre>",
"isResolved" : false,
"postDate" : ISODate("2013-12-02T10:13:24.235Z"),
"tags" : [
"fly"
],
"title" : "How do I fly?",
"views" : [],
"votes" : [
{
"voteType" : "up",
"_creator" : ObjectId("529c5d2d211c9a8c11000005"),
"_id" : ObjectId("529c5d5b211c9a8c11000007")
}
]
}
I'm trying, given a questionId and an answerId to add a vote to the votes array (which is inside the answer). I can't seem to do it. Help?
insert is for adding whole new documents; when you just want to add a new element to an array field of an existing document, you can use update along with an operator like $push.
So, in the shell you would use something like this:
db.questions.update(
{_id: ObjectId("529c5d44211c9a8c11000006")},
{'answers.0.votes': {$push: voteToPush}})