How to see the "queryDebugString" on MongoDB's Text Search? - mongodb

I'm trying the Text Search functionality from MongoDB, and out there on the internet I can see lots of information where people do this:
db.posts.runCommand("text", {search: '"robots are crazy"'})
And get this:
{
"queryDebugString" : "robot||||robots are||",
"language" : "english",
"results" : [
{
"score" : 0.6666666666666666,
"obj" : {
"_id" : ObjectId("50ebc482214a1e88aaa4ad9e"),
"txt" : "Robots are superior to humans"
}
}
],
"stats" : {
"nscanned" : 2,
"nscannedObjects" : 0,
"n" : 1,
"timeMicros" : 185
},
"ok" : 1
}
I know runCommand("text", ... is deprecated, but I've tried the db.posts.find({ $text: { $search: '"robots are crazy"' } }) approach as well, and nothing there.
How can I see this "queryDebugString" attribute? I've looked for some kind of debug flags to use when starting up mongod, but couldn't find anything.

For more recent versions of Mongo (2.6 at least), use .explain(true) for the verbose output, which will contain a parsedTextQuery field with more, and more readable, information than queryDebugString:
> db.test.find({ "$text" : { "$search" : "cows are lovely" } }).explain(true)
{
"cursor" : "TextCursor",
...
"stats" : {
"type" : "TEXT",
...
"parsedTextQuery" : {
"terms" : ["cow", "love"],
"negatedTerms" : [],
"phrases" : [],
"negatedPhrases" : []
]
}
...
}

Just try this:
db.posts.find({ "text" : '"robots are crazy"' })
Now in Mongo, operations became (in basic state) something like:
db.collection.action({query}}
I think it's enough to start, but try to go to the MongoDB documentation:
http://docs.mongodb.org/manual/

Related

MongoDB count occurances of a substring in a collection

Hello I'm a MongoDb beginner. I have a database of a IRC chatlog. The document structure is very simple
{
"_id" : ObjectId("000"),
"user" : "username",
"message" : "foobar foobar potato idontknow",
"time" : NumberLong(1451775601469)
}
I have thousands of these and I want to count the number of occurrences of the string "foobar". I have googled this issue and found something about aggregations. I looks very complicated and I haven't really found any issue this "simple". I'd be glad if someone pointed me in the right direction what to research and I wouldn't mind an example command that does exactly this what I want. Thank you.
There is no any built-in operator to solve your request.
You can try this query, but it has very poor performance:
db.chat.find().forEach(function(doc){
print(doc["user"] + " > " + ((doc["message"].match(/foobar/g) || []).length))
})
If you could change your message field to array, then we could apply aggregation...
EDIT:
If you add array of splitted words into your entry, we can apply aggregation
Sample:
{
"_id" : ObjectId("569bb7040586bcb40f7d2539"),
"user" : "username",
"fullmessage" : "foobar foobar potato idontknow",
"message" : [
"foobar",
"foobar",
"potato",
"idontknow"
],
"time" : NumberLong(1451775601469)
}
Aggregation. We create new entry for each array element, match given word (foobar, in this case) and then count matched result.
db.chat.aggregate([
{"$unwind" : "$message"},
{"$match" : {"message" : {"$regex" : "foobar", "$options" : "i"}}},
{"$group" : {_id:{"_id" : "$_id", "user" : "$user", "time" : "$time", "fullmessage" : "$fullmessage"}, "count" : {$sum:1}}},
{"$project" : {_id:"$_id._id", "user" : "$_id.user", "time" : "$_id.time", "fullmessage" : "$_id.fullmessage", "count" : "$count"}}
])
Result:
[
{
"_id" : ObjectId("569bb7040586bcb40f7d2539"),
"count" : 2,
"user" : "username",
"time" : NumberLong(1451775601469),
"fullmessage" : "foobar foobar potato idontknow"
}
]

Mongoid query embedded document and return parent

I have this document, each is a tool:
{
"_id" : ObjectId("54da43aea96ddcc40915a457"),
"checked_in" : false,
"barcode" : "PXJ-234234",
"calibrations" : [
{
"_id" : ObjectId("54da46ec546173129d810100"),
"cal_date" : null,
"cal_date_due" : ISODate("2014-08-06T00:00:00.000+0000"),
"time_in" : ISODate("2015-02-10T17:46:20.250+0000"),
"time_out" : ISODate("2015-02-10T17:46:20.250+0000"),
"updated_at" : ISODate("2015-02-10T17:59:08.796+0000"),
"created_at" : ISODate("2015-02-10T17:59:08.796+0000")
},
{
"_id" : ObjectId("5509e815686d610b70010000"),
"cal_date_due" : ISODate("2015-03-18T21:03:17.959+0000"),
"time_in" : ISODate("2015-03-18T21:03:17.959+0000"),
"time_out" : ISODate("2015-03-18T21:03:17.959+0000"),
"cal_date" : ISODate("2015-03-18T21:03:17.959+0000"),
"updated_at" : ISODate("2015-03-18T21:03:17.961+0000"),
"created_at" : ISODate("2015-03-18T21:03:17.961+0000")
},
{
"_id" : ObjectId("5509e837686d610b70020000"),
"cal_date_due" : ISODate("2015-03-18T21:03:51.189+0000"),
"time_in" : ISODate("2015-03-18T21:03:51.189+0000"),
"time_out" : ISODate("2015-03-18T21:03:51.189+0000"),
"cal_date" : ISODate("2015-03-18T21:03:51.189+0000"),
"updated_at" : ISODate("2015-03-18T21:03:51.191+0000"),
"created_at" : ISODate("2015-03-18T21:03:51.191+0000")
}
],
"group" : "Engine",
"location" : "Here or there",
"model" : "ZX101C",
"serial" : NumberInt(15449),
"tool" : "octane analyzer",
"updated_at" : ISODate("2015-09-30T20:43:55.652+0000"),
"description" : "Description...",
}
Tools are calibrated periodically. What I want to do is grab tools that are due this month.
Currently, my query is this:
scope :upcoming, -> { where(:at_ats => false).where('calibrations.0.cal_date_due' => {'$gte' => Time.now-1.day, '$lte' => Time.now+30.days}).order_by(:'calibrations.cal_date_due'.asc) }
However, this query gets the tool by the first calibration object and it needs to be the last. I've tried a myriad of things, but I'm stuck here.
How can I make sure I'm querying the most recent calibration document, not the first (which would be the oldest and therefore not relevant)?
You should look into aggregation framework and $unwind operator.
This link may be of help.
This link may be helpful. It contains an example of use of 'aggregation framework' for get the last element of the array, that is, the most recent in your case.

MongoDB Text search fails for stop words

I'm trying to do a query in my collection, but its not returning anything.
Here's my query:
{'$match': {'$text': {'$search': 'a'}}},
{'$group': {'_id': {'texto': '$texto'},
'somanumero': {'$sum': '$numero'}}}
My collection:
{ "_id" : ObjectId("555cdc4fe13823315537042d"), "texto" : ObjectId("555cdc4fe13823315537042c"), "numero" : ObjectId("555cdc4fe13823315537042e") }
{ "_id" : ObjectId("555cdc5ee13823315537042f"), "numero" : 5, "texto" : "a", "lattexto" : "-15.79506", "lontexto" : "-47.88322" }
{ "_id" : ObjectId("555cdc6ae138233155370430"), "numero" : 10, "texto" : "a", "lattexto" : "-15.79506", "lontexto" : "-47.88322" }
{ "_id" : ObjectId("555cdc73e138233155370431"), "numero" : 3, "texto" : "b", "lattexto" : "-15.79506", "lontexto" : "-47.88322" }
And here's my text index:
{
"v" : 1,
"key" : {
"_fts" : "text",
"_ftsx" : 1
},
"name" : "texto_text",
"ns" : "OSA.teste_texto",
"default_language" : "portuguese",
"weights" : {
"texto" : 1
},
"language_override" : "language",
"textIndexVersion" : 2
}
When i use $group or $match alone, it works.
Am I doing something wrong?
From the docs:
MongoDB supports text search for various languages. text indexes drop
language-specific stop words (e.g. in English, “the”, “an”, “a”,
“and”, etc.) and uses simple language-specific suffix stemming.
The problem with your data is that some of the records have the language-specific stop word, a, which is considered to be a stop word in portugese too. Some of the stop words include, and a is on top of the list.
a
ao
aos
aquela
aquelas
aquele
aqueles
aquilo
as
até
com
como
These words are never indexed, and hence whenever you query for stop words, you get no results.
At the same time, If you query for b, you would get results, since it is not a stop word and would be indexed.

mongodb query should be covered by index but is not

the query:
db.myColl.find({"M.ST": "mostrepresentedvalueinthecollection", "M.TS": new Date(2014,2,1)}).explain()
explain output :
"cursor" : "BtreeCursor M.ST_1_M.TS_1",
"isMultiKey" : false,
"n" : 587606,
"nscannedObjects" : 587606,
"nscanned" : 587606,
"nscannedObjectsAllPlans" : 587606,
"nscannedAllPlans" : 587606,
"scanAndOrder" : false,
"indexOnly" : false,
"nYields" : 9992,
"nChunkSkips" : 0,
"millis" : 174820,
"indexBounds" : {
"M.ST" : [
[
"mostrepresentedvalueinthecollection",
"mostrepresentedvalueinthecollection"
]
],
"M.TS" : [
[
ISODate("2014-03-01T00:00:00Z"),
ISODate("2014-03-01T00:00:00Z")
]
]
},
"server" : "myServer"
additional details: myColl contains about 40m documents, average object size is 300b.
I don't get why indexOnly is not set to true, I have a compound index on {"M.ST":1, "M.TS":1}
The mongo host is a unix box with 16gb RAM and 500gb disk space (spinning disk).
The total index size of the database is 10gb, we've got around 1k upserts/sec, on those 1K 20 are inserts the rest are Increments.
We have another query that adds a third field in the find query (called "M.X"), and also a compound index on "M.ST", "M.X", "M.TS". That one is lightning fast and scans only 330 documents.
Any idea what could be wrong ?
Thanks.
EDIT : here's the structure of a sample document:
{
"_id" : "somestring",
"D" : {
"20140301" : {
"IM" : {
"CT" : 143
}
},
"20140302" : {
"IM" : {
"CT" : 44
}
},
"20140303" : {
"IM" : {
"CT" : 206
}
},
"20140314" : {
"IM" : {
"CT" : 5
}
}
},
"Y" : "someotherstring",
"IM" : {
"CT" : 1
},
"M" : {
"X" : 99999,
"ST" : "mostrepresentedvalueinthecollection",
"TS" : ISODate("2014-03-01T00:00:00.000Z")
},
}
The idea is to store some analytics metrics by month, the "D" field represents an array of documents containing data for each day of the month.
EDIT:
This feature is not currently implemented. Corresponding JIRA ticket is SERVER-2104. You can upvote for it, but for now, to utilize covered index queries you need to avoid use of dot-notation/embedded document.
I think you need to set a projection on that query, to tell mongo what indexes it covers.
Try this..
db.myColl.find({"M.ST": "mostrepresentedvalueinthecollection", "M.TS": new Date(2014,2,1)},{ M.ST:1, M.TS:1, _id:0 }).explain()

Mongodb + Mongoose: trying to add a sub-sub-item

Does this makes any sense when trying to add a sub-sub-item? (I'm new to mongo - be merciful :-))
question = db.questions.findOne({_id: ObjectId("529c5d44211c9a8c11000006")})
question.answers[0].votes.insert(...)
When I run this from the mongo console the result is an error saying [object object] does not have the method insert.
I have the following mongoDB Question Schema.
{
"__v" : 2,
"_creator" : ObjectId("529c5d2d211c9a8c11000005"),
"_id" : ObjectId("529c5d44211c9a8c11000006"),
"answers" : [
{
"postDate" : ISODate("2013-12-02T10:14:19.060Z"),
"postDateText" : "15min ago",
"authorEmail" : "guys#pix.com",
"authorName" : "guys#pix.com",
"body" : "You need magic powder",
"isWinner" : false,
"_creator" : ObjectId("529c5d2d211c9a8c11000005"),
"_id" : ObjectId("529c5d7b211c9a8c11000008"),
"votes" : [
{
"voteType" : "up",
"_creator" : ObjectId("529c5d2d211c9a8c11000005"),
"_id" : ObjectId("529c5d5b211c9a8c11000007")
}
]
}
],
"authorEmail" : "guys#wix.com",
"authorName" : "guys#wix.com",
"body" : "I'm trying to fly...\n\n<pre class=\"brush: js;\">\nfunction logName(name) {\n console.log(name);\n}\n</pre>",
"isResolved" : false,
"postDate" : ISODate("2013-12-02T10:13:24.235Z"),
"tags" : [
"fly"
],
"title" : "How do I fly?",
"views" : [],
"votes" : [
{
"voteType" : "up",
"_creator" : ObjectId("529c5d2d211c9a8c11000005"),
"_id" : ObjectId("529c5d5b211c9a8c11000007")
}
]
}
I'm trying, given a questionId and an answerId to add a vote to the votes array (which is inside the answer). I can't seem to do it. Help?
insert is for adding whole new documents; when you just want to add a new element to an array field of an existing document, you can use update along with an operator like $push.
So, in the shell you would use something like this:
db.questions.update(
{_id: ObjectId("529c5d44211c9a8c11000006")},
{'answers.0.votes': {$push: voteToPush}})