Aggregation: Project dotted field doesn't seem to work - mongodb

I have a database containing this document:
{"_id":{"$id":"xxx"},"duration":{"sec":137,"usec":0},"name":"test"}
If I call db.collection.aggregate with this pipeline:
{$project:{_id: 0, name: 1, duration: 1, seconds: "$duration.sec"}}
I get this result:
{"result":[{"duration":{"sec":137,"usec":0},"name":"test"}],"ok":1}
Why does the result not have a 'seconds' field? Have I used the wrong projection syntax?
I'm not entirely sure of the version of mongodb the server is running. I'm using the 1.3.1 php driver with php 5.4.3, but the server may be older than that - perhaps by about half a year?

According to the MongoDB documentation on $project:
You may also use $project to rename fields. Consider the following
example:
db.article.aggregate(
{ $project : {
title : 1 ,
page_views : "$pageViews" ,
bar : "$other.foo"
}} );
This operation renames the pageViews field to page_views, and renames the foo field in the other sub-document as the top-level
field bar.
That example seems to match-up pretty good with what you are trying to do.
I know 10gen officially released the aggregation framework with MongoDB v2.2. Check out the current production release, which I believe is 2.2.3. If you are running on a prior development version, there could be something odd going on with aggregation.

As Bryce said, I'm currently using MongoDB 2.6 through the shell and the $project pipeline is working for renaming nested fields as you do.
db.article.aggregate({$project:{'_id': 0, 'name': 1, 'duration': 1, 'seconds': '$duration.sec'}}
I've not tried yet trough the python or php drivers but my former pipelines with the last pymongo worked very well.

Related

Mongo error: "$out stage requires a string argument, but found object (code: 14, codeName: TypeMismatch)"

I'm trying to write a Mongo aggregation using the out operator as described in the docs. This is the aggregation I'm writing:
db.mycollectionname.aggregate([
{ $match: {} },
{ $project: {}},
{ $out: {to: "projets", mode: "insertDocuments"}}
])
When I execute this I get the following error: $out stage requires a string argument, but found object - clear in and of itself but it goes against what the docs say. When I provide a string to the $out stage, I don't get the error but that's not what I want.
Mongo version: 3.6.9
(I have more logic under the $project pipeline stage which I removed for brevity, it doesn't have any impact).
Can someone help me understand why this differs from what the docs say? And how I can provide the arguments I want to pass to the out stage (an object containing "to" and "mode") as a string?
Many thanks,
Chris.
You should look at the version specific documentation:
https://docs.mongodb.com/v3.6/reference/operator/aggregation/out/
$out in MongoDB 3.6 and MongoDB 4.0 only require a single string. In MongoDB 4.2, $out can use a dictionary to set the mode.
I think the problem is with your MongoDb version, you are using 3.6.9 but the document says:
MongoDB 4.2 adds a new syntax structure that implements expanded functionality and flexibility around merging aggregation pipeline results into a target collection, including support for sharded collections and output modes that preserve the existing collection data.
Just update your version and it will work. :)

MongoDB:createCollection with parameters

I used this comment
> db.createCollection("naveen",{capped:true,autoIndexId:true,size:53440099,max:1000});
and I got this:
{
"note" : "the autoIndexId option is deprecated and will be removed in a future release",
"ok" : 1
}
In fact, MongoDB doesn't need to create a collection, when you insert document MongoDB will automatically create a collection for you, and also generate unique Id for each document!
Example:
db.createCollection("naveen",{capped:true,autoIndexId:true,size:53440099,max:1000}); //don't
you don't need to create a collection, you just need to insert a direct document, it's not like SQL, It's JSON.
{ name: 'john', ...}
like this, insert document without creating a collection, it automatically generates a collection for you!
db.newYear.insert({ name: 'john', age: 33, year: 2020 });
Removing the autoIndexId parameter will remove the note in the response.
From MongoDB version 3.2, the autoIndexId parameter is deprecated when using createCollection, hence you are receiving this note message along with the ok value to make you aware of this.
The autoIndexId parameter is removed in version 3.4.
A reply to a comment below is useful in this answer:
and replaced with what?
Looking at this SO answer, MongoDB docs and MongoDB's JIRA it seems they are pushing for developers to not intervene with auto indexing.

MongoDB aggregation $lookup to a field that is an indexed array

I am trying a fairly complex aggregate command on two collections involving $lookup pipeline. This normally works just fine on simple aggregation as long as index is set on foreignField.
But my $lookup is more complex as the indexed field is not just a normal Int64 field but actually an array of Int64. When doing a simple find(), it is easy to verify using explain() that the index is being used. But explaining the aggregate pipeline does not explain whether index is being used in the $lookup pipeline. All my timing tests seem to indicate that the index is not being used. MongoDB version is 3.6.2. Db compatibility is set to 3.6.
As I said earlier, I am not using simple foreignField lookup but the 3.6-specific pipeline + $match + $expr...
Could using pipeline be showstopper for the index? Does anyone have any deep experience with the new $lookup pipeline syntax and/or the index on an array field?
Examples
Either of the following works fine and if explained, shows that index on followers is being used.
db.col1.find({followers: {$eq : 823778}})
db.col1.find({followers: {$in : [823778]}})
But the following one does not seem to make use of the index on followers [there are more steps in the pipeline, stripped for readability].
db.col2.aggregate([
{$match:{field: "123"}},
{$lookup:{
from: "col1",
let : {follower : "$follower"},
pipeline: [{
$match: {
$expr: {
$or: [
{ $eq : ["$follower", "$$follower"] },
{ $in : ["$$follower", "$followers"]}
]
}
}
}],
as: "followers_all"
}
}])
This is a missing feature which is going to part of 3.8 version.
Currently eq matches in lookup sub pipeline are optimised to use indexes.
Refer jira fixed in 3.7.1 ( dev version).
Also, this may be relevant as well for non-multi key indexes.

MongoDB covered query on embedded document

I am learning indexing in MongoDB
My sample schema is:
name
location
street
number
I have created two indexes, on name and on location.number.
When I type
db.table.find({ 'name': 'Steve' }, { _id: 0, 'name': 1 }).explain('executionStats')
I got covered query, but when I type
db.table.find({ 'location.number': 46 }, { _id: 0, 'location.number': 1 }).explain('executionStats')
the totalDocsExamined is not equal to 0 so it is not covered query. Why? The query contains only one field, which index has and _id is excluded same as in first query. Covered queries are not working with embedded documents?
No, they are not. It is very well documented restriction:
An index cannot cover a query if any of the indexed fields in the query predicate or returned in the projection are fields in embedded documents.
The quoted text on Alex Blex answer no longer appears in the linked site. I still don't do my own research with the data I managed with this, but I think this may be possible now.
According to the new docs:
Changed in version 3.6: An index can cover a query on fields within
embedded documents.
Version 3.6 was released in November 2017, so definitely a new feature for the date the OP was made.
See the docs for more examples.

MongoDB full text search and the aggregation pipeline

I am currently using the full-text search capabilities of MongoDB to count the number of documents per hour which contain a certain keyword.
This is really interesting when run across a large collection where each document is a Tweet. For example for the keyword "thanks" we see Nov 29 (Thanks Giving).
My current approach works (it generated the above plot) but it is not going to scale. At the moment I manually count the number of tweets in each hour by iterating over the documents returned by search. This approach is not going to scale as this search result will eventually reach the MongoDB document limit. At the moment it works because I have only 3.5 million tweets but I plan on collecting a lot more.
data = db.command('text', collection,
search=query,
project={'hour_bucket': 1, '_id': 0},
limit=-1
)
hours = Counter()
for d in data['results']:
hours[d['obj']['hour_bucket']] += 1
My question is: can text-search be used inside the aggregation pipeline? This would fix all of my problems. However the only comment I have seen about this is the following: https://jira.mongodb.org/browse/SERVER-9063
Does anyone know what the status of this work is?
Somewhat coincidentally, support for text search in the aggregation framework has recently been committed with a tagged fixVersion for the upcoming MongoDB 2.5.5 development/unstable release (see SERVER-11675).
Assuming all goes well in QA/testing, this feature will be included in the 2.6 production release.
There should be some further information included in the draft 2.6 release notes after 2.5.5 is released, and I would encourage you to test this feature in your development environment.
FYI, you can find or subscribe to release announcements via the mongodb-announce discussion group.
Manual: http://docs.mongodb.org/manual/tutorial/text-search-in-aggregation/
Example:
db.tweets.aggregate(
[
{ $match : { $text: { $search: "query" } } },
{ $project : { day : { $substr: ["$created_at", 0, 10]}}},
{ $group : { _id : "$day", number : { $sum : 1 }}},
{ $sort : { _id : 1 }}
]
)