MongoCXX - handling a cursor from distinct - mongodb

I think it should be fairly obvious what I'm trying to do here - query a collection m_coll, and get all unique values of Density from that collection. However, the thing it's returning is an element, not a full document, so you can't seem to key it, and it screams at you, namely C++ exception with description "unset document::element" thrown in the test body.. What modification needs to be made to make this work?
std::vector<int> MongoReader::getLvlOne()
{
std::vector<int> ret;
bsoncxx::builder::stream::document empty;
mongocxx::cursor cursor = m_coll.distinct("Density",empty.view());
for (bsoncxx::document::view doc : cursor)
{
ret.push_back(doc["Density"].get_int32());
}
return ret;
}

This is really obscure and poorly documented, for which I apologize. I've opened a Jira ticket, CXX-1406, about improving docs and providing an example.
The distinct method returns a cursor, but it only ever provides a single document that looks like this:
{
"values" : [ "A", "B" ],
"ok" : 1
}
That's just exactly what the distinct database command returns.
You can see an example of usage in the tests for distinct.
There's a ticket, CXX-1126, for a better API, but it would be a breaking change, so we're not sure when we'll address it.

Related

Unique Values in NoSQL

Consider mongodb or couchbase. What if I need a certain value to be unique (maybe incremental) within the range of UINT32?
Well, I guess I could add a field like another_id and use something like this to increment it (mongo).
function getNextSequence(name) {
var ret = db.counters.findAndModify(
{
query: { _id: name },
update: { $inc: { seq: 1 } },
new: true
}
);
return ret.seq;
}
db.users.insert(
{
another_id : getNextSequence("userid"),
name : "Stack O. Flow"
}
)
But really the question is,
Is this approach safe?
Should I even use NoSQL for this? (consider I only have around 50M rows of data but I really need fast read and writes because this 50M rows of data gets updated almost a few times in second.)
If I should stick with SQL which one should I use. I've used MySQL and it was too slow. (though non-optimization might be at fault) (joining quite a few tables)
Thank you for any suggestions.
There is a specific counter object in Couchbase that should do what you want. Here is an example of it with Node.js.
You could relate it to the main object you are using by doing an objectID such as:
original_objectID::counter.
Then when you go to get the original object, you just do another get for the counter object by ID and done. You can iterate it easily as well. So if you needed to get the object and the original objectID was
user::kirk
then that user's counter object would be:
user::kirk::counter
And you can get and set it by that ID. It works very well in Couchbase.

MongoDB: Find document given field values in an object with an unknown key

I'm making a database on theses/arguments. They are related to other arguments, which I've placed in an object with a dynamic key, which is completely random.
{
_id : "aeokejXMwGKvWzF5L",
text : "test",
relations : {
cF6iKAkDJg5eQGsgb : {
type : "interpretation",
originId : "uFEjssN2RgcrgiTjh",
ratings: [...]
}
}
}
Can I find this document if I only know what the value of type is? That is I want to do something like this:
db.theses.find({relations['anything']: { type: "interpretation"}}})
This could've been done easily with the positional operator, if relations had been an array. But then I cannot make changes to the objects in ratings, as mongo doesn't support those updates. I'm asking here to see if I can keep from having to change the database structure.
Though you seem to have approached this structure due to a problem with updates in using nested arrays, you really have only caused another problem by doing something else which is not really supported, and that is that there is no "wildcard" concept for searching unspecified keys using the standard query operators that are optimal.
The only way you can really search for such data is by using JavaScript code on the server to traverse the keys using $where. This is clearly not a really good idea as it requires brute force evaluation rather than using useful things like an index, but it can be approached as follows:
db.theses.find(function() {
var relations = this.relations;
return Object.keys(relations).some(function(rel) {
return relations[rel].type == "interpretation";
});
))
While this will return those objects from the collection that contain the required nested value, it must inspect each object in the collection in order to do the evaluation. This is why such evaluation should really only be used when paired with something that can directly use an index instead as a hard value from the object in the collection.
Still the better solution is to consider remodelling the data to take advantage of indexes in search. Where it is neccessary to update the "ratings" information, then basically "flatten" the structure to consider each "rating" element as the only array data instead:
{
"_id": "aeokejXMwGKvWzF5L",
"text": "test",
"relationsRatings": [
{
"relationId": "cF6iKAkDJg5eQGsgb",
"type": "interpretation",
"originId": "uFEjssN2RgcrgiTjh",
"ratingId": 1,
"ratingScore": 5
},
{
"relationId": "cF6iKAkDJg5eQGsgb",
"type": "interpretation",
"originId": "uFEjssN2RgcrgiTjh",
"ratingId": 2,
"ratingScore": 6
}
]
}
Now searching is of course quite simple:
db.theses.find({ "relationsRatings.type": "interpretation" })
And of course the positional $ operator can now be used with the flatter structure:
db.theses.update(
{ "relationsRatings.ratingId": 1 },
{ "$set": { "relationsRatings.$.ratingScore": 7 } }
)
Of course this means duplication of the "related" data for each "ratings" value, but this is generally the cost of being to update by matched position as this is all that is supported with a single level of array nesting only.
So you can force the logic to match with the way you have it structured, but it is not a great idea to do so and will lead to performance problems. If however your main need here is to update the "ratings" information rather than just append to the inner list, then a flatter structure will be of greater benefit and of course be a lot faster to search.

Read response from MongoDB operation

in my application I insert/update some documents.
I would need to act somehow depending on the result of the operation, but I do not understand how to use the WriteResult object.
This is the toString() of an update succesfully terminated:
Update write result: { "serverUsed" : "xxx.xxx.xxx.xxx:27017" , "ok" : 1 , "n" : 1 , "updatedExisting" : true}
Now, from the documentation I read that getLastError methods are deprecated.
I've getN that just tell me how many record has been updated (meaningless with inserts).
I've no methods to retrieve the OK value.
Do you have any suggestion on how manage the WriteResult object to understand the result of an operation?
Thanks in advance,
Samuel
I was using another framework that was hiding this deprecation, but since having removed that framework, I also came upon this problem.
I was also puzzled as you were on what the documentation was trying to say. Having gone through the MongoDB source code it looks like what was previously this:
WriteResult result = collection.update(query, update, true, false);
if (!result.getLastError().ok()) {
// handle error here
}
now looks like this:
try {
collection.update(query, update, true, false);
}
catch (CommandFailureException e) {
CommandResult result = e.getCommandResult();
// you can use result.ok() here, but it should almost always return false.
}
MongoException is a runtime exception that has several subclasses including CommandFailureException, so you may want to take a look at the javadocs here

Best way to create a mongo expression that never matches

What I am looking for is somehow the equivalent of doing in SQL:
WHERE 1 = 0
I'm looking for such a thing because I'm building a typesafe DSL to perform queries on my domain, supporting conjunctions and disjunctions. Sometimes it may be easier to add a query that never match anything, instead of dealing with it in the code.
For exemple, in my usecase:
StampleFilters().underCategoryIds(sharedCategoryIds.toList)
In this case, it does not work as expected because sharedCategoryIds is empty, so it results in a query being $(), which does not filter anything.
For an empty list, I would rather build a query that never returns anything.
Is there an easy way to do such a thing, without any impact on performances?
I could probably add some query like { somefield: unexistingvalue } but I wonder if there is nothing better.
Edit
I expect the expression to be composable. I mean it should work in queries like $or(exp1,exp2,exp3) where exp1 is for exemple the expression that never match.
If you have any proposition, it would be nice to explain why one is better than others and how it affect the query engine performances (or not)
I think the best way to achieve what you want is to add {_id : -1}
db.coll.find({a : 1}) will be transformed into db.coll.find({a : 1, _id : -1}). This is simpler then all shx2 solutions (except of the last one with noScan which is nice).
Moreover _id field is already a primary index, so it will quickly realize that there is no such _id field in the collection.
P.S. if someone would be so smart to name their _id as -1, then you can do {_id : NaN}.
If there will be _id = NaN then you most probably need to redevelop your app.
I came up with a few ways to achieve that:
"P&!P": { $and: [ {X:0}, {X:{$ne:0}} ] }
Can't be "$in" an empty list: { X: {$in: []} }
Nothing can be this long { X: {$size: 9999999999999999} }
"noScan": db.coll.find({})._addSpecial("$maxScan", 0)
EDIT:
one more, using $where: { $where: function() {return 0} }

sort by string length in Mongodb/pymongo

I was wondering if anyone knows how to sort a mongodb find() result by string length.
I have tried something like db.foo.find().sort({item.lenght:-1}) but obviously doesn't work. Can somebody help me and also suggest me a way to do the same thing but in pymongo?
There are lot of things ( and basic API ) I would personally love to see in the aggregation framework such as:
Math functions
log (as in logarithm)
ceil
floor
Array
sum
String
length
Just to name a few.
And that is without resorting to obscure usages of the $mod operator or other means in such cases as "ceil" and "floor". But I digress.
Your "string length" falls into this category. Raise a JIRA issue about it. But for now you you can use mapReduce and the existing JavaScript functionality:
db.collection.mapReduce(
function() {
emit( this.item.length, this.item );
},
function(key,values) {
return values;
},
{ "out": { "inline": 1 } }
)
So while that does actually have the "mapReduce" funky style of returning a re-shaped document and with of course everything matching the same length in an array, what it does do is take advantage of the nature of "mapReduce" ( not just restricted to MongoDB ) and allows the emitted "key" value to be sorted in the response.
There is now a solution for this in MongoDB v3.4+ using the aggregation framework using $strLenBytes. Given the following document:
{_id: 0, name: "Bob"}
We can use
db.mycollection.aggregate([{
$project: {
byteLength: {$strLenBytes: "$name"}
}
}])
Which will return 3 for the number of bytes.
No, actually is not possible.
I was dealing with a similar problem, what I did was to store the string length of every object as a property of the object itself. This bypassed the problem.
If you think that shall be implemented (I do) I recomend you to upvote the issue in JIRA, which, for some reason have not so many votes:
https://jira.mongodb.org/browse/SERVER-5319