Azure Cosmos DB Mongo API Bug With NOT IN - mongodb

I have stored the following document in my Cosmos DB using the Mongo API:
{
"_id" : ObjectId("59157eaabfeb1900011592c8"),
"imageResourceId" : "1489496086018.png",
"gallery" : "Tst",
"thumbnailRaw" : {
"$binary" : "<SNIP>",
"$type" : "00"
},
"tags" : [
"Weapon/Sword",
"Japanese"
],
"__v" : 1
}
I'm trying to perform a query that excludes any objects containing the tag "Japanese". I've crafted the following query, which performs correctly (that is, it does not return the above document) on a real Mongo DB:
{"gallery":"Tst, "tags":{"$nin":["Japanese"]}}
On Cosmos DB, this query returns the above image, despite the presence of a string found in the $nin array. Am I doing this query correctly? Is there another, supported way for Cosmos DB to do a NOT IN logical operation?

I have a different issue with CosmosDB, that made me run a few tests on operations with arrays, I believe that in your case this should work:
db.gallery.find({"tags":{"$elemMatch":{$nin: ["japanase"]}}} )
my issue:
Azure Cosmos DB check if array in field is contained in search array
I agree with the comment that CosmosDB is implementing only a subset of mongoDB, and documentation is very scarce, but I hope the fix I propose works for you.

Related

Using $bitsAllClear and other bitwise operators in Azure CosmosDB for MongoDB

While developing against a local MongoDB server I used $bitsAllClear in the following query to fetch some data from a collection
db.getCollection('messages').find({
"flags": {
$bitsAllClear: 64
},
"someprop": ...
});
Ie find all documents from the messages collection where the flags property (a number) does not have that particular bit set (ie flags & 64 === 0). This works fine when querying locally against an instance of MongoDB.
When I tried the very same query against an Azure Cosmos DB for MongoDB I got the following error back
Error: error: {
"ok" : 0,
"errmsg" : "$bitsAllClear not supported",
"code" : 115,
"codeName" : "CommandNotSupported"
}
And reading the docs (which I only did afterwards) I recognized that $bitsAllClear and similar operators are not supported in Azure Cosmos DB for MongoDB.
So now my question: Is there any workaround for this type of query other than
querying ALL documents and filter on the flag later
extract this particular flag into a own boolean property?

How to update data in elasticsearch with using like bulkupdate in mongoDB?

I find solution to update data in elasticsearch with golang. The data is about 1,000,000+++ documents and must be specific with id of document. I can update in mongoDB with using bulk operation but I can't find it in elasticsearch it is have a operation like it? or anyony have idea to update huge of data in elasticsearch with specific id. Thanks in advance.
In general, you can use bulk API to make such bulk updates. You can either index data again using same id or just run update. You can use CURL to push the updates from command line, if you are doing it as one off update.
POST _bulk
{ "index" : { "_index" : "test", "_id" : "1" } }
{ "field1" : "value1" }
{ "update" : {"_id" : "1", "_index" : "test"} }
{ "doc" : {"field2" : "value2"} }
Other option is to use update_by_query, if you are setting custom fields. With update by query, you can also mix it with pipeline to update existing data.
It entirely comes down whether you are trying to run update using information from different index (in such case, you can use enrich processor, which is available in 7.5 onwards) OR if you simply want to add a new field and update it using some rule which already uses attributes available on the document.
So for different type of scenario, different options are available. Bulk API is more appropriate, when the data source is external. But if data is already available on Elasticsearch, then update by query is appropriate.
You can also look at reindexing with pipeline scripting. But again, horses for courses rule applies here as well.

How to query for MongoDB Document With duplicate Key?

I am having Mongo Documents with duplicate Key,
{
"_id" : ObjectId("576a3b4a2bf2bc22bccb80ec"),
"Name" : "User1",
"Name" : "User2"
}
{
"_id" : ObjectId("576a3b4a2bf2bc22bccb80ab"),
"Name" : "User2",
"Name" : "User1"
}
When I try to query for Name as "User1". I always get one document only. But result should be two documents. Is there any way that I get correct result?
Thanks in advance
Note: I know my design is wrong I am just trying to make it success.
Please note that according to the docs, it IS possible to have duplicate key names, but depending on the driver you either might not be able to insert or read such data:
BSON documents may have more than one field with the same name. Most MongoDB interfaces, however, represent MongoDB with a structure (e.g. a hash table) that does not support duplicate field names. If you need to manipulate documents that have more than one field with the same name, see the driver documentation for your driver.
(Source: https://docs.mongodb.com/manual/core/document/#field-names)
Unfortunately, in order to correct your existing data, you will have to use a driver which can handle duplicate keys.
You cannot have two fields with same name in a collection in MongoDB .
When you try to insert a document with two fields with same key , MongoDB will update with the latest value rather than creating a separate fields.
Example :
db.test.insert({'Name':'user1','Name':'user2'})
db.test.insert({'Name':'user2','Name':'user1'})
Will result in inserting 2 documents as shown below
{ "_id" : ObjectId("576a8b4731157693143d0571"), "Name" : "user2" }
{ "_id" : ObjectId("576a8b5531157693143d0572"), "Name" : "user1" }
db.collection_name.find({"Name" : "User1"})
For a collection the ObjectId in MongoDB are unique because it acts as the primary key for that collection. As per the MongoDB Manual You can never have two documents in the same collection with same ObjectId.
But for different collections, there's a possibility of having the same ObjectID. And in that case while querying you obviously have to mention the collection name.
Hope this helps

Bug for collections that are sharded over a hashed key

When querying for large amounts of data in sharded collections we benefited a lot from querying the shards in parallel.
The following problem does only occur in collections that are sharded over a hashed key.
In Mongo 2.4 it was possible to query with hash borders in order to get all data of one chunk.
We used the query from this post.
It is a range query with hash values as borders:
db.collection.find(
{ "_id" : { "$gte" : -9219144072535768301,
"$lt" : -9214747938866076750}
}).hint({ "_id" : "hashed"})
The same query also works in 2.6 but takes a long time.
The explain() shows that it is using the index but scanned objects is way to high.
"cursor" : "BtreeCursor _id_hashed",
Furthermore the borders are wrong.
"indexBounds" : {
"_id" : [
[
{
"$minElement" : 1
},
{
"$maxElement" : 1
}
]
]
},
Was there some big change from 2.4 t0 2.6 which breaks this query?
Even if the borders are interpreted as non-hash values, why does it take so long?
Is there some other way to get all documents of one chunk or hash index range?
Also the mongo internal hadoop connector has this problem with sharded collections.
Thanks!
The query above working in 2.4 was not supported behavior. See SERVER-14557 with a similar complaint and an explanation of how to properly perform this query. Reformatted for proper behavior, your query becomes:
db.collection.find().min({ _id : -9219144072535768301}).max({ _id : -9214747938866076750}).hint({_id : "hashed"})
As reported in the SERVER ticket, there is an additional bug (SERVER-14400) that prevents this query from being targeted towards a single shard. At this point in time there are no plans to address in 2.6. This should however prevent the table scan you are seeing under 2.6 and allow for more efficient retrieval.

MongoDB $or query not working for me after mongorestore

I ran a mongodump and then mongorestore to move a MongoDB database from one computer to another. The data are there, I can query them (first query) and get results but using $or in a query produces no results (second query).
db.employees.find( { 'name.first' : 'Joe' })
-- vs --
db.employees.find( { $or : [ { 'name.first' : 'Joe' }]})
As far as I can tell, indexes have been recreated from system.indexes.bson, any ideas what is wrong?
indexes:
> db.employees.getIndexes()
[
{
"name" : "_id_",
"ns" : "data.demployees",
"key" : {
"_id" : 1
}
}
]
original server: MongoDB 1.6.5 64b
new server: MongoDB 1.4.4 32b
I was running the query through the console, not pymongo.
To really help here, we need a few pieces of information:
version numbers (MongoDB and pymongo, server and new computer)
output from db.employees.getIndexes()
can you run a test on a smaller data set? (see below)
can you double-check data types?
Smaller Data Set
Try copying out a small set of the employees to a new collection and run the same queries:
db.employees.find().limit(100).forEach( function(x) { db.employees_test.insert(x); } )
Basically, let's try to rule out corruption of data. Then let's try to isolate the version and see if this is a known bug.
Double-check Data Types
Ensure that the data types are correct.
Is this a bug?
This could be a bug, but if it is, the bug should be trivial to reproduce. Once you've double-checked that the system is behaving incorrectly, it's time to repro this so that you can at least file a bug.
pymongo requires quotes around the special operators-- have you tried this?
db.employees.find( { '$or' : [ { 'name.first' : 'Joe' }]})