Update last object inside array [duplicate] - mongodb

I have an object structure like this:
{
name: "...",
pockets: [
{
cdate: "....",
items: [...]
}
...
]
}
In an update operation, I want to add some records into the items field of the last pocket item. Using dot notation is the only way that I know to access a sub document, but I can't get what I want. So, I'm looking for something like these:
pockets.-1.items
pockets.$last.items
Is it possible to modify the last element? If yes, how?

I don't know of a way to do this using a single-line query. But you could select the record, update and then save it.
var query = <insert query here>;
var mydocs = db.mycollection.find(query);
for (var i=0 ; i<mydocs.length ; i++) {
mydocs[i].pockets[pockets.length-1].items.push('new item');
db.mycollection.save(mydoc);
}

I don't believe it is possible to do it atomically. There is a request for this functionality to be added to MongoDB.
If you can assure thread-safety in your application code, you could probably use a sequence of $pop from pockets array (that removes the last element from pockets) to variable p and then $addToSet to p.items, now you can $push p back into pockets. But if your application doesn't have a way to assure only one process may be doing this at one time, then another process could modify the array in the middle of those steps and you may end up losing that update.
You might also look into "Update if current" semantics here to see another way you can work around possible race by multiple threads issue.

Related

Is there a way to update array value in algolia Partial Update Method?

My records are something like this,
{
objectID: "123123",
product_id: "456456",
categories: ['pie', 'desert']
}
I want to just replace desert with sweet in categories.
Is this possible by using partial_update_objects method?
You can't update a value but ultimately you can Remove & Add with the built-in operations. It would allow an "update" of the value (it only works if the values in the array are unique). An alternative is to get the object and compute the new value for the array to later replace it.

call custom python function on every document in a collection Mongo DB

I want to call a custom python function on some existing attribute of every document in the entire collection and store the result as a new key-value pair in that (same) document. May I know if there's any way to do that (since each call is independent of others) ?
I noticed cursor.forEach but can't it be done just using python efficiently ?
A simple example would be to split the string in text and store the no. of words as a new attribute.
def split_count(text):
# some complex preprocessing...
return len(text.split())
# Need something like this...
db.collection.update_many({}, {'$set': {"split": split_count('$text') }}, upsert=True)
But it seems like setting a new attribute in a document based on the value of another attribute in the same document is not possible this way yet. This post is old but the issues seem to be still open.
I found a way to call any custom python function on a collection using parallel_scan in PyMongo.
def process_text(cursor):
for row in cursor.batch_size(200):
# Any complex preprocessing here...
split_text = row['text'].split()
db.collection.update_one({'_id': row['_id']},
{'$set': {'split_text': split_text,
'num_words': len(split_text) }},
upsert=True)
def preprocess(num_threads=4):
# Get up to max 'num_threads' cursors.
cursors = db.collection.parallel_scan(num_threads)
threads = [threading.Thread(target=process_text, args=(cursor,)) for cursor in cursors]
for thread in threads:
thread.start()
for thread in threads:
thread.join()
This is not really faster than cursor.forEach (but not that slow either), but it helps me execute any arbitrarily complex python code and save the results from within Python itself.
Also if I have an array of ints in one of the attributes, doing cursor.forEach converts them to floats which I don't want. So I preferred this way.
But I would be glad to know if there're any better ways than this :)
It is quite unlikely that it will ever be efficient to do this kind of thing in python. This is because the document would have to make a round trip and go through the python function on the client machine.
In your example code, you are passing the result of a function to a mongodb update query, which won't work. You can't run any python code inside mongodb queries on the db server.
As the answer to you linked question suggests, this type of action has to be performed in the mongo shell. e.g:
db.collection.find().snapshot().forEach(
function (elem) {
splitLength = elem.text.split(" ").length
db.collection.update(
{
_id: elem._id
},
{
$set: {
split: splitLength
}
}
);
}
);

Updating multiple complex array elements in MongoDB

I know this has been asked before, but I have yet to find a solution that works efficiently. I am working with the MongoDB C# driver, though this is more of a general question about MongoDB operations.
I have a document structure that looks something like this:
field1: value1
field2: value2
...
users: [ {...user 1 subdocument...}, {...user 2 subdocument...}, ... ]
Some facts:
Each user subdocument includes further sub-arrays & subdocuments (so they're fairly complex).
The average users array only contains about 5 elements, but in the worst case can surpass 100.
Several thousand update operations on multiple users may be conducted per day in this system, each on one document at a time. Larger arrays will receive more frequent updates due to their data size.
I am trying to figure out how to do this efficiently. From what I've heard, you cannot directly set several array elements to new values all at once, so I had to try something else.
I tried using the $pullAll / $AddToSet + $each operations to remove the old array and replace it with a modified one. I am aware that $pullall can remove only the elements that I need as well, but I would like to preserve the order of elements.
The C# code:
try
{
WriteConcernResult wcr = collection.Update(query,
Update.Combine(Update.PullAll("users"),
Update.AddToSetEach("users", newUsers.ToArray())));
}
catch (WriteConcernException wce)
{
return wce.Message;
}
In this case newUsers is aList<BsonValue>converted to an array. However I am getting the following exception message:
Cannot update 'users' and 'users' at the same time
By the looks of it, I can't have two update statements in use on the same field in the same write operation.
I also tried Update.Set("users", newUsers.ToArray()), but apparently the Set statement doesn't work with arrays, just basic values:
Argument 2: cannot convert from 'MongoDB.Bson.BsonValue[]' to 'MongoDB.Bson.BsonValue'
So then I tried converting that array to a BsonDocument:
Update.Set("users", newUsers.ToArray().ToBsonDocument());
And got this:
An Array value cannot be written to the root level of a BSON document.
I could try replacing the whole document, but that seems like overkill and definitely not very efficient.
So the only thing I can think of now is to run two separate write operations: one to remove the unwanted old users and another to replace them with their newer versions:
WriteConcernResult wcr = collection.Update(query, Update.PullAll("users"));
WriteConcernResult wcr = collection.Update(query, Update.AddToSetEach("users", newUsers.ToArray()));
Is this my best option? Or is there another, better way of doing this?
Your code should work with a minor change:
Update.Set("users", new BsonArray(newUsers));
BsonArray is a BsonValue, where as an array of documents is not and we don't implicitly convert arrays like we do other primitive values.
this extension method solve my problem:
public static class MongoExtension
{
public static BsonArray ToBsonArray(this IEnumerable list)
{
var array = new BsonArray();
foreach (var item in list)
array.Add((BsonValue) item);
return array;
}
}

Autocomplete with Firebase

How does one use Firebase to do basic auto-completion/text preview?
For example, imagine a blog backed by Firebase where the blogger can tag posts with tags. As the blogger is tagging a new post, it would be helpful if they could see all currently-existing tags that matched the first few keystrokes they've entered. So if "blog," "black," "blazing saddles," and "bulldogs" were tags, if the user types "bl" they get the first three but not "bulldogs."
My initial thought was that we could set the tag with the priority of the tag, and use startAt, such that our query would look something like:
fb.child('tags').startAt('bl').limitToFirst(5).once('value', function(snap) {
console.log(snap.val())
});
But this would also return "bulldog" as one of the results (not the end of the world, but not the best either). Using startAt('bl').endAt('bl') returns no results. Is there another way to accomplish this?
(I know that one option is that this is something we could use a search server, like ElasticSearch, for -- see https://www.firebase.com/blog/2014-01-02-queries-part-two.html -- but I'd love to keep as much in Firebase as possible.)
Edit
As Kato suggested, here's a concrete example. We have 20,000 users, with their names stored as such:
/users/$userId/name
Oftentimes, users will be looking up another user by name. As a user is looking up their buddy, we'd like a drop-down to populate a list of users whose names start with the letters that the searcher has inputted. So if I typed in "Ja" I would expect to see "Jake Heller," "jake gyllenhaal," "Jack Donaghy," etc. in the drop-down.
I know this is an old topic, but it's still relevant. Based on Neil's answer above, you more easily search doing the following:
fb.child('tags').startAt(queryString).endAt(queryString + '\uf8ff').limit(5)
See Firebase Retrieving Data.
The \uf8ff character used in the query above is a very high code point
in the Unicode range. Because it is after most regular characters in
Unicode, the query matches all values that start with queryString.
As inspired by Kato's comments -- one way to approach this problem is to set the priority to the field you want to search on for your autocomplete and use startAt(), limit(), and client-side filtering to return only the results that you want. You'll want to make sure that the priority and the search term is lower-cased, since Firebase is case-sensitive.
This is a crude example to demonstrate this using the Users example I laid out in the question:
For a search for "ja", assuming all users have their priority set to the lowercased version of the user's name:
fb.child('users').
startAt('ja'). // The user-inputted search
limitToFirst(20).
once('value', function(snap) {
for(key in snap.val()){
if(snap.val()[key].indexOf('ja') === 0) {
console.log(snap.val()[key];
}
}
});
This should only return the names that actually begin with "ja" (even if Firebase actually returns names alphabetically after "ja").
I choose to use limitToFirst(20) to keep the response size small and because, realistically, you'll never need more than 20 for the autocomplete drop-down. There are probably better ways to do the filtering, but this should at least demonstrate the concept.
Hope this helps someone! And it's quite possible the Firebase guys have a better answer.
(Note that this is very limited -- if someone searches for the last name, it won't return what they're looking for. Hence the "best" answer is probably to use a search backend with something like Kato's Flashlight.)
It strikes me that there's a much simpler and more elegant way of achieving this than client side filtering or hacking Elastic.
By converting the search key into its' Unicode value and storing that as the priority, you can search by startAt() and endAt() by incrementing the value by one.
var start = "ABA";
var pad = "AAAAAAAAAA";
start += pad.substring(0, pad.length - start.length);
var blob = new Blob([start]);
var reader = new FileReader();
reader.onload = function(e) {
var typedArray = new Uint8Array(e.target.result);
var array = Array.prototype.slice.call(typedArray);
var priority = parseInt(array.join(""));
console.log("Priority of", start, "is:", priority);
}
reader.readAsArrayBuffer(blob);
You can then limit your search priority to the key "ABB" by incrementing the last charCode by one and doing the same conversion:
var limit = String.fromCharCode(start.charCodeAt(start.length -1) +1);
limit = start.substring(0, start.length -1) +limit;
"ABA..." to "ABB..." ends up with priorities of:
Start: 65666565656565650000
End: 65666665656565650000
Simples!
Based on Jake and Matt's answer, updated version for sdk 3.1. '.limit' no longer works:
firebaseDb.ref('users')
.orderByChild('name')
.startAt(query)
.endAt(`${query}\uf8ff`)
.limitToFirst(5)
.on('child_added', (child) => {
console.log(
{
id: child.key,
name: child.val().name
}
)
})

Am i correctly using indexes for this mongoDB?

So i need some advice as to what i'm doing incorrectly.
My database is setup up exactly like a file system consisting of folders and files.
It begins with a folder, but can have a relatively infinite number of subfolders and or files.
{
"name":"folder1",
"uniqueID":"zzz0",
"subcontents": [ {"name":"subfolder1", "uniqueID":"zzz1"},
{"name":"subfile1", "uniqueID":"zzz2"},
{"name":"subfile2", "uniqueID":"zzz3"},
{"name":"subfolder2", "subcontents": [...etc...], "uniqueID":"zzz4"},
]
}
Each folder/file document have a uniqueID so that I can reference to it (seen above zzz#). My question is, can I make a mongoDB query to pull out only a single document?
Like say for example db.fileSystemCollection.find({"uniqueID":"zzz4"}) and it would give me the following result? Do i have to use indexes to do this? I've been trying but the query returns empty every time.
intended result ---> {"name":"subfolder2", "subcontents": [...etc...], "uniqueID":"zzz4"}
[EDIT]
Based on the responses below, I will consider an XML database instead on mongoDB. The json structure cant be rearranged to work with MongoDB (too much data).
Short answer is no, as it's stated by Chris.
Your embedded representation of a tree is really good for intuitive understanding (and implementation as well). But if you want to allow effective searches on your tree using indices in MongoDB, you might consider another ways for tree storage. A bunch of ways is listed at http://docs.mongodb.org/manual/tutorial/model-tree-structures/
Please keep in mind that every representation has its own pros and cons depending on your access patterns.
Since for filesystem-like structure it's likely to have the ability to find all the sub contents of a given folder, you may use child references pattern for this:
{
"name":"folder1",
"uniqueID":"zzz0",
"subcontents": [ "zzz1",
"zzz2",
"zzz3",
"zzz4"
]
}
{
"name":"subfolder1",
"uniqueID":"zzz1"
}
...
No; searching for {uniqueID: "zzz4"} will only get you documents whose top-level uniqueID matches.
What you probably want is to maintain an array on the document which lists all the unique IDs in that tree. So your document would be:
{
"name":"folder1",
"uniqueID":"zzz0",
"idList": ["zzz0", "zzz1", "zzz2", "zzz3", "zzz4"],
"subcontents": [ {"name":"subfolder1", "uniqueID":"zzz1"},
{"name":"subfile1", "uniqueID":"zzz2"},
{"name":"subfile2", "uniqueID":"zzz3"},
{"name":"subfolder2", "subcontents": [...etc...], "uniqueID":"zzz4"},
]
}
Then you can index that:
db.fileSystemCollection.ensureIndex({"idList": 1})
Then you can find on it:
db.fileSystemCollection.find({"idList": "zzz4})
That'll return you those documents.
As an aside, if you're trying to store files in Mongo, have you looked at GridFS?