Using Memcached to cache results from Model::find() - memcached

I'd like to store the DocumentSet returned from Model::find() in memcached. However, I get the MongoException below when I try to work with the results after retrieving them from cache. Specifically, when using foreach, the exception is thrown when calling if ($this->_resource->hasNext()) on line 63 of \data\source\mongo_db\Result.php
MongoException
The MongoCursor object has not been correctly initialized by its constructor
I can understand why this would be the case.
My question is, is there anyway to pre-populate a Model::find() or create my own DocumentSet so that I can work with the data? Normally, I'd just convert it to an array and store that in cache. However, I need access to some of the Model methods I've written (ex: Customer::fullName())
Update: I've found a bit of a workound that is ok but not great. I'm saving the Model::find() results as an array in cache $result->to('array'). Then, upon retrieval, I loop through the $results and populate a new array with Model::create($result, array("exists" => true) for each $result;

A DocumentSet returned by Model::find contains a Mongo db cursor. It doesn't load all of the data from the database until the items are iterated. As each item is iterated, a Document is created and is cached in memory into the DocumentSet object. The built-in php function iterator_to_array() can be used to turn the DocumentSet into an array of documents which you could cache.
As you mention in your update, you can also use ->to('array') to prepare for caching and then Model::create() to build it back up. One caveat with that method: when you use ->to('array') it also casts MongoId and MongoDate objects to strings and integers respectively. If you've defined a $_schema for your model and set the 'id' and 'date' types, they will be cast back to the original MongoId and MongoDate objects in the Document returned by Model::create(). If you don't have a schema for all of your fields, it can be a problem. I have a recent pull request that tries to make it possible to do ->to('array') without casting the native mongo objects (also fixes a problem with the mongo objects always being casted when inside arrays of sub-documents).
FWIW, I actually prefer saving just the data in cache because it's less space than serializing a whole php object and avoids potential issues with classes not being defined or other items not initialized when the item is pulled from cache.
I haven't tried this... but I would think you could make a cache strategy class that would take care of this transparently for you. Another example of the great care that went into making Lithium a very flexible and powerful framework. http://li3.me/docs/lithium/storage/cache/strategy

Related

Get Deleted Object in AbstractMongoEventListener

I want to run some logic when an Object get deleted from MongoDB. I am using SpringData Mongo.
I am using AbstractMongoEventListener as the object can be deleted from collection through number of ways and I am overriding the
public void onBeforeDelete(BeforeDeleteEvent<Object> event)
method. But there are no method in event object which will return the Object I am going to delete.
event.getSource() and event.getDocument() returns the document. How can I get the object.
Somehow this Event seems to be messed up. In difference to the other MongoMappingEvent<T> descendents, this one inherits a MongoMappingEvent<Document> through AbstractDelteEvent<T>. I cannot explain this difference.
But as I also was in need to retrive the Documents before deleting them, I used the debugger to find, it is possible to retrive the Document Ids, using some hackish shit undocumented get("Key")-chain.
event.getDocument()
.get("_id", Document.class) // BSON Document!
.getList("$in", ObjectId.class) // ObjectId.class or what ever Type your Id is.
With that you can retrive a list of the ids of your documents. Take the repository or what ever, and use those ids to fetch the documents.
I do really not like using those string-key-things that I have not found in a documentation, as who knows when they will be removed.
I would love to remove this answer as soon as someone provides a less hackish way.
Be aware, that when you are using an #EventHandler, it can not consider the type parameter.

Deleting an Item in Firebase

I have the following data in Firebase:
Before Deletion (Link)
In the "-Kabn1954" branch, I want to delete the item "apple". Using Swift, I delete an item at a specific index, in a particular branch, using this:
self.ref.child("-Kabn1954").child("foods").child("1").removeValue()
However, after I do this, the Firebase data looks like this:
After Deletion (Link)
As you can see, the data in this branch now goes directly from index 0 to index 2. For this reason, I get an error. How can I make it such that when the item at index 1 is deleted, the two remaining items have an index of 0 followed by an index of 1?
Firebase doesn't actually store the data as an array, instead it stores it as an object keyed by the index as you're observing. The guide suggests that you should try to restructure your data so that the array-like behavior is not used.
If that is not possible or really not preferable, I don't know about how the Swift API works, however in both the python and JavaScript libraries, if you observe on the parent foods element, you'll get an array object which you can splice and push an update. I'm guessing this is also true in Swift, as the API indicates that an NSArray can be returned too.
As the blog post mentions, you'll need to update the entire array when you want to reindex, as Firebase will not do it for you. setValue() accepts an NSArray which can be called on the foods reference. Be careful about race conditions here, you'll want to encapsulate the read and write into a single transaction to avoid losing your update.

MongoDB - is this efficient?

I have a collection users in mongodb. I am using the node/mongodb API.
In my node.js if I go:
var users = db.collection("users");
and then
users.findOne({ ... })
Is this idiomatic and efficient?
I want to avoid loading all users into application memory.
In Mongo, collection.find() returns a cursor, which is an abstract representation of the results that match your query. It doesn't return your results over the wire until you iterate over the cursor by calling cursor.next(). Before iterating over it, you can, for instance, call cursor.limit(x) to limit the number of results that it is allowed to return.
collection.findOne() is effectively just a shortcut version of collection.find().limit(1).next(). So the cursor is never allowed to return more than the one result.
As already explained, the collection object itself is a facade allowing access to the collection, but which doesn't hold any actual documents in memory.
findOne() is very efficient, and it is indeed idiomatic, though IME it's more used in dev/debugging than in real application code. It's very useful in the CLI, but how often does a real application need to just grab any one document from a collection? The only case I can think of is when a given query can only ever return one document (i.e. an effective primary key).
Further reading:
Cursors,
Read Operations
Yes, that should only load one user into memory.
The collection object is exactly that, it doesn't return all users when you create a new collection object, only when you either use a findOne or you iterate a cursor from the return of a find.

RequestFactory Diff Calculation and 'static' find method

Am bit stuck by these three questions:
1) I see that diff is calculated in AutoBeanUtils's diff method. I saw a tag called parentObject in the entity which is used in the comparison to calculate diff.
parent = proxyBean.getTag(Constants.PARENT_OBJECT); in AbstractRequestContext class.
Does that mean there are two copies for a given entity thats loaded on to the browser? If my entity actual size is say 1kb, actual data loaded will be 2kb (as two copies of entity are getting loaded onto the browser) ?
2) On the server side:
Suppose I have to fetch an entity from the database, the static find<EntityName> should be such that I have to make a db call every time, or is there a way where I can fine tune that behavior? [Sorry I did not understand the locator concept very well.]
3) What happens if there is a crash on the server side(for any reason which need not be current request specific) when a diff is sent from the client?
Thanks a lot.
when you .edit() a proxy, it makes a copy and stores the immutable proxy you passed as argument as the PARENT_OBJECT of the returned proxy.
you'd generally make a DB call every time the method is called (this is the same for a Locator's find() method), which will be no more than twice for each request. You can use some sort of cache if you need, but if you use JPA or JDO this is taken care of for you (you have to use a session-per-request pattern, aka OpenSessionInView)
If there's any error while decoding the request, a global error will be returned, that will be passed to onFailure of all Receivers for the failed RequestContext request.
See https://code.google.com/p/google-web-toolkit/wiki/RequestFactoryMovingParts#Flow

What's the difference between -objectRegisteredForID: and -existingObjectWithID:error:?

What's the difference between getting an managed object with
- (NSManagedObject *)objectRegisteredForID:(NSManagedObjectID *)objectID
and
- (NSManagedObject *)existingObjectWithID:(NSManagedObjectID *)objectID error:(NSError **)error
What are "registered" objects? What's the difference between "registered" objects and "unregistered" objects?
What are "registered" objects?
Judging from the results I've gotten using these methods, a registered object is one that has been fetched into the MOC. If an object exists in the persistent store but has not been fetched, feeding its objectID to the objectRegisteredForID method will return nil.
How could you even have its objectID if it had not been fetched? Well, I visited this question when implementing a Revert routine. I dumped any unsaved changes by replacing the database with an older copy, cleared the context and then reaccessed it. But I wanted to be able to restore the user's selection of objects to the cache of a table. So, before doing the reversion, I stashed the objectIDs that the user had selected in an array. Then, after the reversion, I rebuilt the table cache using the stashed objectIDs.
When I called objectRegisteredForID using these stashed objectIDs, it always returned nil. (But if I tested this before getting a fresh context, it would return the corresponding object -- which at that point was a fetched, loaded object. Hence my inference as to the meaning of "registered.")
When I called objectWithID using these stashed objectIDs, everything was fine unless the object had been deleted post the last save, in which case it would no longer exist in the database and the invalid but non-nil return would cause exceptions later.
So I used existingObjectWithID:error. If the object still existed, it would be returned. If it no longer existed, the return would be nil and the error's localizedDescription would be "Attempt to access an object not found in store."
Years after the fact:
As Wienke suspects, registered objects are those already in memory for that context. So objectRegisteredForID: will return an object only if somebody else has previously obtained that object.
objectWithID: will return an object if it currently exists in the persistent store.
So the really important distinction is:
objectWithID may go to the persistent store.
Note the corollary: objectWithID may have to perform a fetch. That means locking the store. So:
if the store is already locked by somebody else, objectWithID may block, whereas objectRegisteredForID will never block; and
supposing you had an array of 30 object IDs and you performed objectWithID for each, you'd potentially do 30 separate trips to the store — it'd be much faster to check whether the objects are already registered and then, if any aren't, use an NSFetchRequest to get the rest. Notice that a self in X query can accept an array or set of object IDs to return actual objects even though that wouldn't technically exactly match the normal Objective-C meaning of 'self'.
Falling back on NSFetchRequest is also generally preferable if you have any relationship paths you're going to need prefetched. So there's potentially quite a lot of performance to be gained.
To your first question:
objectRegisteredForID: is the quick & easy way to get the object -- it either returns your object or it returns nil letting you know that it could not. You use this when you either already know why the result might be nil or you don't care.
existingObjectWithID:error: is similar, in that it returns either your object or nil but, in addition, if you pass an error parameter, it will tell you WHY it returned nil. You may want to do this if you plan to do any sort of error reporting (a log message or alert), or error handling (perhaps you plan to take different actions, depending on which kind of error is returned.
EDIT: In addition (per docs), if there is not a managed object with the given ID already registered in the context, the corresponding object is faulted into the context.
I suggest that you break out the "what are registered objects?" portion of your question to a separate question in order to facilitate both getting a better answer (based on the subject line matching the question) and to help future spelunkers who may come looking for an answer.
I was recently confused as to why registeredObject(for objectID: NSManagedObjectID) was returning nil, but there was a simple explanation: the NSManagedObject instances I fetched were not retained: I extracted the information I needed from them and let them be deallocated, which seems to "unregister" them from the managed object context, though they could easily be retrieved using other methods on NSManagedObjectContext. I find the CoreData documentation truly terrible: "registered" is just one of many distinctions that are not clearly explained.