Pagination solution for Salat/Cashbah - mongodb

I am interested in a pagination solution for documents stored in MongoDB. I use Salat/Casbah in order to work with this data. As far as I can tell, there is nothing readily available in as far as open source to paginate data using those two solutions. Is there a solution I'm currently overlooking in order to paginate data that I'm displaying in an HTTP API using those as my drivers?

Despite its cheesy attempts at humor, this post on MongoDB paging is pretty good and focuses on range queries and associated techniques to paginate your data. How you actually do it depends on the amount of data and the nature of your application.

please, be careful with pagination! In MongoDB pagination very often results in iteration over entire collection. Exactly because of this casbah doesn't have good pagination solution. You can try to use filtering instead of pagination, for example when result is ordered by field relevance, selecting results where relevance > some value
There're a lot of information about how to do efficient paging in mongodb, e.g.: MongoDB - paging

Related

Firestore pagination of multiple queries

In my case, there are 10 fields and all of them need to be searched by "or", that is why I'm using multiple queries and filter common items in client side by using Promise.all().
The problem is that I would like to implement pagination. I don't want to get all the results of each query, which has too much "read" cost. But I can't use .limit() for each query cause what I want is to limit "final result".
For example, I would like to get the first 50 common results in the 10 queries' results, if I do limit(50) to each query, the final result might be less than 50.
Anyone has ideas about pagination for multiple queries?
I believe that the best way for you to achieve that is using query cursors, so you can better manage the data that you retrieve from your searches.
I would recommend you to take a look at the below links, to find out more information - including a question answered by the community, that seems similar to your case.
Paginate data with query cursors
multi query and pagination with
firestore
Let me know if the information helped you!
Not sure it's relevant but I think I'm having a similar problem and have come up with 4 approaches that might be a workaround.
Instead of making 10 queries, fetch all the products matching a single selection filter e.g. category (in my case a customer can only set a single category field). And do all the filtering on the client side. With this approach the app still reads lots of documents at once but at least reuse these during the session time and filter with more flexibility than firestore`s strict rules.
Run multiple queries in a server environment, such as cloud store functions with Node.js and get only the first 50 documents that are matching all the filters. With this approach client only receives wanted data not more, but server still reads a lot.
This is actually your approach combined with accepted answer
Create automated documents in firebase with the help of cloud functions, e.g. Colors: {red:[product1ID,product2ID....], ....} just storing the document IDs and depending on filters get corresponding documents in server side with cloud functions, create a cross product of matching arrays (AND logic) and push first 50 elements of it to the client side. Knowing which products to display client then handle fetching client side library.
Hope these would help. Here is my original post Firestore multiple `in` queries with `and` logic, query structure

Reading the similar data from more than two collections in mongoDB

I am novice user to MongoDB. In our application the data size for each table quite bit large, So I decided to split the same into different collections even though it is same of kind. The only difference is the "id" between each document(documents in one collection is under one category) in the all the collections. So we decided to insert the data into number collections and each collections will be having certain number of documents. currently I have 10 collections of same of kind of document data.
My requirement is
1) to get the data from all the collections in a single query to display in application home page.
2) I do need to get the data by using sorting and filtering before fetching.
I have gone through some of the posts in the stackoverflow saying that use Mongo-3.2 $lookup aggregation for this requirement. but I am suspecting If I use $lookup for 10 collections, there might be performance Issue and too complex query.
since I have divided the my same kind of data into number of collections(Each collection will have the documents which comes under one category, Like that I have the 10 categories, so I need to use 10 collections).
Could any body please suggest me whether my approach is correct?
If you have a lot data, how could you display all of them in a webpage?
My understanding is that you will only display a portion of the dataset by querying the database. Since you didn't mention how many records you have, it's not easy to make a recommendation.
Based on the vague description, sharding is the solution, you should check out the official doc.
However, before you do sharding, and since you mentioned are a novice user, you probably want to check your databases' indexing, data models, and benchmark your performance first.
Hope this helps.
You should store all 10 types of data in 1 collection, not 10. Don't make things more difficult than they need to be.

How do I efficiently page a large collection of query results with Sails.js / Waterline?

I'm working with a large dataset behind the Waterline ORM. In several use-cases I need to do some processing on many/most of the record–10's of thousands.
So far I've been working with .find(), but that executes and returns the entire result set. Is there a Sails/Waterline approach to iterating over a query result–which preserves the storage-agnostic aspect of the ORM?
You can use paginate, something like -> Model.find().paginate({page: xx, limit: xx});
More info here: http://sailsjs.org/documentation/concepts/models-and-orm/query-language
Search for pagination :)
If you want to keep the storage agnostic waterline trait you will have to take a look to your actual schema implementation (even if you're coding storage agnostic).
You can:
Use pagination like #holzanic answers, however this might come up with critital performance issues in some storage technologies.
Use streams.
If you will be listing whole objects from a Model, you can make sure you can craft paginate by id. You can take first n elements in a query and then try to obtain the next page where their id attribute is bigger than last received in previous page.

Why use ElasticSearch with Mongo?

I have read a few articles recently on the combination of mongodb for storage and elasticsearch for indexing/search. I feel like I'm missing something though. Why would you go this route as opposed to just using mongo to index the data? What benefits does elasticsearch bring and is it worth the added complexity?
ElasticSearch implements a lot more features, such as custom splitting of text into words, custom stemming, facetted search and a whole lot more. While MongoDB's (rather simple) text search does some of this, it is not nearly as powerful as ElasticSearch.
If all you ever do is look for a single string in a single field, then MongoDB's normal query system will work excellently for that. If you need to look for words in multiple fields, then MongoDB's text search will work. If you need anything more than that, ElasticSearch is the way to go.
A search engine and a database do some fundamentally different things. A good search engine (like ElasticSearch) supports far more elaborate and complex indexing, facets, highlighting etc. In the case of ElasticSearch, you also get your replies 'real-time'. On the other hand, a search engine doesn't return every single document that matches your query. Instead, it will score documents according to how much they match, and return the top scoring ones. When you query a database such as MongoDB, you should expect it to return everything that matches your query.
You can store the entire document in ElasticSearch, but it is usually not the optimal solution. Normally you will have it configured to return the document id's, which you use to fetch the document from a database. MongoDB is a database optimized for document based storage. this is why you hear about people using them together.
edit:
When this was posted, it matched the recommendations, but this may no longer be the case.
Derick's answer pretty much nails it. The questions behind all this is:
What are the features you want to implement in your application?
If you rely on heavy searching capabilities in large chunks of text, ElasticSearch is probably a good thing to use. If you want to have a flexible datastore that can cope with complex ad-hoc queries, Mongo might be a good fit. If you have different requirements for a datastore, it is often a good thing to combine two tools instead of implementing all kind of workarounds to make it work with just one datastore.
Choose the right tool for the job.

Is mongoDB efficient in doing multi-key lookups?

I'm evaluating MongoDB, coming from Membased/memcached because I want more flexibility.
Of course Membase is excellent in doing fast (multi)-key lookups.
I like the additional options that MongoDB gives me, but is it also fast in doing multi-key lookups? I've seen the $or and $in operator and I'm sure I can model it with that. I just want to know if it's performant (in the same league) as Membase.
use-case, e.g., Lucene/Solr returns 20 product-ids. Lookup these product-ids in Couchdb to return docs/ appropriate fields.
Thanks,
Geert-Jan
For your use case, I'd say it is, from my experience: I hacked some analytics into a database of mine that made a lot of $in queries with thousands of ids and it worked fine (it was a hack). To my surprise, it worked rather well, in the lower millisecond area.
Of course, it's hard to compare this, and -as usual- theory is a bad companion when it comes to performance. I guess the best way to figure it out is to migrate some test data and send some queries to the system.
Use MongoDB's excellent built-in profiler, use $explain, keep the one index per query rule in mind, take a look at the logs, keep an eye on mongostat, and do some benchmarks. This shouldn't take too long and give you a definite and affirmative answer. If your queries turn out slow, people here and on the news group probably have some ideas how to improve the exact query, or the indexation.
One index per query. It's sometimes thought that queries on multiple
keys can use multiple indexes; this is not the case with MongoDB. If
you have a query that selects on multiple keys, and you want that
query to use an index efficiently, then a compound-key index is
necessary.
http://www.mongodb.org/display/DOCS/Indexing+Advice+and+FAQ#IndexingAdviceandFAQ-Oneindexperquery.
There's more information on that page as well with regard to Indexes.
The bottom line is Mongo will be great if your indexes are in memory and you are indexing on the columns you want to query using composite keys. If you have poor indexing then your performance will suffer as a result. This is pretty much in line with most systems.