how to offset results when retrieving data from freebase - metadata

is there a way to offset results when retrieving data from a query like below? is there a property like "limit: x, but just for offsetting (besides "&cursor=")?
https://www.googleapis.com/freebase/v1/mqlread?
query[{"type":"/cvg/computer_videogame",...

No, the cursor is your only option. Alternatively, you can make your query more specific to do the offsetting at the query level (e.g. query a smaller time range, etc).

Related

MongoDB Geospatial and createdAt sorting

I have a headache for a idea how to properly sort data from a MongoDB. It is using 2dsphere index and has timestamp createdAt. The goal is to show latest pictures (that what this collection is about, just a field mediaUrl...) but it has to be close to the user. I'm not very familiar with complex MongoDB aggregation queries so I thought here's a good place to ask. Sorting with $near shows only items sorted by distance. But there's a upload time, e.g. if item is 5 min fresh but is like 500 meters far than older item it still should be sorted higher.
Ugly way would be to iterate every few hundreds meters and collect data but maybe there's a smarter way?
So if I am correct you want to be able to sort on 2 fields ?
distance
timestamp
You should check out this method:
https://docs.mongodb.com/manual/reference/operator/aggregation/sort/
It allows you to sort multiple columns.

Could the order of the results change after using Pymongo's rewind() function?

Given that MongoDB query results are returned in the order that they are found, which "may coincide with insertion order (but isn't guaranteed to be) or the order of the index(es) used":
Does this mean that the order of the results could change after using Pymongo's rewind() function?
It seems like rewind() performs another database query, right?
Correct, rewind() performs another database query, as if the first had never happened. If you don't specify any sort order to your results, and if MongoDB had to move some documents (because some changed size, for example) between the first and the second query, you will get them in different order.
If you need your documents in a particular order, use sort.
http://api.mongodb.com/python/current/api/pymongo/cursor.html#pymongo.cursor.Cursor.sort

Get total number of matches along with result in algolia

I am looking for something like FOUND_ROWS() in mysql select query in algolia results as I need to keep a track of how many total results to expect. Is there someway to get this in Algolia?
The proper way to obtain the number of results is to access the nbHits value which is available in the JSON response of every search call.

How can you measure the space that a set of documents takes up (in bytes) in mongo db?

What I would like to do is figure out how much space in bytes a certain set of documents takes up. E.g. something like:
collection.stuff.stats({owner: someOwner}, {sizeInBytes: 1})
Where the first parameter is a query, and the second is like a projection of the statistics you want calculated.
I read that there's a bsonsize function you can use to measure the size of a single document. I'm wondering if maybe I could use that along with the aggregation methods to calculate the size of a search. But if I was going to do that, I'd want to know how bsonsize works. How does it work? Is it expensive to run?
Are there other options for measuring the size of data in mongo?
One perhaps "quick and dirty" way to find this would be to assign your results to a cursor, then insert that result into a new collection and call db.collection.stats on it. It would look like this in the shell:
var myCursor = db.collection.find({key:value});
while(myCursor.hasNext()) {
db.resultColl.insert(myCursor.next())
}
db.resultColl.stats();
Which should return the information on the subset of documents

Solr: Query for documents whose from-to date range contains the user input

I would like to store and query documents that contain a from-to date range, where the range represents an interval when the document has been valid.
Typical use cases in lucene/solr documentation address the opposite problem: Querying for documents that contain a single timestamp and this timestamp is contained in a date range provided as query parameter. (createdate:[1976-03-06T23:59:59.999Z TO *])
I want to use the edismax parser.
I have found the ms() function, which seems to me to be designed for boosting score only, not to eliminate non-matching results entirely.
I have found the article Spatial Search Tricks for People Who Don't Have Spatial Data, where the problem described by me is said to be Easy... (Find People Alive On May 25, 1977).
Is there any simpler way to express something like
date_from_query:[valid_from_field TO valid_to_field] than using the spacial approach?
The most direct approach is to create the bounds yourself:
valid_from_field:[* TO date_from_query] AND valid_to_field:[date_from_query TO *]
.. which would give you documents where the valid_from_field is earlier than the date you're querying, and the valid_to_field is later than the date you're querying, in effect, extracting the interval contained between valid_from_field and valid_to_field. This assumes that neither field is multi valued.
I'd probably add it as a filter query, since you don't need any scoring from it, and you probably want to allow other search queries at the same time.