Algolia Custom Ranking on the Fly - algolia

I have an index of thousands of music tracks. For searching I want the tracks to be returned by track title ascending.
I also have the created_at field which is a date time of when I added the track to library. Is it ok for me to change the ranking on the fly?
So for my normal artist / title search before the query I would run:
index.setSettings({
ranking: [
"asc(title)",
"asc(artist)"
]
});
And then when I want to return the tracks I recently added to the database I would run:
index.setSettings({
ranking: [
"desc(created_at)",
"asc(title)",
"asc(artist)"
]
});
My question is: Is this performant? Are there any down sides for doing this for each query?
Thanks for the advice!

Algolia sort data at indexing time, not query time. That's why you have to use the method setSettings. If you do it this way, all the data will be resorted every time you set the new settings.
The solution is to use replicas. They are a copy of the master index, sorted differently.
You can find the relevant doc here:
https://www.algolia.com/doc/guides/relevance/sorting/#multiple-sorting-strategies

Related

MongoDB Querying Large Datasets

Lets say I have simple document structure like:
{
"item": {
"name": "Skittles",
"category": "Candies & Snacks"
}
}
On my search page, whenever user searches for product name, I want to have a filter options by category.
Since categories can be many (like 50 types) I cannot display all of the checkboxes on the sidebar beside the search results. I want to only show those which have products associated with it in the results. So if none of the products in search result have a category, then do not show that category option.
Now, the item search by name itself is paginated. I only show 30 items in a page. And we have tens of thousands of items in our database.
I can search and retrieve all items from all pages, then parse the categories. But if i retrieve tens of thousands of items in 1 page, it would be really slow.
Is there a way to optimize this query?
You can use different approaches based on your workflow and see what works the best in your situation. Some good candidate for the solution are
Use distinct prior to running the query on large dataset
Use Aggregation Pipeline as #Lucia suggested
[{$group: { _id: "$item.category" }}]
Use another datastore(either redis or mongo itselff) to store intelligence on categories
Finally based on the approach you choose and the inflow of requests for filters, you may want to consider indexing some fields
P.S. You're right about how aggregation works, unless you have a match filter as first stage, it will fetch all the documents and then applies the next stage.

Algolia: Best way to query slave index to get sort by date ranking functionality

I have a data set where I want to dynamically sort by date (both ascending and descending) on the fly. I read through the docs and as instructed I've created a slave index of my master index, where the top ranking value is my 'date' ordered by DESC. The date is in the correct integer and unix timestamp format.
My question is how do I query this new index on the fly using the front end Javascript Algolia API?
Right now, my code looks like the following:
this.client = algoliasearch("xxxx", "xxxxx");
this.index = this.client.initIndex('master_index');
this.index.search(
this.query, {
hitsPerPage: 10,
page: this.pagination,
facets: '*',
facetFilters: facetArray
},
function(error, results) {
// do stuff
}.bind(this));
What I've tried doing is to just change the initIndex to use my slave index instead and this does work...but I'm thinking that this is slow and inefficient if I need to reinitialize the index every time the user just wants to sort by date. Isn't there a parameter instead that I can insert in the query to sort by date?
Also, my second question is that even when I change the index to the slave index, it only sorts by descending. How can I have it sort by ascending as well?
I really do not want to create ANOTHER slave index just to sort by ascending date since I have many thousands of rows and am already close to exceeding my record limit. Surely there must be another way here?
Thanks!
What I've tried doing is to just change the initIndex to use my slave index instead and this does work...but I'm thinking that this is slow and inefficient if I need to reinitialize the index every time the user just wants to sort by date. Isn't there a parameter instead that I can insert in the query to sort by date?
You should store all the indices you want to do sorts in different properties on the this object:
this.indices = {
mostRelevant: this.client.initIndex('master_index'),
desc: this.client.initIndex('slave_desc')
};
Then you can use this.indices.mostRelevant.search() or this.indices.desc.search().
This is not a performance issue to do so.
Also see the dedicated library to create instant-search experiences: https://community.algolia.com/instantsearch.js/
Also, my second question is that even when I change the index to the slave index, it only sorts by descending. How can I have it sort by ascending as well?
I really do not want to create ANOTHER slave index just to sort by ascending date since I have many thousands of rows and am already close to exceeding my record limit. Surely there must be another way here?
This is the only true way to do sorts in Algolia. This is by design what makes Algolia so fast and is currently the only way to do so.

Using mongodb in store

I am using mongodb for store, I need to find how frequency one item is selling. I know logic, but not syntactic way in mongo, assume I have 3 items, first itemA was sold in "2015-08-25 00:28:41", itemB "2015-08-25 00:29:05", itemC "2015-08-25 00:30:02", so I need to subtract C-B, B-A then add and divide 2. How can I do query for multiple items ? for example 100 items. Thanks.
Seems your question is a bit more basic - How to query MongoDB.
So if your collection name is 'store' you will use:
db.store.find() // This will get all records.
If you want to sort it by date, you can add .sort({ date: -1}) - will sort them.
Then adding .limit(100) will limit the results you will get, then you can carry on with whatever logic you need.
db.store.find().sort({ date: -1}).limit(100)

Solr: How to Search by Time *AND* Distance

We are working on an app using Solr to search by distance. A Solr consultant wrote the original code but is no longer available, so I, a Solr newbie, try to take this over. The current index insertion code looks like this:
{add:
{doc:
{id: <my_id>,
category: <my_type>,
resourcename: <private_flag>,
store: <my_latlng>,
},
overwrite: true,
commitWithin: <commit_time>
}
}
And the query below properly returns all the documents near (mylat, mylng):
localhost:8983/solr/select?wt=json&q=category:"<mytype>"&fl=id&fq=
{!bbox}&sfield=store&pt=<mylat>,<mylng>&d=<my_distance>&rows=200
All was well in paradise. Now we need to add a time dimension, meaning instead of just retrieve nearby docs, we need to retrieve nearby docs within a specific time range (e.g. 2 days ago, 2 month ago). This means adding to each index a "origin_time" field, and then modify the query to search for TIME plus distance.
Can anyone suggest how I should add this time field to the index and how to adjust the query to search by distance and time?
Thanks!

MongoDB - Query embbeded documents

I've a collection named Events. Each Eventdocument have a collection of Participants as embbeded documents.
Now is my question.. is there a way to query an Event and get all Participants thats ex. Age > 18?
When you query a collection in MongoDB, by default it returns the entire document which matches the query. You could slice it and retrieve a single subdocument if you want.
If all you want is the Participants who are older than 18, it would probably be best to do one of two things:
Store them in a subdocument inside of the event document called "Over18" or something. Insert them into that document (and possibly the other if you want) and then when you query the collection, you can instruct the database to only return the "Over18" subdocument. The downside to this is that you store your participants in two different subdocuments and you will have to figure out their age before inserting. This may or may not be feasible depending on your application. If you need to be able to check on arbitrary ages (i.e. sometimes its 18 but sometimes its 21 or 25, etc) then this will not work.
Query the collection and retreive the Participants subdocument and then filter it in your application code. Despite what some people may believe, this isnt terrible because you dont want your database to be doing too much work all the time. Offloading the computations to your application could actually benefit your database because it now can spend more time querying and less time filtering. It leads to better scalability in the long run.
Short answer: no. I tried to do the same a couple of months back, but mongoDB does not support it (at least in version <= 1.8). The same question has been asked in their Google Group for sure. You can either store the participants as a separate collection or get the whole documents and then filter them on the client. Far from ideal, I know. I'm still trying to figure out the best way around this limitation.
For future reference: This will be possible in MongoDB 2.2 using the new aggregation framework, by aggregating like this:
db.events.aggregate(
{ $unwind: '$participants' },
{ $match: {'age': {$gte: 18}}},
{ $project: {participants: 1}
)
This will return a list of n documents where n is the number of participants > 18 where each entry looks like this (note that the "participants" array field now holds a single entry instead):
{
_id: objectIdOfTheEvent,
participants: { firstName: 'only one', lastName: 'participant'}
}
It could probably even be flattened on the server to return a list of participants. See the officcial documentation for more information.