I have a model restaurant and offers ( has_many :offers) , The sphinx index it self is on the restaurant level . I want to have a delta index which is based on new offers for restaurants or update for current one.
What's the best way to do that ?
I think that this question could be useful for your problem:
Using Delta Indexes for associations in Thinking Sphinx
Related
I need to perform sorting on Elasticsearch documents...
I have one index created for MongoDB collections 'products', which have price and product ratings in it.
I have another collection 'product_hits' in which I am save one record (product_id, IP etc.) on every click of particular product by user. Now I want to sort product documents on by considering Product hit count (maybe which I can get through aggregation), price and product rating.
In short I want to rank all the products based on price and popularity as other sites does.
How can I achieve this in elasticsearch?
I gone though scripting of elasticsearch and I am able to sort on price and product rating..... but I didn't find anything useful in which we can perform sort based on multiple indices.
is it possible?? or do I have to sort all records on my own through coding?
I am using play framework.
I hope this question can be understood... Its complex..!!!
Let's say I have a collection with this structure,
student_id,score,score_type
I have an index on score, and I want to query score of student with id=10000 and order the results by score.
I ran the query on my dataset and this is what the query plan is,
1: First the db uses the index on score to sort the documents.
2: Then it does the filter on the document with id:1000
Even though we use an index here, all the docs are examined here for the match(since there is no index on student_id). My question is that if all the documents are to be examined,why doesn't the db consider this alternate plan
1: Do a collection wide search and do the filtering.
2: Then use the index on score to do the sorting.
Here sorting will be done on a smaller dataset, so it should be faster.
What is wrong with the second plan?
Only one index can be used per query.
So if you want to query for a key and sort for another, you need a multi key index:
db.collection.ensureIndex({student_id:1,score:1})
db.collection.find({student_id: 1000}).sort({score:1})
As we already known as we don't have table joins in Mongodb but if we want to get result from 2 different documents than how we can query to Mongodb? Consider following example.
Document 1 - > department
{Id_:123,name:technical,location:"B Wing"}
{Id_:234,name:account,location:"main Wing"}
{Id_:547,name:HR,location"C Wing"}
Document 2 - > employee
{Id_:a101,name:Peter,dept_id:234,DOB:2010-01-01}
{Id_:a102,name:Liomo,dept_id:547,DOB:1950-01-01}
{Id_:a103,name:Juno,dept_id:123,DOB:1990-01-01}
{Id_:a104,name:ole,dept_id:554,DOB:2011-01-01}
So how can we get all fields like (EmployeeName, DepatmentName, DOB) in one result, I am not getting any way please help me out
Thanks in advance
There is no way. Mongodb does not support joins. The NoSQL way is to denormalize your data, meaning you embed a copy in A of the fields from B you need in A.
There is such a thing as database references, but all that does is provide syntactic sugar for combining the result of two queries client side.
By the way, if your data is really relational in nature (like an employee database the way I would probably design it), perhaps a relational database would be a better fit.
I have some collection A with _id, content, timestamp as fields and some collection B with A_id, _id, content, timestamp as fields. A_id refers to some object in A.
I want to sort the objects in A based on their latest timestamps in B.
I can get it to work by re architecting my db design (e.g. storing a latest_B_timestamp in A) BUT is there a simple way to do this directly with Mongo?
Thanks!
I doubt there is any good way to do that with mongo. Your current solution seems ok and natural in mongo. Duplication is the way to go.
No.
MongoDB has no joins, so if 2 collections have related data, they should be worked in the application layer.
I'm working on a small project where I need to build an inverted index and apply similarity algorithms based on a user query - basic information retrieval. What's the best NoSQL product for building and searching inverted indices?
Thanks,
J
Since an inverted index is all about storing the relationship between words and their locations within a document, I'm not sure this is really a good use case for NoSQL. Traditional SQL will work better here. For example, try a data structure like this:
Documents (DocumentID primary key, DocumentText text)
Words (WordID primary key, Word text)
Instances (InstanceID primary key, WordID foreign key, DocumentID foreign key, WordIndex integer)
With this structure, as you insert a document into the Documents table, you parse out each word and add it to the Words table if it's new or retrieve the existing WordID if it already exists, and then add the associated data to the Instances table.
If you're intent on using NoSQL you can use it with something like MongoDB and put all your documents in one collection and all the words in another collection. Inside each Word document, include an Instances array which would be an array of objects with the ObjectID of the associated document and the word index in that document. However, I'm not sure if MongoDB is optimized for handling such large arrays within documents. Common words like 'a' and 'the' could end up going over the 4MB document limit even, depending on how much data you have.
see Elasticsearch
Distributed, scalable, and highly available
Real-time search and analytics capabilities
Sophisticated RESTful API