I have an index in Algolia and two websites. As example, the record is:
[{
name: "record 1",
public: 1
}, {
name: "record 2",
public: 0
}]
Those two websites search into the same index, but what I want is:
the first website can search for all records
the second website should search only for public: 1 and shouldn't be able to search for public: 0
I thought about two different indexes, but the record with public: 1 are shared by the two websites, so I should have duplicated records (and, for each plan, there is the limit for records). It's not a solution that I want to apply.
How can achieve this?
You can create a secured API key for the second website that contains a filter for public:1.
From the docs about secured API keys:
The goal of a secured API key is to ensure a set of query parameters
cannot be changed by the end user. In order to do that, we compute a
HMAC SHA-256 hash between one of your API keys that is used as a
secret and the set of query parameters you want to enforce.
Related
I'm developing a social media and I need to create friend list and followers list, but the followers list I'm thing about performance in queries, then I have in my mind two options but I don't know exactly what is the best:
First strategy:
{
id: "any_id",
follower_id:"follower_id",
user_id:"user_id"
}
Second strategy:
{
id: "user_id",
user_followers: [ { id:"follower_1" }, {id:"follower_2"} ]
}
first strategy is better ,
for serveral reason
for example ,
if the follower_id and user_id store in another collection
getting the list of followers when it grows up , you could use skip , limit and pagination, but split list of followers in array needs more process when the list grows up,
update of list of followers in strategy one is more simple and faster than of updating in array
I have collections of users and projects.
Every project is connected exactly to one user.
My question is: should every user hold the list of project ids?
If i want to retrieve all the projects of a specific user, which option is more efficient and best practice:
Create an index on projects collection on the user id property. than just query on user id property.
Create an index on project collection on the project id property. than, if the user holds its project ids, just query projects collection for those specific ids.
Which option to choose? Maybe there is a third option that is better?
The advantage of the first option, is that i don't need to update the list of project in the user document when deleting/adding projects.
Thanks!
Every project is connected exactly to one user.
A user can have many projects (and a project is associated with one user only). This is a one-to-many relationship.
My question is: should every user hold the list of project ids?
Every user should store the list of his/her projects. For example:
user:
id: <some value>,
name: <some value>,
email: <some value>,
projects: [
{ projectId: <some value>, projectName: <...>, projectDescription: <....>, otherInfo: { fld1: <...>, fld2: <...>, etc. } },
{ projectId: <some value>, projectName: <...>, projectDescription: <....>, otherInfo: { fld1: <...>, fld2: <...>, etc. } },
...
]
Note that each project is a sub-document (object or embedded document) within the projects array. A project has its related details like, projectId, projectName, etc..
If i want to retrieve all the projects of a specific user, which
option is more efficient and best practice:
a. Create an index on projects collection on the user id property.
than just query on user id property.
b. Create an index on project collection on the project id property.
than, if the user holds its project ids, just query projects
collection for those specific ids.
I think, there should be only one collection called as user_projects. Assuming that: (i) a user may have 0 to 100 projects, and (ii) a project's details are not too huge.
This is a model of embedding the 'many' side of the 1-to-N relationship into the 'one' side. This a recommended way, de-normalizing the data. This has the advantage of efficient and fast queries. This simplifies transactions, as writes (inserts, updates and deletes) are going to be atomic with single operation to a document within the same collection.
About retrieving all projects for a specific user:
You will be using the user id or name (with a unique index) to retrieve a document, and it will be very fast query. You can have index on the projects array (indexes on array fields are called as Multikey Indexes) - on the project's fields. For example, index on projectId or/and projectName makes sense.
You can get all projects for a user - its a simple query using the user id / name. Query projection allows what information related to project is displayed. You can use a find or aggregate method to build the query.
You can query a specific project for a user, using the projectId or projectName. Since there are indexes on user and project fields, this will be an efficient query.
So, my recommendation is to have a single collection, user_projects, with a user's information and the projects information embedded in it.
Lets say I have simple document structure like:
{
"item": {
"name": "Skittles",
"category": "Candies & Snacks"
}
}
On my search page, whenever user searches for product name, I want to have a filter options by category.
Since categories can be many (like 50 types) I cannot display all of the checkboxes on the sidebar beside the search results. I want to only show those which have products associated with it in the results. So if none of the products in search result have a category, then do not show that category option.
Now, the item search by name itself is paginated. I only show 30 items in a page. And we have tens of thousands of items in our database.
I can search and retrieve all items from all pages, then parse the categories. But if i retrieve tens of thousands of items in 1 page, it would be really slow.
Is there a way to optimize this query?
You can use different approaches based on your workflow and see what works the best in your situation. Some good candidate for the solution are
Use distinct prior to running the query on large dataset
Use Aggregation Pipeline as #Lucia suggested
[{$group: { _id: "$item.category" }}]
Use another datastore(either redis or mongo itselff) to store intelligence on categories
Finally based on the approach you choose and the inflow of requests for filters, you may want to consider indexing some fields
P.S. You're right about how aggregation works, unless you have a match filter as first stage, it will fetch all the documents and then applies the next stage.
I am new to cloud-ant, In my current assignment i want to search all distinct records based on fields x:
I have documents which have domain as attribute. I want all unique domains which are present in my db.Below is the example,
documentNo1-{"domain":"gmail.com"}
documentNo2-{"domain":"ymail.com"}
documentNo3-{"domain":"gmail.com"}
expected result is API should return only unique domain name, like below
[gmail.com,ymail.com]
I am not getting operators in cloud-ant which can achieve this, only solution i have is to retrieve it and create our own unique domain list.
Looking for any good approach/solution for above scenario.
You can use Cloudant Search to create a faceted index.
See https://console.bluemix.net/docs/services/Cloudant/api/search.html#faceting
This would allow you to essentially group documents by domain, creating the unique list you need.
There is a good video tutorial showing this technique:
https://www.youtube.com/watch?v=9er3XI150VM
Is it possible to make a compound index, where one of the fields have a fixed value?
Let's say I want to avoid users using the same e-mail for different accounts, but just for regular user accounts, and I want to allow admins to use the mail in as many places as they want, and even have a regular user account and an administrative account using the same e-mail
User.index({ username: 1, email: 1 }, { unique: true })
Is not useful, since it will not allow admins to reuse the email. Is it possible to do something like?
User.index({ role: "regular_user", username 1, email: 1}, { unique: true });
Luis,
In regards to the example that you gave. If you create a unique compound index, individual keys can have the same values, but the combination of values across the keys that exist in the index entry can only appear once. So if we had a unique index on {"username" : 1, "role" : 1}. The following inserts would be legal:
> db.users.insert({"username" : "Luis Sieira"})
> db.users.insert({"username" : "Luis Sieira", "role" : "regular"})
> db.users.insert({"username" : "Luis Sieira", "role" : "admin"})
If you tried to insert a second copy of any of the above documents, you would cause a duplicate key exception.
Your Scenarios
I think that if you added an allowance field to your schema. When you do inserts for admins for new accounts. You can add a different value for their admin allowance. If you added unique index for {"username":1,"email":1, "allowance" : 1}
You could make the following inserts, legally:
>db.users.insert({"username" : "inspired","email": "i#so.com", "allowance": 0})
>db.users.insert({"username" : "inspired","email": "i#so.com", "allowance": 1})
>db.users.insert({"username" : "inspired","email": "i#so.com", "allowance": 2})
>db.users.insert({"username" : "inspired","email": "i#so.com", "allowance": 3})
Of course, you'll have to handle certain logic from the client, but this will allow you to use an allowance code of 0 for regular accounts and then allow you to save a higher allowance code (incrementing it or adding custom value for it) each time an admin creates another account.
I hope this offers some direction with using unique compound indexes.
You are on the right track. First things first, if you define an index with the role like this
User.index({role: 1, username: 1, email: 1}, { unique: true });
Mongo will use null for documents that do not specify the role field. If you insert an user without specifying the role and try to add it again, you will get an error because the three fields already exist in the database. So you can use this to your advantage by not including a role (or you can use a predefined value for better reading, like you proposed as regular_user).
Now, the tricky part is forcing the index to permit admins to bypass the uniqueness constraint. The best solution would be to generate a some hash and add it to the role. So, if you just add admins with roles like admin_user, you won't bypass the constraint. Meanwhile, using a role like admin_user_635646 (always with varying suffix) will allow you to insert the same admin multiple times.