Faceting in Algolia search engine index

Faceting in Algolia search engine index - algolia

I'm working on integrating the Algolia search engine using nodeJS, having a bit of trouble getting proper facets for an array properties in Algolia
For example: Having an items property in a record of the Algolia index which contains data in a given format:-
{
id:1,
category:'books',
items: [
{ id: 1, name: 'C Programming Language', instock: true },
{
id: 2,name: 'Head First C',instock: false,
},
];
}
We want to get only those items name in the facet for which the instock value is true.
Applied distinct on items.name and filter by instock:true still getting 'C Programming Language' and 'Head First C' both names in facets.
Expected result: I should get only 'C Programming Language' in the items name facet if records are filtered by instock value true.
Is there some sort of option I'm missing? Any help would be appreciated.

Your data structure is inside out for an index.
The individual titles should be the records, with "category" as an attribute of those records. With your current object structure, the entire category is a record.
Something like this should work:
{
id:1,
name: 'C Programming Language',
instock: true,
category:'books'
},
{
id: 2,
name: 'Head First C',
instock: false,
category:'books'
},
Then you can filter on titles that are category:'books' and instock:true

Related

Mongoose findOne not working as expected on nested records

I've got a collection in MongoDB whose simplified version looks like this:
Dealers = [{
Id: 123,
Name: 'Someone',
Email: 'someone#somewhere.com',
Vehicles: [
{
Id: 1234,
Make: 'Honda',
Model: 'Civic'
},
{
Id: 2345,
Make: 'Ford',
Model: 'Focus'
},
{
Id: 3456,
Make: 'Ford',
Model: 'KA'
}
]
}]
And my Mongoose Model looks a bit like this:
const vehicle_model = mongoose.Schema({
Id: {
Type: Number
},
Email: {
Type: String
},
Vehicles: [{
Id: {
Type: Number
},
Make: {
Type: String
},
Model: {
Type: String
}
}]
})
Note the Ids are not MongoDB Ids, just distinct numbers.
I try doing something like this:
const response = await vehicle_model.findOne({ 'Id': 123, 'Vehicles.Id': 1234 })
But when I do:
console.log(response.Vehicles.length)
It's returned all the Vehicles nested records instead on the one I'm after.
What am I doing wrong?
Thanks.

This question is asked very frequently. Indeed someone asked a related question here just 18 minutes before this one.
When query the database you are requesting that it identify and return matching documents to the client. That is a separate action entirely than asking for it to transform the shape of those documents before they are sent back to the client.
In MongoDB, the latter operation (transforming the shape of the document) is usually referred to as "Projection". Simple projections, specifically just returning a subset of the fields, can be done directly in find() (and similar) operations. Most drivers and the shell use the second argument to the method as the projection specification, see here in the documentation.
Your particular case is a little more complicated because you are looking to trim off some of the values in the array. There is a dedicated page in the documentation titled Project Fields to Return from Query which goes into more detail about different situations. Indeed near the bottom is a section titled Project Specific Array Elements in the Returned Array which describes your situation more directly. In it is where they describe usage of the positional $ operator. You can use that as a starting place as follows:
db.collection.find({
"Id": 123,
"Vehicles.Id": 1234
},
{
"Vehicles.$": 1
})
Playground demonstration here.
If you need something more complex, then you would have to start exploring usage of the $elemMatch (projection) operator (not the query variant) or, as #nimrod serok mentions in the comments, using the $filter aggregation operator in an aggregation pipeline. The last option here is certainly the most expressive and flexible, but also the most verbose.

Find & Update partial nested collection

Assume I have a Mongo collection as such:
The general schema: There are Categories, each Category has an array of Topics, and each Topic has a Rating.
[
{CategoryName: "Cat1", ..., Topics: [{TopicName: "T1", rating: 9999, ...},
{TopicName: "T2", rating: 42, ....}]},
{CategoryName: "Cat2", ... , Topics: [...]},
...
]
In my client-side meteor code, I have two operations I'd like to execute smoothly, without any added filtering to be done: Finding, and updating.
I'm imagining the find query as follows:
.find({CategoryName: "Cat1", Topics: [{TopicName: "T1"}]}).fetch()
This will, however, return the whole document - The result I want is only partial:
[{CategoryName: "Cat1", ..., Topics: [{TopicName: "T1", rating: 9999, ...}]}]
Similarly, with updating, I'd like a query somewhat as such:
.update({CategoryName: "Cat1", Topics: [{TopicName: "T1"}]}, {$set: {Topics: [{rating: infinityyy}]}})
To only update the rating of the topic T1, and not all topics of category Cat1.
Again, I'd like to avoid any filtering, as the rest of the data should not even be sent to the client in the first place.
Thanks!

You need to amend your query to the following:
Categories.find(
{ CategoryName: 'Cat1', 'Topics.TopicName': 'T1' },
{ fields: { 'Topics.$': 1 }}, // make sure you put any other fields you want in here too
).fetch()
What this query does is searches for a Category that matches the name Cat1 and has the object with the TopicName equal to T1 inside the Topic array.
In the fields projection we are using the $ symbol to tell MongoDB to return the object that was found as part of the query, and not all the objects in the Topics array.
To update this nested object you need to use the same $ symbol:
Categories.update(
{ CategoryName: "Cat1", 'Topics.TopicName': 'T1' },
{$set: {'Topics.$.rating': 100 },
);
Hope that helps

How to filter minimongo collection with more parameters in meteor

I need help with filtering reactively minimongo collection with 3+ parameters. Firs, I have loaded minimongo from server's mongo collection with publish/subscribe. Now, I want to filter that collection and let user see only objects from collection that are same like filters. First of all I have search query filter that checks if input string is same as some filed in collection:
Job.find({ $or: [ { Name: regex }, { Description: regex }, ]});
This part is doing his job well. And now second filter: I have field in object that is true if that particular job is remote friendly and false if it is not, I wanted to if user enables that filter he only see just remote friendly job positions, and if he disables that, he can see all jobs available ( + search query of course):
if (remoteQuery === true ) {
return Job.find({ $or: [ { Name: regex }, { Description: regex }, ] , Remote: true});
}
This is working, but it is bad way for doing this. And now biggest problem comes with last filter: I have one more field that is storing "job" (collection object) type. Type can be 1, 2 , 3 or 4. So how could I say to minimongo e.g. "Show only jobs that have "Front-end" inside (search query), that are remote friendly, and have third filter 2 and 3 inside"
Job.find({ $or: [ { Name: "Front-end"}, { Description: "Front-end"}, ] , Remote: true, positionType: [2,3],});
Something like this? Thanks!

Sounds like you are looking for the MongoDB query $in operator:
The $in operator selects the documents where the value of a field equals any value in the specified array.
Therefore your 3rd query could look like:
Job.find({
positionType: {
$in: [2, 3] // Matches documents where `positionType` is either 2 or 3
},
// Other conditions…
});

Querying by field in MongoDB, which can be arbitrarily deeply nested in the document

I am currently developing a RESTful API/thinking about the implementation.
For the sake of simplicity, in my model, there's a document type, called 'Box'. A box can contain items, and other boxes as well. (Kind of like a composite pattern) Such sub-boxes can be arbitrarily deeply nested.
In MongoDB, such a box document would look like this:
{
_id: 0,
items: ['A', 'B', 'C'],
sub-boxes: [
{
_id: 1,
items: ['D', 'E']
},
{
_id: 2,
items: []
sub-boxes: [
{
_id: 3,
items: ['G']
}
]
}
]
}
My REST API url looks like this:
GET /api/boxes/:id
I would like to be able to retrieve the box #0 the same way as #3 (from an API point of view).
GET /api/boxes/0
GET /api/boxes/3
My question is that is it possible in MongoDB to query for the field _id, even if I don't know how deeply it is nested in the document? I cannot hardcode in my queries the location of _id, since it can be basically anywhere.
I know that I could normalize my model, so each 'sub-boxes' property would only contain references to other boxes, but I would prefer to keep my model denormalized, if possible.

In the end I decided to normalize my data, so each box only contains references to other boxes.

Pymongo Query Filtering a versioned data

Here's a sample data from my database:
{'data':
[{'is_active': True, 'printer_name': 'A', 'vid': 14510},
{'is_active': True, 'printer_name': 'Changed A', 'vid': 14511}
]
},
{'data':
[{'is_active': False, 'printer_name': 'B', 'vid': 14512}]
}
The vid field here is the version id. Whenever a record is edited, the modified data is pushed into the list and it therefore has a higher vid than its old version.
Now I want to define a method called get_all_active_printers which returns all printers with is_active :True
Here's my attempt, but it returns both printers when it should not return printer B
def get_all_active_printers():
return printers.find(
{'data.is_active': True}, {"data": {"$slice": -1}, '_id': 0, 'vid': 0})
Whats wrong with my query ?
Edit 1 in response to comment by WanBachtiar
Here's the actual output from using the command print([c for c in get_all_active_printers()])
[{'data': [{'printer_name': 'Changed A', 'vid': 1451906336.6602068, 'is_active': True, 'user_id': ObjectId('566bbf0d680fdc1ac922be4c')}]}, {'data': [{'printer_name': 'B', 'vid': 1451906343.8941162, 'is_active': False, 'user_id': ObjectId('566bbf0d680fdc1ac922be4c')}]}]
As you can see in the actual output - the is_active value for Printer B is False, but get_all_active_printers still returns B
Here's my version details:
Python 3.4.3
pymongo 3.2
mongodb 2.4.9
On Ubuntu 14.04, if that matters.
Edit 2
Noticed yet another issue. The query is returning vid field, even though have clearly mentioned 'vid': 0 in the projection.
* Edit 3*
I am not sure by what you mean when you say
"make sure that there is no other documents for {'printer_name': 'B'}"
. Yes the second data (on printer B) has a second row. That was the first data - when the printer was created when the field is_active was true. Later it becomes false. Here's the snapshot of the database:
But I want to filter on the latest data as old data is only for keeping an audit trail.
If i move 'data.is_active': True to the projections as in the following code:
def get_all_active_printers():
return printers.find(
{}, {'data': {'$slice': -1}, 'data.is_active': True, '_id': 0, 'vid': 0})
I get the following error message:
pymongo.errors.OperationFailure: database error: You cannot currently
mix including and excluding fields. Contact us if this is an issue.
So how do i filter based on the latest data, given the snapshot above ? Sorry if my question did not make it clear earlier.

Thanks for clarifying the question.
So you are wanting to query documents that have only the latest element with is_active: True.
Unfortunately, in your case find({'data.is_active': True}) would find all documents containing any data element with is_active:True, not just the last element in the array. Also, without knowing the length of the array, you cannot reference the last element of the array using the array.i syntax.
However there are other ways/alternatives:
Update using $push, $each and $position to insert new elements to the front of the array. Mongo Shell example:
/* When updating */
db.printers.update(
/* Filter document for printer B */
{"data.printer_name": 'B'},
/* Push a new document to the front of the array */
{"$push": {
"data": {
$each: [{'is_active': false, 'printer_name': "B", 'vid': 14513 }],
$position: 0
}
}
}
);
/* When querying now you know that the latest one is on index 0 */
db.printers.find(
{"data.0.is_active": true},
{"data": { $slice: 1} }
);
Note that $position is new in MongoDB v2.6.
Use MongoDB aggregation to $unwind the array, $group then $match to filter. For example:
db.printers.aggregate([
{$unwind: '$data' },
{$sort: { 'data.vid' : 1 } },
{$group: {
'_id': { 'printer_name' : '$data.printer_name', id:'$_id' },
'vid': { $max: '$data.vid' },
'is_active' : { $last: '$data.is_active' }
}
},
{$match:{"is_active": true}}
]);
It may be beneficial for you to re-consider the document schema. For example, instead of having an array of documents maybe you should consider having them flat for ease of querying.
{'is_active': true, 'printer_name': 'A', 'vid': 14512}
{'is_active': false, 'printer_name': 'B', 'vid': 14513}
For more examples and discussions of different version tracking schema designs, see these blog posts:
How to track versions with MongoDB
Further thought on how to track versions with MongoDB
Also a useful reference on schema designs : Data Modeling Introduction.
The query is returning vid field, even though have clearly mentioned
'vid': 0 in the projection.
You could hide it with "data.vid": 0 instead of vid:0.
If i move 'data.is_active': True to the projections as in the following code... I get the following error message.
There are rules of projections that you have to follow. Please see projecting fields from query results for more information on projections.
If you are starting a new project, I would recommend to use the latest stable release of MongoDB, currently it is v3.2.0.
Regards,
Wan.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse