Algolia AND search through an array - algolia

I am looking for a way to search in Algolia a record where at least one element of an array meets several conditions.
As an example, imagine this kind of record:
{
"name": "Shoes",
"price": 100,
"prices": [
{
"start": 20160101,
"end": 20160131,
"price": 50,
},
{
"start": 20160201,
"end": 20160229,
"price": 80,
}
]
}
I am looking for a way to do a query like the following:
prices.price<60 AND prices.start<=20160210 AND prices.end>=20160210
(A product where the price is less than 60 for the given date)
That query should not return anything because the price condition is not met for that date but the record is returned anyway. Probably because the condition is met "globally" among all prices.
I am a beginner with Algolia and trying to learn. Is there a way I can do the desired request or will I have to go for a separate index for prices and use multiple queries?
Thanks.

When a facetFilter or tagFilter is applied on an array, Algolia's engine checks if any element of the array matches and then goes to the next condition.
The reason it behaves that way and not the way you expected is simple: let's assume you have an array of strings (like tags):
{ tags: ['public', 'cheap', 'funny', 'useless'] }
When a user wants to find a record that is "useless" and "funny", this user is not looking for a tag that is both "useless" and "funny" at the same time, but for a record containing both tags in the array.
The solution for this is to denormalize your object in some way: transforming a record with an array of objects to multiple records with one object each.
So you would transform
{
"name": "Shoes",
"price": 100,
"prices": [
{ "start": 20160101, "end": 20160131, "price": 50 },
{ "start": 20160201, "end": 20160229, "price": 80 }
]
}
into
[
{
"name": "Shoes",
"default_price": 100,
"price": { "start": 20160101, "end": 20160131, "price": 50 }
},
{
"name": "Shoes",
"default_price": 100,
"price": { "start": 20160201, "end": 20160229, "price": 80 }
}
]
You could either do this in the same index (and use distinct to de-duplicate), or have one index per month or day. It really depends on your use-case, which makes it a bit hard to give you the right solution.

Related

Find count of records till specific row index in Mongodb using Spring data criteria Query with pagable

I've 200 records in my mongodb I want to perform basic search operation along with auto suggestion functionality. When I type something in search text box then list of records should be displayed as a suggestion in the drop down. If I select the record from the suggestion list then I want to find the page Number of that particular record on which page that record is coming based on that I can fetch all the 50 records of that particular page and I can highlight the selected record in my data panel where I'm showing all the records.
I'm facing issue in writing query for finding page number of the selected record I'm passing the ID of selected record from the UI and based on that ID I want to find the page number of that record.
I've tried using count query but but I didn't find a way to count the number of records till the particular row index in Mongo, so based on that I can calculate the page number.
[
{
"name": "ABC1",
"id": "33d47e5da3a08555dceb17e34"
},
{
"name": "ABC2",
"id": "5f47e5da3a08555dceb17e8f"
},
{
"name": "ABC3",
"id": "5f47e5da3a08555dceb17e99"
},
{
"name": "ABC4",
"id": "5f47e5da3a08555dceb17e77"
},
{
"name": "ABC5",
"id": "5f47e5da3a08555dceb17e66"
},
{
"name": "ABC6",
"id": "5f47e5da3a08555dceb17e55"
},
{
"name": "ABC7",
"id": "5f47e5da3a08555dceb17e78"
},
{
"name": "ABC8",
"id": "5f47e5da3a08555dceb17e68"
},
{
"name": "ABC9",
"id": "5f47e5da3a08555dceb17e55"
}
]
I want to know the below record is available on which pageNumber. For that I need a Total record count till the ABC5 record.
{
"name": "ABC5",
"id": "5f47e5da3a08555dceb17e66"
}
So that I can find the pageNumber like this;
int pageNumber = count/pageLimit;
final int pageLimit = 3;
int pageNumber = count/pageLimit;
Page<T> page = repository.findAll(new PageRequest(pageNumber, pageLimit));
And this should be my final Result:
[
{
"name": "ABC4",
"id": "5f47e5da3a08555dceb17e77"
},
{
"name": "ABC5",
"id": "5f47e5da3a08555dceb17e66"
},
{
"name": "ABC6",
"id": "5f47e5da3a08555dceb17e55"
},
]
Can someone please help on this?
Thanks in advance.

How to implement a RESTful API for order changes on large collection entries?

I have an endpoint that may contain a large number of resources. They are returned in a paginated list. Each resource has a unique id, a rank field and some other data.
Semantically the resources are ordered with respect to their rank. Users should be able to change that ordering. I am looking for a RESTful interface to change the rank field in many resources in a large collection.
Reordering one resource may result in a change of the rank fields of many resources. For example consider moving the least significant resource to the most significant position. Many resources may need to be "shifted down in their rank".
The collection being paginated makes the problem a little tougher. There has been a similar question before about a small collection
The rank field is an integer type. I could change its type if it results in a reasonable interface.
For example:
GET /my-resources?limit=3&marker=234 returns :
{
"pagination": {
"prevMarker": 123,
"nextMarker": 345
},
"data": [
{
"id": 12,
"rank": 2,
"otherData": {}
},
{
"id": 35,
"rank": 0,
"otherData": {}
},
{
"id": 67,
"rank": 1,
"otherData": {}
}
]
}
Considered approaches.
1) A PATCH request for the list.
We could modify the rank fields with the standard json-patch request. For example the following:
[
{
"op": "replace",
"path": "/data/0/rank",
"value": 0
},
{
"op": "replace",
"path": "/data/1/rank",
"value": 1
},
{
"op": "replace",
"path": "/data/2/rank",
"value": 2
}
]
The problems I see with this approach:
a) Using the array indexes in path in patch operations. Each resource has already a unique ID. I would rather use that.
b) I am not sure to what the array index should refer in a paginated collection? I guess it should refer to the global index once all pages are received and merged back to back.
c) The index of a resource in a collection may be changed by other clients. What the current client thinks at index 1 may not be at that index anymore. I guess one could add test operation in the patch request first. So the full patch request would look like:
[
{
"op": "test",
"path": "/data/0/id",
"value": 12
},
{
"op": "test",
"path": "/data/1/id",
"value": 35
},
{
"op": "test",
"path": "/data/2/id",
"value": 67
},
{
"op": "replace",
"path": "/data/0/rank",
"value": 0
},
{
"op": "replace",
"path": "/data/1/rank",
"value": 1
},
{
"op": "replace",
"path": "/data/2/rank",
"value": 2
}
]
2) Make the collection a "dictionary"/ json object and use a patch request for a dictionary.
The advantage of this approach over 1) is that we could use the unique IDs in path in patch operations.
The "data" in the returned resources would not be a list anymore:
{
"pagination": {
"prevMarker": 123,
"nextMarker": 345
},
"data": {
"12": {
"id": 12,
"rank": 2,
"otherData": {}
},
"35": {
"id": 35,
"rank": 0,
"otherData": {}
},
"67": {
"id": 67,
"rank": 1,
"otherData": {}
}
}
}
Then I could use the unique ID in the patch operations. For example:
{
"op": "replace",
"path": "/data/12/rank",
"value": 0
}
The problems I see with this approach:
a) The my-resources collection can be large, I am having difficulty about the meaning of a paginated json object, or a paginated dictionary. I am not sure whether an iteration order can be defined on this large object.
3) Have a separate endpoint for modifying the ranks with PUT
We could add a new endpoint like this PUT /my-resource-ranks.
And expect the complete list of the ordered id's to be passed in a PUT request. For example
[
{
"id": 12
},
{
"id": 35
},
{
"id": 67
}
]
We would make the MyResource.rank a readOnly field so it cannot be modified through other endpoints.
The problems I see with this approach:
a) The need to send the complete ordered list. In the PUT request for /my-resource-ranks we will not include any other data, but only the unique id's of resources. It is less severe than sending the full resources but still the complete ordered list can be large.
4) Avoid the MyResource.rank field and the "rank" be the order in the /my-collections response.
The returned resources would not have the "rank" field in them and they will be already sorted with respect to their rank in the response.
{
"pagination": {
"prevMarker": 123,
"nextMarker": 345
},
"data": [
{
"id": 35,
"otherData": {}
},
{
"id": 67,
"otherData": {}
},
{
"id": 12,
"otherData": {}
}
]
}
The user could change the ordering with the move operation in json-patch.
[
{
"op": "test",
"path": "/data/2/id",
"value": 12
},
{
"op": "move",
"from": "/data/2",
"path": "/data/0"
}
]
The problems I see with this approach:
a) I would prefer the freedom for the server to return to /my-collections in an "arbitrary" order from the client's point of view. As long as the order is consistent, the optimal order for a "simpler" server implementation may be different than the rank defined by the application.
b) Same concern as 1)b). Does index in the patch operation refer to the global index once all pages are received and merged back to back? Or does it refer to the index in the current page ?
Update:
Does anyone know further examples from an existing public API ? Looking for further inspiration. So far I have:
Spotify's Reorder a Playlist's Tracks
Google Tasks: change order, move
I would
Use PATCH
Define a specialized content-type specifically for updating the order.
The application/patch+json type is pretty great for doing straight-up modifications, but I think your use-case is unique enough to warrant a useful, minimal specialized content-type.

Trying to aggregate on multiple fields Mongoose

I have created a time entry system in which users can enter in the amount of time (percentage) spent on a task between a given time period. Each record looks like the following. I changed the user _id to explicit names to make it easier to visualize
"project_name": "first_project",
"linked_project": "5bd057f5d4b8173d88b7fe47",
"percentage": 25,
"user": {
"$oid": "Steve"
},
"project_name": "first_project",
"linked_project": "5bd057f5d4b8173d88b7fe47",
"percentage": 50,
"user": {
"$oid": "Steve"
},
"project_name": "second_project",
"linked_project": "5bd057f5d4b8173d88b7fe48",
"percentage": 25,
"user": {
"$oid": "Steve"
},
"project_name": "second_project",
"linked_project": "5bd057f5d4b8173d88b7fe48",
"percentage": 75,
"user": {
"$oid": "Mary"
},
I'm trying to first group by Person and then by project. Basically I want a total of how much each user has spent on a particular task. Not sure if what I am trying to achieve is even possible. I have included what I am trying to achieve below:
Example output:
[
{
user: Steve,
projects: [
first_project: 75,
second_project: 25
]
},
{
user: Mary,
projects: [
second_project: 75
]
}
]
I've tried a variety of ways to achieve this and I haven't come close. Hopefully someone has some insight on how to achieve this.
You can use multiple groups, one for summing percentages for each user and project_name combination and second group to push all the documents for user.
db.colname.aggregate([
{"$group":{
"_id":{
"user":"$user",
"project_name":"$project_name"
},
"time":{"$sum":"$percentage"}
}},
{"$group":{
"_id":"$_id.user",
"projects":{"$push":{"project_name":"$_id.project_name","time":"$time"}}
}}
])
To get the output as single document you can use in the last group stage
"projects":{"$mergeObjects":{"$arrayToObject":[[["$_id.project_name","$time"]]]}}

MongoDb query - aggregation, group, filter, max

I am trying to figure out specific mongoDb query, so far unsuccessfully.
Documents in my collections looks someting like this (contain more attributes, which are irrelevant for this query):
[{
"_id": ObjectId("596e01b6f4f7cf137cb3d096"),
"code": "A",
"name": "name1",
"sys": {
"cts": ISODate("2017-07-18T12:40:22.772Z"),
}
},
{
"_id": ObjectId("596e01b6f4f7cf137cb3d097"),
"code": "A",
"name": "name2",
"sys": {
"cts": ISODate("2017-07-19T12:40:22.772Z"),
}
},
{
"_id": ObjectId("596e01b6f4f7cf137cb3d098"),
"code": "B",
"name": "name3",
"sys": {
"cts": ISODate("2017-07-16T12:40:22.772Z"),
}
},
{
"_id": ObjectId("596e01b6f4f7cf137cb3d099"),
"code": "B",
"name": "name3",
"sys": {
"cts": ISODate("2017-07-10T12:40:22.772Z"),
}
}]
What I need is to get current versions of documents, filtered by code or name, or both. Current version means that from two(or more) documents with same code, I want pick the one which has latest sys.cts date value.
So, result of this query executed with filter name="name3" would be the 3rd document from previous list. Result of query without any filter would be 2nd and 3rd document.
I have an idea how to construct this query with changed data model but I was hoping someone could lead me right way without doing so.
Thank you

Does the OData protocol provide a way to transform an array of objects to an array of raw values?

Is there a way specify in an OData query that instead of certain name/value pairs being returned, a raw array should be returned instead? For example, if I have an OData query that results in the following:
{
"#odata.context": "http://blah.org/MyService/$metadata#People",
"value": [
{
"Name": "Joe Smith",
"Age": 55,
"Employers": [
{
"Name": "Acme",
"StartDate": "1/1/1990"
},
{
"Name": "Enron",
"StartDate": "1/1/1995"
},
{
"Name": "Amazon",
"StartDate": "1/1/1999"
}
]
},
{
"Name": "Jane Doe",
"Age": 30,
"Employers": [
{
"Name": "Joe's Crab Shack",
"StartDate": "1/1/2007"
},
{
"Name": "TGI Fridays",
"StartDate": "1/1/2010"
}
]
}
]
}
Is there anything I can add to the query to instead get back:
{
"#odata.context": "http://blah.org/MyService/$metadata#People",
"value": [
{
"Name": "Joe Smith",
"Age": 55,
"Employers": [
[ "Acme", "1/1/1990" ],
[ "Enron", "1/1/1995" ],
[ "Amazon", "1/1/1999" ]
]
},
{
"Name": "Jane Doe",
"Age": 30,
"Employers": [
[ "Joe's Crab Shack", "1/1/2007" ],
[ "TGI Fridays", "1/1/2010" ]
]
}
]
}
While I could obviously do the transformation client side, in my use case the field names are very large compared to the data, and I would rather not transmit all those names over the wire nor spend the CPU cycles on the client doing the transformation. Before I come up with my own custom parameters to indicate that the format should be as I desire, I wanted to check if there wasn't already a standardized way to do so.
OData provides several options to control the amount of data and metadata to be included in the response.
In OData v4, you can add odata.metadata=minimal to the Accept header parameters (check the documentation here). This is the default behaviour but even with this, it will still include the field names in the response and for a good reason.
I can see why you want to send only the values without the fields name but keep in mind that this will change the semantic meaning of the response structure. It will make it less intuitive to deal with as a json record on the client side.
So to answer your question, The answer is 'NO',
Other options to minimize the response size:
You can use the $value OData option to gets the raw value of a single property.
Check this example:
services.odata.org/OData/OData.svc/Categories(1)/Products(1)/Supplier/Address/City/$value
You can also use the $select option to cherry pick only the fields you need by selecting a subset of properties to include in the response