best way to extract sort value from hit?

best way to extract sort value from hit? - scala

I'm currently sorting data with multiple sorts, and the amount of sorts I use as well as their order differ depending on the search we're doing. I'm simply trying to output the distance from the document location to a fixed point. I can do this with script fields pretty easily. However, I want to know if I can leverage the sort distance values I already have instead of recomputing it. Maybe using a script field but without using the arcDistance() funciton (as I assume this would add latency).
For example, I'm getting the following in the response:
"sort": [
0.0,
-92322255895,
2,
0,
0,
"baxa",
**18736.217,** //this is the distance in meters, I'd like to somehow extract this easily.
"test"
]
In code I'm able to get the sort values but I have no idea which index is associated with the distance at runtime, unless I store it somewhere (which I can do but wondering if there's a better way).
any help would be appreciated!
thanks.

Related

Algolia aroundLatLonViaIP not returning results further away

I have a large databases of objects where I'd like to show objects that are nearer to the user searching first so I'm using aroundLatLngViaIP.
This works well for objects that are near by, however, if there aren't any nearby it doesn't show any further away even if there is an exact text match.
Is it possible to use aroundLatLngViaIP to promote results nearby but not exclude those that are far away?

To achieve this, you need to use aroundRadius: all as an additional query parameter. Quoting from the doc:
The special value all causes the geo distance to be computed and taken into account for ranking, but without filtering; this option is faster than specifying a high integer value.
https://www.algolia.com/doc/api-reference/api-parameters/aroundRadius/

Possible to do TSP with mongodb?

I'm working an a map app where I can add waypoints along specific routes. I need to pull my waypoints out in order obviously so I can get directions from A-D in the correct order.
I've read a bit about geoJSON in mongodb but I'm curious if theres a way to query my data so that my points come out ordered by how close they are together rather than the order I've put them in.
Basically what I'm asking...Is there a way to do a "Traveling Salesman query" so that my waypoints are ordered in the smartest order?

I think the short answer is no. You're going to need to add an order key to your waypoints. This is a pretty standard pattern for any nav system with waypoints. At the very least you need to know which point is first and which is last so that you can then solve the TSP.

Complex URL handling conception

I'm currently struggling at a complex URL handling concept question. The application have a product property database table/collection with all the different product types (i.e. categories, colors, manufacturers, materials, etc.).
{_id:1,alias:"mercedes-benz",type:"brand"},
{_id:2,alias:"suv-cars",type:"category"},
{_id:3,alias:"cars",type:"category"},
{_‌id:4,alias:"toyota",type:"manufacturer"},
{_id:5,alias:"red",type:"color"},
{_id:6,alias:"yellow",type:"color"},
{_id:7,alias:"bmw",type:"manufacturer"},
{_id:8,alias:"leather",type:"material"}
...
Now the mission is to handle URL requests in the style below in every(!) possible order to retrieve the included product properties. The only allowed character is the dash (settled SEO requirement, some properties also can include dashes by themselve - i think also an important point - i.e. the category "suv-cars" or the manufacturer "mercedes-benz"):
http:\\www.example.com\{category}-{color}-{manufacturer}-{material}
http:\\www.example.com\{color}-{manufacturer}
http:\\www.example.com\{color}-{category}-{material}-{manufacturer}
http:\\www.example.com\{category}-{color}-nonexistingproperty-{manufacturer}
http:\\www.example.com\{color}-{category}-{manufacturer}
http:\\www.example.com\{manufacturer}
http:\\www.example.com\{manufacturer}-{category}-{color}-{material}
http:\\www.example.com\{category}
http:\\www.example.com\{manufacturer}-nonexistingproperty-{category}-{color}-{material}
http:\\www.example.com\{color}-crap-{manufacturer}
...
...so: every order of the properties should be allowed! The result have to be the information about the used properties per URL-Request (BTW yes, the duplicate content will be fixed by redirects and a predefined schema). The "nonexistingproperties"/"crap" are possible and just should be ignored.
UPDATE:
Idea 1: One way i'm thinking about the question is to split the query string by dashes and analyze them value by value, the problem: At the two or three or more word combinations at some properties there are too many different combinations and variations so a loooot of queries which kills this idea i think..
Idea 2: The other way is to build a (in my opinion) too large Alias/URL-Table with all of the different combinations, but i think that's just an ugly workaround. There are about 15.000 of different properties so the count of the aliases in the different sort orders is killing this idea.
Idea 3: It's your turn! Thanks for your mind and your time.

While your question is a bit broad, below are some ideas. There isn't a single awesome answer unless you find a free or commercial engine for this that works exactly the way you want.
The way I thought about your problem was to consider the URL as a list of keywords.
use Lucene as a keyword/tag system. It's good at the types of searches you suggest you want, including phrases, stems, etc.
store and index the data in DB of choice, but pull the keywords into memory and build a bit index of all keywords vs items. Iterate through the keyword table producing weighted results. If order of keywords matters, you'll also need make a pass through the result set to weight based on word order. These types of searches always need to cap their result set quickly in order to return results quickly.
cache the results like crazy from working matches, and give precedence to results that users seem to click on the most for a given URL.
attack the database by using tag indexes in MongoDB. You'd still need to merge and weight results. Very intensive and not likely a good use of DB resources.
read some of the academic papers on keyword searches. It's a popular topic.
build a table of words that have dashes in them, and normalize/convert those before running your queries
always check for full exact matches first

The only way this may work, if you restrict all property values to be unique. So, you make a set of categories+colors+manufacturers, etc. All values have to be unique. This will allow you to find to what property the value belongs.
The data structure for this should be fairly simple:
{_id:ValueOfTheProperty, Property:TypeOfProperty}
Here are some possible samples:
{ _id: Red, Property: Color }
{ _id: Green, Property: Color }
{ _id: Boots, Property: Category }
{ _id: Shoes, Property: Category }
...
This way, the order does not matter, and you are able to convert them in a single pass to a map:
{ Color: Red, Category: Boots }
Though, I predict some problems with ambigous names here.

How can I get the min/max value of a sub-array in MongoDB?

So I have a situation where I'm tagging documents using an array, such as:
tags: [
'Housing' : 10,
'Retail' : 1,
'Stocks' : 25,
]
I was just saving the tags themselves, but have recently added the numbers because I need to know rank/position. So the number next to the tag stands for the rank of the document in the set of documents marked with that tag.
It gets tricky in a few places, but right now I'm just trying to figure out how I add another document with one or more of these tags. Let's say I create a new doc and tag it as Housing. Its rank needs to be set to 11, but how do I know that?
The only solution I've been able to find so far is to do a map/reduce to go through the records and find the max value for that tag, add one and save. Now, most records will probably only have 2-3 tags each, but it's theoretically possible to have 10-15. Either way it seems like map/reduce would take a herculean effort once there are a lot of tags and a lot of records...
Is there an easier way or should I just start looking for another solution to this problem?
Edit:
Let me give you a little more detail about the problem I'm trying to solve... I'm displaying images in a slideshow/carousel. The slideshow covers different categories, so using the category/tags from above, you can either view ALL of the images, or just those from Housing, Retail, Stocks, etc. Right now I only have a handful of defined categories, but it's quite possible that these will expand over time. They need to be filtered by the tag, and sorted by date (newest first).
Now, up until this point I've been doing just that. The problem comes in when I want to select an arbitrary image in the middle of the deck. Say you want to load "housing_chart.gif" that was uploaded 6 months ago. I don't want to load 6 months of images in order to get to that image (which is basically what I'm doing now). Instead I want to load that particular image and then be able to paginate it for next/previous images.
But in order to "paginate" the images on the carousel, I have to know the location of that image in the results... without actually getting all of the results and counting. So I figured putting a rank on them would be the way to go, but that causes other problems as well. I don't really like the idea of creating another collection just to store ranks, but that may be the most efficient way of going about it.

What do you need the rank for? Is this ever updated? If not and if the rank is always increasing, can't you just sort the documents for a particular tag by their date?
I would argue that information about the rank really belongs to the "tag" rather than the document, ie. each tag should have an associated list of documents with this tag - the position in that list can then define the list.

Searching for surrounding suburbs based on latitude & longitude using Objective C

I need to identify items of interest nearby to a particular latitude/longitude. I have two sets of objects stored in Core Data:
PostalArea
- latitude
- longitude
- postcode
Store
- name
- latitude
- longitude
I need to be able to retrieve a record from the PostalArea table, and then find stores which are closeby.
Most of the examples I've found are SQL based. I'm hoping someone can assist me with getting this working with Core Data. Ideally I would like to limit the result set to a certain number (say, 10), as opposed to limiting it based on distance.
EDIT: pulling all the records into memory is an option, if need be.

The most common solution to this is to determine a min/max of both lat and long and then do a predicate based on those. That would narrow your search to within a circle and then with those remaining objects in memory you can sort by closest to the point.
Update
Once you have the objects in memory you can then do some fun things with them. For instance you could have a transient value called 'currentPoint' and you could then KVO on the resulting array such as:
[resultingArray setValue:aPoint forKey:#"currentPoint"];
Then you could have a method that returns the distance from that point which you can sort by.
That is one example, I am sure there are other ways but the general idea is to get a subset of locations into memory so that you can then calculate distance and finally sort.

How would you feel about a REST service option? Is you use the Google APIs, you can query for exactly those things. There are plenty of JSON interfaces available for Objective C.