Algolia search results for partial string matches - algolia

Trying to do a pretty basic search implementation of partial matching. For instance, I'd like 'ia hu' to return 'Ian Hunter'. I've got first and last name split so we're indexing first, last and combined.
Was reading the suggestion in here, but this just isn't a very elegant or feasible way to solve: https://www.algolia.com/doc/faq/troubleshooting/how-can-i-make-queries-within-the-middle-of-a-word.
I don't think we should have to generate a ton of substring combos for first and last name to get this to return results.
Has anyone implemented a more elegant solution?

In this specific use case (matching "Ian Hunter" with "ia hu"), you can turn prefix matching on all words with queryType=prefixAll (see documentation).
This will not allow infix matching, so "an hu" or "ia un" will not match "Ian Hunter". This cannot therefore be considered a general solution to your question. However, in practice, prefix matching tends to be what people use instinctively; infix matching is relatively rare in my experience.

Related

Scala spark: Efficient check if condition is matched anywhere?

What I want is roughly equivalent to
df.where(<condition>).count() != 0
But I'm pretty sure it's not quite smart enough to stop once it finds any such violation. I would expect some sort of aggregator to be able to do this, but I haven't found one? I could do it with a max and some sort of conversion, but again I don't think it would necessarily know to quit (not being specific to bool, I'm not sure if understands no value is larger than true).
More specifically, I want to check if a column contains only a single element. Right now my best idea is to do this is by grabbing the first value and comparing everything.
I would try this option, it should be much faster:
df.where(<condition>).head(1).isEmpty
You can also try to define your conditions on a row together with scala's exists (which stops at the first occurence of true):
df.mapPartitions(rows => if(rows.exists(row => <condition>)) Iterator(1) else Iterator.empty).isEmpty
At the end you should benchmark the alternatives

Why are while loops not recommended in scala

The scala style checker says that while loops are deprecated if you’re using a strict functional style - http://www.scalastyle.org/rules-dev.html#org_scalastyle_scalariform_WhileChecker.
I found 1 solution - Is there any advantage to avoiding while loops in Scala?
This says mutability will ensure that, on the long run, you'll introduce bugs with a while pattern. How can this happen?
Why is there no check for for loop if immutability is highly restricted?
I have a simple use case where I have to remove all the occurrences of substring from a string that are present at the end. I could find a solution for it because of which I was using loops.
Example - String is "IABCFGHUABCABC" and subtring is "ABC". String output should be "IABCFGHU" where all the trailing occurrences of substring is removed.
Is there any non imperative and recommended way to solve this problem using scala?
Why is there no check for for loop if immutability is highly restricted?
Because unlike in C-style for loops, there's no mutability in Scala for:
for (i <- <something>) {
<body>
}
is just another way to write the method call <something>.foreach { i => <body> }.
Is there any non imperative and recommended way to solve this problem using scala?
Yes, of course. As the question you linked says, you can use tail recursion. I won't provide code, but the idea is: if the string doesn't end with the substring, return it; if it does, remove that ending and call the function again with new arguments. You should think on why this will ultimately return the desired result.

Why to use := in Scala? [duplicate]

What is the difference between = and := in Scala?
I have googled extensively for "scala colon-equals", but was unable to find anything definitive.
= in scala is the actual assignment operator -- it does a handful of specific things that for the most part you don't have control over, such as
Giving a val or var a value when it's created
Changing the value of a var
Changing the value of a field on a class
Making a type alias
Probably others
:= is not a built-in operator -- anyone can overload it and define it to mean whatever they like. The reason people like to use := is because it looks very assignmenty and is used as an assignment operator in other languages.
So, if you're trying to find out what := means in the particular library you're using... my advice is look through the Scaladocs (if they exist) for a method named :=.
from Martin Odersky:
Initially we had colon-equals for assignment—just as in Pascal, Modula, and Ada—and a single equals sign for equality. A lot of programming theorists would argue that that's the right way to do it. Assignment is not equality, and you should therefore use a different symbol for assignment. But then I tried it out with some people coming from Java. The reaction I got was, "Well, this looks like an interesting language. But why do you write colon-equals? What is it?" And I explained that its like that in Pascal. They said, "Now I understand, but I don't understand why you insist on doing that." Then I realized this is not something we wanted to insist on. We didn't want to say, "We have a better language because we write colon-equals instead of equals for assignment." It's a totally minor point, and people can get used to either approach. So we decided to not fight convention in these minor things, when there were other places where we did want to make a difference.
from The Goals of Scala's Design
= performs assignment. := is not defined in the standard library or the language specification. It's a name that is free for other libraries or your code to use, if you wish.
Scala allows for operator overloading, where you can define the behaviour of an operator just like you could write a method.
As in other languages, = is an assignment operator.
The is no standard operator I'm aware of called :=, but could define one with this name. If you see an operator like this, you should check up the documentation of whatever you're looking at, or search for where that operator is defined.
There is a lot you can do with Scala operators. You can essentially make an operator out of virtually any characters you like.

Case Insensitive filtering using Google Guava

Current I am using following piece of code to create a filter, in a map to match and give a filtered list of resultset.
final Map filteredMap = Maps.filterKeys(mymap, Predicates.containsPattern("^Xyz"));
However Guava Predicates.containsPattern does case-sensitive matching.
How should I use containsPattern for doing case-Insensitive matching.
Use
Predicates.contains(Pattern.compile("^Xyz", Pattern.CASE_INSENSITIVE))
as predicate instead. See core Java Pattern and Predicates.contains.
EDIT (after OP's comment): yes, you can write:
Predicates.containsPattern("(?i)^Xyz"))
(see Pattern's documentation: Case-insensitive matching can also be enabled via the embedded flag expression (?i).) but it's IMO less self-explaining, plus compiled Pattern from first example can be cached to some private static final constant when used in a loop, which can improve performance.

Finding info on scala operators

Im reading http://debasishg.blogspot.com/2008/04/external-dsls-made-easy-with-scala.html and I am trying to find info on the "<~" operator, for example:
def trans = "(" ~> repsep(trans_spec, ",") <~ ")"
I have some reasonable guess that has something to do with the product("~") operator along with lists?
What does it do?
In the future, how do I lookup operators like that? It is no good to google "<~" for example.
EDIT:
Found the "<~" info in Scala combinator parsers - distinguish between number strings and variable strings
Question 2 remains
On Question 2, unfortunately that is one disadvantage of Scala's allowance of non-alphabetic characters, they're not easily found in search engines. Your best bet is simply to check the Scaladocs of whatever code is in scope.
Regarding Question 2, there is an upcoming (time-frame unkonwn to me) addition to the ScalaDoc processor that will produce a cross-reference index that allows you to look up method and field names and see which classes declare or define them.
You can get a preview of this (not integrated with the ScalaDocs, but useful nonetheless) here: ScalaDoc Name Index