Sort matches by closer to the beginning of the string - sphinx

There is a table, which contains site URLs.
I want to sort Sphinx results in a way: "the closer keyword to the beginning of string - the more relevant"
"foobar.com, barfoo.com, barbarfoo.com" is correct result set for keyword "foo"
I have tried :
$s = new SphinxClient;
$s->setServer("localhost", 9312);
$s->SetMatchMode(SPH_MATCH_ALL);
$s->SetSortMode(SPH_SORT_RELEVANCE);
$s->SetFieldWeights(array(
'id' => 0,
'url' => 1000,
));
$result = $s->query("foo");
Unfortunately I get result, that sorted by id.

Hmm, dont think sphinx can do that directly. There are various ranking factors but all based on words.
Can match part words using the 'min_prefix_len', but cant get 'where in the word' the match happens, to be able to rank by it.
The only way may be able to get to work with sphinx, would be to use wordbreaker
http://sphinxsearch.com/blog/2013/01/29/a-new-tool-in-the-trunk-wordbreaker/
to index your domains names as seperate words. Hoping that your domains would be split correctly at "foo bar com", "bar foo com", "bar bar foo com" - which then could rank by word position, eg min_hit_pos
http://sphinxsearch.com/docs/current.html#field-factors

Related

Count filtered records in scala

As I am new to scala ,This problem might look very basic to all..
I have a file called data.txt which contains like below:
xxx.lss.yyy23.com-->mailuogwprd23.lss.com,Hub,12689,14.98904563,1549
xxx.lss.yyy33.com-->mailusrhubprd33.lss.com,Outbound,72996,1.673717588,1949
xxx.lss.yyy33.com-->mailuogwprd33.lss.com,Hub,12133,14.9381027,664
xxx.lss.yyy53.com-->mailusrhubprd53.lss.com,Outbound,72996,1.673717588,3071
I want to split the line and find the records depending upon the numbers in xxx.lss.yyy23.com
val data = io.Source.fromFile("data.txt").getLines().map { x => (x.split("-->"))}.map { r => r(0) }.mkString("\n")
which gives me
xxx.lss.yyy23.com
xxx.lss.yyy33.com
xxx.lss.yyy33.com
xxx.lss.yyy53.com
This is what I am trying to count the exact value...
data.count { x => x.contains("33")}
How do I get the count of records who does not contain 33...
The following will give you the number of lines that contain "33":
data.split("\n").count(a => a.contains("33"))
The reason what you have above isn't working is that you need to split data into an array of strings again. Your previous statement actually concatenates the result into a single string using newline as a separator using mkstring, so you can't really run collection operations like count on it.
The following will work for getting the lines that do not contain "33":
data.split("\n").count(a => !a.contains("33"))
You simply need to negate the contains operation in this case.

Searching with multiple keys and "begins with"

What's the best way to perform the following type of search in a collection named "things":
mylist = ['lak', 'dodg', 'ang']
and the return could be:
["lake", "Lakers", "laky", "dodge", "Dodgers", "Angels", "angle"]
Would I need to perform a separate query for each?
To do this you want to use the mongodb command $in to search for all things that match with something in your array.
The command you would use would be:
db.things.find( {name: { $in: mylist }} )
But for this to work you want to be using regular expressions in your array, so you can either define them in the array, or if you want to maintain strings then the best thing to do it probably just create another array and loop through and create regex from the strings.
mylist = [/^lak/i, /^dodg/i, /^ang/i]
The ^ making it match only if it begins with the value, and the i at the end to make the search case insensitive.

How to replace words after first two words

Let say i have the full name like: Wan Ahmad Wan Dollah Karmat.
And i want to display like: Wan Ahmad W.D.K
I tried this code:
preg_replace('/(.)[^\s]+\s?/', '${1}.', strtoupper($_GET['fullname']), 2)
But the output is: W.A.Wan Dollah Karmat
I want the first two words and shorter the rest words. please help.
Problem solved, thanks to Casimir et Hippolyte. The final code is:
preg_replace('~^(?:\s*\S+){1,2}(*SKIP)(*FAIL)|(\S)\S+~', '${1}.', strtoupper($_GET['fullname']))
its the matter of patterns.
You can use the backtracking control verbs (*SKIP) and (*FAIL) to avoid the two first words.
$pattern = '~^(?:\s*\S+){1,2}(*SKIP)(*FAIL)|(\S)\S+~';
$result = preg_replace_callback($pattern,
function ($m) { return strtoupper($m[1]) . '.'; },
$_GET['fullname'] );
In short:
(*SKIP) forces a substring that matches the preceding subpattern to not be retry if the pattern fails later.
(*FAIL) forces the pattern to fail.

What does this mean in Perl 1..$#something?

I have a loop for example :
for my $something ( #place[1..$#thing] ) {
}
I don't get this statement 1..$#thing
I know that # is for comments but my IDE doesn't color #thing as comment. Or is it really just a comment for someone to know that what is in "$" is "thing" ? And if it's a comment why was the rest of the line not commented out like ] ) { ?
If it has other meanings, i will like to know. Sorry if my question sounds odd, i am just new to perl and perplexed by such an expression.
The $# is the syntax for getting the highest index of the array in question, so $#thing is the highest index of the array #thing. This is documented in perldoc perldata
.. is the range operator, and 1 .. $#thing means a list of numbers, from 1 to whatever the highest index of #thing is.
Using this list inside array brackets with the # sigill denotes that this is an array slice, which is to say, a selected number of elements in the #place array.
So assuming the following:
my #thing = qw(foo bar baz);
my #place = qw(home work restaurant gym);
then #place[1 .. $#thing] (or 1 .. 2) would expand into the list work, restaurant.
It is correct that # is used for comments, but not in this case.
it's how you define a range. From starting value to some other value.
for my $something ( #place[1..3] ) {
# Takes the first three elements
}
Binary ".." is the range operator, which is really two different
operators depending on the context. In list context, it returns a list
of values counting (up by ones) from the left value to the right
value. If the left value is greater than the right value then it
returns the empty list. The range operator is useful for writing
foreach (1..10) loops and for doing slice operations on arrays. In the
current implementation, no temporary array is created when the range
operator is used as the expression in foreach loops, but older
versions of Perl might burn a lot of memory when you write something
like this:
http://perldoc.perl.org/perlop.html#Range-Operators

jquery syntax to look for a hidden field in a form

I have a form with a table in it. In each row is a table cell with a hidden input item with the name of it starting with "hf_id_" followed by a number so that row 1's field has a name of "hf_id_1", row 2 is "hf_id_2" and so on. I need to search all of these fields for a particular value but I'm not quite sure how to get to the hidden fields. I know how to get to them when the full name is known but in this case I'm not sure if there's a way to get an array of these where name starts with "hf_id_". Thanks.
You can search elements with ^ (starting with) and $ (ending with), example:
$('input[name^="hf_id_"]');
So you can get all those elements like:
var elements = $('input[name^="hf_id_"]');
And you can iterate over them to search for a particular value like:
$('input[name^="hf_id_"]').each(function(){
if ($(this).val() === 'search value here')
{
// found..........
}
});
Or you could simply use
$('input[type="hidden"]');