How to add hash character search to sphinx - sphinx

I have done the following updates to sphinx to include a hash character in my search to no avail.
config file:
source MY_SOURCE{
...
sql_qudery_pre = SET CHARACTER_SET_RESULTS=utf8
sql_query_pre = SET NAMES utf8
}
index MY_INDEX {
path = C:\Sphinx\data\MY_INDEX
...
charset_type = utf-8
charset_table = 0..9, A..Z->a..z, a..z, +, #, U+002E
}
I then run indexer --rotate --all. Please not that Sphinx is running as a Window's service.
When I run the following query, I get no results:
SELECT count(*) FROM MY_INDEX WHERE Match("#");
Can someone please look at this info and let me know what I am doing incorrectly?
Thank you!

A single # in the config file signifies a comment, and so not seen, need to encode it U+0023.

Related

PostgreSQL absolute over relative xpath location

Consider the following xml document that is stored in a PostgreSQL field:
<E_sProcedure xmlns="http://www.minushabens.com/2008/FMSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" modelCodeScheme="Emo_ex" modelCodeSchemeVersion="01" modelCodeValue="EMO_E_PROCEDURA" modelCodeMeaning="Section" sectionID="11">
<tCatSnVsn_Pmax modelCodeScheme="Emodinamica_referto" modelCodeSchemeVersion="01" modelCodeValue="tCat4" modelCodeMeaning="My text"><![CDATA[1]]></tCatSnVsn_Pmax>
</E_sProcedure>
If I run the following query I get the correct result for Line 1, while Line 2 returns nothing:
SELECT
--Line 1
TRIM(BOTH FROM array_to_string((xpath('//child::*[#modelCodeValue="tCat4"]/text()', t.xml_element)),'')) as tCatSnVsn_Pmax_MEANING
--Line2
,TRIM(BOTH FROM array_to_string((xpath('/tCatSnVsn_Pmax/text()', t.xml_element)),'')) as tCatSnVsn_Pmax
FROM (
SELECT unnest(xpath('//x:E_sProcedure', s.XMLDATA::xml, ARRAY[ARRAY['x', 'http://www.minushabens.com/2008/FMSchema']])) AS xml_element
FROM sr_data as s)t;
What's wrong in the xpath of Line 2?
Your second xpath() doesn't return anything because of two problems. First: you need to use //tCatSnVsn_Pmax as the xml_element still starts with <E_sProcedure>. The path /tCatSnVsn_Pmax tries to select a top-level element with that name.
But even then, the second one won't return anything because of the namespace. You need to pass the same namespace definition to the xpath(), so you need something like this:
SELECT (xpath('/x:tCatSnVsn_Pmax/text()', t.xml_element, ARRAY[ARRAY['x', 'http://www.minushabens.com/2008/FMSchema']]))[1] as tCatSnVsn_Pmax
FROM (
SELECT unnest(xpath('//x:E_sProcedure', s.XMLDATA::xml, ARRAY[ARRAY['x', 'http://www.minushabens.com/2008/FMSchema']])) AS xml_element
FROM sr_data as s
)t;
With modern Postgres versions (>= 10) I prefer using xmltable() for anything nontrivial. It makes passing namespaces easier and accessing multiple attributes or elements.
SELECT xt.*
FROM sr_data
cross join
xmltable(xmlnamespaces ('http://www.minushabens.com/2008/FMSchema' as x),
'/x:E_sProcedure'
passing (xmldata::xml)
columns
sectionid text path '#sectionID',
pmax text path 'x:tCatSnVsn_Pmax',
model_code_value text path 'x:tCatSnVsn_Pmax/#modelCodeValue') as xt
For your sample XML, the above returns:
sectionid | pmax | model_code_value
----------+------+-----------------
11 | 1 | tCat4

How to search in SQL for cloumn with asterisk?

I try to create advanced search to my database.
I want to do something like that: if the user type for search = overf**w
and I have in my database an cloumn that his value = overflow - show him.
this my code:
$name = str_replace('*', '_', $name);
SELECT name FROM table WHERE name LIKE CONCAT('%', ?, '%')
its not working, I dont know what the problem.
You can't use LIKE in this situation, you need to use REGEXP() to do a wildcard search. Replace * or ** with .*. To only return names that starts with the given value use ^ at the beginning of the regular expression
SELECT name
FROM actors
WHERE name REGEXP('^overf.*w')
I don't know php but your $name parameter should be set like this (in pseudo code)
$name = '^' + replace($name, '**', '.*')

query error: no field 'face' found in schema

i have bigint column named as face in mysql. and this is my sphinx.conf
source src1
{
type = mysql
sql_host = localhost
sql_user = root
sql_pass = pass
sql_db = nums
sql_port = 3306 # optional, default is 3306
sql_query = SELECT id,id AS id_attr,tel,name,sex,face from tel
sql_attr_uint = id_attr
sql_attr_bigint = face
}
index num
{
rt_attr_bigint = face
rt_field = face
source = src1
path = C:/sphinx/bin/data/numaralar
}
i can make search by name and tel but not with face.
Fatal error: Uncaught exception 'Foolz\SphinxQL\Exception\DatabaseException' with message '[1064] index nums: query error: no field 'face' found in schema [ SELECT * FROM nums WHERE MATCH('(#face 123456)') LIMIT 0, 10 OPTION max_matches = 5000;SHOW META]' in ..
why may it be?
You are trying to use the value as an field. The # fulltext operator (and indeed the whole of MATCH() full text query, operates on fields ONLY.
You've instead defined face as an atribute. Attributes don't work in full-text queries.
Can
Make face a field instead (remove the sql_attr_bigint) or make it both an attribute and field. (to do that, would have to duplicate it like you've duplicated the id, one for field, one for attribute. or use sql_field_string, but that makes a string attribute)
or
Use filter by the attribute instead. Dont really know how to do that in Foolz. But the SphinxQL query would be something like
SELECT * FROM nums WHERE `face` = 123456 LIMIT 0, 10

PostgreSQL, perl and dojo special character issue (æ,ø and å)

I have a webpage made in perl and dojo using a PostgreSQL database. I have to search for availale people in the database and since im from Denmark the letters æ,ø and å has to be available in the search. I thought this was standard when using UTF8 and when I normally program in php over mysql I didn't think it would be that hard.
I have done properly every trick I know to convert this search_word to the right encoding so i can search in the postgre sql database for correct names with æ,ø and å... but it still fails.
i have my perl code making the fetch but this fetch returns 0 rows and when i insert the same command in the psql terminal i get 46 rows returned (copy from "tail -f log terminal" the STDERR statement and inserts it into another terminal connected to the database through the psql command)... the perl code is:
sub dbSearchPersons {
my $search_word = escapeSql($_[0]);
$search_word = Encode::decode_utf8($search_word);
$statement = "SELECT id,name,initials,email FROM person WHERE name ilike '\%".$search_word."\%' OR email ilike '\%".$search_word."\%' OR initials ilike '\%".$search_word."\%' ORDER BY name ASC";
$sth = $dbh->prepare($statement);
$num_rows = $sth->execute();
print STDERR "Statement: " . $statement;
if($num_rows > 0){
$persons = $dbh->selectall_hashref($statement,'id');
}
dbFinish($sth);
webdie($DBI::errstr) if($DBI::errstr);
}
and as you can see i write the SQL statement to STDERR and which outputs the following:
[Fri Apr 27 11:24:26 2012] [error] [client 10.254.0.1] Statement: SELECT id,name,initials,email FROM person WHERE name ilike '%Jørgen%' OR email ilike '%Jørgen%' OR initials ilike '%Jørgen%' ORDER BY name ASC, referer: https://xx.xxx.xxx.xx/cgi-bin/users.cgi
The sql I correctly written (as i can see it through the terminal output above) and if I copy and paste the statement from the terminal and inserts it directly into the psql terminal, i get 46 rows returned as I should... But the perl still wont return any rows.
I don't get it? When formatting a string to display "ø" and not "ø" (as perl translates the UTF8 encoding to, from "J%C3%B8rgen" which gets send through dojo.xhr.post), should I not be able to use it in a SQL statement? Is it because the psql database can have a certain encoding i have to take that into account somehow? Or could it be some completely different?
Hope someone can help me. I have been struggling with this problem for two days now and since the things looks like they should, but don't work I get a little sad :/
Regards,
Thor Astrup Pedersen
You probably forgot to pg_enable_utf8. The database interface will return then Perl character data to you.
$ createdb -e -E UTF-8 -l en_US.UTF-8 -T template0 so10349280
CREATE DATABASE so10349280 ENCODING 'UTF-8' TEMPLATE template0 LC_COLLATE 'en_US.UTF-8' LC_CTYPE 'en_US.UTF-8';
$ echo 'create table person (id int, name varchar, initials varchar, email varchar)'|psql so10349280
CREATE TABLE
$ echo "insert into person (id, name) values (1, 'Jørgensen')"|psql so10349280
INSERT 0 1
$ echo 'select * from person'|psql so10349280
id | name | initials | email
----+-----------+----------+-------
1 | Jørgensen | |
$ perl -Mutf8 -Mstrictures -MDBI -MDevel::Peek -E'
my $dbh = DBI->connect(
"DBI:Pg:dbname=so10349280", $ENV{LOGNAME}, "", { RaiseError => 1, AutoCommit => 1, pg_enable_utf8 => 1}
);
my $r = $dbh->selectall_hashref("select * from person where name = ?", "id", undef, "Jørgensen");
Dump $r->{1}{name};
'
SV = PV(0x836e20) at 0xa58dc8
REFCNT = 1
FLAGS = (POK,pPOK,UTF8)
PV = 0xa5a000 "J\303\270rgensen"\0 [UTF8 "J\x{f8}rgensen"]
CUR = 10
LEN = 16
You don't say quite clear, I think you eventually intend to send out the character data as JSON for use with Dojo. You need to encode them into UTF-8 octets; the various JSON libaries take care of that automatically for you, no need to invoke Encode functions manually.

PostgreSQL change part of a string to uppercase

I have a field named rspec in a table trace.
So for now the field is like "Vol3/data/20070204_191426_FXBS.v3a".
All I need is a query to change it to the format "Vol3/data/20070204_191426_FXBS.V3A".
Assuming the current version:
select left(rspec, - 3)||upper(right(rspec, 3))
from trace
For older versions:
select substr(rspec, 1, length(rspec) - 3)||upper(substring(rspec from '...$'))
from trace
Or, to cover all possibilities like
file extensions of variable length: abc123.jpeg
no file extension at all: abc123
dot as last character: abc123.
multiple dots: abc.123.jpg
SELECT CASE WHEN rspec ~~ '%.%'
THEN substring(rspec, E'^.*\\.')
|| upper(substring(rspec , E'([^.]*)$'))
ELSE rspec
END AS rspec
FROM (VALUES
('abc123.jpeg')
, ('abc123')
, ('abc123.')
, ('abc.123.jpg')
) ASx(rspec); -- testcases
Explain:
If the string has no dot, use the string.
Else, take everything up to and including the last dot in the string.
Append everything after the last dot in upper case.