Sphinx search: getting rt indexing to work with mysql - first time

Sphinx search: getting rt indexing to work with mysql - first time - sphinx

I am trying to get rt indexing to work:
http://sphinxsearch.com/docs/current.html#rt-overview
I am missing the link between sphinx and mysql.
In sphinx.conf I have:
index rt_test
{
type = rt
path = /home/my/path/sphinx/data/rt_test
rt_field = title
rt_field = content
}
I run /home/path/bin/indexer --all
It tells me
skipping non-plain index 'rt_test'... (which I read is as it should be)
Then in mysql (logging in as I normally would):
create table rt_test(id INTEGER PRIMARY KEY NOT NULL AUTO_INCREMENT, title varchar(100),
content varchar(100));
insert into rt_test(title, content) values ("test title", "test content");
SELECT * FROM rt_test WHERE MATCH('test');
This gives me a "wrong syntax" error. That's not surprising. Mysql just thinks that I
have created a regular table and inserted regular data, and now it doesn't understand the
Sphinx query.
So what's the missing link? How does mysql know about sphinx? If I don't create the table first then I get an error that the table doesn't exist (sphinx has not made a "sphinx" table to be queried from mysql).
I have installed sphinx on Linux as described here:
http://sphinxsearch.com/docs/current.html#installing
Using this version:
wget http://sphinxsearch.com/files/sphinx-2.0.8-release.tar.gz
Edit: I also ran $searchd
It says:
WARNING: compat_sphinxql_magics=1 is deprecated; please update your applica
WARNING: preopen_indexes=1 has no effect with seamless_rotate=0
listening on all interfaces, port=9312
listening on all interfaces, port=9306
precaching index 'other'
precaching index 'rt_test'
precached 2 indexes in 0.012 sec

I get it now.
I don't log into my regular mysql database, but (funny enough) do as it says:
$ mysql -h 127.0.0.1 -P 9306
And then I don't create the table. Just insert and search.
My regular database and the sphinx mysql are completely separate. I have to insert all my data into my regular database and also into the sphinx database. Then search in the sphinx database and use the result to get the full data in my regular database.

Related

Return name of the Postgres database containing value in a row in a table

I am looking for some implementation to essentially search the entire Postgres instance, all of it's databases to find the database containing a specific row in a specific table.
So I have a Postgres instance with 170 databases, every database has the exact same schema and table layout. In each database there is a specific table "SM_Project" and specific row "ProjectName" in said table. We have instances where we know a ProjectName or at least a partial match (can use LIKE with %) but we have no idea which database it lives in.
I am wanting to script, simplify the ability to enter a ProjectName and search the entire Postgres instance (all databases in there) and return the name of the db that contains that record.
I foolishly thought this would be a simple task, with my lack of experience I've tried to do this with several SELECT statements and while I can explicitly connect to a database and search for the record from there, I can't find a way to return the parent database name. I was thinking ti clunkily script it in bash to iterate through the databases until we get a true return on a EXISTS in a SELECT statement. But I feel like I'm overlooking something fundamental.
So my setup is like this"
Postgres
db1
SM_Project
ProjectName
db2
SM_Project
ProjectName
db3
SM_Project
ProjectName
In short I'm looking to return the name of the database that contains a record of ProjectName equal to a string.
Any thoughts would be very welcomed!

Is it mandatory to use "" around the table name performing query on PostgreSQL?

I am not so into PostgreSQL and pgAdmin 4 and I have the following doubt.
Following a screenshot of what I can see in my pgAdmin4:
As you can see it is performing this very simple query:
SELECT * FROM public."Example"
ORDER BY id ASC
The thing that I am not understanding is what is this public name in front of the Example table name. What is it?
I was trying to perform a query in this way but it is not working:
SELECT * FROM Example
ORDER BY id ASC
It give me a syntax error. I often used MySql and in MySql it is working.
I tried to replace the query in this way:
SELECT * FROM "Example"
ORDER BY id ASC
and so it is working. So it means that in PosgreSQL database the "" around the table name are mandatory?

The thing that I am not understanding is what is this public name in front of the Example table name. What is it?
As said in postgres documentation:
"By default tables (and other objects) are automatically put into a schema named "public". Every new database contains such a schema."
So it means that in PosgreSQL database the "" around the table name
are mandatory?
Not really but you need to use it if you are using reserved keywords (such as "user","name"and other)or if your table's name contains uppercase(it's your case) letters. Anyways, in this case if you can it's better change your table's name.

You should change your table name to all alphabet in lowercase then try again with
select * from example

How to skip or modify an index in pgloader?

I have a MySQL database with a FULLTEXT index that I wish to port to Postgres. When I create a Postgres database using pgloader, the index in Postgres becomes this:
"idx_33441_ibtsearchidx" gin (to_tsvector('simple'::regconfig, keywords))
Now, the simple configuration is not what I want; I, for this application, need english. I can manually enter an ALTER INDEX statement in psql after the migration, but I would like to fully automate the pgloader process (which worked beautifully in every other case!)
But how do I configure pgloader to do this? I seems like there are three possibilities:
Just put an ALTER INDEX statement into the pgloader script's AFTER LOAD section. But the problem is, I won't know the index name. Also I think this approach would be inefficient since an index was made and then a new one would be made after that.
Tell pgloader NOT to automatically make the fulltext index in Postgres. I don't know how to do this. Can it be done? I know how to exclude tables but not indexes. Here I can do the ALTER INDEX in the AFTER LOAD section no problem, because I can choose my own index name.
Specify exactly the full text index configuration I want in the pgloader script. I was unable to find an option for doing this in the pgloader reference. Is it possible?

Postgres equivalent to Sql Servers ##DBTS

I am mainly from a Sql Server background, and following some issues with getting MySql to work with the Microsoft Sync Framework (namely it does not cater for snapshots), I am having to look into Postgres and try to get that working with the Sync Framework.
The triggers that are needed include a call to function "##DBTS", but I am having trouble finding an equivalent in Postgres for this.
From the microsoft documentation for this it says:
##DBTS returns the current database's last-used timestamp value.
A new timestamp value is generated when a row with a timestamp
column is inserted or updated.
In MySql it was the following:
USE INFORMATION_SCHEMA;
SELECT MAX(UPDATE_TIME) FROM TABLES WHERE UPDATE_TIME < NOW();
Can anyone tell me what this would be in Postgres?

PostgreSQL does not keep track when a table was last modified. So there is no equivalent for SQL Server's ##DBTS nor for MySQL's INFORMATION_SCHEMA.TABLES.UPDATE_TIME.
You also might be interested in this discussion:
http://archives.postgresql.org/pgsql-general/2009-02/msg01171.php
which essentially says: "if you need to know when a table was last modified, you have to add a timestamp column to each table that records that last time the row was updated".

Table invisible in PostgreSQL - Undefined relation issue at different sessions

I have executed the following create statement using SQLWorkbench at my target postgresql database:
CREATE TABLE Config (
id serial PRIMARY KEY,
pub_ip_range_low varchar(100),
pub_ip_range_high varchar(100)
);
Right after table creation I request the table content by typing 'select * from config;' and see that table could be retrieved. Nevertheless, my java program that uses JDBC type 4 driver cannot access the table when I issue the same select statement in it. An exception is thrown when the program tries to access it which says says "Undefined relation" for the config table.
My questions are:
Why sqlworkbench where I had previously run the create statement recognizes the table while my java program cannot find it?
Where does the postgressql DBMS puts the tables I created? I don't see them neither in public nor in information schema.
NOTE:
I checked target postgres database and cannot see the table Config anywhere although SQL workbench can query it. Then I opened another SQL workbench instance and noticed that the table cannot be queried (i.e. not found). So, my conclusion is that PostgreSQL puts the table I created in the first running SQLBench instance into some location that is bound to that session. Another SQL Workbench instance or my java program is not bound to session, so cannot query the previously created table config.

The only "bloody location" that is session-local in PostgreSQL is the schema pg_temp, in other words: temporary tables. But your CREATE command does not display the keyword TEMP[ORARY]. Of course, as long as the transaction is not commited, nobody sees anything outside the transaction.
It's more likely you are seeing a switcheroo of hosts / databases / ports / or the schema search_path. A mixup with the mixed-case table name is a hot candidate, too. If you don't double-quote "Config", the table ends up all lower case in the system, so: config. If you later double quote the name, it won't match. The manual has the details.

Maybe the create failed on the extra trailing comma?
CREATE TABLE config (
id serial PRIMARY KEY,
pub_ip_range_low varchar(100),
pub_ip_range_high varchar(100) -- >> ,
);