tsvector only supports English?

tsvector only supports English? - postgresql

I did the following:
ALTER TABLE blog_entry ADD COLUMN body_tsv tsvector;
CREATE TRIGGER tsvectorupdate BEFORE INSERT OR UPDATE ON blog_entry
FOR EACH ROW EXECUTE PROCEDURE tsvector_update_trigger(body_tsv, 'pg_catalog.english', body);
CREATE INDEX blog_entry_tsv ON blog_entry USING gin(body_tsv);
UPDATE blog_entry SET body_tsv=to_tsvector(body);
Now this is working:
SELECT title FROM blog_entry WHERE body_tsv ## plainto_tsquery('hello world');
But when trying to search for non-English text, it's not working at all (no results).
I am using v9.2.2
Please help.

It's been a while since I played with this, but you need to create the ts_vector in the correct language, not the ts_query.
So when you update your table, use:
UPDATE blog_entry SET body_tsv=to_tsvector('german', body);
You can also extend the functionality and use an ispell dictionary to make stemming better to the text search engine (although it still won't be as sophisticated as e.g. Solr)
To do that, download the ISPELL dictionary that is e.g. contained in the OpenOffice German dictionary
The .oxt file is actually a .zip file, so you can simply extract its content.
Then copy the file de_DE_frami.dic to the PostgreSQL "share/tsearch_data" directory while changing the extension to .dict (which is what PostgreSQL expects.
Then copy the file de_DE_frami.aff to the same directory, changing the extension to .affix.
You need to convert both (text) files to UTF-8 in order for them to work with PostgreSQL
Then register that dictionary using:
CREATE TEXT SEARCH CONFIGURATION de_config (copy=german);
CREATE TEXT SEARCH DICTIONARY german_stem (
TEMPLATE = snowball,
Language = german
);
CREATE TEXT SEARCH DICTIONARY german_ispell (
TEMPLATE = ispell,
dictfile = de_DE_frami,
afffile = de_de_frami
);
alter text search configuration de_config
alter mapping for asciiword WITH german_ispell, german_stem;
Once that is done, you can create your ts_vector using:
UPDATE blog_entry SET body_tsv=to_tsvector('de_config', body);
This is also described in the manual: http://www.postgresql.org/docs/current/static/textsearch-dictionaries.html#TEXTSEARCH-ISPELL-DICTIONARY

I know it's been a while for this question, but I was searching about changing the FTS language and found an other solution. (and better than download a dictionary)
on Postgres CLI you can use the command to get a List of text search configurations: \dF
Check your current configuration:
show default_text_search_config;
Change your text search configuration to another language:
set default_text_search_config = 'pg_catalog.[language]';

Related

Like does not work with Russian symbols? (Swift, SQLite)

I use like with lowercaseString and Russian symbols but LOWER doesn't convert them to lowercase in the query. I tried to create my own function but it didn't work for me. How to solve this problem?
Having studied the documentation of SQLite, I learned that you need to connect the ICU library. How can this be done in this plugin?
Library: stephencelis/SQLite.swift (https://github.com/stephencelis/SQLite.swift)
Thanks for help.
// in name value: ПРИВЕТ from database
let search_name = "Привет"
user.filter(name.lowercaseString.like("%" + search_name.lowercased() + "%"))

SQLite LOWER is only for ASCII. If you want to get case insensitive for Russian (or any other symbols besides ASCII), use FTS3/FTS4 https://www.sqlite.org/fts3.html (or FTS5 https://www.sqlite.org/fts5.html).
SQLite.swift has the corresponding full text search modules https://github.com/stephencelis/SQLite.swift/blob/master/Documentation/Index.md#full-text-search
To use it in your project with existing database, you should make connection to virtual table via FTS module and filter the query using .match
// CREATE VIRTUAL TABLE "table" USING fts4("row0", "row1"), if not exists
try db.run(table.create(.FTS4(row0, row1), ifNotExists: true))
// SELECT * FROM "table" WHERE "row0" MATCH 'textToMatch*'
try db.prepare(table.filter(row0.match("\(textToMatch)*")))
// SELECT * FROM "table" WHERE "any row" MATCH 'textToMatch*'
try db.prepare(table.match("\(textToMatch)*")))

How to use Postgresql ts_delete function

I am trying to use Postgresql Full Text Search. I read that the stop words (words ignored for indexing) are implemented via dictionary. But I would like to give the user a limited control over the stop words (insert new ones), so I grouped then in a table.
From the example below:
select strip(to_tsvector('simple', texto)) from longtxts where id = 23;
I can get the vector:
{'alta' 'aluno' 'cada' 'do' 'em' 'leia' 'livro' 'pedir' 'que' 'trecho' 'um' 'voz'}
And now I would like to remove the elements from the stopwords table:
select array(select palavra_proibida from stopwords);
That returns the array:
{a,as,ao,aos,com,default,e,eu,o,os,da,das,de,do,dos,em,lhe,na,nao,nas,no,nos,ou,por,para,pra,que,sem,se,um,uma}
Then, following documentation:
ts_delete(vector tsvector, lexemes text[]) tsvector remove any occurrence of lexemes in lexemes from vector ts_delete('fat:2,4 cat:3 rat:5A'::tsvector, ARRAY['fat','rat'])
I tried a lot. For example:
select ts_delete((select strip(to_tsvector('simple', texto)) from longtxts where id = 23), array[(select palavra_proibida from stopwords)]);
But I always receive the error:
ERROR: function ts_delete(tsvector, character varying[]) does not exist
LINE 1: select ts_delete((select strip(to_tsvector('simple', texto))...
^
HINT: No function matches the given name and argument types. You might need to add explicit type casts.
Could anyone help me? Thanks in advance!

ts_delete was introduced in PostgreSQL 9.6. Based on the error message, you're using an earlier version. You may try select version(); to be sure.
When you land on the PostgreSQL online documentation with a web search, it may correspond to any version. The version is in the URL and there's a "This page in another version" set of links at the top of each page to help switching to the equivalent doc for a different version.

OpenEdge 102a export table to xml file

I am new to openedge and i am trying to export initially a table to xml file.
My final aim is to export three tables to xml file.
I have tried to export in a simple delimited and is working.
I have tried
For txt
OUTPUT TO c:\temp\file.txt.
FOR EACH cGrSIRVATNBR:
EXPORT DELIMITER ";" cGrSIRVATNBR.
END.
OUTPUT CLOSE.
For xml
cGrSIRVATNBR:WRITE-XML("FILE","c:\temp\tt.xml", TRUE).
For xml i thing is only supported from 102b. That's why i am taking error (Unable to understand after -- cGrSIRVATNBR:) when using WRITE-XML.
I will appreciate any help.

This works fine for me:
define temp-table ttCust no-undo like customer.
for each customer no-lock where custNum = 1:
create ttCust.
buffer-copy customer to ttCust.
end.
temp-table ttCust:write-xml( "file", "cust.xml", true ).
You cannot directly write a db table to XML. You have to copy the records that you want into a temp-table first.

SQLAlchemy, Psycopg2 and Postgresql COPY

It looks like Psycopg has a custom command for executing a COPY:
psycopg2 COPY using cursor.copy_from() freezes with large inputs
Is there a way to access this functionality from with SQLAlchemy?

accepted answer is correct but if you want more than just the EoghanM's comment to go on the following worked for me in COPYing a table out to CSV...
from sqlalchemy import sessionmaker, create_engine
eng = create_engine("postgresql://user:pwd#host:5432/db")
ses = sessionmaker(bind=engine)
dbcopy_f = open('/tmp/some_table_copy.csv','wb')
copy_sql = 'COPY some_table TO STDOUT WITH CSV HEADER'
fake_conn = eng.raw_connection()
fake_cur = fake_conn.cursor()
fake_cur.copy_expert(copy_sql, dbcopy_f)
The sessionmaker isn't necessary but if you're in the habit of creating the engine and the session at the same time to use raw_connection you'll need separate them (unless there is some way to access the engine through the session object that I don't know). The sql string provided to copy_expert is also not the only way to it, there is a basic copy_to function that you can use with subset of the parameters that you could past to a normal COPY TO query. Overall performance of the command seems fast for me, copying out a table of ~20000 rows.
http://initd.org/psycopg/docs/cursor.html#cursor.copy_to
http://docs.sqlalchemy.org/en/latest/core/connections.html#sqlalchemy.engine.Engine.raw_connection

If your engine is configured with a psycopg2 connection string (which is the default, so either "postgresql://..." or "postgresql+psycopg2://..."), you can create a psycopg2 cursor from an SQL Alchemy session using
cursor = session.connection().connection.cursor()
which you can use to execute
cursor.copy_from(...)
The cursor will be active in the same transaction as your session currently is. If a commit or rollback happens, any further use of the cursor with throw a psycopg2.InterfaceError, you would have to create a new one.

You can use:
def to_sql(engine, df, table, if_exists='fail', sep='\t', encoding='utf8'):
# Create Table
df[:0].to_sql(table, engine, if_exists=if_exists)
# Prepare data
output = cStringIO.StringIO()
df.to_csv(output, sep=sep, header=False, encoding=encoding)
output.seek(0)
# Insert data
connection = engine.raw_connection()
cursor = connection.cursor()
cursor.copy_from(output, table, sep=sep, null='')
connection.commit()
cursor.close()
I insert 200000 lines in 5 seconds instead of 4 minutes

It doesn't look like it.
You may have to just use psycopg2 to expose this functionality and forego the ORM capabilities. I guess I don't really see the benefit of ORM in such an operation anyway since it's a straight bulk insert and dealing with individual objects a la an ORM would not really make a whole lot of sense.

If you're starting from SQLAlchemy, you need to first get to the connection engine (also known by the property name bind on some SQLAlchemy objects):
engine = create_engine('postgresql+psycopg2://myuser:password#localhost/mydb')
# or
engine = session.engine
# or any other way you know to get to the engine
From the engine you can isolate a psycopg2 connection:
# get a psycopg2 connection
connection = engine.connect().connection
# get a cursor on that connection
cursor = connection.cursor()
Here are some templates for the COPY statement to use with cursor.copy_expert(), a more complete and flexible option than copy_from() or copy_to() as it is indicated here: https://www.psycopg.org/docs/cursor.html#cursor.copy_expert.
# to dump to a file
dump_to = """
COPY mytable
TO STDOUT
WITH (
FORMAT CSV,
DELIMITER ',',
HEADER
);
"""
# to copy from a file:
copy_from = """
COPY mytable
FROM STDIN
WITH (
FORMAT CSV,
DELIMITER ',',
HEADER
);
"""
Check out what the options above mean and others that may be of interest to your specific situation https://www.postgresql.org/docs/current/static/sql-copy.html.
IMPORTANT NOTE: The link to the documentation of cursor.copy_expert() indicates to use STDOUT to write out to a file and STDIN to copy from a file. But if you look at the syntax on the PostgreSQL manual, you'll notice that you can also specify the file to write to or from directly in the COPY statement. Don't do that, you're likely just wasting your time if you're not running as root (who runs Python as root during development?) Just do what's indicated in the psycopg2's docs and specify STDIN or STDOUT in your statement with cursor.copy_expert(), it should be fine.
# running the copy statement
with open('/path/to/your/data/file.csv') as f:
cursor.copy_expert(copy_from, file=f)
# don't forget to commit the changes.
connection.commit()

You don't need to drop down to psycopg2, use raw_connection nor a cursor.
Just execute the sql as usual, you can even use bind parameters with text():
engine.execute(text('''copy some_table from :csv
delimiter ',' csv'''
).execution_options(autocommit=True),
csv='/tmp/a.csv')
You can drop the execution_options(autocommit=True) if this PR will be accepted

How to set null values while importing to phpmyadmin?

I'm trying to import a .csv file into phpmyadmin where several fields are purposefully left blank. I need these field to register as null values and not just left as a blank string.
I know in the field properties you can select to allow "null" vs. "not null" for each field, but it still doesn't change cell to a null value while importing. After the import I can manually go check the null box for each field on each record, but that it unrealistic considering the amount of data I'm working with.
Is there a way to get phpmyadmin to set these blank cell to null values on import?

I've been experience similar issues.
If you download a PhpMyAdmin CSV file with NULL values, you'll notice that NULL doesn't get encapsulated with quotes. So you'll have a line like this:
"1";"2";NULL;NULL
"2";"2";NULL;NULL
etc.
However, if you edit a CSV file in something like Open Office Calc, it might change this to put quotes around NULL, like so:
"1";"2";"NULL";"NULL"
"2";"2";"NULL";"NULL"
etc.
What should work is doing a search and replace for ["NULL" = NULL].
In your case, because you have empty (blank) fields, you'll be looking at doing a search and replace like this:
[,, = ,NULL,]
And probably a second pass for NULL values at the end of a line like so:
[,\n = ,NULL\n]

Ancient question, but in case another MySQL noob like myself comes across it.
The find/replace rigamarole jmbertucci describes is avoidable if you're in charge of the creation of the CSV file, for example when you're backing up your own databases. In phpMyAdmin, if you select "custom" export method, you will see replace NULL with: and the default is NULL. Simply change that to "NULL" and you save yourself a step.

I ran into this same problem and jmbertucci's answer worked great. I did run into one additional problem. In the case with a row of data like such
"hello","world",,,,,,
which has multiple sets of null values in a row doing a search replace with [,, = ,NULL,] as jmbertucci suggested won't work as you intend it to on the first pass. Instead you'll end up with
"hello","world",NULL,,NULL,,NULL
You should continue to do the search replace to until you end up with 0 occurrences replaced

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

tsvector only supports English? - postgresql

Related

Like does not work with Russian symbols? (Swift, SQLite)

How to use Postgresql ts_delete function

OpenEdge 102a export table to xml file

SQLAlchemy, Psycopg2 and Postgresql COPY

How to set null values while importing to phpmyadmin?

Categories

Resources