Is it possible to combine multiple text search configuration in FTS on postgresql? - postgresql

I tried to combine multiple text search to use it into text search on postgresql.
I tried :
Create text search configuration test (
copy = english, french
)
But this didn't work:
text search configuration parameter "french" not recognized
I have a column which mixed of english french words and I want to get multiple configuration texts to search the queries items.
Example:
to_tsvector('test', words) ## to_tsquery('test','activité')
to_tsvector('test', words) ## to_tsquery('test', 'mystery')
How can I mix different text configurations to get result when I look for a french or english word?

The French text search configuration uses French stemming (the french_stem dictionary), while for English english_stem is used.
How do you want to stem for both? You could create a text search configuration that applies both stemmers, but I guess that the result would not be convincing. Similar for stop words.
You can explicitly specify the text search configuration in the query if you know what language you want to search for.

Related

postgresql fulltext returning wrong results

I'm using postgresql full text tsvector column.
But I found a problem:
When I search for "calça"
The results contains the following results:
1- calça red
2- calça blue
3- calçado red
Why "calçado" is being returned when I search for "calça" ?
Is there any configuration so I can solve this?
Thanks.
It isn't just a matter that one string contains the other. The Portuguese stemmer thinks this is the way they should be stemmed. If you turn the longer word into 'calçadot', for example, it no longer stems it, because (presumably) 'adot' is not recognized as a Portuguese suffix which ought to be removed the way 'ado' is.
If you don't want stemming at all, then you could change the config to 'simple', which doesn't stem. But at that point, maybe you don't want full text search at all, and could just use LIKE instead with a pg_trgm index.
If it is just this particular word that you don't want stemmed, I think you can set up a synonym dictionary which will map calçado to itself, which will bypass stemming.

VSCode multiline search of two words?

I saw a SO post that says you can search using regex or an actual literal text on it to search multiline texts. But what if you want to (quickly) search two or three of words within a specified lines of text content?
For example, what if you want to search for multiline text area that contains "ruby" and "regex" (assuming you want to know where you took a note on your txt (or markdown or rich text format) file. you may want to search for "how to use regex in ruby" or "the ruby regex tutorial", right? )
Now you can use a simple (but redundant) regex like ruby(.*\n)+regex|regex(.*\n)+ruby. But to me it doesn't look beautiful. For three or more words, this kind of regex workaround increases its redundancy exponentially also, not good.
So is there a smarter way to do this? Thanks.

MongoDB Text Search AND multiple search words with word stemming

I am trying to search for multiple words in text inclusively(AND operation)
without losing word stemming.
For example:
db.supplies.runCommand("text", {search:"printers inks"})
should return results with (printer and ink) or (printers ink) or (printers ink) or (printers inks) , instead of all results with either printer or ink.
This post covers the search for multiple words as an AND operation, but the solution doesn't search for stemmed words ->MongoDB Text Search AND multiple search words.
The only way I could think of is creating a permutation of all the words and then running the search for the number of permutations(which could be large)
This may not be an effective way to search on a large collection.
Is there a better and smarter way to do it ?
So is there a reason you have to use a text search? If it were me i would use a regular expression.
https://docs.mongodb.com/manual/reference/operator/query/regex/
Off the top of my head something like this.
db.collection.find({products:/printers inks|printers|inks/})
Now i suppose you can do the same thing with a text search too.
db.collection.find({$text:{$search : "\"printers inks\" printers inks"}})
note the escaped quotes.

how to fulltext index both chinese and english characters together by using ngram parser in mysql 5.7?

I have a table named 'comp' with a column 'compName', the compName contain different country's Characters, I am using mysql5.7 with ngram parser, now it is fine to search the Chinese word, but the it brings me the bad result when i searched English word. According to INFORMATION_SCHEMA.INNODB_FT_INDEX_CACHE table , i found it participle English word by the character, like: abc will be participled as ab,bc, But as we understand, the English will be participled as "SPACE", right ? So how to resolve this kinds of case when using ngram parser in mysql5.7.

Foreign languages words in a text

I've a french text with some words in english and I want to find those words and highlight all of them at once. Is there any program that can help me do that? Is it possible to do this with any other foreign language?
I'm using microsoft Word.
Word can do this IF the English words are formatted with the English language (and the rest in the French language). In that case, Word's FIND functionality advanced options are able to filter so that the language formatting is searched (instead of text).