Stopword "_" in sphinx - sphinx

I have records that come in as LastName_FirstName e.g. Smith_Bob so am unable to search for Smith or Bob independently. I tried add ing _ to my stopwords.txt file and re-rotating to no avail. Is there another way to force _ to be ignored?

Related

Search for XML Nodes in Azure DevOps Server Code Search

We have recently configured our ADS 2020.1 with ElasticSearch, which will now allow us to do Code searches.
I checked the Documentation, but are unable to find an answer to my current "issue".
Suppose i want to find all files where we have the this code
<Project>
When i search it just as <Project> i get many hits for "Project" in the code.
When i search for "<Project>" i get the same result as above.
Also tried \<Project\> or "\<Project\>"
How can i properly search for the <> brackets in the code?
I have the same questions for searches including the = sign or anything like that, i cannot find the way to include this and search as text.
After further search i found another topic here
Is there a way to make TFS code search recognize the "#" symbol?
That's not supported.
Checked for some characters in code search. You can't use the symbol characters except * and ? as part of your search query, which including below characters: . , : ; / \ ` ' " # = ! # $ & + ^ | ~ < > ( ) { } [ ]. The search will simply ignore these symbols.
https://developercommunity.visualstudio.com/t/allow-non-alphanumeric-characters-in-code-search/893393

Powershell - Delete specific word from a text file

I tried to search on google but I did not find anything about it,
Simply I have a .CSV file that contains for example this,
APPNAME,Status OK,Ping OK
APPNAME2,Status OK,Ping FAIL
There are multiple lines, all i need is to delete "Status" and "PING" from the text file so that the output will be like this
APPNAME, OK, OK
APPNAME2, OK, FAIL
Is that possible ? If so thank you for your reply.

Postgres full text search ignore url

I am trying to use PostgreSQL to implement a full-text search system.
I encounter this strange or may be intended feature with that.
While trying to index or search for a column which contains names of files with extension (e.g. myimage.jpg), the system treats it as a url and does not properly tokenize.
I referred to the documentation and see that via ts_debug that the file name is taken as a host of a url.
Could some one tell how to take all inputs as normal word in the FTS of PostgreSQL.
Also, on a second request, how can one do a contains, startswith, and endswith searches with it?
Update
I have now tried the statement create text search configuration..., copied from pg_catalog.english and removed host,url, and url_path and then specified the configuration for the ts_debug method. But still no go., myimage.jpg is still identified as host.
Version
I use version 9.4
tl;dr Look at pre-parsing your input and removing punctuation if you really only want words (and not emails, urls, hosts, etc).
So after trying to figure this out myself the issue is that you don't seem to be able to easily customise the parser. From my understanding the parser runs first, which generates tokens. Those tokens are then matched to dictionaries.
By removing host, url, url_path from the configuration all you are doing is making it so that these tokens don't get looked up in a dictionary, resulting in no lexeme from these tokens. Which essentially means that they don't exist in terms of search. Which is not want you want...
Ideally what you need to do is customise the parser to not generate those tokens in the first place, or to also generate overlapping tokens (similar to how hyphenated words generate a token for the entire word as well as individual components) . This doesn't seem to be possible at the moment without writing a custom parser.
The only solution to this would be to pre-parse the text to remove the full stop. Note that if you rely on other types of tokens like version (e.g. 8.3.0) or email (e.g. name#domain.com) this will break those. So you may need to be a bit clever on how you remove characters.
select ts_debug('english', replace('this-is-a-file.jpg', '.', ' '));
"(asciihword,"Hyphenated word, all ASCII",this-is-a-file,{english_stem},english_stem,{this-is-a-fil})"
"(hword_asciipart,"Hyphenated word part, all ASCII",this,{english_stem},english_stem,{})"
"(blank,"Space symbols",-,{},,)"
"(hword_asciipart,"Hyphenated word part, all ASCII",is,{english_stem},english_stem,{})"
"(blank,"Space symbols",-,{},,)"
"(hword_asciipart,"Hyphenated word part, all ASCII",a,{english_stem},english_stem,{})"
"(blank,"Space symbols",-,{},,)"
"(hword_asciipart,"Hyphenated word part, all ASCII",file,{english_stem},english_stem,{file})"
"(blank,"Space symbols"," ",{},,)"
"(asciiword,"Word, all ASCII",jpg,{english_stem},english_stem,{jpg})"
In terms of your second question. Are you talking about partial word matches? You get this a little bit with the stemming when using a config like english, so running becomes run which will match if you search for run or running. If you're talking about fuzzy matching it gets a little more complicated. I suggest reading this article http://rachbelaid.com/postgres-full-text-search-is-good-enough/

find - globbing in path to search

My question is simple and I couldn't find answer on google:
why if I type:
find *h
or
find *g
or any other character following the star, the result is all files in current and subdirectories ?
the same result is also for
find *
which is obvious. I guess the star(*) acts here as the directory where to start searching, not the file pattern to search for. So the * extends as 'all directories in current directory'. So in this case it will search in all directories and find all files, which is the expected behavior. But why if I provide as directory to start searching '*g' it finds also all files ? even though there is no single directory which starts with 'g' ?
What you are describing is not how it works. *g is expanded by the shell to all the files and directories in the current directory which end with g and then find acts on that list.
As #Barmar points out in a comment, what you describe sounds like you have no matches on *g and the nullglob option set in your shell, which will cause a wildcard expression with no matches to expand into the empty string. (The default behavior is to leave it unexpanded, which would cause an error message from find.)

Eclipse - Search for a word or Phrase inside the package

Is there a way I can "find a particular word or phrase" in a package/project(in all files in the package/project), without going to each and every class file and do the ctrl + F thing.
Let's say I have a template project for schools. I just need to change the school's name with the needed name. I know I can keep this as a constant and do, but I am just giving you a scenario.
My requirement is that I find all files where the particular pharse or word exists in the package and change it.
Search -> Search... or ctrl+H. There is File Search where you can search by text or regular expression and restrict search by scope and/or file name patterns. And there is Java Search which allows you to find declarations, references and occurrences of Java elements.