VK newsfeed.search -- search string not respecting " " for exact term search - vk

When using newsfeed.search to search, with q = "UEFA EL", we are finding the exact term search to be ignored. We get response that has keyword "UEFA" and "EL", but not as a phrase but separated.
Is VK newsfeed.search respecting the exact term search?
https://api.vk.com/method/newsfeed.search?v=5.69&q=%22UEFA%20EL%22&count=200&start_time=1618441503&access_token=[access_token]

Related

Regex : findall with a repeated capture group

I would like to understand why :
re.findall(r"(\d[A-Za-z]+)", "My user name is 3e4r 5fg")
returns
['3e', '4r', '5fg']
while :
re.findall(r"(\d[A-Za-z]+)+", "My user name is 3e4r 5fg")
returns
['4r', '5fg']
I tested some combinations with spaces between groups of "digit-letter" and 2 points clearly are involved in :
spaces between those groups
last "+".
I don't really understand why adding "+" after the group changes the result. Can someone explain me the steps of the process which leads to those different answers? Thank you very much.
When you put + after parenthesis you are searching for a pattern that contains one or more sub pattern with 1 digit and (one or more) letters'
so this phrase: "(\d[A-Za-z]+)+" return 2 matches:
3e4r
5fg
When you put a sub-pattern in parenthesis it means that all matches this sub-pattern will enter in a group, the groups is:
3e
5fg
The function re.findall returns only the groups (Unless there are no groups then it returns the matches ).

Prefix/wildcard searches with 'websearch_to_tsquery' in PostgreSQL Full Text Search?

I'm currently using the websearch_to_tsquery function for full text search in PostgreSQL. It all works well except for the fact that I no longer seem to be able to do partial matches.
SELECT ts_headline('english', q.\"Content\", websearch_to_tsquery('english', {request.Text}), 'MaxFragments=3,MaxWords=25,MinWords=2') Highlight, *
FROM (
SELECT ts_rank_cd(f.\"SearchVector\", websearch_to_tsquery('english', {request.Text})) AS Rank, *
FROM public.\"FileExtracts\" f, websearch_to_tsquery('english', {request.Text}) as tsq
WHERE f.\"SearchVector\" ## tsq
ORDER BY rank DESC
) q
Searches for customer work but cust* and cust:* do not.
I've had a look through the documentation and a number of articles but I can't find a lot of info on it. I haven't worked with it before so hopefully it's just something simple that I'm doing wrong?
You can't do this with websearch_to_tsquery but you can do it with to_tsquery (because ts_query allows to add a :* wildcard) and add the websearch syntax yourself in in your backend.
For example in a node.js environment you could do smth. like this:
let trimmedSearch = req.query.search.trim()
let searchArray = trimmedSearch.split(/\s+/) //split on every whitespace and remove whitespace
let searchWithStar = searchArray.join(' & ' ) + ':*' //join word back together adds AND sign in between an star on last word
let escapedSearch = yourEscapeFunction(searchWithStar)
and than use it in your SQL
search_column ## to_tsquery('english', ${escapedSearch})
You need to write the tsquery directly if you want to use partial matching. plainto_tsquery doesn't pass through partial match notation either, so what were you doing before you switched to websearch_to_tsquery?
Anything that applies a stemmer is going to have hard time handling partial match. What is it supposed to do, take off the notation, stem the part, then add it back on again? Not do stemming on the whole string? Not do stemming on just the token containing the partial match indicator? And how would it even know partial match was intended, rather than just being another piece of punctuation?
To add something on top of the other good answers here, you can also compose your query with both websearch_to_tsquery and to_tsquery to have everything from both worlds:
select * from your_table where ts_vector_col ## to_tsquery('simple', websearch_to_tsquery('simple', 'partial query')::text || ':*')
Another solution I have come up with is to do the text transform as part of the query so building the tsquery looks like this
to_tsquery(concat(regexp_replace(trim(' all the search terms here '), '\W+', ':* & '), ':*'));
(trim) Removes leading/trailing whitespace
(regexp_replace) Splits the search string on non word chars and adds trailing wildcards to each term, then ANDs the terms (:* & )
(concat) Adds a trailing wildcard to the final term
(to_tsquery) Converts to a ts_query
You can test the string manipulation by running
SELECT concat(regexp_replace(trim(' all the search terms here '), '\W+', ':* & ', 'gm'), ':*')
the result should be
all:* & the:* & search:* & terms:* & here:*
So you have multi word partial matches e.g. searching spi ma would return results matching spider man

Microsoft graph Mail Search Strict value

I have an issue with the search parameters. I want to pass a phrase in my query. For exemple i'm looking for emails where the subject is "Test 1".
For this i'm doing a get on this ressource.
https://graph.microsoft.com/v1.0/me/messages?$search="subject:Test 1"
But the behaviour of this query is : Looking for mails that contains "Test" in the subject OR 1 in any other fields.
Refering to the KQL Syntax
A phrase (includes two or more words together, separated by spaces; however, the words must be enclosed in double quotation marks)
So, to do what i want i have to put double quotes (") around my phrase to do a strict value search. Like below
subject:"Test 1"
The problem it's at this point. Microsoft graph api already use double quotes (") after the parameters $search.
?$search="Key words"
So I can't do what is mentioned in the KQL doc.
https://graph.microsoft.com/v1.0/me/messages?$search="subject:"Test 1""
It's throwing an error :
"Syntax error: character '1' is not valid at position 15 in '\"subject:\"test 1\"\"'.",
It's an expected behaviour. I was pretty sure it will not work.
If someone has any suggestions for a solution or a workaround, I'm a buyer.
What I've already tried so far :
Use simple quote
Remove the quotes right after $select=
Remove the subject part $select="Test 1", same behaviour as the first request mentioned in this post. It will looks for emails that contain "test" or "1".
Best regards.
EDIT :
After sasfrog's anwser :
I used $filter : It works well with simple operator AND, OR.I have some errors by using the Not Operator. And btw you have to use the orderby parameter to show the result by date and add the field in filter parameters.
Exemple 1 (working, what I asked for first) :
https://graph.microsoft.com/v1.0/me/messages/?$orderby=receivedDateTime desc &$filter=receivedDateTime ge 1900-01-01T00:00:00Z AND contains(subject,'test 1')
Exemple 2 (not working)
https://graph.microsoft.com/v1.0/me/messages/?$orderby=receivedDateTime desc &$filter=(receivedDateTime ge 1900-01-01T00:00:00Z AND contains(subject,'test 1')) NOT(contains(from/EmailAddress/address,[specific address]))
EDIT 2
After some test with the filter parameters.
The NOT operator is still not working so to workaround use "ne" (non-equals)
the example 2 becomes :
https://graph.microsoft.com/v1.0/me/messages/?$orderby=receivedDateTime desc&$filter=(receivedDateTime ge 1900-01-01T00:00:00Z AND contains(subject,'test 1')) AND (from/EmailAddress/address ne [specific address])
UPDATE : OTHER SOLUTION WITH $search
Using $filter is great but it looks like it was sometimes pretty slow. So I found a workaround aboutmy issue.
It's to use AND operator between all terms.
Exemple 4 :
I'm looking for the mails where the subject is test 1;
Let value = "test 1". So you have to splice it by using space separator. And after write some code to manipulate this array, to obtain something like below.
$search="(subject:test AND subject:1)"
The brackets can be important if you use a multiple fields search. And VoilĂ .
Not sure if it's sufficient for what you're doing, but how about using the contains function within a filter query instead:
https://graph.microsoft.com/v1.0/me/messages?$filter=contains(subject,'Test 1')
Sounds like you're already looking at the doco but here it is just in case.
Update also, this worked for me using the search method:
https://graph.microsoft.com/v1.0/me/messages?$search="subject:'Test 1'"

Reduce multiple whitespaces to a single space in KDB+/Q

To get from "a b" to "a b"
ssr["a b";"[ ]+";" "]
doesn't seem to work.
Thanks!
You can use the following which treats each repeating space as a pair,
then using over, 'replaces' these with a single space.
q)x:"This is a test"
q)(" "sv" "vs)/[x]
"This is a test"
It is possible to do this more efficiently then using vs and sv. Using the adverb each-prior ':
{x where not(and':)null x}"This is a test"
"This is a test"
Alternative you can using ssr with the adverb over / in order to continuously remove blocks of two spaces:
ssr[;" ";" "]/["This is a test"]
"This is a test"
The example you provided fails due to the limited regex options available, using + in this sequence "[ ]+" is an example of an operation that is not supported. You can read more about regex in q on the kx wiki.

Sphinx query before and after a term

Is it possible to set up a query in sphinx with a term that has to either also match a word before OR after?
(TermBefore) (Term) (TermAfter)
so that both
TermBefore Term
Term TermAfter
would match but
Term
does not?
The proximity search operator is pretty much designed for this
"Term TermAfter"~2
http://sphinxsearch.com/docs/current.html#extended-syntax
Ah, I thought you meant 'TermAfter' to be actully be the same word, just that it can be before or after.
But if two different terms, possibly the easiest is just to do:
"TermBefore Term" | "Term TermAfter"
Just simple phrase operator, where either phrase must match.
Edit again:
If dont want the matchs adjecent use Strict order operator, rather htna phrase operator...
(TermBefore << Term) | (Term << TermAfter)