REST pattern for for query parameters that might be LIKE searches - rest

Hi I'm building a RESTful app and can't find the recommended way to pattern optional fuzzy or LIKE queries. For example a strict query might be,
/place?city=New+York&state=NY
Corresponds to SQL "... WHERE city="New York" AND state="NY"
But what if I wanted to search for the city field for any row with "York" in city name?
"... WHERE city LIKE "%{parameter}%" AND state="{parameter2}"
I'm thinking about just adding some kind of url-valid character to the request like this:
/place?city=*York*&state=NY
Is there an established or recommended pattern I should use? Thanks!

It's fine to use query string for searching, but it's a little bit weird to use macro character like "*" or "?" in query string(unless you decide to build a really powerful search engine like Google). More importantly, search is usually considered in fuzzy mode by default, so it's redundant to append/prepend the keyword with "*". If you do need exact search, you could surround the exact(or strict) keyword with double quotes. Namely, instead of using /place?city=*York*&state=NY, I recommend /place?city=York&state="NY".
In fact, Google uses quotes to search for an exact word or set of words, and I also found this site takes this pattern.

Related

Complex RegEx in T-SQL

I recently started using RegEx as conditional in my queries, but it seems that T-SQL has limited support for the official syntax.
As an example, I wish to test if a string is valid as a time between 00:00 and 23:59, and a fine RegEx expression would be "([0-1][0-9]|[2][0-3]):([0-5][0-9])":
select iif('16:06' like '([0-1][0-9]|[2][0-3]):([0-5][0-9])', 'Valid', 'Invalid')
.. fails and outputs "Invalid". Am I right to understand that T-SQL cannot handle groupings and conditionals (|)? I wound up lazily using a simplified RegEx which does not properly test the string - which I am fairly unhappy with:
select iif('16:06' like '[0-2][0-9]:[0-5][0-9]', 'Valid, 'Invalid')
.. which returns "Valid", but would also consider the string "28:06" as valid.
I know I can add further checks to fully check if it is a valid time string, but I would much prefer to take full advantage of RegEx.
Simply asked: Am I just doing or thinking things wrong about this being a limitation, and if yes - how can I use proper RegEx in T-SQL?
The pattern syntax used for LIKE and PATINDEX is much more limited than what's commonly known as Regular Expressions.
In standard SQL it actually has only 2 special characters.
% : wildcard for 0 or more characters
_ : any 1 character
And T-SQL added the character class [...] to the syntax.
But to test if a string contains a time, using LIKE is a clumsy way to do it.
In MS Sql Server one can use the TRY_CONVERT or TRY_CAST functions.
They'll return NULL is the conversion to a datatype fails.
select IIF(TRY_CAST('16:06' AS TIME) IS NOT NULL, 'Valid', 'Invalid')
This will return 'Valid' for '23:59', but 'Invalid' for '24:00'
You may use the following logic:
SELECT IIF('16:06' LIKE '[01][0-9]:[0-5][0-9]' OR
'16:06' LIKE '2[0-3]:[0-5][0-9]', 'Valid', 'Invalid');
The first LIKE expression matches 00:00 to 19:59, and the second LIKE matches 20:00 to 23:59. If SQL Server supported full regex, we could just use a single regex expression with an alternation.
I would recommend writing a user defined function using SQLCLR. Since .Net supports Regex you can port it to T-SQL. First link in Google gave this implementation, but there may be other (better) implementations.
Caveat - use of SQLCLR requires elevated permissions and may lead to security issues or performance issues or even issues with stability of the SQL Server if not implemented correctly. But if you know what you are doing this may lead to significant enhancements of T-SQL specific for your use cases.

Multiple regex in one command

Disclaimer: I have no engineering background whatsoever - please don't hold it against me ;)
What I'm trying to do:
Scan a bunch of text strings and find the ones that
are more than one word
contain title case (at least one capitalized word after the first one)
but exclude specific proper nouns that don't get checked for title case
and disregard any parameters in curly brackets
Example: Today, a Man walked his dogs named {FIDO} and {Fifi} down the Street.
Expectation: Flag the string for title capitalization because of Man and Street, not because of Today, {FIDO} or {Fifi}
Example: Don't post that video on TikTok.
Expectation: No flag because TikTok is a proper noun
I have bits and pieces, none of them error-free from what https://www.regextester.com/ keeps telling me so I'm really hoping for help from this community.
What I've tried (in piece meal but not all together):
(?=([A-Z][a-z]+\s+[A-Z][a-z]+))
^(?!(WordA|WordB)$)
^((?!{*}))
I think your problem is not really solvable solely with regex...
My recommendation would be splitting the input via [\s\W]+ (e.g. with python's re.split, if you really need strings with more than one word, you can check the length of the result), filtering each resulting word if the first character is uppercase (e.g with python's string.isupper) and finally filtering against a dictionary.
[\s\W]+ matches all whitespace and non-word characters, yielding words...
The reasoning behind this different approach: compiling all "proper nouns" in a regex is kinda impossible, using "isupper" also works with non-latin letters (e.g. when your strings are unicode, [A-Z] won't be sufficient to detect uppercase). Filtering utilizing a dictionary is a way more forward approach and much easier to maintain (I would recommend using set or other data type suited for fast lookups.
Maybe if you can define your use case more clearer we can work out a pure regex solution...

Algolia tag not searchable when ending with special characters

I'm coming across a strange situation where I cannot search on string tags that end with a special character. So far I've tried ) and ].
For example, given a Fruit index with a record with a tag apple (red), if you query (using the JS library) with tagFilters: "apple (red)", no results will be returned even if there are records with this tag.
However, if you change the tag to apple (red (not ending with a special character), results will be returned.
Is this a known issue? Is there a way to get around this?
EDIT
I saw this FAQ on special characters. However, it seems as though even if I set () as separator characters to index that only effects the direct attriubtes that are searchable, not the tag. is this correct? can I change the separator characters to index on tags?
You should try using the array syntax for your tags:
tagFilters: ["apple (red)"]
The reason it is currently failing is because of the syntax of tagFilters. When you pass a string, it tries to parse it using a special syntax, documented here, where commas mean "AND" and parentheses delimit an "OR" group.
By the way, tagFilters is now deprecated for a much clearer syntax available with the filters parameter. For your specific example, you'd use it this way:
filters: '_tags:"apple (red)"'

regex to get string within 2 strings

"artistName":"Travie McCoy", "collectionName":"Billionaire (feat. Bruno Mars) - Single", "trackName":"Billionaire (feat. Bruno Mars)",
i wish to get the artist name so Travie McCoy from within that code using regex, please not i am using regexkitlite for the iphone sdk if this changes things.
Thanks
"?artistName"?\s*:\s*"([^"]*)("|$) should do the trick. It even handles some variations in the string:
White space before and after the :
artistName with and without the quotes
missing " at the end of the artist name if it is the last thing on the line
But there will be many more variations in the input you might encounter that this regex will not match.
Also you don’t want to use a regex for matching this for performance reasons. Right now you might only be interested in the artistName field. But some time later you will want information from the other fields. If you just change the field name in the regex you’ll have to match the whole string again. Much better to use a parser and transform the whole string into a dictionary where you can access the different fields easily. Parsing the whole string shouldn’t take much longer than matching the last key/value pair using a regex.
This looks like some kind of JSON, there are lots of good and complete parsers available. It isn’t hard to write one yourself though. You could write a simple recursive descent parser in a couple of hours. I think this is something every programmer should have done at least once.
\"?artistName\"?\s*:\s*\"([^\"]*)(\"|$)
Thats for objective c

Matching beginning of words in a NSString

Is there a method built in to NSString that tokenizes the string and searches the beginning of each token? the compare method seems to only do the beginning of a string, and using rangeOfString isn't really sufficient because it doesn't have knowledge of tokens. Right now I'm thinking the best way to do this is to call
[myString componentsSeparatedByString:#" "]
and then loop over the resulting array, calling compare on each component of the string. Is this built-in and I just missed it?
Using CFStringTokenizer for, um, tokenizing strings will be more robust than splitting on #" ", but searching the results is still left up to you.
You may want to look into RegexKit Lite:
http://regexkit.sourceforge.net/#RegexKitLite/
Although it's a third party library, it's basically a very small (one class) wrapper built around the built-in fairly powerful regular expression engine.
It seems like this would be more useful since you could have non-capturing expressions match around the token-separators and then the capturing portion include or not include the text you are looking for along with the remaining text between tokens. If you have not used regular expressions much before, you'll want to read some kind of reference but just be aware you can separate out matching patterns from content you want to see with a cryptic but very powerful syntax.
I'm also not sure you can use CFStringTokenizer on the iPhone since the iPhone specific doc set has no reference for it.