How do I replace all occurrences of ' sub.*' with the exception of ' substation.*'?
regexp_replace("CleanString",' sub.*',' ', 'ig')
I have tried using various combinations of groupings () but still not getting it.
Using postgres regexp_replace()
A regular expression normally matches only things that are there, not things that are not there - you cannot simply put an "if-then-else" in there.
However, Postgres's regex support, the manual page for which is here includes "lookahead" and "lookbehind" expressions.
In your case, you want a *negative lookahead":
(?!re) negative lookahead matches at any point where no substring matching re begins (AREs only)
It's important to note the phrase "at any point" - lookarounds are "zero width", so (?!station) doesn't mean "something other than station", it means "a position in the string where station isn't coming next".
You can therefore construct your query like this:
' sub(?!station).*'
That will match any of "sub", "foo sub", " subbar", or "foo subbar", but not any of "substation", "foo substation", " substationbar", or "foo substationbar". Since the (?!station) is zero-width, and the next token is .*, it's fine for nothing to come after " sub".
If you want there to be something after the "sub", you could instead write:
' sub(?!station).+'
The .+ means "at least one of something", so it will still match " subbar" and "foo subbar", but will no longer match " sub" or "foo sub".
Related
I am trying to parse all the files and verify if any of the file content has strings TESTDIR or TEST_DIR
Files contents might look something like:-
TESTDIR = foo
include $(TESTDIR)/chop.mk
...
TEST_DIR := goldimage
MAKE_TESTDIR = var_make
NEW_TEST_DIR = tesing_var
Actually I am only interested in TESTDIR ,$(TESTDIR),TEST_DIR but in my case last two lines should be ignored. I am new to perl , Can anyone help me out with re-rex.
/\bTEST_?DIR\b/
\b means a "word boundary", i.e. the place between a word character and a non-word character. "Word" here has the Perl meaning: it contains characters, numbers, and underscores.
_? means "nothing or an underscore"
Look at "characterset".
Only (space) surrounding allowed:
/^(.* )?TEST_?DIR /
^ beginning of the line
(.* )? There may be some content .* but if, its must be followed by a space
at the and says that a whitespace must be there. Otherwise use ( .*)?$ at the end.
One of a given characterset is allowed:
Should the be other characters then a space be possible you can use a character class []:
/^(.*[ \t(])?TEST_?DIR[) :=]/
(.*[ \t(])? in front of TEST_?DIR may be a (space) or a \t (tab) or ( or nothing if the line starts with itself.
afterwards there must be one of (space) or : or = or ). Followd by anything (to "anything" belongs the "=" of ":=" ...).
One of a given group is allowed:
So you need groups within () each possible group in there devided by a |:
/^(.*( |\t))?TEST_?DIR( | := | = )/
In this case, at the beginning is no change to [ \t] because each group holds only one character and \t.
At the end, there must be (single space) or := (':=' surrounded by spaces) or = ('=' surrounded by spaces), following by anything...
You can use any combination...
/^(.*[ \t(])?TEST_?DIR([) =:]| :=| =|)/
Test it on Debuggex.com. (Use 'PCRE')
I am trying to allow double quotation marks into my grammar's functions. I was hoping that I could use Haskell conventions to generate something like:
> mkSentence "This is \"just\" a sentence"
> This is "just" a sentence
However, when I try this in my grammar, I am faced with errors like in the example below using the English RGL:
> cc -table ss "This is \"just\" a sentence"
constant not found: just
given Predef, Predef, CatEng, ResEng, MorphoEng, Prelude,
ParadigmsEng
A function type is expected for ss "This is " instead of type {s : Str}
0 msec
> cc -table ss "This is \"just a sentence"
lexical error
0 msec
I can see that src/common/ExtendFunctor.gf in the RGL has an implementation of quoted:
oper
quoted : Str -> Str = \s -> "\"" ++ s ++ "\"" ; ---- TODO bind ; move to Prelude?
I have tried to implement something similar, but " may be used in different parts of my grammar, so ideally the double quotation marks could be escaped without special binds. I am considering defaulting to ” to avoid the issues with ", but maybe there is a way to escape double quotation marks "everywhere" (like in these docs)?
Any tips or reference to other docs would be very appreciated!
As far as I know, there is no API function for handling quotes. You can just do something like this yourself:
oper
qmark : Str = "\"" ;
quote : Str -> Str = \s -> qmark + s + qmark ;
And call it like this:
> cc -one ss ("This is" ++ quote "just" ++ "a sentence")
This is "just" a sentence
As long as you're only handling strings that are not runtime tokens, it works fine.
It's of course a bit clumsy to have to write it like that, but you can always write a sed oneliner from your preferred syntax. This works for just one "quoted" part, adjust as you wish for more.
$ sed -E 's/(.*) \\"(.*)\\" (.*)/("\1" ++ quote "\2" ++ "\3")/'
this is \"just\" a sentence
("this is" ++ quote "just" ++ "a sentence")
this is \"just\" a sentence with \"two\" quoted words
("this is \"just\" a sentence with" ++ quote "two" ++ "quoted words")
I am trying to extract whatever is between two strings. The first string is a known string, the second string could be from a list of strings.
For example,
We have the start string and the end strings. We want to get the text between these.
start = "start"
end = ["then", "stop", "other"]
Criteria
test = "start a task then do something else"
result = "a task"
test = "start a task stop doing something else"
result = "a task"
test = "start a task then stop"
result = "a task"
test = "start a task"
result = "a task"
I have looked at using a regex, and I got one which works for between two strings, I just cannot create one which words with a option of strings:
(?<=start\s).*(?=\sthen)
I have tried using this:
(?<=start\s).*(?=\sthen|\sstop|\sother)
but this will include 'then, stop or other' in the match like so:
"start a task then stop" will return "a task then"
I have also tried to do a 'match any character except the end list" in the capture group like so: (?<=start\s)((?!then|stop|other).*)(?=\sthen|\sstop|\sother) but this has the same effect as the one above.
I am using swift, so I am also wondering whether this can be achieved by finding the substring between two strings.
Thanks for any help!
You may use
(?<=start\s).*?(?=\s+(?:then|stop|other)|$)
See the regex demo. To search for whole words, add \b word boundary in proper places:
(?<=\bstart\s).*?(?=\s+(?:then|stop|other)\b|$)
See another regex demo
Details
(?<=start\s) - a positive lookbehind that matches a location immediately preceded with start string and a whitespace
.*? - any 0+ chars other than line break chars, as few as possible
(?=\s+(?:then|stop|other)|$) - a position in the string that is immediately followed with
\s+ - 1+ whitespaces
(?:then|stop|other) - one of the words
|$ - or end of string.
Strange syntax in this code fragment:
var result =
try {
Process(bl).!!
} catch {
case e: Exception =>
log.error(s"Error on query: ${hql}\n")
"Etc etc" + "Query: " + hql
}
Why not using separator like , or ; after log.error(s"...")?
The catch statement is returning one or two values?
PS: there are a better Guide tham this one, with all Scala syntax alternatives?
Newline characters can terminate statements
semi ::= ‘;’ | nl {nl}
Scala is a line-oriented language where statements may be terminated
by semi-colons or newlines. A newline in a Scala source text is
treated as the special token “nl” ...
IMHO, newline character \n is just as good of a statement terminator as semicolon character ;. However, it may have an advantage over ; in that it is invisible to humans which perhaps has the benefit of less code clutter. It might seem strange because it is invisible, but rest assured it is there silently doing its job delimiting statements. Perhaps it might become less strange if we try to imagine it like so
1 + 42'\n' // separating with invisible character \n
1 + 42; // separating with visible character ;
Note that we must use semicolons when writing multiple statements on the same line
log.error(s"Error on query: ${hql}\n"); "Etc etc" + "Query: " + hql
Addressing the comment, AFAIU, your confusion stems from misunderstanding how pattern matching anonymous functions and block expressions work. Desugared handler function
case e: Exception =>
log.error(s"Error on query: ${hql}\n")
"Etc etc" + "Query: " + hql
is equivalent to something like
case e: Exception => {
log.error(s"Error on query: ${hql}\n"); // side-effect statement that just logs an error
return "Etc etc" + "Query: " + hql; // final expression becomes the return value of the block
}
Hence, "one block with two branches into it" is not the correct understanding, instead there is only a single code path through your particular function.
I wonder whether it is possible to get the cursor context in Dragon NaturallySpeaking's advanced scripting.
By cursor context, I mean the surrounding characters. For example, I sometimes want to condition some steps of a voice command on whether the character preceding the cursor is a space.
Best I could come up with is my CheckNewPara function shown here: http://knowbrainer.com/forums/forum/messageview.cfm?catid=4&threadid=2739&discTab=true&messid=11427&parentid=11409&FTVAR_FORUMVIEWTMP=Single
Function CheckNewPara()
Clipboard$()
SendKeys "+{Left}^c", True ' copy previous character
Select Case Clipboard$()
Case "" ' if the prior character is nothing
CheckNewPara = "" ' add no space
Case Chr(13)&Chr(10), Chr(9), ")" ' if the prior character is a Lf-CR, tab or )
SendKeys "{Right}", True
CheckNewPara = "" ' add no space
Case "." ' if the prior character is a period
SendKeys "{Right}", True
Clipboard$() ' check for No.
SendKeys "+{Left 3}^c", True ' copy prior three characters
SendKeys "{Right}", True
If Clipboard$() = "No." Then
CheckNewPara = " " ' add one space after No.
Else
CheckNewPara = " " ' add two spaces after period
End If
Case "?", "!"
SendKeys "{Right}", True
CheckNewPara = " " ' add two spaces after other ends of sentence
Case Else
SendKeys "{Right}", True
CheckNewPara = " " ' add one space in the usual case
End Select
Clipboard$()
End Function
You should look at the complete topic at http://knowbrainer.com/forums/forum/messageview.cfm?FTVAR_FORUMVIEWTMP=Linear&catid=4&threadid=2739&discTab=true to get all the context, but the code in the post I pointed to should get you started.
My newest version of the function actually calls an AutoHotKey script which looks at both the prior three characters (or as many as there are, if there are any) and the next two characters (or how ever many there are, if there are any) and returns either a space, two spaces, or nothing depending on the context. The context could be a terminal punctuation (requiring two spaces) or a pound/hash # symbol or close paren bracket or brace ) ] } all requiring no spaces, or else by default one space. I also have it so I can call it before and/or after typing in the results of a Dragon command.
HTH, YMMV,