Sed replacing Special Characters in a string - sed

I am having difficulties replacing a string containing special characters using sed. My old and new string are shown below
oldStr = "# td=(nstates=20) cam-b3lyp/6-31g geom=connectivity"
newStr = "# opt b3lyp/6-31g geom=connectivity"
My sed command is the following
sed -i 's/\# td\=\(nstates\=20\) cam\-b3lyp\/6\-31g geom\=connectivity/\# opt b3lyp\/6\-31g geom\=connectivity/g' myfile.txt
I dont get any errors, however there is no match. Any ideas on how to fix my patterns.
Thanks

try s|# td=(nstates=20) cam-b3lyp/6-31g geom=connectivity|# opt b3lyp/6-31g geom=connectivity|g'
you can use next to anything after s instead of /, as your expression contains slashes I used | instead. -, = and # don't have to be escaped (minus only in character sets [...]), escaped parens indicate a group, nonescaped parens are literals.

Related

How to match exact string in perl

I am trying to parse all the files and verify if any of the file content has strings TESTDIR or TEST_DIR
Files contents might look something like:-
TESTDIR = foo
include $(TESTDIR)/chop.mk
...
TEST_DIR := goldimage
MAKE_TESTDIR = var_make
NEW_TEST_DIR = tesing_var
Actually I am only interested in TESTDIR ,$(TESTDIR),TEST_DIR but in my case last two lines should be ignored. I am new to perl , Can anyone help me out with re-rex.
/\bTEST_?DIR\b/
\b means a "word boundary", i.e. the place between a word character and a non-word character. "Word" here has the Perl meaning: it contains characters, numbers, and underscores.
_? means "nothing or an underscore"
Look at "characterset".
Only (space) surrounding allowed:
/^(.* )?TEST_?DIR /
^ beginning of the line
(.* )? There may be some content .* but if, its must be followed by a space
at the and says that a whitespace must be there. Otherwise use ( .*)?$ at the end.
One of a given characterset is allowed:
Should the be other characters then a space be possible you can use a character class []:
/^(.*[ \t(])?TEST_?DIR[) :=]/
(.*[ \t(])? in front of TEST_?DIR may be a (space) or a \t (tab) or ( or nothing if the line starts with itself.
afterwards there must be one of (space) or : or = or ). Followd by anything (to "anything" belongs the "=" of ":=" ...).
One of a given group is allowed:
So you need groups within () each possible group in there devided by a |:
/^(.*( |\t))?TEST_?DIR( | := | = )/
In this case, at the beginning is no change to [ \t] because each group holds only one character and \t.
At the end, there must be (single space) or := (':=' surrounded by spaces) or = ('=' surrounded by spaces), following by anything...
You can use any combination...
/^(.*[ \t(])?TEST_?DIR([) =:]| :=| =|)/
Test it on Debuggex.com. (Use 'PCRE')

How to escape special char when use glib.string.escape()

Due to the document of glib.string.escape()
Escapes the special characters '\b', '\f', '\n', '\r', '\t', '\v', '\' and '"' in the string source by inserting a '\' before them.
Additionally all characters in the range 0x01-0x1F (everything below SPACE) and in the range 0x7F-0xFF (all non-ASCII chars) are replaced with a '\' followed by their octal representation. Characters supplied in exceptions are not escaped.
Now I want not eacape "0x7F-0xFF" characters. How to write the exceptions part?
my example code no work.
shellcmd = "bash -c \""+file.get_string(title,"List").escape("0x7F-0xFF")+"\"";
print("shellcmd: %s\n", shellcmd);
Process.spawn_command_line_sync (shellcmd,
out ls_stdout, out ls_stderr, out ls_status);
if(ls_status!=0){ list = ls_stderr.split("\n"); }
else{ list = ls_stdout.split("\n"); }
this works.
shellcmd = "bash -c \""+file.get_string(title,"Check").replace("\"","\\\"")+"\"";
You actually have to put the characters 0x7f to 0xff in the exceptions argument. So something like:
shellcmd = "bash -c \""+file.get_string(title,"List").escape("\x7F\x80\x81\x82…\xfe\xff")+"\"";
You would need to list them all manually.
Looking more generally at your code, you seem to be constructing a command to run. This is a very bad idea and you should never do it. It is wide open to code injection. Use Process.spawn_sync() and pass it an argument vector instead.

Running a PowerShell script file with path containing spaces from Jenkins Pipeline without using backtick

I want to run the following PowerShell script file from Jenkins Pipeline:
".\Folder With Spaces\script.ps1"
I have been able to do it with the following step definition:
powershell(script: '.\\Folder` With` Spaces\\script.ps1')
So I have to remember to:
escape the backslash with a double backslash (Groovy syntax)
escape the space with backtick (PowerShell syntax)
I would prefer to avoid at least some of this. Is it possible to avoid using the backtick escaping, for example? (Putting it between "" does not seem to work, for some reason.)
I found that it's possible to use the ampersand, or invoke, operator, like this:
powershell(script: "& '.\\Folder With Spaces\\script.ps1'")
That gets rid of the backtick escaping, and should make life a tiny bit easier.
To avoid escaping the backslashes you could use slashy strings or dollar slashy strings as follows. However you cannot use a backslash as the very last character in slashy strings as it would escape the /. Of course slashes as well would have to be escaped when using slashy strings.
String slashy = /String with \ /
echo slashy
assert slashy == 'String with \\ '
// won't work
// String slashy = /String with \/
String dollarSlashy = $/String with / and \/$
echo dollarSlashy
assert dollarSlashy == 'String with / and \\'
And of course you'll lose the possibility to include newlines \n and other special characters in the string using the \. However as both slashy and dollar slashy strings have multi line support at least newlines can be included like:
String slashyWithNewline = /String with \/ and \
with newline/
echo slashyWithNewline
assert slashyWithNewline == 'String with / and \\ \nwith newline'
String dollarSlashyWithNewline = $/String with / and \
with newline/$
echo dollarSlashyWithNewline
assert dollarSlashyWithNewline == 'String with / and \\ \nwith newline'
If you combine that with your very own answer you won't need both of the escaping.

Is there a function to escape all regex-relevant characters?

The regex I'm using in my application is a combination of user-input and code. Because I don't want to restrict the user I would like to escape all regex-relevant characters like "+", brackets , slashes etc. from the entry.
Is there a function for that or at least an easy way to get all those characters in an array so that I can do something like this:
for regexChar in regexCharacterArray{
myCombinedRegex = myCombinedRegex.replaceOccurences(of: regexChar, with: "\\" + regexChar)
}
Yes, there is NSRegularExpression.escapedPattern(for:):
Returns a string by adding backslash escapes as necessary to protect any characters that would match as pattern metacharacters.
Example:
let escaped = NSRegularExpression.escapedPattern(for: "[*]+")
print(escaped) // \[\*]\+

Flip array index with sed

I have some java code declaring a 2d array that I want to flip.
Content is like:
zData[0][0] = 198;
zData[0][1] = 198;
zData[0][2] = 198;
...
And I want to flip indices to have
zData[0][0] = 198;
zData[1][0] = 198;
zData[2][0] = 198;
So I tried doing it with sed:
sed -r 's#zData[([0-9]*)][([0-9]*)]#zData[\2][\1]#g' DataSample1.java
But unfortunately sed says:
sed: -e expression #1, char 43: Unmatched ) or \)
Might the string "zData" hold kind of flag or option?
I tried not using the -r option but I have the same kind of message for:
sed 's#zData[\(\[\0\-\9\]\*\)][\(\[\0\-\9\]\*\)]#zData[\2][\1]#g' DataSample1.java
Thanks for your help
Simples:
$ sed -r 's/(zData)(\[[^]]+])(\[[^]]+])/\1\3\2/' file
zData[0][0] = 198;
zData[1][0] = 198;
zData[2][0] = 198;
Regexplanation:
# Match
(zData) # Capture the variable name we want to transpose
( # Start capture group for first index
\[ # Opening bracket escaped to mean literal [
[^]]+ # One or more none ] characters i.e the digits
] # The closing literal ] doesn't need escaping here.
) # Close the capture
(\[[^]]+]) # Same regexp as before for the second index
# Replace
\1\3\2 # Switch the indexes but rearranging the 2nd and 3rd capture groups
Note: Switch \[[^]]+] to if it is clearer \[[0-9]+] for you, so instead of saying match an opening square bracket followed by one or more none-closing brackets followed by a closing bracket you are saying match an opening square bracket followed by one or more digit followed by a closing bracket.
Try that one:
sed 's#\([a-zA-Z0-9_-]\+\)\(\[[^]]*\]\)\(\[[^]*]\]\)\(.*$\)#\1\3\2\4#'
It adds four captures for the variable name, the first index, the second index and the rest and then switches order.
Edit: #Sudo_O's solution with extended regular expressions is much more readable. Thx for that! Nevertheless, on some systems sed -r may not be available, since it is not part of basic POSIX.