How to escape minus in regular expression with sed? - sed

I need to free a string from unwanted characters. In this example I want to filter all +'s and all -'s from b and write the result to c. So if b is +fdd-dfdf+, c should be +-+.
read b
c=$(echo $b | sed 's/[^(\+|\-)]//g')
But when i run the script, the console says:
sed: -e expression #1, char 15: Invalid range end
The reason is the \- in my regular expression. How can I solve this problem and say, that I want to filter all -'s?

are you looking for this?
kent$ echo 'a + b + c - d - e'|sed 's/[^-+]//g'
++--

Related

How to substitute with basic regex with alternating signs?

I want to do the following to all of the statements in the file:
Input: xblahxxblahxxblahblahx
Output: <blah><blah><blahblah>
So far I am thinking of using sed -i 's/x/</g' something.ucli
You can use
sed 's/x\([^x]*\)x/<\1>/g'
Details:
x - an x
\([^x]*\) - Group 1 (\1 refers to this group value from the replacement pattern): zero or more (*) chars other than x ([^x])
x - an x
See the online demo:
#!/bin/bash
s='xblahxxblahxxblahblahx'
sed 's/x\([^x]*\)x/<\1>/g' <<< "$s"
# => <blah><blah><blahblah>
If x is a multichar string, e.g.xyz, it will be easier with perl:
perl -pe 's/xyz(.*?)xyz/<$1>/g'
See this online demo.

Extract a substring using command line utilities

I have a text file including lines in the form of:
(term1 x:a y:b (term2 z:c k:a))
I want to extract only terms from this line using command line utilities such as awk, grep, sed. i.e I want the result to be:
term1
term2
I have formed a regex matching the rest but the terms, but could not find a way to negate it.
(\()|( \()|( (.*?) \()|( (.*?)\)+)
How can I form a command extracting the every substring after '(' and before ' '?
Thanks
Try this:
sed "s/(\([^ (]*\)[^(]*/\1\n/g"
For example:
$ echo "(term1 x:a y:b (term2 (term3) z:c k:a) x (termX a:b ) )" | sed "s/(\([^ )]*\)[^(]*/\1\n/g"
term1
term2
term3
termX

replace two consecutive lines based on a pattern and repeat through out the file

I'm trying to replace two consecutive lines based on a pattern match, and would want this to repeat for the entire file. Here is the input file:
c aaaaa bbb
+ 0.1
c xxxx
c yyyy
+ 0.2
* c gggg
m eeeee hhhhh
+ 0.3
The command I tried is:
sed '/^c/{N;s/+/*+/}'
I expected to see a * prepended to each line beginning, but only those lines immediatlely following a c line:
c aaaaa bbb
*+ 0.1
c xxxx
c yyyy
*+ 0.2
* c gggg
m eeeee hhhhh
+ 0.3
what I actually get:
c aaaaa bbb
*+ 0.1
c xxxx
c yyyy
+ 0.2
* c gggg
m eeeee hhhhh
+ 0.3
Here, i see only the first occurrence of + (with previous line beginning with c) is getting replaced with *+. The second occurrence of + in the file is not getting replaced.
What am I doing wrong? How do I get the result I want: replacement happens in multiple consecutive lines in the file?
The problem you run into is that when a line that starts with c comes right after another line that comes with c, the N command in your code consumes it, and it isn't available for checking when you process the line that comes next.
Instead of reading ahead to see if the next line should be changed, I'd remember the last line and look back to see if the current line should be changed:
sed 'x; G; /^c/ s/+/*+/; s/.*\n//' file
This works as follows:
x # Swap pattern space and hold buffer. Because we do this here,
# the previous line will be in the hold buffer for every line
# (except the first, then it is empty)
G # append hold buffer to pattern space. Now the pattern space
# contains the previous line followed by the current line.
/^c/ s/+/*+/ # If the pattern space begins with a c (i.e., if the previous
# line began with a c), replace + with *+
s/.*\n// # Remove the first line (the previous one) from the pattern
# space
# Then drop off the end. The changed current line is printed.
sed -e 'H;$!d' -e 'x' -e ':cycle' -e 's/\(\nc[[:alnum:][:blank:][:punct:]]*\n\)+/\1*+/g;t cycle' -e 's/.//' YourFile
Posix version changing the whoe in max 2 internal cycle
load the file in memory (-e 'H;$!d' -e 'x')
Add the * in front of line starting with a + after a line starting with a c ( s/\(\nc[[:alnum:][:blank:][:punct:]]*\n\)+/\1*+/g)
do the same if occur in previous line ( :cycle and t cycle)
use a trick to insure starting with new line( H append current line to buffer also for first line so an extra new line as heading) (for first line with a c) and remove this at the end ('s/.//)

remove all words containing backslash

ive been tring sooooo many different variations to get this right.
i am simply looking to use sed to remove all words beginning with or containing a backslash.
so string
another test \/ \u7896 \n test ha\ppy
would become
another test test
i've tried soo many different options, but it doesnt seem to want to work. Does anybody have an idea how to do this?
and before everyone starts giving me minus 1 for this question, believe me, i have tried to find the answer.
You could use str.split and a list comprehension:
>>> strs = "another test \/ \u7896 \n test ha\ppy"
>>> [x for x in strs.split() if '\\' not in x]
['another', 'test', 'test']
# use str.join to join the list
>>> ' ' .join([x for x in strs.split() if '\\' not in x])
'another test test'
$ echo "another test \/ \u7896 \n test ha\ppy" | sed -r 's/\S*\\\S*//g' | tr -s '[:blank:]'
another test test
This might work for you (GNU sed):
sed 's/\s*\S*\\\S*//g' file
string = "another test \/ \u7896 \n test ha\ppy"
string_no_slashes = " ".join([x for x in string.split() if "\\" not in x])

Flip array index with sed

I have some java code declaring a 2d array that I want to flip.
Content is like:
zData[0][0] = 198;
zData[0][1] = 198;
zData[0][2] = 198;
...
And I want to flip indices to have
zData[0][0] = 198;
zData[1][0] = 198;
zData[2][0] = 198;
So I tried doing it with sed:
sed -r 's#zData[([0-9]*)][([0-9]*)]#zData[\2][\1]#g' DataSample1.java
But unfortunately sed says:
sed: -e expression #1, char 43: Unmatched ) or \)
Might the string "zData" hold kind of flag or option?
I tried not using the -r option but I have the same kind of message for:
sed 's#zData[\(\[\0\-\9\]\*\)][\(\[\0\-\9\]\*\)]#zData[\2][\1]#g' DataSample1.java
Thanks for your help
Simples:
$ sed -r 's/(zData)(\[[^]]+])(\[[^]]+])/\1\3\2/' file
zData[0][0] = 198;
zData[1][0] = 198;
zData[2][0] = 198;
Regexplanation:
# Match
(zData) # Capture the variable name we want to transpose
( # Start capture group for first index
\[ # Opening bracket escaped to mean literal [
[^]]+ # One or more none ] characters i.e the digits
] # The closing literal ] doesn't need escaping here.
) # Close the capture
(\[[^]]+]) # Same regexp as before for the second index
# Replace
\1\3\2 # Switch the indexes but rearranging the 2nd and 3rd capture groups
Note: Switch \[[^]]+] to if it is clearer \[[0-9]+] for you, so instead of saying match an opening square bracket followed by one or more none-closing brackets followed by a closing bracket you are saying match an opening square bracket followed by one or more digit followed by a closing bracket.
Try that one:
sed 's#\([a-zA-Z0-9_-]\+\)\(\[[^]]*\]\)\(\[[^]*]\]\)\(.*$\)#\1\3\2\4#'
It adds four captures for the variable name, the first index, the second index and the rest and then switches order.
Edit: #Sudo_O's solution with extended regular expressions is much more readable. Thx for that! Nevertheless, on some systems sed -r may not be available, since it is not part of basic POSIX.