remove all words containing backslash - sed

ive been tring sooooo many different variations to get this right.
i am simply looking to use sed to remove all words beginning with or containing a backslash.
so string
another test \/ \u7896 \n test ha\ppy
would become
another test test
i've tried soo many different options, but it doesnt seem to want to work. Does anybody have an idea how to do this?
and before everyone starts giving me minus 1 for this question, believe me, i have tried to find the answer.

You could use str.split and a list comprehension:
>>> strs = "another test \/ \u7896 \n test ha\ppy"
>>> [x for x in strs.split() if '\\' not in x]
['another', 'test', 'test']
# use str.join to join the list
>>> ' ' .join([x for x in strs.split() if '\\' not in x])
'another test test'

$ echo "another test \/ \u7896 \n test ha\ppy" | sed -r 's/\S*\\\S*//g' | tr -s '[:blank:]'
another test test

This might work for you (GNU sed):
sed 's/\s*\S*\\\S*//g' file

string = "another test \/ \u7896 \n test ha\ppy"
string_no_slashes = " ".join([x for x in string.split() if "\\" not in x])

Related

How to substitute with basic regex with alternating signs?

I want to do the following to all of the statements in the file:
Input: xblahxxblahxxblahblahx
Output: <blah><blah><blahblah>
So far I am thinking of using sed -i 's/x/</g' something.ucli
You can use
sed 's/x\([^x]*\)x/<\1>/g'
Details:
x - an x
\([^x]*\) - Group 1 (\1 refers to this group value from the replacement pattern): zero or more (*) chars other than x ([^x])
x - an x
See the online demo:
#!/bin/bash
s='xblahxxblahxxblahblahx'
sed 's/x\([^x]*\)x/<\1>/g' <<< "$s"
# => <blah><blah><blahblah>
If x is a multichar string, e.g.xyz, it will be easier with perl:
perl -pe 's/xyz(.*?)xyz/<$1>/g'
See this online demo.

Extract a substring using command line utilities

I have a text file including lines in the form of:
(term1 x:a y:b (term2 z:c k:a))
I want to extract only terms from this line using command line utilities such as awk, grep, sed. i.e I want the result to be:
term1
term2
I have formed a regex matching the rest but the terms, but could not find a way to negate it.
(\()|( \()|( (.*?) \()|( (.*?)\)+)
How can I form a command extracting the every substring after '(' and before ' '?
Thanks
Try this:
sed "s/(\([^ (]*\)[^(]*/\1\n/g"
For example:
$ echo "(term1 x:a y:b (term2 (term3) z:c k:a) x (termX a:b ) )" | sed "s/(\([^ )]*\)[^(]*/\1\n/g"
term1
term2
term3
termX

How to escape minus in regular expression with sed?

I need to free a string from unwanted characters. In this example I want to filter all +'s and all -'s from b and write the result to c. So if b is +fdd-dfdf+, c should be +-+.
read b
c=$(echo $b | sed 's/[^(\+|\-)]//g')
But when i run the script, the console says:
sed: -e expression #1, char 15: Invalid range end
The reason is the \- in my regular expression. How can I solve this problem and say, that I want to filter all -'s?
are you looking for this?
kent$ echo 'a + b + c - d - e'|sed 's/[^-+]//g'
++--

Remove only single spaces in text file with sed, perl, awk, tr or anything

I have a rather large text file where there is an extra space between every character;
I t l o o k s l i k e t h i s .
I'd like to remove those extra characters so
It looks like this.
via the Linux terminal.
I can't seem to find anyway to do this without removing all of the whitespaces. I'm willing to try any solution at this point. I'd appreciate any nudge in the right direction.
$ echo 'I t l o o k s l i k e t h i s . ' | sed 's/\(.\) /\1/g'
It looks like this.
Are you certain that the intermediate characters are spaces? It is most likely that this is a UTF-16 file.
I suggest you use a capable editor to open it as such and convert it to UTF-8.
An awksolution
echo "I t l o o k s l i k e t h i s ." | awk '{for (i=1;i<=NF;i+=2) printf $i;print ""}' FS=""
It looks like this.
As long as it's every other character you want to get rid of, you can use python.
>>> s = "I t l o o k s l i k e t h i s ."
>>> print s[0::2]
It looks like this.
If you wanted to do this for the text file, do the following:
with open("/path/to/file.txt") as f:
f = f.readlines()
with open("/path/to/new.txt") as g:
for i in f:
g.write(str(i)[0::2]+"\n")
perl -pe 's|(\s+)| " "x (length($1)>1) |ge' file

sed replacement value between to matches

Hi I want to replace a string coming between to symbols by using sed
example: -amystring -bxyz
what to replace mystring with ****
value after -a can be anything like -amystring 123 -bxyz, -amystring 123<newline_char>, -a'mystring 123' -bxyz, -a'mystring 123'<newline_char>
I tried following regex but it does not work in all the cases
sed -re "s#(-w)([^\s\-]+)#\1**** #g"
can anybody help me to solve this issue ?
MyString="YourStringWithoutRegExSpecialCharNotEscaped"
sed "s/-a${MyString} -b/-a**** -b/g"
if you can escape your string for any regex key char like * + . \ / with something like
echo "${MyString}" | sed 's/\[*.\\/+?]/\\&/g' | read -r MyString
before us it in sed.
otherwise, you need to better define the edge pattern