How do I get rid of this unicode character? - unicode

Any idea how to get rid of this irritating character U+0092 from a bunch of text files? I've tried all the below but it doesn't work. It's called U+0092+control from the character map
sed -i 's/\xc2\x92//' *
sed -i 's/\u0092//' *
sed -i 's///' *
Ah, I've found a way:
CHARS=$(python2 -c 'print u"\u0092".encode("utf8")')
sed 's/['"$CHARS"']//g'
But is there a direct sed method for this?

Try sed "s/\`//g" *. (I added the g so it will remove all the backticks it finds).
EDIT: It's not a backtick that OP wants to remove.
Following the solution in this question, this ought to work:
sed 's/\xc2\x92//g'
To demonstrate it does:
$ CHARS=$(python -c 'print u"asdf\u0092asdf".encode("utf8")')
$ echo $CHARS
asdf<funny glyph symbol>asdf
$ echo $CHARS | sed 's/\xc2\x92//g'
asdfasdf
Seeing as it's something you tried already, perhaps what is in your text file is not U+0092?

This might work for you (GNU sed):
echo "string containing funny character(s)" | sed -n 'l0'
This will display the string as sed sees it in octal, then use:
echo "string containing funny character(s)" | sed 's/\onnn//g'
Where nnn is the octal value, to delete it/them.

Related

remove special character (^#) with sed

I want to remove "^#^#^#^#^#^#^#^#^#" from my textfile. I tried the following, but it did not work:
sed -i 's/\\^#//g' myfile.txt
sed is not generally robust against null characters. But Perl is, and tr:
tr -d '\000' <myfile.txt >newfile.txt
Some sed variants will be able to handle null bytes with the notation which works in Perl:
perl -i -pe 's/\x00//g' myfile.txt
The -i option says to replace the original file, like some sed variants also allow you to.
^# is one of the ways how to display the null byte. sed (at least the GNU one) represents it as \x00:
sed 's/\x00//g'

Replacing the test with sed

I'm trying to replace the text using the sed, but it's showing some error. Not getting where I'm getting wrong.
sed -i 's/process.env.REDIRECT_URI/http:\/\/test-domain.apps.io/\callback/g' input.txt
Have this :
process.env.REDIRECT_URI
Replace this with :
http://test-domain.apps.io
Try:
sed -i 's/process.env.REDIRECT_URI/http:\/\/test-domain.apps.io/g' input.txt
Notes:
The original command has a spurious string /\callback. All that was needed to make the code work was to remove it.
. is a wildcard. If you want to be sure that you are matching periods, they should be escaped:
sed -i 's/process\.env\.REDIRECT_URI/http:\/\/test-domain.apps.io/g' input.txt
Sometimes, its clearer if one doesn't have to escape /. One can use a separator of one's choice. For example, use #:
sed -i 's#process\.env\.REDIRECT_URI#http://test-domain.apps.io#g' input.txt
If you did want /callback in the output, use:
sed -i 's/process\.env\.REDIRECT_URI/http:\/\/test-domain.apps.io\/callback/g' input.txt
or:
sed -i 's#process\.env\.REDIRECT_URI#http://test-domain.apps.io/callback#g' input.txt

How to change part of the string using sed?

I have a file data.txt with the following strings:
text-common-1.1.1-SNAPSHOT.jar
text-special-common-2.1.2-SNAPSHOT.jar
some-text-variant-1.1.1-SNAPSHOT.jar
text-another-variant-text-3.3.3-SNAPSHOT.jar
I want to change all of the text-something-digits-something.jar to text-something-5.0.jar.
Here is my script with sed (GNU sed version 4.2.1
), but it doesn't work, I don't know why:
#!/bin/bash
for t in ./data.txt
do
sed -i "s/\(text-[a-z]*-(\d|\.)*\).*\(.jar\)/\15.0\2/" ${t}
done
What is wrong with my sed usage?
How about this awk
awk '/^text/ {sub(/[0-9].*\./,"5.0.")}1'
text-common-5.0.jar
text-special-common-5.0.jar
some-text-variant-1.1.1-SNAPSHOT.jar
text-another-variant-text-5.0.jar
text-something-digits-something.jar to text-something-5.0.jar
equal change digits-someting to 5.0
It also takes care of changing line only starting with text
I think a simpler approach might be enough: sed -r -e 's/(text-(.*-)?common-)([0-9\.]+)(-.*\.jar)/\15.0\4/' < your_data.
Another way of saying the same thing with perl: perl -pe 's/(text-(?:(.*-))*common-)([\d\.]+)(-.*\.jar)/${1}1.5${4}/' < your_data.
#!/bin/bash
for t in ./data.txt
do
sed -i '/^text-/ s/[.0-9]\{1,\}-something\(\.jar\)$/5.0\2/' ${t}
# for "any" something
#sed -i '/^text-/ s/[.0-9]\{1,\}-[^?]\{1,\}\(\.jar\)$/5.0\2/' ${t}
done
select string starting with text and change digit value is present
Using sed:
sed '/^text-/ s/-[0-9.]*-/-5.0-/' file

how to find replace value with whitespace using sed in a bash script

I have values in a file like this ' value-to-remove '(without the ' characters). I want to use sed to run through the file and replace the values including the space before and after. I am running this via a bash script.
How can I do this?
The sed command I'm using at the moment replaces the values but leaves behind the two spaces.
sed -i 's/ '$value' / /g' test.conf
In script I have
sed -i -e 's/\s'$DOMAIN'-'$SITE'\s/\s/g' gitosis.conf
echoed as
sed -i -e s/\sffff.com-eeee\s/\s/g test.conf
Not working though.
IMHO your sed does not know '\s', so use [ \t], and use double quotes, otherwise your variables will not expand. e.g.:
sed -i -e "s/[ \t]'$DOMAIN'-'$SITE'[ \t]/ /g" gitosis.conf
Let me know if this is what you need
echo 'Some values to remove value-to-remove and more' | sed -e 's/\svalue-to-remove\s/CHANGED/g'
output: Some values to removeCHANGEDand more

sed Removing whitespace around certain character

what would be the best way to remove whitespace only around certain character. Let's say a dash - Some- String- 12345- Here would become Some-String-12345-Here. Something like sed 's/\ -/-/g;s/-\ /-/g' but I am sure there must be a better way.
Thanks!
If you mean all whitespace, not just spaces, then you could try \s:
echo 'Some- String- 12345- Here' | sed 's/\s*-\s*/-/g'
Output:
Some-String-12345-Here
Or use the [:space:] character class:
echo 'Some- String- 12345- Here' | sed 's/[[:space:]]*-[[:space:]]*/-/g'
Different versions of sed may or not support these, but GNU sed does.
Try:
's/ *- */-/g'
you can use awk as well
$ echo 'Some - String- 12345-' | awk -F" *- *" '{$1=$1}1' OFS="-"
Some-String-12345-
if its just "- " in your example
$ s="Some- String- 12345-"
$ echo ${s//- /-}
Some-String-12345-