How to change part of the string using sed? - sed

I have a file data.txt with the following strings:
I want to change all of the text-something-digits-something.jar to text-something-5.0.jar.
Here is my script with sed (GNU sed version 4.2.1
), but it doesn't work, I don't know why:
for t in ./data.txt
sed -i "s/\(text-[a-z]*-(\d|\.)*\).*\(.jar\)/\15.0\2/" ${t}
What is wrong with my sed usage?

How about this awk
awk '/^text/ {sub(/[0-9].*\./,"5.0.")}1'
text-something-digits-something.jar to text-something-5.0.jar
equal change digits-someting to 5.0
It also takes care of changing line only starting with text

I think a simpler approach might be enough: sed -r -e 's/(text-(.*-)?common-)([0-9\.]+)(-.*\.jar)/\15.0\4/' < your_data.
Another way of saying the same thing with perl: perl -pe 's/(text-(?:(.*-))*common-)([\d\.]+)(-.*\.jar)/${1}1.5${4}/' < your_data.

for t in ./data.txt
sed -i '/^text-/ s/[.0-9]\{1,\}-something\(\.jar\)$/5.0\2/' ${t}
# for "any" something
#sed -i '/^text-/ s/[.0-9]\{1,\}-[^?]\{1,\}\(\.jar\)$/5.0\2/' ${t}
select string starting with text and change digit value is present

Using sed:
sed '/^text-/ s/-[0-9.]*-/-5.0-/' file


Sed Remove 3 last digits from string

Second string after ; is time. gg:mm:sssss:. I just want to be gg:mm:ss:
Like so:
I tried with cut but it deletes everything after n'th occurance of character, and for now I am stuck, please help.
give this one liner a try:
awk -F';' -v OFS=";" 'sub(/...$/,"",$2)+1' file
It removes the last 3 chars from column 2.
update with sed one liner
If you are a fan of sed:
sed -r 's/(;[^;]*)...;/\1;/' file
With sed:
sed -r 's/^([^;]+;[^;]+)...;/\1;/' file
sed -r 's/^([^;]+;[0-9]{2}:[0-9]{2}:[0-9]{2})...;/\1;/' file
It also can be something like sed 's/(.*)([0-9]{2}\:){2}([0-9]{3})[0-9]*\;(.*)/\1\2\3\4/g'
It is not very clean, but at least is more clear for me.
I'd use perl for this:
perl -pe 's/(?<=:\d\d)\d+(?=;)//' file
That removes any digits between "colon-digit-digit" and the semicolon (first match only, not globally in the line).
If you want to edit the file in-place: perl -i -pe ...
With sed:
sed -E 's/(:[0-9]{2})[0-9]{3}/\1/' file
or perl:
perl -pe's/:\d\d\B\K...//' file

Add new line using awk, sed

I have a large file which is slightly corrupted. The new lines have disappeared. There should have been a new line at every 250th character. How can I fix that?
Thanks in advance.
How about
sed 's/.\{250\}/&\n/g'
The .\{250\} captures 250 of any type of character. The characters are replaced by themselves, plus a newline.
try this:
sed -r 's/.{250}/&\n/g'
awk -v FPAT='.{1,25}' -v OFS='\n' '$1=$1'
There is a command in coreutils that can wrap lines, it is called fold:
fold -w 250
sed 's/^.\{250\}/&\
/;P;D' YourFile
Could be faster on huge file
An awk version
awk '{L=250;for (i=1;i<=length($0);i+=L) print substr($0,i,L)}'

using sed for substitution in next line

I am working on sed command to translate some text into another text.
cat text
sed -e 's|<strong>(.*?)</strong>|//textbf{1}|g'
Expected Outcome: \textbf{ABC}
but using above script i cannot convert it into expected output since there is new line between the tags. How to handle such cases?
This might work for you (GNU sed):
sed -r '$!N;s|(<)(strong>)([^\n]*)\n\s*\1/\2|//textbf{\3}|;P;D' file
sed '$!N;s|\(<\)\(strong>\)\([^\n]*\)\n\s*\1/\2|//textbf{\3}|;P;D' file
sed -e 'N;s|<strong>\(.*\?\)\n</strong>|\/textbf{\1}|g'
as said by CodeGnome and David Ravetti, the N flag allows for multi-line patterns.

How do I get rid of this unicode character?

Any idea how to get rid of this irritating character U+0092 from a bunch of text files? I've tried all the below but it doesn't work. It's called U+0092+control from the character map
sed -i 's/\xc2\x92//' *
sed -i 's/\u0092//' *
sed -i 's///' *
Ah, I've found a way:
CHARS=$(python2 -c 'print u"\u0092".encode("utf8")')
sed 's/['"$CHARS"']//g'
But is there a direct sed method for this?
Try sed "s/\`//g" *. (I added the g so it will remove all the backticks it finds).
EDIT: It's not a backtick that OP wants to remove.
Following the solution in this question, this ought to work:
sed 's/\xc2\x92//g'
To demonstrate it does:
$ CHARS=$(python -c 'print u"asdf\u0092asdf".encode("utf8")')
$ echo $CHARS
asdf<funny glyph symbol>asdf
$ echo $CHARS | sed 's/\xc2\x92//g'
Seeing as it's something you tried already, perhaps what is in your text file is not U+0092?
This might work for you (GNU sed):
echo "string containing funny character(s)" | sed -n 'l0'
This will display the string as sed sees it in octal, then use:
echo "string containing funny character(s)" | sed 's/\onnn//g'
Where nnn is the octal value, to delete it/them.

To remove blank lines in data set

I need a one liner using sed, awk or perl to remove blank lines from my data file. The data in my file looks like this -
These blanks are at random and appear anywhere in my data file. Can someone suggest a one-liner to remove these blank lines from my dataset.
it can be done in many ways.
e.g with awk:
awk '$0' yourFile
or sed:
sed '/^$/d' yourFile
or grep:
grep -v '^$' yourFile
A Perl solution. From the command line.
$ perl -i.bak -n -e'print if /\S/' INPUT_FILE
Edits the file in-place and creates a backup of the original file.
AWK Solution:
Here we loop through the input file to check if they have any field set. NF is AWK's in-built variable that is set to th number of fields. If the line is empty then NF is not set. In this one liner we test if NF is true, i.e set to a value. If it is then we print the line, which is implicit in AWK when the pattern is true.
SED Solution:
This solution is similar to the ones mentioned as the answer. As the syntax show we are not printing any lines that are blank.
sed -n '/^$/!p' INPUT_FILE
You can do:
sed -i.bak '/^$/d' file
A Perl solution:
perl -ni.old -e 'print unless /^\s*$/' file
...which create as backup copy of the original file, suffixed with '.old'
for perl it is as easier as sed,awk, or grep.
$ cat tmp/tmpfile
$ perl -i -pe 's{^\s*\n$}{}' tmp/tmpfile
$ cat tmp/tmpfile