How to copy and paste sample ID to end of header - sed

I have a fasta file with headers that look like...
>DNA1111_0
>DNA2987_1
>DNA3674_5
How do I used sed to modify the headers so they look like...
>DNA1111_0;sample=DNA1111
>DNA2987_1;sample=DNA2987
>DNA3674_5;sample=DNA3674
I haven't been able to get the correct modification, thank you.

With GNU sed:
sed -E 's/^>(.*)(_.*)$/>\1\2;sample=\1/' file
Output:
>DNA1111_0;sample=DNA1111
>DNA2987_1;sample=DNA2987
>DNA3674_5;sample=DNA3674

With any sed that supports -E (e.g. GNU and OSX seds):
$ sed -E 's/([^>_]+).*/&;sample=\1/' file
>DNA1111_0;sample=DNA1111
>DNA2987_1;sample=DNA2987
>DNA3674_5;sample=DNA3674

Related

Trying to overwrite a string that has single quotes using Perl or sed in the terminal

I have the following line in a file:
$app-assets:"/assets/";
I am trying to use sed in the terminal to overwrite that line to read as follows:
$app-assets:"http://www.example.com/assets/";
I have tried the following but it does not work:
sed -i \'\' -e \'s/app-assets:"/assets/"/app-assets:"http://www.example.com/assets/"/g\' myfile.txt
I am fine using Perl if easier.
Use the following sed approach:
sed -i 's~\(\$app-assets:"\)\(/assets/\)"~\1http://www.example.com\2"~' myfile.txt
~ here is treated as sed subcommand separator
sed 's/app-assets:\"\/assets\/\";/app-assets:\"http:\/\/www\.example\.com\/assets\/\";/g' filename

Sed Remove 3 last digits from string

27211;18:05:03479;20161025;0;0;0;0;10991;0;10991;000;0;0;000;1000000;0;0;000;0;0;0;82
Second string after ; is time. gg:mm:sssss:. I just want to be gg:mm:ss:
Like so:
27211;18:05:03;20161025;0;0;0;0;10991;0;10991;000;0;0;000;1000000;0;0;000;0;0;0;82
I tried with cut but it deletes everything after n'th occurance of character, and for now I am stuck, please help.
give this one liner a try:
awk -F';' -v OFS=";" 'sub(/...$/,"",$2)+1' file
It removes the last 3 chars from column 2.
update with sed one liner
If you are a fan of sed:
sed -r 's/(;[^;]*)...;/\1;/' file
With sed:
sed -r 's/^([^;]+;[^;]+)...;/\1;/' file
(Or)
sed -r 's/^([^;]+;[0-9]{2}:[0-9]{2}:[0-9]{2})...;/\1;/' file
It also can be something like sed 's/(.*)([0-9]{2}\:){2}([0-9]{3})[0-9]*\;(.*)/\1\2\3\4/g'
It is not very clean, but at least is more clear for me.
Regards
I'd use perl for this:
perl -pe 's/(?<=:\d\d)\d+(?=;)//' file
That removes any digits between "colon-digit-digit" and the semicolon (first match only, not globally in the line).
If you want to edit the file in-place: perl -i -pe ...
With sed:
sed -E 's/(:[0-9]{2})[0-9]{3}/\1/' file
or perl:
perl -pe's/:\d\d\B\K...//' file

How to change part of the string using sed?

I have a file data.txt with the following strings:
text-common-1.1.1-SNAPSHOT.jar
text-special-common-2.1.2-SNAPSHOT.jar
some-text-variant-1.1.1-SNAPSHOT.jar
text-another-variant-text-3.3.3-SNAPSHOT.jar
I want to change all of the text-something-digits-something.jar to text-something-5.0.jar.
Here is my script with sed (GNU sed version 4.2.1
), but it doesn't work, I don't know why:
#!/bin/bash
for t in ./data.txt
do
sed -i "s/\(text-[a-z]*-(\d|\.)*\).*\(.jar\)/\15.0\2/" ${t}
done
What is wrong with my sed usage?
How about this awk
awk '/^text/ {sub(/[0-9].*\./,"5.0.")}1'
text-common-5.0.jar
text-special-common-5.0.jar
some-text-variant-1.1.1-SNAPSHOT.jar
text-another-variant-text-5.0.jar
text-something-digits-something.jar to text-something-5.0.jar
equal change digits-someting to 5.0
It also takes care of changing line only starting with text
I think a simpler approach might be enough: sed -r -e 's/(text-(.*-)?common-)([0-9\.]+)(-.*\.jar)/\15.0\4/' < your_data.
Another way of saying the same thing with perl: perl -pe 's/(text-(?:(.*-))*common-)([\d\.]+)(-.*\.jar)/${1}1.5${4}/' < your_data.
#!/bin/bash
for t in ./data.txt
do
sed -i '/^text-/ s/[.0-9]\{1,\}-something\(\.jar\)$/5.0\2/' ${t}
# for "any" something
#sed -i '/^text-/ s/[.0-9]\{1,\}-[^?]\{1,\}\(\.jar\)$/5.0\2/' ${t}
done
select string starting with text and change digit value is present
Using sed:
sed '/^text-/ s/-[0-9.]*-/-5.0-/' file

remove ^M characters from file using sed

I have this line inside a file:
ULNET-PA,client_sgcib,broker_keplersecurities
,KEPLER
I try to get rid of that ^M (carriage return) character so I used:
sed 's/^M//g'
However this does remove everything after ^M:
[root#localhost tmp]# vi test
ULNET-PA,client_sgcib,broker_keplersecurities^M,KEPLER
[root#localhost tmp]# sed 's/^M//g' test
ULNET-PA,client_sgcib,broker_keplersecurities
What I want to obtain is:
[root#localhost tmp]# vi test
ULNET-PA,client_sgcib,broker_keplersecurities,KEPLER
Use tr:
tr -d '^M' < inputfile
(Note that the ^M character can be input using Ctrl+VCtrl+M)
EDIT: As suggested by Glenn Jackman, if you're using bash, you could also say:
tr -d $'\r' < inputfile
still the same line:
sed -i 's/^M//g' file
when you type the command, for ^M you type Ctrl+VCtrl+M
actually if you have already opened the file in vim, you can just in vim do:
:%s/^M//g
same, ^M you type Ctrl-V Ctrl-M
You can simply use dos2unix which is available in most Unix/Linux systems. However I found the following sed command to be better as it removed ^M where dos2unix couldn't:
sed 's/\r//g' < input.txt > output.txt
Hope that helps.
Note: ^M is actually carriage return character which is represented in code as \r
What dos2unix does is most likely equivalent to:
sed 's/\r\n/\n/g' < input.txt > output.txt
It doesn't remove \r when it is not immediately followed by \n and replaces both with just \n. This fails with certain types of files like one I just tested with.
alias dos2unix="sed -i -e 's/'\"\$(printf '\015')\"'//g' "
Usage:
dos2unix file
If Perl is an option:
perl -i -pe 's/\r\n$/\n/g' file
-i makes a .bak version of the input file
\r = carriage return
\n = linefeed
$ = end of line
s/foo/bar/g = globally substitute "foo" with "bar"
In awk:
sub(/\r/,"")
If it is in the end of record, sub(/\r/,"",$NF) should suffice. No need to scan the whole record.
This is the better way to achieve
tr -d '\015' < inputfile_name > outputfile_name
Later rename the file to original file name.
I agree with #twalberg (see accepted answer comments, above), dos2unix on Mac OSX covers this, quoting man dos2unix:
To run in Mac mode use the command-line option "-c mac" or use the
commands "mac2unix" or "unix2mac"
I settled on 'mac2unix', which got rid of my less-cmd-visible '^M' entries, introduced by an Apple 'Messages' transfer of a bash script between 2 Yosemite (OSX 10.10) Macs!
I installed 'dos2unix', trivially, on Mac OSX using the popular Homebrew package installer, I highly recommend it and it's companion command, Cask.
This is clean and simple and it works:
sed -i 's/\r//g' file
where \r of course is the equivalent for ^M.
Simply run the following command:
sed -i -e 's/\r$//' input.file
I verified this as valid in Mac OSX Monterey.
remove any \r :
nawk 'NF+=OFS=_' FS='\r'
gawk 3 ORS= RS='\r'
remove end of line \r :
mawk2 8 RS='\r?\n'
mawk -F'\r$' NF=1

How to remove set of special characters (see attachment)

This characters is special I can not put in code because the forum not support it. Here is how it looks in code format: [32;1m
The cube (first character) is arrow to left in file (see links above).
Here is the picture of character how it look.See the file: http://www.dodaj.rs/f/2u/ar/3B1Q7J4Q/sample.jpg
And here is attachement of file it consist what I want to remove: http://hotfile.com/dl/124448134/58e08a0/File.log.html
Here is the complete file:
[32;1m/var/log/daemon.log file is rotated1...[0m
[32;1m/var/log/daemon.log file is rotated2...[0m
[37;1m/var/log/daemon.log file is rotated3...[0m
[35;1m/var/log/daemon.log file is rotated3...[0m
[33;1mhello[0m
[33;1mthis is sample[0m
[33;1mwhats up?[0m
What I want is to delete everything of unnecessary characters and output to be:
/var/log/daemon.log file is rotated1...
/var/log/daemon.log file is rotated2...
/var/log/daemon.log file is rotated3...
/var/log/daemon.log file is rotated3...
hello
this is sample
whats up?
I tried to delete special characters with sed like:
cat File.log | sed 's/[!##\$%^&*()]//g' | sed -e 's/37;1m//g' > output.log
but it do nothing.
Can someone please write me that code that make what I need?
Thx.
EDIT: After posting the post arrow can not see on forum...
sed -e 's/[[:cntrl:]]//g' -e 's/\[32;1m//g' -e 's/\[33;1m//g' -e 's/\[35;1m//g' -e 's/\[37;1m//g' -e 's/\[0m//g'
echo '[32;1m/var/log/daemon.log file is rotated1...[0m' | awk -F'1m' '{sub("\[0m","",$2);print $2}'
/var/log/daemon.log file is rotated1...