How to copy and paste sample ID to end of header

How to copy and paste sample ID to end of header - sed

I have a fasta file with headers that look like...
>DNA1111_0
>DNA2987_1
>DNA3674_5
How do I used sed to modify the headers so they look like...
>DNA1111_0;sample=DNA1111
>DNA2987_1;sample=DNA2987
>DNA3674_5;sample=DNA3674
I haven't been able to get the correct modification, thank you.

With GNU sed:
sed -E 's/^>(.*)(_.*)$/>\1\2;sample=\1/' file
Output:
>DNA1111_0;sample=DNA1111
>DNA2987_1;sample=DNA2987
>DNA3674_5;sample=DNA3674

With any sed that supports -E (e.g. GNU and OSX seds):
$ sed -E 's/([^>_]+).*/&;sample=\1/' file
>DNA1111_0;sample=DNA1111
>DNA2987_1;sample=DNA2987
>DNA3674_5;sample=DNA3674

Related

Trying to overwrite a string that has single quotes using Perl or sed in the terminal

I have the following line in a file:
$app-assets:"/assets/";
I am trying to use sed in the terminal to overwrite that line to read as follows:
$app-assets:"http://www.example.com/assets/";
I have tried the following but it does not work:
sed -i \'\' -e \'s/app-assets:"/assets/"/app-assets:"http://www.example.com/assets/"/g\' myfile.txt
I am fine using Perl if easier.

Use the following sed approach:
sed -i 's~\(\$app-assets:"\)\(/assets/\)"~\1http://www.example.com\2"~' myfile.txt
~ here is treated as sed subcommand separator

sed 's/app-assets:\"\/assets\/\";/app-assets:\"http:\/\/www\.example\.com\/assets\/\";/g' filename

Sed Remove 3 last digits from string

27211;18:05:03479;20161025;0;0;0;0;10991;0;10991;000;0;0;000;1000000;0;0;000;0;0;0;82
Second string after ; is time. gg:mm:sssss:. I just want to be gg:mm:ss:
Like so:
27211;18:05:03;20161025;0;0;0;0;10991;0;10991;000;0;0;000;1000000;0;0;000;0;0;0;82
I tried with cut but it deletes everything after n'th occurance of character, and for now I am stuck, please help.

give this one liner a try:
awk -F';' -v OFS=";" 'sub(/...$/,"",$2)+1' file
It removes the last 3 chars from column 2.
update with sed one liner
If you are a fan of sed:
sed -r 's/(;[^;]*)...;/\1;/' file

With sed:
sed -r 's/^([^;]+;[^;]+)...;/\1;/' file
(Or)
sed -r 's/^([^;]+;[0-9]{2}:[0-9]{2}:[0-9]{2})...;/\1;/' file

It also can be something like sed 's/(.*)([0-9]{2}\:){2}([0-9]{3})[0-9]*\;(.*)/\1\2\3\4/g'
It is not very clean, but at least is more clear for me.
Regards

I'd use perl for this:
perl -pe 's/(?<=:\d\d)\d+(?=;)//' file
That removes any digits between "colon-digit-digit" and the semicolon (first match only, not globally in the line).
If you want to edit the file in-place: perl -i -pe ...

With sed:
sed -E 's/(:[0-9]{2})[0-9]{3}/\1/' file
or perl:
perl -pe's/:\d\d\B\K...//' file

How to change part of the string using sed?

I have a file data.txt with the following strings:
text-common-1.1.1-SNAPSHOT.jar
text-special-common-2.1.2-SNAPSHOT.jar
some-text-variant-1.1.1-SNAPSHOT.jar
text-another-variant-text-3.3.3-SNAPSHOT.jar
I want to change all of the text-something-digits-something.jar to text-something-5.0.jar.
Here is my script with sed (GNU sed version 4.2.1
), but it doesn't work, I don't know why:
#!/bin/bash
for t in ./data.txt
do
sed -i "s/\(text-[a-z]*-(\d|\.)*\).*\(.jar\)/\15.0\2/" ${t}
done
What is wrong with my sed usage?

How about this awk
awk '/^text/ {sub(/[0-9].*\./,"5.0.")}1'
text-common-5.0.jar
text-special-common-5.0.jar
some-text-variant-1.1.1-SNAPSHOT.jar
text-another-variant-text-5.0.jar
text-something-digits-something.jar to text-something-5.0.jar
equal change digits-someting to 5.0
It also takes care of changing line only starting with text

I think a simpler approach might be enough: sed -r -e 's/(text-(.*-)?common-)([0-9\.]+)(-.*\.jar)/\15.0\4/' < your_data.
Another way of saying the same thing with perl: perl -pe 's/(text-(?:(.*-))*common-)([\d\.]+)(-.*\.jar)/${1}1.5${4}/' < your_data.

#!/bin/bash
for t in ./data.txt
do
sed -i '/^text-/ s/[.0-9]\{1,\}-something\(\.jar\)$/5.0\2/' ${t}
# for "any" something
#sed -i '/^text-/ s/[.0-9]\{1,\}-[^?]\{1,\}\(\.jar\)$/5.0\2/' ${t}
done
select string starting with text and change digit value is present

Using sed:
sed '/^text-/ s/-[0-9.]*-/-5.0-/' file

remove ^M characters from file using sed

I have this line inside a file:
ULNET-PA,client_sgcib,broker_keplersecurities
,KEPLER
I try to get rid of that ^M (carriage return) character so I used:
sed 's/^M//g'
However this does remove everything after ^M:
[root#localhost tmp]# vi test
ULNET-PA,client_sgcib,broker_keplersecurities^M,KEPLER
[root#localhost tmp]# sed 's/^M//g' test
ULNET-PA,client_sgcib,broker_keplersecurities
What I want to obtain is:
[root#localhost tmp]# vi test
ULNET-PA,client_sgcib,broker_keplersecurities,KEPLER

Use tr:
tr -d '^M' < inputfile
(Note that the ^M character can be input using Ctrl+VCtrl+M)
EDIT: As suggested by Glenn Jackman, if you're using bash, you could also say:
tr -d $'\r' < inputfile

still the same line:
sed -i 's/^M//g' file
when you type the command, for ^M you type Ctrl+VCtrl+M
actually if you have already opened the file in vim, you can just in vim do:
:%s/^M//g
same, ^M you type Ctrl-V Ctrl-M

You can simply use dos2unix which is available in most Unix/Linux systems. However I found the following sed command to be better as it removed ^M where dos2unix couldn't:
sed 's/\r//g' < input.txt > output.txt
Hope that helps.
Note: ^M is actually carriage return character which is represented in code as \r
What dos2unix does is most likely equivalent to:
sed 's/\r\n/\n/g' < input.txt > output.txt
It doesn't remove \r when it is not immediately followed by \n and replaces both with just \n. This fails with certain types of files like one I just tested with.

alias dos2unix="sed -i -e 's/'\"\$(printf '\015')\"'//g' "
Usage:
dos2unix file

If Perl is an option:
perl -i -pe 's/\r\n$/\n/g' file
-i makes a .bak version of the input file
\r = carriage return
\n = linefeed
$ = end of line
s/foo/bar/g = globally substitute "foo" with "bar"

In awk:
sub(/\r/,"")
If it is in the end of record, sub(/\r/,"",$NF) should suffice. No need to scan the whole record.

This is the better way to achieve
tr -d '\015' < inputfile_name > outputfile_name
Later rename the file to original file name.

I agree with #twalberg (see accepted answer comments, above), dos2unix on Mac OSX covers this, quoting man dos2unix:
To run in Mac mode use the command-line option "-c mac" or use the
commands "mac2unix" or "unix2mac"
I settled on 'mac2unix', which got rid of my less-cmd-visible '^M' entries, introduced by an Apple 'Messages' transfer of a bash script between 2 Yosemite (OSX 10.10) Macs!
I installed 'dos2unix', trivially, on Mac OSX using the popular Homebrew package installer, I highly recommend it and it's companion command, Cask.

This is clean and simple and it works:
sed -i 's/\r//g' file
where \r of course is the equivalent for ^M.

Simply run the following command:
sed -i -e 's/\r$//' input.file
I verified this as valid in Mac OSX Monterey.

remove any \r :
nawk 'NF+=OFS=_' FS='\r'
gawk 3 ORS= RS='\r'
remove end of line \r :
mawk2 8 RS='\r?\n'
mawk -F'\r$' NF=1

How to remove set of special characters (see attachment)

This characters is special I can not put in code because the forum not support it. Here is how it looks in code format: [32;1m
The cube (first character) is arrow to left in file (see links above).
Here is the picture of character how it look.See the file: http://www.dodaj.rs/f/2u/ar/3B1Q7J4Q/sample.jpg
And here is attachement of file it consist what I want to remove: http://hotfile.com/dl/124448134/58e08a0/File.log.html
Here is the complete file:
[32;1m/var/log/daemon.log file is rotated1...[0m
[32;1m/var/log/daemon.log file is rotated2...[0m
[37;1m/var/log/daemon.log file is rotated3...[0m
[35;1m/var/log/daemon.log file is rotated3...[0m
[33;1mhello[0m
[33;1mthis is sample[0m
[33;1mwhats up?[0m
What I want is to delete everything of unnecessary characters and output to be:
/var/log/daemon.log file is rotated1...
/var/log/daemon.log file is rotated2...
/var/log/daemon.log file is rotated3...
/var/log/daemon.log file is rotated3...
hello
this is sample
whats up?
I tried to delete special characters with sed like:
cat File.log | sed 's/[!##\$%^&*()]//g' | sed -e 's/37;1m//g' > output.log
but it do nothing.
Can someone please write me that code that make what I need?
Thx.
EDIT: After posting the post arrow can not see on forum...

sed -e 's/[[:cntrl:]]//g' -e 's/\[32;1m//g' -e 's/\[33;1m//g' -e 's/\[35;1m//g' -e 's/\[37;1m//g' -e 's/\[0m//g'

echo '[32;1m/var/log/daemon.log file is rotated1...[0m' | awk -F'1m' '{sub("\[0m","",$2);print $2}'
/var/log/daemon.log file is rotated1...

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

How to copy and paste sample ID to end of header - sed

I have a fasta file with headers that look like... >DNA1111_0 >DNA2987_1 >DNA3674_5 How do I used sed to modify the headers so they look like... >DNA1111_0;sample=DNA1111 >DNA2987_1;sample=DNA2987 >DNA3674_5;sample=DNA3674 I haven't been able to get the correct modification, thank you.

With GNU sed: sed -E 's/^>(.)(_.)$/>\1\2;sample=\1/' file Output: >DNA1111_0;sample=DNA1111 >DNA2987_1;sample=DNA2987 >DNA3674_5;sample=DNA3674

With any sed that supports -E (e.g. GNU and OSX seds): $ sed -E 's/([^>_]+).*/&;sample=\1/' file >DNA1111_0;sample=DNA1111 >DNA2987_1;sample=DNA2987 >DNA3674_5;sample=DNA3674

Related

Trying to overwrite a string that has single quotes using Perl or sed in the terminal

Sed Remove 3 last digits from string

How to change part of the string using sed?

remove ^M characters from file using sed

How to remove set of special characters (see attachment)

Categories

Resources

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

How to copy and paste sample ID to end of header - sed

I have a fasta file with headers that look like... >DNA1111_0 >DNA2987_1 >DNA3674_5 How do I used sed to modify the headers so they look like... >DNA1111_0;sample=DNA1111 >DNA2987_1;sample=DNA2987 >DNA3674_5;sample=DNA3674 I haven't been able to get the correct modification, thank you.

With GNU sed: sed -E 's/^>(.*)(_.*)$/>\1\2;sample=\1/' file Output: >DNA1111_0;sample=DNA1111 >DNA2987_1;sample=DNA2987 >DNA3674_5;sample=DNA3674

With any sed that supports -E (e.g. GNU and OSX seds): $ sed -E 's/([^>_]+).*/&;sample=\1/' file >DNA1111_0;sample=DNA1111 >DNA2987_1;sample=DNA2987 >DNA3674_5;sample=DNA3674

Related

Trying to overwrite a string that has single quotes using Perl or sed in the terminal

Sed Remove 3 last digits from string

How to change part of the string using sed?

remove ^M characters from file using sed

How to remove set of special characters (see attachment)

Categories

Resources

With GNU sed: sed -E 's/^>(.)(_.)$/>\1\2;sample=\1/' file Output: >DNA1111_0;sample=DNA1111 >DNA2987_1;sample=DNA2987 >DNA3674_5;sample=DNA3674