How can I change spaces to underscores and lowercase everything?

How can I change spaces to underscores and lowercase everything? - perl

I have a text file which contains:
Cycle code
Cycle month
Cycle year
Event type ID
Event ID
Network start time
I want to change this text so that when ever there is a space, I want to replace it with a _. And after that, I want the characters to lower case letter like below:
cycle_code
cycle_month
cycle_year
event_type_id
event_id
network_start_time
How could I accomplish this?

Another Perl method:
perl -pe 'y/A-Z /a-z_/' file

tr alone works:
tr ' [:upper:]' '_[:lower:]' < file

Looking into sed documentation some more and following advice from the comments the following command should work.
sed -r {filehere} -e 's/[A-Z]/\L&/g;s/ /_/g' -i

There is a perl tag in your question as well. So:
#!/usr/bin/perl
use strict; use warnings;
while (<DATA>) {
print join('_', split ' ', lc), "\n";
}
__DATA__
Cycle code
Cycle month
Cycle year
Event type ID
Event ID
Network start time
Or:
perl -i.bak -wple '$_ = join('_', split ' ', lc)' test.txt

sed "y/ABCDEFGHIJKLMNOPQRSTUVWXYZ /abcdefghijklmnopqrstuvwxyz_/" filename

Just use your shell, if you have Bash 4
while read -r line
do
line=${line,,} #change to lowercase
echo ${line// /_}
done < "file" > newfile
mv newfile file
With gawk:
awk '{$0=tolower($0);$1=$1}1' OFS="_" file
With Perl:
perl -ne 's/ +/_/g;print lc' file
With Python:
>>> f=open("file")
>>> for line in f:
... print '_'.join(line.split()).lower()
>>> f.close()

Related

Normalize column fill with space on right

Working with the example log file below:
1;000117;20190529;055529;9521;0988388019
1;000015;20190529;071944;2222;2231
1;000012;20190529;072734;4258;4252
1;000006;20190529;073336;2226;1000
3;000005;20190529;073715;1000;037760967
3;000004;20190529;073751;1000;037760967
I need to normalize the last column filling with spaces until they has the lenght = 25
Tryed with unsuccessful perl code:
perl -F';' -lane '$F[5] = $F[5], sprintf "% 25d"; $" = ";"; print "#F"'
I need the output below:
1;000117;20190529;055529;9521;0988388019
1;000015;20190529;071944;2222;2231
1;000012;20190529;072734;4258;4252
1;000006;20190529;073336;2226;1000
3;000005;20190529;073715;1000;037760967
3;000004;20190529;073751;1000;037760967

$ awk 'BEGIN{FS=OFS=";"} {$NF=sprintf("%-25s",$NF)}1' file
1;000117;20190529;055529;9521;0988388019
1;000015;20190529;071944;2222;2231
1;000012;20190529;072734;4258;4252
1;000006;20190529;073336;2226;1000
3;000005;20190529;073715;1000;037760967
3;000004;20190529;073751;1000;037760967
So you can see the blanks:
$ awk 'BEGIN{FS=OFS=";"} {$NF=sprintf("%-25s",$NF)}1' file | tr ' ' '#'
1;000117;20190529;055529;9521;0988388019###############
1;000015;20190529;071944;2222;2231#####################
1;000012;20190529;072734;4258;4252#####################
1;000006;20190529;073336;2226;1000#####################
3;000005;20190529;073715;1000;037760967################
3;000004;20190529;073751;1000;037760967################

You were on the right track. More successful Perl codes:
perl -F';' -lane '$F[5]=sprintf("%-25s",$F[5]);print join ";",#F'
perl -F';' -pane '$F[5]=sprintf("%-25s",$F[5]);$_=join ";",#F'

This might work for you (GNU sed):
sed -i ':a;/;[^;]\{25\}$/!s/$/ /;ta' file
If the last field is not 25 characters long, add a space until it is.

Extracting fasta ids after string match

I have a list of fasta sequences as following:
>Product_1_001:299:H377WBGXB:1:11101
TGATCATCTCACCTACTAATAGGACGATGACCCAGTGACGATGA
>Product_2_001:299:H377WBGXB:2:11101
CATCGATGATCATTGATAAGGGGCCCATACCCATCAAAACCGTT
The original fasta sequence is much longer than the subset posted here. I wanted to extract the 10 characters after the pattern "TCAT" into a separate file and did this
grep -oP "(?<=TCAT).{10}"
I do get the needed result as:
CTCACCTACT
TGATAAGGGG
I would like their corresponding fasta ids as one column and the extracted pattern as second column like:
>Product_1_001:299:H377WBGXB:1:11101 CTCACCTACT
>Product_2_001:299:H377WBGXB:2:11101 TGATAAGGGG

Try this one-liner
perl -lne ' /^[^<].+?(?<=TCAT)(.{10})/ and print $p,"\t",$1; $p=$_ ' file
with your given inputs
$ cat fasta.txt
>Product_1_001:299:H377WBGXB:1:11101
TGATCATCTCACCTACTAATAGGACGATGACCCAGTGACGATGA
>Product_2_001:299:H377WBGXB:2:11101
CATCGATGATCATTGATAAGGGGCCCATACCCATCAAAACCGTT
$ perl -lne ' /^[^<].+?(?<=TCAT)(.{10})/ and print $p,"\t",$1; $p=$_ ' fasta.txt
>Product_1_001:299:H377WBGXB:1:11101 CTCACCTACT
>Product_2_001:299:H377WBGXB:2:11101 TGATAAGGGG
$

Another way will be ussing awk command like this :
cat <your_file>| awk -F"_" '/Product/{printf "%s", $0; next} 1'|awk -F"TCAT" '{ print substr($1,1,35) "\t" substr($2,1,10)}'
the output :
Product_1_001:299:H377WBGXB:1:11101 CTCACCTACT
Product_2_001:299:H377WBGXB:2:11101 TGATAAGGGG
hope it help you.

How to add new number into each line?

I have this line about 500 times in a my file backup.xml
my-company-review/</link>
Is there a way through command line, perl, etc. to add a number into the line after the word review. For example, something like this:
my-company-review1/</link>
my-company-review2/</link>
my-company-review3/</link>
Thanks in advance for the help!

Why not use Perl, as I suggested with your last problem. Once again, this is a sort of hack solution, that only works if there's a maximum of one replacement per line... But it's a quick throw-away program.
perl -e '$count=1; foreach (<>) { s/(my-company-review)(\/<\/link>)/$1$count$2/ && $count++; print; }'
An extra loop will do multiple substitutions on a line:
perl -e '$count=1; foreach (<>) { while(s/(my-company-review)(\/<\/link>)/$1$count$2/) {$count++;} print; }'
That awk solution looks way nicer =)

Here's one way:
perl -i -wpe ' BEGIN { $count = 1; }
++$count
if s{(my-company-review)(/</link>)}{$1$count$2};
' backup.xml
(Disclaimer: not tested.)

You can use awk:
awk 'gsub("/</link>", NR "/</link>")' infile
or perl:
perl -ne 's:/</link>:$./</link>:; print' infile

Bash, Perl or Sed, Insert on New Line after found phrase

Ok I guess I need something that will do the following:
search for this line of code in /var/lib/asterisk/bin/retrieve_conf:
$engineinfo = engine_getinfo();
insert these two lines immediately following:
$engineinfo['engine']="asterisk";
$engineinfo['version']="1.6.2.11";
Thanks in advance,
Joe

You could do it like this
sed -ne '/$engineinfo = engine_getinfo();/a\'$'\n''$engineinfo['engine']="asterisk";\'$'\n''$engineinfo['version']="1.6.2.11";'$'\n'';p' /var/lib/asterisk/bin/retrieve_conf
Add -i for modification in place once you confirm that it works.
What does it do and how does it work?
First we tell sed to match a line containing your string. On that matched line we then will perform an a command, which is "append text".
The syntax of a sed a command is
a\
line of text\
another line
;
Note that the literal newlines are part of this syntax. To make it all one line (and preserve copy-paste ability) in place of literal newlines I used $'\n' which will tell bash or zsh to insert a real newline in place. The quoting necessary to make this work is a little complex: You have to exit single-quotes so that you can have the $'\n' be interpreted by bash, then you have to re-enter a single-quoted string to prevent bash from interpreting the rest of your input.
EDIT: Updated to append both lines in one append command.

You can use Perl and Tie::File (included in the Perl distribution):
use Tie::File;
tie my #array, 'Tie::File', "/var/lib/asterisk/bin/retrieve_conf" or die $!;
for (0..$#array) {
if ($array[$_] =~ /\$engineinfo = engine_getinfo\(\);/) {
splice #array, $_+1, 0, q{$engineinfo['engine']="asterisk"; $engineinfo['version']="1.6.2.11";};
last;
}
}

Just for the sake of symmetry here's an answer using awk.
awk '{ if(/\$engineinfo = engine_getinfo\(\);/) print $0"\n$engineinfo['\''engine'\'']=\"asterisk\";\n$engineinfo['\''version'\'']=\"1.6.2.11\"" ; else print $0 }' in.txt

You may also use ed:
# cf. http://wiki.bash-hackers.org/howto/edit-ed
cat <<-'EOF' | ed -s /var/lib/asterisk/bin/retrieve_conf
H
/\$engineinfo = engine_getinfo();/a
$engineinfo['engine']="asterisk";
$engineinfo['version']="1.6.2.11";
.
wq
EOF

A Perl one-liner:
perl -pE 's|(\$engineinfo) = engine_getinfo\(\);.*\K|\n${1}['\''engine'\'']="asterisk";\n${1}['\''version'\'']="1.6.2.11";|' file

sed -i 's/$engineinfo = engine_getinfo();/$engineinfo = engine_getinfo();<CTRL V><CNTRL M>$engineinfo['engine']="asterisk"; $engineinfo['version']="1.6.2.11";/' /var/lib/asterisk/bin/retrieve_conf

How can I remove the timestamp from a filename in Perl?

I have a file which has a line in it as:
/hosting/logs/U01-ecom-SIT01/CU01-DC05-IFIO_SIT01_NU01-nc3sz1ecmas11/waslogs/SystemOut_10.01.21_16.54.18.log`
I need a script which would read this line and remove the time stamp from it, that is:
10.01.21_16.54.18
The script should print the filename without the timestamp and holding the full path, that is:
/hosting/logs/U01-ecom-SIT01/CU01-DC05-IFIO_SIT01_NU01-nc3sz1ecmas11/waslogs/SystemOut.log`
Please help as I'm unable to pattern match and output the file path without the timestamp.

echo "/hosting/logs/U01-ecom-SIT01/CU01-DC05-IFIO_SIT01_NU01-nc3sz1ecmas11/waslogs/SystemOut_10.01.21_16.54.18.log" |
perl -pe "s/_\d\d\.\d\d\.\d\d_\d\d\.\d\d\.\d\d//;"

$ perl -e 's{_\d{2}\.\d{2}.\d{2}_\d{2}\.\d{2}.\d{2}}{} and print for #ARGV' /hosting/logs/U01-ecom-SIT01/CU01-DC05-IFIO_SIT01_NU01-nc3sz1ecmas11/waslogs/SystemOut_10.01.21_16.54.18.log

Path shortened to prevent scrolling:
$ cat paths
CU01-DC05-IFIO_SIT01_NU01-nc3sz1ecmas11/waslogs/SystemOut_10.01.21_16.54.18.log
$ perl -pe 's/(_(\d\d(\.\d\d){2})){2}\.log$/.log/' paths
CU01-DC05-IFIO_SIT01_NU01-nc3sz1ecmas11/waslogs/SystemOut.log
The timestamp is made up of 2 sequences that look like _##.##.##. The subsequences end with 2 sequences of .##. These are the roles of the {2} quantifiers.

while(<>){
#s = split /\// ;
$fullpath=join("/",splice #s , 0, $#s);
#a = split /[_.]/ ,$s[-1];
$newfile="$fullpath/$a[0].$a[-1]";
print $newfile."\n";
}

You can use the following coding
use strict;
use warnings;
my $var; $var=/hosting/logs/U01-ecom-SIT01/CU01-DC05-IFIO_SIT01_NU01-nc3sz1ecmas11/waslogs/SystemOut_10.01.21_16.54.18.log";
$var=~s/_\d\d\.\d\d\.\d\d//g;
# $var=~s/_10\.01\.21_16\.54\.18//g; # You can use this way also
print "$var\n";

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

How can I change spaces to underscores and lowercase everything? - perl

Another Perl method: perl -pe 'y/A-Z /a-z_/' file

tr alone works: tr ' [:upper:]' '_[:lower:]' < file

Looking into sed documentation some more and following advice from the comments the following command should work. sed -r {filehere} -e 's/[A-Z]/\L&/g;s/ /_/g' -i

There is a perl tag in your question as well. So: #!/usr/bin/perl use strict; use warnings; while (<DATA>) { print join('_', split ' ', lc), "\n"; } DATA Cycle code Cycle month Cycle year Event type ID Event ID Network start time Or: perl -i.bak -wple '$_ = join('_', split ' ', lc)' test.txt

sed "y/ABCDEFGHIJKLMNOPQRSTUVWXYZ /abcdefghijklmnopqrstuvwxyz_/" filename

Related

Normalize column fill with space on right

Extracting fasta ids after string match

How to add new number into each line?

Bash, Perl or Sed, Insert on New Line after found phrase

How can I remove the timestamp from a filename in Perl?

Categories

Resources

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

How can I change spaces to underscores and lowercase everything? - perl

Another Perl method: perl -pe 'y/A-Z /a-z_/' file

tr alone works: tr ' [:upper:]' '_[:lower:]' < file

Looking into sed documentation some more and following advice from the comments the following command should work. sed -r {filehere} -e 's/[A-Z]/\L&/g;s/ /_/g' -i

There is a perl tag in your question as well. So: #!/usr/bin/perl use strict; use warnings; while (<DATA>) { print join('_', split ' ', lc), "\n"; } __DATA__ Cycle code Cycle month Cycle year Event type ID Event ID Network start time Or: perl -i.bak -wple '$_ = join('_', split ' ', lc)' test.txt

sed "y/ABCDEFGHIJKLMNOPQRSTUVWXYZ /abcdefghijklmnopqrstuvwxyz_/" filename

Related

Normalize column fill with space on right

Extracting fasta ids after string match

How to add new number into each line?

Bash, Perl or Sed, Insert on New Line after found phrase

How can I remove the timestamp from a filename in Perl?

Categories

Resources

There is a perl tag in your question as well. So: #!/usr/bin/perl use strict; use warnings; while (<DATA>) { print join('_', split ' ', lc), "\n"; } DATA Cycle code Cycle month Cycle year Event type ID Event ID Network start time Or: perl -i.bak -wple '$_ = join('_', split ' ', lc)' test.txt