I have a string like this:
my $masterP = "A:B:C a:b:c a:c:b A:C:B B:C:A";
my (#scen) = split (/ /, $$masterP);
foreach my $key (#scen) {
my ($string1, $string2, $string3) = split (/:/, $key);
my $new = "${string1}_${string2}";
my $try .= $try . "$new";
}
print "$try\n";
I am expecting $try to print: A_B a_b a_c A_C B_C (with space), but it doesn't work. How can this be fixed?
This will do what you require:
use strict;
use warnings;
my $masterP = "A:B:C a:b:c a:c:b A:C:B B:C:A";
my #scen = split ' ', $masterP;
my #try = map { join '_', (split /:/)[0,1] } #scen;
my $try = "#try";
print "$try\n";
Output
A_B a_b a_c A_C B_C
Please always use Strict and Warnings to be get valuable code:
use strict;
use warnings;
my $masterP = "A:B:C a:b:c a:c:b A:C:B B:C:A";
my #scen = split(/ /, $masterP);
my $try;
foreach my $key (#scen) {
my ($string1, $string2, $string3) = split (/:/, $key);
my $new = "${string1}_${string2}";
$try .= " $new";
$try =~ s/^\s//;
chomp($try);
}
print "$try\n";
Related
How could I add dynamically a modifier to a substitution?
my $str = 'aaBaabaabaa';
my $mod = 'g';
$str =~ s/b// + $mod;
qr/(?X:).../ doesn't work for operation-only modifiers like /g.
A possible approach would be:
#!/usr/bin/perl
use strict;
use warnings;
my $str = 'aaBaabaabaa';
print "BEFORE: ${str}\n";
# regex configuration
my $pattern = '(.)(.)'; # 'b';
my $replacement = '"$2$1"'; # '';
my $mod = 'ieg'; # 'g';
# extract operation-only modifiers
my $operation_only = qr/([ge])/;
my %special;
$special{$_}++ foreach ($mod =~ /$operation_only/g);
my $mod_filtered;
($mod_filtered = $mod) =~ s/$operation_only//g;
# generate regex
my $regex = "(?${mod_filtered}:${pattern})";
my $compiled = qr/$regex/;
print "s/ ${regex} -> ${compiled} / ${replacement} / $mod -> ", join('', sort keys %special), "\n";
# execute substitution
my $replacement_code = sub { return $replacement };
for (1..$special{e} // 0) {
my $recurse = $replacement_code;
$replacement_code = sub { return eval $recurse->() };
}
if (exists $special{g}) {
$str =~ s/$compiled/$replacement_code->()/ge;
} else {
$str =~ s/$compiled/$replacement_code->()/e;
}
print "AFTER: ${str}\n";
exit 0;
Output:
$ perl dummy.pl
BEFORE: aaBaabaabaa
s/ (?i:(.)(.)) -> (?^:(?i:(.)(.))) / "$2$1" / ieg -> eg
AFTER: aaaBbaaaaba
I would like to create a program that replaces characters and retains the special characters. An example input and output is shown below.
Here's what I did so far:
$sentence = userinput;
#words = split(/ /, $sentence);
for ($i = 0; $i < #words.length; $i ++){
$words[$i] =~ s/\W//g;
#characters = split(//, $words[$i]);
#print $words[$i] . "\n";
$wordlength = length($words[$i]);
for ($j = 0; $j < #characters.length; $j ++){
$char = $characters[$j];
for ($x = 0; $x < $wordlength; $x++){
$char++;
if ($char eq "aa"){
$char = "a";
}
elsif ($char eq "AA"){
$char = "A";
}
}
print $char;
if ($x = 0){
$output[$i] = $char;
}
else {
$output[$i] = join ($char);
}
}
print $output[$i];
}
Input:
Hi! how are you doing?
Output:
Jk! krz duh brx itnsl?
A couple of things in your code don't make sense:
Missing use strict; use warnings;.
All variables are global (you should be using my to create variables)
#foo.length is not the number of elements in the array #foo. It's the number of elements in the array #foo concatenated with the number of characters in $_ (because arrays in scalar context return their length, . concatenates strings, and length works on $_ by default).
join ($char) always returns the empty string: You're joining an empty list (no elements) using $char as a separator.
Here's an attempt to fix all of these issues:
use strict;
use warnings;
my $sentence = readline;
$sentence =~ s{([A-Za-z]+)}{
my $word = $1;
join '', map {
my $base = ord(/^[A-Z]/ ? 'A' : 'a');
chr((ord($_) - $base + length($word)) % 26 + $base)
} split //, $word
}eg;
print $sentence;
I think what you are doing is rot3 encoding, but if so then your example is wrong
my $sentence = 'Hi! how are you doing?';
$sentence =~ tr/A-Za-z/D-ZA-Cd-za-c/;
print $sentence, "\n";
output
Kl! krz duh brx grlqj?
which is similar, but not identical to
Jk! krz duh brx itnsl?
I want to compare a string in column A with that in column B for every row and print a third column that highlights the differences.
Column A Column B
uuaaugcuaauugugauaggggu uuaaugcuaauugugauaggggu
uuaaugcuaauugugauagggguu uuaaugcuaauugugauaggggu
uuaaugcuaauugugauagggguuu uuaaugcuaauugugauaggggu
Desired Result:
Column A Column B Column C
uuaaugcuaauugugauaggggu uuaaugcuaauugugauaggggu ********************
uuaaugcuaauugugauagggguu uuaaugcuaauugugauaggggu ********************u
uuaaugcuaauugugauagggguuu uuaaugcuaauugugauaggggu ********************uu
I have an example script that might work, but how do I do this for every row in the data frame?
use strict;
use warnings;
my $string1 = 'AAABBBBBCCCCCDDDDD';
my $string2 = 'AEABBBBBCCECCDDDDD';
my $result = '';
for(0 .. length($string1)) {
my $char = substr($string2, $_, 1);
if($char ne substr($string1, $_, 1)) {
$result .= "**$char**";
} else {
$result .= $char;
}
}
print $result;
Using bruteforce and substr
use strict;
use warnings;
while (<DATA>) {
my ($str1, $str2) = split;
my $len = length $str1 < length $str2 ? length $str1 : length $str2;
for my $i (0..$len-1) {
my $c1 = substr $str1, $i, 1;
my $c2 = substr $str2, $i, 1;
if ($c1 eq $c2) {
substr $str1, $i, 1, '*';
substr $str2, $i, 1, '*';
}
}
printf "%-30s %s\n", $str1, $str2;
}
__DATA__
Column_A Column_B
uuaaugcuaauugugauaggggu uuaaugcuaauugugauaggggu
uuaaugcuaauugugauagggguu uuaaugcuaauugugauaggggu
uuaaugcuaauugugauagggguuu uuaaugcuaauugugauaggggu
AAABBBBBCCCCCDDDDD AEABBBBBCCECCDDDDD
Outputs:
*******A *******B
*********************** ***********************
***********************u ***********************
***********************uu ***********************
*A********C******* *E********E*******
Alternative using XOR
It's also possible to use ^ to find the intersection between two strings.
The following performs the same as the above:
while (<DATA>) {
my ($str1, $str2) = split;
my $intersection = $str1 ^ $str2;
while ($intersection =~ /(\0+)/g) {
my $len = length $1;
my $pos = pos($intersection) - $len;
substr $str1, $pos, $len, '*' x $len;
substr $str2, $pos, $len, '*' x $len;
}
printf "%-30s %s\n", $str1, $str2;
}
I could not resist to provide a modified Miller's solution with regular expressions
use strict;
use warnings;
while (<DATA>) {
my $masked_str1 ="";
my $masked_str2 ="";
my ($str1, $str2) = split;
my $intersection = $str1 ^ $str2;
while ($intersection =~ /(\x00+)/g) {
my $mask = $intersection;
$mask =~ s/\x00/1/g;
$mask =~ s/[^1]/0/g;
while ( $mask =~ /\G(.)/gc ) { # traverse the mask
my $bit = $1;
if ( $str1 =~ /\G(.)/gc ) { # traverse the string1 to be masked
$masked_str1 .= $bit ? '_' : $1;
}
if ( $str2 =~ /\G(.)/gc ) { # traverse the string2 to be masked
$masked_str2 .= $bit ? '_' : $1;
}
}
}
print "=" x 80;
printf "\n%-30s %s\n", $str2, $str1; # Minimum length 30 char, left-justified
printf "%-30s %s\n", $str1, $str2;
printf "%-30s %s\n\n", $masked_str1, $masked_str2;
}
I have a perl script but it calculate molecular weight only when sequence is given. However I want to calculate molecular weight of protein sequences which is in fasta file.
print "Enter the amino acid sequence:\n";
$a = < STDIN > ;
chomp($a);
my #a = ();
my $a = '';
$x = length($a);
print "Length of sequence is : $x";
#a = split('', $a);
$b = 0;
my %data = (
A=>71.09, R=>16.19, D=>114.11, N=>115.09,
C=>103.15, E=>129.12, Q=>128.14, G=>57.05,
H=>137.14, I=>113.16, L=>113.16, K=>128.17,
M=>131.19, F=>147.18, P=>97.12, S=>87.08,
T=>101.11, W=>186.12, Y=>163.18, V=>99.14
);
foreach $i(#a) {
$b += $data{$i};
}
$c = $b - (18 * ($x - 1));
print "\nThe molecular weight of the sequence is $c";
first of all u must tell us what format has .fasta files. As i know they looks like
>seq_ID_1 descriptions etc
ASDGDSAHSAHASDFRHGSDHSDGEWTSHSDHDSHFSDGSGASGADGHHAH
ASDSADGDASHDASHSAREWAWGDASHASGASGASGSDGASDGDSAHSHAS
SFASGDASGDSSDFDSFSDFSD
>seq_ID_2 descriptions etc
ASDGDSAHSAHASDFRHGSDHSDGEWTSHSDHDSHFSDGSGASGADGHHAH
ASDSADGDASHDASHSAREWAWGDASHASGASGASG
if we will make suggestion that your code works fine, and counts molecular weight all we need is to read fasta files, parse them and count weight by yours code. It's more easy that sounds like.
#!/usr/bin/perl
use strict;
use warnings;
use Encode;
for my $file (#ARGV) {
open my $fh, '<:encoding(UTF-8)', $file;
my $input = join q{}, <$fh>;
close $fh;
while ( $input =~ /^(>.*?)$([^>]*)/smxg ) {
my $name = $1;
my $seq = $2;
$seq =~ s/\n//smxg;
my $mass = calc_mass($seq);
print "$name has mass $mass\n";
}
}
sub calc_mass {
my $a = shift;
my #a = ();
my $x = length $a;
#a = split q{}, $a;
my $b = 0;
my %data = (
A=>71.09, R=>16.19, D=>114.11, N=>115.09,
C=>103.15, E=>129.12, Q=>128.14, G=>57.05,
H=>137.14, I=>113.16, L=>113.16, K=>128.17,
M=>131.19, F=>147.18, P=>97.12, S=>87.08,
T=>101.11, W=>186.12, Y=>163.18, V=>99.14
);
for my $i( #a ) {
$b += $data{$i};
}
my $c = $b - (18 * ($x - 1));
return $c;
}
Very new here so be gentle. :)
Here is the jist of what I want to do:
I want to take a string that is made up of numbers separated by semi-colons (ex. 6;7;8;9;1;17;4;5;90) and replace every "X" number of semicolons with a "\n" instead. The "X" number will be defined by the user.
So if:
$string = "6;7;8;9;1;17;4;5;90";
$Nth_number_of_semicolons_to_replace = 3;
The output should be:
6;7;8\n9;1;17\n4;5;90
I've found lots on changing the Nth occurrence of something but I haven't been able to find anything on changing every Nth occurrence of something like I am trying to describe above.
Thanks for all your help!
use List::MoreUtils qw(natatime);
my $input_string = "6;7;8;9;1;17;4;5;90";
my $it = natatime 3, split(";", $input_string);
my $output_string;
while (my #vals = $it->()) {
$output_string .= join(";", #vals)."\n";
}
Here is a quick and dirty answer.
my $input_string = "6;7;8;9;1;17;4;5;90";
my $count = 0;
$input_string =~ s/;/++$count % 3 ? ";" : "\n"/eg;
Don't have time for a full answer now, but this should get you started.
$string = "6;7;8;9;1;17;4;5;90";
$Nth_number_of_semicolons_to_replace = 3;
my $regexp = '(' . ('\d+;' x ($Nth_number_of_semicolons_to_replace - 1)) . '\d+);';
$string =~ s{ $regexp ) ; }{$1\n}xsmg
sub split_x{
my($str,$num,$sep) = #_;
return unless defined $str;
$num ||= 1;
$sep = ';' unless defined $sep;
my #return;
my #tmp = split $sep, $str;
while( #tmp >= $num ){
push #return, join $sep, splice #tmp, 0, $num;
}
push #return, join $sep, #tmp if #tmp;
return #return;
}
print "$_\n" for split_x '6;7;8;9;1;17;4;5;90', 3
print join( ',', split_x( '6;7;8;9;1;17;4;5;90', 3 ) ), "\n";
my $string = "6;7;8;9;1;17;4;5;90";
my $Nth_number_of_semicolons_to_replace = 3;
my $num = $Nth_number_of_semicolons_to_replace - 1;
$string =~ s{ ( (?:[^;]+;){$num} [^;]+ ) ; }{$1\n}gx;
print $string;
prints:
6;7;8
9;1;17
4;5;90
The regex explained:
s{
( # start of capture group 1
(?:[^;]+;){$num} # any number of non ';' characters followed by a ';'
# repeated $num times
[^;]+ # any non ';' characters
) # end of capture group
; # the ';' to replace
}{$1\n}gx; # replace with capture group 1 followed by a new line
If you've got 5.10 or higher, this could do the trick:
#!/usr/bin/perl
use strict;
use warnings;
my $string = '1;2;3;4;5;6;7;8;9;0';
my $n = 3;
my $search = ';.*?' x ($n -1);
print "string before: [$string]\n";
$string =~ s/$search\K;/\n/g;
print "print string after: [$string]\n";
HTH,
Paul