Splitting argument using regular expression with limit

Splitting argument using regular expression with limit - perl

$aa = "Main:http://google-test.com:8080/service"
(or)
$aa = "http://google-test.com:8080/service2"
I want to split this into two parts:
Main:
http://google-test.com:8080/service
But it is not working with this split:
split (/\:/,$aa,1);

You need change the limit from 1 to 2.
perl -le 'my $aa="Main:http://google-test.com:8080/service"; my #parts = split(/:/, $aa, 2); print scalar #parts;'
From perldoc -f split:
If LIMIT is specified and positive, it represents the maximum number
of fields the EXPR will be split into,
It looks like you were trying to use it as the maximum number of times to split and not the number of parts to return.

New question, new answer:
my ($a1, $a2) = $aa =~ /^(\w*):?(http://.+)$/;
Assuming the "Main" part can only be alphanumerics. This will also match $a1 to the empty string if "Main" is left out, which you can check for with an if statement or similar.
Split would work too, with a limit of two, as gpojd has already answered.
my ($a1, $a2) = split /:/, $aa, 2;
But then you would need to check and see what you caught in the two variables. E.g. the URL could be in either $a1 or $a2. And you might need to join them back together afterwards.

You want to split it at the colons?
try:
my #DATA;
$aa = "Main:http://google-test.com:8080/service";
#DATA = split(/:/, $aa);
Then you can access the different parts of the split using:
for ($i = 0; $i < #DATA; $i++)
{
print "data section $i value is: " . $DATA[$i] . "\n";
}

Related

Passing strings as array to subroutine and return count of specific char

I was trying to think in the right way to tackle this:
-I would to pass say, n elements array as argument to a subroutine. And for each element match two char types S and T and print for each element, the count of these letters. So far I did this but I am locked and found some infinite loops in my code.
use strict;
use warnings;
sub main {
my #array = #_;
while (#array) {
my $s = ($_ = tr/S//);
my $t = ($_ = tr/T//);
print "ST are in total $s + $t\n";
}
}
my #bunchOfdata = ("QQQRRRRSCCTTTS", "ZZZSTTKQSST", "ZBQLDKSSSS");
main(#bunchOfdata);
I would like the output to be:
Element 1 Counts of ST = 5
Element 2 Counts of ST = 6
Element 3 Counts of ST = 4
Any clue how to solve this?

while (#array) will be an infinite loop since #array never gets smaller. You can't read into the default variable $_ this way. For this to work, use for (#array) which will read the array items into $_ one at a time until all have been read.
The tr transliteration operator is the right tool for your task.
The code needed to get your results could be:
#!/usr/bin/perl
use strict;
use warnings;
my #data = ("QQQRRRRSCCTTTS", "ZZZSTTKQSST", "ZBQLDKSSSS");
my $i = 1;
for (#data) {
my $count = tr/ST//;
print "Element $i Counts of ST = $count\n";
$i++;
}
Also, note that my $count = tr/ST//; doesn't require the binding of the transliteration operator with $_. Perl assumes this when $_ holds the value to be counted here. Your code tried my $s = ($_ = tr/S//); which will give the results but the shorter way I've shown is the preferred way.
(Just noticed you had = instead of =~ in your statement. That is an error. Has to be $s = ($_ =~ tr/S//);)
You can combine the 2 sought letters as in my code. Its not necessary to do them separately.
I got the output you want.
Element 1 Counts of ST = 5
Element 2 Counts of ST = 6
Element 3 Counts of ST = 4
Also, you can't perform math operations in a quoted string like you had.
print "ST are in total $s + $t\n";
Instead, you would need to do:
print "ST are in total ", $s + $t, "\n";
where the operation is performed outside of the string.

Don't use while to traverse an array - your array gets no smaller, so the condition is always true and you get an infinite loop. You should use for (or foreach) instead.
for (#array) {
my $s = tr/S//; # No need for =~ as tr/// works on $_ by default
my $t = tr/T//;
print "ST are in total $s + $t\n";
}

Why tr///??
sub main {
my #array = #_;
while (#array) {
my $s = split(/S/, $_, -1) - 1;
my $t = split(/T/, $_, -1) - 1;
print "ST are in total $s + $t\n";
}
}

Count Characters in Perl

I need to count the letter "e" in the string
$x="012ei ke ek ek ";
So far, I've tried with a for-loop:
$l=length($x);
$a=0;
for($i=0;$i<$l;$i++)
{$s=substr($x,$i,1);
if($s=="e")
{$a++;}
print $a;

Your code has some problems. You forgot to close the for loop brace,
and in Perl == is supposed to compare numbers. Use eq for strings.
It is also recommended that you use warnings and enable strict mode,
which would have helped you debugging this. In your case, since e
would be treated as 0, so the other one char substrings, 1 and 2
would be the only characters not equal to e when compared with ==. A
cleaned up version of your code could be written as:
use warnings;
use strict;
my $x = "012ei ke ek ek ";
my $l = length $x;
my $count = 0;
for(my $i = 0; $i < $l; $i++) {
my $s = substr($x, $i, 1);
$count++ if ($s eq "e");
}
print $count;
There are multiple ways to achieve this. You could use a match with a
group, which if global returns all the occurrences in list context.
Since you want the number, take this result in scalar context. You can
achieve this for example with:
my $count = () = $string =~ /(e)/g;
Or:
my $count = #{[ $string =~ /(e)/g ]}
Another way is to split the string into characters and grep those that
are e:
my $count = grep $_ eq 'e', split //, $string;
And probably the most compact is to use tr which returns the count of
characters in scalar context, although this does restrict this usage to
counting characters only:
my $count = $string =~ tr/e//;

You compare characters with the numeric operator (==) when you should use the string comparison eq. If you had used the warnings pragma you would have seen that.
You code should have looked like:
#!/usr/bin/env perl
use strict;
use warnings;
my $x = "012ei ke ek ek ";
my $l = length($x);
my $a = 0;
for ( my $i = 0; $i < $l; $i++ ) {
my $s = substr( $x, $i, 1 );
if ( $s eq "e" ) {
$a++;
}
}
print "$a\n";
Proper indentation and the use of the strict and warnings pragmas will avoid and/or catch unintentional, dumb errors.
A much more Perl-ish (and shorter) way to achieve your answer is:
perl -le '$x="012ei ke ek ek";#count=$x=~m/e/g;print scalar #count'
4
This matches globally and collects all the matches in list context. The scalar value of the list gives the number of occurrences you seek.
Another way is to use tr
perl -le '$x="012ei ke ek ek";print scalar $x=~tr/e//'
4

#sidyll Already mentioned what is the problem in your script and all of the possible ways, but TIMTOWTDI.
$x="012ei ke ek ek ";
my $count;
$count++ while($x=~/e/g);
print $count;

Selecting highest count of element except when...

So i have been working on this perl script that will analyze and count the same letters in different line spaces. I have implemented the count to a hash but am having trouble excluding a " - " character from the output results of this hash. I tried using delete command or next if, but am not getting rid of the - count in the output.
So with this input:
#extract = ------------------------------------------------------------------MGG-------------------------------------------------------------------------------------
And following code:
#Count selected amino acids.
my %counter = ();
foreach my $extract(#extract) {
#next if $_ =~ /\-/; #This line code does not function correctly.
$counter{$_}++;
}
sub largest_value_mem (\%) {
my $counter = shift;
my ($key, #keys) = keys %$counter;
my ($big, #vals) = values %$counter;
for (0 .. $#keys) {
if ($vals[$_] > $big) {
$big = $vals[$_];
$key = $keys[$_];
}
}
$key
}
I expect the most common element to be G, same as the output. If there is a tie in the elements, say G = M, if there is a way to display both in that would be great but not necessary. Any tips on how to delete or remove the '-' is much appreciated. I am slowly learning perl language.
Please let me know if what I am asking is not clear or if more information is needed, thanks again kindly for all the comments.

Your data doesn't entirely make sense, since it's not actually working perl code. I'm guessing that it's a string divided into characters. After that it sounds like you just want to be able to find the highest frequency character, which is essentially just a sort by descending count.
Therefore the following demonstrates how to count your characters and then sort the results:
use strict;
use warnings;
my $str = '------------------------------------------------------------------MGG-------------------------------------------------------------------------------------';
my #chars = split '', $str;
#Count Characteres
my %count;
$count{$_}++ for #chars;
delete $count{'-'}; # Don't count -
# Sort keys by count descending
my #keys = sort {$count{$b} <=> $count{$a}} keys %count;
for my $key (#keys) {
print "$key $count{$key}\n";
}
Outputs:
G 2
M 1

foreach my $extract(#extract) {
#next if $_ =~ /\-/
$_ setting is suppressed by $extract here.
(In this case, $_ keeps value from above, e.g. routine argument list, previous match, etc.)
Also, you can use character class for better readability:
next if $extract=~/[-]/;

Could i search between keys of a hash and assign its value to a variable in Perl?

I want to use substr function to recuperate some nucleotides in a sequences. Here i have the FASTA format of those sequences:
>dvex28051
AAAACAAAAACATTCGCTAGAAAGTAATCAGCTGGTCATTTATTTGAAATGTTAATGATATATTTCATGTTGCTAATTTTTTATGAAAAAAATCATTGCTTATTTAATTACTCTTGGTTCTTGACCAACTATAAAAGCATTGTTTAGTATCAAGTGTCCAGGTATCAGCAGTTTTGTTTGAAAACAAACTTTTATTCATGCAGTCAGTGGCGGATCCAGGTAGAGTGCAGAGGCAGCACCCTCCGTCAGAAAACCAAAAAAAGAAGAAATGAAAAATTATAAAAAAAATTTCTAAACGTTGGTGCACTTAAGTGTAGCAAAAAATTCCTGTTTAGATATTCAGTGGGGAGCGACACCTTTTGGGGCCTATAGCTTCAAATCTTACTTGGTGACCTAAAATCGCTTTTTCGTTGGATCTGCGAAAGCTAGAATTTGGTTGCTGCAAATCGAATCGGTGCATCAACTGCATCAATATCAACGATGTGGTGACTGGTGGTATATTTTGGGTTCGTGCAATGCTACATTTATTTCAATCATATTTCAAGGCAGAAAGGGAAAGAAAACATCAGGTCAAGACAGTGGCGTAGCGAGGGAAGGGGGGCATACGTCCCCGGGCGCAACACGATGTCTTTTTTTTTAATCATCTGCGAAATTCAGACATTTTTTAGAGACTAAATGAAACTATGGAAAACCGGGCCCTTATAAAAGTTGAGACCAAGTGAAAAACTGGGGATAAAACATGAAAATCGGGCTCCAAAAGAATGAGAGTCCGCCCTTGGTCTGTACCAGCATGATTTGAGCGCAAATTTCATTAAGCCCCCGGGCGCAAGACACTCACGCTACGCCCCTGGGTAAAGACAAACAGAGTAGTTTTTCTTATAAACACAAGCATGCACAAACAACATAAAAACAAAACACAGTTTTTTTTAAGACGATGTGCTGCGTGCACCCGCTCAATGTTTTTTTTTTTTTTTTATAGAAAAGCAAAACTTTGAAAGGTTAACGTCAACTCATTTTACAACAATTTGTGGCAAATGGTATCAAGGTATCAAGCAATTAACTAAATGTCTTCCACTAGAACGCAGAACACCATTTTGCAATTATTTATTTGATGTAAACCAGTGTGTTAGATCAAAATCACTTCGACGCCGTTTTTTGACTCCGTGAAAATCTTGGTATTCTTCTCGCATTGCATAATGATGGTTTGTTGAAATAAAATTAAACGCTTAACGTTCTTAAAATGAGCGCGATACTACTTTTCTTTGTAGATTTTCTGCATGCGCTCCTTTTAAGTTGATCCCGAGCTACAAACTTCTTTATGAACGTTTTGGATTTCTCCAAAATAAAGCCTGCAAGCAGTTTTCTAAAAACACCGCACCCCCCATTAGGAATTTCTAGATCCGCCCCTGCATACAGTATTTGTTAATTATTAAAACCAACCAGCAGCAATTGTTTATTCAATGACTATTAAACCAACCTGGATAGTGCGTTTGGTCTTGATTGAAGCGATTGCTGCATTGACGTCTTTCGGAACCACATCACC
>dvex294195
GAATCAGTGGAAAAGTCACAACGCAGCTTGCCGAATTACTGCAGATTCTTTACACTTTTTTTTCTACATTATCACTGTTTTGCTTAATTTTCAATTATAGAAATCAAAATTAATAACTGGTATGTAGTTGGTCGGTGCTTCGAGAAAGTAGCCTACTCAATGATTTCTCAGAATGTTACAGTACTTCAAAAAAACAGACTACCCATTTCAAAAAATATAAACCTAGTA
I want to compare each keys of the hash with the Hit column (dvex\d++) of this table:
#Query Hit sense start end star_q end_q lenght_q # this line is informative don't make part of the code.
miRNA1 dvex28051 + 205 232 11 38 51
miRNA1 dvex202016 - 75 106 17 48 51
miRNA1 dvex294195 + 55 85 11 48 51
If this exist, I want to assign its value of the hash to a variable (i.e: $sequence) for apply a substr function:
my $fragment = substr $sequence, $start, $length_sequence;
I make an array with the sequences, and tried to reading it each 2 values and compare it:
while (my $line1 = <$MYINPUTFILE>){ #Entry of the sequences Fasta file
chomp $line1;
push #array_lines, $line1;
}
while (my $line2 = <$IN>){ #Entry of the table
chomp $line2;
push #database_lines, $line2;
}
foreach my $database_line (#database_lines){ #each value of the table
my #entry = split /\s++/,$database_line;
$pattern = $entry[1];
$query = $entry[0];
$start = $entry[3];
$l_pattern = length $pattern;
$end = $entry[4];
$lng_sequence = ($end - $start) + 1;
$sense = $entry[2];
$l_query = $entry[7];
my $count = 2;
for (my $i = 0; $i <= $#array_lines; $i +=$count){
chomp $array_lines[$i-2];
chomp $array_lines[$i-1];
$seq = $array_lines[$i-1];
$header = $array_lines[$i-2];
if($new_header =~ /$pattern/ && $l_header == $l_pattern){
if(($end+$right_diff+$increment) > $l_query){
$clean_seq = substr $seq, $start, $l_query;
} else {;}
}
The problem with my code is that Perl recognizes $seq as the last one Sequence. And always apply substr function on this $seq. I need to search the $pattern and search in those sequences, if exist, assign $seq to its sequence, next apply substr function.
Some suggestions?

I see two significant problems with your code. First, in the loop:
for (my $i = 0; $i <= $#array_lines; $i +=$count){
chomp $array_lines[$i-2];
chomp $array_lines[$i-1];
$seq = $array_lines[$i-1];
$i is set to zero the first time through, but you access array elements $i-1 and $i-2. Element -1 will be the last element of the array, and -2 will be the second to the last element. So it looks like $seq and $header will have incorrect values the first time through your loop. Maybe you need to start $i at $count instead of zero?
Secondly, in this line:
if(($end+$right_diff+$increment) > $l_query){
$increment appears only here in your code. It is never set to anything. Did you mean to use $i here?
A few other suggestions:
Make sure you use warnings; use strict; This will catch errors such as the $increment variable above.
Here is a simpler way to read a file into an array:
my #array_lines = <$MYINPUTFILE>;
chomp #array_lines;
Within regexes, ++ is a special quantifier that disables backtracking. If you want to split on one or more whitespace characters, it is more typical to use split /\s+/, or the equivalent split ' '
With this line, you appear to be simply checking that two strings are equal:
if($new_header =~ /$pattern/ && $l_header == $l_pattern)
You could just do this instead:
if($new_header eq $pattern)
When you have multiple conditions, it is clearer to put them all in one if statement instead of using nested statements. If you have many conditions, you can put them on multiple lines for clarity.
It isn't necessary to use else {;} If you don't need to do anything there, just omit the else clause altogether.

Reformulate a string query in perl

How do i reformulate a string in perl?
For example consider the string "Where is the Louvre located?"
How can i generate strings like the following:
"the is Louvre located"
"the Louvre is located"
"the Louvre located is"
These are being used as queries to do a web search.
I was trying to do something like this:
Get rid of punctuations and split the sentence into words.
my #words = split / /, $_[0];
I don't need the first word in the string, so getting rid of it.
shift(#words);
And then i need move the next word through out the array - not sure how to do this!!
Finally convert the array of words back to a string.

How can I generate all permutations of an array in Perl?
Then use join to glue each permutation array back together into a single string.

Somewhat more verbose example:
use strict;
use warnings;
use Data::Dumper;
my $str = "Where is the Louvre located?";
# split into words and remove the punctuation
my #words = map {s/\W+//; $_} split / /, $str;
# remove the first two words while storing the second
my $moving = splice #words, 0 ,2;
# generate the variations
my #variants;
foreach my $position (0 .. $#words) {
my #temp = #words;
splice #temp, $position, 0, $moving;
push #variants, \#temp;
}
print Dumper(\#variants);

my #head;
my ($x, #tail) = #words;
while (#tail) {
push #head, shift #tail;
print join " ", #head, $x, #tail;
};
Or you can just "bubble" $x through the array: $words[$n-1] and words[$n]
foreach $n (1..#words-1) {
($words[$n-1, $words[$n]) = ($words[$n], $words[$n-1]);
print join " ", #words, "\n";
};

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

Splitting argument using regular expression with limit - perl

$aa = "Main:http://google-test.com:8080/service" (or) $aa = "http://google-test.com:8080/service2" I want to split this into two parts: Main: http://google-test.com:8080/service But it is not working with this split: split (/\:/,$aa,1);

You want to split it at the colons? try: my #DATA; $aa = "Main:http://google-test.com:8080/service"; #DATA = split(/:/, $aa); Then you can access the different parts of the split using: for ($i = 0; $i < #DATA; $i++) { print "data section $i value is: " . $DATA[$i] . "\n"; }

Related

Passing strings as array to subroutine and return count of specific char

Count Characters in Perl

Selecting highest count of element except when...

Could i search between keys of a hash and assign its value to a variable in Perl?

Reformulate a string query in perl

Categories

Resources