the problem is that it return like split on undefined value
B
e
c
k
y
.
split string perl code
sub start_thread {
my #args = #_;
print('Thread started: ', #args, "\n");
open(my $myhandle,'<',#args) or die "unable to open file"; # typical open call
my #aftersplit;
for (;;) {
while (<$myhandle>) {
chomp;
#aftersplit = split('|',$_);
#print $_."\n";
foreach my $val (#aftersplit){
print $val."\n";
}
}
sleep 1;
seek FH, 0, 1; # this clears the eof flag on FH
}
}
it split the string in $_ and save in array aftersplit
You have to escape | as it is special char in regex,
my #aftersplit = split(/\|/, $_);
You need to escape special character | with \
#aftersplit = split('\|',$_);
You need to escape your delimiter, since it's a special character.
For certain special characters you need to precede your character with a literal \
my #aftersplit = split '\|', $_;
You can also use quotemeta.
my $separator = quotemeta('|');
my #aftersplit = split /$separator/, $_;
Or implement the escape sequence \Q
my #aftersplit = split /\Q|/, $_;
Related
I am trying to write a big script but I am stuck on a part. I want to sprit an array based on ".."
From the script I got this:
print #coordinates;
gene complement(872..1288)
my desired output:
complement 872 1288
I tried:
1) my #answer = split(.., #coordinates)
print("#answer\n");
2) my #answer = split /../, #coordinates;
3) print +(split /\../)[-1],[-2],[-3] while <#coordinates>
4) foreach my $anwser ( #coordinates )
{$anwser =~ s/../"\t"/;
print $anwser;}
5) my #answer = split(/../, "complement(872..1288)"); #to see if the printed array is problematic.
which prints:
) ) ) ) ) ) ) ) )
6) my #answer = split /"gene "/, #coordinates; # I tried to "catch" the entire output's spaces and tabs
which prints
0000000000000000000000000000000001000000000100000000
But none of them works. Does anyone has any idea how to step over this issue?
Ps, unfortunately, I can't run my script right now on Linux so I used this website to run my script. I hope this is not the reason why I didn't get my desired output.
my $RE_COMPLEMENT = qr{(complement)\((\d+)\.\.(\d+)\)}msx;
for my $item (#coordinates) {
my ($head, $i, $j) = $item =~ $RE_COMPLEMENT;
if (defined($head) && defined($i) && defined($j)) {
print("$head\t$i\t$j\n");
}
}
split operates on a scalar, not on an array.
my $string = 'gene complement(872..1288)';
my #parts = split /\.\./, $string;
print $parts[0]; # gene complement(872
print $parts[1]; # 1288)
To get the desired output, you can use a substitution:
my $string = 'gene complement(872..1288)';
$string =~ s/gene +|\)//g;
$string =~ s/\.\./ /;
$string =~ s/\(/ /;
Desired effect can be achieved with
use of tr operator to replace '(.)' => ' '
then splitting data string into element on space
storing only required part of array
output elements of array joined with tabulation
use strict;
use warnings;
use feature 'say';
my $data = <DATA>;
chomp $data;
$data =~ tr/(.)/ /;
my #elements = (split ' ', $data)[1..3];
say join "\t", #elements;
__DATA__
gene complement(872..1288)
Or as an alternative solution with only substitutions (without splitting data string into array)
use strict;
use warnings;
use feature 'say';
my $data = <DATA>;
chomp $data;
$data =~ s/gene\s+//;
$data =~ s/\)//;
$data =~ s/[(.]+/\t/g;
say $data;
__DATA__
gene complement(872..1288)
Output
complement 872 1288
I felt there must a better way to count occurrence instead of writing a sub in perl, shell in Linux.
#/usr/bin/perl -w
use strict;
return 1 unless $0 eq __FILE__;
main() if $0 eq __FILE__;
sub main{
my $str = "ru8xysyyyyyyysss6s5s";
my $char = "y";
my $count = count_occurrence($str, $char);
print "count<$count> of <$char> in <$str>\n";
}
sub count_occurrence{
my ($str, $char) = #_;
my $len = length($str);
$str =~ s/$char//g;
my $len_new = length($str);
my $count = $len - $len_new;
return $count;
}
If the character is constant, the following is best:
my $count = $str =~ tr/y//;
If the character is variable, I'd use the following:
my $count = length( $str =~ s/[^\Q$char\E]//rg );
I'd only use the following if I wanted compatibility with versions of Perl older than 5.14 (as it is slower and uses more memory):
my $count = () = $str =~ /\Q$char/g;
The following uses no memory, but might be a bit slow:
my $count = 0;
++$count while $str =~ /\Q$char/g;
Counting the occurences of a character in a string can be performed with one line in Perl (as compared to your 4 lines). There is no need for a sub (although there is nothing wrong with encapsulating functionality in a sub). From perlfaq4 "How can I count the number of occurrences of a substring within a string?"
use warnings;
use strict;
my $str = "ru8xysyyyyyyysss6s5s";
my $char = "y";
my $count = () = $str =~ /\Q$char/g;
print "count<$count> of <$char> in <$str>\n";
In a beautiful* Bash/Coreutils/Grep one-liner:
$ str=ru8xysyyyyyyysss6s5s
$ char=y
$ fold -w 1 <<< "$str" | grep -c "$char"
8
Or maybe
$ grep -o "$char" <<< "$str" | wc -l
8
The first one works only if the substring is just one character long; the second one works only if the substrings are non-overlapping.
* Not really.
toolic has given a correct answer, but you might consider not hardcoding your values to make the program reusable.
use strict;
use warnings;
die "Usage: $0 <text> <characters>" if #ARGV < 1;
my $search = shift; # the string you are looking for
my $str; # the input string
if (#ARGV && -e $ARGV[0] || !#ARGV) { # if str is file, or there is no str
local $/; # slurp input
$str = <>; # use diamond operator
} else { # else just use the string
$str = shift;
}
my $count = () = $str =~ /\Q$search\E/gms;
print "Found $count of '$search' in '$str'\n";
This will allow you to use the program to count for the occurrence of a character, or a string, inside a string, a file, or standard input. For example:
count.pl needles haystack.txt
some_process | count.pl foo
count.pl x xyzzy
Very new here so be gentle. :)
Here is the jist of what I want to do:
I want to take a string that is made up of numbers separated by semi-colons (ex. 6;7;8;9;1;17;4;5;90) and replace every "X" number of semicolons with a "\n" instead. The "X" number will be defined by the user.
So if:
$string = "6;7;8;9;1;17;4;5;90";
$Nth_number_of_semicolons_to_replace = 3;
The output should be:
6;7;8\n9;1;17\n4;5;90
I've found lots on changing the Nth occurrence of something but I haven't been able to find anything on changing every Nth occurrence of something like I am trying to describe above.
Thanks for all your help!
use List::MoreUtils qw(natatime);
my $input_string = "6;7;8;9;1;17;4;5;90";
my $it = natatime 3, split(";", $input_string);
my $output_string;
while (my #vals = $it->()) {
$output_string .= join(";", #vals)."\n";
}
Here is a quick and dirty answer.
my $input_string = "6;7;8;9;1;17;4;5;90";
my $count = 0;
$input_string =~ s/;/++$count % 3 ? ";" : "\n"/eg;
Don't have time for a full answer now, but this should get you started.
$string = "6;7;8;9;1;17;4;5;90";
$Nth_number_of_semicolons_to_replace = 3;
my $regexp = '(' . ('\d+;' x ($Nth_number_of_semicolons_to_replace - 1)) . '\d+);';
$string =~ s{ $regexp ) ; }{$1\n}xsmg
sub split_x{
my($str,$num,$sep) = #_;
return unless defined $str;
$num ||= 1;
$sep = ';' unless defined $sep;
my #return;
my #tmp = split $sep, $str;
while( #tmp >= $num ){
push #return, join $sep, splice #tmp, 0, $num;
}
push #return, join $sep, #tmp if #tmp;
return #return;
}
print "$_\n" for split_x '6;7;8;9;1;17;4;5;90', 3
print join( ',', split_x( '6;7;8;9;1;17;4;5;90', 3 ) ), "\n";
my $string = "6;7;8;9;1;17;4;5;90";
my $Nth_number_of_semicolons_to_replace = 3;
my $num = $Nth_number_of_semicolons_to_replace - 1;
$string =~ s{ ( (?:[^;]+;){$num} [^;]+ ) ; }{$1\n}gx;
print $string;
prints:
6;7;8
9;1;17
4;5;90
The regex explained:
s{
( # start of capture group 1
(?:[^;]+;){$num} # any number of non ';' characters followed by a ';'
# repeated $num times
[^;]+ # any non ';' characters
) # end of capture group
; # the ';' to replace
}{$1\n}gx; # replace with capture group 1 followed by a new line
If you've got 5.10 or higher, this could do the trick:
#!/usr/bin/perl
use strict;
use warnings;
my $string = '1;2;3;4;5;6;7;8;9;0';
my $n = 3;
my $search = ';.*?' x ($n -1);
print "string before: [$string]\n";
$string =~ s/$search\K;/\n/g;
print "print string after: [$string]\n";
HTH,
Paul
I have strings similar to this
INSERT INTO `log_action` VALUES (1,'a',1,4),(2,'a',1,1),(3,'a',4,4),(4,'a',1,1),(5,'a',6,4);
where I would like to add a number each of the first values, so it becomes (when value is 10)
INSERT INTO `log_action` VALUES (11,'a',1,4),(12,'a',1,1),(13,'a',4,4),(14,'a',1,1),(15,'a',6,4);
I have tried this
#!/usr/bin/perl -w
use strict;
my $input;
if ($#ARGV == 0) {
$input = $ARGV[0];
} else {
print "Usage: test.pl filename\n\n";
die "Wrong number of arguments.\n";
}
my $value;
$value = 10;
open(FILE, '<', $input) or die $!;
foreach my $line (<FILE>) {
if ($line =~ m/^INSERT INTO \`log_action\` VALUES/) {
$line =~ s/\((\d+),/\($1+$value,/ge;
print $line . "\n";
}
}
close FILE;
It fails because of the \($1+$value,. The \( and , is there to as the search eats those.
Any suggestions how to solve it?
You where almost there, but the part you put in the replacement side of s///e needs to be valid Perl. You are evaluating Perl code:
my $string =<<HERE;
INSERT INTO `log_action` VALUES
(1,'a',1,4),(2,'a',1,1),(3,'a',4,4),(4,'a',1,1),(5,'a',6,4);
HERE
my $value = 10;
$string =~ s/\((\d+),/ '(' . ($1+$value) . ',' /ge;
print "$string\n";
The Perl code that /e evaluates is just a string concatenation:
'(' . ($1+$value) . ','
However, when I want to match parts of the string that I don't want to replace, I use lookarounds so those parts aren't part of the replacement:
my $string =<<HERE;
INSERT INTO `log_action` VALUES
(1,'a',1,4),(2,'a',1,1),(3,'a',4,4),(4,'a',1,1),(5,'a',6,4);
HERE
my $value = 10;
$string =~ s/ (?<=\() (\d+) (?=,) / $1+$value /xge;
print "$string\n";
A perl script I'm writing needs to parse a file that has continuation lines like a Makefile. i.e. lines that begin with whitespace are part of the previous line.
I wrote the code below but don't feel like it is very clean or perl-ish (heck, it doesn't even use "redo"!)
There are many edge cases: EOF at odd places, single-line files, files that start or end with a blank line (or non-blank line, or continuation line), empty files. All my test cases (and code) are here: http://whatexit.org/tal/flatten.tar
Can you write cleaner, perl-ish, code that passes all my tests?
#!/usr/bin/perl -w
use strict;
sub process_file_with_continuations {
my $processref = shift #_;
my $nextline;
my $line = <ARGV>;
$line = '' unless defined $line;
chomp $line;
while (defined($nextline = <ARGV>)) {
chomp $nextline;
next if $nextline =~ /^\s*#/; # skip comments
$nextline =~ s/\s+$//g; # remove trailing whitespace
if (eof()) { # Handle EOF
$nextline =~ s/^\s+/ /;
if ($nextline =~ /^\s+/) { # indented line
&$processref($line . $nextline);
}
else {
&$processref($line);
&$processref($nextline) if $nextline ne '';
}
$line = '';
}
elsif ($nextline eq '') { # blank line
&$processref($line);
$line = '';
}
elsif ($nextline =~ /^\s+/) { # indented line
$nextline =~ s/^\s+/ /;
$line .= $nextline;
}
else { # non-indented line
&$processref($line) unless $line eq '';
$line = $nextline;
}
}
&$processref($line) unless $line eq '';
}
sub process_one_line {
my $line = shift #_;
print "$line\n";
}
process_file_with_continuations \&process_one_line;
How about slurping the whole file into memory and processing it using regular expressions. Much more 'perlish'. This passes your tests and is much smaller and neater:
#!/usr/bin/perl
use strict;
use warnings;
$/ = undef; # we want no input record separator.
my $file = <>; # slurp whole file
$file =~ s/^\n//; # Remove newline at start of file
$file =~ s/\s+\n/\n/g; # Remove trailing whitespace.
$file =~ s/\n\s*#[^\n]+//g; # Remove comments.
$file =~ s/\n\s+/ /g; # Merge continuations
# Done
print $file;
If you don't mind loading the entire file in memory, then the code below passes the tests.
It stores the lines in an array, adding each line either to the previous one (continuation) or at the end of the array (other).
#!/usr/bin/perl
use strict;
use warnings;
my #out;
while( <>)
{ chomp;
s{#.*}{}; # suppress comments
next unless( m{\S}); # skip blank lines
if( s{^\s+}{ }) # does the line start with spaces?
{ $out[-1] .= $_; } # yes, continuation, add to last line
else
{ push #out, $_; } # no, add as new line
}
$, = "\n"; # set output field separator
$\ = "\n"; # set output record separator
print #out;