What do <prog1.pl> and <prog2.pl> do in their while-loop? [closed] - perl

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 25 days ago.
Improve this question
I ran two programmes <prog1.pl> and <prog2.pl>. Now I need to figure out what happens in their while-loop. Help would be greatly appreciated.
prog1.pl
my $k = "";
print "running ...\n";
open (IN,"auste-north-1522.txt");
open (OUT,">outfile3.txt");
while (<IN>) {
if ($_ =~ m/\ <[^i].*[^i]\ >/g) {
print OUT $_;
}
}
close (IN);
close (OUT);
print "Press the return/enter key to finish.";
$k = <STDIN>
prog2.pl
my $k = "";
print "running ...\n";
open (IN,"auste-north-1522.txt");
open (OUT,">outfile4.txt");
while (<IN>) {
$_ =~ s/(\ <i\ >)|(\ <\ /i\ >)//g;
print OUT $_ unless ($_ =~ m/\ <.*\ >/g);
}
close (IN);
close (OUT);
print "Press the return/enter key to finish.";
$k = <STDIN>
I was told study their scripts but I still struggle to understand.

while (<IN>) is a common way how to shorten
while (defined($_ = readline IN))
Inside the first loop
$_ =~ m/\ <[^i].*[^i]\ >/g
matches the topic variable $_ against a regular expression. The backslashes before spaces aren't needed, same for the /g. You can write it as
/ <[^i].*[^i] >/
Which matches if there's a space, <, anything but i, then 0 or more of anything but a newline, anything but i again, space, and >.
For example, these strings match:
" <jj >"
" <jXj >"
The second loop is left as an exercise for the reader, see perlsyn, perlop, perlre.

Related

Read specific text patterns in perl [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 7 years ago.
Improve this question
I want to read from a text file only specific text, for example:
FileExample:
1111111/first/second/third/fourth.c11111111...etc...
1111111/afirst/asecond/athird/afourth.c11111111...etc...etc
I would like to read the whole file except the part of the file from the 3rd "1" from the first "/" until the ".c" after the 4th "/" to make myself more clear I will bold the text I want my program to read and leave unbolded the part of the text I don't want my program to read.
1111111/first/second/third/fourth.c11111111...etc...etc
1111111/afirst/asecond/athird/afourth.c11111111...etc...etc
after I do all the operations I want with the bolded text,I want to write it in another file the unbolded text unmodified and the bolded text with the modifications made after the operations,and placed in the original file order.
open my $fh1, '<', 'hex.txt';
open my $fh2, '<', 'hex2.txt';
until ( eof $fh1 or eof $fh2 ) {
my #l1 = map hex,unpack '(a2)*', <$fh1>;
my #l2 = map hex,unpack '(a2)*', <$fh2>;
my $n = #l2 > #l1 ? #l2 : #l1;
my #sum = map {
$l1[$_] + $l2[$_];
} 0 .. $n-1;
#sum = map { sprintf '%X', $_ } #sum;
open my $out, '>', 'sum.txt';
print { $out } #sum, "\n";
}
I want to sum the hex values from the file hex to the sum values from file hex2,both files have the same construction type, both have text and hex values in the same location and have the exact same length.i just need to understand how to tell him to read from location1 to location2.
Convert file to hex:
{
my $input = do {
open my $in, '<', $ARGV[0];
local $/;
<$in>
};
open my $out, '>', 'hex.txt';
print $out unpack 'H*', $input;
}
Your precise criteria aren't clear. Are those digits always ones? It's a mistake to show such a very simple example when you're hoping for help. But I suggest you use split
Something like this perhaps?
use strict;
use warnings;
use feature 'say';
my $data = do {
local $/;
<DATA>;
};
$data =~ tr/\n//d;
say for split qr{\d\d\d(?:/\w+)+/\w+\.c}, $data;
__DATA__
1111111/first/second/third/fourth.c11111111...etc...
1111111/afirst/asecond/athird/afourth.c11111111...etc...etc
output
1111
11111111...etc...1111
11111111...etc...etc
I changed the input to be able to recognize what 1's it matches:
abcd111/first/second/third/fourth.cX1111111...etc...
abcd111/afirst/asecond/athird/afourth.cX1111111...etc...etc
This seems to produce the output you want
perl -pe 's=([^/]+).../.*\.c=$1='
[^/] is a character class, it matches anything that's not a slash;
+ means it must be present one or more times
putting it into parentheses makes it a "capture group", i.e. Perl will remember what matched that part.
.../ matches any three character followed by a slash.
.* matches anything.
\.c matches a dot followed by a c.
the whole matching part (abcd in the sample input, up to the c before X) is substituted (hence the starting s) with $1, i.e. the contents of the first capture group, i.e. the abcd in the sample input.

how to output the second line in a multi-line file [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 9 years ago.
Improve this question
I have a big file with repeated lines as follows:
#UUSM
ABCDEADARFA
+------qqq
!2wqeqs6777
I will like to output the all the 'second line' in the file. I have the following code snipped for doing this, but it's not working as expected. Lines 1, 3 and 4 are in the output instead.
open(IN,"<", "file1.txt") || die "cannot open input file:$!";
while (<IN>) {
$line = $line . $_;
if ($line =~ /^\#/) {
<IN>;
#next;
my $line = $line;
}
}
print "$line";
Please help!
try this
open(IN,"<", "file1.txt") || die "cannot open input file:$!";
my $lines = "";
while (<IN>) {
if ($. % 4 == 2) $lines .= $_;
}
print "$lines";
I assume what you are asking is how to print the line that comes after a line that begins with #:
perl -ne 'if (/^\#/) { print scalar <> }' file1.txt
This says, "If the line begins with #, then print the next line. Do this for all the files in the argument list." The scalar function is used here to impose a scalar context on the file handle, so that it does not print the whole file. By default print has a list context for its arguments.
If you actually want to print the second line in the file, well, that's even easier. Here's a few examples:
Using the line number $. variable, printing if it equals line number 2.
perl -ne '$. == 2 and print, close ARGV' yourfile.txt
Note that if you have multiple files, you must close the ARGV file handle to reset the counter $.. Note also the use of the lower precedence operator and will force print and close to both be bound to the conditional.
Using regular logic.
perl -ne 'print scalar <>; close ARGV;'
perl -pe '$_ = <>; close ARGV;'
Both of these uses a short-circuit feature by closing the ARGV file handle when the second line is printed. If you should want to print every other line of a file, both these will do that if you remove the close statements.
perl -ne '$at = $. if /^\#/; print if $. - 1 == $at' file1.txt
Written out longhand, the above is equivalent to
open my $fh, "<", "file1.txt";
my $at_line = 0;
while (<$fh>) {
if (/^\#/) {
$at_line = $.;
}
else {
print if $. - 1 == $at_line;
}
}
If you want lines 2, 6, 10 printed, then:
while (<>)
{
print if $. % 4 == 2;
}
Where $. is the current line number — and I didn't spend the time opening and closing the file. That might be:
{
my $file = "file1.txt";
open my $in, "<", $file or die "cannot open input file $file: $!";
while (<$in>)
{
print if $. % 4 == 2;
}
}
This uses the modern preferred form of file handle (a lexical file handle), and the braces around the construct mean the file handle is closed automatically. The name of the file that couldn't be opened is included in the error message; the or operator is used so the precedence is correct (the parentheses and || in the original were fine too and could be used here, but conventionally are not).
If you want the line after a line starting with # printed, you have to organize things differently.
my $print_next = 0;
while (<>)
{
if ($print_next)
{
print $_;
$print_next = 0;
}
elsif (m/^#/)
{
$print_next = 1;
}
}
Dissecting the code in the question
The original version of the code in the question was (line numbers added for convenience):
1 open(IN,"<", "file1.txt") || die "cannot open input file:$!";
2 while (<IN>) {
3 $line = $line . $_;
4 if ($line =~ /^\#/) {
5 <IN>;
6 #next;
7 my $line = $line;
8 }
9 }
10 print "$line";
Discussion of each line:
OK, though it doesn't use a lexical file handle or report which file could not be opened.
OK.
Premature and misguided. This adds the current line to the variable $line before any analysis is done. If it was desirable, it could be written $line .= $_;
Suggests that the correct description for the desired output is not 'the second lines' but 'the line after a line starting with #. Note that since there is no multi-line modifier on the regex, this will always match only the first line segment in the variable $line. Because of the premature concatenation, it will match on each line (because the first line of data starts with #), executing the code in lines 5-8.
Reads another line into $_. It doesn't test for EOF, but that's harmless.
Comment line; no significance except to suggest some confusion.
my $line = $line; is a self-assignment to a new variable hiding the outer $line...mainly, this is weird and to a lesser extent it is a no-op. You are not using use strict; and use warnings; because you would have warnings if you did. Perl experts use use strict; and use warnings; to make sure they haven't made silly mistakes; novices should use them for the same reason.
Of itself, OK. However, the code in the condition has not really done very much. It skips the second line in the file; it will later skip the fourth, the sixth, the eighth, etc.
OK.
OK, but...if you're only interested in printing the lines after the line starting #, or only interested in printing the line numbers 2N+2 for integral N, then there is no need to build up the entire string in memory before printing each line. It will be simpler to print each line that needs printing as it is found.

script conversion from Perl to shell [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking for code must demonstrate a minimal understanding of the problem being solved. Include attempted solutions, why they didn't work, and the expected results. See also: Stack Overflow question checklist
Closed 9 years ago.
Improve this question
following is the code in perl.
Can we write the same thing in shell scripts ??
If yes how ?
I have used associative arrays but unable to achieve what this is doing
open MYFILE, "<", "$ARGV[0]" or die "Can't open $ARGV[0] file \n";
############ to retieve the info and put them in associative arrray ##############
$line = <MYFILE>;
#line1 = split(/,/ , $line);
$length = #line1;
$count = 0;
while($count < $length)
{
$line1[$count] =~ s/^\"//;
$line1[$count] =~ s/\"$//;
$count++;
}
$line = <MYFILE>;
#line2 = split(/,/ , $line);
$length = #line2;
$count = 0;
while($count < $length)
{
$line2[$count] =~ s/^\"//;
$line2[$count] =~ s/\"$//;
$count++;
}
$count = 0;
while($count < $length)
{
$array{$line1[$count]}=$line2[$count];
$count++;
}
Of course you can translate that to a shell script: Just wrap the perl script in a here-doc, pass it to perl, and put #!/bin/sh at the top…
#!/bin/sh
perl - <<'END' $1
...
END
But more seriously, you might achieve enlightenment by rewriting the code in a different fashion. What you are doing is reading a line, splitting it at commata, and removing quotation marks at the beginning and end of each field:
sub get_fields {
map { s/^"//; s/"$//; $_ } split /,/, $_[0];
}
my #keys = get_fields scalar <>; # 1st line
my #vals = get_fields scalar <>; # 2nd line
my %hash;
#hash{ #line1 } = #line2;
Except for the slice operation at the end, you can now more easily rewrite the code because it uses data flow instead of structured programming as the predominant paradigm. Not to mention that my code is shorter by an order of magnitude (in base 3).
If you are writing code for production purposes, don't do this. It will break. I assume you are processing CSV. Stick with Perl, and use Text::CSV. Then:
use strict; use warnings; use autodie;
use Text::CSV;
my $csv = Text::CSV->new({ binary => 1 });
open my $fh, "<:utf8", $ARGV[0];
my $keys = $csv->getline($fh);
my $vals = $csv->getline($fh);
my %hash;
#hash{#$keys} = #$vals;
It isn't even much longer, but very unlikely to break (It doesn't split on commas inside quotes).

Reading file line by line iteration issue

I have the following simple piece of code (identified as the problem piece of code and extracted from a much larger program).
Is it me or can you see an obvious error in this code that it stopping it from matching against $variable and printing $found when it definitely should be doing?
Nothing is printed when I try to print $variable, and there are definitely matching lines in the file I am using.
The code:
if (defined $var) {
open (MESSAGES, "<$messages") or die $!;
my $theText = $mech->content( format => 'text' );
print "$theText\n";
foreach my $variable (<MESSAGES>) {
chomp ($variable);
print "$variable\n";
if ($theText =~ m/$variable/) {
print "FOUND\n";
}
}
}
I have located this as the point at which the error is occurring but cannot understand why?
There may be something I am totally overlooking as its very late?
Update I have since realised that I misread your question and this probably doesn't solve the problem. However the points are valid so I am leaving them here.
You probably have regular expression metacharacters in $variable. The line
if ($theText =~ m/$variable/) { ... }
should be
if ($theText =~ m/\Q$variable/) { ... }
to escape any that there are.
But are you sure you don't just want eq?
In addition, you should read from the file using
while (my $variable = <MESSAGES>) { ... }
as a for loop will unnecessarily read the entire file into memory. And please use a better name than $variable.
This works for me.. Am I missing the question at hand? You're just trying to match "$theText" to anything on each line in the file right?
#!/usr/bin/perl
use warnings;
use strict;
my $fh;
my $filename = $ARGV[0] or die "$0 filename\n";
open $fh, "<", $filename;
my $match_text = "whatever";
my $matched = '';
# I would use a while loop, out of habit here
#while(my $line = <$fh>) {
foreach my $line (<$fh>) {
$matched =
$line =~ m/$match_text/ ? "Matched" : "Not matched";
print $matched . ": " . $line;
}
close $fh
./test.pl testfile
Not matched: this is some textfile
Matched: with a bunch of lines or whatever and
Not matched: whatnot....
Edit: Ah, I see.. Why don't you try printing before and after the "chomp()" and see what you get? That shouldn't be the issue, but it doesn't hurt to test each case..

Perl: Looping over input lines with an index-based approach

This is a beginner-best-practice question in perl. I'm new to this language. The question is:
If I want to process the output lines from a program, how can I format THE FIRST LINE in a special way?
I think of two possibilities:
1) A flag variable, once the loop is executed first time is set. But it will be evaluated for each cycle. BAD solution
2) An index-based loop (like a "for"). Then I would start the loop in i=1. This solution is far better. The problem is HOW CAN I DO IT?
I just found the code for looping over with the while ( <> ) construct.
Here you can see better:
$command_string = "par-format 70j p0 s0 < " . $ARGV[0] . "|\n";
open DATA, $command_string or die "Couldn't execute program: $!";
print "\t <div>&‎nbsp;&‎nbsp;&‎nbsp;&‎nbsp;&‎nbsp;&‎nbsp;&‎nbsp;&‎nbsp;&‎nbsp;&‎nbsp;|-- <strong>Description</strong></div>\n";
while ( defined( my $line = <DATA> ) ) {
chomp($line);
# print "$line\n";
print "\t <div>&‎nbsp;&‎nbsp;&‎nbsp;&‎nbsp;&‎nbsp;&‎nbsp;&‎nbsp;&‎nbsp;&‎nbsp;&‎nbsp;|&‎nbsp;&‎nbsp;&‎nbsp;-- " . $line . "</div>\n";
}
close DATA;
Please also don't hesitate in correcting any code in here, this is my first perl poem.
Thanks!
You can always use $. or the English name $INPUT_LINE_NUMBER to control the logic in your loop with:
while (my $line = <>) {
if ($. == 1) {
# do cool stuff here
}
# do normal stuff here
}
To handle the first line differently, you could just put
$line = <DATA>;
above your loop.
With proper checking for read problems (empty file, etc.) this should be
if ($line = <DATA>) {
...do special things...
}
while (my $line = <DATA>) {
...do regular things...
}
I'm not sure about the defined() call. You might not need it, since an empty string has a false truth value.
From a 'best practices' perspective there is much wrong with that code sample:
open DATA, $command_string or die "Couldn't execute program: $!";
Security hole, please exploit me.
DATA is a magical value that points to a __DATA__ section at the end of the current file.
You should use
open my $fh
Which uses a lexical variable for a file handle instead of a global.
You should use 3 arg open, ie:
open my $fh, '<' , $filename
open my $fh, '-|' , $command
open my $fh, '-|' , $command, #args
sadly I have yet to work out how 3-arg works with dual-pipes.
theres' this IPC::Open2 thing, but I haven't worked out how
to use that effectively yet. Suggestions welcome .