How to combine two lines together using Perl? - perl

How to combine two lines together using Perl? I'm trying to combine these two lines using a Perl regular expression:
__Data__
test1 - results
dkdkdkdkdkd
I would like the output to be like this:
__Data__
test1 - results dkdkdkdkdkd
I thought this would accomplish this but not working:
$_ =~ s/__Data__\n(test1.*)\n(.*)\n/__Data__\n$1 $2/smg;

If you have a multiline string:
s/__Data__\ntest1.*\K\n//g;
The /s modifier only makes the wildcard . match \n, so it will cause .* to slurp your newline and cause the match of \n to be displaced to the last place it occurs. Which, depending on your data, might be far off.
The /m modifier makes ^ and $ match inside the string at newlines, so not so useful. The \K escape preserves whatever comes before it, so you do not need to put it back afterwards.
If you have a single line string, for instance in a while loop:
while (<>) {
if (/^__Data__/) {
$_ .= <>; # add next line
chomp; # remove newline
$_ .= <>; # add third line
}
print;
}

There seems to be a problem with the setup of $_. When I run this script, I get the output I expect (and the output I think you'd expect). The main difference is that I've added a newline at the end of the replacement pattern in the substitute. The rest is cosmetic or test infrastructure.
Script
#!/usr/bin/env perl
use strict;
use warnings;
my $text = "__Data__\ntest1 - results\ndkdkdkdkdkd\n";
my $copy = $text;
$text =~ s/__Data__\n(test1.*)\n(.*)\n/__Data__\n$1 $2\n/smg;
print "<<$copy>>\n";
print "<<$text>>\n";
Output
<<__Data__
test1 - results
dkdkdkdkdkd
>>
<<__Data__
test1 - results dkdkdkdkdkd
>>
Note the use of << and >> to mark the ends of strings; it often helps when debugging. Use any symbols you like; just enclose your displayed text in such markers to help yourself debug what's going on.
(Tested with Perl 5.12.1 on RHEL 5 for x86/64, but I don't think the code is version or platform dependent.)

Related

Awk's output in Perl doesn't seem to be working properly

I'm writing a simple Perl script which is meant to output the second column of an external text file (columns one and two are separated by a comma).
I'm using AWK because I'm familiar with it.
This is my script:
use v5.10;
use File::Copy;
use POSIX;
$s = `awk -F ',' '\$1==500 {print \$2}' STD`;
say $s;
The contents of the local file "STD" is:
CIR,BS
60,90
70,100
80,120
90,130
100,175
150,120
200,260
300,500
400,600
500,850
600,900
My output is very strange and it prints out the desired "850" but it also prints a trailer of the line and a new line too!
ka#man01:$ ./test.pl
850
ka#man01:$
The problem isn't just printing. I need to use the variable generated by awk "i.e. the $s variable) but the variable is also being reserved with a long string and a new line!
Could you guys help?
Thank you.
I'd suggest that you're going down a dirty road by trying to inline awk into perl in the first place. Why not instead:
open ( my $input, '<', 'STD' ) or die $!;
while ( <$input> ) {
s/\s+\z//;
my #fields = split /,/;
print $fields[1], "\n" if $fields[0] == 500;
}
But the likely problem is that you're not handling linefeeds, and say is adding an extra one. Try using print instead, or chomp on the resultant string.
perl can do many of the things that awk can do. Here's something similar that replaces your entire Perl program:
$ perl -naF, -le 'chomp; print $F[1] if $F[0]==500' STD
850
The -n creates a while loop around your argument to -e.
The -a splits up each line into #F and -F lets you specify the separator. Since you want to separate the fields on a comma you use -F,.
The -l adds a newline each time you call print.
The -e argument is the program to run (with the added while from -n). The chomp removes the newline from the output. You get a newline in your output because you happen to use the last field in the line. The -l adds a newline when you print; that's important when you want to extract a field in the middle of the line.
The reason you get 2 newlines:
the backtick operator does not remove the trailing newline from the awk output. $s contains "850\n"
the say function appends a newline to the string. You have say "850\n" which is the same as print "850\n\n"

Print Line By Line

I've been trying to work on a lyrical bot for my server, but before I started to work on it, I wanted to give it a test so I came up with this script using the Lyrics::Fetcher module.
use strict;
use warnings;
use Lyrics::Fetcher;
my ($artist, $song) = ('Coldplay', 'Adventures Of A Lifetime');
my $lyrics = Lyrics::Fetcher->fetch($artist, $song, [qw(LyricWiki AstraWeb)]);
my #lines = split("\n\r", $lyrics);
foreach my $line (#lines) {
sleep(10);
print $line;
}
This script works fine, it grabs the lyrics and prints it out in a whole(which is not what I'm looking for).
I was hoping to achieve a line by line print of the lyrics every 10 seconds. Help please?
Your call to split looks suspicious. In particular the regex "\n\r". Note, the first argument to split is always interpreted as a regex regardless of whether you supply a quoted string.
On Unix systems the line ending is typically "\n". On DOS/Windows it's "\r\n" (the reverse of what you have). On ancient Macs it was "\r". To match all thre you could do:
my #lines = split(/\r\n|\n|\r/, $lyrics);
You will need to enable autoflush, otherwise the lines will just be buffered and printed when the buffer is full or when the program terminates
STDOUT->autoflush;
You can use the regex generic newline pattern \R to split on any line ending, whether your data contains CR, LF, or CR LF. This feature is available only in Perl v5.10 or better
my #lines = split /\R/, $lyrics;
And you will need to print a newline after each line of lyrics, because the split will have removed them
print $line, "\n";

Perl: How to remove spaces and blank lines in one pass

I have got 2 perl scripts, first one removes blank lins from a file and the second one removes all spaces inside a file. I wonder, if it's possible to connect both of these regular expressions inside 1 script?
For spaces, i have used this regsub: $str =~ tr/ //d;
and for Blank lines, I have used this regexp
while (<$file>) {
if (/\S/){
print $new_file $_; }}
It should be really easy: just add tr/ //d before the if line.
Note: It will remove lines containing spaces only, too. If you want to keep them (but transliterated to empty lines), insert the transliteration before the print line.
If you wish to trim the end of the line that contains space,
you might want it to work like this:
perl -pi -e 's/\s*$/\n/' f1 f2 f3 #UNIX file format
perl -pi -e 's/\s*$/\r\n/' f1 f2 f3 #DOS file format

I want to create a perl code to extract what is in the parentheses and port it to a variable

I want to create a perl code to extract what is in the parentheses and port it to a variable.
"(05-NW)HPLaserjet" should become "05-NW"
Something like this:
Catch "("
take out any spaces that exsist in between ()
everything in between () = variable 1
How would I go about doing this?
This is a job for regular expressions. Looks confusing because parens are used as meta characters in regular expression and are also part of the pattern in your example, escaped by backslashes.
C:\temp $ echo (05-NW)HPLaserjet | perl -nlwe "print for m/\(([^)]+)\)/g"
Match opening paren, start capture group, match one or more characters that aren't the closing paren, close capture group, match closing paren.
You can use regular expressions (see perlretut) to match and capture the value. By assigning to a list, you can put your captures into named variables. The global variables $1, $2 etc. are also used for capture groups, so you can use that instead of list assignment if you like.
use strict;
use warnings;
while (<>) # read every line
{
my ($printer_code) = m/
\( # Match literal opening parenthesis
([^\)]*) # Capture group (printer_code): Match characters which aren't right parenthesis, zero or more times
\)/x; # Match literal closing parenthesis
# The 'x' modifier allows you to add whitespace and comments to regex for clarity.
# If you use it, make sure you use '\ ' (or '\s', etc.) for actual literal whitespace matching!
}
__DATA__
(05-NW)HPLaserjet
perldoc perlre
use warnings;
use strict;
my $s = '(05-NW)HPLaserjet';
my ($v) = $s =~ /\((.*)\)/; # Grab everything between parens (including other parens)
$v =~ s/\s//g; # Remove all whitespace
print "$v\n";
__END__
05-NW
See also: Perl Idioms Explained - #ary = $str =~ m/(stuff)/g

Need to print the last occurrence of a string in Perl

I have a script in Perl that searches for an error that is in a config file, but it prints out any occurrence of the error. I need to match what is in the config file and print out only the last time the error occurred. Any ideas?
Wow...I was not expecting this much of a response. I should've been more clear in stating this is for log monitoring on a windows box that sends an alert to Nagios. This is actually my first Perl program and all this information has been very helpful. Does anyone know how I can apply this any of the tail answers on a wintel box?
Another way to do it:
perl -n -e '$e = $1 if /(REGEX_HERE)/; END{ print $e }' CONFIG_FILE_HERE
What exactly do you need to print? The line containing the error? More context than that?
File::ReadBackwards can be helpful.
In outline:
my $errinfo;
while (<>)
{
$errinfo = "whatever" if (m/the error pattern/);
}
print "error: $errinfo\n" if ($errinfo);
This catches all errors, but doesn't print until the end, when only the last one survives.
A brute-force approach involves setting up your own pipeline by pointing STDOUT to tail. This allows you to print all errors, and then it's up to tail to worry about only letting the last one out.
You didn't specify, so I assume a legal config line is of the form
Name = some value
Matching that is straightforward:
^ (starting at the beginning of line)
\w+ (one or more “word characters”)
\s+ (followed by mandatory whitespace)
= (followed by an equals sign)
\s+ (more mandatory whitespace)
.+ (some mandatory value)
$ (finishing at the end of the line)
Gluing it together, we get
#! /usr/bin/perl
use warnings;
use strict;
# for demo only
*ARGV = *DATA;
my $pid = open STDOUT, "|-", "tail", "-1" or die "$0: open: $!";
while (<>) {
print unless /^ \w+ \s+ = \s+ .+ $/x;
}
close STDOUT or warn "$0: close: $!";
__DATA__
This = assignment is ok
But := not this
And == definitely not this
Output:
$ ./lasterr
And == definitely not this
With regular expressions, when you want the last occurrence of a pattern, place ^.* at the front of your pattern. For example, to replace the last X in the input with Y, use
$ echo XABCXXXQQQXX | perl -pe 's/^(.*)X/$1Y/'
XABCXXXQQQXY
Note that the ^ is redundant because regular-expression quantifiers are greedy, but I like having it there for emphasis.
Applying this technique to your problem, you can search for the last line in your config file that contains an error as in the following program:
#! /usr/bin/perl
use warnings;
use strict;
local $_ = do { local $/; scalar <DATA> };
if (/\A.* ^(?! \w+ \s+ = \s+ [^\r\n]+ $) (.+?)$/smx) {
print $1, "\n";
}
__DATA__
This = assignment is ok
But := not this
And == definitely not this
The syntax of the regular expression is a bit different because $_ contains multiple lines, but the principle is the same. \A is similar to ^, but it matches only at the beginning of string to be searched. With the /m switch (“multi-line”), ^ matches at logical line boundaries.
Up to this point, we know the pattern
/\A.* ^ .../
matches the last line that looks like something. The negative look-ahead assertion (?!...) looks for a line that is not a legal config line. Ordinarily . matches any character except newline, but the /s switch (“single line”) lifts this restriction. Specifying [^\r\n]+, that is, one or more characters that are neither carriage return nor line feed, does not allow the match to spill into the next line.
Look-around assertions do not capture, so we grab the offending line with (.+?)$. The reason it's safe to use . in this context is because we know the current line is bad and the non-greedy quantifier +? stops matching as soon as it can, which in this case is the end of the current logical line.
All these regular expressions use the /x switch (“extended mode”) to allow extra whitespace: the aim is to improve readability.