How to print a variable to a file in Perl? - perl

I am using the following code to try to print a variable to file.
my $filename = "test/test.csv";
open FILE, "<$filename";
my $xml = get "http://someurl.com";
print $xml;
print FILE $xml;
close FILE;
So print $xml prints the correct output to the screen. But print FILE $xml doesn't do anything.
Why does the printing to file line not work? Perl seems to often have these things that just don't work...
For the print to file line to work, is it necessary that the file already exists?

The < opens a file for reading. Use > to open a file for writing (or >> to append).
It is also worthwhile adding some error handling:
use strict;
use warnings;
use LWP::Simple;
my $filename = "test/test.csv";
open my $fh, ">", $filename or die("Could not open file. $!");
my $xml = get "http://example.com";
print $xml;
print $fh $xml;
close $fh;

Related

Perl - Encoding error when working with .html file

I have some .html files in a directory to which I want to add one line of css code. Using perl, I can locate the position with a regex and add the css code, this works very well.
However, my first .html file contain an accented letter: é but the resulting .html file has an encoding problem and prints: \xE9
In the perl file, I have been careful to specify UTF-8 encoding when opening and closing the files, has shown in the MWE below, but that does not solve the problem. How can I solve this encoding error?
MWE
use strict;
use warnings;
use File::Spec::Functions qw/ splitdir rel2abs /; # To get the current directory name
# Define variables
my ($inputfile, $outputfile, $dir);
# Initialize variables
$dir = '.';
# Open current directory
opendir(DIR, $dir);
# Scan all files in directory
while (my $inputfile = readdir(DIR)) {
#Name output file based on input file
$outputfile = $inputfile;
$outputfile =~ s/_not_centered//;
# Open output file
open(my $ofh, '>:encoding(UTF-8)', $outputfile);
# Open only files containning ending in _not_centered.html
next unless (-f "$dir/$inputfile");
next unless ($inputfile =~ m/\_not_centered.html$/);
# Open input file
open(my $ifh, '<:encoding(UTF-8)', $inputfile);
# Read input file
while(<$ifh>) {
# Catch and store the number of the chapter
if(/(<h2)(.*?)/) {
# $_ =~ s/<h2/<h2 style="text-align: center;"/;
print $ofh "$1 style=\"text-align: center;\"$2";
}else{
print $ofh "$_";
}
}
# Close input and output files
close $ifh;
close $ofh;
}
# Close output file and directory
closedir(DIR);
Problematic file named "Chapter_001_not_centered.html"
<html >
<head></head>
<body>
<h2 class="chapterHead"><span class="titlemark">Chapter 1</span><br /><a id="x1-10001"></a>Brocéliande</h2>
Brocéliande
</body></html>
Following demo script does required inject with utilization of glob function.
Note: the script creates a new file, uncomment rename to make replacement original file with a new one
use strict;
use warnings;
use open ":encoding(Latin1)";
my $dir = '.';
process($_) for glob("$dir/*_not_centered.html");
sub process {
my $fname_in = shift;
my $fname_new = $fname_in . '.new';
open my $in, '<', $fname_in
or die "Couldn't open $fname_in";
open my $out, '>', $fname_new
or die "Couldn't open $fname_new";
while( <$in> ) {
s/<h2/<h2 style="text-align: center;"/;
print $out $_;
}
close $in;
close $out;
# rename $fname_new, $fname_in
# or die "Couldn't rename $fname_new to $fname_in";
}
If you do not mind to run following script per individual file basis script.pl in_file > out_file
use strict;
use warnings;
print s/<h2/<h2 style="text-align: center;"/ ? $_ : $_ for <>;
In case if such task arises only occasionally then it can be solved with one liner
perl -pe "s/<h2/<h2 style='text-align: center;'/" in_file
This question found an answer in the commments of #Shawn and # sticky bit:
By changing the encoding to open and close the files to ISO 8859-1, it solves the problem. If one of you wants to post the answer, I will validate it.

How to open a file that has a special character in it such as $?

Seems fairly simple but with the "$" in the name causes the name to split. I tried escaping the character out but when I try to open the file I get GLOB().
my $path = 'C:\dir\name$.txt';
open my $file, '<', $path || die
print "file = $file\n";
It should open the file so I can traverse the entries.
It has nothing to do with the "$". Just follow standard file handling procedure.
use strict;
use warnings;
my $path = 'C:\dir\name$.txt';
open my $file_handle, '<', $path or die "Can't open $path: $!";
# read and print the file line by line
while (my $line = <$file_handle>) {
# the <> in scalar context gets one line from the file
print $line;
}
# reset the handle
seek $file_handle, 0, 0;
# read the whole file at once, print it
{
# enclose in a block to localize the $/
# $/ is the line separator, so when it's set to undef,
# it reads the whole file
local $/ = undef;
my $file_content = <$file_handle>;
print $file_content;
}
Consider using the CPAN modules File::Slurper or Path::Tiny which will handle the exact details of using open and readline, checking for errors, and encoding if appropriate (most text files are encoded to UTF-8).
use strict;
use warnings;
use File::Slurper 'read_text';
my $file_content = read_text $path;
use Path::Tiny 'path';
my $file_content = path($path)->slurp_utf8;
If it's a data file, use read_binary or slurp_raw.

Perl Script: sorting through log files.

Trying to write a script which opens a directory and reads bunch of multiple log files line by line and search for information such as example:
"Attendance = 0 " previously I have used grep "Attendance =" * to search my information but trying to write a script to search for my information.
Need your help to finish this task.
#!/usr/bin/perl
use strict;
use warnings;
my $dir = '/path/';
opendir (DIR, $dir) or die $!;
while (my $file = readdir(DIR))
{
print "$file\n";
}
closedir(DIR);
exit 0;
What's your perl experience?
I'm assuming each file is a text file. I'll give you a hint. Try to figure out where to put this code.
# Now to open and read a text file.
my $fn='file.log';
# $! is a variable which holds a possible error msg.
open(my $INFILE, '<', $fn) or die "ERROR: could not open $fn. $!";
my #filearr=<$INFILE>; # Read the whole file into an array.
close($INFILE);
# Now look in #filearr, which has one entry per line of the original file.
exit; # Normal exit
I prefer to use File::Find::Rule for things like this. It preserves path information, and it's easy to use. Here's an example that does what you want.
use strict;
use warnings;
use File::Find::Rule;
my $dir = '/path/';
my $type = '*';
my #files = File::Find::Rule->file()
->name($type)
->in(($dir));
for my $file (#files){
print "$file\n\n";
open my $fh, '<', $file or die "can't open $file: $!";
while (my $line = <$fh>){
if ($line =~ /Attendance =/){
print $line;
}
}
}

Add a string at the beginning of the file with OPEN CLOSE function in Perl

My code did not work for my any modifications. other than append nothing works for in open close functions.
#!/usr/local/bin/perl
my $file = 'test';
open(INFO, $file);
print INFO "Add this line please\n";
print INFO "First line\n";
close(INFO);
You need to tell perl what type of filehandle you want
open(INFO, ">", "$file")|| die "Cannot open $file";
This will create and write to a file.
Look up
http://perldoc.perl.org/functions/open.html
By default open(INFO, $file) will take the file handle in read mode('<') . So until unless you specify the write mode('>') you cannot print the values into the file . When you write the code you should use : use strict; and use warnings; which will be helpful.
Code:
use strict;
use warnings;
my $InputFile = $ARGV[0];
open(FH,'<',"$InputFile")or die "Couldn't open the file $InputFile: $!";
my #file_content = <FH>;
close(FH);
open(FH,'>',"$InputFile") or die "Cannot open $InputFile: $!";
#String to be added at the begining of the file
my $file = "test";
print FH $file . "\n";
print FH #file_content;
close(FH);

parsing pdf in perl

I am trying to extract some information from pdf. I am trying to use getpdftext.pl from the CAM::PDF module. When I just run $~ getpdftext.pl sample.pdf, it produces a text of the pdf to stdout.
But I am thinking of writing this to a textfile and parse for required fields in perl. Can someone please guide me on how to do this?
But when I try to call pdftotext.pl inside my perl script I am getting a No such file error.
#program to extract text from pdf and save it in a text file
use PDF;
use CAM::PDF;
use CAM::PDF::PageText;
use warnings;
use IPC::System::Simple qw(system capture);
$filein = 'sample.pdf';
$fileout = 'output1.txt';
open OUT, ">$fileout" or die "error: $!";
open IN, "getpdftext.pl $filein" or die "error :$!" ;
while(<IN>)
{
print OUT $fileout;
}
It would probably be easier to make getpdftext.pl to do what you want.
Working with the code from getpdftext.pl, this (untested code) should output the pdf to a text file.
my $filein = 'sample.pdf';
my $fileout = 'output1.txt';
my $doc = CAM::PDF->new($filein) || die "$CAM::PDF::errstr\n";
open my $fo, '>', $fileout or die "error: $!";
foreach my $p ( 1 .. $doc->numPages() ) {
my $str = $doc->getPageText($p);
if (defined $str) {
CAM::PDF->asciify(\$str);
print $fo $str;
}
}
close $fo;
See perldoc -f open. You want to take the output stream of an external command and use it as an input stream inside your Perl script. That's what the -| mode is for:
open my $IN, '-|', "getpdftext.pl $filein" or die $!;
while (<$IN>) {
...
}