Perl Programm keep waiting to open a file using utf8 - perl

I am trying to read a UTF-8 encoded xml file. This file is of size around 8M and contains only one line.
I used below line to open this single line xml file:
open(INP,"<:utf8","$infile") or die "Couldn't open file passed as input, $!";
local $/ = undef;
my $inputfile = <INP>;
print $inputfile; ## Not working..
But after this line program get stuck and keep waiting.
I have tried other methods like binmode and decode but getting the same issue.
The same Program works when i change above mentioned file opening code to:
open(INP,"$infile") or die "Couldn't open file passed as input, $!";
local $/ = undef;
my $inputfile = <INP>;
print $inputfile; ## It works..
open(INP,"$infile") or die "Couldn't open file passed as input, $!";
binmode(INP, ":utf8");
local $/ = undef;
my $inputfile = <INP>;
print $inputfile; ## Not working..
Can you please help me what I am doing wrong here? I need to perform some operation on the input data and have to get utf8 encoded output.

I tried your last snippet here (Ubuntu 12.04, perl 5.14.2) and it works as expected. Only problem I have, is a difference between the input and output. Input file is UTF-8 and output is ISO-8859-1.
When I add
use utf8;
use open qw(:std :utf8);
this problem is gone, though. So this must be an environment issue.

Related

How to modify content of a file using single file handle

I'm trying to modify content of a file using Perl.
The following script works fine.
#!/usr/bin/perl
use strict;
use warnings;
open(FH,"test.txt") || die "not able to open test.txt $!";
open(FH2,">","test_new.txt")|| die "not able to opne test_new.txt $!";
while(my $line = <FH>)
{
$line =~ s/perl/python/i;
print FH2 $line;
}
close(FH);
close(FH2);
The content of test.txt:
im learning perl
im in File handlers chapter
The output in test_new.txt:
im learning python
im in File handlers chapter
If I try to use same file handle for modifying the content of file, then I'm not getting expected output. The following is the script that attempts to do this:
#!/usr/bin/perl
use strict;
use warnings;
open(FH,"+<","test.txt") || die "not able to open test.txt $!";
while(my $line = <FH>)
{
$line =~ s/perl/python/i;
print FH $line;
}
close(FH);
Incorrect output in test.txt:
im learning perl
im learning python
chapter
chapter
How do I modify the file contents using single file handle?
You can't delete from a file (except at the end).
You can't insert characters into a file (except at the end).
You can replace a character in a file.
You can append to a file.
You can shorten a file.
That's it.
You're imagining you can simply replace "Perl" with "Python" in the file. Those aren't of the same length, so it would require inserting characters into the file, and you can't do that.
You can effectively insert characters into a file by loading the rest of the file into memory and writing it back out two characters further. But doing this gets tricky for very large files. It's also very slow since you end up copying a (possibly very large) portion of the file every time you want to insert characters.
The other problem with in-place modifications is that you can't recover from an error. If something happens, you'll be left with an incomplete or corrupted file.
If the file is small and you're ok with losing the data if something goes wrong, the simplest approach is to load the entire file into memory.
open(my $fh, '<+', $qfn)
or die("Can't open \"$qfn\": $!\n");
my $file = do { local $/; <$fh> };
$file =~ s/Perl/Python/g;
seek($fh, 0, SEEK_SET)
or die $!;
print($fh $file)
or die $!;
truncate($fh)
or die $!;
A safer approach is to write the data to a new file, then rename the file when you're done.
my $new_qfn = $qfn . ".tmp";
open(my $fh_in, '<', $qfn)
or die("Can't open \"$qfn\": $!\n");
open(my $fh_out, '<', $new_qfn)
or die("Can't create \"$new_qfn\": $!\n");
while (<$fh_in>) {
s/Perl/Python/g;
print($fh_out $_);
}
close($fh_in);
close($fh_out);
rename($qfn_new, $qfn)
or die $!;
The downside of this approach is it might change the file's permissions, and hardlinks will point to the old content instead of the new file. You also need permissions to create a file.
As #Сухой27 answered
it's typical situation that perl onliner pleasingly used.
perl -i -pe 's/perl/python/i'
perl takes below options
-p : make line by line loop(every line assign into $_ and print after evaluated $_)
-e : evaluate code block in above loop ( regex take $_ as default operand )
-i : in plcae file edit (if you pass arguments for -i, perl preserve original files with that extention)
if you run below script
perl -i.bak -pe 's/perl/python/i' test.txt
you will get modified test.txt
im learning python
im in File handlers chapter
and get original text files named in test.txt.bak
im learning perl
im in File handlers chapter

Open file in perl and STDERR contents problems

I'm trying to write some unit tests for a perl file uploading script. I'm still pretty new to perl so I'm having some issues achieving the outcome I expect from my code.
Basically my thought process is that I can pass a test_only attribute along with the request that will tell the script to just grab a file already on the system rather than try to use an uploaded file.
I created a test file and put it in my output/tmp directory. I made sure to set its permissions to 775. Its just a simple .txt file that says "I am a test file".
What I expect to happen currently is that when I run my test script I should see the contents of the file printed out to the error log as well as some reference to the buffer(so I can verify the file is being opened properly). However, this is not happening, nothing is being put in the error log. I'm wondering if the file is being opened properly?
I'm sure I'm just missing something fundamental about how perl opens files. Any help will be appreciated. Thanks :)
This is the appropriate snippet of my code:
my $test_only = 1;
my $tmp_uploads_path = "/home/my_instance/output/tmp/";
if($test_only)
{
#put simulated file handle and file name here
$file = "";
$file_name = "test_file.txt";
}
else
{
$file = $q->upload('file')
|| die "No file data sent\n $!";
$file_name = $q->param('file_name')
|| die "No file_name sent\n $!";
}
########
#SAVE THE UPLOAD
########
my $bufsize = 1024;
my $buffer = '';
open(my $TMPFILE, ">".$tmp_uploads_path.$file_name);
binmode $TMPFILE;
print STDERR "=> ".Dumper($TMPFILE)."\n";
while(read ($TMPFILE, $buffer, $bufsize)){
print STDERR "=> ".Dumper($TMPFILE)."\n";
print STDERR "=> ".Dumper($buffer)."\n";
print $TMPFILE $buffer;
}
close($TMPFILE);
You opened the $TMPFILE for writing, due to the > mode. Therefore, you cannot read from it.
You should always put use strict; use warnings; at the top of your scripts, this would have alerted you to this problem!
You should open files like
my $name = ...;
open my $fh, "<", $name or die "Can't open $name: $!";
or
use autodie;
open my $fh, "<", $name;
That is, do proper error handling, and use the three-arg variant of open: handle, mode and name (don't concat mode and name, except on ancient perls).
I am also suprised that you are using read. You can get a similar effect by
local $/ = \$bufsize;
while (defined(my $buffer = <$TMPFILE>)) { ... }

Copy/rename images with utf8 names using csv file

I'm working on a script to batch rename and copy images based on a csv file. The csv consists of column 1: old name and column 2: new name. I want to use the csv file as input for the perl script so that it checks the old name and makes a copy using the new name into a new folder. The problem that (i think) I'm having has to do with the images. They contain utf8 characters like ß etc. When I run the script it prints out this: Barfu├ƒg├ñsschen where it should be Barfußgässchen and the following error:
Unsuccessful stat on filename containing newline at C:/Perl64/lib/File/Copy.pm line 148, <$INFILE> line 1.
Copy failed: No such file or directory at X:\Script directory\correction.pl line 26, <$INFILE> line 1.
I know it has to do with Binmode utf8 but even when i try a simple script (saw it here: How can I output UTF-8 from Perl?):
use strict;
use utf8;
my $str = 'Çirçös';
binmode(STDOUT, ":utf8");
print "$str\n";
it prints out this: Ãirþ÷s
This is my entire script, can someone explain to me where i'm going wrong? (its not the cleanest of codes because i was testing out stuff).
use strict;
use warnings;
use File::Copy;
use utf8;
my $inputfile = shift || die "give input!\n";
#my $outputfile = shift || die "Give output!\n";
open my $INFILE, '<', $inputfile or die "In use / not found :$!\n";
#open my $OUTFILE, '>', $outputfile or die "In use / not found :$!\n";
binmode($INFILE, ":encoding(utf8)");
#binmode($OUTFILE, ":encoding(utf8)");
while (<$INFILE>) {
s/"//g;
my #elements = split /;/, $_;
my $old = $elements[1];
my $new = "new/$elements[3]";
binmode STDOUT, ':utf8';
print "$old | $new\n";
copy("$old","$new") or die "Copy failed: $!";
#copy("Copy.pm",\*STDOUT);
# my $output_line = join(";", #elements);
# print $OUTFILE $output_line;
#print "\n"
}
close $INFILE;
#close $OUTFILE;
exit 0;
You need to ensure every step of the process is using UTF-8.
When you create the input CSV, you need to make sure that it's saved as UTF-8, preferably without a BOM. Windows Notepad will add a BOM so try Notepad++ instead which gives you more control of the encoding.
You also have the problem that the Windows console is not UTF-8 compliant by default. See Unicode characters in Windows command line - how?. Either set the codepage with chcp 65001 or don't change the STDOUT encoding.
In terms of your code, the first error regarding the new line is probably due to the trailing new line from the CSV. Add chomp() after while (<$INFILE>) {
Update:
To "address" the file you need to encode your filenames in the correct locale - See How do you create unicode file names in Windows using Perl and What is the universal way to use file I/O API with unicode filenames?. Assuming you're using Western 1252 / Latin, this means when your copy command will look like:
copy(encode("cp1252", $old), encode("cp1252", $new))
Also, your open should also encode the filename:
open my $INFILE, '<', encode("cp1252", $inputfile)
Update 2:
As you're running in a DOS window, remove binmode(STDOUT, ":utf8"); and leave the default codepage in place.

perl unable to copy contents of file and print it

I need to read/copy the contents of a file(test.pl) just as the way it is formatted and email it.
I am using the following code but I am unable to print anything out.
I get this error even though the file exists in the same directory.
Failed: No such file or directory
Code:
#!/usr/bin/perl
use strict;
use warnings;
use DBI;
open my $fh, '<', 'test.pl '
or die "Failed: $!\n";
my $text = do {
local $/;
<$fh>
};
close $fh
or die "Failed again: $!\n";
print $text, "\n";
It looks like there is an extra space in the filename you are trying to open. In your open statement, try changing 'test.pl ' to 'test.pl'.
if you are going to read files names from STDIN (user's input), you may want to trim them either by using regex (s/^\s+//....) or Text::Trim among other validations.

Perl : Cannot open a file with filename passed as argument

I am passing two filenames from a DOS batch file to a Perl script.
my $InputFileName = $ARGV[0];
my $OutputFileName = $ARGV[1];
Only the input file physically exists while the Outputfile must be created by the script.
open HANDLE, $OutputFileName or die $!;
open (HANDLE, ">$OutputFileName);
open HANDLE, ">$OutputFileName" or die $!;
All three fail.
However the following works fine.
open HANDLE, ">FileName.Txt" or die $!;
What is the correct syntax?
Edit : Error message is : No such file or directory at Batchfile.pl at line nn
The proper way is to use the three-parameter form of open (with the mode as a separate parameter) with lexical file handles. Also die doesn't have a capital D.
Like this
open my $out, '>', $OutputFileName or die $!;
but your last example should work assuming you have spelled die properly in your actual code.
If you are providing a path to the filename that doesn't exist then you also need to create the intermediate directories.
The die string will tell you the exact problem. What message do you get when this fails?
code:
$file_name = $ARGV[1];
open (OUTPUT "> $file_name") or error("unable to create or open $file_name");
print OUTPUT "hello world";
close(OUTPUT);
command to execute:
perl perl_file.pl data.txt
it will work try