Having some trouble with Perl (I am brand new at it). I have one .txt file in the same directory. I'm planning to copy the file, print it to stdout, add more text to the copy, and compare file sizes. This is what I have so far:
#!/usr/local/bin/perl
use File::Copy;
copy("data.txt", "copyOfData.txt") or die "copy failed :(";
open (MYFILE, "data.txt") or die "open failed :(";
while (<MYFILE>) {
chomp;
print "$_\n";
}
$filesize = -s MYFILE;
print "MYFILE filesize is $filesize\n";
close (MYFILE);
open(MYCOPYFILE, ">>copyOfData.txt");
print MYCOPYFILE "\nextra data here blah blah blah\n";
$filesize = -s MYCOPYFILE;
print "MYCOPYFILE filesize is $filesize\n";
close (MYCOPYFILE);
However, the output I'm getting is as follows:
MYFILE filesize is 28
MYCOPYFILE filesize is 28
Surely the MYCOPYFILE size should be bigger than the MYFILE size as I've added extra text? I have checked both text files and the copy does have the extra text at the end.
Thanks for your help!
You can check the size of the file using the filename (you don't have to open it).
my $size = -s 'data.txt' ;
And you should always start your script with
use strict ;
use warnings ;
And opening files is better done with the three-argument-version of open
open my $filehandle , '<' , 'filename' or die "Failed to open: $!" ;
$filesize = -s MYFILE;
As pointed out above, -s doesn't work on filehandles. If you want to get the size from a file handle, use stat
$filesize = ((stat(MYFILE))[7]);
See perldoc -f -X for details of -s and friends, see perldoc -f stat for stat
Check the filesize after closing the file.
If this does not change the filesize try adding more text, maybe the size wil change then.
Greetings
dgw's answer is correct.
Another way is to use File::stat as:
#!/usr/bin/perl
use strict;
use warnings;
use File::stat;
my $filesize = stat("test.txt")->size;
print "Size: $filesize\n";
exit 0;
Perl automatically do some buffering operation on IO objects. The contents may not be written to disk right after you call print FH "blalba". However, the -s function reads file size from disk. So when you think you have "updated" the file, you might get the size smaller than expected.
To get the correct size, flush the content in the buffer to the disk and then fetch the size with -s. Note that close(FH) will first flush the buffer automatically and then close the file handle. So you can put the -s operation after the close call to get accurate size.
Or, flush the buffer explicitly by calling flush() of IO::Handle before getting size:
open(MYCOPYFILE, ">>copyOfData.txt");
print MYCOPYFILE "\nextra data here blah blah blah\n";
MYCOPYFILE->flush();
$filesize = -s MYCOPYFILE;
print "MYCOPYFILE filesize is $filesize\n";
-s operates on file handle, directory handle and expression representing file name. So your code about -s works fine.
Related
When I open a SQLite database file there is a lot of readable text in the beginning of the file - how big is the chance that a SQLite file is filtered wrongly away due the -B file test?
#!/usr/bin/env perl
use warnings;
use strict;
use 5.10.1;
use File::Find;
my $dir = shift;
my $databases;
find( {
wanted => sub {
my $file = $File::Find::name;
return if not -B $file;
return if not -s $file;
return if not -r $file;
say $file;
open my $fh, '<', $file or die "$file: $!";
my $firstline = readline( $fh ) // '';
close $fh or die $!;
push #$databases, $file if $firstline =~ /\ASQLite\sformat/;
},
no_chdir => 1,
},
$dir );
say scalar #$databases;
The perlfunc man page has the following to say about -T and -B:
The -T and -B switches work as follows. The first block or so of the file is
examined for odd characters such as strange control codes or characters with
the high bit set. If too many strange characters (>30%) are found, it's a -B
file; otherwise it's a -T file. Also, any file containing a zero byte in the
first block is considered a binary file.
Of course you could now do a statistic analysis of a number of sqlite files, parse their "first block or so" for "odd characters", calculate the probability of their occurrence, and that would give you an idea of how likely it is that -B fails for sqlite files.
However, you could also go the easy route. Can it fail? Yes, it's a heuristic. And a bad one at that. So don't use it.
File type recognition on Unix is usually done by evaluating the file's content. And yes, there are people who've done all the work for you already: it's called libmagic (the thingy that yields the file command line tool). You can use it from Perl with e.g. File::MMagic.
Well, all files are technically a collection of bytes, and thus binary. Beyond that, there is no accepted definition of binary, so it's impossible to evaluate -B's reliability unless you care to posit a definition by which it is to be evaluated.
Im new to perl, so sorry if this is obvious, but i looked up how to open a file, and use the flags, but for the life of me they dont seem to work right I narrowed it down to these lines of code.
if ($flag eq "T"){
open xFile, ">" , "$lUsername\\$openFile";
}
else
{
open xFile, ">>", "$lUsername\\$openFile";
}
Both of these methods seem to delete the contents of my file. I also checked if the flag is formatted correctly and it is, i know for a fact ive gone down both conditions.
EDIT: codepaste of a larger portion of my code http://codepaste.net/n52sma
New to Perl? I hope you're using use strict and use warnings.
As other's have stated, you should be using a test to make sure your file is open. However, that's not really the problem here. In fact, I used your code, and it seems to work fine for me. Maybe you should try printing some debugging messages to see if this is doing what you think it's doing:
use strict;
use warnings;
use autodie; #Will stop your program if the "open" doesn't work.
my $lUsername = "ABaker";
my $openFile = "somefile.txt";
if ($flag eq "T") {
print qq(DEBUG: Flag = "$flag": Deleting file "$lUsername/$openFile");
open xFile, ">" , "$lUsername/$openFile";
}
else {
print qq(DEBUG: Flag = "$flag": Appending file "$lUsername/$openFile");
open xFile, ">>", "$lUsername/$openFile";
}
You want to use strict and warnings in order to make sure you're not having issues with variable names. The use strict forces you to declare your variables first. For example, are you setting $Flag, but then using $flag? Maybe $flag is set the first time through, but you're setting $Flag the second time through.
Anyway, the DEBUG: statements will give you a better idea of what your error could be.
By the way, in Perl, you're checking if $flag is set to T and not t. If you want to test against both t and T, test whether uc $flag eq 'T' and not just $flag eq 'T'.
#Ukemi
I reformated to comply with use strict, i also made print statements to make sure i was trunctating when i want to, and not when i dont. It still is deleting the file. Although now sometimes its simply not writing, im going to give a larger portion of my code in a link, id really appreciate it if you gave it a once over.
Are you seeing it say Truncating, but the file is empty? Are you sure the file already existed? There's a reason why I put the flag and everything in my debug statements. The more you print, the more you know. Try the following section of code:
$file = "lUsername/$openFile" #Use forward slashes vs. back slashes.
if ($flag eq "T") {
print qq(Flag = "$flag". Truncating file "$file"\n);
open $File , '>', $file
or die qq(Unable to open file "$file" for writing: $!\n);
}
else {
print qq(Flag = "$flag". Appending to file "$file"\n);
if (not -e $file) {
print qq(File "$file" does not exist. Will create it\n");
}
open $File , '>>', $file
or die qq(Unable to open file "$file" for appending: $!\n);
}
Note I'm printing out the flag and the name of the file in quotes. This will allow me to see if there are any hidden characters in my file name.
I'm using the qq(...) method to quote strings, so I can use the quotation marks in my print statements.
Also note I'm checking for the existence of the file when I truncate. This way, I make sure the file actually exists.
This should point out any possible errors in your logic. The other thing you can do is to stop your program when you finish writing out the file and verify that the file was written out as expected.
print "Write to file now:\n";
my $writeToFile = <>;
printf $File "$writeToFile";
close $File;
print "DEBUG: Temporary stop. Examine file\n";
<STDIN>; #DEBUG:
Now, if you see it saying it's appending to the file, and the file exists, and you still see the file being overwritten, we'll know the problem lies in your actual open xFile, ">>" $file statement.
You should use the three-argument-version of open, lexical filehandles and check wether there might have been an error:
# Writing to file (clobbering it if it exists)
open my $file , '>', $filename
or die "Unable to write to file '$filename': $!";
# Appending to file
open my $file , '>>', $filename
or die "Unable to append to file '$filename': $!";
>> does not clobber or truncate. Either you ended up in the "then" clause when you expected to be in the "else" clause, or the problem is elsewhere.
To check what $flag contains:
use Data::Dumper;
local $Data::Dumper::Useqq = 1;
print(Dumper($flag));
For your reference I have mentioned some basic file handling techniques below.
open FILE, "filename.txt" or die $!;
The command above will associate the FILE filehandle with the file filename.txt. You can use the filehandle to read from the file. If the file doesn't exist - or you cannot read it for any other reason - then the script will die with the appropriate error message stored in the $! variable.
open FILEHANDLE, MODE, EXPR
The available modes are the following:
read < #this mode will read the file
write > # this mode will create the new file. If the file already exists it will truncate and overwrite.
append >> #this will append the contents if the file already exists,else it will create new one.
if you have confusion on this, you can use the module called File::Slurp;
I have mentioned the sample codes using File::Slurp module.
use strict;
use File::Slurp;
my $read_mode=read_file("test.txt"); #to read file contents
write_file("test2.txt",$read_mode); #to write file
my #all_files=read_dir("/home/desktop",keep_dot_dot=>0); #read a dir
write_file("test2.txt",{append=>1},"#all_files"); #Append mode
I have some code that appends into some files in the nested for loops. After exiting the for loops, I want to append .end to all the files.
foreach my $file (#SPICE_FILES)
{
open(FILE1, ">>$file") or die "[ERROR $0] cannot append to file : $file\n";
print FILE1 "\n.end\n";
close FILE1;
}
I noticed in some strange cases that the ".end" is appended into the middle of the files!
how do i resolve this??
Since I do not yet have the comment-privilege I'll have to write this as an 'answer'.
Do you use any dodgy modules?
I have run into issues where (obviously) broken perl-modules have done something to the output buffering. For me placing
$| = 1;
in the code has helped. The above statement turns off perls output buffering (AFAIK). It might have had other effects too, but I have not seen anything negative come out of it.
I guess you've got data buffered in some previously opened file descriptors. Try closing them before re-opening:
open my $fd, ">>", $file or die "Can't open $file: $!";
print $fd, $data;
close $fd or die "Can't close: $!";
Better yet, you can append those filehanles to an array/hash and write to them in cleanup:
push #handles, $fd;
# later
print $_ "\n.end\n" for #handles;
Here's a case to reproduce the "impossible" append in the middle:
#!/usr/bin/perl -w
use strict;
my $file = "file";
open my $fd, ">>", $file;
print $fd "begin"; # no \n -- write buffered
open my $fd2, ">>", $file;
print $fd2 "\nend\n";
close $fd2; # file flushed on close
# program ends here -- $fd finally closed
# you're left with "end\nbegin"
It’s not possible to append something to the middle of the file. The O_APPEND flag guarantees that each write(2) syscall will place its contents at the old EOF and update the st_size field by incrementing it by however many bytes you just wrote.
Therefore if you find that your own data is not showing up at the end when you go to look at it, then another agent has written more data to it afterwards.
(Context: I'm trying to monitor a long-running process from a Perl CGI script. It backs up an MSSQL database and then 7-zips it. So far, the backup part (using WITH STATS=1) outputs to a file, which I can have the browser look at, refreshing every few seconds, and it works.)
I'm trying to use 7zip's command-line utility but capture the progress bar to a file. Unfortunately, unlike SQL backups, where every time another percent is done it outputs another line, 7zip rewinds its output before outputting the new progress data, so that it looks nicer if you're just using it normally on the command-line. The reason this is unfortunate is that normal redirects using >, 1>, and 2> only create a blank file, and no output ever appears in it, except for >, which has no output until the job is done, which isn't very useful for a progress bar.
How can I capture this kind of output, either by having every change in % somehow be appended to a logfile (so I can use my existing method of logfile monitoring) just using command-line trickery (no Perl), or by using some Perl code to capture it directly after calling system()?
If you need to capture the output all at once then this is the code you want:
$var=`echo cmd`;
If you want to read the output line by line then you need this code:
#! perl -slw
use strict;
use threads qw[ yield async ];
use threads::shared;
my( $cmd, $file ) = #ARGV;
my $done : shared = 0;
my #lines : shared;
async {
my $pid = open my $CMD, "$cmd |" or die "$cmd : $!";
open my $fh, '>', $file or die "$file : $!";
while( <$CMD> ) {
chomp;
print $fh $_; ## output to the file
push #lines, $_; ## and push it to a shared array
}
$done = 1;
}->detach;
my $n = 0;
while( !$done ) {
if( #lines ) { ## lines to be processed
print pop #lines; ## process them
}
else {
## Else nothing to do but wait.
yield;
}
}
Another option is using Windows create process. I know Windows C/C++ create process will allow you to redirect all stdout. Perl has access to this same API call: See Win32::Process.
You can try opening a pipe to read 7zip's output.
This doesn't answer how to capture output that gets rewound, but it was a useful way of going about it that I ended up using.
For restores:
use 7za l to list the files in the zip file and their sizes
fork 7za e using open my $command
track each file as it comes out with -s $filename and compare to the listing
when all output files are their full size, you're done
For backups:
create a unique dir somewhere
fork 7za a -w
find the .tmp file in the dir
track its size
when the .tmp file no longer exists, you're done
For restores you get enough data to show a percentage done, but for backups you can only show the total file size so far, but you could compare with historical ratios if you're using similar data to get a guestimate. Still, it's more feedback than before (none).
I have some issue with a Perl script. It modifies the content of a file, then reopen it to write it, and in the process some characters are lost. All words starting with '%' are deleted from the file. That's pretty annoying because the % expressions are variable placeholders for dialog boxes.
Do you have any idea why? Source file is an XML with default encoding
Here is the code:
undef $/;
open F, $file or die "cannot open file $file\n";
my $content = <F>;
close F;
$content =~s{status=["'][\w ]*["']\s*}{}gi;
printf $content;
open F, ">$file" or die "cannot reopen $file\n";
printf F $content;
close F or die "cannot close file $file\n";
You're using printf there and it thinks its first argument is a format string. See the printf documentation for details. When I run into this sort of problem, I always ensure that I'm using the functions correctly. :)
You probably want just print:
print FILE $content;
In your example, you don't need to read in the entire file since your substitution does not cross lines. Instead of trying to read and write to the same filename all at once, use a temporary file:
open my($in), "<", $file or die "cannot open file $file\n";
open my($out), ">", "$file.bak" or die "cannot open file $file.bak\n";
while( <$in> )
{
s{status=["'][\w ]*["']\s*}{}gi;
print $out;
}
rename "$file.bak", $file or die "Could not rename file\n";
This also reduces to this command-line program:
% perl -pi.bak -e 's{status=["\']\\w ]*["\']\\s*}{}g' file
Er. You're using printf.
printf interprets "%" as something special.
use "print" instead.
If you have to use printf, use
printf "%s", $content;
Important Note:
PrintF stands for Print Format , just as it does in C.
fprintf is the equivelant in C for File IO.
Perl is not C.
And even IN C, putting your content as parameter 1 gets you shot for security reasons.
Or even
perl -i bak -pe 's{status=["\'][\w ]*["\']\s*}{}gi;' yourfiles
-e says "there's code following for you to run"
-i bak says "rename the old file to whatever.bak"
-p adds a read-print loop around the -e code
Perl one-liners are a powerful tool and can save you a lot of drudgery.
If you want a solution that is aware of the XML nature of the docs (i.e., only delete status attributes, and not matching text contents) you could also use XML::PYX:
$ pyx doc.xml | perl -ne'print unless /^Astatus/' | pyxw
That's because you used printf instead of print and you know printf doesn't print "%" (because it would think you forgot to type the format symbol such as %s, %f etc) unless you explicitly mention by "%%". :-)