what might cause a print error in perl? - perl

I have a long running script that every hour opens a file, prints to it and closes the file. I've recently found very rarely, the print is failing, not because I'm testing the status of the print itself but rather due to the fact of missing entries in the file until the system is actually rebooted!
I do trap for file open failures and write a message to syslog when that happens and I'm not seeing any open failures so I'm now guessing it may be the print that is failing. I'm not trapping the print failures, which I suspect most people don't but am now going to update that one print.
Meanwhile, my question is does anyone know what types of situations could cause a print statement to fail when there is plenty of disk storage and no contention for a file which has been successfully opened in append mode?

You could be out of memory (ENOMEM) or over a filesize limit (E2BIG or SIGXFSZ). You could have an old-fashioned I/O error (EIO). You could have a race condition if the script is run concurrently or if the file is accessed over NFS. And, of course, you could have an error in the expression whose value you would print.
An exotic cause that I once saw is that a CPU heatsink failure can lead to sprintf spuriously failing, causing some surprising results including writing garbage to file descriptors.
Finally, I remind you that print will often write its stuff in an I/O buffer. This means two things. (1) You need to check the result of close() as well. (2) If you print but you don't immediately close() or flush() then your data can be buffered and not actually written until much later (or not at all if the process dies horribly).

Related

What does if (print) means in Perl?

I got the following code
if (print msgsnd($ipc_msg_to_mppt_id, $msg, 0)) {
what is the purpose of print here? What is returns?
Documentation says it returns true if successful. But how can printing be unsuccessfull?
Printing isn't necessary as simple as dumping the output to a console. It could also be redirected to a file or some other kind of pipe. If it's redirected to a file you don't have write access to, then printing to that file will fail. If it's piped into another program and the latter program terminates, then writing to it will cause a broken pipe error.
As a general principle, I/O operations are outside the control of your program, so you should always assume that they can fail. Reading to or writing from a file, the console, or any kind of socket or pipe, can always, without warning, fail. So if you want your program to do something about it, you'll need to check the return value of functions like print.

fork() creates run-time error, am I using it correctly?

I have a Perl program that I've written that parses SQL-like statements and creates a job to run on a mainframe that extracts fields and records based on the criteria. The records are returned to the PC and then output in various formats (e.g. csv, XML, etc.). I co-opted SQL's INTO clause to list the files and outputs.
The issue is that writing records to a SQLite database (one of the output formats) takes a relatively long time and I don't want that to hold up the processing of the next SQL command (in case there are multiple SQL queries passed in). So, I thought I would use fork() to make this happen.
The original bit of code is:
foreach (#into) {
dointo($_,$recs);
}
#into is a list of files and formats (e.g. 'File1.csv' is a comma-delimited format sent to File1.csv, 'File1.xml' is an XML format written to File1.xml, etc.) to be processed. The subroutine dointo handles each of these. $recs is a sort of iterator that returns records in a variety of formats (flat, anonhash, anonarray, etc.)
So, I changed the code to:
foreach (#into) {
unless (fork()) {
dointo($_,$recs);
exit 0;
}
}
but now when the code runs, it seems to work, but it pulls a run-time error every time.
I didn't capture the return from fork() because I really don't care about waiting for the forked process to finish. Could this be the error? Does the parent process NEED to wait for the child processes to finish before it can safely exit?
Any help would be appreciated.
(BTW, Windows XP is the OS, Activestate Perl 5.10.0 is the Perl version)
Don't write to a SQLite database in parallel.
Fork on Windows has always been a bit dodgy. Remember that fork() can return two false values: 0 if you're in the child process and undef if the fork failed. So for your code above, you may have a failed fork. Also, if you don't want to wait for your children, I think you should set $SIG{CHLD} to 'IGNORE' to avoid creating zombies if you're not waiting.
See perlipc, perlfaq8 and waitpid for more information. Plus, it would help if you said what the runtime error is :)
I forgot to mention that you might also want to look at Parallel::ForkManager. It will make this simpler for you.
And see perlfork to understand limitations of fork emulation on Windows.

Is it bad to open() and close() in a loop if speed isn't an issue?

I modified another programmer's Perl script I use to make it output logs. The perl script goes through files, and for every file it goes through I open() the log, write/print to it and then close() it. This happens a lot of times. I do this to make sure I don't lose any data if said Perl script hangs up (it eventually starts doing that, and I'm not knowledgeable enough to fix it). Therefore, I don't have a good alternative to repeating open() and close() in that loop.
My main question is this: the Perl script is for personal use, so speed reduction is not an issue. But are there other bad things that could follow out of this likely improper usage of open/close? It may sound like a stupid question, but is it possible this would wear my hard disk down faster, or am I misunderstanding how file handling works?
Thanks in advance.
As others have mentioned, there is no issue here other than performance (and arguably cleanliness of code).
However, if you are merely worried about "losing data if Perl hangs up", just set autoflush on the file handle:
use IO::Handle;
open HANDLE, '>log.txt'
or die "Unable to open log.txt for writing: $!";
HANDLE->autoflush(1);
Now every print to HANDLE will get flushed automatically. No need to keep opening and closing.
Search for "autoflush" in the perldoc man page for more information.
In theory it's usually better to open and close connections as quickly as possible, and files are no different. The two things you will run into are file locking and performance.
File locking could come about if something else is accessing your file at the same time.
Performance, as you mentioned, isn't a huge concern.
We're not talking about lifetimes of waiting for open/close operations anyway...it's mostly noticeable with high concurrency or hundreds of thousands of actions.
The OS determines hard drive access so you should be fine. If you need to open() and close() a lot of files then it's ok. The only thing that might happen is if your script hangs (for some odd reason) while it has the file pointer from open() it could cause data loss if it resumes after you edit manually (but this is a pretty rare scenario). Also if your script crashes, then the descriptors get released anyway so there's no issue as far as I can tell.

When do you need to `END { close STDOUT}` in Perl?

In the tchrists broilerplate i found this explicit closing of STDOUT in the END block.
END { close STDOUT }
I know END and close, but i'm missing why it is needed.
When start searching about it, found in the perlfaq8 the following:
For example, you can use this to make
sure your filter program managed to
finish its output without filling up
the disk:
END {
close(STDOUT) || die "stdout close failed: $!";
}
and don't understand it anyway. :(
Can someone explain (maybe with some code-examples):
why and when it is needed
how and in what cases can my perl filter fill up the disk and so on.
when things getting wrong without it...
etc??
A lot of systems implement "optimistic" file operations. By this I mean that a call to for instance print which should add some data to a file can return successfully before the data is actually written to the file, or even before enough space is reserved on disk for the write to succeed.
In these cases, if you disk is nearly full, all your prints can appear successful, but when it is time to close the file, and flush it out to disk, the system realizes that there is no room left. You then get an error when closing the file.
This error means that all the output you thought you saved might actually not have been saved at all (or partially saved). If that was important, your program needs to report an error (or try to correct the situation, or ...).
All this can happen on the STDOUT filehandle if it is connected to a file, e.g. if your script is run as:
perl script.pl > output.txt
If the data you're outputting is important, and you need to know if all of it was indeed written correctly, then you can use the statement you quoted to detect a problem. For example, in your second snippet, the script explicitly calls die if close reports an error; tchrist's boilerplate runs under use autodie, which automatically invokes die if close fails.
(This will not guarantee that the data is stored persistently on disk though, other factors come into play there as well, but it's a good error indication. i.e. if that close fails, you know you have a problem.)
I believe Mat is mistaken.
Both Perl and the system have buffers. close causes Perl's buffers to be flushed to the system. It does not necessarily cause the system's buffers to be written to disk as Mat claimed. That's what fsync does.
Now, this would happen anyway on exit, but calling close gives you a chance to handle any error it encountered flushing the buffers.
The other thing close does is report earlier errors in attempts by the system to flush its buffers to disk.

How can I debug a Perl program that suddenly exits?

I have Perl program based on IO::Async, and it sometimes just exits after a few hours/days without printing any error message whatsoever. There's nothing in dmesg or /var/log either. STDOUT/STDERR are both autoflush(1) so data shouldn't be lost in buffers. It doesn't actually exit from IO::Async::Loop->loop_forever - print I put there just to make sure of that never gets triggered.
Now one way would be to keep peppering the program with more and more prints and hope one of them gives me some clue. Is there better way to get information what was going on in a program that made it exit/silently crash?
One trick I've used is to run the program under strace or ltrace (or attach to the process using strace). Naturally that was under Linux. Under other operating systems you'd use ktrace or dtrace or whatever is appropriate.
A trick I've used for programs which only exhibit sparse issues over days or week and then only over handfuls among hundreds of systems is to direct the output from my tracer to a FIFO, and have a custom program keep only 10K lines in a ring buffer (and with a handler on SIGPIPE and SIGHUP to dump the current buffer contents into a file. (It's a simple program, but I don't have a copy handy and I'm not going to re-write it tonight; my copy was written for internal use and is owned by a former employer).
The ring buffer allows the program to run indefinitely with fear of running systems out of disk space ... we usually only need a few hundred, even a couple thousand lines of the trace in such matters.
If you are capturing STDERR, you could start the program as perl -MCarp::Always foo_prog. Carp::Always forces a stack trace on all errors.
A sudden exit without any error message is possibly a SIGPIPE. Traditionally SIGPIPE is used to stop things like the cat command in the following pipeline:
cat file | head -10
It doesn't usually result in anything being printed either by libc or perl to indicate what happened.
Since in an IO::Async-based program you'd not want to silently exit on SIGPIPE, my suggestion would be to put somewhere in the main file of the program a line something like
$SIG{PIPE} = sub { die "Aborting on SIGPIPE\n" };
which will at least alert you to this fact. If instead you use Carp::croak without the \n you might even be lucky enough to get the file/line number of the syswrite, etc... that caused the SIGPIPE.