My program is trying to print to a file which for it is the STDOUT.
To say, print "text here"; prints to a file x.log , while I am also trying to print to the file x.log using file-handler method as in print FH1 "text here"; . I notice that when the file-handler method statement is provided first and then followed by the STDOUT procedure. My second print can override the first.I would like to know more on why this occurs.
This makes me think of a race condition or that the file-handler is relatively slow (if it goes through buffer?) than the STDOUT print statements. This I am not sure on how if that is the way Perl works. Perl version - 5.22.0
As far as I understand your program basically looks like this:
open(my $fh,'>','foobar.txt');
print $fh "foo\n";
print "bar\n"; # prints to STDOUT
And then you use it in a way that STDOUT is redirected in the shell to the same file which is already opened in your program:
$ perl test.pl > foobar.txt
This will open two independent file handles to the same file: one within your program and the other within the shell where you start the program. Both file handles manage their own file position for writing, start at position 0 and advance the position after each write.
Since these file handles are independent from each other they will not care if there are other file handles dealing currently with this file, no matter if these other file handles are inside or outside the program. This means that these writes will overwrite each other.
In addition to this there is also internal buffering done, i.e. each print will first result into a write into some internal buffer and might result immediately into a write to the file handle. When data are written to the file handle depends on the mode of the file handle, i.e. unbuffered, line-buffered or a buffer of a specific size. This makes the result kind of unpredictable.
If you don't want this behavior but still want to write to the same file using multiple file handle you better use the append-mode, i.e. open with >> instead of > in both Perl code and shell. This will make sure that all data will be appended to the end of the file instead of written to the file position maintained by the file handle. This way data will not get overwritten. Additionally you might want to make the file handles unbuffered so that data in the file end up in the same order as the print statements where done:
open(my $fh,'>>','foobar.txt');
$fh->autoflush(1); # make $fh unbuffered
$|=1; # make STDOUT unbuffered
print $fh "foo\n";
print "bar\n"; # prints to STDOUT
$ perl test.pl >> foobar.txt
Related
I basically want to reopen STDERR/STDOUT so they write to one logfile with both the stream and the timestamp included on every line. So print STDERR "Hello World" prints STDERR: 20130215123456: Hello World. I don't want to rewrite all my print statements into function calls, also some of the output will be coming from external processes via system() calls anyway which I won't be able to rewrite.
I also need for the output to be placed in the file "live", i.e. not only written when the process completes.
(p.s. I'm not asking particularly for details of how to generate timestamps, just how to redirect to a file and prepend a string)
I've worked out the following code, but it's messy:
my $mode = ">>";
my $file = "outerr.txt";
open(STDOUT, "|-", qq(perl -e 'open(FILE, "$mode", "$file"); while (<>) { print FILE "STDOUT: \$\_"; }'));
open(STDERR, "|-", qq(perl -e 'open(FILE, "$mode", "$file"); while (<>) { print FILE "STDERR: \$\_"; }'));
(The above doesn't add dates, but that should be trivial to add)
I'm looking for a cleaner solution, one that doesn't require quoting perl code and passing it on the command line, or at least module that hides some of the complexity. Looking at the code for Capture::Tiny it doesn't look like it can handle writing a part of output, though I'm not sure about that. annotate-output only works on an external command sadly, I need this to work on both external commands and ordinary perl printing.
The child launched via system doesn't write to STDOUT because it does not have access to variables in your program. Therefore, means having code run on a Perl file handle write (e.g. tie) won't work.
Write another script that runs your script with STDOUT and STDERR replaced with pipes. Read from those pipes and print out the modified output. I suggest using IPC::Run to do this, because it'll save you from using select. You can get away without it if you combine STDOUT and STDERR in one stream.
In my script I am dealing with opening files and writing to files. I found that there is some thing wrong with a file I try to open, the file exists, it is not empty and I am passing the right path to file handle.
I know that my question might sounds weird but while I was debugging my code I put the following command in my script to check some files
system ("ls");
Then my script worked well, when it's removed it does not work correctly anymore.
my #unique = ("test1","test2");
open(unique_fh,">orfs");
print unique_fh #unique ;
open(ORF,"orfs")or die ("file doesnot exist");
system ("ls");
while(<ORF>){
split ;
}
#neworfs=#_ ;
print #neworfs ;
Perl buffers the output when you print to a file. In other words, it doesn't actually write to the file every time you say print; it saves up a bunch of data and writes it all at once. This is faster.
In your case, you couldn't see anything you had written to the file, because Perl hadn't written anything yet. Adding the system("ls") call, however, caused Perl to write your output first (the interpreter is smart enough to do this, because it thinks you might want to use the system() call to do something with the file you just created).
How do you get around this? You can close the file before you open it again to read it, as choroba suggested. Or you can disable buffering for that file. Put this code just after you open the file:
my $fh = select (unique_fh);
$|=1;
select ($fh);
Then anytime you print to the file, it will get written immediately ($| is a special variable that sets the output buffering behavior).
Closing the file first is probably a better idea, although it is possible to have a filehandle for reading and writing open at the same time.
You did not close the filehandle before trying to read from the same file.
I have the code:
open(FILE, "<$new_file") or die "Cant't open file \n";
#lines=<FILE>;
close FILE;
open(STDOUT, ">$new_file") or die "Can't open file\n";
$old_fh = select(OUTPUT_HANDLE);
$| = 1;
select($old_fh);
for(#lines){
s/(.*?xsl.*?)xsl/$1xslt/;
print;
}
close(STDOUT);
STDOUT -> autoflush(1);
print "file changed";
After closing STDOUT closing the program does not write the last print print "file changed". Why is this?
*Edited* Print message I want to write on Console no to file
I suppose it is because print default filehandle is STDOUT, which at that point it is already closed. You could reopen it, or print to other filehandle, for example, STDERR.
print STDERR "file changed";
It's because you've closed the filehandle stored in STDOUT, so print can't use it anymore. Generally speaking opening a new filehandle into one of the predefined handle names isn't a very good idea because it's bound to lead to confusion. It's much clearer to use lexical filehandles, or just a different name for your output file. Yes you then have to specify the filehandle in your print call, but then you don't have any confusion over what's happened to STDOUT.
A print statement will output the string in the STDOUT, which is the default output file handle.
So the statement
print "This is a message";
is same as
print STDOUT "This is a message";
In your code, you have closed STDOUT and then printing the message, which will not work. Reopen the STDOUT filehandle or do not close it. As the script ends, the file handles will be automatically closed
open OLDOUT, ">&", STDOUT;
close STDOUT;
open(STDOUT, ">$new_file") or die "Can't open file\n";
...
close(STDOUT);
open (STDOUT, ">&",OLDOUT);
print "file changed";
You seem to be confused about how file IO operations are done in perl, so I would recommend you read up on that.
What went wrong?
What you are doing is:
Open a file for reading
Read the entire file and close it
Open the same file for overwrite (org file is truncated), using the STDOUT file handle.
Juggle around the default print handle in order to set autoflush on a file handle which is not even opened in the code you show.
Perform a substitution on all lines and print them
Close STDOUT then print a message when everything is done.
Your main biggest mistake is trying to reopen the default output file handle STDOUT. I assume this is because you do not know how print works, i.e. that you can supply a file handle to print to print FILEHANDLE "text". Or that you did not know that STDOUT was a pre-defined file handle.
Your other errors:
You did not use use strict; use warnings;. No program you write should be without these. They will prevent you from doing bad things, and give you information on errors, and will save you hours of debugging.
You should never "slurp" a file (read the entire file to a variable) unless you really need to, because this is ineffective and slow and for huge files will cause your program to crash due to lack of memory.
Never reassign the default file handles STDIN, STDOUT, STDERR, unless A) you really need to, B) you know what you are doing.
select sets the default file handle for print, read the documentation. This is rarely something that you need to concern yourself with. The variable $| sets autoflush on (if set to a true value) for the currently selected file handle. So what you did actually accomplished nothing, because OUTPUT_HANDLE is a non-existent file handle. If you had skipped the select statements, it would have set autoflush for STDOUT. (But you wouldn't have noticed any difference)
print uses print buffers because it is efficient. I assume you are trying to autoflush because you think your prints get caught in the buffer, which is not true. Generally speaking, this is not something you need to worry about. All the print buffers are automatically flushed when a program ends.
For the most part, you do not need to explicitly close file handles. File handles are automatically closed when they go out of scope, or when the program ends.
Using lexical file handles, e.g. open my $fh, ... instead of global, e.g. open FILE, .. is recommended, because of the previous statement, and because it is always a good idea to avoid global variables.
Using three-argument open is recommended: open FILEHANDLE, MODE, FILENAME. This is because you otherwise risk meta-characters in your file names to corrupt your open statement.
The quick fix:
Now, as I said in the comments, this -- or rather, what you intended, because this code is wrong -- is pretty much identical to the idiomatic usage of the -p command line switch:
perl -pi.bak -e 's/(.*?xsl.*?)xsl/$1xslt/' file.txt
This short little snippet actually does all that your program does, but does it much better. Explanation:
-p switch automatically assumes that the code you provide is inside a while (<>) { } loop, and prints each line, after your code is executed.
-i switch tells perl to do inplace-edit on the file, saving a backup copy in "file.txt.bak".
So, that one-liner is equivalent to a program such as this:
$^I = ".bak"; # turns inplace-edit on
while (<>) { # diamond operator automatically uses STDIN or files from #ARGV
s/(.*?xsl.*?)xsl/$1xslt/;
print;
}
Which is equivalent to this:
my $file = shift; # first argument from #ARGV -- arguments
open my $fh, "<", $file or die $!;
open my $tmp, ">", "/tmp/foo.bar" or die $!; # not sure where tmpfile is
while (<$fh>) { # read lines from org file
s/(.*?xsl.*?)xsl/$1xslt/;
print $tmp $_; # print line to tmp file
}
rename($file, "$file.bak") or die $!; # save backup
rename("/tmp/foo.bar", $file) or die $!; # overwrite original file
The inplace-edit option actually creates a separate file, then copies it over the original. If you use the backup option, the original file is first backed up. You don't need to know this information, just know that using the -i switch will cause the -p (and -n) option to actually perform changes on your original file.
Using the -i switch with the backup option activated is not required (except on Windows), but recommended. A good idea is to run the one-liner without the option first, so the output is printed to screen instead, and then adding it once you see the output is ok.
The regex
s/(.*?xsl.*?)xsl/$1xslt/;
You search for a string that contains "xsl" twice. The usage of .*? is good in the second case, but not in the first. Any time you find yourself starting a regex with a wildcard string, you're probably doing something wrong. Unless you are trying to capture that part.
In this case, though, you capture it and remove it, only to put it back, which is completely useless. So the first order of business is to take that part out:
s/(xsl.*?)xsl/$1xslt/;
Now, removing something and putting it back is really just a magic trick for not removing it at all. We don't need magic tricks like that, when we can just not remove it in the first place. Using look-around assertions, you can achieve this.
In this case, since you have a variable length expression and need a look-behind assertion, we have to use the \K (mnemonic: Keep) option instead, because variable length look-behinds are not implemented.
s/xsl.*?\Kxsl/xslt/;
So, since we didn't take anything out, we don't need to put anything back using $1. Now, you may notice, "Hey, if I replace 'xsl' with 'xslt', I don't need to remove 'xsl' at all." Which is true:
s/xsl.*?xsl\K/t/;
You may consider using options for this regex, such as /i, which causes it to ignore case and thus also match strings such as "XSL FOO XSL". Or the /g option which will allow it to perform all possible matches per line, and not just the first match. Read more in perlop.
Conclusion
The finished one-liner is:
perl -pi.bak -e 's/xsl.*?xsl\K/t/' file.txt
I have a command line utility from a third party (it's big and written in Java) that I've been using to help me process some data. This utility expects information in a line delimited file and then outputs processed data to STDOUT.
In my testing phases, I was fine with writing some Perl to create a file full of information to be processed and then sending that file to this third party utility, but as I'm nearing putting this code in production, I'd really prefer to just pipe data to this utility directly instead of first writing that data to a file as this would save me the overhead of having to write unneeded information to disk. I recently asked on this board how I could do this in Unix, but have since realized that it would be incredibly more convenient to actually run it directly out of a Perl module. Perhaps something like:
system(bin/someapp do-action --option1 some_value --input $piped_in_data)
Currently I call the utility as follows:
bin/someapp do-action --option1 some_value --input some_file
Basically, what I want is to write all my data either to a variable or to STDOUT and then to pipe it to the Java app through a system call in the SAME Perl script or module. This would make my code a lot more fluid. Without it, I'd wind up needing to write a Perl script which calls a bash file half way through which in turn would need to call another Perl script to prep data. If at all possible I'd love to just stay in Perl the whole way through. Any ideas?
If I am reading your question correctly, you are wanting to spawn a process and be able to both write to its stdin and read from its stdout. If that is the case, then IPC::Open2 is exactly what you need. (Also see IPC::Open3 you also need to read from the process' stderr.)
Here is some sample code. I have marked the areas you will have to change.
#!/usr/bin/perl
use strict;
use warnings;
use IPC::Open2;
# Sample data -- ignore this.
my #words = qw(the quick brown fox jumped over the lazy dog);
# Automatically reap child processes. This is important when forking.
$SIG{'CHLD'} = 'IGNORE';
# Spawn the external process here. Change this to the process you need.
open2(*READER, *WRITER, "wc -c") or die "wc -c: $!";
# Fork into a child process. The child process will write the data, while the
# parent process reads data back from the process. We need to fork in case
# the process' output buffer fills up and it hangs waiting for someone to read
# its output. This could cause a deadlock.
my $pid;
defined($pid = fork()) or die "fork: $!";
if (!$pid) {
# This is the child.
# Close handle to process' stdout; the child doesn't need it.
close READER;
# Write out some data. Change this to print out your data.
print WRITER $words[rand(#words)], " " for (1..100000);
# Then close the handle to the process' stdin.
close WRITER;
# Terminate the child.
exit;
}
# Parent closes its handle to the process' stdin immediately! As long as one
# process has an open handle, the program on the receiving end of the data will
# never see EOF and may continue waiting.
close WRITER;
# Read in data from the process. Change this to whatever you need to do to
# process the incoming data.
print "READ: $_" while (<READER>);
# Close the handle to the process' stdin. After this call, the process should
# be finished executing and will terminate on its own.
close READER;
If it only accepts files, let it open "/proc/self/fd/0", which is the same as STDIN. For the rest, see cdhowies answer.
If all you want to do is pipe the STDOUT from your program into your other program's STDIN, you can do this via the standard Perl open command:
open (CMD, "|$command") or die qq(Couldn't execute $command for piping);
Then, all you have to do to send data to this command is to use the print statement:
print CMD $dataToCommand;
And, you finally close your pipe with the close statement:
close (CMD);
PERL HINT
Perl has a command called perldoc which can give you the documentation of any Perl function or Perl module installed on your system. To get more information about the open command, type:
$ perldoc -f open
The -f says this is a Perl function
If you're doing what cdhowie said in his answer, (you're spawning a process, then reading and writing to that process), you will need IPC::Open2. To get information about the IPC::Open2 module, type:
$ perldoc IPC::Open2
This question already has answers here:
Closed 12 years ago.
Possible Duplicate:
Can I send STDOUT and STDERR to a log file and also to the screen in Win32 Perl?
I would to to redirect STDOUT, STDERR to temp_log and then to logfile.txt after the process is complete. The process runs for complete 20 minutes thus I would like to flock the temp_log in the time process runs.
STDOUT and STDERR are just filehandles that are initialized to your program's standard output and error, but they can be reassigned at any time for any reason. For this purpose, you will want to save the original settings so you can restore them.
sub process_that_writes_to_temp_log {
# save original settings. You could also use lexical typeglobs.
*OLD_STDOUT = *STDOUT;
*OLD_STDERR = *STDERR;
# reassign STDOUT, STDERR
open my $log_fh, '>>', '/logdirectory/the-log-file';
*STDOUT = $log_fh;
*STDERR = $log_fh;
# ...
# run some other code. STDOUT/STDERR are now redirected to log file
# ...
# done, restore STDOUT/STDERR
*STDOUT = *OLD_STDOUT;
*STDERR = *OLD_STDERR;
}
Depending on how your program is laid out, saving and restoring the old STDOUT/STDERR settings can be done automatically using local.
sub routine_that_writes_to_log {
open my $log_fh, '>>', 'log-file';
local *STDOUT = $log_fh;
local *STDERR = $log_fh;
# now STDOUT,STDERR write to log file until this subroutine ends.
# ...
}
The redirection part of your question is already answered in Can I send STDOUT and STDERR to a log file and also to the screen in Win32 Perl?. Even though it says "Win32", the answers are for portable Perl.
The other part, the temporary file, is a bit more tricky until you've been burned by it at least once.
Name the temporary file with something that should be unique during its run. Including the PID or the program start time. You can do that on your own with something like $filename.$$, but you can also use File::Temp.
If you want to then append the temporary log to an existing log, you probably want to make another temp file to get the entire result. Once you have the entire thing stitched together, move it into place. That way you don't have competing processes changing the same file as they all try to append their temp logs to the main one.
Even better, use something like Log::Log4perl since it handles all the details for you.