I have two scripts. Which opens a file by
IO::Handle open for appending (">>filename"). then I call $io->autoflush(1);
The question is will it work fine if I do it in two scripts at the same time? Or would some lines be lost while appending?
You'll want to use syswrite, like the Log4Perl docs suggest for this sort of situation. syswrite blocks other writers while writing, and shares the end of file marker with other processes when appending.
That will not work, as append mode is more like shortcut to "open the file, don't truncate it and after opening, seek to the end of file". So yes, you will lose lines.
Related
Say I want to tramp a remote file in Emacs. If the file is huge, this could take a long time.
Can I tell Emacs/Tramp to only head or display/retrieve part of it?
You can open the directory that the file is in, and type M-! head my_file. The command gets executed over SSH.
The function insert-file-contents takes optional arguments that specify which portion of the file to insert, and from a quick glance it seems like Tramp tries to extract only the parts it needs. You'd need to write an interactive function on top of that, though.
I'm using Perl 5.16.1 from Strawberry in a Windows environment. I have a Perl script reading very large text files. The smallest text file is 30M. When reading files that do not have a line feed at the end of the very last line I get very peculiar results. It may not happen all the time but when it does It's as though it is reading cached data from the I/O system for another file that I previously opened with the Perl script. If I manually edit the file and add a line feed it's fine. I added a line counter and some inline code to display what happens when I'm near the end of the file to make sure I wasn't going nuts. To try and fix I tried adding this to my script:
open (SS_LOG, ">>", $SSFile) or die "Can't open $SSFile\r\n $!\r\n";
print SS_LOG "\r\n";
close SS_LOG;
but it does nothing. The file stays the same size. I'm also storing data in large arrays.
Has anyone else seen anything like this?
Try unbuffering your output:
SS_LOG->autoflush(1);
For example,
#!/usr/bin/perl
open FILE1, '>out/existing_file1.txt';
open FILE2, '>out/existing_file2.txt';
open FILE3, '>out/existing_file3.txt';
versus
#!/usr/bin/perl
if (-d out) {
system('rm -f out/*');
}
open FILE1, '>out/new_file1.txt';
open FILE2, '>out/new_file2.txt';
open FILE3, '>out/new_file3.txt';
In the first example, we clobber the files (truncate them to zero length). In the second, we clean the directory and then create new files.
The second method (where we clean the directory) seems redundant and unnecessary. The only advantage to doing this (in my mind) is that it resets permissions, as well as the change date.
Which is considered the best practice? (I suspect the question is pedantic, and the first example is more common.)
Edit: The reason I ask is because I have a script that will parse data and write output files to a directory - each time with the same filename/path. This script will be run many times, and I'm curious whether at the start of the script I should partially clean the directory (of the files I am writing to) or just let the file handle '>' clobber the files for me, and take no extra measures myself.
Other than the permissions issue you mentioned, the only significant difference between the two methods is if another process has one of the output files open while you do this. If you remove the file and then recreate it, the other process will continue to see the data in the original file. If you clobber the file, the other process will see the file contents change immediately (although if it's using buffered I/O, it may not notice it until it needs to refill the buffer).
Removing the files will also update the modification time of the containing directory.
I could not find this answer in the man or info pages, nor with a search here or on Google. I have a file which is, in essence, a text file, but it somehow got screwed up upon saving. (I think there are a few strange bytes at the front of the file accidentally.)
I am able to open the file, and it makes sense, using head or cat, but not using any sort of editor.
In the end, all I wish to do is open the file in emacs, delete the "messy" characters, and save it once cleaned up. The file, however, is huge, so I need something powerful like emacs to be able to open it.
Otherwise, I suppose I can try to create a script to read this in line by line, forcing the script to read it in text format, then write it. But I wanted something quick, since I won't be doing this over & over.
Thanks!
Mike
perl -i.bk -pe 's/[^[:ascii:]]//g;' file
Found this perl one liner here: http://www.perlmonks.org/?node_id=619792
Try M-xfind-file-literally in Emacs.
You could edit the file using hexl-mode, which lets you edit the file in hexadecimal. That would let you see precisely what those offending characters are, and remove them.
It sounds like you either got a different line ending in the file (eg: carriage returns on a *nix system) or it got saved in an unexpected encoding.
You could use strings to grab "printable characters in file". You might have to play with the --encoding though I have only ever used it to grab ascii strings from executable files.
Not really sure if this is possible, but I am running this on Terminal:
script -q \/my\/directory\/\/$outfile \.\/lexparser.csh $file
Explanation
Through a perl script. The first directory and $outfile is where I am saving the output of the Terminal command. the \.\/lexparser.csh $file is just calling on that script to work on the input file, $file.
Problem
However, I put -q b/c I didn't want to save the unnecessary print to the file. The file is big ~ 30 thousand lines of text. It has been running for some time now, which was expected.
Question
I would like to check and ensure everything is going smoothly. The file output name is in Finder, but I'm afraid if I click on it, it will ruin the output. How can check the progress (possibly the current text file) without disrupting the process?
Thanks for your time, let me know if the question is unclear.
Open a new Terminal, navigate to the output directory, and:
tail -f <output_file>
You will continue to see new appends to the file without interruption to any writing process. Just leave the Terminal open with the tail going, and you can watch it all day long. Grab some popcorn.
In addition to tail, you can also learn about tee. The point of tee is to output to a file while also outputting to STDOUT in your terminal. Best of both worlds! Well, someone good aspects of two possible worlds.
You could tail the file via the command line which shouldn't cause problems.
Additionally you could have the program print to stderr as well as stdout, redirect stdout to the file and allow stderr through so it could tell you it's progress. Though that is more of a 20 / 20 hindsight solution.