Perl to read each line - perl

I'm beginner for Perl script. Below script is to check if file modified time is greater than 600 seconds. I read filename from filelist.txt.
When I tried to print file modified time, it shows modified time as blank.
Could you help me where I'm wrong?
filelist.txt
a.sh
file.txt
Perl script
#!/usr/bin/perl
my $filename = '/root/filelist.txt';
open(INFO, $filename) or die("Could not open file.");
foreach $eachfile (<INFO>) {
my $file="/root/$eachfile";
my $file_timestamp = (stat $file)[9];
my $timestamp = localtime($epoch_timestamp);
my $startTime = time();
my $fm = $startTime - $file_timestamp;
print "The file modified time is = $file_timestamp\n";
if ($rm > 600) {
print "$file modified time is greater than 600 seconds";
}
else {
print "$file modified time is less than 600 seconds\n";
}
}

You didn't include use strict; or use warnings; which is your downfall.
You set $fm; you test $rm. These are not the same variable. Using strictures and warnings would have pointed out the error of your ways. Expert use them routinely to make sure they aren't making silly mistakes. Beginners should use them too to make sure they aren't making silly mistakes either.
This revised script:
Uses use strict; and use warnings;
Makes sure each variable is defined with my
Doesn't contain $epoch_timestamp or $timestamp
Uses lexical file handles ($info) and the three argument form of open
Closes the file
Includes newlines at the ends of messages
Chomps the file name read from the file
Prints the file name so it can be seen that the chomp is doing its stuff
Locates the files in the current directory instead of /root
Avoids parentheses around the argument to die
Includes the file name in the argument to die
Could be optimized by moving my $startTime = time; outside the loop
Uses $fm in the test
Could be improved if the greater than/less than comments included the equals case correctly
Code:
#!/usr/bin/perl
use strict;
use warnings;
my $filename = './filelist.txt';
open my $info, '<', $filename or die "Could not open file $filename";
foreach my $eachfile (<$info>)
{
chomp $eachfile;
my $file="./$eachfile";
print "[$file]\n";
my $file_timestamp = (stat $file)[9];
my $startTime = time();
my $fm = $startTime - $file_timestamp;
print "The file modified time is = $file_timestamp\n";
if ($fm > 600) {
print "$file modified time is greater than 600 seconds\n";
}
else {
print "$file modified time is less than 600 seconds\n";
}
}
close $info;
Tangentially: if you're working in /root, the chances are you are running as user root. I trust you have good backups. Experimenting in programming as root is a dangerous game. A simple mistake can wipe out the entire computer, doing far more damage than if you were running as a mere mortal user (rather than the god-like super user).
Strong recommendation: Don't learn to program as root!
If you ignore this advice, make sure you have good backups, and know how to recover from them.
(FWIW: I run my Mac as a non-root user; I even run system upgrades as a non-root user. I do occasionally use root privileges via sudo or equivalents, but I never login as root. I have no need to do so. And I minimize the amount of time I spend as root to minimize the chance of doing damage. I've been working on Unix systems for 30 years; I haven't had a root-privileged accident in over 25 years, because I seldom work as root.)

What others have run into before, is that reading the filename from the file INFO you end up with a newline character at the end of the string and then trying to open /root/file1<cr> doesn't work because that file doesn't exist.
Try calling:
chomp $eachfile
before constructing $file

Related

How to match and find common based on substring from two files?

I have two files. File1 contains list of email addresses. File2 contains list of domains.
I want to filter out all the email addresses after matching exact domain using Perl script.
I am using below code, but I don't get correct result.
#!/usr/bin/perl
#use strict;
#use warnings;
use feature 'say';
my $file1 = "/home/user/domain_file" or die " FIle not found\n";
my $file2 = "/home/user/email_address_file" or die " FIle not found\n";
my $match = open(MATCH, ">matching_domain") || die;
open(my $data1, '<', $file1) or die "Could not open '$file1' $!\n";
my #wrd = <$data1>;
chomp #wrd;
# loop on the fiile to be searched
open(my $data2, '<', $file2) or die "Could not open '$file2' $!\n";
while(my $line = <$data2>) {
chomp $line;
foreach (#wrd) {
if($line =~ /\#$_$/) {
print MATCH "$line\n";
}
}
}
File1
abc#1gmail.com.au
abc#gmail.com
abc#gmail.com1
abc#2outlook.com2
abc#outlook.com1
abc#yahoo.com
abc#yahooo1.com
abc#yahooo.com
File2
yahoo.com
gmail.com
Expected output
abc#gmail.com
abc#yahoo.com
First off, since you seem to be on *nix, you might want to check out grep -f, which can take search patterns from a given file. I'm no expert in grep, but I would try the file and "match whole words" and this should be fairly easy.
Second: Your Perl code can be improved, but it works as expected. If you put the emails and domains in the files as indicated by your code. It may be that you have mixed the files up.
If I run your code, fixing only the paths, and keeping the domains in file1, it does create the file matching_domain and it contains your expected output:
abc#gmail.com
abc#yahoo.com
So I don't know what you think your problem is (because you did not say). Maybe you were expecting it to print output to the terminal. Either way, it does work, but there are things to fix.
#use strict;
#use warnings;
It is a huge mistake to remove these two. Biggest mistake you will ever do while coding Perl. It will not remove your errors, just hide them. You will spend 10 times as much time bug fixing. Uncomment this as your first thing you do to fix this.
use feature 'say';
You never use this. You could for example replace print MATCH "$line\n" with say MATCH $line, which is slightly more concise.
my $file1 = "/home/user/domain_file" or die " FIle not found\n";
my $file2 = "/home/user/email_address_file" or die " FIle not found\n";
This is very incorrect. You are placing a condition on the creation of a variable. If the condition fails, does the variable exist? Don't do this. I assume this is to check if the file exists, but that is not what this does. To check if a file exists, you can use -e, documented as perldoc "-X" (various file tests).
Furthermore, a statement in the form of a string, "/home/user..." is TRUE ("truthy"), as far as Perl conditions are concerned. It is only false if it is "0" (zero), "" (empty) or undef (undefined). So your or clause will never be executed. E.g. "foo" or die will never die.
Lastly, this test is quite meaningless, as you will be testing this in your open statement later on anyway. If the file does not exist, the open will fail and your program will die.
my $match = open(MATCH, ">matching_domain") || die;
This is also very incorrect. First off, you never use the $match variable. Secondly, I bet it does not contain what you think it does. (it contains a boolean which states whether open was successful or not, see perldoc -f open) Thirdly, again, don't put conditions on my declarations of variables, it is a bad idea.
What this statement really means is that $match will contain either the return value of the open, or the return value of die. This should probably be simply:
open my $match, ">", "matching_domain" or die "Cannot open '$match': $!;
Also, use the three argument open with explicit open MODE, and use lexical file handles, like you have done elsewhere.
And one more thing on top of all the stuff I've already badgered you with: I don't recommend hard coding output files for small programs like this. If you want to redirect the output, use shell redirection: perl foo.pl > output.txt. I think this is what has prompted you to think something is wrong with your code: You don't see the output.
Other than that, your code is fine, as near as I can tell. You may want to chomp the lines from the domain file, but it should not matter. Also remember that indentation is a good thing, and it helps you read your code. I mentioned this in a comment, but it was removed for some reason. It is important though.
Good luck!
This assumes that the lines labeled File1 are in the file pointed to by $file1 and the lines labeled File2 are in the file pointed to by $file2.
You have your variables swapped. You want to match what is in $line against $_, not the other way around:
# loop on the file to be searched
open( my $data2, '<', $file2 ) or die "Could not open '$file2' $!\n";
while ( my $line = <$data2> ) {
chomp $line;
foreach (#wrd) {
if (/\#$line$/) {
print MATCH "$_\n";
}
}
}
You should un-comment the warnings and strict lines:
use strict;
use warnings;
warnings shows you that the or die checks are not really working the way you intended in the file name assignment statements. Just use :
my $file1 = "/home/user/domain_file";
my $file2 = "/home/user/email_address_file";
You are already doing the checks where they belong (on open).

Limitations of the Perl Tie::File module

I tried using the Tie:File module to write a text file which should be containing 1 billion lines, but it throws an error after writing 16 million
"Out of memory!"
"Callback called exit at C:/perl/lib/Tie/File.pm line 979 during global destruction."
this is the code I tried with.
use Tie::File;
tie #array, 'Tie::File', "Out.txt";
$start = time();
for ($i = 0; $i <= 15000000; $i++) {
$array[$i].= "$i,";
}
$end = time();
print("time taken: ", $end - $start, "seconds");
untie #array;
I don't know why it throws an error. Any solutions to overcome this? It also took about 55 minutes to write 16 million records and it throws error! Is this usual time it takes to write?
The Tie:File module is known to be quite slow, and it is best used where the advantage of having random access to the lines of the file outweighs the poor performance.
But this isn't a problem with the module, it is a limitation of Perl. Or, more accurately, a limitation of your computer system. If you take the module out of the situation and just try to create an ordinary array with 1,000,000,000 elements then Perl will die with an Out of memory! error. The limit for my 32-bit build of Perl 5 version 20 is around 30 million. For 64-bit builds it will be substantially more.
Tie:File doesn't keep the whole file in memory but pages it in and out to save space, so it can handle very large files. Just not that large!
In this case you don't have any need of the advantages of Tie:File, and you should just write the data sequentially to the file. Something like this
use strict;
use warnings;
use 5.010;
use autodie;
open my $fh, '>', 'Out.txt';
my $time = time;
for my $i (0 .. 15_000_000) {
print $fh "$i,\n";
}
$time = time - $time;
printf "Time taken: %d seconds\n", $time;
This program ran in seven seconds on my system.
Please also note use strict and use warnings at the start of the program. This is essential for any Perl program, and will quickly reveal many simple problems that you would otherwise overlook. With use strict in place, each variable must be declared with my as close as possible to the first point of use.

Recursive directory traversal in perl

I intend to recursively traverse a directory containing this piece of perl script.
The idea is to traverse all directories whose parent directory contains the perl script and list all files path into a single array variable. Then return the list.
Here comes the error msg:
readdir() attempted on invalid dirhandle $DIR at xxx
closedir() attempted on invalid dirhandle $DIR at xxx
Code is attached for reference, Thank you in advance.
use strict;
use warnings;
use Cwd;
our #childfile = ();
sub recursive_dir{
my $current_dir = $_[0]; # a parameter
opendir(my $DIR,$current_dir) or die "Fail to open current directory,error: $!";
while(my $contents = readdir($DIR)){
next if ($contents =~ m/^\./); # filter out "." and ".."
#if-else clause separate dirs from files
if(-d "$contents"){
#print getcwd;
#print $contents;
#closedir($DIR);
recursive_dir(getcwd."/$contents");
print getcwd."/$contents";
}
else{
if($contents =~ /(?<!\.pl)$/){
push(#childfile,$contents);
}
}
}
closedir($DIR);
#print #childfile;
return #childfile;
}
recursive_dir(getcwd);
Please tell us if this is homework? You are welcome to ask for help with assignments, but it changes the sort of answer you should be given.
You are relying on getcwd to give you the current directory that you are processing, yet you never change the current working directory so your program will loop endlessly and eventually run out of memory. You should simply use $current_dir instead.
I don't believe that those error messages can be produced by the program you show. Your code checks whether opendir has succeeded and the program dies unless $DIR is valid, so the subsequent readdir and closedir must be using a valid handle.
Some other points:
Comments like # a parameter are ridiculous and only serve to clutter your code
Upper-case letters are generally reserved for global identifiers like package names. And $dir is a poor name for a directory handle, as it could also mean the directory name or the directory path. Use $dir_handle or $dh
It is crazy to use a negative look-behind just to check that a file name doesn't end with .pl. Just use push #childfile, $contents unless $contents =~ /\.pl$/
You never use the return value from your subroutine, so it is wasteful of memory to return what could be an enormous array from every call. #childfile is accessible throughout the program so you can just access it directly from anywhere
Don't put scalar variables inside double quotes. It simply forces the value to a string, which is probably unnecessary and may cause arcane bugs. Use just -d $contents
You probably want to ignore symbolic links, as otherwise you could be looping endlessly. You should change else { ... } to elsif (-f $contents) { ... }

File truncated, when opened in Perl

Im new to perl, so sorry if this is obvious, but i looked up how to open a file, and use the flags, but for the life of me they dont seem to work right I narrowed it down to these lines of code.
if ($flag eq "T"){
open xFile, ">" , "$lUsername\\$openFile";
}
else
{
open xFile, ">>", "$lUsername\\$openFile";
}
Both of these methods seem to delete the contents of my file. I also checked if the flag is formatted correctly and it is, i know for a fact ive gone down both conditions.
EDIT: codepaste of a larger portion of my code http://codepaste.net/n52sma
New to Perl? I hope you're using use strict and use warnings.
As other's have stated, you should be using a test to make sure your file is open. However, that's not really the problem here. In fact, I used your code, and it seems to work fine for me. Maybe you should try printing some debugging messages to see if this is doing what you think it's doing:
use strict;
use warnings;
use autodie; #Will stop your program if the "open" doesn't work.
my $lUsername = "ABaker";
my $openFile = "somefile.txt";
if ($flag eq "T") {
print qq(DEBUG: Flag = "$flag": Deleting file "$lUsername/$openFile");
open xFile, ">" , "$lUsername/$openFile";
}
else {
print qq(DEBUG: Flag = "$flag": Appending file "$lUsername/$openFile");
open xFile, ">>", "$lUsername/$openFile";
}
You want to use strict and warnings in order to make sure you're not having issues with variable names. The use strict forces you to declare your variables first. For example, are you setting $Flag, but then using $flag? Maybe $flag is set the first time through, but you're setting $Flag the second time through.
Anyway, the DEBUG: statements will give you a better idea of what your error could be.
By the way, in Perl, you're checking if $flag is set to T and not t. If you want to test against both t and T, test whether uc $flag eq 'T' and not just $flag eq 'T'.
#Ukemi
I reformated to comply with use strict, i also made print statements to make sure i was trunctating when i want to, and not when i dont. It still is deleting the file. Although now sometimes its simply not writing, im going to give a larger portion of my code in a link, id really appreciate it if you gave it a once over.
Are you seeing it say Truncating, but the file is empty? Are you sure the file already existed? There's a reason why I put the flag and everything in my debug statements. The more you print, the more you know. Try the following section of code:
$file = "lUsername/$openFile" #Use forward slashes vs. back slashes.
if ($flag eq "T") {
print qq(Flag = "$flag". Truncating file "$file"\n);
open $File , '>', $file
or die qq(Unable to open file "$file" for writing: $!\n);
}
else {
print qq(Flag = "$flag". Appending to file "$file"\n);
if (not -e $file) {
print qq(File "$file" does not exist. Will create it\n");
}
open $File , '>>', $file
or die qq(Unable to open file "$file" for appending: $!\n);
}
Note I'm printing out the flag and the name of the file in quotes. This will allow me to see if there are any hidden characters in my file name.
I'm using the qq(...) method to quote strings, so I can use the quotation marks in my print statements.
Also note I'm checking for the existence of the file when I truncate. This way, I make sure the file actually exists.
This should point out any possible errors in your logic. The other thing you can do is to stop your program when you finish writing out the file and verify that the file was written out as expected.
print "Write to file now:\n";
my $writeToFile = <>;
printf $File "$writeToFile";
close $File;
print "DEBUG: Temporary stop. Examine file\n";
<STDIN>; #DEBUG:
Now, if you see it saying it's appending to the file, and the file exists, and you still see the file being overwritten, we'll know the problem lies in your actual open xFile, ">>" $file statement.
You should use the three-argument-version of open, lexical filehandles and check wether there might have been an error:
# Writing to file (clobbering it if it exists)
open my $file , '>', $filename
or die "Unable to write to file '$filename': $!";
# Appending to file
open my $file , '>>', $filename
or die "Unable to append to file '$filename': $!";
>> does not clobber or truncate. Either you ended up in the "then" clause when you expected to be in the "else" clause, or the problem is elsewhere.
To check what $flag contains:
use Data::Dumper;
local $Data::Dumper::Useqq = 1;
print(Dumper($flag));
For your reference I have mentioned some basic file handling techniques below.
open FILE, "filename.txt" or die $!;
The command above will associate the FILE filehandle with the file filename.txt. You can use the filehandle to read from the file. If the file doesn't exist - or you cannot read it for any other reason - then the script will die with the appropriate error message stored in the $! variable.
open FILEHANDLE, MODE, EXPR
The available modes are the following:
read < #this mode will read the file
write > # this mode will create the new file. If the file already exists it will truncate and overwrite.
append >> #this will append the contents if the file already exists,else it will create new one.
if you have confusion on this, you can use the module called File::Slurp;
I have mentioned the sample codes using File::Slurp module.
use strict;
use File::Slurp;
my $read_mode=read_file("test.txt"); #to read file contents
write_file("test2.txt",$read_mode); #to write file
my #all_files=read_dir("/home/desktop",keep_dot_dot=>0); #read a dir
write_file("test2.txt",{append=>1},"#all_files"); #Append mode

Capturing command-line output on win32 that hasn't been flushed yet

(Context: I'm trying to monitor a long-running process from a Perl CGI script. It backs up an MSSQL database and then 7-zips it. So far, the backup part (using WITH STATS=1) outputs to a file, which I can have the browser look at, refreshing every few seconds, and it works.)
I'm trying to use 7zip's command-line utility but capture the progress bar to a file. Unfortunately, unlike SQL backups, where every time another percent is done it outputs another line, 7zip rewinds its output before outputting the new progress data, so that it looks nicer if you're just using it normally on the command-line. The reason this is unfortunate is that normal redirects using >, 1>, and 2> only create a blank file, and no output ever appears in it, except for >, which has no output until the job is done, which isn't very useful for a progress bar.
How can I capture this kind of output, either by having every change in % somehow be appended to a logfile (so I can use my existing method of logfile monitoring) just using command-line trickery (no Perl), or by using some Perl code to capture it directly after calling system()?
If you need to capture the output all at once then this is the code you want:
$var=`echo cmd`;
If you want to read the output line by line then you need this code:
#! perl -slw
use strict;
use threads qw[ yield async ];
use threads::shared;
my( $cmd, $file ) = #ARGV;
my $done : shared = 0;
my #lines : shared;
async {
my $pid = open my $CMD, "$cmd |" or die "$cmd : $!";
open my $fh, '>', $file or die "$file : $!";
while( <$CMD> ) {
chomp;
print $fh $_; ## output to the file
push #lines, $_; ## and push it to a shared array
}
$done = 1;
}->detach;
my $n = 0;
while( !$done ) {
if( #lines ) { ## lines to be processed
print pop #lines; ## process them
}
else {
## Else nothing to do but wait.
yield;
}
}
Another option is using Windows create process. I know Windows C/C++ create process will allow you to redirect all stdout. Perl has access to this same API call: See Win32::Process.
You can try opening a pipe to read 7zip's output.
This doesn't answer how to capture output that gets rewound, but it was a useful way of going about it that I ended up using.
For restores:
use 7za l to list the files in the zip file and their sizes
fork 7za e using open my $command
track each file as it comes out with -s $filename and compare to the listing
when all output files are their full size, you're done
For backups:
create a unique dir somewhere
fork 7za a -w
find the .tmp file in the dir
track its size
when the .tmp file no longer exists, you're done
For restores you get enough data to show a percentage done, but for backups you can only show the total file size so far, but you could compare with historical ratios if you're using similar data to get a guestimate. Still, it's more feedback than before (none).