Opening last edited file from a directory - perl

I'm trying to run a perl script that opens the last edited file in a directory.
I know how to open a single file, like in the sample below, but not the last edited on /home/test/
open(CONFIG_FILE,"/home/test/test.txt");
while (<CONFIG_FILE>){
}
close(CONFIG_FILE);
How can I do that?

It read all files from /home/test/ and takes newest one,
my ($last_modified) = sort { -M $a <=> -M $b } grep -f, </home/test/*>;
Check perldoc -f -X

Try doing this :
chomp(my $file = qx(cd /home/test && ls -1t | sed q));
this solution work by evaluating a shell command. sed q is a trick to display only the first line
qx() execute the shell command and get the output, see perldoc qx
Or with pure-perl :
use File::DirList;
my $files = File::DirList::list('/home/test', 'M');
my $file = $files->[0]->[13];
The method File::DirList::list('/home/test', 'M'); return an array sorted by modification date, whe just pick up the first one. 13 is the key for the filename.
See perldoc File::DirList

There's an example of how to read the files in a directory in the documentation for the readdir function.
Once you get that going, you'll want to use the stat function to check the mtime of each file as you look at it in the readdir.
All the rest is just programming ;-)
Alternately, you could take the last item from ls sorted by date: ls -tr | tail -1.

Related

Perl regular expression loop through all the directory and get specific file

I would like to translate the unix regular expression into Perl language to get some specific file associated with some condition.
Suppose now I have Perl script in a directory /nfs/cs/test_case/y2016 call totalResult.pl, this directory also contains lot of directories as well such as testWeek1, testWeek2, testWeek3...etc. Each directory contain sub-directory such as testCase1, testCase2, testCase3...etc. and Each testCase directory contains a file call .test_result, the contain record the result either success or fail.
So I can get the file information using unix command, for example:
wc /nfs/cs/test_case/y2016/testWeek1/testCase1/.test_result
If would like to get the test_results for each directory and sub-directory which is fail, I can do it from the current path /nfs/cs/test_case/y2016 in unix like:
grep -ri "fail" */*/.test_result
It will give me the output:
/nfs/cs/test_case/y2016/testWeek1/testCase1/.test_result:fail
/nfs/cs/test_case/y2016/testWeek3/testCase45/.test_result:fail
/nfs/cs/test_case/y2016/testWeek4/testCase12/.test_result:fail
.
.
...etc
How can I achieve it in writing a Perl script just run the command perl testCase.pl then can get the same output? I'm new in unix and Perl, anyone can help?
# Collect names of all test files
my #TestFiles = glob('/nfs/cs/test_case/y2016/*/*/.test_result');
# Check test files for "fail"
foreach my $TestFile ( #TestFiles ) {
open(my $T,'<',$TestFile) or die "Can't open < $TestFile: $!";
while(<$T>){
if( /fail/ ) {
chomp;
print $TestFile,":",$_,"\n";
}
}
close($T);
}
You can also execute the same linux command within Perl using back tick (`) operator.
#result=`grep -ri "fail" */*/.test_result`;
print #result;

perl script to add line of code only modifies one file

I have this:
perl -pi -e 'print "code I want to insert\n" if $. == 2' *.php
which puts the line code I want to insert on the second line of the file, which is what I need done to every single PHP file
If I run it in a directory with both PHP files and non-PHP files it does the right thing, but only to one PHP file. I thought *.php would apply it to all PHP files, but it doesn't do it.
How can I write it so it will modify every PHP file in a directory? Bonus if there is an easy way to do this recursively through all directories. I don't mind running the Perl script for each directory as there aren't that many, but don't want to hand edit every single file.
The problem is that the file handle ARGV that Perl uses to read the files passed on the command line is never explicitly closed, so the line number $. just keeps incrementing after the end of the first file and never goes back to one.
Fix this by closing ARGV when it has reached end of file. Perl will reopen it to read the next file in the list, and so reset $.
perl -i -pe 'print "code I want to insert\n" if $. == 2; close ARGV if eof' *.php
If you can use sed, this should work:
sed -si '2i\CODE YOU WANT TO INSERT' *.php
To do it recursively, you might try:
find -name '*.php' -execdir sed -si '2i\CODE YOU WANT TO INSERT' '{}' +
Using File::Find.
Note, I've included 3 sanity checks to verify that things are actually being processed they way that you want.
Initially the script will just print out the found files until you comment out the bare return.
Then the script will save backups unless you uncomment the unlink statement.
Finally, the script will only process a single file until you comment out the exit statement.
These three checks are just so you can verify that everything is working as you desire before editing a whole directory tree.
use strict;
use warnings;
use File::Find;
my $to_insert = "code I want to insert\n";
find(sub {
return unless -f && /\.php$/;
print "Edit $File::Find::name\n";
return; # Comment out once satisfied with found files
local $^I = '.bak';
local #ARGV = $_;
while (<>) {
print $to_insert if $. == 2 && $_ ne $to_insert;
print;
}
# unlink "$_$^I"; # Uncomment to delete backups once certain that first file is processed correctly.
exit; # Comment out once certain that first file is processed correctly
}, '.')

Read same extension multiple files in one directory in Perl

I currently have an issue with reading files in one directory.
I need to take all the fastq files in a file and run the script for each file then put new files in an ‘Edited_sequences’ folder.
The one script I had is
perl -ne '$i++; if($i<80001){print}' BM2003_TCCCAGAACAAC_L001_R1_001.fastq > ./Edited_sequences/BM2003_TCCCAGAACAAC_L001_R1_001.fastq
It takes the first 80000 lines in one fastq file then outputs the result.
Now for example I have 2000 fastq files, then I need to copy and paste for 2000 times.
I know there is a glob command suit for this situation but I just do not know how to deal with that.
Please help me out.
You can use perl to do copy/paste for you, first argument *.fastq are all fastq files, and second ./Edited_sequences is target folder for new files,
perl -e '$d=pop; `head -8000 "$_" > "$d/$_"` for #ARGV' *.fastq ./Edited_sequences
glob gets you an array of filenames matching a particular expression. It's frequently used with <> brackets, a lot like reading input (you can think of it as reading files from a directory).
This is a simple example that will print the names of every ".fastq" file in the current directory:
print "$_\n" for <*.fastq>;
The important part is <*.fastq>, which gives us an array of filenames matching that expression (in this case, a file extension). If you need to change which directory your Perl script is working in, you can use chdir.
From there, we can process your files as needed:
while (my $filename = <*.fastq>) {
open(my $in, '<', $filename) or die $!;
open(my $out, '>', "./Edited_sequences/$filename") or die $!;
for (1..80000) {
my $line = <$in>;
print $out $line;
}
}
You have two choices:
Use Perl to read in the 2000 files and run it as part of your program
Use the Shell to pass each of those 2000 file to your command line
Here's the bash alternative:
for file in *.fastq
do
perl -ne '$i++; if($i<80001){print}' "$file" > "./Edited_sequences/$file"
done
Your same Perl script, but with the shell finding each file. This should work and not overload the command line. The for loop in bash, if handed a glob can expand them correctly.
However, I always recommend that you don't actually execute the command, but echo the resulting commands into a file:
for file in *.fastq
do
echo "perl -ne '\$i++; if(\$i<80001){print}' \
\"$file\" > \"./Edited_sequences/$file\"" >> myoutput.txt
done
Then, you can look at myoutput.txt to make sure it looks good before you actually do any real harm. Once you've determined that myoutput.txt is a good file, you can execute that as a shell script:
$ bash myoutput.txt

How to check for non empty files in Perl

I'm using the find command for finding files in directories. I would like to check if the files in the directories are not empty (non 0 size) before proceeding. Thanks to the find manual, I know how to identify empty files using the -empty option.
However, I want to use Perl to check for non-empty files. How can I do that?
Thanks in advance.
Refer to perldoc perlfunc -X for a refresher of the Perl file test operators. What you want is this one:
-s File has nonzero size (returns size in bytes).
Simple script showing how to use File::Find:
#!/usr/bin/perl -w
use strict;
use File::Find;
# $ARGV[0] is the first command line argument
my $startingDir = $ARGV[0];
finddepth(\&wanted, $startingDir);
sub wanted
{
# if current path is a file and non-empty
if (-f $_ && -s $_)
{
# print full path to the console
print $File::Find::name . "\n";
}
}
In this example I have the output going to the console. To pipe it to a file, you can just use shell output redirection, e.g. ./findscript.pl /some/dir > somefile.out.
Please have a look at perldoc http://perldoc.perl.org/functions/-X.html
-z File has zero size (is empty).
-s File has nonzero size (returns size in bytes).
Sample usage to detect non-empty file:
unless ( (-z $FILE) ) { process_file($FILE); }
if (-s $FILE) { process_file($FILE); }

How do I run a Perl script on multiple input files with the same extension?

How do I run a Perl script on multiple input files with the same extension?
perl scriptname.pl file.aspx
I'm looking to have it run for all aspx files in the current directory
Thanks!
In your Perl file,
my #files = <*.aspx>;
for $file (#files) {
# do something.
}
The <*.aspx> is called a glob.
you can pass those files to perl with wildcard
in your script
foreach (#ARGV){
print "file: $_\n";
# open your file here...
#..do something
# close your file
}
on command line
$ perl myscript.pl *.aspx
You can use glob explicitly, to use shell parameters without depending to much on the shell behaviour.
for my $file ( map {glob($_)} #ARGV ) {
print $file, "\n";
};
You may need to control the possibility of a filename duplicate with more than one parameter expanded.
For a simple one-liner with -n or -p, you want
perl -i~ -pe 's/foo/bar/' *.aspx
The -i~ says to modify each target file in place, and leave the original as a backup with an ~ suffix added to the file name. (Omit the suffix to not leave a backup. But if you are still learning or experimenting, that's a bad idea; removing the backups when you're done is a much smaller hassle than restoring the originals from a backup if you mess something up.)
If your Perl code is too complex for a one-liner (or just useful enough to be reusable) obviously replace -e '# your code here' with scriptname.pl ... though then maybe refactor scriptname.pl so that it accepts a list of file name arguments, and simply use scriptname.pl *.aspx to run it on all *.aspx files in the current directory.
If you need to recurse a directory structure and find all files with a particular naming pattern, the find utility is useful.
find . -name '*.aspx' -exec perl -pi~ -e 's/foo/bar/' {} +
If your find does not support -exec ... + try with -exec ... \; though it will be slower and launch more processes (one per file you find instead of as few as possible to process all the files).
To only scan some directories, replace . (which names the current directory) with a space-separated list of the directories to examine, or even use find to find the directories themselves (and then perhaps explore -execdir for doing something in each directory that find selects with your complex, intricate, business-critical, maybe secret list of find option predicates).
Maybe also explore find2perl to do this directory recursion natively in Perl.
If you are on Linux machine, you could try something like this.
for i in `ls /tmp/*.aspx`; do perl scriptname.pl $i; done
For example to handle perl scriptname.pl *.aspx *.asp
In linux: The shell expands wildcards, so the perl can simply be
for (#ARGV) {
operation($_); # do something with each file
}
Windows doesn't expand wildcards so expand the wildcards in each argument in perl as follows. The for loop then processes each file in the same way as above
for (map {glob} #ARGV) {
operation($_); # do something with each file
}
For example, this will print the expanded list under Windows
print "$_\n" for(map {glob} #ARGV);
You can also pass the path where you have your aspx files and read them one by one.
#!/usr/bin/perl -w
use strict;
my $path = shift;
my #files = split/\n/, `ls *.aspx`;
foreach my $file (#files) {
do something...
}