Debugging Perl one-liner for renaming files - perl

I was testing/trying out this Perl one-liner, and I'm trying to figure out what happened to the files. I don't see the files anymore.
Did I delete them or what went wrong?
Example of file names listed (original):
IMG_0178.JPG
IMG_0182.JPG
IMG_0183.JPG
IMG_0184.JPG
IMG_0186.JPG
I wanted to simply change the file extension to lowercase (.jpg):
perl -e'while(<*.JPG>) { s/JPG$/jpg/; rename <*.jpg>, $_ }'

Don't use rename with a glob. Use scalars. Try to assign the file name to a new variable before the substitution and rename the old name to the modified one, like this:
perl -e'while(<*.JPG>) { ($new = $_) =~ s/JPG$/jpg/; rename $_, $new }'
Check output with ls -1:
IMG_0178.jpg
IMG_0182.jpg
IMG_0183.jpg
IMG_0184.jpg
IMG_0186.jpg

Bizarrely your code should do what you wanted.
A file glob like <*.JPG> in scalar context will return the next file that matches the pattern, and since both while and rename apply scalar context, the two globs return the same value at each iteration.
while (<*.JPG>) {
s/JPG$/jpg/;
rename <*.jpg>, $_;
}
In the first iteration of the loop $_ is set to IMG_0178.JPG by the while, and the substitution sets the file type to lower case.
Then in the rename <*.jpg> is executed in scalar context and again returns IMG_0178.JPG - the first file in the same list because Windows file names are case-insensitive.
So finally the rename performs rename 'IMG_0178.JPG', 'IMG_0178.jpg' as required.
Rewriting rename like this shows this clearly
sub ren($$) {
print "$_[0] -> $_[1]\n";
}
while (my $file = <*.JPG>) {
$file =~ s/JPG$/jpg/;
ren <*.JPG>, $file;
}
output
IMG_0178.JPG -> IMG_0178.jpg
IMG_0182.JPG -> IMG_0182.jpg
IMG_0183.JPG -> IMG_0183.jpg
IMG_0184.JPG -> IMG_0184.jpg
IMG_0186.JPG -> IMG_0186.jpg
So you are lucky, and your files should have been renamed as you wanted.
But don't do this. In particular you should run the program with a print statement in place of any critical operations so that you can see what is going to happen.
This would be better as id more clearly does what is intended
perl -e '($f = $_) =~ s/JPG$/jpg/i and rename $_, $f while <*.JPG>'

Related

How to rename multiple files in a folder with a specific format?

I have many files in a folder with the format '{galaxyID}-cutout-HSC-I-{#}-pdr2_wide.fits', where {galaxyID} and {#} are different numbers for each file. Here are some examples:
2185-cutout-HSC-I-9330-pdr2_wide.fits
992-cutout-HSC-I-10106-pdr2_wide.fits
2186-cutout-HSC-I-9334-pdr2_wide.fits
I want to change the format of all files in this folder to match the following:
2185_HSC-I.fits
992_HSC-I.fits
2186_HSC-I.fits
namely, I want to take out "cutout", the second number, and "pdr2_wide" from each file name. I would prefer to do this in either Perl or Python. For my Perl script, so far I have the following:
rename [-n];
my #parts=split /-/;
my $this=$parts[0].$parts[1].$parts[2].$parts[3].$parts[4].$parts[5];
$_ = $parts[0]."_".$parts[2]."_".$parts[3];
*fits
which gives me the error message
Not enough arguments for rename at ./rename.sh line 3, near "];" Execution of ./rename.sh aborted due to compilation errors.
I included the [-n] because I want to make sure the changes are what I want before actually doing it; either way, this is in a duplicated directory just for safety.
It looks like you are using the rename you get on Ubuntu (it's not the one that's on my ArchLinux box), but there are other ones out there. But, you've presented it oddly. The brackets around -n shouldn't be there and the ; ends the command.
The syntax, if you are using what I think you are, is this:
% rename -n -e PERL_EXPR file1 file2 ...
The Perl expression is the argument to the -e switch, and can be a simple substitution. Note that this expression is a string that you give to -e, so that probably needs to be quoted:
% rename -n -e 's/-\d+-pdr2_wide//' *.fits
rename(2185-cutout-HSC-I-9330-pdr2_wide.fits, 2185-cutout-HSC-I.fits)
And, instead of doing this in one step, I'd do it in two:
% rename -n -e 's/-cutout-/-/; s/-\d+-pdr2_wide//' *.fits
rename(2185-cutout-HSC-I-9330-pdr2_wide.fits, 2185-HSC-I.fits)
There are other patterns that might make sense. Instead of taking away parts, you can keep parts:
% rename -n -e 's/\A(\d+).*(HSC-I).*/$1-$2.fits/' *.fits
rename(2185-cutout-HSC-I-9330-pdr2_wide.fits, 2185-HSC-I.fits)
I'd be inclined to use named captures so the next poor slob knows what you are doing:
% rename -n -e 's/\A(?<galaxy>\d+).*(HSC-I).*/$+{galaxy}-$2.fits/' *.fits
rename(2185-cutout-HSC-I-9330-pdr2_wide.fits, 2185-HSC-I.fits)
From your description {galaxyID}-cutout-HSC-I-{#}-pdr2_wide.fits, I assume that cutout-HSC-I is fixed.
Here's a script that will do the rename. It takes a list of files on stdin. But, you could adapt to take the output of readdir:
#!/usr/bin/perl
master(#ARGV);
exit(0);
sub master
{
my($oldname);
while ($oldname = <STDIN>) {
chomp($oldname);
# find the file extension/suffix
my($ix) = rindex($oldname,".");
next if ($ix < 0);
# get the suffix
my($suf) = substr($oldname,$ix);
# only take filenames of the expected format
next unless ($oldname =~ /^(\d+)-cutout-(HSC-I)/);
# get the new name
my($newname) = $1 . "_" . $2 . $suf;
printf("OLDNAME: %s NEWNAME: %s\n",$oldname,$newname);
# rename the file
# change to "if (1)" to actually do it
if (0) {
rename($oldname,$newname) or
die("unable to rename '$oldname' to '$newname' -- $!\n");
}
}
}
For your sample input file, here's the program output:
OLDNAME: 2185-cutout-HSC-I-9330-pdr2_wide.fits NEWNAME: 2185_HSC-I.fits
OLDNAME: 992-cutout-HSC-I-10106-pdr2_wide.fits NEWNAME: 992_HSC-I.fits
OLDNAME: 2186-cutout-HSC-I-9334-pdr2_wide.fits NEWNAME: 2186_HSC-I.fits
The above is how I usually do things but here's one with just a regex. It's fairly strict in what it accepts [for safety], but you can adapt as desired:
#!/usr/bin/perl
master(#ARGV);
exit(0);
sub master
{
my($oldname);
while ($oldname = <STDIN>) {
chomp($oldname);
# only take filenames of the expected format
next unless ($oldname =~ /^(\d+)-cutout-(HSC-I)-\d+-pdr2_wide([.].+)$/);
# get the new name
my($newname) = $1 . "_" . $2 . $3;
printf("OLDNAME: %s NEWNAME: %s\n",$oldname,$newname);
# rename the file
# change to "if (1)" to actually do it
if (0) {
rename($oldname,$newname) or
die("unable to rename '$oldname' to '$newname' -- $!\n");
}
}
}

How to print result STDOUT to a temporary blank new file in the same directory in Perl?

I'm new in Perl, so it's maybe a very basic case that i still can't understand.
Case:
Program tell user to types the file name.
User types the file name (1 or more files).
Program read the content of file input.
If it's single file input, then it just prints the entire content of it.
if it's multi files input, then it combines the contents of each file in a sequence.
And then print result to a temporary new file, which located in the same directory with the program.pl .
file1.txt:
head
a
b
end
file2.txt:
head
c
d
e
f
end
SINGLE INPUT program ioSingle.pl:
#!/usr/bin/perl
print "File name: ";
$userinput = <STDIN>; chomp ($userinput);
#read content from input file
open ("FILEINPUT", $userinput) or die ("can't open file");
#PRINT CONTENT selama ada di file tsb
while (<FILEINPUT>) {
print ; }
close FILEINPUT;
SINGLE RESULT in cmd:
>perl ioSingle.pl
File name: file1.txt
head
a
b
end
I found tutorial code that combine content from multifiles input but cannot adapt the while argument to code above:
while ($userinput = <>) {
print ($userinput);
}
I was stucked at making it work for multifiles input,
How am i suppose to reformat the code so my program could give result like this?
EXPECTED MULTIFILES RESULT in cmd:
>perl ioMulti.pl
File name: file1.txt file2.txt
head
a
b
end
head
c
d
e
f
end
i appreciate your response :)
A good way to start working on a problem like this, is to break it down into smaller sections.
Your problem seems to break down to this:
get a list of filenames
for each file in the list
display the file contents
So think about writing subroutines that do each of these tasks. You already have something like a subroutine to display the contents of the file.
sub display_file_contents {
# filename is the first (and only argument) to the sub
my $filename = shift;
# Use lexical filehandl and three-arg open
open my $filehandle, '<', $filename or die $!;
# Shorter version of your code
print while <$filehandle>;
}
The next task is to get our list of files. You already have some of that too.
sub get_list_of_files {
print 'File name(s): ';
my $files = <STDIN>;
chomp $files;
# We might have more than one filename. Need to split input.
# Assume filenames are separated by whitespace
# (Might need to revisit that assumption - filenames can contain spaces!)
my #filenames = split /\s+/, $files;
return #filenames;
}
We can then put all of that together in the main program.
#!/usr/bin/perl
use strict;
use warnings;
my #list_of_files = get_list_of_files();
foreach my $file (#list_of_files) {
display_file_contents($file);
}
By breaking the task down into smaller tasks, each one becomes easier to deal with. And you don't need to carry the complexity of the whole program in you head at one time.
p.s. But like JRFerguson says, taking the list of files as command line parameters would make this far simpler.
The easy way is to use the diamond operator <> to open and read the files specified on the command line. This would achieve your objective:
while (<>) {
chomp;
print "$_\n";
}
Thus: ioSingle.pl file1.txt file2.txt
If this is the sole objective, you can reduce this to a command line script using the -p or -n switch like:
perl -pe '1' file1.txt file2.txt
perl -ne 'print' file1.txt file2.txt
These switches create implicit loops around the -e commands. The -p switch prints $_ after every loop as if you had written:
LINE:
while (<>) {
# your code...
} continue {
print;
}
Using -n creates:
LINE:
while (<>) {
# your code...
}
Thus, -p adds an implicit print statement.

How do I extract lines between two strings

I am an absolute beginner in perl and I am trying to extract lines of text between 2 strings on different lines but without success. It looks like I`m missing something in my code. The code should print out the file name and the found strings. Do you have any idea where could be the problem ? Many thanks indeed for your help or advice. Here is the example:
*****************
example:
START
new line 1
new line 2
new line 3
END
*****************
and my script:
use strict;
use warnings;
my $command0 = "";
opendir (DIR, "C:/Users/input/") or die "$!";
my #files = readdir DIR;
close DIR;
splice (#files,0,2);
open(MYOUTFILE, ">>output/output.txt");
foreach my $file (#files) {
open (CHECKBOOK, "input/$file")|| die "$!";
while ($record = <CHECKBOOK>) {
if (/\bstart\..\/bend\b/) {
print MYOUTFILE "$file;$_\n";
}
}
close(CHECKBOOK);
$command0 = "";
}
close(MYOUTFILE);
I suppose that you are trying to use a flip-flop here, which might work well for your input, but you've written it wrong:
if (/\bstart\..\/bend\b/) {
A flip-flop (the range operator) uses two statements, separated by either .. or .... What you want is two regexes joined with ..:
if (/\bSTART\b/ .. /\bEND\b/)
Of course, you also want to match the case (upper), or use the /i modifier to ignore case. You might even want to use beginning of line anchor ^ to only match at the beginning of a line, e.g.:
if (/^START\b/ .. /^END\b/)
You should also know that your entire program can be replaced with a one-liner, such as
perl -ne 'print if /^START\b/ .. /^END\b/' input/*
Alas, this only works for linux. The cmd shell in Windows does not glob, so you must do that manually:
perl -ne "BEGIN { #ARGV = map glob, #ARGV }; print if /^START\b/ .. /^END\b/" input/*
If you are having troubles with the whole file printing no matter what you do, I think the problem lies with your input file. So take a moment to study it and make sure it is what you think it is, for example:
perl -MData::Dumper -e"$Data::Dumper::Useqq = 1; print Dumper $_;" file.txt
If you're matching a multi-line string, you might need to tell the regexp about it:
if (/\bstart\..\/bend\b/s) {
note the s after the regex.
Perldoc says:
s
Treat string as single line. That is, change "." to match any
character whatsoever, even a newline, which normally it would not
match.

How to find a file which exists in different directories under a given path in Perl

I'm looking for a method to looks for file which resides in a few directories in a given path. In other words, those directories will be having files with same filename across. My script seem to have the hierarchy problem on looking into the correct path to grep the filename for processing. I have a fix path as input and the script will need to looks into the path and finding files from there but my script seem stuck on 2 tiers up and process from there rather than looking into the last directories in the tier (in my case here it process on "ln" and "nn" and start processing the subroutine).
The fix input path is:-
/nfs/disks/version_2.0/
The files that I want to do post processing by subroutine will be exist under several directories as below. Basically I wanted to check if the file1.abc do exists in all the directories temp1, temp2 & temp3 under ln directory. Same for file2.abc if exist in temp1, temp2, temp3 under nn directory.
The files that I wanted to check in full path will be like this:-
/nfs/disks/version_2.0/dir_a/ln/temp1/file1.abc
/nfs/disks/version_2.0/dir_a/ln/temp2/file1.abc
/nfs/disks/version_2.0/dir_a/ln/temp3/file1.abc
/nfs/disks/version_2.0/dir_a/nn/temp1/file2.abc
/nfs/disks/version_2.0/dir_a/nn/temp2/file2.abc
/nfs/disks/version_2.0/dir_a/nn/temp3/file2.abc
My script as below:-
#! /usr/bin/perl -w
my $dir = '/nfs/fm/disks/version_2.0/' ;
opendir(TEMP, $dir) || die $! ;
foreach my $file (readdir(TEMP)) {
next if ($file eq "." || $file eq "..") ;
if (-d "$dir/$file") {
my $d = "$dir/$file";
print "Directory:- $d\n" ;
&getFile($d);
&compare($file) ;
}
}
Note that I put the print "Directory:- $d\n" ; there for debug purposes and it printed this:-
/nfs/disks/version_2.0/dir_a/
/nfs/disks/version_2.0/dir_b/
So I knew it get into the wrong path for processing the following subroutine.
Can somebody help to point me where is the error in my script? Thanks!
To be clear: the script is supposed to recurse through a directory and look for files with a particular filename? In this case, I think the following code is the problem:
if (-d "$dir/$file") {
my $d = "$dir/$file";
print "Directory:- $d\n" ;
&getFile($d);
&compare($file) ;
}
I'm assuming the &getFile($d) is meant to step into a directory (i.e., the recursive step). This is fine. However, it looks like the &compare($file) is the action that you want to take when the object that you're looking at isn't a directory. Therefore, that code block should look something like this:
if (-d "$dir/$file") {
&getFile("$dir/$file"); # the recursive step, for directories inside of this one
} elsif( -f "$dir/$file" ){
&compare("$dir/$file"); # the action on files inside of the current directory
}
The general pseudo-code should like like this:
sub myFind {
my $dir = shift;
foreach my $file( stat $dir ){
next if $file -eq "." || $file -eq ".."
my $obj = "$dir/$file";
if( -d $obj ){
myFind( $obj );
} elsif( -f $obj ){
doSomethingWithFile( $obj );
}
}
}
myFind( "/nfs/fm/disks/version_2.0" );
As a side note: this script is reinventing the wheel. You only need to write a script that does the processing on an individual file. You could do the rest entirely from the shell:
find /nfs/fm/disks/version_2.0 -type f -name "the-filename-you-want" -exec your_script.pl {} \;
Wow, it's like reliving the 1990s! Perl code has evolved somewhat, and you really need to learn the new stuff. It looks like you learned Perl in version 3.0 or 4.0. Here's some pointers:
Use use warnings; instead of -w on the command line.
Use use strict;. This will require you to predeclare variables using my which will scope them to the local block or the file if they're not in a local block. This helps catch a lot of errors.
Don't put & in front of subroutine names.
Use and, or, and not instead of &&, ||, and !.
Learn about Perl Modules which can save you a lot of time and effort.
When someone says detect duplicates, I immediately think of hashes. If you use a hash based upon your file's name, you can easily see if there are duplicate files.
Of course a hash can only have a single value for each key. Fortunately, in Perl 5.x, that value can be a reference to another data structure.
So, I recommend you use a hash that contains a reference to a list (array in old parlance). You can push each instance of the file to that list.
Using your example, you'd have a data structure that looks like this:
%file_hash = {
file1.abc => [
/nfs/disks/version_2.0/dir_a/ln/temp1
/nfs/disks/version_2.0/dir_a/ln/temp2
/nfs/disks/version_2.0/dir_a/ln/temp3
],
file2.abc => [
/nfs/disks/version_2.0/dir_a/nn/temp1
/nfs/disks/version_2.0/dir_a/nn/temp2
/nfs/disks/version_2.0/dir_a/nn/temp3
];
And, here's a program to do it:
#! /usr/bin/env perl
#
use strict;
use warnings;
use feature qw(say); #Can use `say` which is like `print "\n"`;
use File::Basename; #imports `dirname` and `basename` commands
use File::Find; #Implements Unix `find` command.
use constant DIR => "/nfs/disks/version_2.0";
# Find all duplicates
my %file_hash;
find (\&wanted, DIR);
# Print out all the duplicates
foreach my $file_name (sort keys %file_hash) {
if (scalar (#{$file_hash{$file_name}}) > 1) {
say qq(Duplicate File: "$file_name");
foreach my $dir_name (#{$file_hash{$file_name}}) {
say " $dir_name";
}
}
}
sub wanted {
return if not -f $_;
if (not exists $file_hash{$_}) {
$file_hash{$_} = [];
}
push #{$file_hash{$_}}, $File::Find::dir;
}
Here's a few things about File::Find:
The work takes place in the subroutine wanted.
The $_ is the name of the file, and I can use this to see if this is a file or directory
$File::Find::Name is the full name of the file including the path.
$File::Find::dir is the name of the directory.
If the array reference doesn't exist, I create it with the $file_hash{$_} = [];. This isn't necessary, but I find it comforting, and it can prevent errors. To use $file_hash{$_} as an array, I have to dereference it. I do that by putting a # in front of it, so it can be #$file_hash{$_} or, #{$file_hash{$_}}.
Once all the file are found, I can print out the entire structure. The only thing I do is check to make sure there is more than one member in each array. If there's only a single member, then there are no duplicates.
Response to Grace
Hi David W., thank you very much for your explainaion and sample script. Sorry maybe I'm not really clear in definding my problem statement. I think I can't use hash in my path finding for the data structure. Since the file*.abc is a few hundred and undertermined and each of the file*.abc even is having same filename but it is actually differ in content in each directory structures.
Such as the file1.abc resides under "/nfs/disks/version_2.0/dir_a/ln/temp1" is not the same content as file1.abc resides under "/nfs/disks/version_2.0/dir_a/ln/temp2" and "/nfs/disks/version_2.0/dir_a/ln/temp3". My intention is to grep the list of files*.abc in each of the directories structure (temp1, temp2 and temp3 ) and compare the filename list with a masterlist. Could you help to shed some lights on how to solve this? Thanks. – Grace yesterday
I'm just printing the file in my sample code, but instead of printing the file, you could open them and process them. After all, you now have the file name and the directory. Here's the heart of my program again. This time, I'm opening the file and looking at the content:
foreach my $file_name (sort keys %file_hash) {
if (scalar (#{$file_hash{$file_name}}) > 1) {
#say qq(Duplicate File: "$file_name");
foreach my $dir_name (#{$file_hash{$file_name}}) {
#say " $dir_name";
open (my $fh, "<", "$dir_name/$file_name")
or die qq(Can't open file "$dir_name/$file_name" for reading);
# Process your file here...
close $fh;
}
}
}
If you are only looking for certain files, you could modify the wanted function to skip over files you don't want. For example, here I am only looking for files which match the file*.txt pattern. Note I use a regular expression of /^file.*\.txt$/ to match the name of the file. As you can see, it's the same as the previous wanted subroutine. The only difference is my test: I'm looking for something that is a file (-f) and has the correct name (file*.txt):
sub wanted {
return if not -f $_ and /^file.*\.txt$/;
if (not exists $file_hash{$_}) {
$file_hash{$_} = [];
}
push #{$file_hash{$_}}, $File::Find::dir;
}
If you are looking at the file contents, you can use the MD5 hash to determine if the file contents match or don't match. This reduces a file to a mere string of 16 to 28 characters which could even be used as a hash key instead of the file name. This way, files that have matching MD5 hashes (and thus matching contents) would be in the same hash list.
You talk about a "master list" of files and it seems you have the idea that this master list needs to match the content of the file you're looking for. So, I'm making a slight mod in my program. I am first taking that master list you talked about, and generating MD5 sums for each file. Then I'll look at all the files in that directory, but only take the ones with the matching MD5 hash...
By the way, this has not been tested.
#! /usr/bin/env perl
#
use strict;
use warnings;
use feature qw(say); #Can use `say` which is like `print "\n"`;
use File::Find; #Implements Unix `find` command.
use Digest::file qw(digest_file_hex);
use constant DIR => "/nfs/disks/version_2.0";
use constant MASTER_LIST_DIR => "/some/directory";
# First, I'm going thorugh the MASTER_LIST_DIR directory
# and finding all of the master list files. I'm going to take
# the MD5 hash of those files, and store them in a Perl hash
# that's keyed by the name of file file. Thus, when I find a
# file with a matching name, I can compare the MD5 of that file
# and the master file. If they match, the files are the same. If
# not, they're different.
# In this example, I'm inlining the function I use to find the files
# instead of making it a separat function.
my %master_hash;
find (
{
%master_hash($_) = digest_file_hex($_, "MD5") if -f;
},
MASTER_LIST_DIR
);
# Now I have the MD5 of all the master files, I'm going to search my
# DIR directory for the files that have the same MD5 hash as the
# master list files did. If they do have the same MD5 hash, I'll
# print out their names as before.
my %file_hash;
find (\&wanted, DIR);
# Print out all the duplicates
foreach my $file_name (sort keys %file_hash) {
if (scalar (#{$file_hash{$file_name}}) > 1) {
say qq(Duplicate File: "$file_name");
foreach my $dir_name (#{$file_hash{$file_name}}) {
say " $dir_name";
}
}
}
# The wanted function has been modified since the last example.
# Here, I'm only going to put files in the %file_hash if they
sub wanted {
if (-f $_ and $file_hash{$_} = digest_file_hex($_, "MD5")) {
$file_hash{$_} //= []; #Using TLP's syntax hint
push #{$file_hash{$_}}, $File::Find::dir;
}
}

Read content from file and find full file-name on disk

My problem is that I have a bunch of file names without the version appended (version keeps changing everytime). The file names are in a file in a particular sequence and I need to get the latest version from a folder and then sequentially install the same. The logic would be:
scan a file with contents
read a line from the file
using this as a key, access the folder and match the same
if found, write the full file-name to a file with some characters appended
if not found, skip and loop to line 1, till all the lines in the file are finished
What is the best language to use: shell script or Perl for such a task? And if someone can provide some hints in the form of code :-)
I would read in all your partial filenames then loop through the folder matching the full filenames against the partial ones. The exact implementation would depend on some details.
Do the full filenames need to appear in the same order as the partial ones did? Can you derive the partial filename from the full filename?
Update: so, something like (assuming $infile, $outfile, and $indir are already opened file and dirhandles, and a translation routine partial_filename_from_full that returns undef for things like directories or non-relevant files):
chomp( my #partial_filenames = readline( $infile ) );
while ( my $filename = readdir( $indir ) ) {
my $partial_filename = partial_filename_from_full( $filename );
if ( defined $partial_filename ) {
$full_filename{ $partial_filename } = $filename;
}
}
for my $partial_filename ( #partial_filenames ) {
if ( exists $full_filename{ $partial_filename } ) {
print $outfile $full_filename{ $partial_filename }, "\n";
} else {
# error? just skip it? you decide
}
}
If there are multiple full filenames per partial filename, instead of assigning:
$full_filename{ $partial_filename } = $filename;
you would determine if $filename were a better "match" than the previously encountered
one.
Your question is not very clear, but I'm guessing you have a directory containing file names such as:
fileA01
fileA02
fileB03
fileB05
fileB12
fileC02
fileD09
fileE22
The file you scan 'with contents' contains a list of names such as:
fileA
fileB
fileE
And you want code to find the entry in the directory with the highest version number for the corresponding file name:
fileA02
fileB12
fileE22
You will have to decide exactly how versions are compared - I've used 2-digit version numbers, but you haven't stated your constraints.
I would probably use Perl for this. First, I'd read the whole 'file with contents' into memory, and then create a monster regex to recognize the file names - possibly with the version number detection included. I'd use opendir, readdir (and closedir) to process the directory. For each line, I'd match it with the regex, and capture whether the name was the most recent version of any of the sought files. If so, I'd capture the filename in a hash, indexed by the version-less filename (hence, if fileA01 was read first, then I'd have $filelist{fileA} = "fileA01"; except of course both the hash key and the full filename would be in variables.
Doing it in shell would be harder. Using the most powerful features of Bash, it is probably doable; I'd still use Perl (or Python, or any scripting language).
I would use awk.
awk -f myawk.awk
myawk.awk
BEGIN{
}
{
myfilename = $0;
retval = getline otherfile < myfilename;
if (retval == -1) # check the correct syntax
{
# file does not exist. do the necessary error handling
}
else
{
# File exists. so do what you want.
# perhaps you might want to write to a new file with the modified filename
}
}
END{
}