Sort across multiple columns (Perl) - perl

How would I go about sorting across multiple columns for the below code?
Currently, the code:
1. Gets a #list of files in a $directory
2. Uses regex to get the $fileName, $fileLocation and $fileSize for each element in #list
3. Prints out the 3 values in (2) into 3 fixed-width columns
4. Then prints out the total number of files and directory size
I would like the output to display sorted by:
1. $fileName then
2. $fileLocation then
3. $fileSize
$directory = '/shared/tmp';
$count = 0;
#list = qx{du -ahc $directory};
printf ("%-60s %-140s %-5s\n", "Filename", "Location", "Size");
foreach(#list) {
chop($_); # remove newline at end
if (/^(.+?K)\s+(.+\/)(.+\.[A-Za-z0-9]{2,4})$/) { # store lines with valid filename into new array
# push(#files,$1);
$fileSize = $1;
$fileLocation = $2;
$fileName = $3;
if ($fileName =~ /^\./) {
next; }
printf ("%-60s %-140s %-5s\n", $fileName, $fileLocation, $fileSize);
$count++;
}
else {
next;
}
}
print "Total number of files: $count\n";
$total = "$list[$#list]";
$total =~ s/^(.+?)\s.+/$1/;
print "Total directory size: $total\n";

You can specify your own sorting algorithm and give it to sort!
Documention: sort - perldoc.perl.org
A sample implementation
Push your results (in a hash reference) into an array called #entries, and use something like the below.
my #entries;
...
# inside your loop
push #entries, {
'filename' => $fileName,
'location' => $fileLocation,
'size' => $fileSize
};
...
my #sorted_entries = sort {
$a->{'filename'} cmp $b->{'filename'} || # use 'cmp' for strings
$a->{'location'} cmp $b->{'location'} ||
$a->{'size'} <=> $b->{'size'} # use '<=>' for numbers
} #entries;

Related

Printing the search path taken to find item during BFS

I am trying to solve the doublets puzzle problem using Perl. This is one of my first times using Perl so please excuse the messy code.
I have everything working, I believe, but am having an issue printing the shortest path. Using a queue and BFS I am able to find the target word but not the actual path taken.
Does anyone have any suggestions? I have been told to keep track of the parents of each element but it is not working.
#!/usr/bin/perl
use strict;
my $file = 'test';
#my $file = 'wordlist';
open(my $fh, $file);
my $len = length($ARGV[0]);
my $source = $ARGV[0];
my $target = $ARGV[1];
my #words;
# Creates new array of correct length words
while (my $row = <$fh>) {
chomp $row;
my $rowlen = length($row);
if ($rowlen == $len) {
push #words, $row;
}
}
my %wordHash;
# Creates graph for word variations using dictionary
foreach my $word (#words) {
my $wordArr = [];
for (my $i = 0; $i < $len; $i++) {
my $begin = substr($word, 0, $i);
my $end = substr($word, $i+1, $len);
my $key = "$begin" . "_" . "$end";
my $Arr = [];
my $regex = "$begin" . "[a-z]" . "$end";
foreach my $wordTest (#words) {
if ("$wordTest" =~ $regex && "$wordTest" ne "$word") {
push $wordArr, "$wordTest";
}
}
}
$wordHash{"$word"} = $wordArr;
}
my #queue;
push(#queue, "$source");
my $next = $source;
my %visited;
my %parents;
my #path;
# Finds path using BFS and Queue
while ("$next" ne "$target") {
print "$next: ";
foreach my $variation (#{$wordHash{$next}}) {
push(#queue, "$variation");
$parents{"$variation"} = $next;
print "$variation | ";
}
print "\n-----------------\n";
$visited{"$next"} = 1;
push(#path, "$next");
$next = shift(#queue);
while ($visited{"$next"} == 1) {
$next = shift(#queue);
}
}
print "FOUND: $next\n\n";
print "Path the BFS took: ";
print "#path\n\n";
print "Value -> Parent: \n";
for my $key (keys %parents) {
print "$key -> $parents{$key}\n";
}
Before you accept a word from the #queue to be $next, you test to ensure that it's not been %visited. By then, though, damage has been done. The test has ensured a visited word wont become the focus again and hence, will prevent loops but the earlier code updated %parents whether the word had been %visited or not.
If a word has been %visited, you not only want to avoid it becomming the $next candidate, you want to avoid it being a considered $variation as that will screw up %parents. I don't have a word dictionary to test with and you haven't given an example of the failure but I think you can fix this up by shifting the %visited guard into the inner loop where variations are considered;
foreach my $variation (#{$wordHash{$next}}) {
next if %visited{ $variation } ;
push(#queue, "$variation");
... etc ...
This will protect the integrity of your #parents array as well as stop loops. On a small note, you don't need use double quotes when indexing into a hash; as I've done above, just state the scalar variable - using quotes just interpolates the value of the variable which produces the same result.
Your code, IMHO, is excellent for a beginner, BTW.
Update
I've since got a word dictionary and the problem above does exists as well as one other. The code does move one letter at a time from the source but in a near random direction - not necessarily closer to the target. To correct that, I changed the regex you use to build your graph such that the corresponding letter from the target replaces the generic [a-z]. There are also a couple of minor changes - mostly style related. The updated code looks like this;
use v5.12;
my $file = 'wordlist.txt';
#my $file = 'wordlist';
open(my $fh, $file);
my $len = length($ARGV[0]);
my $source = $ARGV[0];
my $target = $ARGV[1];
chomp $target;
my #target = split('', $target);
my #words;
# Creates new array of correct length words
while (my $row = <$fh>) {
$row =~ s/[\r\n]+$//;
my $rowlen = length($row);
if ($rowlen == $len) {
push #words, $row;
}
}
my %wordHash;
# Creates graph for word variations using dictionary
foreach my $word (#words) {
my $wordArr = [];
for (my $i = 0; $i < $len; $i++) {
my $begin = substr($word, 0, $i);
my $end = substr($word, $i+1, $len);
my $key = "$begin" . "_" . "$end";
my $Arr = [];
# my $re_str = "$begin[a-z]$end";
my $regex = $begin . $target[$i] . $end ;
foreach my $wordTest (#words) {
if ($wordTest =~ / ^ $regex $ /x ) {
next if $wordTest eq $word ;
push $wordArr, "$wordTest";
}
}
}
$wordHash{"$word"} = $wordArr;
}
my #queue;
push(#queue, "$source");
my $next = $source;
my %visited;
my %parents;
my #path;
# Finds path using BFS and Queue
while ($next ne $target) {
print "$next: ";
$visited{$next} = 1;
foreach my $variation (#{$wordHash{$next}}) {
next if $visited{ $variation } ;
push(#queue, $variation);
$parents{$variation} = $next;
print "$variation | ";
}
print "\n-----------------\n";
push(#path, $next);
while ( $visited{$next} ) {
$next = shift #queue ;
}
}
push #path, $target ;
print "FOUND: $next\n\n";
print "Path the BFS took: #path\n\n";
print "Value -> Parent: \n";
for my $key (keys %parents) {
print "$key -> $parents{$key}\n";
}
and when ran produces;
./words.pl head tail | more
head: heal |
-----------------
heal: teal | heil |
-----------------
teal:
-----------------
heil: hail |
-----------------
hail: tail |
-----------------
FOUND: tail
Path the BFS took: head heal teal heil hail tail
Value -> Parent:
hail -> heil
heil -> heal
teal -> heal
tail -> hail
heal -> head
You could probably remove the printing of the %parents hash - as hash values come out randomly, it doesnt tell you much

How do I speed up pattern recognition in perl

This is the program as it stands right now, it takes in a .fasta file (a file containing genetic code), creates a hash table with the data and prints it, however, it is quite slow. It splits a string an compares it against all other letters in the file.
use strict;
use warnings;
use Data::Dumper;
my $total = $#ARGV + 1;
my $row;
my $compare;
my %hash;
my $unique = 0;
open( my $f1, '<:encoding(UTF-8)', $ARGV[0] ) or die "Could not open file '$ARGV[0]' $!\n";
my $discard = <$f1>;
while ( $row = <$f1> ) {
chomp $row;
$compare .= $row;
}
my $size = length($compare);
close $f1;
for ( my $i = 0; $i < $size - 6; $i++ ) {
my $vs = ( substr( $compare, $i, 5 ) );
for ( my $j = 0; $j < $size - 6; $j++ ) {
foreach my $value ( substr( $compare, $j, 5 ) ) {
if ( $value eq $vs ) {
if ( exists $hash{$value} ) {
$hash{$value} += 1;
} else {
$hash{$value} = 1;
}
}
}
}
}
foreach my $val ( values %hash ) {
if ( $val == 1 ) {
$unique++;
}
}
my $OUTFILE;
open $OUTFILE, ">output.txt" or die "Error opening output.txt: $!\n";
print {$OUTFILE} "Number of unique keys: " . $unique . "\n";
print {$OUTFILE} Dumper( \%hash );
close $OUTFILE;
Thanks in advance for any help!
It is not clear from the description what is wanted from this script, but if you're looking for matching sets of 5 characters, you don't actually need to do any string matching: you can just run through the whole sequence and keep a tally of how many times each 5-letter sequence occurs.
use strict;
use warnings;
use Data::Dumper;
my $str; # store the sequence here
my %hash;
# slurp in the whole file
open(IN, '<:encoding(UTF-8)', $ARGV[0]) or die "Could not open file '$ARGV[0]' $!\n";
while (<IN>) {
chomp;
$str .= $_;
}
close(IN);
# not sure if you were deliberately omitting the last two letters of sequence
# this looks at all the sequence
my $l_size = length($str) - 4;
for (my $i = 0; $i < $l_size; $i++) {
$hash{ substr($str, $i, 5) }++;
}
# grep in a scalar context will count the values.
my $unique = grep { $_ == 1 } values %hash;
open OUT, ">output.txt" or die "Error opening output.txt: $!\n";
print OUT "Number of unique keys: ". $unique."\n";
print OUT Dumper(\%hash);
close OUT;
It might help to remove searching for information that you already have.
I don't see that $j depends upon $i so you're actually matching values to themselves.
So you're getting bad counts as well. It works for 1, because 1 is the square of 1.
But if for each five-character string you're counting strings that match, you're going
to get the square of the actual number.
You would actually get better results if you did it this way:
# compute it once.
my $lim = length( $compare ) - 6;
for ( my $i = 0; $i < $lim; $i++ ){
my $vs = substr( $compare, $i, 5 );
# count each unique identity *once*
# if it's in the table, we've already counted it.
next if $hash{ $vs };
$hash{ $vs }++; # we've found it, record it.
for ( my $j = $i + 1; $j < $lim; $j++ ) {
my $value = substr( $compare, $j, 5 );
$hash{ $value }++ if $value eq $vs;
}
}
However, it could be an improvement on this to do an index for your second loop
and let the c-level of perl do your matching for you.
my $pos = $i;
while ( $pos > -1 ) {
$pos = index( $compare, $vs, ++$pos );
$hash{ $vs }++ if $pos > -1;
}
Also, if you used index, and wanted to omit the last two characters--as you do, it might make sense to remove those from the characters you have to search:
substr( $compare, -2 ) = ''
But you could do all of this in one pass, as you loop through file. I believe the code
below is almost an equivalent.
my $last_4 = '';
my $last_row = '';
my $discard = <$f1>;
# each row in the file after the first...
while ( $row = <$f1> ) {
chomp $row;
$last_row = $row;
$row = $last_4 . $row;
my $lim = length( $row ) - 5;
for ( my $i = 0; $i < $lim; $i++ ) {
$hash{ substr( $row, $i, 5 ) }++;
}
# four is the maximum we can copy over to the new row and not
# double count a strand of characters at the end.
$last_4 = substr( $row, -4 );
}
# I'm not sure what you're getting by omitting the last two characters of
# the last row, but this would replicate it
foreach my $bad_key ( map { substr( $last_row, $_ ) } ( -5, -6 )) {
--$hash{ $bad_key };
delete $hash{ $bad_key } if $hash{ $bad_key } < 1;
}
# grep in a scalar context will count the values.
$unique = grep { $_ == 1 } values %hash;
You may be interested in this more concise version of your code that uses a global regex match to find all the subsequences of five characters. It also reads the entire input file in one go, and removes the newlines afterwards.
The path to the input file is expected as a parameter on the command line, and the output is sent to STDIN, and can be redirected to a file on the command line, like this
perl subseq5.pl input.txt > output.txt
I've also used Data::Dump instead of Data::Dumper because I believe it to be vastly superior. However it is not a core module, and so you will probably need to install it.
use strict;
use warnings;
use open qw/ :std :encoding(utf-8) /;
use Data::Dump;
my $str = do { local $/; <>; };
$str =~ tr|$/||d;
my %dups;
++$dups{$1} while $str =~ /(?=(.{5}))/g;
my $unique = grep $_ == 1, values %dups;
print "Number of unique keys: $unique\n";
dd \%dups;

Perl - sort filenames by order based on the filemask YYY-MM-DD that is in the filename

Need some help, not grasping a solution here on what method I should use.
I need to scan a directory and obtain the filenames by order of
1.YYYY-MM-DD, YYYY-MM-DD is part of the filename.
2. Machinename which is at the start of the filename to the left of the first "."
For example
Machine1.output.log.2014-02-26
Machine2.output.log.2014-02-27
Machine2.output.log.2014-02-26
Machine2.output.log.2014-02-27
Machine3.output.log.2014-02-26
So that it outputs in an array as follows
Machine1.output.log.2014-02-26
Machine2.output.log.2014-02-26
Machine3.output.log.2014-02-26
Machine1.output.log.2014-02-27
Machine2.output.log.2014-02-27
Thanks,
Often, temporarily turning your strings into a hash or array for sorting purposes, and then turning them back into the original strings is the most maintainable way.
my #filenames = qw/
Machine1.output.log.2014-02-26
Machine2.output.log.2014-02-27
Machine2.output.log.2014-02-26
Machine2.output.log.2014-02-27
Machine3.output.log.2014-02-26
/;
#filenames =
map $_->{'orig_string'},
sort {
$a->{'date'} cmp $b->{'date'} || $a->{'machine_name'} cmp $b->{'machine_name'}
}
map {
my %attributes;
#attributes{ qw/orig_string machine_name date/ } = /\A(([^.]+)\..*\.([^.]+))\z/;
%attributes ? \%attributes : ()
} #filenames;
You can define your own sort like so ...
my #files = (
"Abc1.xxx.log.2014-02-26"
, "Abc1.xxx.log.2014-02-27"
, "Abc2.xxx.log.2014-02-26"
, "Abc2.xxx.log.2014-02-27"
, "Abc3.xxx.log.2014-02-26"
);
foreach my $i ( #files ) { print "$i\n"; }
sub bydate {
(split /\./, $a)[3] cmp (split /\./, $b)[3];
}
print "sort it\n";
foreach my $i ( sort bydate #files ) { print "$i\n"; }
You can take your pattern 'YYYY-MM-DD' and match it to what you need.
#!/usr/bin/perl
use strict;
opendir (DIRFILES, ".") || die "can not open data file \n";
my #maplist = readdir(DIRFILES);
closedir(MAPS);
my %somehash;
foreach my $tmp (#maplist) {
next if $tmp =~ /^.{1,2}$/;
next if $tmp =~ /test/;
$tmp =~ /(\d{4})-(\d{2})-(\d{2})/;
$somehash{$tmp} = $1 . $2 . $3; # keep the original file name
# allows for duplicate dates
}
foreach my $tmp (keys %somehash) {
print "-->", $tmp, " --> " , $somehash{$tmp},"\n";
}
my #list= sort { $somehash{$a} <=> $somehash{$b} } keys(%somehash);
foreach my $tmp (#list) {
print $tmp, "\n";
}
Works, tested it with touch files.

Manipulating files according to indexes by perl

I am working on some genome data and I have 2 files ->
File1
A1 1 10
A1 15 20
A2 2 11
A2 13 16
File2
>A1
CTATTATTTATCGCACCTACGTTCAATATTACAGGCGAACATACCTACTA
AAGTGTGTTAATTAATTAATGCTTGTAGGACATAATAATAACAATTGAAT
>A2
GTCTGCACAGCCGCTTTCCACACAGACATCATAACAAAAAATTTCCACCA
AACCCCCCCCTCCCCCCGCTTCTGGCCACAGCACTTAAACACATCTCTGC
CAAACCCCAAAAACAAAGAACCCTAACACCAGCCTAACCAGATTTCAAAT
In file 1, 2nd and 3rd column represents the indexes in File2. So I want that, if character in column1 of file1 matches with character followed by symbol (>) in file2 , then from next line of that file2 give back the substring according to indexes in col2 and col3 of file1. (sorry, I know its complicated) Here is the desire output ->
Output
>A1#1:10
CTATTATTTA
>A1#15:20
ACCTA
>A2#2:11
TCTGCACAGC
>A2#13:16
GCTT
I know if I have only 1 string I can take out sub-string very easily ->
#ARGV or die "No input file specified";
open $first, '<',$ARGV[0] or die "Unable to open input file: $!";
$string="GATCACAGGTCTATCACCCTATTAACCACTCACGGGAGCTCTCCATGCAT";
while (<$first>)
{
#cols = split /\s+/;
$co=$cols[1]-1;
$length=$cols[2]-$co;
$fragment = substr $string, $co, $length;
print ">",$cols[0],"#",$cols[1],":",$cols[2],"\n",$fragment,"\n";
}
but here my problem is when should I input my second file and how should I match the character in col1 (of file1) with character in file2 (followed by > symbol) and then how to get substring?
I wasnt sure if they were all one continuous line or separate lines.
I set it up as continuous for now.
Basically, read the 2nd file as master.
Then you can process as many index files as you need.
You can use hash of arrays to help with the indexing.
push #{$index{$key}}, [$start,$stop];
use strict;
my $master_file = "dna_master.txt";
if ($#ARGV) {
print "Usage: $0 [filename(s)]\n";
exit 1;
}
my %Data = read_master($master_file);
foreach my $index_file (#ARGV) {
my %Index = read_index($index_file);
foreach my $key (sort keys %Index) {
foreach my $i (#{$Index{$key}}) {
my ($start,$stop) = #$i;
print ">$key#$start:$stop\n";
my $pos = $start - 1;
my $count = $stop - $start + 1;
print substr($Data{$key},$pos,$count)."\n";
}
}
}
sub read_file {
my $file = shift;
my #lines;
open(FILE, $file) or die "Error: cannot open $file\n$!";
while(<FILE>){
chomp; #remove newline
s/(^\s+|\s+$)//g; # strip lead/trail whitespace
next if /^$/; # skip blanks
push #lines, $_;
}
close FILE;
return #lines;
}
sub read_index {
my $file = shift;
my #lines = read_file($file);
my %index;
foreach (#lines) {
my ($key,$start,$stop) = split /\s+/;
push #{$index{$key}}, [$start,$stop];
}
return %index;
}
sub read_master {
my $file = shift;
my %master;
my $key;
my #lines = read_file($file);
foreach (#lines) {
if ( m{^>(\w+)} ) { $key = $1 }
else { $master{$key} .= $_ }
}
return %master;
}
Load File2 in a Hash, with A1, A2... as keys, and the DNA sequence as value. This way you can get the DNA sequence easily.
This 2nd update turns the master file into a hash of arrays as well.
This treats each row in the 2nd file as individual sequences.
use strict;
my $master_file = "dna_master.txt";
if ($#ARGV) {
print "Usage: $0 [filename(s)]\n";
exit 1;
}
my %Data = read_master($master_file);
foreach my $index_file (#ARGV) {
my %Index = read_index($index_file);
foreach my $key (sort keys %Index) {
foreach my $i (#{$Index{$key}}) {
my ($start,$stop) = #$i;
print ">$key#$start:$stop\n";
my $pos = $start - 1;
my $count = $stop - $start + 1;
foreach my $seq (#{$Data{$key}}) {
print substr($seq,$pos,$count)."\n";
}
}
}
}
sub read_file {
my $file = shift;
my #lines;
open(FILE, $file) or die "Error: cannot open $file\n$!";
while(<FILE>){
chomp; #remove newline
s/(^\s+|\s+$)//g; # strip lead/trail whitespace
next if /^$/; # skip blanks
push #lines, $_;
}
close FILE;
return #lines;
}
sub read_index {
my $file = shift;
my #lines = read_file($file);
my %index;
foreach (#lines) {
my ($key,$start,$stop) = split /\s+/;
push #{$index{$key}}, [$start,$stop];
}
return %index;
}
sub read_master {
my $file = shift;
my %master;
my $key;
my #lines = read_file($file);
foreach (#lines) {
if ( m{^>(\w+)} ) { $key = $1 }
else { push #{ $master{$key} }, $_ }
}
return %master;
}
Output:
>A1#1:10
CTATTATTTA
AAGTGTGTTA
>A1#15:20
ACCTAC
ATTAAT
>A2#2:11
TCTGCACAGC
ACCCCCCCCT
AAACCCCAAA
>A2#13:16
GCTT
CCCC
ACAA

How to create the next file or folder in a series of progressively numbered files?

Sorry for the bad title but this is the best I could do! :D
I have a script which creates a new project every time the specified function is called.
Each project must be stored in its own folder, with the name of the project. But, if you don't specify a name, the script will just name it "new projectX", where X is a progressive number.
With time the user could rename the folders or delete some, so every time the script runs, it checks for the smallest number available (not used by another folder) and creates the relevant folder.
Now I managed to make a program which I think works as wanted, but I would like to hear from you if it's OK or there's something wrong which I'm unable to spot, given my inexperience with the language.
while ( defined( $file = readdir $projects_dir ) )
{
# check for files whose name start with "new project"
if ( $file =~ m/^new project/i )
{
push( #files, $file );
}
}
# remove letters from filenames, only the number is left
foreach $file ( #files )
{
$file =~ s/[a-z]//ig;
}
#files = sort { $a <=> $b } #files;
# find the smallest number available
my $smallest_number = 0;
foreach $file ( #files )
{
if ( $smallest_number != $file )
{
last;
}
$smallest_number += 1;
}
print "Smallest number is $smallest_number";
Here's a basic approach for this sort of problem:
sub next_available_dir {
my $n = 1;
my $d;
$n ++ while -e ($d = "new project$n");
return $d;
}
my $project_dir = next_available_dir();
mkdir $project_dir;
If you're willing to use a naming pattern that plays nicely with Perl's string auto-increment feature, you can simplify the code further, eliminating the need for $n. For example, newproject000.
I think I would use something like:
use strict;
use warnings;
sub new_project_dir
{
my($base) = #_;
opendir(my $dh, $base) || die "Failed to open directory $base for reading";
my $file;
my #numbers;
while ($file = readdir $dh)
{
$numbers[$1] = 1 if ($file =~ m/^new project(\d+)$/)
}
closedir($dh) || die "Failed to close directory $base";
my $i;
my $max = $#numbers;
for ($i = 0; $i < $max; $i++)
{
next if (defined $numbers[$i]);
# Directory did not exist when we scanned the directory
# But maybe it was created since then!
my $dir = "new project$i";
next unless mkdir "$base/$dir";
return $dir;
}
# All numbers from 0..$max were in use...so try adding new numbers...
while ($i < $max + 100)
{
my $dir = "new project$i";
$i++;
next unless mkdir "$base/$dir";
return $dir;
}
# Still failed - give in...
die "Something is amiss - all directories 0..$i in use?";
}
Test code:
my $basedir = "base";
mkdir $basedir unless -d $basedir;
for (my $j = 0; $j < 10; $j++)
{
my $dir = new_project_dir($basedir);
print "Create: $dir\n";
if ($j % 3 == 2)
{
my $k = int($j / 2);
my $o = "new project$k";
rmdir "$basedir/$o";
print "Remove: $o\n";
}
}
Try this:
#!/usr/bin/env perl
use strict;
use warnings;
# get the current list of files
# see `perldoc -f glob` for details.
my #files = glob( 'some/dir/new\\ project*' );
# set to first name, in case there are none others
my $next_file = 'new project1';
# check for others
if( #files ){
# a Schwartian transform
#files = map { $_->[0] } # get original
sort { $a->[1] <=> $b->[1] } # sort by second field which are numbers
map { [ $_, do{ ( my $n = $_ ) =~ s/\D//g; $n } ] } # create an anonymous array with original value and the second field nothing but digits
#files;
# last file name is the biggest
$next_file = $files[-1];
# add one to it
$next_file =~ s/(.*)(\d+)$/$1.($2+1)/e;
}
print "next file: $next_file\n";
Nothing wrong per se, but that's an awful lot of code to achieve a single objective (get the minimum index of directories.
A core module, couple of subs and few Schwartzian transforms will make the code more flexible:
use strict;
use warnings;
use List::Util 'min';
sub num { $_[0] =~ s|\D+||g } # 'new project4' -> '4', 'new1_project4' -> '14' (!)
sub min_index {
my ( $dir, $filter ) = #_;
$filter = qr/./ unless defined $filter; # match all if no filter specified
opendir my $dirHandle, $dir or die $!;
my $lowest_index = min # get the smallest ...
map { num($_) } # ... numerical value ...
grep { -d } # ... from all directories ...
grep { /$filter/ } # ... that match the filter ...
readdir $dirHandle; # ... from the directory contents
$lowest_index++ while grep { $lowest_index == num( $_ ) } readdir $dirhandle;
return $lowest_index;
}
# Ready to use!
my $index = min_index ( 'some/dir' , qr/^new project/ );
my $new_project_name = "new project $index";