How to store File::Find::name output in an array

How to store File::Find::name output in an array - perl

I've managed to extract the filenames of my .txt files, but I'm having trouble storing it in an array.
Filenames:
sample1.txt sample2.txt sample3.txt
Code:
sub find_files {
my $getfile = $File::Find::name;
if ($getfile =~ m/txt$/) {
my #sample;
($file, $path, $ext) = fileparse($getfile, qr/\..*/);
push(#sample, "$file");
print "$sample[0] ";
}
}
Expected output:
sample1
Output:
sample1 sample2 sample3

You are storing each file name in #sample, but that array is declared in far too small a scope and is discarded at the end of the if block, right after the print
This should work rather better. It's also more concise and makes sure that the items found are files, not directories
my #sample;
sub find_files {
return unless -f and /\.txt\z/i;
my ($file, $path, $ext) = fileparse($File::Find::name, qr/\.[^.]*\z/);
push #sample, $file;
}
find(\&find_files, '/my/dir');
print "$_\n" for #sample;

Related

How to check whether one file's value contains in another text file? (perl script)

I would like to check one of the file's values contains on another file. if one of the value contains it will show there is existing bin for that specific, if no, it will show there is no existing bin limit. the problem is I am not sure how to check all values at once.
first DID1 text file value contain :
L84A:D:O:M:
L84C:B:E:D:
second DID text file value contain :
L84A:B:E:Q:X:F:i:M:Y:
L84C:B:E:Q:X:F:i:M:Y:
L83A:B:E:Q:X:F:i:M:Y:
if first 4words value are match, need to check all value for that line.
for example L84A in first text file & second text file value has M . it should print out there is an existing M bin
below is my code :
use strict;
use warnings;
my $filename = 'DID.txt';
my $filename1 = 'DID1.txt';
my $count = 0;
open( FILE2, "<$filename1" )
or die("Could not open log file. $!\n");
while (<FILE2>) {
my ($number) = $_;
chomp($number);
my #values1 = split( ':', $number );
open( FILE, "<$filename" )
or die("Could not open log file. $!\n");
while (<FILE>) {
my ($line) = $_;
chomp($line);
my #values = split( ':', $line );
foreach my $val (#values) {
if ( $val =~ /$values1[0]/ ) {
$count++;
if ( $values[$count] =~ /$values1[$count]/ ) {
print
"Yes ,There is an existing bin & DID\n #values1\n";
}
else {
print "No, There is an existing bin & DID\n";
}
}
}
}
}
I cannot check all value. please help to give any advice on it since this is my first time learning for perl language. Thanks a lot :)

Based on my understanding I write this code:
use strict;
use warnings;
#use ReadWrite;
use Array::Utils qw(:all);
use vars qw($my1file $myfile1cnt $my2file $myfile2cnt #output);
$my1file = "did1.txt"; $my2file = "did2.txt";
We are going to read both first and second files (DID1 and DID2).
readFileinString($my1file, \$myfile1cnt); readFileinString($my2file, \$myfile2cnt);
In first file, as per the OP's request the first four characters should be matched with second file and then if they matched we need to check rest of the characters in the first file with the second one.
while($myfile1cnt=~m/^((\w){4})\:([^\n]+)$/mig)
{
print "<LineStart>";
my $lineChk = $1; my $full_Line = $3; #print ": $full_Line\n";
my #First_values = split /\:/, $full_Line; #print join "\n", #First_values;
If the first four digit matched then,
if($myfile2cnt=~m/^$lineChk\:([^\n]+)$/m)
{
Storing the rest of the content in the same and to be split with colon and getting the characters to be matched with first file contents.
my $FullLine = $1; my #second_values = split /:/, $FullLine;
Then search each letter first and second content which matched line...
foreach my $sngletter(#First_values)
{
If the letters are matched with first and second file its going to be printed.
if( grep {$_ eq "$sngletter"} #second_values)
{
print "Matched: $sngletter\t";
}
}
}
else { print "Not Matched..."; }
This is just information that the line end.
print "<LineEnd>\n"
}
#------------------>Reading a file
sub readFileinString
#------------------>
{
my $File = shift;
my $string = shift;
use File::Basename;
my $filenames = basename($File);
open(FILE1, "<$File") or die "\nFailed Reading File: [$File]\n\tReason: $!";
read(FILE1, $$string, -s $File, 0);
close(FILE1);
}

Read search pattern and data into hash (first field is a key), then go through data and select only field included into pattern for this key.
use strict;
use warnings;
use feature 'say';
my $input1 = 'DID1.txt'; # look for key,pattern(array)
my $input2 = 'DID.txt'; # data - key,elements(array)
my $pattern;
my $data;
my %result;
$pattern = file2hash($input1); # read pattern into hash
$data = file2hash($input2); # read data into hash
while( my($k,$v) = each %{$data} ) { # walk through data
next unless defined $pattern->{$k}; # skip those which is not in pattern hash
my $find = join '|', #{ $pattern->{$k} }; # form search pattern for grep
my #found = grep {/$find/} #{ $v }; # extract only those of interest
$result{$k} = \#found; # store in result hash
}
while( my($k,$v) = each %result ) { # walk through result hash
say "$k has " . join ':', #{ $v }; # output final result
}
sub file2hash {
my $filename = shift;
my %hash;
my $fh;
open $fh, '<', $filename
or die "Couldn't open $filename";
while(<$fh>) {
chomp;
next if /^\s*$/; # skip empty lines
my($key,#data) = split ':';
$hash{$key} = \#data;
}
close $fh;
return \%hash;
}
Output
L84C has B:E
L84A has M

Perl script to pair two array

I want to pair two array and add char '/' between them. Let say, two arrays are like below
#array1 = (FileA .. FileZ);
#array2 = (FileA.txt .. FileZ.txt);
The output that I want is like below
../../../experiment/fileA/fileA.txt
.
.
../../../experiment/fileZ/fileZ.txt
here is my code
my #input_name = input();
my $dirname = "../../../experiment/";
# CREATE FOLDER PATH
my #fileDir;
foreach my $input_name (#input_name){
chomp $input_name;
$_ = $dirname . $input_name;
push #fileDir, $_;
}
# CREATE FILE NAME
my #filename;
my $extension = '.txt';
foreach my $input_name (#input_name){
chomp $input_name;
$_ = $input_name . $extension;
push #filename, $_;
}
The code that I'd try is like below. But it seem doesn't work
#CREATE FULL PATH
foreach my $test_path (#test_path){
foreach my $testname (#testname){
my $test = map "$test_path[$_]/$testname[$_]", 0..$#test_path;
push #file, $test;
}
}
print #file;

I assume input() returns something like ('fileA', 'fileB').
The problem with your code is the nested loop here:
foreach my $test_path (#test_path){
foreach my $testname (#testname){
This combines every $test_path with every possible $testname. You don't want that. Also, it doesn't make much sense to assign the result of map to a scalar: All you'll get is the number of elements in the list created by map.
(Also, you have random chomp calls sprinkled throughout your code. None of those should be there.)
You only need a single array and a single loop:
use strict;
use warnings;
sub input {
return ('fileA', 'fileB');
}
my #input = input();
my $dirname = '../../../experiment';
my #files = map "$dirname/$_/$_.txt", #input;
for my $file (#files) {
print "got $file\n";
}
Here the loop is hidden in the map ..., #input call. If you want to write it as a for loop, it would look like this:
my #files;
for my $input (#input) {
push #files, "$dirname/$input/$input.txt";
}

The problem is your algorithm. You're iterating all filenames and all dirnames at the same time.
I mean, your code says "For every directory, create every file".
Try something along the lines of this and you'll be fine:
# WRITE TESTFILE
foreach my $filename (#filename){
chomp $filename;
if ( -e "$filename/$filename" and -d "$filename/$filename" ){
print "File already exists\n";
}
else {
open ( TXT_FILE, ">$filename/$filename" );
print TXT_FILE "Hello World";
close TXT_FILE;
}
}

Perl -two list matching elements

I am trying to grab the list of the files jenkins has updated from last build and latest build and stored in a perl array.
Now i have list of files and folders in source code which are considered as sensitive in terms of changes like XXXX\yyy/., XXX/TTTT/FFF.txt,...in FILE.TXT
i want that script should tell me if any these sensitive files was part of my changed files and if yes list its name so that we can double check with development team about is change before we trigger build .
How should i achieve this , and how to ,compare multiple files under one path form the changed path files .
have written below script ---which is called inside the jenkins with %workspace# as argument
This is not giving any matching result.
use warnings;
use Array::Utils qw(:all);
$url = `curl -s "http://localhost:8080/job/Rev-number/lastStableBuild/" | findstr "started by"`;
$url =~ /([0-9]+)/;
system("cd $ARGV[1]");
#difffiles = `svn diff -r $1:HEAD --summarize`;
chomp #difffiles;
foreach $f (#difffiles) {
$f = substr( $f, 8 );
$f = "$f\n";
}
open FILE, '/path/to/file'
or die "Can't open file: $!\n";
#array = <FILE>;
#isect = intersect( #difffiles, #array );
print #isect;

I have manged to solve this issue using below perl script -
sub Verifysensitivefileschanges()
{
$count1=0;
#isect = intersect(#difffiles,#sensitive);
#print "ISECT=#isect\n";
foreach $t (#isect)
{
if (#isect) {print "Matching element found -- $t\n"; $count1=1;}
}
return $count1;
}
sub Verifysensitivedirschanges()
{
$count2=0;
foreach $g (#difffiles)
{
$dirs = dirname($g);
$filename = basename($g);
#print "$dirs\n";
foreach $j (#array)
{
if( "$j" =~ /\Q$dirs/)
{print "Sensitive path files changed under path $j and file name is $filename\n";$count2=1;}
}
}
return $count2;
}

Need some help in program logic

I am trying to read a config file and discard the directories that are listed in there with size mentioned in the file. So far I have this-
open FILE, 'C:\reports\config.txt' or die $!;
my $size_req;
my $path;
my $sub_dir;
my $count;
my #lines = <FILE>;
foreach $_ (#lines)
{
my #line = split /\|/, $_;
if ($line[0] eq "size")
{
$size_req= $line[1];
$size_req= ">".$size_req*1024;;
}
if ($line[0] eq "path")
{
$path= $line[1];
}
if ($line[0] eq "directories")
{ my $aa;
my $siz_two_digit;
my $sub_dir;
my $i;
my $array_size=#line;
**for($i=1; $i < $array_size; )**
{
$sub_dir=$line[$i];
print $sub_dir;
print "\n";
print $path;
print "\n";
my $r1 = File::Find::Rule->directory
->name($sub_dir)
->prune # don't go into it
->discard; # don't report it
my $fn = File::Find::Rule->file
->size( $size_req );
my #files = File::Find::Rule->or( $r1, $fn )
->in( $path);
print #files;
undef #files;
print #files;
$i++;
print "\n";
print "\n";
}
}
}
The problem with the for loop is that- it stores all the subdirectories to be discarded from an array just fine. However, when it reads the name of the first directory to be discarded, it does not know about the remaining subdirectories and lists them too. When it goes to the 2 nd value, it ignores the previous one and lists that as well.
Does anyone know if the File|::Find::Rule takes an array at a time so that the code will consider entire line in the configuration file at once? or any other logic?
Thank you

This code does not do what you think:
my $r1 = File::Find::Rule->directory
->name($sub_dir)
->prune # don't go into it
->discard; # don't report it
You are trying to store a rule in a scalar, but what you are actually doing is calling Find::File::Rule and converting the resulting list to an integer (the number of elements in the list) and storing that in $r1.
Just put the whole call in the #files call. It may look messy but it will work a whole lot better.

How to check for files that has two different extensions in Perl

I have a file reflog with the content as below. There will be items with same name but different extensions. I want to check that for each of the items (file1, file2 & file3 here as example), it needs to be exist in both extensions (.abc and .def). If both extensions exist, it will perform some regex and print out. Else it will just report out with the file name together with extension (ie, if only on of file1.abc or file1.def exists, it will be printed out).
reflog:
file1.abc
file2.abc
file2.def
file3.abc
file3.def
file4.abc
file5.abc
file5.def
file6.def
file8abc.def
file7.abc
file1.def
file9abc.def
file10def.abc
My script is as below (editted from yb007 script), but I have some issues with the output that I don;t know how to resolve. I notice the output is going to be wrong when the reflog file having any file with the name *abc.def (such as ie. file8abc.def & file9abc.def). It will be trim down the last 4 suffix and return the wrong .ext (which is .abc here but I suppose it should be .def).
#! /usr/bin/perl
use strict;
use warnings;
my #files_abc ;
my #files_def ;
my $line;
open(FILE1, 'reflog') || die ("Could not open reflog") ;
open (FILE2, '>log') || die ("Could not open log") ;
while ($line = <FILE1>) {
if($line=~ /(.*).abc/) {
push(#files_abc,$1);
} elsif ($line=~ /(.*).def/) {
push(#files_def,$1); }
}
close(FILE1);
my %first = map { $_ => 1 } #files_def ;
my #same = grep { $first{$_} } #files_abc ;
my #abc_only = grep { !$first{$_} } #files_abc ;
foreach my $abc (sort #abc_only) {
$abc .= ".abc";
}
my %second = map {$_=>1} #files_abc;
my #same2 = grep { $second{$_} } #files_def; ##same and same2 are equal.
my #def_only = grep { !$second{$_} } #files_def;
foreach my $def (sort #def_only) {
$def .= ".def";
}
my #combine_all = sort (#same, #abc_only, #def_only);
print "\nCombine all:-\n #combine_all\n" ;
print "\nList of files with same extension\n #same";
print "\nList of files with abc only\n #abc_only";
print "\nList of files with def only\n #def_only";
foreach my $item (sort #combine_all) {
print FILE2 "$item\n" ;
}
close (FILE2) ;
My output is like this which is wrong:-
1st:- print screen output as below:
Combine all:-
file.abc file.abc file1 file10def.abc file2 file3 file4.abc file5 file6.def file7.abc
List of files with same extension
file1 file2 file3 file5
List of files with abc only
file4.abc file.abc file7.abc file.abc file10def.abc
List of files with def only
file6.def
Log output as below:
**file.abc
file.abc**
file1
file10def.abc
file2
file3
file4.abc
file5
file6.def
file7.abc
Can you pls help me take a look where gies wrong? Thanks heaps.

ALWAYS add
use strict;
use warnings;
to the head of your program. They will catch most simple errors before you need to ask for help.
You should always check whether a file open succeeded with open FILE, "reflog" or die $!;
You are using a variable $ine that doesn't exist. You mean $line
The lines you read into the array contain a trailing newline. Write chomp #lines; to remove them
Your regular expressions are wrong and you need || instead of &&. Instead write if ($line =~ /\.(iif|isp)$/)
If you still have problems when these are fixed then please ask again.

Aside from the errors already pointed out, you appear to be loading #lines from FUNC instead of FILE. Is that also a typo?
Also, If reflog truly contains a series of lines with one filename on each line, why would you ever expect the conditional "if ($line =~ /.abc/ && $line =~ /.def/)" to evaluate true?
It would really help if you could post an example from the actual file you are reading from, along with the actual code you are debugging. Or at least edit the question to fix the typos already mentioned

use strict;
use warnings;
my #files_abc;
my #files_def;
my $line;
open(FILE,'reflog') || die ("could not open reflog");
while ($line = <FILE>) {
if($line=~ /(.*)\.abc/) {
push(#files_abc,$1);
}
elsif($line=~ /(.*)\.def/) {
push(#files_def,$1);
}
}
close(FILE);
my %second = map {$_=>1} #files_def;
my #same = grep { $second{$_} } #files_abc;
print "\nList of files with same extension\n #same";
foreach my $abc (#files_abc) {
$abc .= ".abc";
}
foreach my $def (#files_def) {
$def .= ".def";
}
print "\nList of files with abc extension\n #files_abc";
print "\nList of files with def extension\n #files_def";
Output is
List of files with same extension
file1 file2 file3 file5
List of files with abc extension
file1.abc file2.abc file3.abc file4.abc file5.abc file7.abc file10def.abc
List of files with def extension
file2.def file3.def file5.def file6.def file8abc.def file1.def file9abc.def
Hope this helps...

You don't need to slurp the whole file; you can read one line at a time. I think this code works on this extended version of your reflog file:
xx.pl
#!/usr/bin/env perl
use strict;
use warnings;
open my $file, '<', "reflog" or die "Failed to open file reflog for reading ($!)";
open my $func, '>', 'log' or die "Failed to create file log for writing ($!)";
my ($oldline, $oldname, $oldextn) = ("", "", "");
while (my $newline = <$file>)
{
chomp $newline;
$newline =~ s/^\s*//;
my ($newname, $newextn) = ($newline =~ m/(.*)([.][^.]*)$/);
if ($oldname eq $newname)
{
# Found the same file - presumably $oldextn eq ".abc" and $newextn eq ".def"
print $func "$newname\n";
print "$newname\n";
$oldline = "";
$oldname = "";
$oldextn = "";
}
else
{
print $func "$oldline\n" if ($oldline);
print "$oldline\n" if ($oldline);
$oldline = $newline;
$oldname = $newname;
$oldextn = $newextn;
}
}
print $func "$oldline\n" if ($oldline);
print "$oldline\n" if ($oldline);
#unlink "reflog" ;
chmod 0644, "log";
close $func;
close $file;
Since the code does not actually check the extensions, it would be feasible to omit $oldextn and $newextn; on the other hand, you might well want to check the extensions if you're sufficiently worried about the input format to need to deal with leading white space.
I very seldom find it good for a processing script like this to remove its own input, hence I've left unlink "reflog"; commented out; your mileage may vary. I would also often just read from standard input and write to standard output; that would simplify the code quite a bit. This code writes to both the log file and to standard output; obviously, you can omit either output stream. I was too lazy to write a function to handle the writing, so the print statements come in pairs.
This is a variant on control-break reporting.
reflog
file1.abc
file1.def
file2.abc
file2.def
file3.abc
file3.def
file4.abc
file5.abc
file5.def
file6.def
file7.abc
Output
$ perl xx.pl
file1
file2
file3
file4.abc
file5
file6.def
file7.abc
$ cat log
file1
file2
file3
file4.abc
file5
file6.def
file7.abc
$
To handle unsorted file names with blank lines
#!/usr/bin/env perl
use strict;
use warnings;
open my $file, '<', "reflog" or die "Failed to open file reflog for reading ($!)";
open my $func, '>', 'log' or die "Failed to create file log for writing ($!)";
my #lines;
while (<$file>)
{
chomp;
next if m/^\s*$/;
push #lines, $_;
}
#lines = sort #lines;
my ($oldline, $oldname, $oldextn) = ("", "", "");
foreach my $newline (#lines)
{
chomp $newline;
$newline =~ s/^\s*//;
my ($newname, $newextn) = ($newline =~ m/(.*)([.][^.]*)$/);
if ($oldname eq $newname)
{
# Found the same file - presumably $oldextn eq ".abc" and $newextn eq ".def"
print $func "$newname\n";
print "$newname\n";
$oldline = "";
$oldname = "";
$oldextn = "";
}
else
{
print $func "$oldline\n" if ($oldline);
print "$oldline\n" if ($oldline);
$oldline = $newline;
$oldname = $newname;
$oldextn = $newextn;
}
}
print $func "$oldline\n" if ($oldline);
print "$oldline\n" if ($oldline);
#unlink "reflog" ;
chmod 0644, "log";
close $func;
close $file;
This is very similar to the original code I posted. The new lines are these:
my #lines;
while (<$file>)
{
chomp;
next if m/^\s*$/;
push #lines, $_;
}
#lines = sort #lines;
my ($oldline, $oldname, $oldextn) = ("", "", ""); # Old
foreach my $newline (#lines)
This reads the 'reflog' file, skipping blank lines, saving the rest in the #lines array. When the lines are all read, they're sorted. Then, instead of a loop reading from the file, the new code reads entries from the sorted array of lines. The rest of the processing is as before. For your described input file, the output is:
file1
file2
file3
Urgh: the chomp $newline; is not needed, though it is not otherwise harmful. The old-fashioned chop (a precursor to chomp) would have been dangerous. Score one for modern Perl.

open( FILE, "reflog" );
open( FUNC, '>log' );
my %seen;
while ( chomp( my $line = <FILE> ) ) {
$line =~ s/^\s*//;
if ( $ine =~ /(\.+)\.(abc|def)$/ ) {
$seen{$1}++;
}
}
foreach my $file ( keys %seen ) {
if ( $seen{$file} > 1 ) {
## do whatever you want to
}
}
unlink "reflog";
chmod( 0750, "log" );
close(FUNC);
close(FILE);