Perl automated file testing

Perl automated file testing - perl

So I'm taking a Perl programming class and our teacher gave us our first assignment with very little talk about how to actually program perl. Here's exactly what our teacher assigned:
"You should write a script (you may name you script whatever you deem appropriate) that accepts 3 filenames as arguments. The first filename corresponds to the program source code written in C++. The second filename corresponds to an input file that is to be used by the C++ program listed as the first filename. The third filename corresponds to a text file that contains the expected, correct output for the program in question. A directory path may be provided with any of the filenames.
If the program doesn’t require an input file, the second parameter on the command line should be the filename "/dev/null".
All of the files - the program source, the input file, and expected output file - should be copied to a scratch test directory before any of them are used by the script so that there is little chance that the originals will be modified by the test procedure. If the scratch directory doesn’t exist, your script should create one as a subdirectory of the current working directory.
Your script should then compile and link the scratch copy of the source using the GNU g++ compiler. The script should then run the program – saving the output to a temporary file stored in the scratch test directory.
After running the program, your script should then use the UNIX command diff to compare the actual output generated in the previous step with the expected output file and report either that the output conforms to specifications or report any differences as reported by the diff program.
After completion, your script should remove all of the temporary copies and scratch files. Do not remove the original program, the original input file, the original expected output file, or the scratch directory itself."
I have this so far:
#!/usr/bin/perl -w
use strict;
my ($line, $program, $input, $output);
print "Give the program, input, and standart output for testing. ";
$line = <>;
chomp $line;
($program, $input, $output) = split/\s+/, $line; # split/\s+/ is to separate spaces from the input
my($o_test) = $output + "_test";
print "$program ";
print "$input ";
print "$output ";
system("mkdir test_scratch") == 0
or die "failed to create test_scratch. exiting...."
system("cp $program, /test_scratch/"); # error
system("cp $input, /test_scratch/");
system("cp $output, /test_scratch/");
system("cd test_scratch");
system("g++ $program");
system("chmod +x a.out");
system("./a.out < $input > $o_test");
my($DIFF) = system("diff $output $o_test") # error
if[ $DIFF != ""]
print ("Output conforms to specifications."); # error
then
print ("$DIFF");
system("cd ..");
I'm getting errors at the # in code. I don't even know how to do the "/dev/null". I've spent a lot of time looking things up online and searching stackoverflow, but I just don't know what else to do. I realize this is an extremely long question but I don't know what else to do. Thank you for ANY help you can give me.

There are several modules which can help you here. The first one I would recommend is ExtUtils::CBuilder which can manage a build process for you. Then you might also use File::Copy for moving things intot the temporary folder, and even File::chdir for managing the working directory. Since the prof specifies that you should use diff perhaps you should, but there are modules which do that task, or you could use Test::More to check that the output is what is expected.
Just for future reference, this is how I would accomplish a similar task (don't keep temp dir, don't need diff):
#!/usr/bin/env perl
use strict;
use warnings;
use File::chdir;
use File::Temp;
use File::Copy;
use File::Basename;
use File::Slurp;
use ExtUtils::CBuilder;
use Test::More tests => 1;
die "Not enough inputs\n" unless #ARGV >= 3;
# create a temporary directory
my $temp = File::Temp->newdir;
# copy all arguments to that temporary directory
copy $_, $temp for #ARGV;
# store only the filename (not path) of each argument
my ($cpp_file, $in_file, $expected_file) = map { scalar basename $_ } #ARGV;
# change working directory to temporary one (via File::chdir)
local $CWD = $temp;
# build the executable
my $builder = ExtUtils::CBuilder->new(config => {cc => 'g++'});
my $obj = $builder->compile( source => $cpp_file, 'C++' => 1 );
my $exe = $builder->link_executable( objects => 'hello.o' );
# run the executable
my $output = `./$exe $in_file`;
# read in the expected file
my $expected = read_file $expected_file;
# test the resulting output
is $output, $expected, 'Got expected output from executable';
Note that you may need to be careful with newlines on the output and expected files.

I'm getting errors at the # in code.
The easy fix for this error is to remove the comma. The standard Linux cp command doesn't take a comma between the source and destination. You might also consider File::Copy to copy files in Perl.
I don't even know how to do the "/dev/null".
/dev/null is a standard Linux file(*), which points to emptiness. You can write bytes to this file, and they disappear. You can try to read bytes from this file, and nothing's ever there.
This leaves you with two options:
Just use /dev/null directly, and the program will be unable to read anything from an empty byte stream.
Check if the input file is equal to /dev/null, and don't give the C++ program an input value if the input is /dev/null.
Either method would work. If you copy /dev/null to a directory, you get a new zero-length file called 'null', which is a perfectly valid empty input file for the C++ program.
(*) No, /dev/null is not a file in the sense you're used to if you come from the Windows world. It doesn't exist on disk - never has and never will. But the classic Unix philosophy is to make every data source a file, even if that file doesn't exist on disk. As an example, see the Linux /proc filesystem, which allows you to see information about CPUs, processes, etc. in a directory tree structure. That philosophy held up somewhat until the sockets API took a completely different route. If you're looking for an operating system that does make basically everything into a file, including network connections and the screen, look up Plan 9. I would not recommend doing your homework assignments on Plan 9 and expecting them to work on Linux, though.
N.B. Don't forget to check return codes from system, since if g++ fails to compile the C++ program there will be no way to run a compiled program that doesn't exist.

Related

sh: 1: Syntax error: "(" unexpected

I am trying to run an executable perl file that copies a directory to another location and then removes every file in that new location except for those ending with .faa and .tsv. Here's the code:
#!/usr/bin/perl
use strict;
use warnings;
my $folder = $ARGV[0];
system ("cp -r ~/directoryA/$folder/ ~/directoryB/");
chdir "~/directoryB/$folder";
# Remove everything except for .faa and .tsv files
system ("rm !\(*.faa|*.tsv\)");
Regardless of whether or not I escape the parenthesis, I get the error:
sh: 1: Syntax error: "(" unexpected
and it didn't remove any files. The location of the perl file is ~/bin, and I'd like to avoid changing the #!/usr/bin/perl line since several computers will be using this script.
This is a little beyond my knowledge, as I only know basic scripting, but does anyone know why this is happening?

This entire program is much simpler without the use of shell commands
I would write this, which copies only the wanted file types in the first place. I assume there are no nested directories to be copied
use strict;
use warnings;
use File::Copy 'copy';
my ($folder) = #ARGV;
while ( my $file = glob "~/directoryA/$folder/*.{faa,tsv}" ) {
copy $file, '~/directoryB';
}

It's complaining about this line:
system ("rm !\(*.faa|*.tsv\)");
as even if you get the quoting of the shell metacharacters right, is pretty obtuse and does not, I believe, erase all files that don't end in .faa or .tsv.
Perl is up to the latter task.
unlink grep { -f $_ && !/[.]faa$/ && !/[.]tsv$/ } glob("*")
is one of several ways.

Read same extension multiple files in one directory in Perl

I currently have an issue with reading files in one directory.
I need to take all the fastq files in a file and run the script for each file then put new files in an ‘Edited_sequences’ folder.
The one script I had is
perl -ne '$i++; if($i<80001){print}' BM2003_TCCCAGAACAAC_L001_R1_001.fastq > ./Edited_sequences/BM2003_TCCCAGAACAAC_L001_R1_001.fastq
It takes the first 80000 lines in one fastq file then outputs the result.
Now for example I have 2000 fastq files, then I need to copy and paste for 2000 times.
I know there is a glob command suit for this situation but I just do not know how to deal with that.
Please help me out.

You can use perl to do copy/paste for you, first argument *.fastq are all fastq files, and second ./Edited_sequences is target folder for new files,
perl -e '$d=pop; `head -8000 "$_" > "$d/$_"` for #ARGV' *.fastq ./Edited_sequences

glob gets you an array of filenames matching a particular expression. It's frequently used with <> brackets, a lot like reading input (you can think of it as reading files from a directory).
This is a simple example that will print the names of every ".fastq" file in the current directory:
print "$_\n" for <*.fastq>;
The important part is <*.fastq>, which gives us an array of filenames matching that expression (in this case, a file extension). If you need to change which directory your Perl script is working in, you can use chdir.
From there, we can process your files as needed:
while (my $filename = <*.fastq>) {
open(my $in, '<', $filename) or die $!;
open(my $out, '>', "./Edited_sequences/$filename") or die $!;
for (1..80000) {
my $line = <$in>;
print $out $line;
}
}

You have two choices:
Use Perl to read in the 2000 files and run it as part of your program
Use the Shell to pass each of those 2000 file to your command line
Here's the bash alternative:
for file in *.fastq
do
perl -ne '$i++; if($i<80001){print}' "$file" > "./Edited_sequences/$file"
done
Your same Perl script, but with the shell finding each file. This should work and not overload the command line. The for loop in bash, if handed a glob can expand them correctly.
However, I always recommend that you don't actually execute the command, but echo the resulting commands into a file:
for file in *.fastq
do
echo "perl -ne '\$i++; if(\$i<80001){print}' \
\"$file\" > \"./Edited_sequences/$file\"" >> myoutput.txt
done
Then, you can look at myoutput.txt to make sure it looks good before you actually do any real harm. Once you've determined that myoutput.txt is a good file, you can execute that as a shell script:
$ bash myoutput.txt

how to create a script from a perl script which will use bash features to copy a directory structure

hi i have written a perl script which copies all the entire directory structure from source to destination and then i had to create a restore script from the perl script which will undo what the perl script has done that is create a script(shell) which can use bash features to restore the contents from destination back to source i m struggling to find the correct function or command which can copy recursively (not an requirement) but i want exactly the same structure as it was before
Below is the way i m trying to create a file called restore to do the restoration process
i m particularly looking for algorithm.
Also restore will restore the structure to a command line directory input if it is supplied if not You can assume the default input supplied to perl script
$source
$target
in this case we would wanna copy from target to source
So we have two different parts in one script.
1 which will copy from source to destination.
2 it will create a script file which will undo what part 1 has done
i hope this makes it very clear
unless(open FILE, '>'."$source/$file")
{
# Die with error message
# if we can't open it.
die "\nUnable to create $file\n";
}
# Write some text to the file.
print FILE "#!/bin/sh\n";
print FILE "$1=$target;\n";
print FILE "cp -r \n";
# close the file.
close FILE;
# here we change the permissions of the file
chmod 0755, "$source/$file";
The last problem i have is i couldn't get $1 in my restore file as it refers to a some variable in perl
but i need this for getting command line input when i run restore as $0 = ./restore $1=/home/xubuntu/User

First off, the standard way in Perl for doing this:
unless(open FILE, '>'."$source/$file") {
die "\nUnable to create $file\n";
}
is to use the or statement:
open my $file_fh, ">", "$source/$file"
or die "Unable to create "$file"";
It's just easier to understand.
A more modern way would be use autodie; which will handle all IO problems when opening or writing to files.
use strict;
use warnings;
use autodie;
open my $file_fh, '>', "$source/$file";
You should look at the Perl Modules File::Find, File::Basename, and File::Copy for copying files and directories:
use File::Find;
use File::Basename;
my #file_list;
find ( sub {
return unless -f;
push #file_list, $File::Find::name;
},
$directory );
Now, #file_list will contain all the files in $directory.
for my $file ( #file_list ) {
my $directory = dirname $file;
mkdir $directory unless -d $directory;
copy $file, ...;
}
Note that autodie will also terminate your program if the mkdir or copy commands fail.
I didn't fill in the copy command because where you want to copy and how may differ. Also you might prefer use File::Copy qw(cp); and then use cp instead of copy in your program. The copy command will create a file with default permissions while the cp command will copy the permissions.
You didn't explain why you wanted a bash shell command. I suspect you wanted to use it for the directory copy, but you can do that in Perl anyway. If you still need to create a shell script, the easiest way is via the :
print {$file_fh} << END_OF_SHELL_SCRIPT;
Your shell script goes here
and it can contain as many lines as you need.
Since there are no quotes around `END_OF_SHELL_SCRIPT`,
Perl variables will be interpolated
This is the last line. The END_OF_SHELL_SCRIPT marks the end
END_OF_SHELL_SCRIPT
close $file_fh;
See Here-docs in Perldoc.

First, I see that you want to make a copy-script - because if you only need to copy files, you can use:
system("cp -r /sourcepath /targetpath");
Second, if you need to copy subfolders, you can use -r switch, can't you?

Unix commands in Perl?

I'm very new to Perl, and I would like to make a program that creates a directory and moves a file into that directory using the Unix command like:
mkdir test
Which I know would make a directory called "test". From there I would like to give more options like:
mv *.jpg test
That would move all .jpg files into my new directory.
So far I have this:
#!/usr/bin/perl
print "Folder Name:";
$fileName = <STDIN>;
chomp($fileType);
$result=`mkdir $fileName`;
print"Your folder was created \n";
Can anyone help me out with this?

Try doing this :
#!/usr/bin/perl
use strict; use warnings;
print "Folder Name:";
$dirName = <STDIN>;
chomp($dirName);
mkdir($dirName) && print "Your folder was created \n";
rename $_, "$dirName/$_" for <*.jpg>;
You will have a better control when using built-in perl functions than using Unix commands. That's the point of my snippet.

Most (if not all) Unix commands have a corresponding version as a function
e.g
mkdir - see here
mv - See here
Etc. either get a print out of the various manual pages (or probably have a trip down to the book shop - O'Reilly nut shell book is quite good along with others).

In perl you can use bash commands in backticks. However, what happens when the directory isn't created by the mkdir command? Your program doesn't get notified of this and continues on its merry way thinking that everything is fine.
You should use built in command in perl that do the same thing.
http://perldoc.perl.org/functions/mkdir.html
http://perldoc.perl.org/functions/rename.html
It is much easier to trap errors with those functions and fail gracefully. In addition, they run faster because you don't have to fork a new process for each command you run.

Perl has some functions similar to those of the shell. You can just use
mkdir $filename;
You can use backquotes to run a shell command, but it is only usefull if the command returns anything to its standard output, which mkdir does not. For commands without output, use system:
0 == system "mv *.jpg $folder" or die "Cannot move: $?";

How does Perl interact with the scripts it is running?

I have a Perl script that runs a different utility (called Radmind, for those interested) that has the capability to edit the filesystem. The Perl script monitors output from this process, so it would be running throughout this whole situation.
What would happen if the utility being run by the script tried to edit the script file itself, that is, replace it with a newer version? Does Perl load the script and any linked libraries at the start of its execution and then ignore the script file itself unless told specifically to mess with it? Or perhaps, would all hell break loose, and executions might or might not fail depending on how the new file differed from the one being run?
Or maybe something else entirely? Apologies if this belongs on SuperUser—seems like a gray area to me.

It's not quite as simple as pavel's answer states, because Perl doesn't actually have a clean division of "first you compile the source, then you run the compiled code"[1], but the basic point stands: Each source file is read from disk in its entirety before any code in that file is compiled or executed and any subsequent changes to the source file will have no effect on the running program unless you specifically instruct perl to re-load the file and execute the new version's code[2].
[1] BEGIN blocks will run code during compilation, while commands such as eval and require will compile additional code at run-time
[2] Most likely by using eval or do, since require and use check whether the file has been loaded already and ignore it if it has.

For a fun demonstration, consider
#! /usr/bin/perl
die "$0: where am I?\n" unless -e $0;
unlink $0 or die "$0: unlink $0: $!\n";
print "$0: deleted!\n";
for (1 .. 5) {
sleep 1;
print "$0: still running!\n";
}
Sample run:
$ ./prog.pl
./prog.pl: deleted!
./prog.pl: still running!
./prog.pl: still running!
./prog.pl: still running!
./prog.pl: still running!
./prog.pl: still running!

Your Perl script will be compiled first, then run; so changing your script while it runs won't change the running compiled code.
Consider this example:
#!/usr/bin/perl
use strict;
use warnings;
push #ARGV, $0;
$^I = '';
my $foo = 42;
my $bar = 56;
my %switch = (
foo => 'bar',
bar => 'foo',
);
while (<ARGV>) {
s/my \$(foo|bar)/my \$$switch{$1}/;
print;
}
print "\$foo: $foo, \$bar: $bar\n";
and watch the result when run multiple times.

The script file is read once into memory. You can edit the file from another utility after that -- or from the Perl script itself -- if you wish.

As the others said, the script is read into memory, compiled and run. GBacon shows that you can delete the file and it will run. This code below shows that you can change the file and do it and get the new behavior.
use strict;
use warnings;
use English qw<$PROGRAM_NAME>;
open my $ph, '>', $PROGRAM_NAME;
print $ph q[print "!!!!!!\n";];
close $ph;
do $PROGRAM_NAME;
... DON'T DO THIS!!!

Perl scripts are simple text files that are read into memory, compiled in memory, and the text file script is not read again. (Exceptions are modules that come into lexical scope after compilation and do and eval statements in some cases...)
There is a well known utility that exploits this behavior. Look at CPAN and its many versions which is probably in your /usr/bin directory. There is a CPAN version for each version of Perl on your system. CPAN will sense when a new version of CPAN itself is available, ask if you want to install it, and if you say "y" it will download the newer version and respawn itself right where you left off without loosing any data.
The logic of this is not hard to follow. Read /usr/bin/CPAN and then follow the individualized versions related to what $Config::Config{version} would generate on your system.
Cheers.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse