CMG biotools - a Perl based tool - perl

sub runBlast {
# order is preserved !
for ( my $subject_counter = 0 ; $subject_counter < scalar ( #{$xmlcfg->{sources}[0]->{entry}} ) ; $subject_counter++ ) {
my $subjectTitle = $INFO{$subject_counter}{title};
my $subjectSubtitle = $INFO{$subject_counter}{subtitle};
for ( my $query_counter = 0 ; $query_counter < scalar ( #{$xmlcfg->{sources}[0]->{entry}} ) ; $query_counter++ ) {
my $queryTitle = $INFO{$query_counter}{title};
my $querySubtitle = $INFO{$query_counter}{subtitle};
$tab_h{"$query_counter-$subject_counter"} = $cm->start();
unless ( $tab_h{"$query_counter-$subject_counter"} ) {
my $blastreport_scratch = "$scratch/$query_counter-$subject_counter.blastout.gz";
my $jobid = md5 ( "$scratch/$query_counter.fsa" , "$scratch/$subject_counter.fsa" ) ;
system "$perl /usr/biotools/indirect/cacher --id='$jobid' --source='$cache_source' -action get > $blastreport_scratch";
if ( $? != 0 or $clean or -s $blastreport_scratch == 0) {
print STDERR "# jobid $jobid not in cache - redoing\n";
my $cmd = "$BLASTALL -F 0 -p blastp -d $scratch/$subject_counter.fsa -e 1e-5 -m 7 < $scratch/$query_counter.fsa | $TIGRCUT | gawk '{print \$1\"\\t\"\$2}' | $gzip > $blastreport_scratch";
system $cmd;
die "# failed at '$cmd'\n" if $? != 0;
system "$perl /usr/biotools/indirect/cacher --id=$jobid --source=$cache_source -action put -expire 100 < $blastreport_scratch";
} else {
my $s = -s $blastreport_scratch;
print STDERR "# fetched jobid $jobid from cache ( $s bytes)\n";
}
exit;
}
}
}
$cm->wait_all_children;
}
I am completely zero in Perl programming. I had to run this tool called CMG Biotools which has been coded in Perl.
I am attaching part of its code here. Can anyone please tell me when jobid not in cache...redoing message will be displayed.code for CMG biotools

Your script, blastmatrix, attempts to use an external (to this script) perl tool called "cacher" - /usr/biotools/indirect/cacher - passing parameters
-action get
--source='$cache_source'; and
--id='$jobid'
So the script is attemting to retrieve a job with ID $jobid from a caching utillity and its failing. Having failed, the reference to "redoing" appears to be an attempt to run BLASTALL, that is /usr/biotools/blast/bin/blastall, and then retrys the same cache command.
So, if all you are seeing is the message but the script is working then I'd guess - and that's all I can do - that BLASTALL is cleaning up some issue - a unexpected file, a missing file - who knows - and the second attempt at the cache is working.
If it's not working at all, I can only say that it finally fails - which is a different thing from say "the root cause is ..." - when it attempts to use the cacher.
Note - all the above is speculative.

Related

Export Snapshots from Accurev using Perl

I am trying to use a Perl script to pull out all snapshots from Accurev but I'm having issues.
I can run this command fine on it's own
accurev show -p myDepot streams
This will get all the streams for me, but when I go to put it into my Perl script, I come up empty and can't pass in the argument to a for each loop.
Here's what I have:
#!/usr/bin/perl
#only tested on Windows - not supported by AccuRev
use XML::Simple ;
use Data::Dumper ;
use strict ;
use Time::Piece;
### Modify to reflect your local AccuRev client path
$::AccuRev = "/cygdrive/c/\"Program Files (x86)\"/AccuRev/bin/accurev.exe" ;
my ($myDepot, $myDate, $stream_raw, $stream_xml, $streamNumber, $streamName, $counter, $snapTime) ;
### With AccuRev 4.5+ security, if you want to ensure you are authenticated before executing the script,
### uncomment the following line and use a valid username and password.
system "$::AccuRev login -n username password" ;
chomp($myDepot = $ARGV[0]);
chomp($myDate = $ARGV[1]);
if ($myDepot eq "") {
print "\nUsage: perl snapshot_streams.pl <depot_name>\n" ;
print "This script will return the name of the snapshot streams for the depot passed in...\n" ;
exit(1) ;
}
$stream_raw = `$::AccuRev show -p $myDepot -fx streams`;
$stream_xml = XMLin($stream_raw, forcearray => 1, suppressempty => '', KeyAttr => 'stream') ;
if ($stream_xml eq "") {
print "\nDepot $myDepot doesn't exist...\n" ;
exit(1) ;
}
print "List of snapshots in depot $myDepot:\n";
$counter = 0 ;
foreach $stream_xml (#{$stream_xml->{stream}})
{
if ($stream_xml->{type} eq "snapshot") {
$streamName = $stream_xml->{name};
$snapTime = scalar localtime($stream_xml->{time});
my $datecheck = $snapTime->strftime('%Y%m%d');
if ($datecheck >= $myDate){
print "Snapshot Name: $streamName \t\t\t Time: $snapTime\n" ;
}
$counter = $counter + 1 ;
}
}
if ( $counter == 0 ) {
print "\nNo snapshots found in depot $myDepot...\n" ;
}
The problem was that the AccuRev path was not working correctly so I was not getting the correct output. Since I have the AccuRev home directory listed in my envrionment variables I was able to call accurev and save it to an XML file to be referenced in the XMLin call.
In addition to this, the command had to be in "" not '' or ``.
Below is the end result with an additional argument to specify the date range of snapshots:
#!C:\Strawberry\perl\bin
#only tested on Windows - not supported by AccuRev
use XML::Simple qw(:strict);
use English qw( -no_match_vars );
use Data::Dumper ;
use strict ;
use Time::Piece;
my ( $login, $xml, $command, $myDepot, $myDateStart, $myDateEnd, $stream_xml, $streamNumber, $streamName, $counter, $snapTime) ;
###If Accurev is already in your environment variables, you can call it without setting the path
###otherwise uncomment and update script
###$accurev = "/cygdrive/c/\"Program Files (x86)\"/AccuRev/bin/accurev.exe";
### With AccuRev 4.5+ security, if you want to ensure you are authenticated before executing the script,
### uncomment the following line and use a valid username and password.
###$login = "accurev login -n username password" ;
###system($login);
chomp($myDepot = $ARGV[0]);
chomp($myDateStart = $ARGV[1]);
chomp($myDateEnd = $ARGV[2]);
if ($myDepot eq "") {
print "\nUsage: perl snapshot_streams.pl <depot_name>\n" ;
print "This script will return the name of the snapshot streams for the depot passed in...\n" ;
exit(1) ;
}
$command = "accurev show -p $myDepot -fx streams > snapshot_streams.xml";
system($command);
$stream_xml = XMLin("snapshot_streams.xml", ForceArray => 1, SuppressEmpty => '', KeyAttr => 'stream') ;
if ($stream_xml eq "") {
print "\nDepot $myDepot doesn't exist...\n" ;
exit(1) ;
}
print "List of snapshots in depot $myDepot:\n";
$counter = 0 ;
foreach $stream_xml (#{$stream_xml->{stream}})
{
if ($stream_xml->{type} eq "snapshot") {
$streamName = $stream_xml->{name};
$snapTime = scalar localtime($stream_xml->{time});
my $datecheck = $snapTime->strftime('%Y%m%d');
if ($datecheck >= $myDateStart && $datecheck <= $myDateEnd){
print "Snapshot Name: $streamName \t\t\t Time: $snapTime\n" ;
}
$counter = $counter + 1 ;
}
}
if ( $counter == 0 ) {
print "\nNo snapshots found in depot $myDepot...\n" ;
}
Here is the call:
perl -w snapshot.pl <depot> "FromDate" "ToDate" > output.txt 2>&1
The output looks something like this:
List of snapshots in depot <Depot_Name>:
Snapshot Name: Product_1_SS Time: Tue Jul 04 10:00:05 2018
Snapshot Name: Product_2_SS Time: Tue Jul 07 11:00:15 2018
Snapshot Name: Product_3_SS Time: Tue Jul 15 12:30:30 2018

How to write a background multiple sh run in a perl script? (tried several methods but all failed)

I tried to write a background multiple sh run in a perl script, below is my 3 tries:
for ( $i = 0; $i < #num - 1; $i++ ) {
$num1 = $num[ $i + 1 ] - 1;
`nohup sh ./tmpScript/matrixR.$num[$i]-$num1.sh &`
# Don't know why not background run........just run one by one and i have to wait.........
}
Tried:
`sh *.sh &`
#Wrong ........just run one by one and i have to wait.........
Also tried:
`cd tmpScript; for file in *.sh; do nohup sh \$file \&; done `;
#Still wrong........just run one by one and i have to wait.........
They all failed..Could any of you help and solve it? Thanks!
p.s.: "``" these marks didn't displayed above..(seemed be eaten by stackoverflow, it's my first time asking..)
So, given your problem seems to be:
invoke sh ./tmpScript/matrixR.$num[$i]-$num1.sh & multiple times in parallel.
My first thought would be that it's worth having a look at that script, because perl can probably do it too.
However for the general case I would suggest using Parallel::ForkManager:
use strict;
use warnings;
use Parallel::ForkManager;
my $manager = Parallel::ForkManager->new(30);
for ( $i = 0; $i < #num - 1; $i++ ) {
$num1 = $num[ $i + 1 ] - 1;
$manager->start and next;
exec("./tmpScript/matrixR.$num[$i]-$num1.sh >/dev/null")
or die($!);
}
$manager->wait_all_children();
This has the added advantage that it'll parallelise 30 times, but wait until sup processes complete before continuing. Which means you don't denial-of-service your server.

Perl brute force attack

I am having a lot of trouble trying to create a brute force script. The password I need to crack is 1 to 4 characters long and all lowercase letters. I think I have figured out the code to generate all the possible combinations but I am not sure how to test this on a file. Any guidance or hints would be great.
$password = "aaaa";
while ( length $password < 5 ) {
print "$password\n";
$password++;
I had this similar problem. Either you are in my class or scripting classes around the country do this problem at the same time. My professor encourages forum use but we can't share answers with direct classmates at our university.
If you know me from your class by my username, then I ask that you do not use my code. Otherwise enjoy. I have commented the code since learning from working code is the best way to learn.
As long as you are using only letters you can just increment a scalar instead of nesting loops. If you do need to use other characters I bet you could just use an array of possible characters and increment through that array for each position, though let's ignore that since you seem to only need those letters =)
sub brute2()
{
print "Bruteforce Attack...\n";
print "Enter password length: "; #Prompt user for maximum length for pass
chomp(my $plen = (<>)); #Receive input and remove newline character
print "Password Length is $plen\n";
$plen++;
print "Press any key to continue.\n"; #Execute once they hit any key
if (<>)
{
my $pass = "a"; #This code assumes only letters a..z, so we just set here
while ( length $pass < $plen ) #Run check loop until we exaust all possibilities within the maximum length
{
my $status = system("unzip -pp -o -P $pass secret_file_brute.zip > /dev/null 2>&1"); #System call to compare our password against a zip file, this will set status to the return value
print ("Attempting: $pass Return: $status\n");
if ($status == 0) #Return value of 0 means success
{
print ("Password is: $pass Return is: $status\n"); #Print correct password. I did return value also for debug
last; #Break loop since we got correct password
}
$pass++; #Increment $pass var to next iteration IE "a" to "b", "aa" to "ab", "zzz" to "aaaa" etc...
}
}
}
According to the man page I found, unzip returns exit code 82 when it can't decrypt.
sub try {
my ($password) = #_;
system("unzip -qq -o -P $password secret_file_brute.zip >/dev/null 2>&1");
die("Can't launch unzip: $!\n") if $? == -1;
die("unzip killed by signal ".($? & 0x7F)."\n") if $? & 0x7F;
my $exit_code = $? >> 8;
die("unzip exited with error $exit_code\n") if $exit_code && $exit_code != 82;
return !$exit_code;
}
Your code does not generate all of the possible passwords (e.g. it doesn't generate aaa). The following does:
sub brute_force {
for (my $password = 'a'; length($password)<5; ++$password) {
return $password if try($password);
}
return undef;
}
The final bit is to display the results.
{
my $password = brute_force();
defined($password)
or die("Password not found\n");
print("$password\n");
}

Obtaining exit status values from GNU parallel

The Perl wrapper below executes commands in parallel, saving STDOUT
and STDERR to /tmp files:
open(A,"|parallel");
for $i ("date", "ls", "pwd", "factor 17") {
print A "$i 1> '/tmp/$i.out' 2> '/tmp/$i.err'\n";
}
close(A);
How do I obtain the exit status values from the individual commands?
To get the exist status of the individual jobs, parallel would need to write the info somewhere. I don't know if it does or not. If it doesn't, you can do that yourself.
my %jobs = (
"date" => "date",
"ls" => "ls",
"pwd" => "pwd",
"factor" => "factor 17",
);
open(my $parallel, "|parallel");
for my $id (keys(%jobs)) {
print $parallel
$jobs{$id}
." 1> '/tmp/$id.out'"
." 2> '/tmp/$id.err' ; "
."echo \$?"
." > '/tmp/$id.exit'\n";
}
close($parallel);
my $exit_status = $? >> 8;
if ($exit_status >= 255) {
print("Failed\n");
} else {
printf("%d failed jobs\n", $exit_status);
}
for my $id (keys(%jobs)) {
...grab output and exit code from files...
}
Update:
I went and installed parallel.
It has an option called --joblog {file} which produces a report with exit codes. It accepts - for file name if you want it to output to STDOUT.
Note that parallel doesn't recognise abnormal death by signal, so this is not included in the --joblog report. Using the solution I posted above, a missing .exit file would indicate an abnormal death. (You must make sure it doesn't exist in the first place, though.)
Update:
#Ole Tange mentions that the limitation of --joblog {file} I mentioned above, the lack of logging of death by signal, has been addressed in version 20110722.
GNU Parallel 20110722 has exit val and signal in --joblog:
parallel --joblog /tmp/log false ::: a
cat /tmp/log
Seq Host Starttime Runtime Send Receive Exitval Signal Command
1 : 1311332758 0 0 0 1 0 false a
If you want to avoid the wrapper you could consider:
cat foo | parallel "{} >\$PARALLEL_SEQ.out 2>\$PARALLEL_SEQ.err; echo \$? >\$PARALLEL_SEQ.status"
Version 20110422 or later makes it even shorter:
cat foo | parallel "{} >{#}.out 2>{#}.err; echo \$? >{#}.status"
If your lines do no contain ' then this should work too:
cat foo | parallel "{} >'{}'.out 2>'{}'.err; echo \$? >'{}'.status"
Instead of wrapping parallel, you can use any of the tons of modules available from CPAN providing similar functionality.
For instance:
use Proc::Queue size => 10, qw(run_back);
my #pids;
for $i ("date", "ls", "pwd", "factor 17") {
push #pids, run_back {
open STDOUT, '>', '/tmp/$i.out';
open STDERR, '>', '/tmp/$i.err';
exec $i;
}
}
for (#pids) {
1 while waitfor($_, 0) <= 0;
say "process $_ exit code: ", ($? >> 8);
}

Why is my Perl script that calls FTP all of a sudden failing?

I have a script that has been running for over a year and now it is failing:
It is creating a command file:
open ( FTPFILE, ">get_list");
print FTPFILE "dir *.txt"\n";
print FTPFILE "quit\n";
close FTPFILE;
Then I run the system command:
$command = "ftp ".$Server." < get_list | grep \"\^-\" >new_list";
$code = system($command);
The logic the checks:
if ($code == 0) {
do stuff
} else {
log error
}
It is logging an error. When I print the $code variable, I am getting 256.
I used this command to parse the $? variable:
$exit_value = $? >> 8;
$signal_num = $? & 127;
$dumped_core = $? & 128;
print "Exit: $exit_value Sig: $signal_num Core: $dumped_core\n";
Results:
Exit: 1 Sig: 0 Core: 0
Thanks for any help/insight.
Mel - you might gain a bit more information by looking at standard error output of the ftp command.
1) Does the FTP command work by hand from shell prompt?
2) If command line ftp works, capture the output (stdout and stderr) of the ftp command and print it in Perl script. For a couple of ways to do so, see perlfaq8 - How can I capture STDERR from an external command?
The two easiest apporaches are these:
my $output = `$command 2>&1`;
my $pid = open(PH, "$command 2>&1 |");
while (<PH>) { print "Next line from FTP output: $_"; }
3) As wisely noted by Snake Plissken in a comment, an alternate (and more idiomatic and possibly easier) approach is to scrap the system call to "ftp" command and instead use Net::FTP Perl module.