perl - help reworking code to include use of a sub routine - perl

My test script simply does a perl dbi connection to a mysql database and given a list of tables, extracts (1) record per table.
For every table I list, I also want to print that (1) record out to its own file. For example if I have a list of 100 tables, I should expect 100 uniques files with (1) record each.
So far the code works, but I am interested in creating a sub routine, call it create_file for pieces of the code that handles that #Create file
I am not familiar with writing sub routines and need help implementing that if possible.
I am not sure how I would call the part where the data is built. $data='';
Can someone show me good way to do this? Thanks for your help.
code:
# Get list of tables
my #tblist = qx(mysql -u foo-bar -ppassw0rd --database $dbsrc -h $node --port 3306 -ss -e "show tables");
# Data output
foreach my $tblist (#tblist)
{
my $data = '';
chomp $tblist;
#Create file
my $out_file = "/home/$node-$tblist.$dt.dat";
open (my $out_fh, '>', $out_file) or die "cannot create $out_file: $!";
my $dbh = DBI->connect("DBI:mysql:database=$dbsrc;host=$node;port=3306",'foo-bar','passw0rd');
my $sth = $dbh->prepare("SELECT UUID(), '$node', ab, cd, ef, gh, hi FROM $tblist limit 1");
$sth->execute();
while (my($id, $nd,$ab,$cd,$ef,$gh,$hi) = $sth->fetchrow_array() ) {
$data = $data. "__pk__^A$id^E1^A$nd^E2^A$ab^E3^A$cd^E4^A$ef^E5^A$gh^E6^A$hi^E7^D";
}
$sth->finish;
$dbh->disconnect;
#Create file
print $out_fh $data;
close $out_fh or die "Failed to close file: $!";
};

my $dt = "2011-02-25";
my $dbsrc = "...";
my $node = "...";
# Get list of tables
my #tblist = qx(mysql -u foo-bar -ppassw0rd --database $dbsrc -h $node --port 3306 -ss -e "show tables");
my $dbh = DBI->connect("DBI:mysql:database=$dbsrc;host=$node;port=3306",'foo-bar','passw0rd');
foreach my $tblist (#tblist)
{
# This breaks - chomp is given a list-context
#extract_data($dbh, chomp($tblist));
chomp $tblist;
extract_data($dbh, $tblist);
};
$dbh->disconnect;
sub extract_table
{
my($dbh, $tblist) = #_;
my $out_file = "/home/$node-$tblist.$dt.dat";
open (my $out_fh, '>', $out_file) or die "cannot create $out_file: $!";
my $sth = $dbh->prepare("SELECT UUID(), '$node', ab, cd, ef, gh, hi FROM $tblist limit 1");
$sth->execute();
while (my($id, $nd,$ab,$cd,$ef,$gh,$hi) = $sth->fetchrow_array() ) {
print $out_fh "__pk__^A$id^E1^A$nd^E2^A$ab^E3^A$cd^E4^A$ef^E5^A$gh^E6^A$hi^E7^D";
}
$sth->finish;
close $out_fh or die "Failed to close file: $!";
};
Unless you really need to connect to the database for each statement you execute, keep the database open between operations.

Instead of creating your own create_file, you could use write_file from File::Slurp.
It abstracts away the open/close/die/print.

Related

Saving Strings to Files

When I save a string to a file, and than load it back from the same file, it acts differently from the original string.
It seems that the first code with the hardcoded ip address works, but the second code where I write to the file and than read back, won't work. If I print $ip after loading from the file, it looks the same.
$ip = "100.10.100.1";
$port = 1337;
socket(S,PF_INET,SOCK_STREAM,getprotobyname("tcp"));
connect(S,sockaddr_in($port,inet_aton($ip)));
$ip = "100.10.100.1";
my $filename = 'c:\\tmp\\ip.txt';
open(my $fh, '>', $filename);
print $fh "$ip";
close $fh;
open(my $fh, '<', $filename);
$i = 0;
while (my $row = <$fh>) {
chomp $row;
if ($i eq 0) {
$ip = $row;
}
$i = $i + 1;
}
$port = 1337;
socket(S,PF_INET,SOCK_STREAM,getprotobyname("tcp"));
connect(S,sockaddr_in($port,inet_aton($ip)));
OK, it seems that the perl was run with -T cmd line option, which means it runs in "taint mode" and distrust data it read from files.
When printing the errors to a file, I saw a warning
"Insecure dependency in connect while running with -T switch"
Thanks for the tips!

how to combine the code to make the output is on the same line?

Can you help me to combine both of these progeam to display the output in a row with two columns? The first column is for $1 and the second column is $2.
Kindly help me to solve this. Thank you :)
This is my code 1.
#!/usr/local/bin/perl
#!/usr/bin/perl
use strict ;
use warnings ;
use IO::Uncompress::Gunzip qw(gunzip $GunzipError);
my $input = "par_disp_fabric.all_max_lowvcc_qor.rpt.gz";
my $output = "par_disp_fabric.all_max_lowvcc_qor.txt";
gunzip $input => $output
or die "gunzip failed: $GunzipError\n";
open (FILE, '<',"$output") or die "Cannot open $output\n";
while (<FILE>) {
my $line = $_;
chomp ($line);
if ($line=~ m/^\s+Timing Path Group \'(\S+)\'/) {
$line = $1;
print ("$1\n");
}
}
close (FILE);
This is my code 2.
my $input = "par_disp_fabric.all_max_lowvcc_qor.rpt.gz";
my $output = "par_disp_fabric.all_max_lowvcc_qor.txt";
gunzip $input => $output
or die "gunzip failed: $GunzipError\n";
open (FILE, '<',"$output") or die "Cannot open $output\n";
while (<FILE>) {
my $line = $_;
chomp ($line);
if ($line=~ m/^\s+Levels of Logic:\s+(\S+)/) {
$line = $1;
print ("$1\n");
}
}
close (FILE);
this is my output for code 1 which contain 26 line of data:
**async_default**
**clock_gating_default**
Ddia_link_clk
Ddib_link_clk
Ddic_link_clk
Ddid_link_clk
FEEDTHROUGH
INPUTS
Lclk
OUTPUTS
VISA_HIP_visa_tcss_2000
ckpll_npk_npkclk
clstr_fscan_scanclk_pulsegen
clstr_fscan_scanclk_pulsegen_notdiv
clstr_fscan_scanclk_wavegen
idvfreqA
idvfreqB
psf5_primclk
sb_nondet4tclk
sb_nondetl2tclk
sb_nondett2lclk
sbclk_nondet
sbclk_sa_det
stfclk_scan
tap4tclk
tapclk
The output code 1 also has same number of line.
paste is useful for this: assuming your shell is bash, then using process substitutions
paste <(perl script1.pl) <(perl script2.pl)
That emits columns separated by a tab character. For prettier output, you can pipe the output of paste to column
paste <(perl script1.pl) <(perl script2.pl) | column -t -s $'\t'
And with this, you con't need to try and "merge" your perl programs.
To combine the two scripts and to output two items of data on the same line, you need to hold on until the end of the file (or until you have both data items) and then output them at once. So you need to combine both loops into one:
#!/usr/bin/perl
use strict ;
use warnings ;
use IO::Uncompress::Gunzip qw(gunzip $GunzipError);
my $input = "par_disp_fabric.all_max_lowvcc_qor.rpt.gz";
my $output = "par_disp_fabric.all_max_lowvcc_qor.txt";
gunzip $input => $output
or die "gunzip failed: $GunzipError\n";
open (FILE, '<',"$output") or die "Cannot open $output\n";
my( $levels, $timing );
while (<FILE>) {
my $line = $_;
chomp ($line);
if ($line=~ m/^\s+Levels of Logic:\s+(\S+)/) {
$levels = $1;
}
if ($line=~ m/^\s+Timing Path Group \'(\S+)\'/) {
$timing = $1;
}
}
print "$levels, $timing\n";
close (FILE);
You still haven't given us one vital piece of information - what does the input data looks like. Most importantly, are the two pieces of information you're looking for on the same line?
[Update: Looking more closely at your regexes, I see it's possible for both pieces of information to be on the same line - as they are both supposed to be the first item on the line. It would be helpful if you were clearer about that in your question.]
I think this will do the right thing, no matter what the answer to your question is. I've also added the improvements I suggested in my answer to your previous question:
#!/usr/bin/perl
use strict ;
use warnings ;
use IO::Uncompress::Gunzip qw(gunzip $GunzipError);
my $zipped = "par_disp_fabric.all_max_lowvcc_qor.rpt.gz";
my $unzipped = "par_disp_fabric.all_max_lowvcc_qor.txt";
gunzip $zipped => $unzipped
or die "gunzip failed: $GunzipError\n";
open (my $fh, '<', $unzipped) or die "Cannot open '$unzipped': $!\n";
my ($levels, $timing);
while (<$fh>) {
chomp;
if (m/^\s+Levels of Logic:\s+(\S+)/) {
$levels = $1;
}
if (m/^\s+Timing Path Group \'(\S+)\'/) {
$timing = $1;
}
# If we have both values, then print them out and
# set the variables to 'undef' for the next iteration
if ($levels and $timing) {
print "$levels, $timing\n";
undef $levels;
undef $timing;
}
}
close ($fh);

Optimize Perl script to deal with large amount of data

Here is my script:
#!/usr/bin/perl -w
use warnings;
use strict;
no warnings 'uninitialized';
`rm /slot/ems12093/oracle/working/marchfound.txt`;
`touch /slot/ems12093/oracle/working/marchfound.txt`;
`rm /slot/ems12093/oracle/working/newcontact.txt`;
`touch /slot/ems12093/oracle/working/newcontact.txt`;
my ( $filename, $handle, #contact_list, $file_list, $k, #file_list2, $i, $e, $m, $fh, $f, $g,
$file1, $data, $file_location, $arrSize, $namefile );
$file_location = '/slot/ems12093/oracle/working/marchfound.txt';
$filename = '/slot/ems12093/oracle/working/contact.txt';
open( $handle, '<', $filename ) or die $!;
#contact_list = <$handle>;
close $handle;
chomp #contact_list;
chdir( '/scratch/mount_point/dnbfiles/oracle_cr/' );
$file_list = qx(ls|grep -i \"2016_03_Mar_EA\");
chomp( $file_list );
$k = "/scratch/mount_point/dnbfiles/oracle_cr/2016_03_Mar_EA";
chdir( $k );
#file_list2 = qx(ls|grep -i contact|grep -i full|grep -Ev "Glb");
chomp #file_list2;
foreach $file1 ( #file_list2 ) {
foreach $i ( #contact_list ) {
$e = "zgrep $i $file1";
$f = qx($e);
if ( $f ) {
print "working\n";
$g = "$f, $file1";
open $data, '>>', $file_location or die $!;
print $data "$g\n";
close $data;
#contact_list = grep { !/$i/ } #contact_list;
$arrSize = #contact_list;
print "$arrSize\n";
}
}
}
$m = "/slot/ems12093/oracle/working/";
chdir( $m );
chomp #contact_list;
$namefile = '/slot/ems12093/oracle/working/newcontact.txt';
open( $fh, '<', $namefile ) or die $!;
#contact_list = <$fh>;
close $fh;
print "done\n";
Here I am taking an input file contact.txt which has 370k records, for example mail address, and checking if those records are present in March month's zipped database 2016_03_Mar_EA.
The database again contains approx 1.6 million records e.g. name, designation, mail, etc. So it's going to take a LOT of time to check and print all 355k * 1.6m records.
Please suggest if there is any way that I can improve my script to get a faster result.
Not purely speed specific but you should do below modifications.
1) contact.txt has 370k records therefore you should not slurp whole data at once. So instead of doing
#contact_list = <$handle>;
You should read data line by line using
while(<$handle>){
#process one contact at a time
}
2) You are changing directories and executing shell commands to get desired files. It'd be better to use File::Find::Rule. It's easier to use, see below:
my #files = File::Find::Rule->file()->name( '*.pm' )->in( #INC );
The way you are doing this, I'd bet most of the time is spent in umcompressing the database dump (which will happen 370k times). Uncompress it once - before doing the matches. (That assumes you do have enough disk).
If you are not checking for actual regexps, fgrep will save some (marginal) time (though I suspect that this optimizatin is done internally by grep)
The advice on not slurping files is a good for memory saving, and should not affect speed much, for a single scan through the data. However, you are actually unnecessarily scanning the arry multiple times, in order to get rid of duplicate contacts
#contact_list = grep { !/$i/ } #contact_list;
and that not always slows the whole shebang down, it also wastes memory as #contact_list is being copied in memory.
You can read by line, keep track in a hash, and skip the loop body on duplicates:
next if exists $seen{$i};
$seen{$i}++

perl write variables to a file

Here's my code to parse a configuration file, write the retrieved data to another file and send it to a MySQL database.
The database connection and writing data to a table works fine, however I can't get it to write data to the mentioned file mongoData.txt.
I'm quite new to Perl, so any help will be highly appreciated.
#!/usr/bin/perl
use strict;
use warnings;
use DBI;
my $line;
# Retrieving data
open( my $FILE, "<", "/etc/mongod.conf" )
or die "Cannot find file! : $!\n";
while ( $line = <$FILE> ) {
chomp($line);
my ( $KEY, $VALUE ) = split /\:/, $line;
# Ignoring commented lines
$_ = $line;
unless ( $_ = ~/^#/ ) {
# Write to file
open my $FILE2, ">", "/home/sierra/Documents/mongoData.txt"
or die "Cannot create file $!\n";
print $FILE2 "$KEY", "$VALUE\n";
}
# Connection to SQL database
my $db = DBI->connect(( "dbi:mysql:dbname=mongodconf;
host = localhost;", "root", "sqladmin"
)) or die "can't connect to mysql";
# Inserting into database
$db->do("insert into data values ('$KEY', '$VALUE')")
or die "query error\n";
}
close($FILE);
Every time you open a file for output, you create a new file and delete any pre-existing file with the same name. That means you're going to be left with only the last line you wrote to the file
Here are some more pointers
Variable identifiers should in general be all in digits, lower case letters, and underscores. Capital letters are reserved for global identifiers such as package names
If you are running a version of Perl later than v5.14 then you can use autodie which checks all IO operations for you and removes the need to test the return status by hand
If you use a die string that has no newline at the end, then Perl will add information about the source file name and line number where it occurred, which can be useful for debugging
It is unnecessary to name your loop control variables. Programs can be made much more concise and readable by using Perl's pronoun variable $_ which is the default for many built-in operators
It is wasteful to reconnect to your database every time you need to make changes. You should connect once at the top of your program and use that static connection throughout your code
You should use placeholders when passing parameter expressions to an SQL operation. It can be dangerous, and that way DBI will quote them correctly for you
There is no need to close input files explicitly. Everything will be closed automatically at the end of the program. But if you are worried about the integrity of your output data, you may want to do an explicit close on output file handles so that you can check that they succeeded
Here's what I would write. Rather than testing whether each line of the input begins with a hash, it removes everything from the first hash character onwards and then checks to see if there are any non-blank characters in what remains. That allows for trailing comments in the data
#!/usr/bin/perl
use strict;
use warnings 'all';
use autodie;
use DBI;
my ($input, $output, $dsn) = qw{
/etc/mongod.conf
/home/sierra/Documents/mongoData.txt
dbi:mysql:dbname=mongodconf;host=localhost;
};
open my $fh, '<', $input;
open my $out_fh, '>', $output;
my $dbh = DBI->connect($dsn, qw/ root sqladmin /)
or die "Can't connect to MySQL: $DBI::errstr";
while ( <$fh> ) {
chomp;
s/#.*//;
next unless /\S/;
my ( $key, $val ) = split /\:/;
print $out_fh "$key $val\n";
$dbh->do('insert into data values (?, ?)', $key, $val);
}
close $out_fh or die $!;
$dbh->disconnect or warn $dbh->errstr;
You need to append the text into the creating new file mongoData.txt
while ($line=<$FILE>)
{
chomp ($line);
my ($KEY, $VALUE) = split /\:/,$line;
# Ignoring commented lines
$_ = $line;
unless ($_ = ~/^#/)
{
open my $FILE2, ">>", "/home/sierra/Documents/mongoData.txt" or die "Cannot create file $!\n";
print $FILE2 "$KEY","$VALUE\n";
}
}
close($FILE2);
or else
Create the text file once before your nesting the while loop
open my $FILE2, ">", "/home/sierra/Documents/mongoData.txt"
or die "Cannot create file $!\n";
while ($line=<$FILE>)
{
chomp ($line);
my ($KEY, $VALUE) = split /\:/,$line;
# Ignoring commented lines
$_ = $line;
unless ($_ = ~/^#/)
{
print $FILE2 "$KEY","$VALUE\n";
}
}
close($FILE2);
May be this will help you.

How do I search for a value in a file and print it using Perl?

I am having trouble searching for a value and printing it. This is what I have so far. What am I doing wrong? How do i get the desired output by searching in the output?
my $host = $ARGV[0];
my $port = $ARGV[1];
my $domain = $ARGV[2];
my $bean = $ARGV[3];
my $get = $ARGV[4];
open(FILE, ">", "/home/hey");
print FILE "open $host:$port\n";
print FILE "domain $domain\n";
print FILE "bean $bean\n";
print FILE "get -s $get\n";
print FILE "close\n";
close FILE;
open JMX, "/root/jdk1.6.0_37/bin/java -jar /var/scripts/jmxterm-1.0-alpha-4-uber.jar -v silent -n < /home//hey |";
open (dbg, ">", "/home/donejava1");
#print JMX "help \n";
foreach ( <JMX> )
{
chomp;
print $_;
open (LOG, ">", "/home/out1");
print LOG $_;
close LOG;
}
//output
{
committed = 313733;
init = 3221225472;
max = 3137339392;
used = 1796598680;
}
// how do i print 1796598680, looking for the attribute "used" ?
The following example should provide a solution for you.
perl -lne'print $1 if /used\s*=\s*(\d+);/' filename