Untainting a blessed hash member with or without the delete - perl

I saw this line of code in some sources
( $self->{arg} ) = ( ( delete $self->{arg} ) =~ /(.*)/s ) if ${^TAINT};
I understand the untainting. I also known delete
My question is, in what circumstances is it necessary or preferred to use the delete, and isn't it enough to use the simpler
( $self->{arg} ) = ( ( $self->{arg} ) =~ /(.*)/s ) if ${^TAINT};
For example
#!/usr/bin/env perl -T
use 5.014;
use warnings;
package Some {
use Moose;
has 'arg' => (is => 'rw', isa => 'Str');
sub doit {
my $self = shift;
#( $self->{arg} ) = ( ( delete $self->{arg} ) =~ /(.*)/s ) if ${^TAINT};
( $self->{arg} ) = ( ( $self->{arg} ) =~ /(.*)/s ) if ${^TAINT};
}
};
my $some = Some->new( arg => 'some text' );
$some->doit();
say $some->arg;

With a normal hash deleting the value and reinserting will give the same result as modifying it in place.
The commit does not give any information about why he deletes it just that he copies the functionality from Mason 1. But if you look at the source of HTML::Mason::Lexer, you will find this comment:
We need to untaint the component or else the regexes will fail
to a Perl bug. The delete is important because we need to
create an entirely new scalar, not just modify the existing one.
($current->{comp_source}) = (delete $current->{comp_source}) =~ /(.*)/s if taint_is_on;
So the reason of doing it this way is to have a new scalar, although he does not do that for the other place where he is untainting: Mason::Interp, so my guess is an earlier Perl bug, when untainting.
So the difference is that with delete will give you a new scalar, although this will seldom have a practical application. (Delete and insert is also a slower operation of course.)
use strict;
my $hash->{test} = 'test';
print \($hash->{test}),"\n";
( $hash->{test} ) = ( ( $hash->{test} ) =~ /(.*)/s );
print \($hash->{test}),"\n";
( $hash->{test} ) = ( ( delete $hash->{test} ) =~ /(.*)/s );
print \($hash->{test}),"\n";
gives
SCALAR(0x7f84d10047e8)
SCALAR(0x7f84d10047e8)
SCALAR(0x7f84d1029230)

Related

Different output from same code parsing simiiar tab-delimited files

The Perl script below is written in a shell.
If I use the tab-delimited file numeric then I get the desired result of each line parsed accordingly. However, if I use the file alpha as input then only the first line is parsed.
The only difference between alpha and numeric is that numeric has NC_000023
NC_000023.11:g.41747805_41747806delinsTT
NC_000023.11:g.41750615C>A
while alpha has NC_0000X
NC_0000X.11:g.41747805_41747806delinsTT
NC_0000X.11:g.41750615C>A
What am I missing?
numeric
Input Variant Errors Chromosomal Variant Coding Variant(s)
NM_003924.3:c.*18_*19delGCinsAA NC_000023.11:g.41747805_41747806delinsTT LRG_513t1:c.*18_*19delinsAA NM
NM_003924.3:c.013G>T NC_000023.11:g.41750615C>A LRG_513t1:c.13G>T
alpha
Input Variant Errors Chromosomal Variant Coding Variant(s)
NM_003924.3:c.*18_*19delGCinsAA NC_0000X.11:g.41747805_41747806delinsTT LRG_513t1:c.*18_*19delinsAA NM_003924.3:c.*18_*19delinsAA
NM_003924.3:c.013G>T NC_0000X.11:g.41750615C>A LRG_513t1:c.13G>T NM_003924.3:c.13G>T
Perl
perl -ne '
next if $. == 1;
if ( /.*del([A-Z]+)ins([A-Z]+).*NC_0+([^.]+)\..*g\.([0-9]+)_([0-9]+)/ ) { # indel
print join( "\t", $3, $4, $5, $1, $2 ), "\n";
}
else {
while ( /\t*NC_(\d+)\.\S+g\.(\d+)(\S+)/g ) {
# conditional parse
( $num1, $num2, $common ) = ( $1, $2, $3 );
$num3 = $num2;
if ( $common =~ /^([A-Z])>([A-Z])$/ ) { # SNP
( $ch1, $ch2 ) = ( $1, $2 );
}
elsif ( $common =~ /^del([A-Z])$/ ) { # deletion
( $ch1, $ch2 ) = ( $1, "-" );
}
elsif ( $common =~ /^ins([A-Z])$/ ) { # insertion
( $ch1, $ch2 ) = ( "-", $1 );
}
elsif ( $common =~ /^_(\d+)del([A-Z]+)$/ ) { # multi deletion
( $num3, $ch1, $ch2 ) = ( $1, $2, "-" );
}
elsif ( $common =~ /^_(\d+)ins([A-Z]+)$/ ) { # multi insertion
( $num3, $ch1, $ch2 ) = ( "-", $1, $2 );
}
printf( "%d\t%d\t%d\t%s\t%s\n", $num1, $num2, $num3, $ch1, $ch2 ); # output
map { undef } ( $num1, $num2, $num3, $common, $ch1, $ch2 );
}
}' numeric
output
23 41747805 41747806 GC AA
23 41750615 41750615 C A
output using alpha:
X 41747805 41747806 GC AA
If I use \w instead of \d in the while condition, like this
while ( /\t*NC_(\w+)\.\S+g\.(\d+)(\S+)/g ) { ... }
I get this result
X 41747805 41747806 GC AA
0 41750615 41750615 C A
Why the zero in $1
The while (/\t*NC_(\d+)\. will not match 'NC_0000X.11', due to the 'X' and the regex is looking for digits only.
After the change you made, NC_(\w+) will match 'NC_0000X' and $num1 is set to '0000X'.
Your printf "%d...." $num1 ... will print a 0 for non numeric input. As $num1 is '0000X', it will print as 0.
The input example suggests, that each line consists of fields, which are separated by white-space. Some fields are of interest, otheres not. Each field holds recognizable information.
Your program should follow this structure.
Read the file line by line
split the line into fields
skip fields, which are not of interest, e.g. non NC.*
extract the necessary information from the field
do whatever necessary, sum it up, collect it
print the information at the level needed. Per field, line, file or after all files
It is much easier to work on smaller chuncks instead of finding a regex which works for the whole line. It is much easier to read, understand and maintain.

more elegant way to construct SQL adding WHERE and using placeholders

What is the best way to construct sql with various number of WHERE conditions ?
My solution looks ugly:
my ($where, #values);
if ($phone_number)
{
$where = 'AND pnone_number=?';
#values = ($from, $till, $phone_number);
}
else
{
$where = '';
#values = ($from, $till);
}
my $sql = 'SELECT * FROM calls WHERE time between ? AND ? '.$where.' ORDER BY time';
my $res = $dbh->selectall_arrayref($sql, undef, #values) or warn 'error';
How about:
my $where = '';
my #values = ( $from, $till );
if ( $phone_number ) {
$where = 'AND phone_number=?';
push #values, $phone_number;
}
That eliminates the need for your else clause.
You could also use something like SQL::Abstract.
use SQL::Abstract;
...
my ( $sql, #values ) = SQL::Abstract->new->select(
'calls', # table
'*', # columns
{ time => { '<=' => $till, '>' => $from }, # where clause
$phone_number ? ( phone_number => $phone_number ) : ( ),
},
'time' # order clause
);
1=1 is added for cases when $where would be epmty.
my $where = "AND time between ? AND ? ";
my #values = ($from, $till);
if ($phone_number) {
$where .= 'AND pnone_number=? ';
push #values, $phone_number;
}
my $sql = 'SELECT * FROM calls WHERE 1=1 $where ORDER BY time';
my $res = $dbh->selectall_arrayref($sql, undef, #values) or warn 'error';
Conditional list-include (aka "enterprise"):
my #values = ( $from,
$till,
( $phone_number ) x !! $phone_number,
);
my $sql = 'SELECT * FROM calls WHERE time between ? AND ? '
. 'AND phone_number=?' x !! $phone_number
. ' ORDER BY time';

Perl DBI and sql now()

I have been trying to use sql NOW() function while I update a table. But nothing happens to that date field, I think DBI just ignores that value. Code is :
dbUpdate($id, (notes => 'This was an update', filesize => '100.505', dateEnd => 'NOW()'));
and the function is :
sub dbUpdate {
my $id = shift;
my %val_hash = #_;
my $table = 'backupjobs';
my #fields = keys %val_hash;
my #values = values %val_hash;
my $update_stmt = "UPDATE $table SET ";
my $count = 1;
foreach ( #fields ) {
$update_stmt .= "$_ = ? ";
$update_stmt .= ', ' if ($count != scalar #fields);
$count++;
}
$update_stmt .= " WHERE ID = ?";
print "update_stmt is : $update_stmt\n";
my $dbo = dbConnect();
my $sth = $dbo->prepare( $update_stmt );
$sth->execute( #values, $id ) or die "DB Update failed : $!\n";
$dbo->disconnect || die "Failed to disconnect\n";
return 1;
}#dbUpdate
As you can see this is a "dynamic" generation of the sql query and hence I dont know where an sql date function(like now()) come.
In the given example the sql query will be
UPDATE backupjobs SET filesize = ? , notes = ? , dateEnd = ? WHERE ID = ?
with param values
100.55, This was an update, NOW(), 7
But the date column still shows 0000-00-........
Any ideas how to fix this ?
I have been trying to use sql NOW() function while I update a table. But
nothing happens to that date field, I think DBI just ignores that
value.
No, it doesn't. But the DB thinks you are supplying a datetime or timestamp data type when you are in fact trying to add the string NOW(). That of course doesn't work. Unfortunately DBI cannot find that out for you. All it does is change ? into an escaped (via $dbh->quote()) version of the string it got.
There are several things you could do:
Supply a current timestamp string in the appropriate format instead of the string NOW().
If you have it available, you can use DateTime::Format::MySQL like this:
use DateTime::Format::MySQL;
dbUpdate(
$id,
(
notes => 'This was an update',
filesize => '100.505',
dateEnd => DateTime::Format::MySQL->format_datetime(DateTime->now),
)
);
If not, just use DateTime or Time::Piece.
use DateTime;
my $now = DateTime->now->datetime;
$now =~ y/T/ /;
dbUpdate(
$id,
(notes => 'This was an update', filesize => '100.505', dateEnd => $now));
Tell your dbUpdate function to look for things like NOW() and react accordingly.
You can do something like this. But there are better ways to code this. You should also consider that there is e.g. CURRENT_TIMESTAMP() as well.
sub dbUpdate {
my $id = shift;
my %val_hash = #_;
my $table = 'backupjobs';
my $update_stmt = "UPDATE $table SET ";
# Check for the NOW() value
# This could be done with others too
foreach ( keys %val_hash ) {
if ($val_hash{$_} =~ m/^NOW\(\)$/i) {
$update_stmt .= "$_ = NOW()";
$update_stmt .= ', ' if scalar keys %val_hash > 1;
delete $val_hash{$_};
}
}
# Put everything together, join keeps track of the trailing comma
$update_stmt .= join(', ', map { "$_=?" } keys %val_hash );
$update_stmt .= " WHERE ID = ?";
say "update_stmt is : $update_stmt";
say "values are: ", join(', ', values %val_hash);
my $dbo = dbConnect();
my $sth = $dbo->prepare( $update_stmt );
$sth->execute( values %val_hash, $id ) or die "DB Update failed : $!\n";
$dbo->disconnect || die "Failed to disconnect\n";
return 1;
}
Write your queries yourself.
You're probably not going to do it and I'll not add an example since you know how to do it anyway.
Here's something else: Is this the only thing you do with your database while your program runs? It is not wise to connect and disconnect the database every time you make a query. It would be better for performance to connect the database once you need it (or at the beginning of the program, if you always use it) and just use this dbh/dbo everywhere. It saves a lot of time (and code).
Sorry but you can't use NOW() as an interpolated value like that.
when a ? is used in an SQL statement, the corresponding value is (via some mechanism or other) passed to the database escaped, so whatever value is interpreted as a string, not as SQL. So you are in effect attempting to use the string 'NOW()' as a date value, not the function NOW().
To use NOW() as a function, you will have to insert it into the SQL itself rather than pass it as a bound value. This means you will either have to use a hack of your dbupdate function or write a new one, or obtain the time as a string in perl and pass the resulting string the dbupdate.

DBD::CSV: Problem with file-name-extensions

In this script I have problems with file-name-extensions:
if I use /home/mm/test_x it works, with file named /home/mm/test_x.csv it doesn't:
#!/usr/bin/env perl
use warnings; use strict;
use 5.012;
use DBI;
my $table_1 = '/home/mm/test_1.csv';
my $table_2 = '/home/mm/test_2.csv';
#$table_1 = '/home/mm/test_1';
#$table_2 = '/home/mm/test_2';
my $dbh = DBI->connect( "DBI:CSV:" );
$dbh->{RaiseError} = 1;
$table_1 = $dbh->quote_identifier( $table_1 );
$table_2 = $dbh->quote_identifier( $table_2 );
my $sth = $dbh->prepare( "SELECT a.id, a.name, b.city FROM $table_1 AS a NATURAL JOIN $table_2 AS b" );
$sth->execute;
$sth->dump_results;
$dbh->disconnect;
Output with file-name-extention:
DBD::CSV::st execute failed:
Execution ERROR: No such column '"/home/mm/test_1.csv".id' called from /usr/local/lib/perl5/site_perl/5.12.0/x86_64-linux/DBD/File.pm at 570.
Output without file-name-extension:
'1', 'Brown', 'Laramie'
'2', 'Smith', 'Watertown'
2 rows
Is this a bug?
cat test_1.csv
id,name
1,Brown
2,Smith
5,Green
cat test_2.csv
id,city
1,Laramie
2,Watertown
8,Springville
DBD::CSV provides a way to map the table names you use in your queries to filenames. The same mechanism is used to set up per-file attributes like line ending, field separator etc. look for 'csv_tables' in the DBD::CSV documentation.
#!/usr/bin/env perl
use warnings;
use strict;
use DBI;
my $dbh = DBI->connect("DBI:CSV:f_dir=/home/mm", { RaiseError => 1 });
$dbh->{csv_tables}->{table_1} = {
'file' => 'test_1.csv',
'eol' => "\n",
};
$dbh->{csv_tables}->{table_2} = {
'file' => 'test_2.csv',
'eol' => "\n",
};
my $sth = $dbh->prepare( "SELECT a.id, a.name, b.city FROM table_1 AS a NATURAL JOIN table_2 AS b" );
$sth->execute();
$sth->dump_results();
$dbh->disconnect();
In my case I had to specify a line ending character, because I created the CSV files in vi so they ended up with Unix line endings whereas DBD::CSV assumes DOS/Windows line-endings regardless of the platform the script is run on.
I looks like even this works:
#!/usr/bin/env perl
use warnings; use strict;
use 5.012;
use DBI;
my $dbh = DBI->connect("DBI:CSV:f_dir=/home/mm/Dokumente", undef, undef, { RaiseError => 1, });
my $table = 'new.csv';
$dbh->do( "DROP TABLE IF EXISTS $table" );
$dbh->do( "CREATE TABLE $table ( id INT, name CHAR(64), city CHAR(64) )" );
my $sth_new = $dbh->prepare( "INSERT INTO $table( id, name, city ) VALUES ( ?, ?, ? )" );
$dbh->{csv_tables}->{table_1} = { 'file' => '/tmp/test_1.csv', 'eol' => "\n", };
$dbh->{csv_tables}->{table_2} = { 'file' => '/tmp/test_2.csv', 'eol' => "\n", };
my $sth_old = $dbh->prepare( "SELECT a.id, a.name, b.city FROM table_1 AS a NATURAL JOIN table_2 AS b" );
$sth_old->execute();
while ( my $hash_ref = $sth_old->fetchrow_hashref() ) {
state $count = 1;
$sth_new->execute( $count++, $hash_ref->{'a.name'}, $hash_ref->{'b.city'} );
}
$dbh->disconnect();
I think you might want to take a look at the f_ext and f_dir attributes. You can then class your table names as "test_1" and "test_2" without the csv but the files used will be test_1.csv and test_2.csv. The problem with a dot in the table name is a dot is usually used for separating the schema from the table name (see f_schema).

How do I insert values from a hash into a database using Perl's DBI module?

I need to insert values from a hash into a database. Following is the code template I have to insert values in table1 column key and value:
use DBI;
use strict;
%hash; #assuming it already contains desired values
my $dbh = DBI->connect(
"dbi:Sybase:server=$Srv;database=$Db",
"$user", "$passwd"
) or die sprintf 'could not connect to database %s', DBI->errstr;
my $query= "Insert INTO table1(key, values) VALUES (?,?) ";
my $sth = $dbh->prepare($query)
or die "could not prepare statement\n", $dbh->errstr;
$sth-> execute or die "could not execute", $sth->errstr;
I know how to insert values using array i.e use execute_array(), but do not know how to insert values present in %hash in table1.
Any suggestions?
The following uses the execute_array function as mentioned in your question. I tested it.
my $dbh = DBI->connect("DBI:mysql:database=$DB;host=$host;port=$port", $user, $password);
my %hash = (
1 => 'A',
2 => 'B',
0 => 'C',
);
my #keys = keys %hash;
my #values = values %hash;
my $sth = $dbh->prepare("INSERT INTO table1(id, value) VALUES (?,?);");
$sth->execute_array({},\#keys, \#values);
(Sorry, I don't have a Sybase database to work with, or I'd use it as an example.)
Try SQL::Abstract
use DBI;
use SQL::Abstract;
use strict;
%hash; #assuming it already contains desired values
my $dbh = DBI->connect(
"dbi:Sybase:server=$Srv;database=$Db",
"$user", "$passwd"
) or die sprintf 'could not connect to database %s', DBI->errstr;
my ($query, #bind) = $sql->insert("tableName", \%hash);
my $sth = $dbh->prepare($query)
or die "could not prepare statement\n", $dbh->errstr;
$sth-> execute (#bind) or die "could not execute", $sth->errstr;
Here's a mostly easy way to build the query. I will typically do something like this because I haven't found another workaround yet.
use strict;
use DBI;
my $dbh = Custom::Module::Make::DBH->connect('$db');
my %hash = (
apple => 'red',
grape => 'purple',
banana => 'yellow',
);
my $keystr = (join ",\n ", (keys %hash));
my $valstr = join ', ', (split(/ /, "? " x (scalar(values %hash))));
my #values = values %hash;
my $query = qq`
INSERT INTO table1 (
$keystr
)
VALUES (
$valstr
)
`;
my $sth = $dbh->prepare($query)
or die "Can't prepare insert: ".$dbh->errstr()."\n";
$sth->execute(#values)
or die "Can't execute insert: ".$dbh->errstr()."\n";
But it's possible I also didn't understand the question correctly :P
Maybe you could try using
for my $key (keys %hash) {
$sth->execute($key, $hash{$key}) or die $sth->errstr;
}
Is this what you're trying to achieve?
If I understand the manual correctly ("Execute the prepared statement once for each parameter tuple (group of values) [...] via a reference passed ...") it should also be possible to simply to
($tuples, $rows) = $sth->execute_array(\%hash) or die $sth->errstr;