Can I rename a hash key inside another hash in Perl? - perl

I have a hash of a hash (HoH) that I got from using select_allhashref on a mysql query.
The format is perfect for what I want and I have two other HoH by similar means.
I want to eventually merge these HoH together and the only problem (I think) I have is that there is a key in the 'sub-hash' of one HoH that has the same name as one in another HoH.
E.g.
my $statement1 = "select id, name, age, height from school_db";
my $school_hashref = $dbh->selectall_hashref($statement1, 1);
my $statement2 = "select id, name, address, post_code from address_db";
my $address_hashref = $dbh->selectall_hashref($statement2, 1);
So when I dump the data I get results like so:
$VAR1 = {
'57494' => {
'name' => 'John Smith',
'age' => '9',
'height' => '120'
}
}
};
$VAR1 = {
'57494' => {
'name' => 'Peter Smith',
'address' => '5 Cambridge Road',
'post_code' => 'CR5 0FS'
}
}
};
(this is an example so might seem illogical to have different names, but I need it :) )
So I would like to rename 'name' to 'address_name' or such. Is this possible? I know you can do
$hashref->{$newkey} = delete $hashref->{$oldkey};
(edit: this is an example that I found online, but haven't tested.)
but I don't know how I would represent the 'id' part. Any ideas?

Not knowing the way that you are merging them together, the easiest solution might be to simply change the select statement to rename the column in your results. Rather than trying to manipulate the hash afterward.
my $statement2 = "select id, name as address_name, address, post_code from address_db";
my $address_hashref = $dbh->selectall_hashref($statement2, 1);
If that is not a realistic option for you then a loop might be your best option
foreach (keys %{$address_hashref}) {
$address_hashref->{$_}{$newkey} = delete $address_hashref->{$_}{$oldkey}
}

You have do it with $hashref->{$newkey} = delete $hashref->{$oldkey}; because of the way hashes are implemented.
You can do it with hash of hashes too.
$hashref->{$key_id}{$newKey} = delete $hashref->{$key_id}{$oldKey};
The hash function is used to transform the key into the index (the hash) of an array element (the slot or bucket) where the corresponding value is to be sought.
Here's a simple example:
Our hash
{
'a' => "apples",
'b' => "oranges"
}
Let's define our hash function idx = h(key) and using the function on our keys above gives us:
h('a') = 02;
h('b') = 00;
How its stored in an array or bucket
idx | value
00 | 'oranges'
01 | ''
02 | 'apples'
03 | ''
... and so on
Say we want they key of 'apples' to be 'c'. We cannot simply change the key to 'c', since the hash function always returns 02 for 'a', and will return something different for 'c'. So, if we want to change the key, we also have to move the value to the correct idx in the array/bucket.
Well, it might not be different, but that is a collision. Collisions are a special case that to be handled when implementing a hash.
For more info about on hashes:
http://en.wikipedia.org/wiki/Hash_table
How are hash tables implemented internally in popular languages?

Related

Logical solution for nested for loop

I have a bit of logic drain. I hope I can explain what I am missing and what I want in a coherent manner. Let me know if I have to add a bit more data or information.
I have an Excel spreadsheet which I am trying to load to a database. I have slurped the data into an array of hashes. The data in the array looks like this
$hash_of_excel = [
{
col1 => 'value1',
col2 => 'value2',
col3 => 'value3|value4',
col4 => 'value5|value6|value7',
},
{
col1 => 'value8',
col2 => 'value9',
col3 => 'value10|value11|value12',
col4 => 'value13|value14|value15',
},
{
col1 => 'value16|value17',
col2 => 'value19|value18',
col3 => 'value20',
col4 => 'value21',
}
]
I have a piece of code that walks this data structure to get the values
foreach my $results ( #$hash_of_excel ) {
for my $colname ( sort keys %$results ) {
my #array = split /\|/, $results->{$colname};
foreach my $value ( #array ) {
warn $results->{'col1'}, $results->{'col2'}, $results->{'col3'};
last;
}
}
last if $counter++ == 2;
}
This would result in the same value printing over and over for the number of columns present in each hash (ie 4 in our case).
How can I access different columns for the DBI insert but without having to go through lot of for loops?
Is there a way to check if the value has more than one value and pushing them to array instead of having to get all of them in an array?
Or is this good to hand the database inserts to a subroutine and pass just the required column values in an array?
It's not clear what exactly you want, but your innermost loop is weird: it iterates over #array with $value, but $value isn't used in it - that's why you're getting the same output for all the iterations.
The following loop outputs all the values instead:
foreach my $value (#array){
warn $value;
}
i.e. no $results, no last.

XML File Creation in Perl - Updated Requirements [duplicate]

This question already has answers here:
XML file creation in Perl
(2 answers)
Closed 8 years ago.
Sorry, I am posting this again but lot of requirements have been changed and I need advice.
My First input file is
Root1 TBLA KEY1 COLA A B
Root1 TBLA KEY1 COLB D E
Root1 TBLA KEY3 COLX M N
Root2 TBLB KEY4 COLX M N
Root2 TBLB KEY4 COLD A B
Root3 TBLC KEY5 COLD A B
My second input file is
Root1 TBLA KEY6
Root2 TBLB KEY7
Root3 TBLC KEY8
My third input file is
Root1 TBLA KEY9
Root1 TBLA KEY10
Root3 TBLC KEY11
Basically File representation is
1) First file represents the old and new values. First is root table, Second is actual table in which diff is there. Third column tells the key value. Fourth and Fifth represents old and new value.
2) Second file represents the primary key which exists in db1 only and not in db2. First is root table, Second is actual table in which key exists. Third column tells the key value
3) Third file represents the primary key which exists in db2 only and not in db1. First is root table, Second is actual table in which key exists. Third column tells the key value
The output to be created in xml format as
<Data>
<Root1>
<TBLA>
<NEW1>
<KEY>KEY6</KEY>
<NEW1>
<NEW2>
<KEY>KEY9</KEY>
<KEY>KEY10</KEY>
<NEW2>
<MODIFIED>
<KEY name =KEY1>
<COLA>
<oldvalue>A</oldvalue>
<newvalue>B</newvalue>
</COLA>
<COLB>
<oldvalue>D</oldvalue>
<newvalue>E</newvalue>
</COLB>
</KEY>
<KEY name =KEY3>
<COLX>
<oldvalue>M</oldvalue>
<newvalue>N</newvalue>
</COLX>
</KEY>
</MODIFIED>
</TBLA>
</Root1>
<Data>
THIS IS NOT COMPLETE OUTPUT. PART OF OUTPUT IS DISPLAYED
Can anyone suggest what would be the best way to do this. Should i convert this text file to hash of hashes first and then try using pltoxml(). does this make sense. Can XML::Simple or XML::Writer suffice this.
This is the first time I am working on xml and not sure which approach will help efficicently my solution.
A small example wrt to my req would be appreciated.
*Input file will always be sorted on Root and then TBLNAME
Output format
Output contains for every root, every table in that root and that for every table, key which exists in one and then key which exists in second only. This comes in section new1 and new2 respectively. Third section contains Modified which needs to read from first input file and list the key value and with that key value what columns are modified (their old and new value)
If I have to use XML::Simple, how do i create hashref from these files which i can pass it to XMLout. There is no key in any of these files.
This is simply a matter of using split to split the data into fields, storing it into a hash and then transforming it using XML::Simple.
Note that I stick things into an array to enforce the order you intended.
All the data is read from the DATA handle. You shouldn't need me to show you IO code.
The #processors array is simply the different processors you would use on the various files:
Code:
use 5.016;
use strict;
use warnings;
use XML::Simple qw(:strict);
my %roots;
my #processors
= ( sub {
my ( $root, $table, $key, $col, $old, $new ) = split /\s+/;
$roots{ $root }{ $table }[2]{MODIFIED}{ $col }
= { oldvalue => $old
, newvalue => $new
};
return;
}
, sub {
my ( $root, $table, $key ) = split /\s+/;
push #{ $roots{ $root }{ $table }[0]{NEW1}{KEY} }, $key;
}
, sub {
my ( $root, $table, $key ) = split /\s+/;
push #{ $roots{ $root }{ $table }[1]{NEW2}{KEY} }, $key;
}
);
my $processor = shift #processors;
while ( <> ) {
chomp;
if ( $_ eq '---' ) {
$processor = shift #processors;
}
else {
$processor->( $_ );
}
}
my $xs = XML::Simple->new( NoAttr => 1, RootName => 'Data', );
my $xml = $xs->XMLout( \%roots, KeyAttr => {} );
say $xml;
It produces:
<Data>
<Root1>
<TBLA>
<NEW1>
<KEY>KEY6</KEY>
</NEW1>
</TBLA>
<TBLA>
<NEW2>
<KEY>KEY9</KEY>
<KEY>KEY10</KEY>
</NEW2>
</TBLA>
<TBLA>
<MODIFIED>
<COLA>
<newvalue>B</newvalue>
<oldvalue>A</oldvalue>
</COLA>
<COLB>
<newvalue>E</newvalue>
<oldvalue>D</oldvalue>
</COLB>
<COLX>
<newvalue>N</newvalue>
<oldvalue>M</oldvalue>
</COLX>
</MODIFIED>
</TBLA>
</Root1>
<Root2>
<TBLB>
<NEW1>
<KEY>KEY7</KEY>
</NEW1>
</TBLB>
<TBLB></TBLB>
<TBLB>
<MODIFIED>
<COLD>
<newvalue>B</newvalue>
<oldvalue>A</oldvalue>
</COLD>
<COLX>
<newvalue>N</newvalue>
<oldvalue>M</oldvalue>
</COLX>
</MODIFIED>
</TBLB>
</Root2>
<Root3>
<TBLC>
<NEW1>
<KEY>KEY8</KEY>
</NEW1>
</TBLC>
<TBLC>
<NEW2>
<KEY>KEY11</KEY>
</NEW2>
</TBLC>
<TBLC>
<MODIFIED>
<COLD>
<newvalue>B</newvalue>
<oldvalue>A</oldvalue>
</COLD>
</MODIFIED>
</TBLC>
</Root3>
</Data>

Interpreting Perl DBI MySQL column_info()

I'm trying to write Perl code that accepts bindings for a SQL INSERT statement, and identifies problems that might cause the INSERT to be rejected for bad data, and fixes them. To do this, I need to get and interpret column metadata.
The $dbh->column_info method returns the info in a coded form. I've pored through the official CPAN documentation but am still confused.
my $sth_column_info
= $dbh->column_info( $catalog, $schema, $table, undef );
my $columns_aoh_ref = $sth_column_info->fetchall_arrayref(
{ COLUMN_NAME => 1,
DATA_TYPE => 1,
NULLABLE => 1,
ORDINAL_POSITION => 1,
COLUMN_DEF => 1,
}
);
say $table;
for my $href (#$columns_aoh_ref) {
my #list;
while ( my ( $k, $v ) = each %$href ) {
push #list, "$k=" . ( $v // 'undef' );
}
say join '|', #list;
}
Output is:
dw_phone
NULLABLE=0|COLUMN_DEF=undef|DATA_TYPE=4|ORDINAL_POSITION=1|COLUMN_NAME=phone_id
NULLABLE=0|COLUMN_DEF=undef|DATA_TYPE=4|ORDINAL_POSITION=2|COLUMN_NAME=phone_no
NULLABLE=1|COLUMN_DEF=undef|DATA_TYPE=4|ORDINAL_POSITION=3|COLUMN_NAME=phone_ext
NULLABLE=0|COLUMN_DEF=undef|DATA_TYPE=1|ORDINAL_POSITION=4|COLUMN_NAME=phone_type
NULLABLE=0|COLUMN_DEF=undef|DATA_TYPE=1|ORDINAL_POSITION=5|COLUMN_NAME=phone_location
NULLABLE=1|COLUMN_DEF=undef|DATA_TYPE=1|ORDINAL_POSITION=6|COLUMN_NAME=phone_status
NULLABLE=0|COLUMN_DEF=undef|DATA_TYPE=11|ORDINAL_POSITION=7|COLUMN_NAME=insert_date
NULLABLE=0|COLUMN_DEF=undef|DATA_TYPE=11|ORDINAL_POSITION=8|COLUMN_NAME=update_date
Where - for example - does one find a mapping of data type codes to strings? Should I be using DATA_TYPE, TYPE_NAME, or SQL_DATA_TYPE? Should I be using NULLABLE or IS_NULLABLE, and why the two flavors?
I can appreciate the difficulty of documenting (let alone implementing) a universal interface for databases. But I wonder if anyone knows of a reference manual for using the DBI that's specific to MySQL?
UPDATE 1:
Tried to shed more light by retrieving all info using an array rather than a hash:
my $sth_column_info
= $dbh->column_info( $catalog, $schema, $table, undef );
my $aoa_ref = $sth_column_info->fetchall_arrayref; # <- chg. to arrayref, no parms
say $table;
for my $aref (#$aoa_ref) {
my #list = map $_ // 'undef', #$aref;
say join '|', #list;
}
Now I can see lots of potentially useful information mixed in there.
dw_contact_source
undef|dwcust1|dw_contact_source|contact_id|4|BIGINT|20|undef|undef|10|0|undef|undef|4|undef|undef|1|NO|undef|undef|undef|undef|undef|undef|undef|undef|undef|undef|undef|undef|undef|undef|undef|undef|undef|undef|1|bigint(20)|undef|0
undef|dwcust1|dw_contact_source|company_id|4|SMALLINT|6|undef|undef|10|0|undef|undef|4|undef|undef|2|NO|undef|undef|undef|undef|undef|undef|undef|undef|undef|undef|undef|undef|undef|undef|undef|undef|undef|undef|1|smallint(6)|undef|0
undef|dwcust1|dw_contact_source|contact_type_id|4|TINYINT|4|undef|undef|10|0|undef|undef|4|undef|undef|3|NO|undef|undef|undef|undef|undef|undef|undef|undef|undef|undef|undef|undef|undef|undef|undef|undef|undef|undef||tinyint(4)|undef|0
undef|dwcust1|dw_contact_source|insert_date|11|DATETIME|19|undef|0|undef|0|undef|undef|9|-79|undef|4|NO|undef|undef|undef|undef|undef|undef|undef|undef|undef|undef|undef|undef|undef|undef|undef|undef|undef|undef||datetime|undef|0
undef|dwcust1|dw_contact_source|update_date|11|DATETIME|19|undef|0|undef|0|undef|undef|9|-79|undef|5|NO|undef|undef|undef|undef|undef|undef|undef|undef|undef|undef|undef|undef|undef|undef|undef|undef|undef|undef||datetime|undef|0
So my question would be:
How do I get the corresponding names/descriptions of these metadata?
How do I fetchall_arrayref just what I need, using symbols rather than integers? (I tried fetchall_arrayref([qw/COLUMN_NAME DATA_TYPE/]) and got back all undefs; now I'm just flailing about guessing.)
UPDATE 2:
Now I'm digging around in DBD::mysql.pm and I found a very interesting array:
my #names = qw(
TABLE_CAT TABLE_SCHEM TABLE_NAME COLUMN_NAME
DATA_TYPE TYPE_NAME COLUMN_SIZE BUFFER_LENGTH DECIMAL_DIGITS
NUM_PREC_RADIX NULLABLE REMARKS COLUMN_DEF
SQL_DATA_TYPE SQL_DATETIME_SUB CHAR_OCTET_LENGTH
ORDINAL_POSITION IS_NULLABLE CHAR_SET_CAT
CHAR_SET_SCHEM CHAR_SET_NAME COLLATION_CAT COLLATION_SCHEM COLLATION_NAME
UDT_CAT UDT_SCHEM UDT_NAME DOMAIN_CAT DOMAIN_SCHEM DOMAIN_NAME
SCOPE_CAT SCOPE_SCHEM SCOPE_NAME MAX_CARDINALITY
DTD_IDENTIFIER IS_SELF_REF
mysql_is_pri_key mysql_type_name mysql_values
mysql_is_auto_increment
);
These correspond precisely to what is returned by fetchall_arrayref. Now I can see I have four choices for learning data type, so let's see if any of the codes are documented.
UPDATE 3:
DBI Recipes is a very nice adjunct to CPAN DBI documentation about retrieving info back into Perl (Especially the {select|fetch}{row|all}_{hash|array} methods.)
This will help you determine the values for the data_types. I normally use the data_type to determine how to handle a column based on its type.
You then need to look at the MySQL data type key below and get the hash value. Then look at the DBI table below and match up the data name to get the data type value. Example: a BIGINT is an INTEGER type which matches SQL_INTEGER so the DATA_TYPE value is 4,
DBD::MySQL
### ANSI datatype mapping to mSQL datatypes
%DBD::mysql::db::ANSI2db = ("CHAR" => "CHAR",
"VARCHAR" => "CHAR",
"LONGVARCHAR" => "CHAR",
"NUMERIC" => "INTEGER",
"DECIMAL" => "INTEGER",
"BIT" => "INTEGER",
"TINYINT" => "INTEGER",
"SMALLINT" => "INTEGER",
"INTEGER" => "INTEGER",
"BIGINT" => "INTEGER",
"REAL" => "REAL",
"FLOAT" => "REAL",
"DOUBLE" => "REAL",
"BINARY" => "CHAR",
"VARBINARY" => "CHAR",
"LONGVARBINARY" => "CHAR",
"DATE" => "CHAR",
"TIME" => "CHAR",
"TIMESTAMP" => "CHAR"
);
DBI.pm
TYPE
The TYPE attribute contains a reference to an array of integer values representing the international standard values for the respective datatypes. The array of integers has a length equal to the number of columns selected within the original statement, and can be referenced in a similar way to the NAME attribute example shown earlier.
The standard values for common types are:
SQL_CHAR 1
SQL_NUMERIC 2
SQL_DECIMAL 3
SQL_INTEGER 4
SQL_SMALLINT 5
SQL_FLOAT 6
SQL_REAL 7
SQL_DOUBLE 8
SQL_DATE 9
SQL_TIME 10
SQL_TIMESTAMP 11
SQL_VARCHAR 12
SQL_LONGVARCHAR -1
SQL_BINARY -2
SQL_VARBINARY -3
SQL_LONGVARBINARY -4
SQL_BIGINT -5
SQL_TINYINT -6
SQL_BIT -7
SQL_WCHAR -8
SQL_WVARCHAR -9
SQL_WLONGVARCHAR -10
While these numbers are fairly standard,[61] the way drivers map their native types to these standard types varies greatly. Native types that don't correspond well to one of these types may be mapped into the range officially reserved for use by the Perl DBI: -9999 to -9000.

How to write hash key and value pairs into MongoDB documents as field values in Perl?

my %Hash= (2012=> 1, 1982=>12, 2010=>0);
The has key and values need to be all on the same field name 'time' like an array
$mycollection->insert(
{
'field1' => $var1;
'field2' => $var2;
#right here I need to know how to add above hash key and values
# like below
#'time': ["2012.1","1982.12","2010.0"]
}
);
Any suggestions or ideas will be apprecieated. This can probably accomplished by doing series of update statements but I would like to accomplish this with one insert statement due to my requirement.
I suppose your %Hash variable is something like this:
my %Hash= (2012=> 1, 1982=>12, 2010=>0);
So your array "time" is build this way:
my #time = map { $_ . "." . $Hash{$_} } keys %Hash;
and finally:
$mycollection->insert({
'field1' => $var1,
'field2' => $var2,
'time' => \#time
});

Having an SQL SELECT query, how do I get number of items?

I'm writing a web app in Perl using Dancer framework. The database is in sqlite and I use DBI for database interaction.
I'm fine with select statements, but I wonder is there a way to count selected rows.
E.g. I have
get '/' => sub {
my $content = database->prepare(sprintf("SELECT * FROM content LIMIT %d",
$CONTNUM));
$content->execute;
print(Dumper($content->fetchall_arrayref));
};
How do I count all items in the result without issuing another query?
What I want to achieve this way is showing 30 items per page and knowing how many pages there would be. Of course I can run SELECT COUNT (*) foo bar, but it looks wrong and redundant to me. I'm looking for a more or less general, DRY and not too heavy on database way to do so.
Any SQL or Perl hack or a hint what should I read about would be appreciated.
// I know using string concatenation for querys is bad
You have to do it the hard way: one query to get the count and another to get your desired slice of the row set:
my $count = $database->prepare('SELECT COUNT(*) FROM content');
$count->execute();
my $n = $count->fetchall_arrayref()->[0][0];
my $content = $database->prepare('SELECT * FROM content LIMIT ?');
$content->execute($CONTNUM);
#...
Not too familiar with perl, but I assume you can just store the result of $content->fetchall_arrayref and retrieve the count from that array befor you print it.
[edit]
Something like
my $ref = $content->fetchall_arrayref;
my $count = scalar(#$ref);
Don't use sqlite myself but the following might work:
select * from table join (select count(*) from table);
Whether the above works or not the first thing I'd look for is scrollable cursors if you are going to page through results - I doubt sqlite has those. However, in DBI you can use fetchall_arrayref with a max_rows to fetch a "page" at a time. Just look up the example in the DBI docs under fetchall_arrayref - it is something like this:
my $rowcache = [];
while( my $row = ( shift(#$rowcache) || shift(#{$rowcache=$sth->fetchall_arrayref(undef,100)||[]}) )
) {
# do something here
}
UPDATE: Added what you'd get with selectall_hashref assuming the table is called content with one integer column called "a":
$ perl -le 'use DBI; my $h = DBI->connect("dbi:SQLite:dbname=fred.db"); my $r = $h->selectall_hashref(q/select * from content join (select count(*) as count from content)/, "a");use Data::Dumper;print Dumper($r);'
$VAR1 = {
'1' => {
'count' => '3',
'a' => '1'
},
'3' => {
'count' => '3',
'a' => '3'
},
'2' => {
'count' => '3',
'a' => '2'
}
};
If you want to know how many results there will be, as well as getting the results themselves, all in one query, then get the count as a new value:
SELECT COUNT(*) AS num_rows, * from Table WHERE ...
Now the row count will be the first column of every row of your resultset, so simply pop that off before presenting the data.