XML File Creation in Perl - Updated Requirements [duplicate] - perl

This question already has answers here:
XML file creation in Perl
(2 answers)
Closed 8 years ago.
Sorry, I am posting this again but lot of requirements have been changed and I need advice.
My First input file is
Root1 TBLA KEY1 COLA A B
Root1 TBLA KEY1 COLB D E
Root1 TBLA KEY3 COLX M N
Root2 TBLB KEY4 COLX M N
Root2 TBLB KEY4 COLD A B
Root3 TBLC KEY5 COLD A B
My second input file is
Root1 TBLA KEY6
Root2 TBLB KEY7
Root3 TBLC KEY8
My third input file is
Root1 TBLA KEY9
Root1 TBLA KEY10
Root3 TBLC KEY11
Basically File representation is
1) First file represents the old and new values. First is root table, Second is actual table in which diff is there. Third column tells the key value. Fourth and Fifth represents old and new value.
2) Second file represents the primary key which exists in db1 only and not in db2. First is root table, Second is actual table in which key exists. Third column tells the key value
3) Third file represents the primary key which exists in db2 only and not in db1. First is root table, Second is actual table in which key exists. Third column tells the key value
The output to be created in xml format as
<Data>
<Root1>
<TBLA>
<NEW1>
<KEY>KEY6</KEY>
<NEW1>
<NEW2>
<KEY>KEY9</KEY>
<KEY>KEY10</KEY>
<NEW2>
<MODIFIED>
<KEY name =KEY1>
<COLA>
<oldvalue>A</oldvalue>
<newvalue>B</newvalue>
</COLA>
<COLB>
<oldvalue>D</oldvalue>
<newvalue>E</newvalue>
</COLB>
</KEY>
<KEY name =KEY3>
<COLX>
<oldvalue>M</oldvalue>
<newvalue>N</newvalue>
</COLX>
</KEY>
</MODIFIED>
</TBLA>
</Root1>
<Data>
THIS IS NOT COMPLETE OUTPUT. PART OF OUTPUT IS DISPLAYED
Can anyone suggest what would be the best way to do this. Should i convert this text file to hash of hashes first and then try using pltoxml(). does this make sense. Can XML::Simple or XML::Writer suffice this.
This is the first time I am working on xml and not sure which approach will help efficicently my solution.
A small example wrt to my req would be appreciated.
*Input file will always be sorted on Root and then TBLNAME
Output format
Output contains for every root, every table in that root and that for every table, key which exists in one and then key which exists in second only. This comes in section new1 and new2 respectively. Third section contains Modified which needs to read from first input file and list the key value and with that key value what columns are modified (their old and new value)
If I have to use XML::Simple, how do i create hashref from these files which i can pass it to XMLout. There is no key in any of these files.

This is simply a matter of using split to split the data into fields, storing it into a hash and then transforming it using XML::Simple.
Note that I stick things into an array to enforce the order you intended.
All the data is read from the DATA handle. You shouldn't need me to show you IO code.
The #processors array is simply the different processors you would use on the various files:
Code:
use 5.016;
use strict;
use warnings;
use XML::Simple qw(:strict);
my %roots;
my #processors
= ( sub {
my ( $root, $table, $key, $col, $old, $new ) = split /\s+/;
$roots{ $root }{ $table }[2]{MODIFIED}{ $col }
= { oldvalue => $old
, newvalue => $new
};
return;
}
, sub {
my ( $root, $table, $key ) = split /\s+/;
push #{ $roots{ $root }{ $table }[0]{NEW1}{KEY} }, $key;
}
, sub {
my ( $root, $table, $key ) = split /\s+/;
push #{ $roots{ $root }{ $table }[1]{NEW2}{KEY} }, $key;
}
);
my $processor = shift #processors;
while ( <> ) {
chomp;
if ( $_ eq '---' ) {
$processor = shift #processors;
}
else {
$processor->( $_ );
}
}
my $xs = XML::Simple->new( NoAttr => 1, RootName => 'Data', );
my $xml = $xs->XMLout( \%roots, KeyAttr => {} );
say $xml;
It produces:
<Data>
<Root1>
<TBLA>
<NEW1>
<KEY>KEY6</KEY>
</NEW1>
</TBLA>
<TBLA>
<NEW2>
<KEY>KEY9</KEY>
<KEY>KEY10</KEY>
</NEW2>
</TBLA>
<TBLA>
<MODIFIED>
<COLA>
<newvalue>B</newvalue>
<oldvalue>A</oldvalue>
</COLA>
<COLB>
<newvalue>E</newvalue>
<oldvalue>D</oldvalue>
</COLB>
<COLX>
<newvalue>N</newvalue>
<oldvalue>M</oldvalue>
</COLX>
</MODIFIED>
</TBLA>
</Root1>
<Root2>
<TBLB>
<NEW1>
<KEY>KEY7</KEY>
</NEW1>
</TBLB>
<TBLB></TBLB>
<TBLB>
<MODIFIED>
<COLD>
<newvalue>B</newvalue>
<oldvalue>A</oldvalue>
</COLD>
<COLX>
<newvalue>N</newvalue>
<oldvalue>M</oldvalue>
</COLX>
</MODIFIED>
</TBLB>
</Root2>
<Root3>
<TBLC>
<NEW1>
<KEY>KEY8</KEY>
</NEW1>
</TBLC>
<TBLC>
<NEW2>
<KEY>KEY11</KEY>
</NEW2>
</TBLC>
<TBLC>
<MODIFIED>
<COLD>
<newvalue>B</newvalue>
<oldvalue>A</oldvalue>
</COLD>
</MODIFIED>
</TBLC>
</Root3>
</Data>

Related

Perl : Printing a hash with multiple values for single key

I have hash with multiple values to a key. How to print multiple values of key in a hash independently?
# HASH with multiple values for each key
my %hash = ( "fruits" => [ "apple" , "mango" ], "vegs" => ["Potato" , "Onion"]);
# SET UP THE TABLE
print "<table border='1'>";
print "<th>Category</th><th>value1</th><th>value2</th>";
#Print key and values in hash in tabular format
foreach $key (sort keys %hash) {
print "<tr><td>".$key."</td>";
print "<td>".#{$hash{$key}}."</td>";
}
* Current Output: *
Category Value1 Value2
fruits apple mango
vegs Potato Onion
* Desired Output: *
Category Value1 Value2
fruits apple mango
vegs Potato Onion
Try replacing the second line of your loop with
print "<td>$_</td>" for #{ $hash{$key} };
Which will loop over each item in the array reference and wrap them in td tags.

for loop in hash table printing only last value

Please advice.
for my $record (#item) {
for my $int (#$record){
# DEBUG( "DEBUG:: $record and $int");
my %data = ( $record , $int );
}
}
}
Record is like
abc ,china
abc ,japan
abc , italy
abc , singapore
print Dumper %data;
output :
abc , singapore
Now the issues is when I dump the output it shows me last record entry in hash table.May be because of unique key.
Kindly suggest.
Two problems:
You are recreating the hash in each iteration of the loop. The correct way would be
my %data;
for my $record (#item) {
for my $int (#$record){
$data{$record} = $int;
}
}
Hash keys must be unique. It's not possible to have a hash like
( abc => 'china',
abc => 'japan' )
You can use a hash of arrays, though. Just assign to it with
push #{ $data{$record} }, $int;
It will create the following structure:
( abc => [ 'china', 'japan', 'italy', 'singapore' ] )

SQLite: Add columns on the fly if they don't exist

sub insert {
my ($self) = #_;
my $data = $self->{data};
my $keys = join( ', ', keys $data);
my $values = join( ', ', map qq('$_'), values $data);
my $sql = "INSERT INTO tbl ($keys) VALUES ($values);";
my $sth = $self->{dbh}->prepare($sql);
$sth->execute();
}
I have a method that inserts the contents of a hash ref into a one table sqlite database. I was wondering if there was a simple way to add a column to the table if the hash key is not already a column. Obviously if one of the keys is not a column name, the insert will fail. Can I capitalize on that failure, add the missing columns, and redo the insert. Or would I have to check all columns against all the keys each time I want to insert into the database? (All keys have TEXT values)
Alter your table based on the info you get using PRAGMA
my $inf_query = $db->prepare("PRAGMA table_info('tbl')");
$inf_query->execute();
my #inf = map { $_->[1] } #{$inf_query->fetchall_arrayref()};
#inf will be an array containing the columns present in the table, and you can use that info to construct your ALTER query.
Edited to return an array you can use to grep ;)

perl script to search for multiple lines from a file with start and end keywords

I am trying to search in a .sql file for sql statement which starts with CREATE TABLE followed by fields values then keywords [TB_DATA and TB_INDX] and ends by ; it in multiple lines
.sql file statement is in multiple lines
-- CREATE TABLE HDTB_COD;
CREATE TABLE HDTB_CODE( IDPK VARCHAR(256) NOT NULL)
IN TB_DATA INDEX
IN TB_INDX;
CREATE TABLE HDTB_RES
(ARTID VARCHAR(256) NOT NULL)
IN TB_DATA INDEX
IN TB_INDX;
-- DROP TABLE HDTB_COD;
CREATE TABLE HDTB_DE ( IDPK VARCHAR(256)
NOT NULL);
-------------output----------------------
CREATE TABLE HDTB_CODE( IDPK VARCHAR(256) NOT NULL)
IN TB_DATA INDEX IN TB_INDX;
CREATE TABLE HDTB_RES(ARTID VARCHAR(256) NOT NULL)
IN TB_DATA INDEX IN TB_INDX;
perl -n -e 'chomp; next if (/^--/);#p=() if /CREATE TABLE/; push #p,$_; if (/IN TB_DATA INDEX IN TB_INDX;/) { print "#p\n"; }' t.sql
How it works
chomp; # remove newlines
next if (/^--/); #skip lines that are SQL comments
#p = () if /CREATE TABLE/; #start of a table definition, clear array #p
push #p, $_; # put current line into array #p
#condition found, print #p
if (/IN TB_DATA INDEX IN TB_INDX;/) { print "#p\n"; }
Here is an example of how to create quick-and-dirty parsing pipelines. Once you understand the basic pattern, it's easy to add more filtering steps (with grep) or transforming steps (with map)
# Slurp entire file.
my $sql = do { local $/ = undef; <> };
# 1. Grab the CREATE TABLE statements.
# 2. Retain only the statements of interest.
# 3. Modify the statements as needed before printing.
print
map { "$_\n" } # 3b. Add trailing newlines.
map { s/\s+/ /g; $_ } # 3a. Normalize whitespace.
grep { /IN TB_INDX/ } # 2b. Filter.
grep { /IN TB_DATA INDEX/ } # 2a. Filter.
$sql =~ /^(CREATE TABLE .+?;)\s*$/gsm; # 1. Grab.

DBI: selectall_arrayref and columnnames

When I fetch the data this way is it possible then to access the column names and the column types or do I need an explicit prepare to reach this?
use DBI;
my $dbh = DBI->connect( ... );
my $select = "...";
my #arguments = ( ... );
my $ref = $dbh->selectall_arrayref( $select, {}, #arguments, );
Update:
With prepare I would do it this way:
my $sth = $dbh->prepare( $select );
$sth->execute( #arguments );
my $col_names = $sth->{NAME};
my $col_types = $sth->{TYPE};
my $ref = $sth->fetchall_arrayref;
unshift #$ref, $col_names;
The best solution is to use prepare to get a statement handle, as you describe in the second part of your question. If you use selectall_hashref or selectall_arrayref, you don't get a statement handle, and have to query the column type information yourself via $dbh->column_info (docs):
my $sth = $dbh->column_info('','',$table,$column); # or $column='' for all
my $info = $sth->fetchall_arrayref({});
use Data::Dumper; print Dumper($info);
(specifically, the COLUMN_NAME and TYPE_NAME attributes).
However, this introduces a race condition if the table changes schema between the two queries.
Also, you may use selectall_arrayref with the Slice parameter to fetch all the columns into a hash ref, it needs no prepared statement and will return an array ref of the result set rows, with each rows columns the key's to a hash and the values are the column values. ie:
my $result = $dbh->selectall_arrayref( qq{
SELECT * FROM table WHERE condition = value
}, { Slice => {} }) or die "Error: ".$dbh->errstr;
$result = [
[0] = { column1 => 'column1Value', column2 => 'column2Value', etc...},
[1] = { column1 => 'column1Value', column2 => 'column2Value', etc...},
];
Making it easy to iterate over results.. ie:
for my $row ( #$results ){
print "$row->{column1Value}, $row->{column2Value}\n";
}
You can also specify which columns to extract but it's pretty useless due to the fact it's more efficient to do that in your SQL query syntax.
{ Slice => { column1Name => 1, column2Name => 1 } }
That would only return the values for column1Name and column2Name just like saying in your SQL:
SELECT column1Name, column2Name FROM table...