How do I handle a varying number of items from a database query? - perl

Effectively a duplicate of: How can I display data in table with Perl
The accepted answer there applies here. So do some of the alternatives.
I am trying to run raw database queries from Perl program and display results to the user. Something like select * from table. I want to display the information in a HTML table. The columns in the HTML table correspond with the returned columns.
I am having some issues. I can run describe table query to return the number of columns there are in the table. However, how will I store the information from the returned results into arrays?
So if I am storing results like this:
while (($f1, $t2, $n3, $k4, $d5, $e6) = $sth1->fetchrow_array)
In this case I only know that there are, say four columns (which I got from describe table). But this number four is dynamic and can change depending on the table name. I can not declare my variables based on this number. Any suggestions?

Try:
print "<table>\n";
# display HTML header
#cols = #{$sth->{NAMES_uc}};
print "<tr>".join("", map { "<th>${_}</th>" } #cols)."</tr>\n";
# display one HTML table row for each DB row
while (my #row = $sth->fetchrow_array) {
print "<tr>".join("", map { "<td>${_}</td>" } #row)."</tr>\n";
}
print "</table>\n";

while (my #row = $sth->fetchrow_array)
{
print "<tr>".join("", map{ "<td>${_}</td>" } #row)."</tr>" ;
}

Use the technique suggested in the answer(s) to the other question - use fetchrow_array to fetch into an array:
while (my #row = $sth->fetchrow_array())
{
...process array...
}
Or use an alternative to fetchrow_array(), such as fetchrow_hashref().

Related

Perl cgi bind dynamic number of columns

I'm trying to make a simple select from a database, the thing is that I want the same script to be able to select any of the tables in it. I have gotten everything solved up until the point when I need to bind the columns to variables, since they must be generated dynamically I just don't know how to do it.
here's the code:
if($op eq "SELECT"){
if ($whr){
$query1 = "SELECT $colsf FROM $tab WHERE $whr";
}else{
$query1 = "SELECT $colsf FROM $tab";
}
$seth = $dbh->prepare($query1);
$seth->execute();
foreach $cajas(#columnas){
$seth->bind_col(*$dynamically_generated_var*);
}
print $q->br();
print $q->br();
print $q->br();
The variable #columans contains the name of the selected columns (which varies a lot), and I need a variable assigned for each of the columns on the $seth->bind_col().
How can I acheive this?
Using bind_col will not gain you anything here. As you have already figured out, that's used to bind a fixed number of results to a set of variables. But you do not have a fixed set.
Thinking in terms of oh, I can just create them dynamically is a very common mistake. It will get you into all kinds of trouble later. Perl has a data structure specifically for this use case: the hash.
DBI has a bunch of functions built in for retrieving data after execute. One of those is fetchrow_hashref. It will return the results as a hash reference, with one key per column, one row at a time.
while (my $res = $sth->fetchrow_hashref) {
p $res; # p is from Data::Printer
}
Let's assume the result looks like this:
$res = {
id => 1,
color => 'red',
}
You can access the color by saying $res->{color}. The perldocs on perlref and perlreftut have a lot of info about this.
Note that the best practice for naming statement handle variables is $sth.
In your case, you have a dynamic number of columns. Those have to be joined to be in the format of col1, col2, col3. I guess you have already done that in $colsf. The table is pretty obvious in $tab, so we only have the $whr left.
This part is tricky. It's important to always sanitize your input, especially in a CGI environment. With DBI this is best done by using placeholders. They will take care of all the escaping for you, and they are easy to use.
my $sth = $dbi->prepare('select cars from garage where color=?');
$sth->execute($color);
Now we don't need to care if the color is red, blue or ' and 1; --, which might have broken stuff. If it's all very dynamic, use $dbi->quote instead.
Let's put this together in your code.
use strict;
use warnings;
use DBI;
# ...
# the columns
my $colsf = join ',', #some_list_of_column_names; # also check those!
# the table name
my $table = $q->param('table');
die 'invalid table name' if $table =~ /[^a-zA-Z0-9_]/; # input checking
# where
# I'm skipping this part as I don't know where it is comming from
if ($op eq 'SELECT') {
my $sql = 'SELECT $colsf FROM $table';
$sql .= ' WHERE $whr' if $whr;
my $sth = $dbh->prepare($sql) or die $dbi->errstr;
$sth->execute;
my #headings = $sth->{NAME}; # see https://metacpan.org/pod/DBI#NAME1
while (my $res = $sth->fetchrow_hashref) {
# do stuff here
}
}

Perl DBI Results Referenced by Index

The Problem:
I'm using DBI with Perl, and need to do a double nest loop through records in my records set.
In the past, I've used while statements like:
my $someQuery = "SELECT * FROM foo;";
my $sth = $dbh->prepare($someQuery);
$sth->execute();
while (my $ref = $sth->fetchrow_hashref()) {
my $someVariable = "$ref->{'dbFieldName'}";
# do stuff
}
What I've trying to achieve would be easy to do with a for loop instead of a while loop, but I'd need to know how to reference the results by row index as well as the total rowcount. Any idea on how to do this?
Bonus Points:
The above will help me solve the one problem, but I'd like to get better techniques to let me figure it out on my own. I'm using Perl EPIC in Eclipse, which doesn't give any sort of context look-ahead/autocomplete (or if it does I don't know how to enable it). Is there a way to enable this or a different add-on for Eclipse that would show the context look-ahead, so I could see what options are available?
Current row number
There are a couple of ways you can get the row number:
Perl variable
my $row_num = 0;
while (my $row = $sth->fetchrow_hashref) {
$row_num++;
# Do stuff
}
SQL variable
This depends on which DBMS you're using, but you can do something like this in MySQL:
$dbh->do('SET #row_num := 0');
my $statement = <<'SQL';
SELECT #row_num := #row_num + 1 AS row_num,
foo,
bar
FROM table
SQL
my $sth = $dbh->prepare($statement);
$sth->execute;
while (my $row = $sth->fetchrow_hashref) {
say $row->{row_num};
# Do stuff
}
Total rows returned
You can use $sth->rows, with the following caveat:
Generally, you can only rely on a row count after a non-SELECT execute (for some specific operations like UPDATE and DELETE), or after fetching all the rows of a SELECT statement.
For SELECT statements, it is generally not possible to know how many rows will be returned except by fetching them all. Some drivers will return the number of rows the application has fetched so far, but others may return -1 until all rows have been fetched. So use of the rows method or $DBI::rows with SELECT statements is not recommended.
One alternative method to get a row count for a SELECT is to execute a "SELECT COUNT(*) FROM ..." SQL statement with the same "..." as your query and then fetch the row count from that.
However, if you're already counting each row as you fetch it, you could simply use that instead.
Below is the hack I did to get the results into an array structure, which will allow me to do the nested looping easily. Hopefully someone else will post an answer if they have a better way.
my #results;
my $fieldFirstName = 0;
my $fieldLastName = 1;
my $fieldAddress = 2;
my $i = 0;
my $someQuery = "SELECT FirstName, LastName, Address FROM foo;";
my $sth = $dbh->prepare($someQuery);
$sth->execute();
while (my $ref = $sth->fetchrow_hashref()) {
$results[$i][$fieldFirstName] = "$ref->{'FirstName'}";
$results[$i][$fieldLastName] = "$ref->{'LastName'}";
$results[$i][$fieldAddress] = "$ref->{'Address'}";
$i++;
}
$sth->finish();
my $resultsRecordCount = #results;

Parsing CSV files, finding columns and remembering them

I am trying to figure out a way to do this, I know it should be possible. A little background first.
I want to automate the process of creating the NCBI Sequin block for submitting DNA sequences to GenBank. I always end up creating a table that lists the species name, the specimen ID value, the type of sequences, and finally the location of the the collection. It is easy enough for me to export this into a tab-delimited file. Right now I do something like this:
while ($csv) {
foreach ($_) {
if ($_ =! m/table|species|accession/i) {
#csv = split('\t', $csv);
print NEWFILE ">[species=$csv[0]] [molecule=DNA] [moltype=genomic] [country=$csv[2]] [spec-id=$csv[1]]\n";
}
else {
next;
}
}
}
I know that is messy, and I just typed up something similar to what I have by memory (don't have script on any of my computers at home, only at work).
Now that works for me fine right now because I know which columns the information I need (species, location, and ID number) are in.
But is there a way (there must be) for me to find the columns that are for the needed info dynamically? That is, no matter the order of the columns the correct info from the correct column goes to the right place?
The first row will usually as Table X (where X is the number of the table in the publication), the next row will usually have the column headings of interest and are nearly universal in title. Nearly all tables will have standard headings to search for and I can just use | in my pattern matching.
First off, I would be remiss if I didn’t recommend the excellent Text::CSV_XS module; it does a much more reliable job of reading CSV files, and can even handle the column-mapping scheme that Barmar referred to above.
That said, Barmar has the right approach, though it ignores the "Table X" row being a separate row entirely. I recommend taking an explicit approach, perhaps something like this (and this is going to have a bit more detail just to make things clear; I would probably write it more tightly in production code):
# Assumes the file has been opened and that the filehandle is stored in $csv_fh.
# Get header information first.
my $hdr_data = {};
while( <$csv_fh> ) {
if( ! $hdr_data->{'table'} && /Table (\d+)/ ) {
$hdr_data->{'table'} = $1;
next;
}
if( ! $hdr_data->{'species'} && /species/ ) {
my $n = 0;
# Takes the column headers as they come, creating
# a map between the column name and column number.
# Assumes that column names are case-insensitively
# unique.
my %columns = map { lc($_) => $n++ } split( /\t/ );
# Now pick out exactly the columns we want.
foreach my $thingy ( qw{ species accession country } ) {
$hdr_data->{$thingy} = $columns{$thingy};
}
last;
}
}
# Now process the rest of the lines.
while( <$csv_fh> ) {
my $col = split( /\t/ );
printf NEWFILE ">[species=%s] [molecule=DNA] [moltype=genomic] [country=%s] [spec-id=%s]\n",
$col[$hdr_data->{'species'}],
$col[$hdr_data->{'country'}],
$col[$hdr_data->{'accession'}];
}
Some variation of that will get you close to what you need.
Create a hash that maps column headings to column numbers:
my %columns;
...
if (/table|species|accession/i) {
my #headings = split('\t');
my $col = 0;
foreach my $col (#headings) {
$columns{"\L$col"} = $col++;
}
}
Then you can use $csv[$columns{'species'}].

Perl Array Dereference Problem with DBI::fetchall_arrayref

I'm a Perl newbie and am having issues with dereferencing an array that is a result of fetchall_arrayref in the DBI module:
my $sql = "SELECT DISTINCT home_room FROM $classlist";
my $sth = $dbh->prepare($sql);
$sth->execute;
my $teachers = $sth->fetchall_arrayref;
foreach my $teacher (#{$teachers}) {
print $teacher;
}
Running this will print the reference instead of the values in the array.
However, when I run:
my $arrref = [1,2,4,5];
foreach (#{$arrref}) {
print "$_\n";
}
I get the values of the array.
What am I doing wrong? Thank you for your help!
Jeff
From the doc
The fetchall_arrayref method can be
used to fetch all the data to be
returned from a prepared and executed
statement handle. It returns a
reference to an array that contains
one reference per row.
So in your example, $teacher is an ARRAY ref.
So you will need to loop through this array ref
foreach my $teacher (#{$teachers}) {
foreach my $titem (#$teacher) {
print $titem;
}
}
if you want to extract only the teacher column, you want to use:
my #teachers = #{$dbh->selectcol_arrayref($sql)};
fetchall_arrayref fetches all the results of the query, so what you're actually getting back is a reference to an array of arrays. Each row returned will be an arrayref of the columns. Since your query has only one column, you can say:
my $teachers = $sth->fetchall_arrayref;
foreach my $teacher (#{$teachers}) {
print $teacher->[0];
}
to get what you want.
See more:
Arrays of arrays in Perl.
You have a reference to an array of rows. Each row is a reference to an array of fields.
foreach my $teacher_row (#$teachers) {
my ($home_room) = #$teacher_row;
print $home_room;
}
You would have seen the difference with Data::Dumper.
use Data::Dumper;
print(Dumper($teachers));
print(Dumper($arrref));
$sth->fetchall_arrayref returns a reference to an array that contains one reference per row!
Take a look at DBI docs here.
Per the documentation of DBI's fetchall_arrayref():
The fetchall_arrayref method can be
used to fetch all the data to be
returned from a prepared and executed
statement handle. It returns a
reference to an array that contains
one reference per row.
You're one level of indirection away:
my $sql = "SELECT DISTINCT home_room FROM $classlist";
my $sth = $dbh->prepare($sql);
$sth->execute;
my $teachers = $sth->fetchall_arrayref;
foreach my $teacher (#{$teachers}) {
local $" = ', ';
print "#{$teacher}\n";
}
The data structure might be a little hard to visualize sometimes. When that happens I resort to Data::Dumper so that I can insert lines like this:
print Dumper $teacher;
I've found that sometimes by dumping the datastructure I get an instant map to use as a reference-point when creating code to manipulate the structure. I recently worked through a real nightmare of a structure just by using Dumper once in awhile to straighten my head out.
You can use map to dereference the returned structure:
#teachers = map { #$_->[0] } #$teachers;
Now you have a simple array of teachers.

Best way to prevent output of a duplicate item in Perl in realtime during a loop

I see a lot of 'related' questions showing up, but none I looked at answer this specific scenario.
During a while/for loop that parses a result set generated from a SQL select statement, what is the best way to prevent the next line from being outputted if the line before it contains the same field data (whether it be the 1st field or the xth field)?
For example, if two rows were:
('EML-E','jsmith#mail.com','John','Smith')
('EML-E','jsmith2#mail.com','John','Smith')
What is the best way to print only the first row based on the fact that 'EML-E' is the same in both rows?
Right now, I'm doing this:
Storing the first field (specific to my scenario) into a 2-element array (dupecatch[1])
Checking if dupecatch[0] = dupcatch[1] (duplicate - escape loop using 's')
After row is processed, set dupecatch[0] = dupecatch[1]
while ($DBS->SQLFetch() == *PLibdata::RET_OK)
{
$s=0; #s = 1 to escape out of inside loop
while ($i != $array_len and $s==0)
{
$rowfetch = $DBS->{Row}->GetCharValue($array_col[$i]);
if($i==0){$dupecatch[1] = $rowfetch;} #dupecatch prevents duplicate primary key field entries
if($dupecatch[0] ne $dupecatch[1])
{
dosomething($rowfetch);
}
else{$s++;}
$i++;
}
$i=0;
$dupecatch[0]=$dupecatch[1];
}
That is that standard way if you only care about duplicate items in a row, but $dupecatch[0] is normally named $old and $dupecatch[1] normally just the variable in question. You can tell the array is not a good fit because you only ever refer to its indices.
If you want to avoid all duplicates you can use a %seen hash:
my %seen;
while (defined (my $row = get_data())) {
next if $seen{$row->[0]}++; #skip all but the first instance of the key
do_stuff();
}
I suggest using DISTINCT in your SQL statement. That's probably by far the easiest fix.