index is not visible to DBIx::Class - perl

I have this small perl code which is going to add a record to a table but I am confused why DBIC is not able to see the primary key?
I am not able to find any answer anywhere. First the names of table and columns were camelCase, which I then changed to underscore but it just won't run :(
$ ./test.pl
DBIx::Class::ResultSource::unique_constraint_columns(): Unknown unique constraint node_id on 'node' at ./test.pl line 80
code:
sub addNode
{
my $node = shift; my $lcNode = lc($node);
my $id = $schema
->resultset('Node')
->find_or_create
(
{ node_name => $lcNode },
{ key => 'node_id' }
);
return $id;
}
table details:
mysql> desc node;
+------------+-----------------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+------------+-----------------------+------+-----+---------+----------------+
| node_id | mediumint(5) unsigned | NO | PRI | NULL | auto_increment |
| node_name | varchar(50) | NO | | NULL | |
| node_notes | varchar(1000) | YES | | NULL | |
+------------+-----------------------+------+-----+---------+----------------+
3 rows in set (0.00 sec)
DBIx::Class::Resultset:
$ cat Node.pm
use utf8;
package Testdb::Schema::Result::Node;
# Created by DBIx::Class::Schema::Loader
# DO NOT MODIFY THE FIRST PART OF THIS FILE
use strict;
use warnings;
use base 'DBIx::Class::Core';
__PACKAGE__->table("node");
__PACKAGE__->add_columns(
"node_id",
{
data_type => "mediumint",
extra => { unsigned => 1 },
is_auto_increment => 1,
is_nullable => 0,
},
"node_name",
{ data_type => "varchar", is_nullable => 0, size => 50 },
"node_notes",
{ data_type => "varchar", is_nullable => 1, size => 1000 },
);
__PACKAGE__->set_primary_key("node_id");
# Created by DBIx::Class::Schema::Loader v0.07045 # 2017-08-21 22:14:58
# DO NOT MODIFY THIS OR ANYTHING ABOVE! md5sum:bWXf98hpLJgNBU93aaRYkQ
# You can replace this text with custom code or comments, and it will be preserved on regeneration
1;

https://metacpan.org/pod/DBIx::Class::ResultSet#find:
To aid with preparing the correct query for the storage you may supply the key attribute, which is the name of a unique constraint (the unique constraint corresponding to the primary columns is always named primary).
(Emphasis mine.)
In other words, to use the primary key, you need to specify { key => 'primary' }. Any other key attribute is looked up as the name of an additional unique constraint.

You didn't make clear how addNode should work exactly. But if you want to lookup existing nodes by node_name, you should simply remove the key attribute:
my $id = $schema->resultset('Node')->find_or_create(
{ node_name => $lcNode }
);
But read the caveat in the DBIC documentation:
If no such constraint is found, find currently defaults to a simple search->(\%column_values) which may or may not do what you expect. Note that this fallback behavior may be deprecated in further versions. If you need to search with arbitrary conditions - use "search". If the query resulting from this fallback produces more than one row, a warning to the effect is issued, though only the first row is constructed and returned as $result_object.
You should probably consider adding a unique constraint to node_name.

The real ans which came in DBIx list from Henry, is, whatever key you use, the col it corresponds to, should be in the query. all of above are ambiguous, they are not wrong, but they do not clarify the precise fact.

Related

How do I use the Class::DBI->sequence() method to fill 'id' field automatically in perl?

I'm following the example Class::DBI.
I create the cd table like that in my MariaDB database:
CREATE TABLE cd (
cdid INTEGER PRIMARY KEY,
artist INTEGER, # references 'artist'
title VARCHAR(255),
year CHAR(4)
);
The primary key cdid is not set to auto-incremental. I want to use a sequence in MariaDB. So, I configured the sequence:
mysql> CREATE SEQUENCE cd_seq START WITH 100 INCREMENT BY 10;
Query OK, 0 rows affected (0.01 sec)
mysql> SELECT NEXTVAL(cd_seq);
+-----------------+
| NEXTVAL(cd_seq) |
+-----------------+
| 100 |
+-----------------+
1 row in set (0.00 sec)
And set-up the Music::CD class to use it:
Music::CD->columns(Primary => qw/cdid/);
Music::CD->sequence('cd_seq');
Music::CD->columns(Others => qw/artist title year/);
After that, I try this inserts:
# NORMAL INSERT
my $cd = Music::CD->insert({
cdid => 4,
artist => 2,
title => 'October',
year => 1980,
});
# SEQUENCE INSERT
my $cd = Music::CD->insert({
artist => 2,
title => 'October',
year => 1980,
});
The "normal insert" succeed, but the "sequence insert" give me this error:
DBD::mysql::st execute failed: You have an error in your SQL syntax; check the manual that
corresponds to your MariaDB server version for the right syntax to use near ''cd_seq')' at line
1 [for Statement "SELECT NEXTVAL ('cd_seq')
"] at /usr/local/share/perl5/site_perl/DBIx/ContextualFetch.pm line 52.
I think the quotation marks ('') are provoking the error, because when I put the command "SELECT NEXTVAL (cd_seq)" (without quotations) in mysql client it works (see above). I proved all combinations (', ", `, no quotation), but still...
Any idea?
My versions: perl 5.30.3, 10.5.4-MariaDB
The documentation for sequence() says this:
If you are using a database with AUTO_INCREMENT (e.g. MySQL) then you do not need this, and any call to insert() without a primary key specified will fill this in automagically.
MariaDB is based on MySQL. Therefore you do not need the call to sequence(). Use the AUTO_INCREMENT keyword in your table definition instead.

execute failed: Incorrect string value: '\xD6sterl...' with mariadb and perl DBD

I'm a novice perl programmer trying to use DBI to write a buffer of text that contains an email with umlauts and other non-ASCII characters to a joomla database and having a problem.
DBD::mysql::st execute failed: Incorrect string value: '\xD6sterl...' for column `lsv5webstage`.`xuxgc_content`.`fulltext` at row 1 at /home/alerts/scripts_linstage/AdvisoryTest.pm line 373.
I'm not familiar enough with how encoding works to fully understand what the problem is. This is a fedora29 system with mariadb-10.3.12 and joomla-3.9.
Apparently the '\xD6' is an O with an umlaut in "Sebastian �sterlund". I read something about utf8 not being able to handle 4-char, but I don't fully understand.
I found the following reference online which talks about changing the encoding type from utf8 to utf8mb4, but the tables all appear to already be using that encoding:
> SHOW VARIABLES WHERE Variable_name LIKE 'character\_set\_%' OR
Variable_name LIKE 'collation%';
+--------------------------+--------------------+
| Variable_name | Value |
+--------------------------+--------------------+
| character_set_client | utf8mb4 |
| character_set_connection | utf8mb4 |
| character_set_database | utf8mb4 |
| character_set_filesystem | binary |
| character_set_results | utf8mb4 |
| character_set_server | utf8mb4 |
| character_set_system | utf8 |
| collation_connection | utf8mb4_unicode_ci |
| collation_database | utf8mb4_unicode_ci |
| collation_server | utf8mb4_unicode_ci |
+--------------------------+--------------------+
I'm not sure it's helpful, but this is the insert statement I'm using in my perl code:
my $sql = <<EOF;
INSERT INTO xuxgc_content (title, alias, introtext, `fulltext`, state, catid, created, created_by, created_by_alias, modified, modified_by, checked_out, checked_out_time, publish_up, publish_down, images, urls, attribs, version, ordering, metakey, metadesc, metadata, access, hits, language)
VALUES ($title, "$title_alias", $introText, $fullText, $state, $catid, $created, $created_by, $created_by_alias, $modified, $modified_by, $checked_out, $checked_out_time, $publish_up, $publish_down, $images, $urls, $attribs, $version, $ordering, $metakey, $metadesc, $metadata, $access, $hits, $language);
EOF
my $sth = $dbh->prepare($sql);
$sth->execute();
db_disconnect($dbh);
The $fullText variable is populated from a buffer that contains the body of the email. I'm running it through quote() before performing the INSERT.
$fullText = $dbh->quote($fullText);
I also tried using "SET NAMES utf8mb4;INSERT INTO Mytable ...;" and it just didn't like the format.
Here's the full function that's used to connect to the database:
sub db_connect () {
my %DB = (
'host' => 'myhost',
'db' => 'mydb',
'user' => 'myuser',
'pass' => 'mypass',
);
return DBI->connect("DBI:mysql:database=$DB{'db'};host=$DB{'host'}", $DB{'user'}, $DB{'pass'}, { mysql_enable_utf8mb4 => 1 });
}
I don't recall having this problem in the past, and this script has been in use for quite a while.
D6 is hex for Ö in CHARACTER SET latin1 (and several others).
You have declared that your client uses UTF-8 (utf8mb4) encoding, so it spit at you.
Please provide SELECT HEX(col), col ... to see if the D6 got into the database (hence an insert problem) or something else (possibly a fetch/display problem).
Also, you have not quoted your $fulltext string, so you are likely to get all sorts of syntax errors.
Please don't blindly put strings into INSERT statements, but escape them as you put them in.
There may be some useful Perl hint in this:
use utf8;
use open ':std', ':encoding(UTF-8)';
my $dbh = DBI->connect("dbi:mysql:".$dsn, $user, $password, {
PrintError => 0,
RaiseError => 1,
mysql_enable_utf8 => 1, # Switch to UTF-8 for communication and decode.
});
# or {mysql_enable_utf8mb4 => 1} if using utf8mb4
And look for techniques for binding/quoting/escaping.

Cannot insert NULL as Primary Key value in Postgresql for Flask Sqlalchemy after switching from SQlite

I recently started porting a SQLite database over to PostGreSQL for a Flask site built with SQLAlchemy. I have my schemas in PGSQL and even inserted the data into the database. However, I am unable to run my usual INSERT commands to add information to the database. Normally, I insert new records using SQL Alchemy by leaving the ID column to be NULL and then just setting the other columns. However, that results in the following error:
sqlalchemy.exc.IntegrityError: (psycopg2.IntegrityError) null value in column "id" violates not-null constraint
DETAIL: Failing row contains (null, 2017-07-24 20:40:37.787393+00, 2017-07-24 20:40:37.787393+00, episode_length_list = [52, 51, 49, 50, 83]
sum_length = 0
for ..., 0, f, 101, 1, 0, 0, , null).
[SQL: 'INSERT INTO submission (date_created, date_modified, code, status, correct, assignment_id, course_id, user_id, assignment_version, version, url) VALUES (CURRENT_TIMESTAMP, CURRENT_TIMESTAMP, %(code)s, %(status)s, %(correct)s, %(assignment_id)s, %(course_id)s, %(user_id)s, %(assignment_version)s, %(version)s, %(url)s) RETURNING submission.id'] [parameters: {'code': 'episode_length_list = [52, 51, 49, 50, 83]\n\nsum_length = 0\n\nfor episode_length in episode_length_list:\n pass\n\nsum_length = sum_length + episode_length\n\nprint(sum_length)\n', 'status': 0, 'correct': False, 'assignment_id': 101, 'course_id': None, 'user_id': 1, 'assignment_version': 0, 'version': 0, 'url': ''}]
Here is my SQL Alchemy table declarations:
class Base(Model):
__abstract__ = True
#declared_attr
def __tablename__(cls):
return cls.__name__.lower()
def __repr__(self):
return str(self)
id = Column(Integer(), primary_key=True)
date_created = Column(DateTime, default=func.current_timestamp())
date_modified = Column(DateTime, default=func.current_timestamp(),
onupdate=func.current_timestamp())
class Submission(Base):
code = Column(Text(), default="")
status = Column(Integer(), default=0)
correct = Column(Boolean(), default=False)
assignment_id = Column(Integer(), ForeignKey('assignment.id'))
course_id = Column(Integer(), ForeignKey('course.id'))
user_id = Column(Integer(), ForeignKey('user.id'))
assignment_version = Column(Integer(), default=0)
version = Column(Integer(), default=0)
url = Column(Text(), default="")
I created the schema by calling db.create_all() in a script.
Checking the PostGreSQL side, we can see the constructed table:
Table "public.submission"
Column | Type | Modifiers | Storage | Stats target | Description
--------------------+--------------------------+-----------+----------+--------------+-------------
id | bigint | not null | plain | |
date_created | timestamp with time zone | | plain | |
date_modified | timestamp with time zone | | plain | |
code | text | | extended | |
status | bigint | | plain | |
correct | boolean | | plain | |
assignment_id | bigint | | plain | |
user_id | bigint | | plain | |
assignment_version | bigint | | plain | |
version | bigint | | plain | |
url | text | | extended | |
course_id | bigint | | plain | |
Indexes:
"idx_16881_submission_pkey" PRIMARY KEY, btree (id)
Foreign-key constraints:
"submission_course_id_fkey" FOREIGN KEY (course_id) REFERENCES course(id)
"submission_user_id_fkey" FOREIGN KEY (user_id) REFERENCES "user"(id)
Has OIDs: no
I'm still new to this, but shouldn't there be a sequence?
Any insight or suggestions on what to look for next would be super appreciated.
It is standard SQL that a PRIMARY KEY is UNIQUE and NOT NULL. PostgreSQL enforces the standard, and does not allow you to have any (not a single one) NULL on the table. Other databases allow you to have one NULL, therefore, the different behaviour.
PostgreSQL current documentation on Primary Keys clearly states it:
5.3.4. Primary Keys
A primary key constraint indicates that a column, or group of columns, can be used as a unique identifier for rows in the table. This requires that the values be both unique and not null.
If you want your PRIMARY KEY to be a synthetic (i.e.: not natural) sequence number, you should define it with type BIGSERIAL instead of BIGINT. I don't know the details on how this is achieved using SQLAlchemy, but look at the references.
When you then INSERT into your table, the id should NOT be in the INSERT column list (it should not be set to null, just not be there). I.e.:
This will generate a new id:
INSERT INTO public.submission (code) VALUES ('Some code') ;
will work.
This won't:
INSERT INTO public.submission (id, code) VALUES (NULL, 'Some code') ;
I guess SQLAlchemy should be smart enough to generate the proper SQL INSERT statements, once properly configured.
Reference:
Why isn't SQLAlchemy creating serial columns?
Ultimately, I discovered what went wrong, and it was definitely my fault. The process I used to load the old data into the database (pgloader) was doing more than just loading data - it was somehow overwriting parts of the table definitions! I was able to pg_dump the data out, reset the tables, and then load it back in - everything works as expected. Thanks for sanity checks!

Build tree-like hash (YAML) from a complex database extraction (SQL)?

Introduction
Considering the tables given at the end of this question, I would like an algorithm or a simple solution that returns a nested tree from a YAML description. Using yaml format is an optional need. In fact, the output I need is an array of ordered hashes that may or may not contain nested ordered hashes or arrays of ordered hashes.
In short, I am talking about a tree-like structure.
For a better understanding of my question I will treat a simple example that covers all my needs. Actually this example is the one I am using to implement this algorithm.
I decided to ask this question in parallel with my own investigations as my knowledge in Perl is limited. I don't want to dig into the wrong tunnel and that's why I am asking for help.
I am currently focussing on the DBI module. I tried to look at other modules such as DBIx::Tree::NestedSet, but I don't think it is what I need.
So, let's get down to the details of my example.
Example
The inital idea is to write a perl program that takes a yaml description and outputs the extracted data.
This input description follows simple rules:
query is what data we are looking for. It can contains the following keys
sql is the SQL query
hide hides columns from the final output. This field is used when a column is required only in a subquery but not wanted in the end.
subquery is a nested query executed for each row of the parent query
bind to bind columns values to the sql query
hash tells the program to group the results not in an array of hashes but an hash of hashes. Actually this could be directly given to DBI::selectall_hashref. If this field is omitted the output is listed as an array of ordered hashes.
key is the name of the key listed at the same level of the parent's result. We will see
later that a key name can mask a result column.
list tells the program to list the result into an array. Notice that only one column can be displayed i.e. array: name displays the list of names
connect is the DBI connection string
format is the output format. It can be either XML, YAML or JSON. I primarly focus on the
YAML format as it can be easily translated. When omitted, the default ouput is YAML.
indent how many spaces is one identation. The tabs or tab value is also supported.
In addition, we know that in Perl hashes are not ordered. Here, the output keys order is important and should appear as they appear in the sql query.
From this I simpy use the YAML module :(
In summary, in the end we will just execute this command:
$ cat desc.yml | ./fetch > data.yml
The desc.yml description is given below:
---
connect: "dbi:SQLite:dbname=einstein-puzzle.sqlite"
ident: 4
query:
- sql: SELECT * from people
hide:
- pet_id
- house_id
- id
subquery:
- key: brevage
bind: id
sql: |
SELECT name, calories, potassium FROM drink
LEFT JOIN people_has_drink ON drink.id = people_has_drink.id_drink
WHERE people_has_drink.id_people = 1
hash:
- name
- key: house
sql: SELECT color as paint, size, id from house WHERE id = ?
hide: id
bind: paint
subquery:
- key: color
sql: SELECT name, ral, hex from color WHERE short LIKE ?
bind: color
- key: pet
sql: SELECT name from pet WHERE id = ?
bind: pet_id
list: name
Expected Output
From the description above, the output data would be this:
---
- nationality: Norvegian
smoke: Dunhill
brevage:
orange juice:
calories: 45
potassium: 200 mg
water:
calories: 0
potassium: 3 mg
house:
color:
name: Zinc yellow
ral: RAL 1018
hex: #F8F32B
paint: yellow
size: small
pet:
- cats
- nationality: Brit
smoke: Pall Mall
brevage:
milk:
calories: 42
potassium: 150 mg
house:
color:
name: Vermilion
ral: RAL 2002
hex: #CB2821
paint: red
size: big
pet:
- birds
- phasmatodea
Where I am
I still did not fully implemented the nested queries. My current sate is given here:
#!/usr/bin/env perl
use 5.010;
use strict;
use warnings;
use DBI;
use YAML;
use Data::Dumper;
use Tie::IxHash;
# Read configuration and databse connection
my %yml = %{ Load(do { local $/; <DATA>}) };
my $dbh = DBI->connect($yml{connect});
# Fill the bind values of the first query with command-line information
my %bind;
for(#ARGV) {
next unless /--(\w+)=(.*)/;
$bind{$1} = $2;
}
my $q0 = $yml{query}[0];
if ($q0->{bind} and keys %bind > 0) {
$q0->{bind_values} = arrayref($q0->{bind});
$q0->{bind_values}[$_] = $bind{$q0->{bind}[$_]} foreach (0 .. #{$q0->{bind}} - 1);
}
# Fetch all data from the database recursively
my $out = fetch($q0);
sub fetch {
# As long we have a query, one processes it
my $query = shift;
return undef unless $query;
$query->{bind_values} = [] unless ref $query->{bind_values} eq 'ARRAY';
# Execute SQL query
my $sth = $dbh->prepare($query->{sql});
$sth->execute(#{$query->{bind_values}});
my #columns = #{$sth->{NAME}};
# Fetch all the current level's data and preserve columns order
my #return;
for my $row (#{$sth->fetchall_arrayref()}) {
my %data;
tie %data, 'Tie::IxHash';
$data{$columns[$_]} = $row->[$_] for (0 .. $#columns);
for my $subquery (#{ $query->{subquery} }) {
my #bind;
push #bind, $data{$_} for (#{ arrayref($subquery->{bind}) });
$subquery->{bind_values} = \#bind;
my $sub = fetch($subquery);
# Present output as a list
if ($subquery->{list}) {
#if ( map ( $query->{list} eq $_ , keys $sub ) )
my #list;
for (#$sub) {
push #list, $_->{$subquery->{list}};
}
$sub = \#list;
}
if ($subquery->{key}) {
$data{$subquery->{key}} = $sub;
} else {
die "[Error] Key is missing for query '$subquery->{sql}'";
}
}
# Remove unwanted columns from the output
if ($query->{hide}) {
delete $data{$_} for( #{ arrayref($query->{hide}) } );
}
push #return, \%data;
}
\#return;
}
DumpYaml($out);
sub arrayref {
my $ref = shift;
return (ref $ref ne 'ARRAY') ? [$ref] : $ref;
}
sub DumpYaml {
# I am not happy with this current dumper. I cannot specify the indent and it does
# not preserve the extraction order
print Dump shift;
}
__DATA__
---
connect: "dbi:SQLite:dbname=einstein-puzzle.sqlite"
ident: 4
query:
- sql: SELECT * from people
hide:
- pet_id
- house_id
- id
subquery:
- key: brevage
bind: id
sql: |
SELECT name, calories, potassium FROM drink
LEFT JOIN people_has_drink ON drink.id = people_has_drink.id_drink
WHERE people_has_drink.id_people = ?
hash:
- name
- key: house
sql: SELECT color as paint, size, id from house WHERE id = ?
hide: id
bind: house_id
subquery:
- key: color
sql: SELECT short, ral, hex from color WHERE short LIKE ?
bind: paint
- key: pet
sql: SELECT name from pet WHERE id = ?
bind: pet_id
list: name
And this is what output I get:
---
- brevage:
- calories: 0
name: water
potassium: 3 mg
- calories: 45
name: orange juice
potassium: 200 mg
house:
- color:
- hex: '#F8F32B'
ral: RAL 1018
short: yellow
paint: yellow
size: small
nationality: Norvegian
pet:
- cats
smoke: Dunhill
- brevage:
- calories: 42
name: milk
potassium: 150 mg
house:
- color:
- hex: '#CB2821'
ral: RAL 2002
short: red
paint: red
size: big
nationality: Brit
pet:
- birds
- phasmatodea
smoke: Pall Mall
Database
My test databse is a sqlite db where the tables are listed below:
Table People
.----+-------------+----------+--------+-----------.
| id | nationality | house_id | pet_id | smoke |
+----+-------------+----------+--------+-----------+
| 1 | Norvegian | 4 | 3 | Dunhill |
| 2 | Brit | 1 | 2 | Pall Mall |
'----+-------------+----------+--------+-----------'
Table Drink
.----+--------------+----------+-----------.
| id | name | calories | potassium |
+----+--------------+----------+-----------+
| 1 | tea | 1 | 18 mg |
| 2 | coffee | 0 | 49 mg |
| 3 | milk | 42 | 150 mg |
| 4 | beer | 43 | 27 mg |
| 5 | water | 0 | 3 mg |
| 6 | orange juice | 45 | 200 mg |
'----+--------------+----------+-----------'
Table People Has Drink
.-----------+----------.
| id_people | id_drink |
+-----------+----------+
| 1 | 5 |
| 1 | 6 |
| 2 | 3 |
'-----------+----------'
Table House
+----+--------+--------+
| id | color | size |
+----+--------+--------+
| 1 | red | big |
| 2 | green | small |
| 3 | white | middle |
| 4 | yellow | small |
| 5 | blue | huge |
+----+--------+--------+
Table Color
.--------+-------------+----------+---------.
| short | color | ral | hex |
+--------+-------------+----------+---------+
| red | Vermilion | RAL 2002 | #CB2821 |
| green | Pale green | RAL 6021 | #89AC76 |
| white | Light grey | RAL 7035 | #D7D7D7 |
| yellow | Zinc yellow | RAL 1018 | #F8F32B |
| blue | Capri blue | RAL 5019 | #1B5583 |
'--------+-------------+----------+---------'
Table Pet
+----+-------------+
| id | name |
+----+-------------+
| 1 | dogs |
| 2 | birds |
| 3 | cats |
| 4 | horses |
| 5 | fishes |
| 2 | phasmatodea |
+----+-------------+
Database data
If you wish use the same data as mine also give you all what you need:
BEGIN TRANSACTION;
CREATE TABLE "pet" (
`id` INTEGER,
`name` TEXT
);
INSERT INTO `pet` VALUES (1,'dogs');
INSERT INTO `pet` VALUES (2,'birds');
INSERT INTO `pet` VALUES (3,'cats');
INSERT INTO `pet` VALUES (4,'horses');
INSERT INTO `pet` VALUES (5,'fishes');
INSERT INTO `pet` VALUES (2,'phasmatodea');
CREATE TABLE `people_has_drink` (
`id_people` INTEGER NOT NULL,
`id_drink` INTEGER NOT NULL,
PRIMARY KEY(id_people,id_drink)
);
INSERT INTO `people_has_drink` VALUES (1,5);
INSERT INTO `people_has_drink` VALUES (1,6);
INSERT INTO `people_has_drink` VALUES (2,3);
CREATE TABLE "people" (
`id` INTEGER NOT NULL PRIMARY KEY AUTOINCREMENT UNIQUE,
`nationality` VARCHAR(45),
`house_id` INT,
`pet_id` INT,
`smoke` VARCHAR(45)
);
INSERT INTO `people` VALUES (1,'Norvegian',4,3,'Dunhill');
INSERT INTO `people` VALUES (2,'Brit',1,2,'Pall Mall');
CREATE TABLE "house" (
`id` INTEGER NOT NULL PRIMARY KEY AUTOINCREMENT UNIQUE,
`color` TEXT,
`size` TEXT
);
INSERT INTO `house` VALUES (1,'red','big');
INSERT INTO `house` VALUES (2,'green','small');
INSERT INTO `house` VALUES (3,'white','middle');
INSERT INTO `house` VALUES (4,'yellow','small');
INSERT INTO `house` VALUES (5,'blue','huge');
CREATE TABLE `drink` (
`id` INTEGER NOT NULL PRIMARY KEY AUTOINCREMENT UNIQUE,
`name` TEXT,
`calories` INTEGER,
`potassium` TEXT
);
INSERT INTO `drink` VALUES (1,'tea',1,'18 mg');
INSERT INTO `drink` VALUES (2,'coffee',0,'49 mg');
INSERT INTO `drink` VALUES (3,'milk',42,'150 mg');
INSERT INTO `drink` VALUES (4,'beer',43,'27 mg');
INSERT INTO `drink` VALUES (5,'water',0,'3 mg');
INSERT INTO `drink` VALUES (6,'orange juice',45,'200 mg');
CREATE TABLE `color` (
`short` TEXT UNIQUE,
`color` TEXT,
`ral` TEXT,
`hex` TEXT,
PRIMARY KEY(short)
);
INSERT INTO `color` VALUES ('red','Vermilion','RAL 2002','#CB2821');
INSERT INTO `color` VALUES ('green','Pale green','RAL 6021','#89AC76');
INSERT INTO `color` VALUES ('white','Light grey','RAL 7035','#D7D7D7');
INSERT INTO `color` VALUES ('yellow','Zinc yellow','RAL 1018','#F8F32B');
INSERT INTO `color` VALUES ('blue','Capri blue','RAL 5019','#1B5583');
COMMIT;
Is my implementation good
This is a rather broad question, and the answer probably depends on what you want from your code. For instance:
Does it work? Does it have all the features you need? Does it do what you want? Does it respond appropriately for all the ranges of inputs you want to cater for (and input you don't)? If you aren't sure, write some tests.
Is it fast enough? If not, what are the slow bits? Use Devel::NYTProf to find them.
If it's working, you probably also want to turn your code into a module rather than just a script so you can use it again.
and if not (I'm supposing that I am doing all wrong), what modules should I use to get the desired behavior?
It sounds very much like you're trying to do something like DBIx::Class (aka DBIC) does when you ask it to prefetch; it will build you a data structure of objects.
If you need to do this dynamically in response to arbitrary databases and YAML, that's not quite what DBIC was designed to do; it's probably possible but will probably involve you dynamically creating packages, which will not be easy.

Interpreting Perl DBI MySQL column_info()

I'm trying to write Perl code that accepts bindings for a SQL INSERT statement, and identifies problems that might cause the INSERT to be rejected for bad data, and fixes them. To do this, I need to get and interpret column metadata.
The $dbh->column_info method returns the info in a coded form. I've pored through the official CPAN documentation but am still confused.
my $sth_column_info
= $dbh->column_info( $catalog, $schema, $table, undef );
my $columns_aoh_ref = $sth_column_info->fetchall_arrayref(
{ COLUMN_NAME => 1,
DATA_TYPE => 1,
NULLABLE => 1,
ORDINAL_POSITION => 1,
COLUMN_DEF => 1,
}
);
say $table;
for my $href (#$columns_aoh_ref) {
my #list;
while ( my ( $k, $v ) = each %$href ) {
push #list, "$k=" . ( $v // 'undef' );
}
say join '|', #list;
}
Output is:
dw_phone
NULLABLE=0|COLUMN_DEF=undef|DATA_TYPE=4|ORDINAL_POSITION=1|COLUMN_NAME=phone_id
NULLABLE=0|COLUMN_DEF=undef|DATA_TYPE=4|ORDINAL_POSITION=2|COLUMN_NAME=phone_no
NULLABLE=1|COLUMN_DEF=undef|DATA_TYPE=4|ORDINAL_POSITION=3|COLUMN_NAME=phone_ext
NULLABLE=0|COLUMN_DEF=undef|DATA_TYPE=1|ORDINAL_POSITION=4|COLUMN_NAME=phone_type
NULLABLE=0|COLUMN_DEF=undef|DATA_TYPE=1|ORDINAL_POSITION=5|COLUMN_NAME=phone_location
NULLABLE=1|COLUMN_DEF=undef|DATA_TYPE=1|ORDINAL_POSITION=6|COLUMN_NAME=phone_status
NULLABLE=0|COLUMN_DEF=undef|DATA_TYPE=11|ORDINAL_POSITION=7|COLUMN_NAME=insert_date
NULLABLE=0|COLUMN_DEF=undef|DATA_TYPE=11|ORDINAL_POSITION=8|COLUMN_NAME=update_date
Where - for example - does one find a mapping of data type codes to strings? Should I be using DATA_TYPE, TYPE_NAME, or SQL_DATA_TYPE? Should I be using NULLABLE or IS_NULLABLE, and why the two flavors?
I can appreciate the difficulty of documenting (let alone implementing) a universal interface for databases. But I wonder if anyone knows of a reference manual for using the DBI that's specific to MySQL?
UPDATE 1:
Tried to shed more light by retrieving all info using an array rather than a hash:
my $sth_column_info
= $dbh->column_info( $catalog, $schema, $table, undef );
my $aoa_ref = $sth_column_info->fetchall_arrayref; # <- chg. to arrayref, no parms
say $table;
for my $aref (#$aoa_ref) {
my #list = map $_ // 'undef', #$aref;
say join '|', #list;
}
Now I can see lots of potentially useful information mixed in there.
dw_contact_source
undef|dwcust1|dw_contact_source|contact_id|4|BIGINT|20|undef|undef|10|0|undef|undef|4|undef|undef|1|NO|undef|undef|undef|undef|undef|undef|undef|undef|undef|undef|undef|undef|undef|undef|undef|undef|undef|undef|1|bigint(20)|undef|0
undef|dwcust1|dw_contact_source|company_id|4|SMALLINT|6|undef|undef|10|0|undef|undef|4|undef|undef|2|NO|undef|undef|undef|undef|undef|undef|undef|undef|undef|undef|undef|undef|undef|undef|undef|undef|undef|undef|1|smallint(6)|undef|0
undef|dwcust1|dw_contact_source|contact_type_id|4|TINYINT|4|undef|undef|10|0|undef|undef|4|undef|undef|3|NO|undef|undef|undef|undef|undef|undef|undef|undef|undef|undef|undef|undef|undef|undef|undef|undef|undef|undef||tinyint(4)|undef|0
undef|dwcust1|dw_contact_source|insert_date|11|DATETIME|19|undef|0|undef|0|undef|undef|9|-79|undef|4|NO|undef|undef|undef|undef|undef|undef|undef|undef|undef|undef|undef|undef|undef|undef|undef|undef|undef|undef||datetime|undef|0
undef|dwcust1|dw_contact_source|update_date|11|DATETIME|19|undef|0|undef|0|undef|undef|9|-79|undef|5|NO|undef|undef|undef|undef|undef|undef|undef|undef|undef|undef|undef|undef|undef|undef|undef|undef|undef|undef||datetime|undef|0
So my question would be:
How do I get the corresponding names/descriptions of these metadata?
How do I fetchall_arrayref just what I need, using symbols rather than integers? (I tried fetchall_arrayref([qw/COLUMN_NAME DATA_TYPE/]) and got back all undefs; now I'm just flailing about guessing.)
UPDATE 2:
Now I'm digging around in DBD::mysql.pm and I found a very interesting array:
my #names = qw(
TABLE_CAT TABLE_SCHEM TABLE_NAME COLUMN_NAME
DATA_TYPE TYPE_NAME COLUMN_SIZE BUFFER_LENGTH DECIMAL_DIGITS
NUM_PREC_RADIX NULLABLE REMARKS COLUMN_DEF
SQL_DATA_TYPE SQL_DATETIME_SUB CHAR_OCTET_LENGTH
ORDINAL_POSITION IS_NULLABLE CHAR_SET_CAT
CHAR_SET_SCHEM CHAR_SET_NAME COLLATION_CAT COLLATION_SCHEM COLLATION_NAME
UDT_CAT UDT_SCHEM UDT_NAME DOMAIN_CAT DOMAIN_SCHEM DOMAIN_NAME
SCOPE_CAT SCOPE_SCHEM SCOPE_NAME MAX_CARDINALITY
DTD_IDENTIFIER IS_SELF_REF
mysql_is_pri_key mysql_type_name mysql_values
mysql_is_auto_increment
);
These correspond precisely to what is returned by fetchall_arrayref. Now I can see I have four choices for learning data type, so let's see if any of the codes are documented.
UPDATE 3:
DBI Recipes is a very nice adjunct to CPAN DBI documentation about retrieving info back into Perl (Especially the {select|fetch}{row|all}_{hash|array} methods.)
This will help you determine the values for the data_types. I normally use the data_type to determine how to handle a column based on its type.
You then need to look at the MySQL data type key below and get the hash value. Then look at the DBI table below and match up the data name to get the data type value. Example: a BIGINT is an INTEGER type which matches SQL_INTEGER so the DATA_TYPE value is 4,
DBD::MySQL
### ANSI datatype mapping to mSQL datatypes
%DBD::mysql::db::ANSI2db = ("CHAR" => "CHAR",
"VARCHAR" => "CHAR",
"LONGVARCHAR" => "CHAR",
"NUMERIC" => "INTEGER",
"DECIMAL" => "INTEGER",
"BIT" => "INTEGER",
"TINYINT" => "INTEGER",
"SMALLINT" => "INTEGER",
"INTEGER" => "INTEGER",
"BIGINT" => "INTEGER",
"REAL" => "REAL",
"FLOAT" => "REAL",
"DOUBLE" => "REAL",
"BINARY" => "CHAR",
"VARBINARY" => "CHAR",
"LONGVARBINARY" => "CHAR",
"DATE" => "CHAR",
"TIME" => "CHAR",
"TIMESTAMP" => "CHAR"
);
DBI.pm
TYPE
The TYPE attribute contains a reference to an array of integer values representing the international standard values for the respective datatypes. The array of integers has a length equal to the number of columns selected within the original statement, and can be referenced in a similar way to the NAME attribute example shown earlier.
The standard values for common types are:
SQL_CHAR 1
SQL_NUMERIC 2
SQL_DECIMAL 3
SQL_INTEGER 4
SQL_SMALLINT 5
SQL_FLOAT 6
SQL_REAL 7
SQL_DOUBLE 8
SQL_DATE 9
SQL_TIME 10
SQL_TIMESTAMP 11
SQL_VARCHAR 12
SQL_LONGVARCHAR -1
SQL_BINARY -2
SQL_VARBINARY -3
SQL_LONGVARBINARY -4
SQL_BIGINT -5
SQL_TINYINT -6
SQL_BIT -7
SQL_WCHAR -8
SQL_WVARCHAR -9
SQL_WLONGVARCHAR -10
While these numbers are fairly standard,[61] the way drivers map their native types to these standard types varies greatly. Native types that don't correspond well to one of these types may be mapped into the range officially reserved for use by the Perl DBI: -9999 to -9000.