Does SELECT DISTINCT work with Perl's DBD::CSV? - perl

I found a SELECT-example on the web.
When I try it in my script I get this error-message:
Specifying DISTINCT when using aggregate functions isn't reasonable - ignored. at /usr/lib/perl5/site_perl/5.10.0/SQL/Parser.pm line 496.
#!/usr/bin/perl
use warnings;
use strict;
use DBI;
my $dbh = DBI->connect( "DBI:CSV:", undef, undef, { RaiseError => 1, AutoCommit => 1 } );
my $table = 'artikel';
my $array_ref = [ [ 'a_nr', 'a_name', 'a_preis' ],
[ 12, 'Oberhemd', 39.80, ],
[ 22, 'Mantel', 360.00, ],
[ 11, 'Oberhemd', 44.20, ],
[ 13, 'Hose', 119.50, ],
];
$dbh->do( "CREATE TEMP TABLE $table AS IMPORT(?)", {}, $array_ref );
my $sth = $dbh->prepare( "SELECT DISTINCT a_name FROM $table" );
$sth->execute();
$sth->dump_results();
$dbh->disconnect();
Does SELECT DISTINCT not work with DBD::CSV or is something wrong with my script?
edit:
The output is
'Oberhemd'
'Mantel'
'Oberhemd'
'Hose'
4 rows
I thought it should be
'Oberhemd'
'Mantel'
'Hose'
3 rows
Installed versions:
Perl : 5.010000 (x86_64-linux-thread-multi)
OS : linux (2.6.31)
DBI : 1.609
DBD::Sponge : 12.010002
DBD::SQLite : 1.25
DBD::Proxy : 0.2004
DBD::Gofer : 0.011565
DBD::File : 0.37
DBD::ExampleP : 12.010007
DBD::DBM : 0.03
DBD::CSV : 0.26

Hi This is an easily reproducable bug. SELECT data_display_mask FROM test.csv returns 200 plus rows. SELECT DISTINCT data_display_mask FROM test.csv returns the warning message and same 200 rows.
If i do an awk, sort -u for unique ( values of the row ) I get 36 values, which is what I would expect.
Certainly a bug in the code.
-Kanwar
perl -V
Summary of my perl5 (revision 5 version 10 subversion 0) configuration:
Platform:
osname=linux, osvers=2.2.24-6.2.3, archname=i686-linux-thread-multi
DBD::CSV 0.26
SQL::Parser 1.23
DBI 1.609
example:
Specifying DISTINCT when using aggregate functions isn't reasonable - ignored. at /opt/perl2exe/perl5/lib/site_perl/5.10.0/SQL/Parser.pm line 496.
87060
87060
87060
87060
SQL used is SELECT DISTINCT entry_id FROM test.csv

Note that the message about something being not reasonable is
Only a warning. Your script works nevertheless.
Confusing and non-sensical: you don't use any aggregate functions.
I smell a bug in either DBD::CSV or SQL::Statement.
Edit: DISTINCT is explicitly allowed in SQL::Statement

my $sth = $dbh->prepare("SELECT DISTINCT $attributeName1, COUNT( $attributeName2) FROM tableName GROUP BY $attributeName1, $attributeName2");
this gave me: attributeName1 and a distinct count of attributeName2

This is an example of a more general phenomenon with DBD::CSV, namely, that it allows a lot of SQL syntax for which the meaning is silently ignored.
I have seen cases of SELECT DISTINCT that actually filter out duplicates, so the case you mention here seems a bug,
but I haven't found a way to make the DISTINCT in SELECT COUNT(DISTINCT foo) FROM bar do anything.

Works for me. I get 3 rows back.
$ perl x.pl
'Oberhemd'
'Mantel'
'Hose'
3 rows
perl -MDBI -le 'DBI->installed_versions;'
Perl : 5.010001 (i686-linux-gnu-thread-multi)
OS : linux (2.6.24-28-server)
DBI : 1.617
DBD::mysql : 4.020
DBD::Sys : 0.102
DBD::Sponge : 12.010002
DBD::SQLite : 1.33
DBD::Proxy : 0.2004
DBD::Pg : 2.17.2
DBD::Oracle : 1.38
DBD::ODBC : 1.33
DBD::Multiplex : 2.014122
DBD::Gofer : 0.015057
DBD::File : 0.40
DBD::ExampleP : 12.014310
DBD::DBM : 0.06
DBD::CSV : 0.30
Added:
perl -MSQL::Statement -le 'print $SQL::Statement::VERSION'
1.31
Version 1.23, release November 20th, 2009
* Correct handling of DISTINCT in aggregate functions

I met the same problem.
You can turn around this problem using a GROUP BY statement instead of DISTINCT.
This is just a turn around waiting for the resolution of a bug ...

Related

jsonb with psycopg2 RealDictCursor

I have a postgresql 9.4 (aka mongodb killer ;-) ) and this simple schema :
CREATE TABLE test (id SERIAL, name text, misc jsonb);
now i populate this, if i make a select it will show something like
id | name | misc
1 | user1 | { "age" : 23, "size" : "M" }
2 | user2 | { "age" : 30, "size" : "XL" }
now, if i make a request with psycopg2,
cur.execute("SELECT * FROM test;")
rows = list(cur)
i'll end up with
[ { 'id' : 1, 'name' : 'user1', 'misc' : '{ "age" : 23, "size" : "M" }' },
{ 'id2' : 2, 'name' : 'user2', 'misc' : '{ "age" : 30, "size" : "XL' }' }]
what's wrong you would tell me ? well misc is type str. i would expect it to be recognized as json and converted as Python dict.
from psycopg2 doc (psycopg2/extras page) it states that "Reading from the database, json values will be automatically converted to Python objects."
with RealDictCursor it seems that it is not the case.
it means that that i cannot access rows[0]['misc']['age'] as it would be convenient...
ok, i could do manually with
for r in rows:
r['misc'] = json.loads(r['misc'])
but if i can avoid that because there's a nicer solution...
ps.
someone with 1500+ rep could create the postgresql9.4 tag ;-)
Current psycopg version (2.5.3) doesn't know the oid for the jsonb type. In order to support it it's enough to call:
import psycopg2.extras
psycopg2.extras.register_json(oid=3802, array_oid=3807, globally=True)
once in your project.
You can find further information in this ML message.
Works out of the box with psycopg2 2.7.1 (not need to json.loads -- dictionaries are what come out of queries.)
sudo apt-get install libpq-dev
sudo pip install psycopg2 --upgrade
python -c "import psycopg2 ; print psycopg2.__version__"
2.7.1 (dt dec pq3 ext lo64)

"Can't find table names in FROM clause" error with DBD::CSV

I'm trying to use a UNION statement in DBI to combine two CSV files:
#!/usr/bin/perl -w
use strict;
use DBI;
my $dbh = DBI->connect ("dbi:CSV:")
or die "Cannot connect to the CSV file: $DBI::errstr()";
$dbh->{RaiseError} = 1;
$dbh->{TraceLevel} = 0;
my $query = "select * from file.csv UNION select * from output.csv";
my $sth = $dbh->prepare ($query);
$sth->execute ();
$sth->dump_results();
$sth->finish();
$dbh->disconnect();
However, I get the following errors:
Can't find table names in FROM clause! at
C:/Perl64/site/lib/SQL/Statement.pm line 88. DBD::CSV::db prepare
failed: Can't find table names in FROM clause! at C:/Perl64
/site/lib/SQL/Statement.pm line 88. [for Statement "select * from
file.csv UNION select * from output.csv"] at CSV. pl line 15.
DBD::CSV::db prepare failed: Can't find table names in FROM clause! at
C:/Perl64 /site/lib/SQL/Statement.pm line 88. [for Statement "select
* from file.csv UNION select * from output.csv"] at CSV. pl line 15.
I updated SQL::Statement and SQL::Parse as suggested elsewhere but that didn't fix the issue. I'm running on Windows 8.1. What is causing the errors?
Drop .csv extensions from query and make sure your files are in current dir:
my $query = "select * from file UNION select * from output";
You can also explicitly set folder with csv files,
my $dbh = DBI->connect ("dbi:CSV:", "", "", {
f_dir => 'C:\path_to_csv',
});
DBD::CSV uses SQL::Statement as its SQL engine. SQL::Statement only supports a subset of SQL commands, which does not include UNION.
As an alternative, why not simply concatenate the two files?

PostgreSQL, perl and dojo special character issue (æ,ø and å)

I have a webpage made in perl and dojo using a PostgreSQL database. I have to search for availale people in the database and since im from Denmark the letters æ,ø and å has to be available in the search. I thought this was standard when using UTF8 and when I normally program in php over mysql I didn't think it would be that hard.
I have done properly every trick I know to convert this search_word to the right encoding so i can search in the postgre sql database for correct names with æ,ø and å... but it still fails.
i have my perl code making the fetch but this fetch returns 0 rows and when i insert the same command in the psql terminal i get 46 rows returned (copy from "tail -f log terminal" the STDERR statement and inserts it into another terminal connected to the database through the psql command)... the perl code is:
sub dbSearchPersons {
my $search_word = escapeSql($_[0]);
$search_word = Encode::decode_utf8($search_word);
$statement = "SELECT id,name,initials,email FROM person WHERE name ilike '\%".$search_word."\%' OR email ilike '\%".$search_word."\%' OR initials ilike '\%".$search_word."\%' ORDER BY name ASC";
$sth = $dbh->prepare($statement);
$num_rows = $sth->execute();
print STDERR "Statement: " . $statement;
if($num_rows > 0){
$persons = $dbh->selectall_hashref($statement,'id');
}
dbFinish($sth);
webdie($DBI::errstr) if($DBI::errstr);
}
and as you can see i write the SQL statement to STDERR and which outputs the following:
[Fri Apr 27 11:24:26 2012] [error] [client 10.254.0.1] Statement: SELECT id,name,initials,email FROM person WHERE name ilike '%Jørgen%' OR email ilike '%Jørgen%' OR initials ilike '%Jørgen%' ORDER BY name ASC, referer: https://xx.xxx.xxx.xx/cgi-bin/users.cgi
The sql I correctly written (as i can see it through the terminal output above) and if I copy and paste the statement from the terminal and inserts it directly into the psql terminal, i get 46 rows returned as I should... But the perl still wont return any rows.
I don't get it? When formatting a string to display "ø" and not "ø" (as perl translates the UTF8 encoding to, from "J%C3%B8rgen" which gets send through dojo.xhr.post), should I not be able to use it in a SQL statement? Is it because the psql database can have a certain encoding i have to take that into account somehow? Or could it be some completely different?
Hope someone can help me. I have been struggling with this problem for two days now and since the things looks like they should, but don't work I get a little sad :/
Regards,
Thor Astrup Pedersen
You probably forgot to pg_enable_utf8. The database interface will return then Perl character data to you.
$ createdb -e -E UTF-8 -l en_US.UTF-8 -T template0 so10349280
CREATE DATABASE so10349280 ENCODING 'UTF-8' TEMPLATE template0 LC_COLLATE 'en_US.UTF-8' LC_CTYPE 'en_US.UTF-8';
$ echo 'create table person (id int, name varchar, initials varchar, email varchar)'|psql so10349280
CREATE TABLE
$ echo "insert into person (id, name) values (1, 'Jørgensen')"|psql so10349280
INSERT 0 1
$ echo 'select * from person'|psql so10349280
id | name | initials | email
----+-----------+----------+-------
1 | Jørgensen | |
$ perl -Mutf8 -Mstrictures -MDBI -MDevel::Peek -E'
my $dbh = DBI->connect(
"DBI:Pg:dbname=so10349280", $ENV{LOGNAME}, "", { RaiseError => 1, AutoCommit => 1, pg_enable_utf8 => 1}
);
my $r = $dbh->selectall_hashref("select * from person where name = ?", "id", undef, "Jørgensen");
Dump $r->{1}{name};
'
SV = PV(0x836e20) at 0xa58dc8
REFCNT = 1
FLAGS = (POK,pPOK,UTF8)
PV = 0xa5a000 "J\303\270rgensen"\0 [UTF8 "J\x{f8}rgensen"]
CUR = 10
LEN = 16
You don't say quite clear, I think you eventually intend to send out the character data as JSON for use with Dojo. You need to encode them into UTF-8 octets; the various JSON libaries take care of that automatically for you, no need to invoke Encode functions manually.

Is there an equivalent of PHP's mysql_real_escape_string() for Perl's DBI?

Could some tell me if there is a function which works the same as PHP's mysql_real_escape_string() for Perl from the DBI module?
You should use placeholders and bind values.
Don't. Escape. SQL.
Don't. Quote. SQL.
Use SQL placeholders/parameters (?). The structure of the SQL statement and the data values represented by the placeholders are sent to the database completely separately, so (barring a bug in the database engine or the DBD module) there is absolutely no way that the data values can be interpreted as SQL commands.
my $name = "Robert'); DROP TABLE Students; --";
my $sth = $dbh->prepare('SELECT id, age FROM Students WHERE name = ?');
$sth->execute($name); # Finds Little Bobby Tables without harming the db
As a side benefit, using placeholders is also more efficient if you re-use your SQL statement (it only needs to be prepared once) and no less efficient if you don't (if you don't call prepare explicitly, it still gets called implicitly before the query is executed).
Like quote?
I would also recommend reading the documentation for DBD::MySQL if you are worried about utf8.
From http://www.stonehenge.com/merlyn/UnixReview/col58.html :
use SQL::Abstract;
...
my $sqa = SQL::Abstract->new;
my ($owner, $account_type) = #_; # from inputs
my ($sql, #bind) = $sqa->select('account_data', # table
[qw(account_id balance)], # fields
{
account_owner => $owner,
account_type => $account_type
}, # "where"
);
my $sth = $dbh->prepare_cached($sql); # reuse SQL if we can
$sth->execute(#bind); # execute it for this query
Database Handle Method "quote"
my $dbh = DBI->connect( ... );
$sql = sprintf "SELECT foo FROM bar WHERE baz = %s",
$dbh->quote("Don't");
http://metacpan.org/pod/DBI#quote

how to search for whole words efficiently in a sqlite db

I have a table that has a title column. I want to search for whole words like foo. so match " hi foo bye" o "foo", but not "foobar" or "hellofoo". Is there a way without changing the table structure to do this? I currently use 3 like queries, but it is too slow, I have " select * from articles where title like '% foo' or title like 'foo %' or title = 'foo' or title like '% foo %';
There has got to be a better way to do this?
You might be interested in a search indexer like lucene, ferret, or sphinx. These would run as separate processes that would index your data for fast searching where stemming, etc. can be configured.
Alternatively, depending on your data, you could just return all results that contain "foo" in any context and then filter them with regular expressions or such outside of the database. This might be an improvement depending on the characteristics of your data.
spontaneous answer:
use the regexp operator instead of the like operator.
EDIT I just realised that regexp is not always included with SQLite. You might have to compile you own ... in other words, it's not there by default ..
EDIT2
here's a working Perl sample ..
#!/usr/bin/perl -w
use strict;
use Data::Dumper;
use DBI;
# connect to the DB
my $dbh = DBI->connect("dbi:SQLite:dbname=dbfile","","");
# create ugly, pureperl function 'regexp'
# stolen from http://d.hatena.ne.jp/tokuhirom/20090416/1239849298
$dbh->func( "regexp"
, 2
, sub { my ( $pattern, $target ) = #_;
utf8::decode($pattern);
utf8::decode($target);
$target =~ m{$pattern} ? 1 : 0;
}
, "create_function" );
# drop table, if it exists
$dbh->do('drop table if exists foobar');
$dbh->do('create table foobar (foo varchar not null)');
my $sth=$dbh->prepare('insert into foobar (foo) values (?)');
while (<DATA>) { chop;$sth->execute($_); }
#query using regexp
my $a= $dbh->selectall_arrayref( 'select foo '
.'from foobar '
.'where foo regexp "^foo$|^foo\s+.*|.*\W+foo\W+.*|.*\W+foo$"'
);
print join("\n", map {$_->[0];} #{$a})
__DATA__
foo
foo
barfoo
foobarfolo
sdasdssds bar dasdsdsad
dasdsdasdsadsads foo! dasdasdasdsa
There are various regexp libraries that you can include in your iPhone application by linking to them in your build.
See this Stackoverflow question for further info.