Special character handling when fetching data from MS SQL Server using Perl DBD - perl

I have an MS SQL Server 2008 Database, from which I am fetching data using perl DBD::Sybase module. But there are some special characters in the DB, like the Copyright symbol, Trademark symbol etc., which are not getting imported properly. Perl seems to change all of these special characters to a Question mark character. Is there a way to fix this?
I have tried specifying charset=utf8 in the connection string. The doc mentions a syb_enable_utf8 (bool) setting, but whenever I try that, I get an error:
Can't locate object method "syb_enable_utf8" via package "DBI::db"

One solution I found was this:
use Encode qw(encode_utf8);
Then, wherever you are writing data to a file or anywhere else, use Encode::encode_utf8($data);
where $data is the column/value which you have fetched from MSSQL.

I don't use DBD::Sybase but a) I use a lot of other DBDs and b) I am currently collecting information about unicode support in DBDs. According to the pod you need at least OpenClient 15.x when using syb_enable_utf8. Are you using 15.x or later? Perhaps syb_enable_utf8 is not defined if your client is less than 15.x or perhaps you have too old a version of DBD::Sybase. Unfortunately I cannot see from the Changes file when syb_enable_utf8 was added.
However, when you say "can't locate method" I think that is a clue as syb_enable_utf8 is not a method, it is an attribute (it is under Sybase Specific Attributes) in the pod. So you need to add it to your connect call or set it via a connection handle like this:
my $h = DBI->connect("dbi:Sybase:something","user","password", {syb_enable_utf8 => 1});
or
$h->{syb_enable_utf8} = 1;
You should also read the bits in the pod describing what happens when syb_enable_utf8 is set as it appears from the documents it only applies to UNIVARCHAR, UNICHAR, and UNITEXT columns.
Lastly, you need to ensure you insert the data correctly in the first place. I'd guess if it is not inserted from Perl with syb_enable_utf8 and charset=utf8 and your data is not proper unicode characters in Perl before you insert you'll get garbage back.
The comment Raze2dust made had nothing to do with your issue but is worth heeding if you are going to write the data retrieved from your database elsewhere. Just remember to decode any data input to your script and encode any data output.

Related

Simple `to_tsvector` configuration - postgres

How can I change the to_tsvector configuration to use a simple tokenization rule like:
lowercase
split by spaces only
Executing the following query:
SELECT to_tsvector('english', 'birthday=19770531 Name=John-Oliver Age=44 Code=AAA-345')
I get these lexemes:
'-345':9 '19770531':2 '44':6 'aaa':8 'age':5 'birthday':1 'code':7 'john':4 'name':3
The kind of searching I'm looking for is like:
(!birthday | birthday=19770531) & (code=AAA-345)
It means, get me all records that has a text "birthday=19770531" or doesn't have "birthday" at all, and a text equals to "code=AAA-345"). The way lexemes are being created it is not possible. I was expecting to have something like this:
'birthday=19770531':1 'age=44':2 'code=aaa-345':4 'name=john-oliver':3
You would have to code a custom parser. This can only be done in C.
But you might be able to use the existing testing parser test_parser, it seems to do what you want. If not, it would at least be a good starting point.
The problem may be that this is in src/test/modules/, and I don't think it ships with most installation packaging. So it might take some effort to get it to install. It would depend on your OS, version, and package manager.

How to parse the CAP id instead of a hashed value with Weather::NOAA::Alert in Perl

Thanks to the accepted answer in the following solution, I'm now able to extract most of the values I need from NOAA alerts: perl Data::Dumper to extract key values
I would like to parse the "CAP id" as well, however when I try, I receive a hashed value instead of the URL.
For example, using the previously mentioned thread, what I would like to parse is:
http://alerts.weather.gov/cap/wwacapget.php?x=TX12516CBE9400.FloodWarning.12516CC068C0TX.MAFFLWMAF.f21e7ce7cf8e930ab73a110c4d912576
What I get instead: HASH(0x26384c0)
I imagine this is only possible by modifying alert.pm:
https://github.com/mikegrb/Weather-NOAA-Alert/blob/master/lib/Weather/NOAA/Alert.pm and if I've read enough into the issue, it may be on account of XML::Simple?
Typically, I would use XPath to parse XML like data, but for this ATOM format I'm lost.
Ultimately, I'm simply looking to add the parsed variables to an SQL database. With NOAA looking to transition from CAP v1.1 to v1.2 (when, I have no clue), perhaps I should be looking at using something else.
In your previous code, you can get the single key of the hashref $events->{'TXC301'} like this:
my #keys = keys %{$events->{'TXC301'}}
my $alert_url = $keys[0]
Now $alert_url should hold the URL you were mentioning.
Does this answer your question?

Setting application_name on Postgres/SQLAlchemy

Looking at the output of select * from pg_stat_activity;, I see a column called application_name, described here.
I see psql sets this value correctly (to psql...), but my application code (psycopg2/SQLAlchemy) leaves it blank.
I'd like to set this to something useful, like web.1, web.2, etc, so I could later on correlate what I see in pg_stat_activity with what I see in my application logs.
I couldn't find how to set this field using SQLAlchemy (and if push comes to shove - even with raw sql; I'm using PostgresSQL 9.1.7 on Heroku, if that matters).
Am I missing something obvious?
the answer to this is a combination of:
http://initd.org/psycopg/docs/module.html#psycopg2.connect
Any other connection parameter supported by the client library/server can be passed either in the connection string or as keywords. The PostgreSQL documentation contains the complete list of the supported parameters. Also note that the same parameters can be passed to the client library using environment variables.
where the variable we need is:
http://www.postgresql.org/docs/current/static/runtime-config-logging.html#GUC-APPLICATION-NAME
The application_name can be any string of less than NAMEDATALEN characters (64 characters in a standard build). It is typically set by an application upon connection to the server. The name will be displayed in the pg_stat_activity view and included in CSV log entries. It can also be included in regular log entries via the log_line_prefix parameter. Only printable ASCII characters may be used in the application_name value. Other characters will be replaced with question marks (?).
combined with :
http://docs.sqlalchemy.org/en/rel_0_8/core/engines.html#custom-dbapi-args
String-based arguments can be passed directly from the URL string as query arguments: (example...) create_engine() also takes an argument connect_args which is an additional dictionary that will be passed to connect(). This can be used when arguments of a type other than string are required, and SQLAlchemy’s database connector has no type conversion logic present for that parameter
from that we get:
e = create_engine("postgresql://scott:tiger#localhost/test?application_name=myapp")
or:
e = create_engine("postgresql://scott:tiger#localhost/test",
connect_args={"application_name":"myapp"})
If you're using asyncpg driver, you should use
conn = await asyncpg.connect(server_settings={'application_name': 'foo'})
src - https://github.com/MagicStack/asyncpg/issues/204#issuecomment-333917251

Cant enter raw_parser in postgreSQL?

I'm trying to analyze how postgreSQL parse a query, and after some postgreSQL source code tracing with embedding printf() here and there, I've known that the query will be parsed into raw parse tree with raw_parser, which located in file parser.c.
The strange thing is, I've already embedded a printf() dummy in the raw_parser, and after re-installing the postgreSQL and execute a query, my printf() dummy is not printed to the screen!
Can anybody please help me, where I went wrong?
Thanks in advance :D
if you use printf(stderr, "...."), then you can find result in server log. Don't forget - you are not work with server directly. For debugging purposes there are a elog function - it's like printf for client application:
elog(NOTICE, "some text");
a format string is same like printf's format - but you must to remember, PostgreSQL uses a different formats than glibc - so you can to show only integer or float variables. String variables uses different format than is C zero finished string.

After querying DB I can't print data as well as text anymore to browser

I'm in a web scripting class, and honestly and unfortunately, it has come second to my networking and design and analysis classes. Because of this I find I encounter problems that may be mundane but can't find the solution to it easily.
I am writing a CGI form that is supposed to work with a MySQL DB. I can insert and delete into the DB just fine. My problem comes when querying the DB.
My code compiles fine and I don't get errors when trying to "display" the info in the DB through the browser but the data and text doesn't in fact display. The code in question is here:
print br, 'test';
my $dbh = DBI->connect("DBI:mysql:austinc4", "*******", "*******", {RaiseError => 1} );
my $usersstatement = "select * from users";
my $projstatment = "select * from projects";
# Get the handle
my $userinfo = $dbh->query($usersstatement);
my $projinfo = $dbh->query($projstatement);
# Fetch rows
while (#userrow = $userinfo->fetchrow()) {
print $userrow[0], br;
}
print 'end';
This code is in an if statement that is surrounded by the print header, start_html, form, /form, end_html. I was just trying to debug and find out what was happening and printed the statements test and end. It prints out test but doesn't print out end. It also doesn't print out the data in my DB, which happens to come before I print out end.
What I believe I am doing is:
Connecting to my DB
Forming a string the contains the command/request to the DB
Getting a handle for my query I perform on the DB
Fetching a row from my handle
Printing the first field in the row I fetched from my table
But I don't see why my data wouldn't print out as well as the end text. I looked in DB and it does in fact contain data in the DB and the table that I am trying to get data from.
This one has got me stumped, so I appreciate any help. Thanks again. =)
Solution:
I was using a that wasn't supported by the modules I was including. This leads me to another question. How can I detect errors like this? My program does in fact compile correctly and the webpage doesn't "break". Aside from me double checking that all the methods I do use are valid, do I just see something like text not being displayed and assume that an error like this occurred?
Upon reading the comments, the reason your program is broken is because query() does not execute an SQL query. Therefore you are probably calling an undefined subroutine unless this is a wrapper you have defined elsewhere.
Here is my original posting of helpful hints, which still apply:
I hope you have use CGI, use DBI, etc... and use CGI::Carp and use strict;
Look in /var/log/apache2/access.log or error.log for the bugs
Realize that the first thing a CGI script prints MUST be a valid header or the web server and browser become unhappy and often nothing else displays.
Because of #3 print the header first BEFORE you do anything, especially before you connect to the database where the script may die or print something else because otherwise the errors or other messages will be emitted before the header.
If you still don't see an error go back to #2.
CGIs that use CGI.pm can be run from a command line in a terminal session without going through the webserver. This is also a good way to debug.