How to properly use UTF-8-encoded data from Schema inside Catalyst app? - perl

Data defined inside Catalyst app or in templates has correct encoding and is diplayed well, but from database everything non-Latin1 is converted to ?. I suppose problem should be in model class, which is such:
use strict;
use base 'Catalyst::Model::DBIC::Schema';
__PACKAGE__->config(
schema_class => 'vhinnad::Schema::DB',
connect_info => {
dsn => 'dbi:mysql:test',
user => 'user',
password => 'password',
{
AutoCommit => 1,
RaiseError => 1,
mysql_enable_utf8 => 1,
},
'on_connect_do' => [
'SET NAMES utf8',
],
}
);
1;
I see no flaws here, but something must be wrong. I used my schema also with test scripts and data was well encoded and output was correct, but inside Catalyst app i did not get encoding right. Where may be the problem?
EDIT
For future reference i put solution here: i mixed in connect info old and new style.
Old style is like (dsn, username, passw, hashref_options, hashref_other options)
New style is (dsn => dsn, username => username, etc), so right is to use:
connect_info => {
dsn => 'dbi:mysql:test',
user => 'user',
password => 'password',
AutoCommit => 1,
RaiseError => 1,
mysql_enable_utf8 => 1,
on_connect_do => [
'SET NAMES utf8',
],
}

In a typical Catalyst setup with Catalyst::View::TT and Catalyst::Model::DBIC::Schema you'll need several things for UTF-8 to work:
add Catalyst::Plugin::Unicode::Encoding to your Catalyst app
add encoding => 'UTF-8' to your app config
add ENCODING => 'utf-8' to your TT view config
add <meta http-equiv="Content-type" content="text/html; charset=UTF-8"/> to the <head> section of your html to satisfy old IEs which don't care about the Content-Type:text/html; charset=utf-8 http header set by Catalyst::Plugin::Unicode::Encoding
make sure your text editor saves your templates in UTF-8 if they include non ASCII characters
configure your DBIC model according to DBIx::Class::Manual::Cookbook#Using Unicode
if you use Catalyst::Authentication::Store::LDAP configure your LDAP stores to return UTF-8 by adding ldap_server_options => { raw => 'dn' }
According to Catalyst::Model::DBIC::Schema#connect_info:
The old arrayref style with hashrefs for DBI then DBIx::Class options is also supported.
But you are already using the 'new' style so you shouldn't nest the dbi attributes:
connect_info => {
dsn => 'dbi:mysql:test',
user => 'user',
password => 'password',
AutoCommit => 1,
RaiseError => 1,
mysql_enable_utf8 => 1,
on_connect_do => [
'SET NAMES utf8',
],
}

This advice assumes you have fairly up to date versions of DBIC and Catalyst.
This is not necessary: on_connect_do => [ 'SET NAMES utf8' ]
Ensure the table|column charsets are UTF-8 in your DB. You can achieve things that sometimes look right even when parts are broken. The DB must be saving the character data as UTF-8 if you expect the entire chain to work.
Ensure you're using and configuring Catalyst::Plugin::Unicode::Encoding in your Catalyst app. It did have serious-ish bugs in the not too distant past so get the newest.

Related

TCA type 'inline' handling in multi language environment

The following scenario: I have a page translated (connected mode, not copy/free mode!) with multiple elements that are translated from the default language.
In my elements without inline fields everything is fine!
In my elements with inline fields I am completely confused about the handling and/or the configuration!
Let’s say I have a content element which contains three inline elements (let’s call them “quotes”). If I translate these quotes 1:1 everything works as expected.
Well ... almost:
I can create new quotes in the translation, but they won’t be displayed.
I can change the sorting, which won’t be taken into account in frontend. The frontend uses the sorting of the default language.
If I create a new quote in the default language, I get the record displayed in the translation and can translate it. So this works as expected.
This leads me to my questions:
How do I make it the quotes/inline elements in the translation independent of the default language?
If this is not possible (which would be fine to me, as it contradicts the idea of the Translate/Connected-Mode somehow) how do I get rid of the buttons for Sort and Create new (of cause only in the translation, not the default language!)? Otherwise, of course, editors try this and wonder why it doesn’t work.
I hope I’ve simply forget an option, but I’ve been thinking about it and looking for a solution for hours now that I probably can’t see the forest for the trees.
This might help if it is a missing option:
TCA
'config' => [
'appearance' => [
'collapseAll' => '1',
'enabledControls' => [
'dragdrop' => '1',
],
'levelLinksPosition' => 'bottom',
'newRecordLinkTitle' => 'New quote',
'useSortable' => '1',
'showSynchronizationLink' => true,
'showAllLocalizationLink' => true,
'showPossibleLocalizationRecords' => true,
],
'foreign_field' => 'parent_id',
'foreign_sortby' => 'sorting',
'foreign_table' => 'my_quotes_table',
'foreign_table_field' => 'parent_table',
'minitems' => '1',
'type' => 'inline',
],
Typoscript
dataProcessing {
10 = TYPO3\CMS\Frontend\DataProcessing\DatabaseQueryProcessor
10 {
if.isTrue.field = my_quotes_field
table = my_quotes_table
pidInList.field = pid
where = parent_id=###uid### AND deleted=0 AND hidden=0
orderBy = sorting
markers.uid.field = uid
as = quotes
}
}
I am using TYPO3 version 11.5.17 with PHP 8.1 and MariaDB 10.5
1.How do I make it the quotes/inline elements in the translation independent of the default language?
You cannot (by any standard TYPO3 means) do this in connected mode. Your page needs to be in free-translation mode to do stuff like that.
If you want do hide the field for all the other languages you can add a "displayCond" property to your tca like this:
'displayCond' => [
'AND' => [
'FIELD:sys_language_uid:=:0'
]
],
This way the field at least stays hidden for the connected languages.

Net::Kashflow - doesn't work with utf8 descriptions

I know this is a very old Perl module (5 or so years ago since the last update). I've found it useful for a project I'm doing though, that needs to be in Perl. Rather than starting from the bottom up, I've found this helpful to do the basics. I've already fixed up a few bugs with it - but this one I can't figure out
The module in question is: https://github.com/simoncozens/Net-KashFlow
An example usage with the problem is:
my $kf = Net::KashFlow->new(username => q|foo#bar.com|, password => "xxxx" );
my $i = $kf->create_invoice({
CustomerReference => "0123-1111111", CustomerID => 50108952,
CurrencyCode => "EUR"
}) || die $!;
$i->add_line({
Quantity => 1,
Description => "íéó foo bar test",
Rate => 10,
VatAmount => 4,
VatRate => 0,
CurrencyCode => "GBP"
});
This item gets added, but the "Description" value gets converted to:
7enzIGZvbyBiYXIgdGVzdA==
If you use normal a-z 0-9 it works fine (and shows correctly). The issue seems to be that its encoding into base64, and then not being decoded correctly at the other end. My guess is that KashFlow are not going to "fix" this, so it really needs to be done this end. I'm not really familiar with the SOAP::Lite module (again, a pretty old module it seems!), but that's what it uses.
This is the part I think that deals with adding a new "line" to the invoice:
InsertInvoiceLine => {
endpoint => 'https://securedwebapp.com/api/service.asmx',
soapaction => 'KashFlow/InsertInvoiceLine',
namespace => 'KashFlow',
parameters => [
SOAP::Data->new(name => 'UserName', type => 'xsd:string', attr => {}),
SOAP::Data->new(name => 'Password', type => 'xsd:string', attr => {}),
SOAP::Data->new(name => 'InvoiceID', type => 'xsd:int', attr => {}),
SOAP::Data->new(name => 'InvLine', type => 'tns:InvoiceLine', attr => {}),=> {})
], # end parameters
}, # end InsertInvoiceLine
You can see the structure here:
https://securedwebapp.com/api/service.asmx?op=InsertInvoiceLine
After researching this, it was suggested that you tell SOAP::Lite to not convert utf8 into base64, using (I assume), something like:
The structure is:
<?xml version="1.0" encoding="utf-8"?>
<soap:Envelope xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/">
<soap:Body>
<InsertInvoiceLine xmlns="KashFlow">
<UserName>string</UserName>
<Password>string</Password>
<InvoiceID>int</InvoiceID>
<InvLine>
<Quantity>decimal</Quantity>
<Description>string</Description>
<Rate>decimal</Rate>
<ChargeType>int</ChargeType>
<VatRate>decimal</VatRate>
<VatAmount>decimal</VatAmount>
<ProductID>int</ProductID>
<Sort>int</Sort>
<ProjID>int</ProjID>
<LineID>int</LineID>
<ValuesInCurrency>integer</ValuesInCurrency>
</InvLine>
</InsertInvoiceLine>
</soap:Body>
</soap:Envelope>
So looks like its Body > InsertInvoiceLine > InvLine > Description .. but I'm unsure how I can tell it not to encode that particular string.
Any advise would be much appreciated. While its not a major show stopper (as all the data is in the system), it would be much nicer/easier to see the item names as expected :)
Thanks!
I think this is probably SOAP::Lite deciding to convert things to base64 when it thinks they aren't a particular subset of ASCII. You'll find this heuristic in SOAP/Lite.pm in SOAP::Serializer:
$self->typelookup({
'base64Binary' =>
[10, sub {$_[0] =~ /[^\x09\x0a\x0d\x20-\x7f]/ }, 'as_base64Binary'],
'zerostring' =>
[12, sub { $_[0] =~ /^0\d+$/ }, 'as_string'],
... many other types ...
'string' =>
[100, sub {1}, 'as_string'],
});
This comes into play when SOAP::Lite doesn't know an object's type because no one has told it. I'm guessing that in the bowels of your program it's serializing Description and typelookup sticks its dirty mitts in.
And from here you are on your own because SOAP::Lite is no fun. I would start by hacking on SOAP::Lite a bit to see what you can trace. Copy the SOAP/Lite.pm file somewhere and put that location in your #INC. That way you don't mess around with the original file.
If you never want the base64, it might be as simple as deleting that line in typelookup, although declaring the Description type would be more proper (but also a rabbit's hole of work, potentially). The fast fix can stand in while you work on the right fix.
There's also the Perlmonk's meditation How to convince SOAP::Lite to return UTF-8 data in responses as UTF-8?.

How to connecting RapidApp to PostgreSQL, with utf-8 enabled

I'm creating a simple CRUD interface to a database, and I'm trying RapidApp.
I have an existing database, which I connect to with existing Moose-based code. There is a complication in that there is UTF-8 text in the database (eg 'Encyclopédie médico-chirurgicale. Técnicas quirúrgicas. Aparato digestivo')
My Moose-based code works just fine: data goes in & data comes out... and everyone is happy.
In my existing Moose code, the connector is:
$schema = My::Service::Schema->connect(
'dbi:Pg:dbname=my_db;host=my.host.name;port=1234',
'me',
'secret',
{ pg_enable_utf8 => 1 }
);
When I set about connecting RapidApp, I first tried a simple rdbic.pl command, but that doesn't pick up the UTF-8 strings. In an attempt to enforce UTF-8-ness, I've created the following:
use Plack::Runner;
use Plack::App::RapidApp::rDbic;
my $cnf = {
connect_info => {
dsn => 'dbi:Pg:dbname=my_db;host=my.host.name;port=1234',
user => 'me',
password => 'secret',
{ pg_enable_utf8 => 1 },
},
schema_class => 'My::Service::Schema'
};
my $App = Plack::App::RapidApp::rDbic->new( $cnf );
my $psgi = $App->to_app;
my $runner = Plack::Runner->new;
$runner->parse_options('--port', '5678');
$runner->run($psgi);
(which is pretty much rdbic.pl, compressed to one specific thing)
However - I'm getting mal-formed strings (eg: 'Encyclopédie médico-chirurgicale. Técnicas quirúrgicas. Aparato digestivo')
Having fought to get the correct text INTO the database, I know the database is correct... so how do I connect RapidApp to get UTF-8 back out?
Your schema will need to be configured to support UTF-8. Here's a helpful set of things to try:
How to properly use UTF-8-encoded data from Schema inside Catalyst app?

How do I write tests against POE EasyDBI?

I'm looking for some ideas or techniques to write tests against code that uses an EasyDBI session for accessing data in mysql. I don't want the EasyDBI session to be aware of being tested, so I was hoping to find a way to mock a DSN or something like that. But, It's not clear to me how I might do that.
Anyone had/solved this problem before?
I ended up using DBD:Mock which is pretty nice. When I set up my easy dbi component I used DBD:Mock: as the dsn. Then in the options I passed the result sets I wanted to return.
my #result_set = (list of stuff);
my $eDBI = POE::Component::EasyDBI->spawn(
alias => 'eDBI',
dsn => "DBI:Mock:",
username => "",
password => "",
options => {
AutoCommit => 0,
mock_add_resultset => \#result_set,
},
no_connect_failures => 1,
reconnect_wait => 2,
max_retries => 5,
connect_error => [ $alias, "dbi_failure", 5 ],
connected => [ $alias, "dbi_connected" ],
);
Maybe Test::Database::Tutorial / Test::Database this is what you need. Or you create the test database from your __DATA__ with :cache:

Cakephp Form Labels Encoding Utf8

In my php application since the beginning that i set everything with utf8 to avoid future problems. I set my database:
class DATABASE_CONFIG {
public $default = array(
'datasource' => 'Database/Mysql',
'persistent' => false,
'host' => 'localhost',
'login' => 'root',
'password' => '',
'database' => 'aquitex',
'prefix' => '',
'encoding' => 'utf8',
);
public $test = array(
'datasource' => 'Database/Mysql',
'persistent' => false,
'host' => 'localhost',
'login' => 'root',
'password' => '',
'database' => 'aquitex',
'prefix' => '',
'encoding' => 'utf8',
);
}
The file core.php:
Configure::write('App.encoding', 'UTF-8');
And the default layout of the views:
<?php echo $this->Html->charset(); ?>
However, i'm still having problems in some elements like labels of forms.
In my index.ctp file, this line:
echo $this->Html->link("Segurança", array('controller' => 'Posts','action'=> 'add'), array( 'class' => 'button'));
works perfectly and there's no problem with the 'ç' character.
But in forms, like this:
echo $this->Form->create('Post');
echo $this->Form->input('Nome Produto');
echo $this->Form->input(utf8_encode("Código Produto"));
echo $this->Form->input("Versão");
echo $this->Form->input('Data');
//echo $this->Form->input('body', array('rows' => '3'));
echo $this->Form->end('Criar Ficha');
there's no way i can get the words on the labels of the form with 'ó" or 'ç' characters showing properly. As you can see i even tried the utf8encode() in one of them.
Any hints? Thank you!
there is no need to use utf8_encode() in your views.
you simply forgot to save the view file properly.
save it as "utf8 without bom" and you will be fine.
files that do not contain any special utf8 char can still stay as ansi (since there is no difference between them then).
but every file that does contain such a character you need to save as utf8 (even controllers and models if you plan on using utf8 characters there for error messages etc).
PS: in general it is wiser to use english and to translate it via PO file into your language.
this way you can leave the files as they are and you are more flexible (you can add new languages on the fly just by creating a new PO file then).
EDIT
After figuring out together that your inputs() use utf8 chars, I will need to update:
It is wise to use "underscore_field_names" for your db fields (and therefore your input fields) - and in English:
echo $this->Form->input("version"));
you can easily translate them via PO file afterwards or specifying the label:
echo $this->Form->input("version", array('label' => 'Versão'));
but the first way is recommended to keep it dry.
App.encoding just tells Cake to send data in UTF8. If you're using MySQL, make sure the database itself is set to utf8_general_ci collation.