How to display readable UTF-8 strings with Data::Dumper? - perl

I have some UTF-8 encoded strings in structures which I am dumping for debugging purposes with Data::Dumper.
A small test case is:
use utf8;
use Data::Dumper;
say Dumper({да=>"не"}
It outputs
{
"\x{434}\x{430}" => "\x{43d}\x{435}"
};
but I want to see
{
"да" => "не"
};
Of course my structure is quite more complex.
How can I make the strings in the dumped structure readable while debugging? Maybe I have to process the output via chr somehow before warn/say?

Just for debugging:
#!/usr/bin/perl
use strict;
use warnings;
use v5.10;
use utf8;
use Data::Dumper;
binmode STDOUT, ':utf8';
CASE_1: {
# Redefine Data::Dumper::qquote() to do nothing
no warnings 'redefine';
local *Data::Dumper::qquote = sub { qq["${\(shift)}"] };
# Use the Pure Perl implementation of Dumper
local $Data::Dumper::Useperl = 1;
say Dumper({да=>"не"});
}
CASE_2: {
# Use YAML instead
use YAML;
say Dump({да=>"не"});
}
CASE_3: {
# Evalulate whole dumped string
no strict 'vars';
local $Data::Dumper::Terse = 1;
my $var = Dumper({да=>"не"});
say eval "qq#$var#" or die $#;
}
__END__
$VAR1 = {
"да" => "не"
};
---
да: не
{
"да" => "не"
}

print Dumper(%mydata) =~ s/\\x\{([0-9a-f]{2,})\}/chr hex $1/ger;

sorry but I had tested eval whole dump and had got some repugnancy for my data so
Data::Dumper->new(\#_)
->Indent(1)->Sortkeys(1)->Terse(1)->Useqq(0)->Dump
=~ s/((?:\\x\{[\da-f]+\})+)/eval '"'.$1.'"'/eigr;

Related

List all the subroutine names in perl program

I am using more modules in my perl program.
example:
use File::copy;
so likewise File module contains Basename, Path, stat and etc..
i want to list all the subroutine(function) names which is in File Package module.
In python has dir(modulename)
It list all the function that used in that module....
example:
#!/usr/bin/python
# Import built-in module math
import math
content = dir(math)
print content
Like python tell any code for in perl
If you want to look at the contents of a namespace in perl, you can use %modulename::.
For main that's either %main:: or %::.
E.g.:
#!/usr/bin/perl
use strict;
use warnings;
use Data::Dumper;
sub fish {};
sub carrot {};
print "Stuff defined in Dumper:\n";
print Dumper \%Data::Dumper::;
print "Stuff defined:\n";
print Dumper \%::;
That covers a load of stuff though - including pragmas. But you can check for e.g. subroutines by simply testing it for being a code reference.
foreach my $thing ( keys %:: ) {
if ( defined &$thing ) {
print "sub $thing\n";
}
}
And with reference to the above sample, this prints:
sub Dumper
sub carrot
sub fish
So with reference to your original question:
#!/usr/bin/perl
use strict;
use warnings;
use Data::Dumper;
use File::Copy;
print "File::Copy has subs of:\n";
foreach my $thing ( keys %File::Copy:: ) {
if ( defined &$thing ) {
print "sub $thing\n";
}
}
Unfortunately you can't do the same thing with the whole File:: namespace, because there's a whole bunch of different modules that could be installed/loaded, but might not be.
You'd have to use e.g. CPAN to check that -
perl -MCPAN -e shell
i /^File::/
Which will list you around 717 modules that are grouped into the File:: tree.
You could look this up on CPAN. Or if you're just after the core modules, then some variant of using Module::CoreList might do what you want.
Something like this:
#!/usr/bin/perl
use strict;
use warnings;
use Module::CoreList;
foreach my $module ( Module::CoreList->find_modules(qr/^File::/) ) {
if ( eval { require $module =~ s|::|/|gr . ".pm" } ) {
print "Module: $module contains\n";
my $key_str = "\%$module\:\:";
my %stuff = eval $key_str;
foreach my $thing ( sort keys %stuff ) {
my $full_sub_path = "$module::$thing";
if ( eval {"defined &$full_sub_path"} ) {
if ( defined &$thing ) {
print "$thing <- $full_sub_path imported by default\n";
}
else {
print "\t$full_sub_path might be loadable\n";
}
}
}
}
else {
print "Module: $module couldn't be loaded\n";
}
}
It's a bit messy because you have to eval various bits of it to test if a module is in fact present and loadable at runtime. Oddly enough, File::Spec::VMS wasn't present on my Win32 system. Can't think why.... :).
Should note - just because you could import a sub from a module (that isn't exported by default) doesn't make it a good idea. By convention, any sub prefixed with an _ is not supposed to be used externally, etc.
My Devel::Examine::Subs module can do this, plus much more. Note that whether it's a method or function is irrelevant, it'll catch both. It works purely on subroutines as found with PPI.
use warnings;
use strict;
use Devel::Examine::Subs;
my $des = Devel::Examine::Subs->new;
my $subs = $des->module(module => 'File::Copy');
for (#$subs){
print "$_\n";
}
Output:
_move
move
syscopy
carp
mv
_eq
_catname
cp
copy
croak
Or a file/full directory. For all Perl files in a directory (recursively), just pass the dir to file param without a file at the end of the path:
my $des = Devel::Examine::Subs->new(file => '/path/to/file.pm');
my $subs = $des->all;
If you just want to print it use the Data::Dumper module and the following method, CGI used as an example:
use strict;
use warnings;
use CGI;
use Data::Dumper;
my $object = CGI->new();
{
no strict 'refs';
print "Instance METHOD IS " . Dumper( \%{ref ($object)."::" }) ;
}
Also note, it's File::Copy, not File::copy.

How to put data from CSV file to Perl hash

I have Perl and CSV file with something like:
"Name","Lastname"
"Homer","Simpsons"
"Ned","Flanders"
In this CSV file I have header in the first line and in other lines there are
data.
I want to convert this CSV file to such Perl data:
[
{
Lastname => "Simpsons",
Name => "Homer",
},
{
Lastname => "Flanders",
Name => "Ned",
},
]
I've written the function that users Text::CSV and doing what I need.
Here is the sample script:
#!/usr/bin/perl
use strict;
use warnings FATAL => 'all';
use 5.010;
use utf8;
use open qw(:std :utf8);
use Text::CSV;
sub read_csv {
my ($filename) = #_;
my #first_line;
my $result;
my $csv = Text::CSV->new ({ binary => 1, auto_diag => 1 });
open my $fh, "<:encoding(utf8)", $filename or die "$filename: $!";
while (my $row = $csv->getline ($fh)) {
if (not #first_line) {
#first_line = #{$row};
} else {
push #{$result}, { map { $first_line[$_] => $row->[$_] } 0..$#first_line };
}
}
close $fh;
return $result;
}
my $data = read_csv('sample.csv');
This works fine but this function I want to use in several scripts. I'm
greatly suprised that Text::CSV doesn't have this feature.
My question. What should I do to simplify solving such tasks in the future for
me and others?
Should I use some Perl module from CPAN, should I try to add this function to
Text::CSV, or something else?
Huh? Why so complicated? First, we fetch the header outside of the loop:
my $headers = $csv->getline($fh) or die "no header";
Assign these to be the column names:
$csv->column_names(#$headers);
Then, each call to getline_hr will provide a hashref:
while (my $hashref = $csv->getline_hr($fh)) {
push #$result, $hashref;
}
We can also use getline_hr_all:
$result = $csv->getline_hr_all($fh);
In other words, it ain't complex, most pieces are already provided by Text::CSV, and it can be done in very few lines.
Also, a module like this seems to already exist: Text::CSV::Slurp. (note: reverse dependency search through metacpan is awesome)
It's probably not a standard feature because different people will want their CSV files parsed into different data structures.
Why not create your own module that wraps this function?
package CSVRead;
use strict;
use warnings;
use 5.010;
use open qw(:std :utf8);
use Text::CSV;
require Exporter;
our #ISA = qw(Exporter);
our #EXPORT = qw(read_csv);
sub read_csv {
my ($filename) = #_;
my #first_line;
my $result;
my $csv = Text::CSV->new ({ binary => 1, auto_diag => 1 });
open my $fh, "<:encoding(utf8)", $filename or die "$filename: $!";
while (my $row = $csv->getline ($fh)) {
if (not #first_line) {
#first_line = #{$row};
} else {
push #{$result}, { map { $first_line[$_] => $row->[$_] } 0..$#first_line };
}
}
close $fh;
return $result;
}
Then, use it like this:
#!/usr/bin/perl
use strict;
use warnings;
use 5.010;
use Data::Dumper;
use CSVRead;
my $data = read_csv('sample.csv');
say Dumper $data;

XML reading using Perl

I am new to the Perl language. I have an XML like,
<xml>
<date>
<date1>2012-10-22</date1>
<date2>2012-10-23</date2>
</date>
</xml>
I want to parse this XML file & store it in array. How to do this using perl script?
Use XML::Simple - Easy API to maintain XML (esp config files) or
see XML::Twig - A perl module for processing huge XML documents in tree mode.
Example like:
use strict;
use warnings;
use XML::Simple;
use Data::Dumper;
my $xml = q~<xml>
<date>
<date1>2012-10-22</date1>
<date2>2012-10-23</date2>
</date>
</xml>~;
print $xml,$/;
my $data = XMLin($xml);
print Dumper( $data );
my #dates;
foreach my $attributes (keys %{$data->{date}}){
push(#dates, $data->{date}{$attributes})
}
print Dumper(\#dates);
Output:
$VAR1 = [
'2012-10-23',
'2012-10-22'
];
Here's one way with XML::LibXML
#!/usr/bin/env perl
use strict;
use warnings;
use XML::LibXML;
my $doc = XML::LibXML->load_xml(location => 'data.xml');
my #nodes = $doc->findnodes('/xml/date/*');
my #dates = map { $_->textContent } #nodes;
Using XML::XSH2, a wrapper around XML::LibXML:
#!/usr/bin/perl
use warnings;
use strict;
use XML::XSH2;
xsh << '__XSH__';
open 2.xml ;
for $t in /xml/date/* {
my $s = string($t) ;
perl { push #l, $s }
}
__XSH__
no warnings qw(once);
print join(' ', #XML::XSH2::Map::l), ".\n";
If you can't/don't want to use any CPAN mod:
my #hits= $xml=~/<date\d+>(.+?)<\/date\d+>/
This should give you all the dates in the #hits array.
If the XML isn't as simple as your example, using a XML parser is recommended, the XML::Parser is one of them.

Devel::Declare removes line from script

I am trying to learn Devel::Declare so as to attempt to reimplement something like PDL::NiceSlice without source filters. I was getting somewhere when I noticed that it was removing the next line from my script. To illustrate I have made this minimal example wherein one can use the comment keyword to remove the entire line from the code, allowing a compile even though barewords abound on that line.
#Comment.pm
package Comment;
use strict;
use warnings;
use Devel::Declare ();
sub import {
my $class = shift;
my $caller = caller;
Devel::Declare->setup_for(
$caller,
{ comment => { const => \&parser } }
);
no strict 'refs';
*{$caller.'::comment'} = sub {};
}
sub parser {
#my $linestr = Devel::Declare::get_linestr;
#print $linestr;
Devel::Declare::set_linestr("");
}
1
and
#!/usr/bin/env perl
#test.pl
use strict;
use warnings;
use Comment;
comment stuff;
print "Print 1\n";
print "Print 2\n";
yields only
Print 2
what am I missing?
P.S. I will probably have a few more questions on D::D coming up if I should figure this one out, so thanks in advance!
Ok so I got it. Using perl -MO=Deparse test.pl you get:
use Comment;
use warnings;
use strict 'refs';
comment("Print 1\n");
print "Print 2\n";
test.pl syntax OK
which tells me that if forces the comment function to be called. After some experimentation I found that I could just set the output to call comment() explicitly so that it doesn't try to call comment on whatever is next.
sub parser {
Devel::Declare::set_linestr("comment();");
}
so that the deparse is:
use Comment;
use warnings;
use strict 'refs';
comment();
print "Print 1\n";
print "Print 2\n";
test.pl syntax OK
and the proper output too.

Importing in hierarchical Perl modules into the local namespace

Situation:
I have a module Foo::Quux::Bar, living in ./Bar.pm. I need to be able to unit test Bar. However, it is not advantageous due to circumstances beyond my control to set up a Foo/Quux directory structure.
So what I'd like to do is have some sort of unit_test_use routine that lets me grab Bar.pm and move/copy its functions into the local namespace(Note that Bar has a package Foo::Quux::Bar specifier) for my testing pleasure.
Grubbing around in the Perl documentation has not helped me.
Assuming your Bar.pm exports its functions in the standard way, you can load it with require and do the import manually:
BEGIN {
require 'Bar.pm'; # now the package Foo::Quux::Bar is set up
Foo::Quux::Bar->import;
};
But it's definitely worth looking into setting up the directory structure in the standard way, if you can.
The example below uses the following Bar.pm:
package Foo::Quux::Bar;
use warnings;
use strict;
sub one { 1 }
sub two { "zwei" }
sub three { 0x3333 }
1;
In your test-bar program, you can install a hook that will use the current directory's Bar.pm with
#! /usr/bin/perl
use warnings;
use strict;
use File::Basename;
BEGIN {
sub find_bar {
my(undef,$name) = #_;
if (basename($name) eq "Bar.pm") {
open my $fh, "<", "./Bar.pm" or die "$0: open ./Bar.pm: $!";
$fh;
}
}
unshift #INC => \&find_bar;
}
Hooks in #INC are documented in the perlfunc documentation for require.
Now to import all subs, ignoring any import in Foo::Quux::Bar,
# fake use Foo::Quux::Bar
BEGIN {
require Foo::Quux::Bar;
{
no strict 'refs';
while (my($name,$glob) = each %Foo::Quux::Bar::) {
if (*{ $glob }{CODE}) {
*{ __PACKAGE__ . "::" . $name } = *{ $glob }{CODE};
}
}
}
}
Back out in the test code where the strict pragma is enabled, we can
print map "$_\n", one, two, three;
and get the following output:
1
zwei
13107
Here's what I wrote:
sub import_module_into_main
{
my ($mod_name, $filename) = #_;
require $filename;
no strict;
foreach my $var ( keys( %{$mod_name . "::"}))
{
$main::{$var} = ${$mod_name. "::"}{$var};
}
}
Invoke with this: import_module_into_main("Foo::Quux::Bar", "Bar.pm").