Match pattern from vector - match

Let's say I'm having this function:
match symbol {
letter # 'a'..'z' => Token { ... },
digit # '0'..'9' => Token { ... },
whitespace # ['\r', '\t', '\n'] => Token {...}
}
This obviously doesn't work. Is there any way to have whitespace use similar construct, without writing this like:
match symbol {
letter # 'a'..'z' => Token { ... },
digit # '0'..'9' => Token { ... },
'\r' => Token {...},
'\n' => Token {...},
'\r' => Token {...},
}

You can combine patterns using the pipe (|) operator:
match symbol {
letter # 'a'..'z' => Token { ... },
digit # '0'..'9' => Token { ... },
'\r' | '\t' | '\n' => Token {...}
}
But binding match to a variable is ugly:
match symbol {
letter # 'a'..'z' => Token { ... },
digit # '0'..'9' => Token { ... },
ws # '\r' | ws # '\t' | ws # '\n' => Token {...}
}

Related

in perl, why does sprintf(Dumper \%hash) throw a warning when the hash contains a long string?

I have been using syntax like the following for months, without triggering a warning:
die join('', sprintf(Dumper [#stack]), sprintf(Dumper {%oprAtnNOW}), 'opt tojudge not specified');
That is, I have used sprintf with Dumper, without specifying a format.
In the following code, we see that this works fine, but only up to a point. When %oprAtnNOW contains a long string, a warning is triggered.
(In all cases the string compiles as a regex; but before compilation, it is nothing but a string.)
What causes the warning with the long string? Why is there a "missing argument"?
Granted, sprintf is supposed to be given a format, as in
https://perldoc.perl.org/functions/sprintf.
But why is this only enforced when a smaller string is replaced by a long string?
#!/usr/bin/perl
use strict; use warnings;
use Data::Dumper qw(Dumper);
$Data::Dumper::Sortkeys = 1;
print "Perl version: $^V\n";
my %oprAtnNOW;
my $string='~~~~~1983-10-21 Fri 13:01:13, today we went to the movie.';
%oprAtnNOW = (
Vv => {
v=>[ '(?<a>a)',],
},
);
tryit();
%oprAtnNOW = (
Vv => {
v=>[
'(?m)^(?<boundjour2009>(?<tilde5>[~]{5})[\\x20\\t]*(?<dateISO1mbeWeekdaymbeTIME>(?<dateISO1mbeWeekday>(?<dateISO1>(?<YYYY>[1-9]\\d\\d\\d)[-](?<nMonth2>0[1-9]|1[0-2])[-](?<nMonthDay2>3[01]|[0-2][0-9]))([\\x20\\t]+(?<wWeekdayAllor3>Sunday|Monday|Tuesday|Wednesday|Thursday|Friday|Saturday|Sun|Mon|Tue|Wed|Thu|Fri|Sat))?)([\\x20\\t]+(?<nTIMEdiverse>(at[\\x20\\t]+)?((?<HHcMMcSS>(?<HH>0[0-9]|1[0-9]|2[0-3]):(?<MM>[0-5][0-9]):(?<SS>[0-5][0-9]))|(?<HHMMmbeSS>(?<HHMM>(?<HH>0[0-9]|1[0-9]|2[0-3])(?<MM>[0-5][0-9]))(?<SS>[0-5][0-9])?)|(?<HHcMM_pct_cSS>(?<HH>0[0-9]|1[0-9]|2[0-3]):(?<MM>[0-5][0-9])%:(?<SS>[0-5][0-9]))|(?<HHcMM_stop>(?<HH>0[0-9]|1[0-9]|2[0-3]):(?<MM>[0-5][0-9])(?![:][0-5][0-9])))))?))',
],
},
);
tryit();
sub tryit
{
my $rgx=qr/$oprAtnNOW{Vv}->{v}->[0]/;
if($string=~$rgx)
{
print Dumper \%+;
}
print "with format:\n";
print sprintf('%s', Dumper \%oprAtnNOW);
print "WITHOUT format:\n";
print sprintf(Dumper \%oprAtnNOW);
}
The output:
Perl version: v5.18.4
$VAR1 = {
'a' => 'a'
};
with format:
$VAR1 = {
'Vv' => {
'v' => [
'(?<a>a)'
]
}
};
WITHOUT format:
$VAR1 = {
'Vv' => {
'v' => [
'(?<a>a)'
]
}
};
$VAR1 = {
'HH' => '13',
'HHcMMcSS' => '13:01:13',
'MM' => '01',
'SS' => '13',
'YYYY' => '1983',
'boundjour2009' => '~~~~~1983-10-21 Fri 13:01:13',
'dateISO1' => '1983-10-21',
'dateISO1mbeWeekday' => '1983-10-21 Fri',
'dateISO1mbeWeekdaymbeTIME' => '1983-10-21 Fri 13:01:13',
'nMonth2' => '10',
'nMonthDay2' => '21',
'nTIMEdiverse' => '13:01:13',
'tilde5' => '~~~~~',
'wWeekdayAllor3' => 'Fri'
};
with format:
$VAR1 = {
'Vv' => {
'v' => [
'(?m)^(?<boundjour2009>(?<tilde5>[~]{5})[\\x20\\t]*(?<dateISO1mbeWeekdaymbeTIME>(?<dateISO1mbeWeekday>(?<dateISO1>(?<YYYY>[1-9]\\d\\d\\d)[-](?<nMonth2>0[1-9]|1[0-2])[-](?<nMonthDay2>3[01]|[0-2][0-9]))([\\x20\\t]+(?<wWeekdayAllor3>Sunday|Monday|Tuesday|Wednesday|Thursday|Friday|Saturday|Sun|Mon|Tue|Wed|Thu|Fri|Sat))?)([\\x20\\t]+(?<nTIMEdiverse>(at[\\x20\\t]+)?((?<HHcMMcSS>(?<HH>0[0-9]|1[0-9]|2[0-3]):(?<MM>[0-5][0-9]):(?<SS>[0-5][0-9]))|(?<HHMMmbeSS>(?<HHMM>(?<HH>0[0-9]|1[0-9]|2[0-3])(?<MM>[0-5][0-9]))(?<SS>[0-5][0-9])?)|(?<HHcMM_pct_cSS>(?<HH>0[0-9]|1[0-9]|2[0-3]):(?<MM>[0-5][0-9])%:(?<SS>[0-5][0-9]))|(?<HHcMM_stop>(?<HH>0[0-9]|1[0-9]|2[0-3]):(?<MM>[0-5][0-9])(?![:][0-5][0-9])))))?))'
]
}
};
WITHOUT format:
Missing argument in sprintf at /Users/kpr/u/kh/bin/z.pl line 38.
Invalid conversion in sprintf: "%:" at /Users/kpr/u/kh/bin/z.pl line 38.
$VAR1 = {
'Vv' => {
'v' => [
'(?m)^(?<boundjour2009>(?<tilde5>[~]{5})[\\x20\\t]*(?<dateISO1mbeWeekdaymbeTIME>(?<dateISO1mbeWeekday>(?<dateISO1>(?<YYYY>[1-9]\\d\\d\\d)[-](?<nMonth2>0[1-9]|1[0-2])[-](?<nMonthDay2>3[01]|[0-2][0-9]))([\\x20\\t]+(?<wWeekdayAllor3>Sunday|Monday|Tuesday|Wednesday|Thursday|Friday|Saturday|Sun|Mon|Tue|Wed|Thu|Fri|Sat))?)([\\x20\\t]+(?<nTIMEdiverse>(at[\\x20\\t]+)?((?<HHcMMcSS>(?<HH>0[0-9]|1[0-9]|2[0-3]):(?<MM>[0-5][0-9]):(?<SS>[0-5][0-9]))|(?<HHMMmbeSS>(?<HHMM>(?<HH>0[0-9]|1[0-9]|2[0-3])(?<MM>[0-5][0-9]))(?<SS>[0-5][0-9])?)|(?<HHcMM_pct_cSS>(?<HH>0[0-9]|1[0-9]|2[0-3]):(?<MM>[0-5][0-9])%:(?<SS>[0-5][0-9]))|(?<HHcMM_stop>(?<HH>0[0-9]|1[0-9]|2[0-3]):(?<MM>[0-5][0-9])(?![:][0-5][0-9])))))?))'
]
}
};
It's not because of the length, but because the long string contains a percent sign.
...(?<MM>[0-5][0-9])%:(?<SS>[0-5][0-9]))...
~
As it's the only argument, it's interpreted as the format.
You can demonstrate the same behaviour with much shorter strings, e.g.
sprintf '%';
If you don't need to format, just use print:
print Dumper \%oprAtnNOW;

I have two arrays roles added and roles deleted, want to merge them and write in this order in perl

I have another scenario.
My file has the following fields:
Username | Roles | Type |date |
abc|admin |added | 01072015
abc|developer |deleted |01072015
abc|deploy |added |01072015
xyz |admin |deleted |01072015
xyz| deploy|deleted|01072015
cdf|deploy|added|01072015
Note here, the date is going to be the same day, so no change
now, I want this to be printed as
username |roles_added |roles deleted |date
abc |admin,deploy |developer |01072015
xyz ||admin,deploy |01072015
cdf |deploy||01072015
I tried the below approach given, but didn't work out for me. Please guide me.
#!/usr/bin/perl
open(FIL,"report.txt") or die("$!");
my %k=();
while (my $line=<FIL>) {
my ($user,$roles,$type,$dt)=split(/\|/,$line);
$k{$user}{$roles}=1;
$k{$user}{$type}=1;
}
my #names=(sort keys(%k));
foreach my $name (#names) {
foreach my $value ( (keys(%{$k{$name}})) ){
print "$value ";
}
print "$name\n";
}
print " i am here \n";
while( my ($k, $v) = each %$k ) {
print "key: $k, value: $v.\n";
}
When dealing with complex data structures, Data::Dumper is your friend. Try adding use Data::Dumper to your code and then print Dumper \%k just after you've finished building your %k hash. You'll see something like this:
$VAR1 = {
'xyz' => {
' deploy' => 1,
'deleted' => 1
},
'Username ' => {
' Type ' => 1,
' Roles ' => 1
},
'xyz ' => {
'deleted ' => 1,
'admin ' => 1
},
'cdf' => {
'deploy' => 1,
'added' => 1
},
'abc' => {
'deleted ' => 1,
'deploy ' => 1,
'developer ' => 1,
'added ' => 1,
'admin ' => 1
}
};
See how the keys in the sub-hashes are names after two types of things. Half of them are the role names and half of them are "added" or "deleted". It's going to be really hard to get anything useful out of that data structure, so let's try something different.
Where you had:
$k{$user}{$roles}=1;
$k{$user}{$type}=1;
Try this instead:
push #{$k{$user}{type}}, $role;
Now our data structure looks like this:
$VAR1 = {
'xyz' => {
'deleted' => [
' deploy'
]
},
'xyz ' => {
'deleted ' => [
'admin '
]
},
'cdf' => {
'added' => [
'deploy'
]
},
'abc' => {
'deleted ' => [
'developer '
],
'added ' => [
'admin ',
'deploy '
]
}
};
I think you can see that it's far easier to get the information you want out of this data structure. It's basically:
foreach (#names) {
print join ',', #{$k{$_}{added}};
print join ',', #{$k{$_}{deleted}};
}
You'll need to change the code a little to get exactly what you want.
Oh, and please get into the habit of adding use strict and use warnings to all of your code. They would have shown you why the debugging output at the end of your original code wasn't working.
Update: I threw together a complete solution.
#!/usr/bin/perl
use strict;
use warnings;
use 5.010;
my %k;
<>; # Skip header
while (<>) {
chomp;
my ($user, $roles, $type, $dt) = split(/\s*\|\s*/);
push #{$k{$user}{$type}}, $roles;
$k{$user}{date} = $dt;
}
say 'username |roles_added |roles deleted |date';
foreach my $name (sort keys %k) {
say "$name |",
join(',',#{$k{$name}{added} || []}), ' |',
join(',',#{$k{$name}{deleted} || []}), ' |',
$k{$name}{date};
}
You need to pass it the name of the input file as a command line argument.

Concise way to make a 0+ length list in Marpa grammar?

I'm new to Marpa. I've tried a couple ways to describe a list of 0 or more terms in my grammar, and I want to avoid multiple parse trees.
My language will have exactly 1 component followed by 0+ subcomponents:
package => component-rule [subcomponent-rule ...]
What I tried first was this:
{ lhs => 'Package', rhs => [qw/component-rule subcomponents/] },
{ lhs => 'subcomponents', rhs => [qw/subcomponent-list/] },
{ lhs => 'subcomponent-list', rhs => [qw/subcomponent-rule/], action => 'do_subcomponent_list' },
{ lhs => 'subcomponent-list', rhs => [qw/subcomponent-list subcomponent-rule/], action => 'do_subcomponent_list' },
{ lhs => 'subcomponent-list', rhs => [qw//], action => 'do_subcomponent_empty_list' },
{ lhs => 'subcomponent-rule', rhs => [qw/subcomponent subcomponent-name/], action => 'do_subcomponent' },
(Full code at end of post.)
Here's my input:
$recce->read( 'component', );
$recce->read( 'String', 'MO Factory');
$recce->read( 'subcomponent', );
$recce->read( 'String', 'Memory Wipe Station');
$recce->read( 'subcomponent', );
$recce->read( 'String', 'DMO Tour Robot');
I get two parse trees, the first one with an undesirable undef, and the second one which I prefer. Both give the list back as inherently a tree.
$VAR1 = [
{
'Component' => 'MO Factory'
},
[
[
{
'Subcomponent' => undef
},
{
'Subcomponent' => 'Memory Wipe Station'
}
],
{
'Subcomponent' => 'DMO Tour Robot'
}
]
];
$VAR2 = [
{
'Component' => 'MO Factory'
},
[
{
'Subcomponent' => 'Memory Wipe Station'
},
{
'Subcomponent' => 'DMO Tour Robot'
}
]
];
The nullable rule for subcomponent-list was to allow the case of 0 subcomponents, but it introduces the null element on the front of a list of 1+ subcomponents, which is an alternate parse. (Marpa descends the cycle only once, thank goodness.)
My other idea was to make subcomponent-list non-nullable, and introduce an intermediate rule that is 0 or 1 subcomponent-lists:
{ lhs => 'subcomponents', rhs => [qw//] },
{ lhs => 'subcomponents', rhs => [qw/subcomponent-list/] },
This at least eliminated the multiple parse, but I still have a cycle, and a messy nested tree to compress.
Is there a more direct way to make a 0+ length list or otherwise make a symbol optional?
Full sample code:
#!/usr/bin/perl
use Marpa::R2;
use Data::Dumper;
my $grammar = Marpa::R2::Grammar->new(
{ start => 'Package',
actions => 'My_Actions',
default_action => 'do_what_I_mean',
rules => [
{ lhs => 'Package', rhs => [qw/component-rule subcomponents/] },
{ lhs => 'component-name', rhs => [qw/String/] },
{ lhs => 'component-rule', rhs => [qw/component component-name/], action => 'do_component' },
{ lhs => 'subcomponent-name', rhs => [qw/String/] },
{ lhs => 'subcomponent-rule', rhs => [qw/subcomponent subcomponent-name/], action => 'do_subcomponent' },
{ lhs => 'subcomponents', rhs => [qw//] },
{ lhs => 'subcomponents', rhs => [qw/subcomponent-list/] },
{ lhs => 'subcomponent-list', rhs => [qw/subcomponent-rule/], action => 'do_subcomponent_list' },
{ lhs => 'subcomponent-list', rhs => [qw/subcomponent-list subcomponent-rule/], action => 'do_subcomponent_list' },
# { lhs => 'subcomponent-list', rhs => [qw//], action => 'do_subcomponent_empty_list' },
# { lhs => 'subcomponent-list', rhs => [qw//], },
],
}
);
$grammar->precompute();
my $recce = Marpa::R2::Recognizer->new( { grammar => $grammar } );
$recce->read( 'component', );
$recce->read( 'String', 'MO Factory');
if (1) {
$recce->read( 'subcomponent', );
$recce->read( 'String', 'Memory Wipe Station');
$recce->read( 'subcomponent', );
$recce->read( 'String', 'DMO Tour Robot');
$recce->read( 'subcomponent', );
$recce->read( 'String', 'SMO Break Room');
}
my #values = ();
while ( defined( my $value_ref = $recce->value() ) ) {
push #values, ${$value_ref};
}
print "result is ",Dumper(#values),"\n";
sub My_Actions::do_what_I_mean {
print STDERR "do_what_I_mean\n";
# The first argument is the per-parse variable.
# At this stage, just throw it away
shift;
# Throw away any undef's
my #children = grep { defined } #_;
# Return what's left
return scalar #children > 1 ? \#children : shift #children;
}
sub My_Actions::do_component {
my ( undef, $t1 ) = #_;
print STDERR "do_component $t1\n";
my $href = { 'Component' => $t1 };
return $href;
}
sub My_Actions::do_subcomponent{
my ( undef, $t1 ) = #_;
print STDERR "do_subcomponent $t1\n";
my $href = { 'Subcomponent' => $t1 };
return $href;
}
sub My_Actions::do_subcomponent_empty_list
{
print STDERR "do_subcomponent_empty_list\n";
my $href = { 'Subcomponent' => undef };
return $href;
}
sub My_Actions::do_subcomponent_list{
# The first argument is the per-parse variable.
# At this stage, just throw it away
shift;
# Throw away any undef's
my #children = grep { defined } #_;
print STDERR "do_subcomponent_list size ",scalar(#children),"\n";
# Do this to collapse recursive trees to a list:
# #children = map { ref $_ eq "ARRAY" ? #{$_} : $_; } #children;
return scalar #children > 1 ? \#children : shift #children;
}
Specify a sequence rule with the min argument. The value may either be 0 (aka the * quantifier in regexes) or 1 (aka the + quantifier). You can do this by removing the subcomponents and subcomponent-list rules. Instead add:
{
lhs => 'subcomponents',
rhs => ['subcomponent-rule'],
min => 0,
action => 'do_subcomponent_list',
}
Your grammar then runs without further modifications.
Using sequence rules is preferable: No flattening has to take place, and the grammar should be more efficient.
Note that you are encouraged to use the Scanless Interface. The DSL abstracts nicely over this issue:
subcomponents ::= <subcomponent rule>* action => do_subcomponent_list

unable to access hash of array element created using XML parser

My XML Parser looks as below:
$VAR1 = {
'Symmetrix' => {
'Masking_View' => {
'View_Info' => {
'Initiators' => {
'user_node_name' => [
'5001438001725614',
'5001438001725714'
],
'user_port_name' => [
'5001438001725614',
'5001438001725714'
],
'wwn' => [
'5001438001725614',
'5001438001725714'
]
},
'port_grpname' => 'PG_1E0_2E0'
}
},
'Symm_Info' => {
'symid' => '000295900074'
}
}
};
I am trying to pull element of wwn. But I'm not able to get through.
#!C:\Perl\bin
use strict;
use XML::Simple;
use Data::Dumper;
my $input_file = $ARGV[0];
my $detail_info = XMLin("$input_file");
# Loop through each view_info
$detail_info->{Symmetrix}{Masking_View}{View_Info} = [ $detail_info->{Symmetrix} {Masking_View}{View_Info} ] if ref ($detail_info->{Symmetrix}{Masking_View}{View_Info}) ne 'ARRAY';
foreach my $view_info (#{$detail_info-> {Symmetrix}{Masking_View}{View_Info}})
{
$view_info->{Initiators} = [$view_info->{Initiators}] if ref ($view_info-> {Initiators}) ne 'ARRAY';
foreach my $wwn (keys %{$view_info->{Initiators}})
{
my #flags = ();
push (#flags,"$wwn:$view_info->{Initiators}{$wwn}";
print #flags;
#"{$wwn->{wwn}}";
}
}
I am getting output as below;
{ARRAY(0x20c8904)}
I am looking for wwn element in single line of different line.
You're producing too much code to distinguish arrayref values from non-reference values. Strict mode requires you to set the ForceArray options, and thus you can be certain that every value is an arrayref, even if there's just one element in it.
use strictures;
use XML::Simple qw(:strict);
my $detail_info = XMLin($ARGV[0], ForceArray => 1, KeyAttr => []);
# $detail_info is {
# Symmetrix => [
# {
# name => 'Masking_View',
# View_Info => [
# {
# Initiators => [
# {
# user_node_name => [5001438001725614, 5001438001725714],
# user_port_name => [5001438001725614, 5001438001725714],
# wwn => [5001438001725614, 5001438001725714]
# }
# ],
# port_grpname => 'PG_1E0_2E0'
# }
# ]
# },
# {
# name => 'Symm_Info',
# symid => '000295900074'
# }
# ]
# }
my #flags;
for my $view_info (#{ $detail_info->{Symmetrix}[0]{View_Info} }) {
for my $initiator (#{ $view_info->{Initiators} }) {
push #flags, $initiator->{wwn};
}
}
# #flags is (
# [
# 5001438001725614,
# 5001438001725714
# ]
# )

In Perl, how can I skip an empty key when traversing a hash?

This is my problem, I'm not very knowledgeable in Perl, and I have this function that needs to be fixed.
When this function deviceModelMenu() is called, the CLI displays the following text:
The following models are available
==================================================
1.
2. Cisco1240
3. Catalyst3750
4. Catalyst3650
5. HP2524
The first item is empty, which is wrong, and I need to fix that, the piece of code that displays this menu is:
my $features = shift;
print "=" x 50, "\n";
print "The following models are available\n";
print "=" x 50, "\n";
my $i=1;
foreach (keys %{$features->{features}[0]->{deviceModel}})
{
print "$i. $_ \n";
$i++;
}
If I add the following line:
warn Dumper($features->{features}[0]->{deviceModel});
It dumps this:
$VAR1 = {
'deviceModel' => {
'' => {
'cfg' => []
},
'Cisco1240' => {
'cfg' => [
'cisco1240feature.cfg'
]
},
'Catalyst3750' => {
'cfg' => [
'catalyst3750feature.cfg'
]
},
'Catalyst3650' => {
'cfg' => [
'catalyst3650feature.cfg'
]
},
'HP2524' => {
'cfg' => [
'hp2524feature.cfg'
]
}
}
};
As you may notice, the first item is indeed empty. I added the following line to skip it, and just print the rest of the info:
if ($_ eq '') {
shift;
}
But it doesn't seem to work do what I want. I want to skip the item if it's empty.
Well, shifting #ARGV (implicit argument to shift in main program) nor shifting #_ (implicit argument of shift in a function) are not going to help you, because you are not printing either of them.
You can either:
Not add the '' entry in the first place (depends on how it's generated)
Remove the '' entry before printing:
delete $features->{features}[0]->{deviceModel}->{''};
Don't print the entry:
if($_ eq '') {
next;
}
or
if($_ ne '') {
print "$i. $_ \n";
$i++;
}
foreach (keys %{$features->{features}[0]->{deviceModel}})
{
next unless length($_);
print "$i. $_ \n";
$i++;
}
#!/usr/bin/env perl
use strict; use warnings;
my $devices = {
'deviceModel' => {
'' => { 'cfg' => [] },
'Cisco1240' => { 'cfg' => ['cisco1240feature.cfg' ] },
'Catalyst3750' => { 'cfg' => [ 'catalyst3750feature.cfg' ]},
'Catalyst3650' => { 'cfg' => [ 'catalyst3650feature.cfg' ]},
'HP2524' => { 'cfg' => [ 'hp2524feature.cfg' ]},
}
};
{
my $item = 1;
for my $d (grep length, keys %{ $devices->{deviceModel} }) {
printf "%2d. %s\n", $item++, $d;
}
}
Output:
1. Catalyst3750
2. Cisco1240
3. Catalyst3650
4. HP2524