Store and get values from Perl hash tables

Store and get values from Perl hash tables - perl

I want to store names in hash or array, which are in format
(e.g apple<->banana , orange<->papaya).
And now I have half information like apple or papaya which I need to look in that hash table and get the full combination apple<->banana and store it in a variable... :)
Hope my question is clear actually i read few hash documents and every where it's mentioned to search with full name ... so I need to search with half name or 1st word.

Assuming your input file is like:
apple<->banana
orange<->papaya
Here is a way to do the job:
#!/usr/bin/perl
use strict;
use warnings;
my %corresp;
while(<DATA>) {
chomp;
my ($k, $v) = split/<->/,$_;
$corresp{$k} = $v;
}
my %reverse = reverse %corresp;
my $search = 'apple';
if (exists$corresp{$search}) {
say "$search = $corresp{$search}";
} elsif(exists$reverse{$search}) {
say "$search = $reverse{$search}";
} else {
say 'No match!';
}
__DATA__
apple<->banana
orange<->papaya

Try this one:
my %hash = (
'apple' => 'banana',
'orange' => 'papaya'
);
## the word is looking for
my $word = 'orange';
## checking using Key
if(defined($hash{$word})){
print "$word <=> $hash{$word}";
}
## checking using value
else{
## handling only one value, not all
my ($key) = grep { $hash{$_} eq $word } keys %hash;
print "$key <=> $hash{$key}" if $key;
}

Related

Split string into a hash of hashes (perl)

at the moment im a little confused..
I am looking for a way to write a string with an indefinite number of words (separated by a slash) in a recursive hash.
These "strings" are output from a text database.
Given is for example
"office/1/hardware/mouse/count/200"
the next one can be longer or shorter..
This must be created from it:
{
office {
1{
hardware {
mouse {
count => 200
}
}
}
}
}
Any idea ?

Work backwards. Split the string. Use the last two elements to make the inner-most hash. While more words exist, make each one the key of a new hash, with the inner hash as its value.
my $s = "office/1/hardware/mouse/count/200";
my #word = split(/\//, $s);
# Bottom level taken explicitly
my $val = pop #word;
my $key = pop #word;
my $h = { $key => $val };
while ( my $key = pop #word )
{
$h = { $key => $h };
}

Simple recursive function should do
use strict;
use warnings;
use Data::Dumper;
sub foo {
my $str = shift;
my ($key, $rest) = split m|/|, $str, 2;
if (defined $rest) {
return { $key => foo($rest) };
} else {
return $key;
}
}
my $hash = foo("foo/bar/baz/2");
print Dumper $hash;
Gives output
$VAR1 = {
'foo' => {
'bar' => {
'baz' => '2'
}
}
};
But like I said in the comment: What do you intend to use this for? It is not a terribly useful structure.

If there are many lines to be read into a single hash and the lines have a variable number of fields, you have big problems and the other two answers will clobber data by either smashing sibling keys or overwriting final values. I'm supposing this because there is no rational reason to convert a single line into a hash.
You will have to walk down the hash with each field. This will also give you the most control over the process.
our $hash = {};
our $eolmark = "\000";
while (my $line = <...>) {
chomp $line;
my #fields = split /\//, $line;
my $count = #fields;
my $h = $hash;
my $i = 0;
map { (++$i == $count) ?
($h->{$_}{$eolmark} = 1) :
($h = $h->{$_} ||= {});
} #fields;
}
$h->{$_}{$eolmark} = 1 You need the special "end of line" key so that you can recognize the end of a record and still permit longer records to coexist. If you had two records
foo/bar/baz foo/bar/baz/quux, the second would overwrite the final value of the first.
$h = $h->{$_} ||= {} This statement is a very handy idiom to both create and populate a cache in one step and then take a shortcut reference to it. Never do a hash lookup more than once.
HTH

Accessing and modifying a nested hash based on a dot separated string

I have a string as input, say apple.mango.orange = 100
I also have a hash reference:
$inst = {
'banana' => 2,
'guava' => 3,
'apple' => {
'mango' => {
'orange' => 80
}
}
};
I want to modify the value of orange using the input string. Can someone please help me how I could do this?
I tried splitting the string into (key, value) pair. I then did the following on the key string:
my $key2 = "\$inst->{".$key."}";
$key2 =~ s/\./}->{/g;
$$key2 = $value;
This does not work as intended. Can someone help me out here? I have read the Perl FAQ about not using a variable value as variable but I am unable to think of an alternative.

You are building string that consists of (buggy) Perl code, but you never ask Perl to execute it. ...but that's not the right approach.
sub dive_val :lvalue {
my $p = \shift;
$p = \($$p->{$_}) for #_;
$$p
}
my #key = split /\./, "apple.mango.orange";
dive_val($inst, #key) = $value;
or
use Data::Diver qw( DiveVal );
my #key = split /\./, "apple.mango.orange";
DiveVal($inst, map \$_, #key) = $value;

Not only is a symbolic reference a very bad idea here, it doesn't even solve your problem. You're building an expression in $key2, and just jamming another dollar sign in front of its name won't make perl execute that code. For that you would need eval, which is another bad idea
You can install and use the Data::Diver module, which does exactly this sort of thing, or you can simply loop over the list of hash keys, picking up a new hash reference each time and assigning the value to the element with the last key
The biggest issue is actually parsing the incoming string into a list of keys and a value. This code implements a subroutine apply which applies the implied operation in the string to a nested hash. Unless you are confident of your data, it needs some error checking addingto make sure each of the keys in the list exists. The Data:;Dumper output is just to demonstrate the validity of the result
use strict;
use warnings 'all';
use Data::Dumper;
my $inst = { 'banana' => 2, 'guava' => 3, 'apple' => { 'mango' => { 'orange' => 80 } } };
my $s = 'apple.mango.orange = 100';
apply($s, $inst);
print Dumper $inst;
sub apply {
my ($operation, $data) = #_;
my ($keys, $val) = $operation =~ /([\w.]+)\s*=\s*(\d+)/;
my #keys = split /\./, $keys;
my $last = pop #keys;
my $hash = $data;
$hash = $hash->{$_} for #keys;
$hash->{$last} = $val;
}
output
$VAR1 = {
'banana' => 2,
'apple' => {
'mango' => {
'orange' => '100'
}
},
'guava' => 3
};

How to use refernce concept and access element of subroutine argument using Perl?

I am writing a code for calling a subroutine which has 4 argument(3 hashes and one file handler).i want to know how to access them in subroutine.My code is as below.
#print OUTFILE "Content of TPC file:.\n";
my $DATA_INFO = $ARGV[0];
my $OUT_DIR = $ARGV[1];
my $log= "$OUT_DIR/log1";
open(LOG1,">$log");
require "$DATA_INFO";
my $SCRIPT_DIR = $ENV{"SCRIPT_DIR"} ;
require "$SCRIPT_DIR/cmp_fault.pl";
require "$SCRIPT_DIR/pattern_mismatch.pl";
require "$SCRIPT_DIR/scan_count.pl";
print "\nComparing data:\n\n" ;
pattern_mismatch("\%data","\%VAR1","\%status",*LOG1);
cmp_fault("\%data","\%VAR1","\%status",*LOG1);
scan_count("\%data","\%status",*LOG1);
print "\n Comparison done:\n";
foreach $pattern (keys %status) {
print "pattern";
foreach $attr (keys %{$status{$pattern}}) {
print ",$attr";
}
print "\n";
last;
}
#Print Data
foreach $pattern (keys %status) {
print "$pattern";
foreach $attr (keys %{$status{$pattern}}) {
print ",$status{$pattern}{$attr}";
}
print "\n";
Sub routine cmp_fault is here:
sub cmp_fault {
use strict;
use warning;
$data_ref= $_[0];;
$VAR1_ref= $_[1];
$status_ref = $_[2];
$log1_ref=$_[3];
# print LOG1"For TPC : First find the pattern and then its fault type\n";
for $pat ( keys %$data_ref ) {
print "fgh:\n$pat,";
for $key (keys %{$data_ref{$pat}}) {
if($key=~/fault/){
print LOG1 "$key:$data_ref{$pat}{$key},\n";
}
}
}
# print LOG1 "\nFor XLS : First find the pattern and then its pattern type\n";
for $sheet (keys %$VAR1_ref){
if ("$sheet" eq "ATPG") {
for $row (1 .. $#{$VAR1_ref->{$sheet}}) {
$patname = $VAR1_ref->{'ATPG'}[$row]{'Pattern'} ;
next if ("$patname" eq "") ;
$faultXls = $VAR1_ref->{'ATPG'}[$row]{'FaultType'} ;
# print LOG1 " $patname==>$faultXls \n";
if (defined $data{$patname}{'fault'}) {
$faultTpc = $data{$patname}{'fault'} ;
# print LOG1 "\n $patname :XLS: $faultXls :TPC: $faultTpc\n";
if("$faultXls" eq "$faultTpc") {
print LOG1 "PASS: FaultType Matched $patname :XLS: $faultXls :TPC: $faultTpc\n\n\n";
print "PASS: FaultType Matched $patname :XLS: $faultXls :TPC: $faultTpc\n\n";
$status_ref->{$patname}{'FaultType'} = PASS;
}
else {
print LOG1 "FAIL: FaultType Doesn't Match\n\n";
$status_ref->{$patname}{'FaultType'} = Fail;
}
}
}
}
}
}
return 1;

When passing parameters into an array, you can only ever pass a single list of parameters.
For scalars, this isn't a problem. If all you're acting on is a single array, this also isn't a problem.
If you need to send scalars and an array or hash, then the easy way is to 'extract' the scalar parameters first, and then treat 'everything else' as the list.
use strict;
use warnings;
sub scalars_and_array {
my ( $first, $second, #rest ) = #_;
print "$first, $second, ", join( ":", #rest ), "\n";
}
scalars_and_array( "1", "2", "3", 4, 5, 6 );
But it should be noted that by doing so - you're passing values. You can do this with hashes too.
To pass data structure references, it's as you note - pass by reference, then dereference. It's useful to be aware though, that -> becomes useful, because it's accessing a hash and dereferencing it.
use strict;
use warnings;
use Data::Dumper;
sub pass_hash {
my ( $hashref ) = #_;
print $hashref,"\n";
print $hashref -> {"one"},"\n";
print $hashref -> {"fish"} -> {"haddock"};
}
my %test_hash = ( "one" => 2,
"three" => 4,
"fish" => { "haddock" => "plaice" }, );
pass_hash ( \%test_hash );
print "\n";
print Dumper \%test_hash;
The core of your problem here though, is that you haven't turned on strict and warnings which would tell you that:
for $pat ( keys %data_ref ) {
is wrong - there is no hash called data_ref there's only a scalar (which holds a hash reference) called $data_ref.
You need %$data_ref here.
And here:
for $key ( keys %{ $data{$pat} } ) {
You also have no $data - your code says $data_ref. (You might have %data in scope, but that's a really bad idea to mess around with within a sub).
There's a bunch of other errors - which would also be revealed by strict and warnings. That's a very basic debugging step, and you will generally get a much better response from Stack Overflow if you do this before asking for assistance. So please - do that, tidy up your code and remove errors/warnings. If you are still having problems after that, then by all means make a post outlining where and what problem you're having.

Dealing with multiple capture groups in multiple records

Data Format:
attribname: data
Data Example:
cheese: good
pizza: good
bagel: good
fire: bad
Code:
my $subFilter='(.+?): (.+)';
my #attrib = ($dataSet=~/$subFilter/g);
for (#attrib)
{
print "$_\n";
}
The code spits out:
cheese
good
pizza
good
[etc...]
I was wondering what an easy Perly way to do this is? I am parsing the data from a log the data above is trash for simplicity. I am newer to Perl, I suspect I could do this via fanangling indexes, but I was wondering if there is a short method of implementing this? Is there any way to have the capture groups put into two different variables instead of serially appended to the list along with all matches?
Edit: I want the attribute and it's associated value together so I can the do what I need to to them. For example if within my for loop I could access both the attribute name and attribute value.
Edit:
I tried
my %attribs;
while (my $line = <$data>)
{
my ($attrib, $value) = ($line=~m/$subFilter/);
print $attribs{$attrib}," : ", $value,"\n";
}
and no luck :( I don't get any output with this. My data is in a variable not a file, because it parsed out of a set of parent data which is in a file. It would be convenient if the my variable worked so that my (#attrib, #value) = ($line=~/$subFilter/g); filled the lists appropriately with the multiple matches.
Solution:
my #line = ($7 =~/(.+?)\n/g);
for (#line)
{
my ($attrib, $value) = ($_=~m/$subFilter/);
if ($attrib ne "")
{
print $attrib," : ", $value,"\n";
}
}

I'm not really clear on what you actually want to store, but here's how you could store the data in a hash table, with '1' indicating good and '0' indicating 'bad':
use strict;
use warnings;
use Data::Dumper;
my %foods;
while (my $line = <DATA>)
{
chomp $line;
my ($food, $good) = ($line =~ m/^(.+?): (.+)$/);
$foods{$food} = ($good eq 'good' ? 1 : 0);
}
print Dumper(\%foods);
__DATA__
cheese: good
pizza: good
bagel: good
fire: bad
This prints:
$VAR1 = {
'bagel' => 1,
'cheese' => 1,
'fire' => 0,
'pizza' => 1
};

A sensible approach would be to make use of the split function:
my %attrib;
open my $data, '<', 'fileName' or die "Unable to open file: $!";
while ( my $line = <$data> ) {
my ( $attrib, $value ) = split /:\s*/, $line, 2;
$attrib{$attrib} = $value;
}
close $data;
foreach my $attrib ( keys %attrib ) {
print "$attrib: $attrib{$attrib}\n";
}
If you're into one-liners, the following would achieve the same:
$ perl -F/:\s*/ -ane '$attrib{$F[0]} = $F[1]; } END { print $_,"\t",$attrib{$_},"\n" foreach keys %attrib;" fileName

Where can I find an array of the (un)assigned Unicode code points for a particular block?

At the moment, I'm writing these arrays by hand.
For example, the Miscellaneous Mathematical Symbols-A block has an entry in hash like this:
my %symbols = (
...
miscellaneous_mathematical_symbols_a => [(0x27C0..0x27CA), 0x27CC,
(0x27D0..0x27EF)],
...
)
The simpler, 'continuous' array
miscellaneous_mathematical_symbols_a => [0x27C0..0x27EF]
doesn't work because Unicode blocks have holes in them. For example, there's nothing at 0x27CB. Take a look at the code chart [PDF].
Writing these arrays by hand is tedious, error-prone and a bit fun. And I get the feeling that someone has already tackled this in Perl!

Perhaps you want Unicode::UCD? Use its charblock routine to get the range of any named block. If you want to get those names, you can use charblocks.
This module is really just an interface to the Unicode databases that come with Perl already, so if you have to do something fancier, you can look at the lib/5.x.y/unicore/UnicodeData.txt or the various other files in that same directory to get what you need.
Here's what I came up with to create your %symbols. I go through all the blocks (although in this sample I skip that ones without "Math" in their name. I get the starting and ending code points and check which ones are assigned. From that, I create a custom property that I can use to check if a character is in the range and assigned.
use strict;
use warnings;
digest_blocks();
my $property = 'My::InMiscellaneousMathematicalSymbolsA';
foreach ( 0x27BA..0x27F3 )
{
my $in = chr =~ m/\p{$property}/;
printf "%X is %sin $property\n",
$_, $in ? '' : ' not ';
}
sub digest_blocks {
use Unicode::UCD qw(charblocks);
my $blocks = charblocks();
foreach my $block ( keys %$blocks )
{
next unless $block =~ /Math/; # just to make the output small
my( $start, $stop ) = #{ $blocks->{$block}[0] };
$blocks->{$block} = {
assigned => [ grep { chr =~ /\A\p{Assigned}\z/ } $start .. $stop ],
unassigned => [ grep { chr !~ /\A\p{Assigned}\z/ } $start .. $stop ],
start => $start,
stop => $stop,
name => $block,
};
define_my_property( $blocks->{$block} );
}
}
sub define_my_property {
my $block = shift;
(my $subname = $block->{name}) =~ s/\W//g;
$block->{my_property} = "My::In$subname"; # needs In or Is
no strict 'refs';
my $string = join "\n", # can do ranges here too
map { sprintf "%X", $_ }
#{ $block->{assigned} };
*{"My::In$subname"} = sub { $string };
}
If I were going to do this a lot, I'd use the same thing to create a Perl source file that has the custom properties already defined so I can just use them right away in any of my work. None of the data should change until you update your Unicode data.
sub define_my_property {
my $block = shift;
(my $subname = $block->{name}) =~ s/\W//g;
$block->{my_property} = "My::In$subname"; # needs In or Is
no strict 'refs';
my $string = num2range( #{ $block->{assigned} } );
print <<"HERE";
sub My::In$subname {
return <<'CODEPOINTS';
$string
CODEPOINTS
}
HERE
}
# http://www.perlmonks.org/?node_id=87538
sub num2range {
local $_ = join ',' => sort { $a <=> $b } #_;
s/(?<!\d)(\d+)(?:,((??{$++1})))+(?!\d)/$1\t$+/g;
s/(\d+)/ sprintf "%X", $1/eg;
s/,/\n/g;
return $_;
}
That gives me output suitable for a Perl library:
sub My::InMiscellaneousMathematicalSymbolsA {
return <<'CODEPOINTS';
27C0 27CA
27CC
27D0 27EF
CODEPOINTS
}
sub My::InSupplementalMathematicalOperators {
return <<'CODEPOINTS';
2A00 2AFF
CODEPOINTS
}
sub My::InMathematicalAlphanumericSymbols {
return <<'CODEPOINTS';
1D400 1D454
1D456 1D49C
1D49E 1D49F
1D4A2
1D4A5 1D4A6
1D4A9 1D4AC
1D4AE 1D4B9
1D4BB
1D4BD 1D4C3
1D4C5 1D505
1D507 1D50A
1D50D 1D514
1D516 1D51C
1D51E 1D539
1D53B 1D53E
1D540 1D544
1D546
1D54A 1D550
1D552 1D6A5
1D6A8 1D7CB
1D7CE 1D7FF
CODEPOINTS
}
sub My::InMiscellaneousMathematicalSymbolsB {
return <<'CODEPOINTS';
2980 29FF
CODEPOINTS
}
sub My::InMathematicalOperators {
return <<'CODEPOINTS';
2200 22FF
CODEPOINTS
}

Maybe this?
my #list =
grep {chr ($_) =~ /^\p{Assigned}$/}
0x27C0..0x27EF;
#list = map { $_ = sprintf ("%X", $_ )} #list;
print "#list\n";
Gives me
27C0 27C1 27C2 27C3 27C4 27C5 27C6 27C7 27C8 27C9 27CA 27D0 27D1 27D2 27D3
27D4 27D5 27D6 27D7 27D8 27D9 27DA 27DB 27DC 27DD 27DE 27DF 27E0 27E1 27E2
27E3 27E4 27E5 27E6 27E7 27E8 27E9 27EA 27EB

I don't know why you wouldn't say miscellaneous_mathematical_symbols_a => [0x27C0..0x27EF], because that's how the Unicode standard is defined according to the PDF.
What do you mean when you say it doesn't "work"? If it's giving you some sort of error when you check the existence of the character in the block, then why not just weed them out of the block when your checker comes across an error?

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

Store and get values from Perl hash tables - perl

Related

Split string into a hash of hashes (perl)

Accessing and modifying a nested hash based on a dot separated string

How to use refernce concept and access element of subroutine argument using Perl?

Dealing with multiple capture groups in multiple records

Where can I find an array of the (un)assigned Unicode code points for a particular block?

Categories

Resources