Test two boolean values with AND operator in Perl - perl

I want to test two boolean values in Perl.
In case both are 0 do x, in case not do y.
For example:
$var1 = 0; false in Perl
$var2 = 0; false in Perl
if($var1==0 && var2==0)
{
x //both are false
}
else
{
y //both are true
}

if ( $var1 && $var2 ) {
# Both are true.
}
elsif ( !$var1 && !$var2 ) {
# Both are false.
}
or
if ( !( $var1 xor $var2 ) ) {
# Both are true or both are false.
if ( $var1 ) {
# Both are true.
} else {
# Both are false.
}
}
or
if ( $var1 ) {
if ( $var2 ) {
# Both are true.
}
} else {
if ( !$var2 ) {
# Both are false.
}
}

So … assuming you want y only when both $var1 and $var2 are true:
if (!$var1 and !$var2) {
# do x
} elsif ($var1 and $var2) {
# do y
}
But there TIMTOWTDI. Because Perl.

Related

Making tree hash to correct path

I have a hash variable as a tree:
\%data = {
'node' => {
'RN:4' => {
'next' => {
'1' => {
'RN:23' => {
'next' => {
'1' => {
'RN:29' => {
'end' => 1
}
},
'2' => {
'RN:32' => {
'next' => {
'1' => {
'RN:30' => {
'end' = 1
}
}
}
}
}
}
I want to convert this tree to correct paths like this:
1, RN:4 >> RN:23 >> RN:29
2, RN:4 >> RN:23 >> RN:32 >> RN:30
I have tried some recursive code but alway get wrong path.
Help me please !
The data structure is wholly too complicated. Hashes are being used as arrays, and it would be easier if the id wasn't used as the key. It would be better if a node looked like this:
{
id => ...,
children => [ ... ]
}
The structure would become
[
{
id => 'RN:4',
children => [
{
id => 'RN:23',
children => [
{
id => 'RN:29',
children => []
},
{
id => 'RN:32',
children => [
{
id => 'RN:30',
children => []
}
]
}
]
}
]
}
]
You need the id of all ancestors so we pass a long a list of the ancestors as the parameters.
use 5.016;
sub print_paths {
my $i = 0;
my $helper = sub {
my $node = $_[-1];
my $children = $node->{children};
if (#$children) {
__SUB__->(#_, $_) for #$children;
} else {
say $i, ", ", join(" >> ", map { $_->{id} } #_);
}
};
$helper->(#_);
}
print_paths($_) for #$roots;
The above assumes the ends are the nodes with no children. If your ends can have children, you have a trie. Simply add end => 1 to the end nodes and use the following as the core of the visitor:
if (#$children) {
__SUB__->(#_, $_) for #$children;
}
if ($node->{end}) {
say $i, ", ", join(" >> ", map { $_->{id} } #_);
}
With your format, it's trickier (and more expensive).
$node->{id} is replaced with (keys(%$node))[0].
$node->{children} is replaced with $node->{$id}{next}.
$node->{end} is replaced with $node->{$id}{end}.
for my $child (#$children) is replaced with for (my $j=1; my $child = $children->{$j}; ++$j).
use 5.016;
sub print_paths {
my $i = 0;
my $helper = sub {
my $node = $_[-1];
my $id = (keys(%$node))[0];
my $children = $node->{$id}{next};
if ($children) {
for (my $j=1; my $child = $children->{$j}; ++$j) {
__SUB__->(#_, $child) for #$children;
}
}
if ($node->{$id}{end}) {
say $i, ", ", join(" >> ", map { (keys(%$node))[0] } #_);
}
};
$helper->(#_);
}
print_paths($data->{node});

How can I speed up this nested foreach loop in Perl?

I have a Perl script that compares two sets of data loaded into two arrays, and I'm trying to make the comparison more efficient. Currently the code is as follows:
foreach(#{FILE_DATA}) {
if((++$file_current_linenum % 200) == 0) {
$progress = int($file_current_linenum / $file_total_lines * 10000) / 100;
logg("Processed $file_current_linenum file rows, $progress%, $mismatches mismatches.");
}
$file_current_line = $_;
$match_found = 0;
foreach(#{DB_DATA}) {
$db_current_line = $_;
if($file_current_line->{"channel"} == $db_current_line->{"channel"} ) {
if ($file_current_line->{"checksum"} == $db_current_line->{"checksum"} &&
$file_current_line->{"time"} > ($db_current_line->{"date_part"} - $TIME_MATCH_TOLERANCE) &&
$file_current_line->{"time"} < ($db_current_line->{"date_part"} + $TIME_MATCH_TOLERANCE) ){
$match_found = 1;
last; # break;
}
}
}
if($match_found != 1) {
push(#results, $file_current_line);
$mismatches++;
}
}
My first thought would be to remove matches from both arrays to reduce the pool size, would that affect the iterators position?
Both sets of data can have up to a couple of million elements and the comparison can take a few hours to complete.
Your solution is O(DB * FILE).
The following is O(DB + FILE) if and only if there never more than a few lines with the same channel and checksum:
my %DB_DATA;
for my $db_line (#DB_DATA) {
push #{ $DB_DATA{ $db_line->{channel} }{ $db_line->{checksum} } }, $db_line;
}
for my $file_line_idx (0..$#FILE_DATA) {
my $file_line = $FILE_DATA[$file_line_idx];
my $found = 0;
if (my $r1 = $DB_DATA{ $file_line->{channel} } ) {
if (my $r2 = $r1->{ $file_line->{checksum} } ) {
my $file_time = $file_line->{time};
for my $db_line (#$r2) {
my $db_time = $db_line->{date_part};
if (abs($file_time - $db_time) < $TIME_MATCH_TOLERANCE) {
$found = 1;
last;
}
}
}
}
push #mismatches, $file_line if !$found;
if ((($file_line_idx+1) % 200) == 0) {
logg(sprintf("Processed %d file rows, %d%, %d mismatches.",
$file_line_idx+1,
int(($file_line_idx+1)/#FILE_DATA) * 100,
0+#mismatches,
));
}
}
The following is O(DB + FILE) even if there are many lines with the same channel and checksum, but uses a lot of memory if $TIME_MATCH_TOLERANCE is big:
my %DB_DATA;
for my $db_line (#DB_DATA) {
for my $db_time (
$db_line->{date_part} - $TIME_MATCH_TOLERANCE + 1
..
$db_line->{date_part} + $TIME_MATCH_TOLERANCE - 1
) {
++$DB_DATA{ $db_line->{channel} }{ $db_line->{checksum} }{$db_time};
}
}
for my $file_line_idx (0..$#FILE_DATA) {
my $file_line = $FILE_DATA[$file_line_idx];
my $found = 0;
if (my $r1 = $DB_DATA{ $file_line->{channel} } ) {
if (my $r2 = $r1->{ $file_line->{checksum} } ) {
if ($r2->{ $file_line->{time} } {
$found = 1;
last;
}
}
}
push #mismatches, $file_line if !$found;
if ((($file_line_idx+1) % 200) == 0) {
logg(sprintf("Processed %d file rows, %d%, %d mismatches.",
$file_line_idx+1,
int(($file_line_idx+1)/#FILE_DATA) * 100,
0+#mismatches,
));
}
}
Note: Assumes the timestamps are integers. If they're not, convert them to integers before using them as keys.
The following is O((DB + FILE) log DB) [ which is very close to O(DB + FILE) ] even if there are many lines with the same channel and checksum, and uses minimal memory:
sub binsearch(&\#) {
my ($compare, $array) = #_;
my $i = 0;
my $j = $#$array;
return 0 if $j == -1;
while (1) {
my $k = int(($i+$j)/2);
for ($array->[$k]) {
my $cmp = $compare->()
or return 1;
if ($cmp < 0) {
$j = $k-1;
return 0 if $i > $j;
} else {
$i = $k+1;
return 0 if $i > $j;
}
}
}
}
my %DB_DATA;
for my $db_line (#DB_DATA) {
push #{ $DB_DATA{ $db_line->{channel} }{ $db_line->{checksum} } }, $db_line;
}
for my $r1 (values(%DB_DATA)) {
for my $r2 (values(%$r1)) {
#$r2 = sort { $a->{date_part} <=> $b->{date_part} } #$r2;
}
}
for my $file_line_idx (0..$#FILE_DATA) {
my $file_line = $FILE_DATA[$file_line_idx];
my $found = 0;
if (my $r1 = $DB_DATA{ $file_line->{channel} } ) {
if (my $r2 = $r1->{ $file_line->{checksum} } ) {
my $file_time = $file_line->{time};
my $min_db_time = $file_time - $TIME_MATCH_TOLERANCE;
my $max_db_time = $file_time + $TIME_MATCH_TOLERANCE;
if ( binsearch {
$_->{date_part} >= $max_db_time ? -1
: $_->{date_part} <= $min_db_time ? +1
: 0
} #$r2 ) {
$found = 1;
last;
}
}
}
push #mismatches, $file_line if !$found;
if ((($file_line_idx+1) % 200) == 0) {
logg(sprintf("Processed %d file rows, %d%, %d mismatches.",
$file_line_idx+1,
int(($file_line_idx+1)/#FILE_DATA) * 100,
0+#mismatches,
));
}
}
You could probably reduce the time significantly by pre-building a hash from DB_DATA, using the concatenation of the "channel" and "checksum" values as the key, and each value being a list of all of the DB_DATA entries with that channel and checksum. That way, for each FILE_DATA entry, you only need to check that list.
If there are a lot of entries with a given channel and checksum, you try to improve even more by sorting them by date_part, and then trying to binary search to find a valid entry.
If there are very few entries with a given channel and checksum, this should reduce your run time by a factor of a million or so, since it reduces the run time from O($#FILE_DATA * $#DB_DATA) to O($#FILE_DATA + $#DB_DATA).

Optimize perl code

I have to write to check array ref for more than 3 params. If the value is coming from an array then I have written a foreach loop for that, then chop that and assign to a new variable after concatenating pipe.
Code
if ( defined $args->{hotel} ) {
if ( ref( $args->{hotel} ) eq "ARRAY" ) {
foreach my $hotel ( #{ $args->{hotel} } ) {
$hotel .= $hotel . "|";
}
chop($hotel);
$args->{hotel_name} = $hotel;
} else {
$args->{hotel_name} = $args->{hotel};
}
} else {
$args->{hotel_name} = $hotel;
}
if ( defined $args->{country} ) {
if ( ref( $args->{country} ) eq "ARRAY" ) {
foreach my $country_name ( #{ $args->{country} } ) {
$country_name .= $country_name . "|";
}
chop($country_name);
$args->{country_name} = $country_name;
} else {
$args->{country_name} = $args->{country};
}
} else {
$args->{country_name} = $country_name;
}
if ( defined $args->{city} ) {
if ( ref( $args->{city} ) eq "ARRAY" ) {
foreach my $city ( #{ $args->{city} } ) {
$city .= $city . "|";
}
chop($city);
$args->{city_name} = $city;
} else {
$args->{city_name} = $args->{city};
}
} else {
$args->{city_name} = $city;
}
I want to write a function for this kind of work so that there will be no repetition of same code. Please help me; how can we do this in Perl?
You can write:
sub convert_to_name ($$) { # ($value, $fallback_name)
my ($value, $fallback_name) = #_;
if (defined $value) {
if (ref($value) eq 'ARRAY') {
return join '|', #$value;
} else {
return "$value";
}
} else {
return $fallback_name;
}
}
$args->{'hotel_name'} = convert_to_name $args->{'hotel'}, $hotel;
$args->{'country_name'} = convert_to_name $args->{'country'}, $country;
$args->{'city_name'} = convert_to_name $args->{'city'}, $city;
There appear to be some potential bugs in your code, the biggest centering around reusing variable names at lower scopes.
However, of course you can add an iteration loop to your code that would remove the need for 3 nearly identical sections. The following does that by creating a intermediate hash data structure to relate field names to values.
Note: I also simplified the code by inverting the logic if your first if statement so that all ifs could be at the same level. Additionally, it makes sense to use a join instead of rolling your own such functionality.
my %hash = (
hotel => $hotel,
country => $country_name,
city => $city,
);
while ( my ( $field, $value ) = each %hash ) {
if ( !defined $args->{$field} ) {
$args->{"${field}_name"} = $value;
} elsif ( ref( $args->{$field} ) eq "ARRAY" ) {
$args->{"${field}_name"} = join '|', #{ $args->{$field} };
} else {
$args->{"${field}_name"} = $args->{$field};
}
}
Also, if you're comfortable with the Conditional operator, this can be reduced further. However, some would consider this too cluttered:
while ( my ( $field, $value ) = each %hash ) {
$args->{"${field}_name"} = !defined $args->{$field}
? $value
: ref( $args->{$field} ) eq "ARRAY"
? join( '|', #{ $args->{$field} } )
: $args->{$field};
}

Getting nth/last value inside a hash of hashes

Im trying to make a hashes of hashes to uniquely identify the number that only comes under one set of levels. the hash structure looks something like this :
my %gh = {
'Test1' => {
'level1a' => {
'level2b' => {
'level3a' => {
'level4a' => {
'level5' => '63'
}
}
}
}
}
};
Can some please tell me what is the simplest way to traverse the hash so i can get the value 63.
I have been using
my $x = '';
foreach my $l0 (%gh){
foreach my $l1 (%{$l0}){
foreach my $l2 (%$l1){
foreach my $l3 (%{$l2}){
foreach my $l4 (%$l3){
foreach my $l5 (%{$l4}){
$x = $l5;
}
}
}
}
}
}
This process seems to be working fine . But i was just looking for something simpler and shorter;
Thanks in advance
This will work in your case (only hashes, and plain scalar value at the end)
sub hval {
my ($h) = #_;
return map { ref() ? hval($_) : $_ } values %$h;
}
my $gh = {
'Test1' => {
'level1a' => {
'level2b' => {
'level3a' => {
'level4a' => {
'level5' => '63'
}
}
}
}
}
};
my ($x) = hval($gh);
If you use a reference to a hash instead, here is one way:
use warnings;
use strict;
my $gh = {
'Test1' => {
'level1a' => {
'level2b' => {
'level3a' => {
'level4a' => {
'level5' => '63'
}
}
}
}
}
};
print $gh->{Test1}{level1a}{level2b}{level3a}{level4a}{level5}, "\n";
See also: perldoc perldsc and Data::Diver

Safe way to establish the initial state again.

Is there a possibility of calling the function main without the function _end_win being called?
#!/usr/bin/env perl
use warnings;
use strict;
use 5.10.0;
use Term::ReadKey;
use constant {
NEXT_getch => -1,
CONTROL_C => 0x03,
CONTROL_D => 0x04,
KEY_ENTER => 0x0d,
KEY_RIGHT => 0x1b5b43,
KEY_LEFT => 0x1b5b44,
};
say main();
sub main {
my $arg = {};
$arg->{handle_out} = *STDOUT;
_init_scr( $arg );
while ( 1 ) {
my $c = _getch( $arg );
if ( ! defined $c ) {
_end_win( $arg );
warn "EOT";
return;
}
next if $c == NEXT_getch;
given ( $c ) {
when ( $c >= 97 && $c <= 122 ) {
print chr $c;
$arg->{string} .= chr $c;
}
when ( $c == KEY_RIGHT ) {
print '>';
$arg->{string} .= '>';
}
when ( $c == KEY_LEFT ) {
print '<';
$arg->{string} .= '<';
}
when ( $c == CONTROL_D ) {
_end_win( $arg );
return;
}
when ( $c == CONTROL_C ) {
_end_win( $arg );
print STDERR "^C";
kill( 'INT', $$ );
return;
}
when ( $c == KEY_ENTER ) {
_end_win( $arg );
return $arg->{string};
}
}
}
}
sub _init_scr {
my ( $arg ) = #_;
$arg->{old_handle} = select( $arg->{handle_out} );
$arg->{backup_flush} = $|;
$| = 1;
Term::ReadKey::ReadMode 'ultra-raw';
}
sub _end_win {
my ( $arg ) = #_;
print "\n\r";
Term::ReadKey::ReadMode 'restore';
$| = $arg->{backup_flush};
select( $arg->{old_handle} );
}
sub _getch {
my ( $arg ) = #_;
my $c1 = ReadKey 0;
return if ! defined $c1;
if ( $c1 eq "\e" ) {
my $c2 = ReadKey 0.10;
if ( ! defined $c2 ) { return NEXT_getch; }
elsif ( $c2 eq 'C' ) { return KEY_RIGHT; }
elsif ( $c2 eq 'D' ) { return KEY_LEFT; }
elsif ( $c2 eq '[' ) {
my $c3 = ReadKey 0;
if ( $c3 eq 'C' ) { return KEY_RIGHT; }
elsif ( $c3 eq 'D' ) { return KEY_LEFT; }
else {
return NEXT_getch;
}
}
else {
return NEXT_getch;
}
}
else {
return ord $c1;
}
}
To ensure that the terminal is reset when your program exits, put the reset code into an END block. For example, you could replace your _end_win sub with:
END {
print "\n\r";
Term::ReadKey::ReadMode 'restore';
}
(I removed the lines resetting $| and the selected output filehandle since the process is exiting anyhow, so they're about to become irrelevant.)
An END block will always run when the program terminates in a "normal" way, such as calling exit or die or hitting the end of the executable code. It does not fire when the process terminates due to receiving a signal; it looks like you're handling the ctrl-C character directly, but you may want to consider adding a %SIG{INT} handler as well, in case someone sends you a kill -2.