How to transform a string parenthesis to a parent-child represantion - perl

I'm trying to visualize a formula which is in a single string of this format:
AND(operator1(operator2(x,a),operator3(y,b)),operator4(t))
There's no limit to how many arguments are in an operator.
Currently I'm trying to solve it in perl. I guess the easiest way is to transform it to a parent-child representation, and then it would be easy to present it in a collapsable way like this:
[-]AND(
[-]operator1(
[-]operator2(
x
a
)
[+]operator3(...)
)
[-]operator4(
t
)
)
I doubt i'm the first one to tackle this, but I can't find any such example online. Thanks!

I did something like this a few months ago and think this might be along the lines of what you are looking for...
c:\Perl>perl StackOverflow.pl
SomeValue
AND
-operator1
--operator2
---x
---a
--operator3
---y
---b
-operator4
--t
c:\Perl>
If so, here's the code (below). It's a bit of a mess, but it should be enough to get you started.
#!c:/perl/bin/perl.exe
my $SourceString="SomeValue AND (operator1(operator2(x,a),operator3(y,b)),operator4(t))";
my $NodeIndex=0;
my $IndexString="-";
print ProcessString($SourceString, $NodeIndex);
exit;
#----- subs go here -----
sub ProcessString {
my $TargetString=shift||return undef;
my $NodeIndex=shift;
my $ReturnString="";
#are we starting with a paren or comma?
if($TargetString=~m/^([\(\)\, ])/) {
#yep, delete the char and pass it through again for further processing
if($1 eq " ") {$ReturnString.=ProcessString(substr($TargetString, index($TargetString, $1)+1), $NodeIndex);}
elsif($1 eq ",") {$ReturnString.=ProcessString(substr($TargetString, index($TargetString, $1)+1), $NodeIndex);}
elsif($1 eq "(") {$ReturnString.=ProcessString(substr($TargetString, index($TargetString, $1)+1), ++$NodeIndex);}
elsif($1 eq ")") {$ReturnString.=ProcessString(substr($TargetString, index($TargetString, $1)+1), --$NodeIndex);}
} else {
#nope, must be a keyword or the end
if($TargetString=~m/([\(\)\, ])/) {
my $KeywordString=substr($TargetString, 0, index($TargetString, $1));
$ReturnString.=$IndexString x $NodeIndex;
$ReturnString.=$KeywordString."\n";
$ReturnString.=ProcessString(substr($TargetString, index($TargetString, $1)), $NodeIndex);
} else {
#we should never be here
} #end keyword check if
} #end if
#send the string back
return $ReturnString;
} #end sub
__END__

Related

What are all the Unicode properties a Perl 6 character will match?

The .uniprop returns a single property:
put join ', ', 'A'.uniprop;
I get back one property (the general category):
Lu
Looking around I didn't see a way to get all the other properties (including derived ones such as ID_Start and so on). What am I missing? I know I can go look at the data files, but I'd rather have a single method that returns a list.
I am mostly interested in this because regexes understand properties and match the right properties. I'd like to take any character and show which properties it will match.
"A".uniprop("Alphabetic") will get the Alphabetic property. Are you asking for what other properties are possible?
All these that have a checkmark by them will likely work. This just displays that status of roast testing for it https://github.com/perl6/roast/issues/195
This may more more useful for you, https://github.com/rakudo/rakudo/blob/master/src/core/Cool.pm6#L396-L483
The first hash is just mapping aliases for the property names to the full names. The second hash specifices whether the property is B for boolean, S for a string, I for integer, nv for numeric value, na for Unicode Name and a few other specials.
If I didn't understand you question please let me know and I will revise this answer.
Update: Seems you want to find out all the properties that will match. What you will want to do is iterate all of https://github.com/rakudo/rakudo/blob/master/src/core/Cool.pm6#L396-L483 and looking only at string, integer and boolean properties. Here is the full thing: https://gist.github.com/samcv/ae09060a781bb4c36ae6cac80ea9325f
sub MAIN {
use Test;
my $char = 'a';
my #result = what-matches($char);
for #result {
ok EVAL("'$char' ~~ /$_/"), "$char ~~ /$_/";
}
}
use nqp;
sub what-matches (Str:D $chr) {
my #result;
my %prefs = prefs();
for %prefs.keys -> $key {
given %prefs{$key} {
when 'S' {
my $propval = $chr.uniprop($key);
if $key eq 'Block' {
#result.push: "<:In" ~ $propval.trans(' ' => '') ~ ">";
}
elsif $propval {
#result.push: "<:" ~ $key ~ "<" ~ $chr.uniprop($key) ~ ">>";
}
}
when 'I' {
#result.push: "<:" ~ $key ~ "<" ~ $chr.uniprop($key) ~ ">>";
}
when 'B' {
#result.push: ($chr.uniprop($key) ?? "<:$key>" !! "<:!$key>");
}
}
}
#result;
}
sub prefs {
my %prefs = nqp::hash(
'Other_Grapheme_Extend','B','Titlecase_Mapping','tc','Dash','B',
'Emoji_Modifier_Base','B','Emoji_Modifier','B','Pattern_Syntax','B',
'IDS_Trinary_Operator','B','ID_Continue','B','Diacritic','B','Cased','B',
'Hangul_Syllable_Type','S','Quotation_Mark','B','Radical','B',
'NFD_Quick_Check','S','Joining_Type','S','Case_Folding','S','Script','S',
'Soft_Dotted','B','Changes_When_Casemapped','B','Simple_Case_Folding','S',
'ISO_Comment','S','Lowercase','B','Join_Control','B','Bidi_Class','S',
'Joining_Group','S','Decomposition_Mapping','S','Lowercase_Mapping','lc',
'NFKC_Casefold','S','Simple_Lowercase_Mapping','S',
'Indic_Syllabic_Category','S','Expands_On_NFC','B','Expands_On_NFD','B',
'Uppercase','B','White_Space','B','Sentence_Terminal','B',
'NFKD_Quick_Check','S','Changes_When_Titlecased','B','Math','B',
'Uppercase_Mapping','uc','NFKC_Quick_Check','S','Sentence_Break','S',
'Simple_Titlecase_Mapping','S','Alphabetic','B','Composition_Exclusion','B',
'Noncharacter_Code_Point','B','Other_Alphabetic','B','XID_Continue','B',
'Age','S','Other_ID_Start','B','Unified_Ideograph','B','FC_NFKC_Closure','S',
'Case_Ignorable','B','Hyphen','B','Numeric_Value','nv',
'Changes_When_NFKC_Casefolded','B','Expands_On_NFKD','B',
'Indic_Positional_Category','S','Decomposition_Type','S','Bidi_Mirrored','B',
'Changes_When_Uppercased','B','ID_Start','B','Grapheme_Extend','B',
'XID_Start','B','Expands_On_NFKC','B','Other_Uppercase','B','Other_Math','B',
'Grapheme_Link','B','Bidi_Control','B','Default_Ignorable_Code_Point','B',
'Changes_When_Casefolded','B','Word_Break','S','NFC_Quick_Check','S',
'Other_Default_Ignorable_Code_Point','B','Logical_Order_Exception','B',
'Prepended_Concatenation_Mark','B','Other_Lowercase','B',
'Other_ID_Continue','B','Variation_Selector','B','Extender','B',
'Full_Composition_Exclusion','B','IDS_Binary_Operator','B','Numeric_Type','S',
'kCompatibilityVariant','S','Simple_Uppercase_Mapping','S',
'Terminal_Punctuation','B','Line_Break','S','East_Asian_Width','S',
'ASCII_Hex_Digit','B','Pattern_White_Space','B','Hex_Digit','B',
'Bidi_Paired_Bracket_Type','S','General_Category','S',
'Grapheme_Cluster_Break','S','Grapheme_Base','B','Name','na','Ideographic','B',
'Block','S','Emoji_Presentation','B','Emoji','B','Deprecated','B',
'Changes_When_Lowercased','B','Bidi_Mirroring_Glyph','bmg',
'Canonical_Combining_Class','S',
);
}
OK, so here's another take on answering this question, but the solution is not perfect. Bring the downvotes!
If you join #perl6 channel on freenode, there's a bot called unicodable6 which has functionality that you may find useful. You can ask it to do this (e.g. for character A and π simultaneously):
<AlexDaniel> propdump: Aπ
<unicodable6> AlexDaniel, https://gist.github.com/b48e6062f3b0d5721a5988f067259727
Not only it shows the value of each property, but if you give it more than one character it will also highlight the differences!
Yes, it seems like you're looking for a way to do that within perl 6, and this answer is not it. But in the meantime it's very useful. Internally Unicodable just iterates through this list of properties. So basically this is identical to the other answer in this thread.
I think someone can make a module out of this (hint-hint), and then the answer to your question will be “just use module Unicode::Propdump”.

(m/regexp/) or {multiple ; commands; after; or; }

I like very much this syntax:
try_something() or warn "Cant do it";
How can I add more commands after or?
For example it would be useful in this code:
foreach (#array)
{
m/regex/ or {warn "Does not match"; next;} # this syntax is wrong
...
}
One way I found is
try_something() or eval {warn "Can't do it"; next;};
but I think it is bad idea.
BEST ANSWERS:
do is better than eval.
The comma operator is even better: do_smth() or warn("Does not match"), next; Nota bene: parentheses are mandatory for warn so that next does not parse as one of its arguments.
I think that will end up being pretty unreadable pretty fast, but you can do:
foo() or do { bar(); baz(); };
sub foo {
return $_[0] == 2;
}
for (1..3) {
print $_;
foo($_) or do { print " !foo\n"; next; };
print " foo!\n";
}
For the case in your question, I would use unless.
for (#array) {
unless (/regex/) {
warn "Does not match";
next;
}
...
}
You can sometimes get away with using the comma operator. It evaluates its left-hand argument, throws away the result, evaluates the right-hand argument and returns that result. Applied to your situation it looks like
for (#array) {
/regex/ or warn("Does not match"), next;
...
}
Note the extra parentheses. You have to be a bit more careful about parentheses and grouping this way. Be judicious in your use of this technique: it can get ugly quickly.
In a comment below, Zaid suggests
warn('Does not match'), next unless /regex/;
The choice is a matter of style. Perl was created by a linguist. Natural languages allow us to express the same thought in different ways depending on which part we want to emphasize. In your case, do you want to emphasize the warning or the pattern match? Place the more important code out front.
I figured out (and tested) that you can also use 'and':
try_something() or warn "Cant do it" and print "Hmm." and next;
If try_something() is success then it doesn't do anything after or.
If try_something() fails then it warns and prints and next.

program exhibiting bizarre behavior when reading words out from a file

So I have two files, one that contains my text, and another which I want to contain filter words. The one shown here is supposed to be the one with the curse words. Basically, what I'm doing is iterating through each of the words in the text file, and trying to compare them against the curse words.
sub filter {
$word_to_check = $_;
open ( FILE2, $ARGV[1]) || die "Something went wrong. \n";
while(<FILE2>) {
#cursewords = split;
foreach $curse (#cursewords) {
print $curse."\n";
if($word_to_check eq $curse) { return "BAD!";}
}
}
close ( FILE2 );
}
Here are the "curse words":
what is
Here is the text file:
hey dude what is up
But here's what's going wrong. As you can see, I've put a print statement to see if the curse words are getting checked correctly.
hey what
is
dude what
is
what what
is
is what
is
up what
is
I literally have no idea why this could be happening. Please let me know if I should post more code.
EDIT:
AHA! thanks evil otto. It seems I was getting confused with another print statement I had put in before. Now the problem remains: I think I'm not checking for string equality correctly. Here's where filter is getting called:
foreach $w( #text_file_words )
{
if(filter($w) eq "BAD!")
{
#do something here
}
else { print "good!"; }
}
EDIT 2: Nevermind, more stupidity on my part. I need to get some sleep, thanks evil otto.
change
$word_to_check = $_;
to
$word_to_check = shift;
You needed to collect arguments as an array in perl...
sub myFunction{
($wordToCheck) = #_; #this is the arg array, if you have more than one arg you just separate what's between the parenthesis with commas.
}

Alternative to "last" in do loops

According to the perl manual for for last (http://perldoc.perl.org/functions/last.html), last can't be used to break out of do {} loops, but it doesn't mention an alternative. The script I'm maintaining has this structure:
do {
...
if (...)
{
...
last;
}
} while (...);
and I'm pretty sure he wants to go to the end of the loop, but its actually exiting the current subroutine, so I need to either change the last or refactor the whole loop if there is a better way that someone can recommend.
Wrap the do "loop" in a bare block (which is a loop):
{
do {
...
if (...)
{
...
last;
}
} while (...);
}
This works for last and redo, but not next; for that place the bare block inside the do block:
do {{
...
if (...)
{
...
next;
}
...
}} while (...);
do BLOCK while (EXPR) is funny in that do is not really a loop structure. So, last, next, and redo are not supposed to be used there. Get rid of the last and adjust the EXPR to evaluate false when that situation is found.
Also, turn on strict, which should give you at least a warning here.
Never a fan of do/while loops in Perl. the do isn't really a loop which is why last won't break out of it. In our old Pascal daze you couldn't exit a loop in the middle because that would be wrong according to the sage Niklaus "One entrance/one exit" Wirth. Therefore, we had to create an exit flag. In Perl it'd look something like this:
my $endFlag = 0;
do {
...
if (...)
{
...
$endFlag = 1;
}
} while ((...) and (not $endFlag));
Now, you can see while Pascal never caught on.
Why not just use a while loop?
while (...) {
...
if (...) {
last;
}
}
You might have to change your logic slightly to accommodate the fact that your test is at the beginning instead of end of your loop, but that should be trivial.
By the way, you actually CAN break out of a Pascal loop if you're using Delphi, and Delphi DID catch on for a little while until Microsoft wised up and came out with the .net languages.
# "http://perldoc.perl.org/functions/last.html":
last cannot be used to exit a block that returns a value such as eval {} , sub {} or do {} , and should not be used to exit a grep() or map() operation.
So, use a boolean in the 'while()' and set it where you have 'last'...
Late to the party - I've been messing with for(;;) recently. In my rudimentary testing, for conditional expressions A and B, what you want to do with:
do {
last if A;
} while(B);
can be accomplished as:
for(;; B || last) {
last if A;
}
A bit ugly, but perhaps not more so than the other workarounds :) . An example:
my $i=1;
for(;; $i<=3 || last) {
print "$i ";
++$i;
}
Outputs 1 2 3. And you can combine the increment if you want:
my $i=1;
for(;; ++$i, $i<=3 || last) {
print "$i ";
}
(using || because it has higher precedence than ,)

What is the proper way to check if a string is empty in Perl?

I've just been using this code to check if a string is empty:
if ($str == "")
{
// ...
}
And also the same with the not equals operator...
if ($str != "")
{
// ...
}
This seems to work (I think), but I'm not sure it's the correct way, or if there are any unforeseen drawbacks. Something just doesn't feel right about it.
For string comparisons in Perl, use eq or ne:
if ($str eq "")
{
// ...
}
The == and != operators are numeric comparison operators. They will attempt to convert both operands to integers before comparing them.
See the perlop man page for more information.
Due to the way that strings are stored in Perl, getting the length of a string is optimized.
if (length $str) is a good way of checking that a string is non-empty.
If you're in a situation where you haven't already guarded against undef, then the catch-all for "non-empty" that won't warn is if (defined $str and length $str).
You probably want to use "eq" instead of "==".
If you worry about some edge cases you may also want to check for undefined:
if (not defined $str) {
# this variable is undefined
}
As already mentioned by several people, eq is the right operator here.
If you use warnings; in your script, you'll get warnings about this (and many other useful things); I'd recommend use strict; as well.
The very concept of a "proper" way to do anything, apart from using CPAN, is non existent in Perl.
Anyways those are numeric operators, you should use
if($foo eq "")
or
if(length($foo) == 0)
To check for an empty string you could also do something as follows
if (!defined $val || $val eq '')
{
# empty
}
The rest of answers are complicating things. It's just the following.
If filled:
if ($var) {
}
If not filled:
if (! $var) {
}