program exhibiting bizarre behavior when reading words out from a file - perl

So I have two files, one that contains my text, and another which I want to contain filter words. The one shown here is supposed to be the one with the curse words. Basically, what I'm doing is iterating through each of the words in the text file, and trying to compare them against the curse words.
sub filter {
$word_to_check = $_;
open ( FILE2, $ARGV[1]) || die "Something went wrong. \n";
while(<FILE2>) {
#cursewords = split;
foreach $curse (#cursewords) {
print $curse."\n";
if($word_to_check eq $curse) { return "BAD!";}
}
}
close ( FILE2 );
}
Here are the "curse words":
what is
Here is the text file:
hey dude what is up
But here's what's going wrong. As you can see, I've put a print statement to see if the curse words are getting checked correctly.
hey what
is
dude what
is
what what
is
is what
is
up what
is
I literally have no idea why this could be happening. Please let me know if I should post more code.
EDIT:
AHA! thanks evil otto. It seems I was getting confused with another print statement I had put in before. Now the problem remains: I think I'm not checking for string equality correctly. Here's where filter is getting called:
foreach $w( #text_file_words )
{
if(filter($w) eq "BAD!")
{
#do something here
}
else { print "good!"; }
}
EDIT 2: Nevermind, more stupidity on my part. I need to get some sleep, thanks evil otto.

change
$word_to_check = $_;
to
$word_to_check = shift;

You needed to collect arguments as an array in perl...
sub myFunction{
($wordToCheck) = #_; #this is the arg array, if you have more than one arg you just separate what's between the parenthesis with commas.
}

Related

How to transform a string parenthesis to a parent-child represantion

I'm trying to visualize a formula which is in a single string of this format:
AND(operator1(operator2(x,a),operator3(y,b)),operator4(t))
There's no limit to how many arguments are in an operator.
Currently I'm trying to solve it in perl. I guess the easiest way is to transform it to a parent-child representation, and then it would be easy to present it in a collapsable way like this:
[-]AND(
[-]operator1(
[-]operator2(
x
a
)
[+]operator3(...)
)
[-]operator4(
t
)
)
I doubt i'm the first one to tackle this, but I can't find any such example online. Thanks!
I did something like this a few months ago and think this might be along the lines of what you are looking for...
c:\Perl>perl StackOverflow.pl
SomeValue
AND
-operator1
--operator2
---x
---a
--operator3
---y
---b
-operator4
--t
c:\Perl>
If so, here's the code (below). It's a bit of a mess, but it should be enough to get you started.
#!c:/perl/bin/perl.exe
my $SourceString="SomeValue AND (operator1(operator2(x,a),operator3(y,b)),operator4(t))";
my $NodeIndex=0;
my $IndexString="-";
print ProcessString($SourceString, $NodeIndex);
exit;
#----- subs go here -----
sub ProcessString {
my $TargetString=shift||return undef;
my $NodeIndex=shift;
my $ReturnString="";
#are we starting with a paren or comma?
if($TargetString=~m/^([\(\)\, ])/) {
#yep, delete the char and pass it through again for further processing
if($1 eq " ") {$ReturnString.=ProcessString(substr($TargetString, index($TargetString, $1)+1), $NodeIndex);}
elsif($1 eq ",") {$ReturnString.=ProcessString(substr($TargetString, index($TargetString, $1)+1), $NodeIndex);}
elsif($1 eq "(") {$ReturnString.=ProcessString(substr($TargetString, index($TargetString, $1)+1), ++$NodeIndex);}
elsif($1 eq ")") {$ReturnString.=ProcessString(substr($TargetString, index($TargetString, $1)+1), --$NodeIndex);}
} else {
#nope, must be a keyword or the end
if($TargetString=~m/([\(\)\, ])/) {
my $KeywordString=substr($TargetString, 0, index($TargetString, $1));
$ReturnString.=$IndexString x $NodeIndex;
$ReturnString.=$KeywordString."\n";
$ReturnString.=ProcessString(substr($TargetString, index($TargetString, $1)), $NodeIndex);
} else {
#we should never be here
} #end keyword check if
} #end if
#send the string back
return $ReturnString;
} #end sub
__END__

Extracting Single from huge Archive using Perl

I'm trying a single from a large ".tgz" file. I'm using Archive::Tar::Streamed module.
Here is the sample code.
my $tar2 = Archive::Tar::Streamed->new($filename);
$fil = $tar2->next;
while($fil) {
$_ = $fil->name;
if(m/abc\.txt/g) {
$fil->extract($outpath);
$fil = $tar2->next;
}
}
But the iterator is not working. It is looping the first file in the archive not moving to the next file.
Can someone tell me what mistake i've done here???
You put the call to next inside your if, so it's only executed if you extracted the file. There's nothing that modifies $fil inside the loop if the file is not extracted.
You can simplify your code quite a bit by just calling the iterator in the condition of the while loop. Also, you can use the =~ binding operator instead of storing the name in $_. And you do not want the /g regex modifier here. In scalar context, you use /g to loop through multiple matches in a string. Here, all you want is to know whether the string contains a match.
my $tar2 = Archive::Tar::Streamed->new($filename);
while(my $fil = $tar2->next) {
if($fil->name =~ m/abc\.txt/) {
$fil->extract($outpath);
}
}

Perl Ending loops/code blocks based on user input (!die/exit)

just have a general question that came up while I was playing around with some stuff I coded. I was wondering if there is any way to terminate a specific part of the program based on user feedback (I apologize if I am misusing terminology here) other than die(); since that ends the entire program.
here's the code:
if ($choice eq 'y'){
print "\nHit diagnostics: \n";
{
my $hitList=#hitList;
for (my $i=0; $i<$hitList; $i++){
print $hitList[$i]."\n";
#segmented listout of misses with interrupt
if(($i%4) eq 0){
print "CONTINUE or Q to end\n";
my $next=<>;
chomp($next);
if(lc($next) eq 'q'){
**die "Killing request...\n";**
}
}
}
}
So basically I just want for the user to be able to end the modulus if loop if they decide at some point that they don't actually want to see the entire list but still be able to continue (there is a miss prompt afterwards as well) with the program.
Is the best way to do this just to use a variable as a 'switch' to determine whether or not the hit list should continue? Just wondering if there is a more acceptable/elegant solution.
for my $i (0..$hitList-1) {
...
if (...) {
last;
}
...
}
last

(m/regexp/) or {multiple ; commands; after; or; }

I like very much this syntax:
try_something() or warn "Cant do it";
How can I add more commands after or?
For example it would be useful in this code:
foreach (#array)
{
m/regex/ or {warn "Does not match"; next;} # this syntax is wrong
...
}
One way I found is
try_something() or eval {warn "Can't do it"; next;};
but I think it is bad idea.
BEST ANSWERS:
do is better than eval.
The comma operator is even better: do_smth() or warn("Does not match"), next; Nota bene: parentheses are mandatory for warn so that next does not parse as one of its arguments.
I think that will end up being pretty unreadable pretty fast, but you can do:
foo() or do { bar(); baz(); };
sub foo {
return $_[0] == 2;
}
for (1..3) {
print $_;
foo($_) or do { print " !foo\n"; next; };
print " foo!\n";
}
For the case in your question, I would use unless.
for (#array) {
unless (/regex/) {
warn "Does not match";
next;
}
...
}
You can sometimes get away with using the comma operator. It evaluates its left-hand argument, throws away the result, evaluates the right-hand argument and returns that result. Applied to your situation it looks like
for (#array) {
/regex/ or warn("Does not match"), next;
...
}
Note the extra parentheses. You have to be a bit more careful about parentheses and grouping this way. Be judicious in your use of this technique: it can get ugly quickly.
In a comment below, Zaid suggests
warn('Does not match'), next unless /regex/;
The choice is a matter of style. Perl was created by a linguist. Natural languages allow us to express the same thought in different ways depending on which part we want to emphasize. In your case, do you want to emphasize the warning or the pattern match? Place the more important code out front.
I figured out (and tested) that you can also use 'and':
try_something() or warn "Cant do it" and print "Hmm." and next;
If try_something() is success then it doesn't do anything after or.
If try_something() fails then it warns and prints and next.

How could I redefine a subroutine and keep the old one too?

Here's what I'd like to achieve:
sub first {
print "this is original first";
}
*original_first = \&first;
sub first {
print "this is first redefined";
}
original_first(); # i expect this to print "this is original first"
first() # i expect this to print "this is first redefined"
I thought that by saving the symbol for first, I'd be able to later call the original subroutine ( under the name original_first ) and to also be able to call first, and get the one redefined. However, if I call the original_first, I still get the "this is first redefined". What do I have to do to make this work?
This should work as you expect:
sub first {
print "this is original first";
}
*original_first = \&first;
*first = sub {
print "this is first redefined";
};
in your code, Perl interprets both sub declarations similar to this:
BEGIN {
*first = sub { ... }
}
so both assignments to &first end up happening before saving the copy and calling the routines. the fix is to make the second declaration into a runtime assignment:
sub first {
print "this is original first";
}
*original_first = \&first;
*first = sub {print "this is first redefined"};
original_first(); # prints "this is original first"
first(); # prints "this is first redefined"
See the Hook::LexWrap module, which can handle all of that for you. If you don't want to use the module, just look at the source, which shows you exactly how to do it.