What if there exists no matched rule in a Lex program because of REJECT? - lex

I'm currently reading the documentation on Lex written by Lesk and Schmidt, and get confused by the REJECT action.
Consider the two rules
a[bc]+ { ... ; REJECT;}
a[cd]+ { ... ; REJECT;}
Input:
ab
Only the first matches, and see what we get from the material.
The action REJECT means ``go do the next alternative.'' It causes whatever rule was second choice after the current rule to be executed.
However, there is no second choice, will there comes a error?

There are really very few use cases for REJECT; I don't think I've ever seen an instance of it in use other than in examples.
Anyway, unless you specify %option nodefault (or the -s command-line flag), flex will add a default fallback action to your ruleset, equivalent to
.|\n ECHO;
In your case, that pattern will match after the REJECT.
However, it is possible to override the default action; for example, you could add the rule:
.|\n REJECT;
In that case, flex really will not have an alternative after the two REJECTs, and it will print an error message on stderr ("flex scanner jammed") and then call exit.

Related

How to encode normalized(A,B) properly?

I am using clingo to solve a homework problem and stumbled upon something I can't explain:
normalized(0,0).
normalized(A,1) :-
A != 0.
normalized(10).
In my opinion, normalized should be 0 when the first parameter is 0 or 1 in every other case.
Running clingo on that, however, produces the following:
test.pl:2:1-3:12: error: unsafe variables in:
normalized(A,1):-[#inc_base];A!=0.
test.pl:2:12-13: note: 'A' is unsafe
Why is A unsafe here?
According to Programming with CLINGO
Some error messages say that the program
has “unsafe variables.” Such a message usually indicates that the head of one of
the rules includes a variable that does not occur in its body; stable models of such
programs may be infinite.
But in this example A is present in the body.
Will clingo produce an infinite set consisting of answers for all numbers here?
I tried adding number(_) around the first parameter and pattern matching on it to avoid this situation but with the same result:
normalized(number(0),0).
normalized(A,1) :-
A=number(B),
B != 0.
normalized(number(10)).
How would I write normalized properly?
With "variables occuring in the body" actually means in a positive literal in the body. I can recommend the official guide: https://github.com/potassco/guide/releases/
The second thing, ASP is not prolog. Your rules get grounded, i.e. each first order variable is replaced with its domain. In your case A has no domain.
What would be the expected outcome of your program ?
normalized(12351,1).
normalized(my_mom,1).
would all be valid replacements for A so you create an infinite program. This is why 'A' has to be bounded by a domain. For example:
dom(a). dom(b). dom(c). dom(100).
normalized(0,0).
normalized(A,1) :- dom(A).
would produce
normalize(0,0).
normalize(a,1).
normalize(b,1).
normalize(c,1).
normalize(100,1).
Also note that there is no such thing as number/1. ASP is a typefree language.
Also,
normalized(10).
is a different predicate with only one parameter, I do not know how this will fit in your program.
Maybe your are looking for something like this:
dom(1..100).
normalize(0,0).
normalize(X,1) :- dom(X).
foo(43).
bar(Y) :- normalize(X,Y), foo(X).

Matching packet Content in a specific order with Suricata?

I'm attempting to create a Suricata rule that will match a packet if and only if all content is found and in a specific order.
The problem with my current rule is that it will match even if the packet content is test2 test1.
Is there a way to achieve this functionality without using pcre?
alert tcp $HOME_NET any -> $EXTERNAL_NET [80,443] (msg:"Test Rule"; flow:established,to_server; content:"test1"; fast_pattern; content:"test2"; distance:0; classtype:web-application-activity; sid:5182976; rev:2;)
I figured out that the method I was using to test the Suricata signatures was duplicating the tested data at some point causing for the signature to always fire.
As to answer my own question, content order can be enforced by adding a distance modifier after the first content match.
As seen in:
content:"one"; content:"two"; distance:0; content:"three"; distance:0; . . .
As far as I can tell, the fast_pattern keyword can be omitted.

How can Perl's Getopt::Long discover arguments with mandatory parameter missing?

In one of my scripts I use the Getopt::Long library. At the beginning of the program I make a call:
&GetOptions ('help', 'debug', 'user=s' => \$GetUser);
The first two arguments are simple: I discover their existance by checking $opt_help and $opt_debug respectively. However the third argument is tricky, because I need to distinguish between no option at all ($GetUser is undefined, which is ok for me), using "--user" alone ($GetUser is also undefined, but this time I want to display an error message) and "--user FooBar" (where the $GetUser receives 'FooBar', which I can use in further processing).
How can I distinguish between using no "--user" option and using it alone, without a username?
You are looking for : instead of =, so 'user:s' => \$GetUser. From Options with values
Using a colon : instead of the equals sign indicates that the option value is optional. In this case, if no suitable value is supplied, string valued options get an empty string '' assigned, while numeric options are set to 0
This allows you to legitimately call the program with --user and no value (with = it's an error). Then you only declare my $GetUser; and after the options are processed you can tell what happened. If it is undef it wasn't mentioned, if it is '' (empty string) it was invoked without a value and you can emit your message. This assumes that it being '' isn't of any other use in your program.
Otherwise, when you use 'user=s' and no value is given, the GetOptions reports an error by returning false and emits a descriptive message to STDERR. So you may well leave it and do
GetOptions( 'user=s' => ...) or die "Option error\n";
and rely on the module to catch and report wrong use. Our own message above isn't really needed as module's messages clearly describe the problem.
One other way of doing this would go along the lines of
usage(), exit if not GetOptions('user=s' => \$GetUser, ...);
sub usage {
# Your usage message, briefly listing options etc.
}
I'd like to add – you don't need & in front of a function call. It makes the caller's #_ visible, ignores function prototype, and does a few other similarly involved things. One common use is to get a coderef, $rc = \&fun, where it is needed. See for example this post

Erlang mnesia equivalent of "select * from Tb"

I'm a total erlang noob and I just want to see what's in a particular table I have. I want to just "select *" from a particular table to start with. The examples I'm seeing, such as the official documentation, all have column restrictions which I don't really want. I don't really know how to form the MatchHead or Guard to match anything (aka "*").
A very simple primer on how to just get everything out of a table would be very appreciated!
For example, you can use qlc:
F = fun() ->
Q = qlc:q([R || R <- mnesia:table(foo)]),
qlc:e(Q)
end,
mnesia:transaction(F).
The simplest way to do it is probably mnesia:dirty_match_object:
mnesia:dirty_match_object(foo, #foo{_ = '_'}).
That is, match everything in the table foo that is a foo record, regardless of the values of the fields (every field is '_', i.e. wildcard). Note that since it uses record construction syntax, it will only work in a module where you have included the record definition, or in the shell after evaluating rr(my_module) to make the record definition available.
(I expected mnesia:dirty_match_object(foo, '_') to work, but that fails with a bad_type error.)
To do it with select, call it like this:
mnesia:dirty_select(foo, [{'_', [], ['$_']}]).
Here, MatchHead is _, i.e. match anything. The guards are [], an empty list, i.e. no extra limitations. The result spec is ['$_'], i.e. return the entire record. For more information about match specs, see the match specifications chapter of the ERTS user guide.
If an expression is too deep and gets printed with ... in the shell, you can ask the shell to print the entire thing by evaluating rp(EXPRESSION). EXPRESSION can either be the function call once again, or v(-1) for the value returned by the previous expression, or v(42) for the value returned by the expression preceded by the shell prompt 42>.

Perl Dancer trailing slash

Using the Perl web application framework Dancer, I am having some problems with trailing slashes in the URL matching.
Say for example, I want to match the following URL, with an optional Id parameter:
get '/users/:id?' => sub
{
#Do something
}
Both /users/morgan and /users/ match. Though /users will not. Which does not seem very uniform. Since I would prefer, only matching the URL:s without the trailing slash:
/users/morgan and /users. How would I achieve that?
Another approach is to use a named sub - all the examples of Dancer code tend to use anonymous subs, but there's nothing that says it has to be anonymous.
get '/users' => \&show_users;
get '/users/:id' => \&show_users;
sub show_users
{
#Do something
}
Note that, due to the way Dancer does the route matching, this is order-dependent and, in my experience, I've had to list the routes with fewer elements first.
id will contains everything from /user/ on until an optional slash.
get qr{^/users/?(?<id>[^/]+)?$} => sub {
my $captures = captures;
if ( defined $captures->{id} ) {
return sprintf 'the id is: %s', $captures->{id};
}
else {
return 'global user page'
}
};
I know this is an old question, but I've recently solved this problem by using a Plack middleware. There are two of them you can choose from depending on whether you prefer URLs with trailing slashes or not:
Plack::Middleware::TrailingSlash
Plack::Middleware::TrailingSlashKiller
Using any of the middleware above should greatly simplify your core Dancer application code and unit tests since you do not need to handle both cases.
In addition, as mentioned by Dave Sherohman, you should definitely arrange your routes with the fewer elements first in order to match those first, especially if you use the TrailingSlash middleware to force trailing slashes.