Related
I have the following cython function.
01:
+02: cdef int count_char_in_x(unicode x,Py_UCS4 c):
03: cdef:
+04: int count = 0
05: Py_UCS4 x_k
06:
+07: for x_k in x: ## Yellow
+08: if x_k == c:
+09: count+=1
10:
+11: return count
Line 07 is not properly optimized.
The annotated HTML code is expanded as:
+07: for x_k in x: ## Yellow
if (unlikely(__pyx_v_x == Py_None)) {
PyErr_SetString(PyExc_TypeError, "'NoneType' is not iterable");
__PYX_ERR(0, 8, __pyx_L1_error)
}
__Pyx_INCREF(__pyx_v_x);
__pyx_t_1 = __pyx_v_x;
__pyx_t_6 = __Pyx_init_unicode_iteration(__pyx_t_1, (&__pyx_t_3), (&__pyx_t_4), (&__pyx_t_5)); if (unlikely(__pyx_t_6 == ((int)-1))) __PYX_ERR(0, 8, __pyx_L1_error)
for (__pyx_t_7 = 0; __pyx_t_7 < __pyx_t_3; __pyx_t_7++) {
__pyx_t_2 = __pyx_t_7;
__pyx_v_x_k = __Pyx_PyUnicode_READ(__pyx_t_5, __pyx_t_4, __pyx_t_2);
Any tips on how could this be improved?
I think it is possible to write a cdef/cpdef function that at runtime completly avoids Python None type checks. Any idea on how this could be done?
The generated C code looks pretty good to me. The loop overall is a int-iterated for loop (i.e. it's not relying on calling the Python methods __iter__ and __next__).
__Pyx_PyUnicode_READ is translated pretty directly to PyUnicode_READ (depending slightly on the Python version you're using). PyUnicode_READ is a C macro which is as close to a direct array access as you can get.
This is probably as good as it's getting. You might get a small improvement by using bytes rather than unicode (provided you're dealing with ASCII characters). You might just consider whether it's really worth reimplementing unicode.count.
If it were a regular def function you could declare x as unicode not None to remove the None check before the loop. That might make a small difference. However, as #ead points out that isn't supported for cdef functions. It's likely the cost of a def function call will be slightly larger than the cost of a None-check, but you should time it if you care.
In scala, pattern match has guard pattern:
val ch = 23
val sign = ch match {
case _: Int if 10 < ch => 65
case '+' => 1
case '-' => -1
case _ => 0
}
Is the Raku version like this?
my $ch = 23;
given $ch {
when Int and * > 10 { say 65}
when '+' { say 1 }
when '-' { say -1 }
default { say 0 }
}
Is this right?
Update: as jjmerelo suggested, i post my result as follows, the signature version is also interesting.
multi washing_machine(Int \x where * > 10 ) { 65 }
multi washing_machine(Str \x where '+' ) { 1 }
multi washing_machine(Str \x where '-' ) { -1 }
multi washing_machine(\x) { 0 }
say washing_machine(12); # 65
say washing_machine(-12); # 0
say washing_machine('+'); # 1
say washing_machine('-'); # -1
say washing_machine('12'); # 0
say washing_machine('洗衣机'); # 0
TL;DR I've written another answer that focuses on using when. This answer focuses on using an alternative to that which combines Signatures, Raku's powerful pattern matching construct, with a where clause.
"Does pattern match in Raku have guard clause?"
Based on what little I know about Scala, some/most Scala pattern matching actually corresponds to using Raku signatures. (And guard clauses in that context are typically where clauses.)
Quoting Martin Odersky, Scala's creator, from The Point of Pattern Matching in Scala:
instead of just matching numbers, which is what switch statements do, you match what are essentially the creation forms of objects
Raku signatures cover several use cases (yay, puns). These include the Raku equivalent of the functional programming paradigmatic use in which one matches values' or functions' type signatures (cf Haskell) and the object oriented programming paradigmatic use in which one matches against nested data/objects and pulls out desired bits (cf Scala).
Consider this Raku code:
class body { has ( $.head, #.arms, #.legs ) } # Declare a class (object structure).
class person { has ( $.mom, $.body, $.age ) } # And another that includes first.
multi person's-age-and-legs # Declare a function that matches ...
( person # ... a person ...
( :$age where * > 40, # ... whose age is over 40 ...
:$body ( :#legs, *% ), # ... noting their body's legs ...
*% ) ) # ... and ignoring other attributes.
{ say "$age {+#legs}" } # Display age and number of legs.
my $age = 42; # Let's demo handy :$var syntax below.
person's-age-and-legs # Call function declared above ...
person # ... passing a person.
.new: # Explicitly construct ...
:$age, # ... a middle aged ...
body => body.new:
:head,
:2arms,
legs => <left middle right> # ... three legged person.
# Displays "42 3"
Notice where there's a close equivalent to a Scala pattern matching guard clause in the above -- where * > 40. (This can be nicely bundled up into a subset type.)
We could define other multis that correspond to different cases, perhaps pulling out the "names" of the person's legs ('left', 'middle', etc.) if their mom's name matches a particular regex or whatever -- you hopefully get the picture.
A default case (multi) that doesn't bother to deconstruct the person could be:
multi person's-age-and-legs (|otherwise)
{ say "let's not deconstruct this person" }
(In the above we've prefixed a parameter in a signature with | to slurp up all remaining structure/arguments passed to a multi. Given that we do nothing with that slurped structure/data, we could have written just (|).)
Unfortunately, I don't think signature deconstruction is mentioned in the official docs. Someone could write a book about Raku signatures. (Literally. Which of course is a great way -- the only way, even -- to write stuff. My favorite article that unpacks a bit of the power of Raku signatures is Pattern Matching and Unpacking from 2013 by Moritz. Who has authored Raku books. Here's hoping.)
Scala's match/case and Raku's given/when seem simpler
Indeed.
As #jjmerelo points out in the comments, using signatures means there's a multi foo (...) { ...} for each and every case, which is much heavier syntactically than case ... => ....
In mitigation:
Simpler cases can just use given/when, just like you wrote in the body of your question;
Raku will presumably one day get non-experimental macros that can be used to implement a construct that looks much closer to Scala's match/case construct, eliding the repeated multi foo (...)s.
From what I see in this answer, that's not really an implementation of a guard pattern in the same sense Haskell has them. However, Perl 6 does have guards in the same sense Scala has: using default patterns combined with ifs.
The Haskell to Perl 6 guide does have a section on guards. It hints at the use of where as guards; so that might answer your question.
TL;DR You've encountered what I'd call a WTF?!?: when Type and ... fails to check the and clause. This answer talks about what's wrong with the when and how to fix it. I've written another answer that focuses on using where with a signature.
If you want to stick with when, I suggest this:
when (condition when Type) { ... } # General form
when (* > 10 when Int) { ... } # For your specific example
This is (imo) unsatisfactory, but it does first check the Type as a guard, and then the condition if the guard passes, and works as expected.
"Is this right?"
No.
given $ch {
when Int and * > 10 { say 65}
}
This code says 65 for any given integer, not just one over 10!
WTF?!? Imo we should mention this on Raku's trap page.
We should also consider filing an issue to make Rakudo warn or fail to compile if a when construct starts with a compile-time constant value that's a type object, and continues with and (or &&, andthen, etc), which . It could either fail at compile-time or display a warning.
Here's the best option I've been able to come up with:
when (* > 10 when Int) { say 65 }
This takes advantage of the statement modifier (aka postfix) form of when inside the parens. The Int is checked before the * > 10.
This was inspired by Brad++'s new answer which looks nice if you're writing multiple conditions against a single guard clause.
I think my variant is nicer than the other options I've come up with in previous versions of this answer, but still unsatisfactory inasmuch as I don't like the Int coming after the condition.
Ultimately, especially if/when RakuAST lands, I think we will experiment with new pattern matching forms. Hopefully we'll come up with something nice that provides a nice elimination of this wart.
Really? What's going on?
We can begin to see the underlying problem with this code:
.say for ('TrueA' and 'TrueB'),
('TrueB' and 'TrueA'),
(Int and 42),
(42 and Int)
displays:
TrueB
TrueA
(Int)
(Int)
The and construct boolean evaluates its left hand argument. If that evaluates to False, it returns it, otherwise it returns its right hand argument.
In the first line, 'TrueA' boolean evaluates to True so the first line returns the right hand argument 'TrueB'.
In the second line 'TrueB' evaluates to True so the and returns its right hand argument, in this case 'TrueA'.
But what happens in the third line? Well, Int is a type object. Type objects boolean evaluate to False! So the and duly returns its left hand argument which is Int (which the .say then displays as (Int)).
This is the root of the problem.
(To continue to the bitter end, the compiler evaluates the expression Int and * > 10; immediately returns the left hand side argument to and which is Int; then successfully matches that Int against whatever integer is given -- completely ignoring the code that looks like a guard clause (the and ... bit).)
If you were using such an expression as the condition of, say, an if statement, the Int would boolean evaluate to False and you'd get a false negative. Here you're using a when which uses .ACCEPTS which leads to a false positive (it is an integer but it's any integer, disregarding the supposed guard clause). This problem quite plausibly belongs on the traps page.
Years ago I wrote a comment mentioning that you had to be more explicit about matching against $_ like this:
my $ch = 23;
given $ch {
when $_ ~~ Int and $_ > 10 { say 65}
when '+' { say 1 }
when '-' { say -1 }
default { say 0 }
}
After coming back to this question, I realized there was another way.
when can safely be inside of another when construct.
my $ch = 23;
given $ch {
when Int:D {
when $_ > 10 { say 65}
proceed
}
when '+' { say 1 }
when '-' { say -1 }
default { say 0 }
}
Note that the inner when will succeed out of the outer one, which will succeed out of the given block.
If the inner when doesn't match we want to proceed on to the outer when checks and default, so we call proceed.
This means that we can also group multiple when statements inside of the Int case, saving having to do repeated type checks. It also means that those inner when checks don't happen at all if we aren't testing an Int value.
when Int:D {
when $_ < 10 { say 5 }
when 10 { say 10}
when $_ > 10 { say 65}
}
I'm trying to write a DAC macro that gets as input the name of list of bits and its size, and the name of integer variable. Every element in the list should be constrained to be equal to every bit in the variable (both of the same length), i.e. (for list name list_of_bits and variable name foo and their length is 4) the macro's output should be:
keep list_of_bits[0] == foo[0:0];
keep list_of_bits[1] == foo[1:1];
keep list_of_bits[2] == foo[2:2];
keep list_of_bits[3] == foo[3:3];
My macro's code is:
define <keep_all_bits'exp> "keep_all_bits <list_size'exp> <num'name> <list_name'name>" as computed {
for i from 0 to (<list_size'exp> - 1) do {
result = appendf("%s keep %s[%d] == %s[%d:%d];",result, <list_name'name>, index, <num'name>, index, index);
};
};
The error I get:
*** Error: The type of '<list_size'exp>' is 'string', while expecting a
numeric type
...
for i from 0 to (<list_size'exp> - 1) do {
Why it interprets the <list_size'exp> as string?
Thank you for your help
All macro arguments in DAC macros are considered strings (except repetitions, which are considered lists of strings).
The point is that a macro treats its input purely syntactically, and it has no semantic information about the arguments. For example, in case of an expression (<exp>) the macro is unable to actually evaluate the expression and compute its value at compilation time, or even to figure out its type. This information is figured out at later compilation phases.
In your case, I would assume that the size is always a constant. So, first of all, you can use <num> instead of <exp> for that macro argument, and use as_a() to convert it to the actual number. The difference between <exp> and <num> is that <num> allows only constant numbers and not any expressions; but it's still treated as a string inside the macro.
Another important point: your macro itself should be a <struct_member> macro rather than an <exp> macro, because this construct itself is a struct member (namely, a constraint) and not an expression.
And one more thing: to ensure that the list size will be exactly as needed, add another constraint for the list size.
So, the improved macro can look like this:
define <keep_all_bits'struct_member> "keep_all_bits <list_size'num> <num'name> <list_name'name>" as computed {
result = appendf("keep %s.size() == %s;", <list_name'name>, <list_size'num>);
for i from 0 to (<list_size'num>.as_a(int) - 1) do {
result = appendf("%s keep %s[%d] == %s[%d:%d];",result, <list_name'name>, i, <num'name>, i, i);
};
};
Why not write is without macro?
keep for each in list_of_bits {
it == foo[index:index];
};
This should do the same, but look more readable and easier to debug; also the generation engine might take some advantage of more concise constraint.
I'm new here. After reading through how to ask and format, I hope this will be an OK question. I'm not very skilled in perl, but it is the programming language what I known most.
I trying apply Perl to real life but I didn't get an great understanding - especially not from my wife. I tell her that:
if she didn't bring to me 3 beers in the evening, that means I got zero (or nothing) beers.
As you probably guessed, without much success. :(
Now factually. From perlop:
Unary "!" performs logical negation, that is, "not".
Languages, what have boolean types (what can have only two "values") is OK:
if it is not the one value -> must be the another one.
so naturally:
!true -> false
!false -> true
But perl doesn't have boolean variables - have only a truth system, whrere everything is not 0, '0' undef, '' is TRUE. Problem comes, when applying logical negation to an not logical value e.g. numbers.
E.g. If some number IS NOT 3, thats mean it IS ZERO or empty, instead of the real life meaning, where if something is NOT 3, mean it can be anything but 3 (e.g. zero too).
So the next code:
use 5.014;
use Strictures;
my $not_3beers = !3;
say defined($not_3beers) ? "defined, value>$not_3beers<" : "undefined";
say $not_3beers ? "TRUE" : "FALSE";
my $not_4beers = !4;
printf qq{What is not 3 nor 4 mean: They're same value: %d!\n}, $not_3beers if( $not_3beers == $not_4beers );
say qq(What is not 3 nor 4 mean: #{[ $not_3beers ? "some bears" : "no bears" ]}!) if( $not_3beers eq $not_4beers );
say ' $not_3beers>', $not_3beers, "<";
say '-$not_3beers>', -$not_3beers, "<";
say '+$not_3beers>', -$not_3beers, "<";
prints:
defined, value><
FALSE
What is not 3 nor 4 mean: They're same value: 0!
What is not 3 nor 4 mean: no bears!
$not_3beers><
-$not_3beers>0<
+$not_3beers>0<
Moreover:
perl -E 'say !!4'
what is not not 4 IS 1, instead of 4!
The above statements with wife are "false" (mean 0) :), but really trying teach my son Perl and he, after a while, asked my wife: why, if something is not 3 mean it is 0 ? .
So the questions are:
how to explain this to my son
why perl has this design, so why !0 is everytime 1
Is here something "behind" what requires than !0 is not any random number, but 0.
as I already said, I don't know well other languages - in every language is !3 == 0?
I think you are focussing to much on negation and too little on what Perl booleans mean.
Historical/Implementation Perspective
What is truth? The detection of a higher voltage that x Volts.
On a higher abstraction level: If this bit here is set.
The abstraction of a sequence of bits can be considered an integer. Is this integer false? Yes, if no bit is set, i.e. the integer is zero.
A hardware-oriented language will likely use this definition of truth, e.g. C, and all C descendants incl Perl.
The negation of 0 could be bitwise negation—all bits are flipped to 1—, or we just set the last bit to 1. The results would usually be decoded as integers -1 and 1 respectively, but the latter is more energy efficient.
Pragmatic Perspective
It is convenient to think of all numbers but zero as true when we deal with counts:
my $wordcount = ...;
if ($wordcount) {
say "We found $wordcount words";
} else {
say "There were no words";
}
or
say "The array is empty" unless #array; # notice scalar context
A pragmatic language like Perl will likely consider zero to be false.
Mathematical Perspective
There is no reason for any number to be false, every number is a well-defined entity. Truth or falseness emerges solely through predicates, expressions which can be true or false. Only this truth value can be negated. E.g.
¬(x ≤ y) where x = 2, y = 3
is false. Many languages which have a strong foundation in maths won't consider anything false but a special false value. In Lisps, '() or nil is usually false, but 0 will usually be true. That is, a value is only true if it is not nil!
In such mathematical languages, !3 == 0 is likely a type error.
Re: Beers
Beers are good. Any number of beers are good, as long as you have one:
my $beers = ...;
if (not $beers) {
say "Another one!";
} else {
say "Aaah, this is good.";
}
Boolification of a beer-counting variable just tells us if you have any beers. Consider !! to be a boolification operator:
my $enough_beer = !! $beers;
The boolification doesn't concern itself with the exact amount. But maybe any number ≥ 3 is good. Then:
my $enough_beer = ($beers >= 3);
The negation is not enough beer:
my $not_enough_beer = not($beers >= 3);
or
my $not_enough_beer = not $beers;
fetch_beer() if $not_enough_beer;
Sets
A Perl scalar does not symbolize a whole universe of things. Especially, not 3 is not the set of all entities that are not three. Is the expression 3 a truthy value? Yes. Therefore, not 3 is a falsey value.
The suggested behaviour of 4 == not 3 to be true is likely undesirable: 4 and “all things that are not three” are not equal, the four is just one of many things that are not three. We should write it correctly:
4 != 3 # four is not equal to three
or
not( 4 == 3 ) # the same
It might help to think of ! and not as logical-negation-of, but not as except.
How to teach
It might be worth introducing mathematical predicates: expressions which can be true or false. If we only ever “create” truthness by explicit tests, e.g. length($str) > 0, then your issues don't arise. We can name the results: my $predicate = (1 < 2), but we can decide to never print them out, instead: print $predicate ? "True" : "False". This sidesteps the problem of considering special representations of true or false.
Considering values to be true/false directly would then only be a shortcut, e.g. foo if $x can considered to be a shortcut for
foo if defined $x and length($x) > 0 and $x != 0;
Perl is all about shortcuts.
Teaching these shortcuts, and the various contexts of perl and where they turn up (numeric/string/boolean operators) could be helpful.
List Context
Even-sized List Context
Scalar Context
Numeric Context
String Context
Boolean Context
Void Context
as I already said, I don't know well other languages - in every language is !3 == 0?
Yes. In C (and thus C++), it's the same.
void main() {
int i = 3;
int n = !i;
int nn = !n;
printf("!3=%i ; !!3=%i\n", n, nn);
}
Prints (see http://codepad.org/vOkOWcbU )
!3=0 ; !!3=1
how to explain this to my son
Very simple. !3 means "opposite of some non-false value, which is of course false". This is called "context" - in a Boolean context imposed by negation operator, "3" is NOT a number, it's a statement of true/false.
The result is also not a "zero" but merely something that's convenient Perl representation of false - which turns into a zero if used in a numeric context (but an empty string if used in a string context - see the difference between 0 + !3 and !3 . "a")
The Boolean context is just a special kind of scalar context where no conversion to a string or a number is ever performed. (perldoc perldata)
why perl has this design, so why !0 is everytime 1
See above. Among other likely reasons (though I don't know if that was Larry's main reason), C has the same logic and Perl took a lot of its syntax and ideas from C.
For a VERY good underlying technical detail, see the answers here: " What do Perl functions that return Boolean actually return " and here: " Why does Perl use the empty string to represent the boolean false value? "
Is here something "behind" what requires than !0 is not any random number, but 0.
Nothing aside from simplicity of implementation. It's easier to produce a "1" than a random number.
if you're asking a different question of "why is it 1 instead of the original # that was negated to get 0", the answer to that is simple - by the time Perl interpreter gets to negate that zero, it no longer knows/remembers that zero was a result of "!3" as opposed to some other expression that resulted in a value of zero/false.
If you want to test that a number is not 3, then use this:
my_variable != 3;
Using the syntax !3, since ! is a boolean operator, first converts 3 into a boolean (even though perl may not have an official boolean type, it still works this way), which, since it is non-zero, means it gets converted to the equivalent of true. Then, !true yields false, which, when converted back to an integer context, gives 0. Continuing with that logic shows how !!3 converts 3 to true, which then is inverted to false, inverted again back to true, and if this value is used in an integer context, gets converted to 1. This is true of most modern programming languages (although maybe not some of the more logic-centered ones), although the exact syntax may vary some depending on the language...
Logically negating a false value requires some value be chosen to represent the resulting true value. "1" is as good a choice as any. I would say it is not important which value is returned (or conversely, it is important that you not rely on any particular true value being returned).
I've finally decided to put the sort.data.frame method that's floating around the internet into an R package. It just gets requested too much to be left to an ad hoc method of distribution.
However, it's written with arguments that make it incompatible with the generic sort function:
sort(x,decreasing,...)
sort.data.frame(form,dat)
If I change sort.data.frame to take decreasing as an argument as in sort.data.frame(form,decreasing,dat) and discard decreasing, then it loses its simplicity because you'll always have to specify dat= and can't really use positional arguments. If I add it to the end as in sort.data.frame(form,dat,decreasing), then the order doesn't match with the generic function. If I hope that decreasing gets caught up in the dots `sort.data.frame(form,dat,...), then when using position-based matching I believe the generic function will assign the second position to decreasing and it will get discarded. What's the best way to harmonize these two functions?
The full function is:
# Sort a data frame
sort.data.frame <- function(form,dat){
# Author: Kevin Wright
# http://tolstoy.newcastle.edu.au/R/help/04/09/4300.html
# Some ideas from Andy Liaw
# http://tolstoy.newcastle.edu.au/R/help/04/07/1076.html
# Use + for ascending, - for decending.
# Sorting is left to right in the formula
# Useage is either of the following:
# sort.data.frame(~Block-Variety,Oats)
# sort.data.frame(Oats,~-Variety+Block)
# If dat is the formula, then switch form and dat
if(inherits(dat,"formula")){
f=dat
dat=form
form=f
}
if(form[[1]] != "~") {
stop("Formula must be one-sided.")
}
# Make the formula into character and remove spaces
formc <- as.character(form[2])
formc <- gsub(" ","",formc)
# If the first character is not + or -, add +
if(!is.element(substring(formc,1,1),c("+","-"))) {
formc <- paste("+",formc,sep="")
}
# Extract the variables from the formula
vars <- unlist(strsplit(formc, "[\\+\\-]"))
vars <- vars[vars!=""] # Remove spurious "" terms
# Build a list of arguments to pass to "order" function
calllist <- list()
pos=1 # Position of + or -
for(i in 1:length(vars)){
varsign <- substring(formc,pos,pos)
pos <- pos+1+nchar(vars[i])
if(is.factor(dat[,vars[i]])){
if(varsign=="-")
calllist[[i]] <- -rank(dat[,vars[i]])
else
calllist[[i]] <- rank(dat[,vars[i]])
}
else {
if(varsign=="-")
calllist[[i]] <- -dat[,vars[i]]
else
calllist[[i]] <- dat[,vars[i]]
}
}
dat[do.call("order",calllist),]
}
Example:
library(datasets)
sort.data.frame(~len+dose,ToothGrowth)
Use the arrange function in plyr. It allows you to individually pick which variables should be in ascending and descending order:
arrange(ToothGrowth, len, dose)
arrange(ToothGrowth, desc(len), dose)
arrange(ToothGrowth, len, desc(dose))
arrange(ToothGrowth, desc(len), desc(dose))
It also has an elegant implementation:
arrange <- function (df, ...) {
ord <- eval(substitute(order(...)), df, parent.frame())
unrowname(df[ord, ])
}
And desc is just an ordinary function:
desc <- function (x) -xtfrm(x)
Reading the help for xtfrm is highly recommended if you're writing this sort of function.
There are a few problems there. sort.data.frame needs to have the same arguments as the generic, so at a minimum it needs to be
sort.data.frame(x, decreasing = FALSE, ...) {
....
}
To have dispatch work, the first argument needs to be the object dispatched on. So I would start with:
sort.data.frame(x, decreasing = FALSE, formula = ~ ., ...) {
....
}
where x is your dat, formula is your form, and we provide a default for formula to include everything. (I haven't studied your code in detail to see exactly what form represents.)
Of course, you don't need to specify decreasing in the call, so:
sort(ToothGrowth, formula = ~ len + dose)
would be how to call the function using the above specifications.
Otherwise, if you don't want sort.data.frame to be an S3 generic, call it something else and then you are free to have whatever arguments you want.
I agree with #Gavin that x must come first. I'd put the decreasing parameter after the formula though - since it probably isn't used that much, and hardly ever as a positional argument.
The formula argument would be used much more and therefore should be the second argument. I also strongly agree with #Gavin that it should be called formula, and not form.
sort.data.frame(x, formula = ~ ., decreasing = FALSE, ...) {
...
}
You might want to extend the decreasing argument to allow a logical vector where each TRUE/FALSE value corresponds to one column in the formula:
d <- data.frame(A=1:10, B=10:1)
sort(d, ~ A+B, decreasing=c(A=TRUE, B=FALSE)) # sort by decreasing A, increasing B