Related
I'm new here. After reading through how to ask and format, I hope this will be an OK question. I'm not very skilled in perl, but it is the programming language what I known most.
I trying apply Perl to real life but I didn't get an great understanding - especially not from my wife. I tell her that:
if she didn't bring to me 3 beers in the evening, that means I got zero (or nothing) beers.
As you probably guessed, without much success. :(
Now factually. From perlop:
Unary "!" performs logical negation, that is, "not".
Languages, what have boolean types (what can have only two "values") is OK:
if it is not the one value -> must be the another one.
so naturally:
!true -> false
!false -> true
But perl doesn't have boolean variables - have only a truth system, whrere everything is not 0, '0' undef, '' is TRUE. Problem comes, when applying logical negation to an not logical value e.g. numbers.
E.g. If some number IS NOT 3, thats mean it IS ZERO or empty, instead of the real life meaning, where if something is NOT 3, mean it can be anything but 3 (e.g. zero too).
So the next code:
use 5.014;
use Strictures;
my $not_3beers = !3;
say defined($not_3beers) ? "defined, value>$not_3beers<" : "undefined";
say $not_3beers ? "TRUE" : "FALSE";
my $not_4beers = !4;
printf qq{What is not 3 nor 4 mean: They're same value: %d!\n}, $not_3beers if( $not_3beers == $not_4beers );
say qq(What is not 3 nor 4 mean: #{[ $not_3beers ? "some bears" : "no bears" ]}!) if( $not_3beers eq $not_4beers );
say ' $not_3beers>', $not_3beers, "<";
say '-$not_3beers>', -$not_3beers, "<";
say '+$not_3beers>', -$not_3beers, "<";
prints:
defined, value><
FALSE
What is not 3 nor 4 mean: They're same value: 0!
What is not 3 nor 4 mean: no bears!
$not_3beers><
-$not_3beers>0<
+$not_3beers>0<
Moreover:
perl -E 'say !!4'
what is not not 4 IS 1, instead of 4!
The above statements with wife are "false" (mean 0) :), but really trying teach my son Perl and he, after a while, asked my wife: why, if something is not 3 mean it is 0 ? .
So the questions are:
how to explain this to my son
why perl has this design, so why !0 is everytime 1
Is here something "behind" what requires than !0 is not any random number, but 0.
as I already said, I don't know well other languages - in every language is !3 == 0?
I think you are focussing to much on negation and too little on what Perl booleans mean.
Historical/Implementation Perspective
What is truth? The detection of a higher voltage that x Volts.
On a higher abstraction level: If this bit here is set.
The abstraction of a sequence of bits can be considered an integer. Is this integer false? Yes, if no bit is set, i.e. the integer is zero.
A hardware-oriented language will likely use this definition of truth, e.g. C, and all C descendants incl Perl.
The negation of 0 could be bitwise negation—all bits are flipped to 1—, or we just set the last bit to 1. The results would usually be decoded as integers -1 and 1 respectively, but the latter is more energy efficient.
Pragmatic Perspective
It is convenient to think of all numbers but zero as true when we deal with counts:
my $wordcount = ...;
if ($wordcount) {
say "We found $wordcount words";
} else {
say "There were no words";
}
or
say "The array is empty" unless #array; # notice scalar context
A pragmatic language like Perl will likely consider zero to be false.
Mathematical Perspective
There is no reason for any number to be false, every number is a well-defined entity. Truth or falseness emerges solely through predicates, expressions which can be true or false. Only this truth value can be negated. E.g.
¬(x ≤ y) where x = 2, y = 3
is false. Many languages which have a strong foundation in maths won't consider anything false but a special false value. In Lisps, '() or nil is usually false, but 0 will usually be true. That is, a value is only true if it is not nil!
In such mathematical languages, !3 == 0 is likely a type error.
Re: Beers
Beers are good. Any number of beers are good, as long as you have one:
my $beers = ...;
if (not $beers) {
say "Another one!";
} else {
say "Aaah, this is good.";
}
Boolification of a beer-counting variable just tells us if you have any beers. Consider !! to be a boolification operator:
my $enough_beer = !! $beers;
The boolification doesn't concern itself with the exact amount. But maybe any number ≥ 3 is good. Then:
my $enough_beer = ($beers >= 3);
The negation is not enough beer:
my $not_enough_beer = not($beers >= 3);
or
my $not_enough_beer = not $beers;
fetch_beer() if $not_enough_beer;
Sets
A Perl scalar does not symbolize a whole universe of things. Especially, not 3 is not the set of all entities that are not three. Is the expression 3 a truthy value? Yes. Therefore, not 3 is a falsey value.
The suggested behaviour of 4 == not 3 to be true is likely undesirable: 4 and “all things that are not three” are not equal, the four is just one of many things that are not three. We should write it correctly:
4 != 3 # four is not equal to three
or
not( 4 == 3 ) # the same
It might help to think of ! and not as logical-negation-of, but not as except.
How to teach
It might be worth introducing mathematical predicates: expressions which can be true or false. If we only ever “create” truthness by explicit tests, e.g. length($str) > 0, then your issues don't arise. We can name the results: my $predicate = (1 < 2), but we can decide to never print them out, instead: print $predicate ? "True" : "False". This sidesteps the problem of considering special representations of true or false.
Considering values to be true/false directly would then only be a shortcut, e.g. foo if $x can considered to be a shortcut for
foo if defined $x and length($x) > 0 and $x != 0;
Perl is all about shortcuts.
Teaching these shortcuts, and the various contexts of perl and where they turn up (numeric/string/boolean operators) could be helpful.
List Context
Even-sized List Context
Scalar Context
Numeric Context
String Context
Boolean Context
Void Context
as I already said, I don't know well other languages - in every language is !3 == 0?
Yes. In C (and thus C++), it's the same.
void main() {
int i = 3;
int n = !i;
int nn = !n;
printf("!3=%i ; !!3=%i\n", n, nn);
}
Prints (see http://codepad.org/vOkOWcbU )
!3=0 ; !!3=1
how to explain this to my son
Very simple. !3 means "opposite of some non-false value, which is of course false". This is called "context" - in a Boolean context imposed by negation operator, "3" is NOT a number, it's a statement of true/false.
The result is also not a "zero" but merely something that's convenient Perl representation of false - which turns into a zero if used in a numeric context (but an empty string if used in a string context - see the difference between 0 + !3 and !3 . "a")
The Boolean context is just a special kind of scalar context where no conversion to a string or a number is ever performed. (perldoc perldata)
why perl has this design, so why !0 is everytime 1
See above. Among other likely reasons (though I don't know if that was Larry's main reason), C has the same logic and Perl took a lot of its syntax and ideas from C.
For a VERY good underlying technical detail, see the answers here: " What do Perl functions that return Boolean actually return " and here: " Why does Perl use the empty string to represent the boolean false value? "
Is here something "behind" what requires than !0 is not any random number, but 0.
Nothing aside from simplicity of implementation. It's easier to produce a "1" than a random number.
if you're asking a different question of "why is it 1 instead of the original # that was negated to get 0", the answer to that is simple - by the time Perl interpreter gets to negate that zero, it no longer knows/remembers that zero was a result of "!3" as opposed to some other expression that resulted in a value of zero/false.
If you want to test that a number is not 3, then use this:
my_variable != 3;
Using the syntax !3, since ! is a boolean operator, first converts 3 into a boolean (even though perl may not have an official boolean type, it still works this way), which, since it is non-zero, means it gets converted to the equivalent of true. Then, !true yields false, which, when converted back to an integer context, gives 0. Continuing with that logic shows how !!3 converts 3 to true, which then is inverted to false, inverted again back to true, and if this value is used in an integer context, gets converted to 1. This is true of most modern programming languages (although maybe not some of the more logic-centered ones), although the exact syntax may vary some depending on the language...
Logically negating a false value requires some value be chosen to represent the resulting true value. "1" is as good a choice as any. I would say it is not important which value is returned (or conversely, it is important that you not rely on any particular true value being returned).
I had programming interview which consisted of 3 interviewers, 45 min each.
While first two interviewers gave me 2-3 short coding questions (i.e reverse linked list, implement rand(7) using rand(5) etc ) third interviewer used whole timeslot for single question:
You are given string representing correctly formed and parenthesized
boolean expression consisting of characters T, F, &, |, !, (, ) an
spaces. T stands for True, F for False, & for logical AND, | for
logical OR, ! for negate. & has greater priority than |. Any of these
chars is followed by a space in input string. I was to evaluate value
of expression and print it (output should be T or F). Example: Input:
! ( T | F & F ) Output: F
I tried to implement variation of Shunting Yard algorithm to solve the problem (to turn input in postfix form, and then to evaluate postfix expression), but failed to code it properly in given timeframe, so I ended up explaining in pseudocode and words what I wanted.
My recruiter said that first two interviewers gave me "HIRE", while third interviewer gave me "NO HIRE", and since the final decision is "logical AND", he thanked me for my time.
My questions:
Do you think that this question is appropriate to code on whiteboard in approx. 40 mins? To me it seems to much code for such a short timeslot and dimensions of whiteboard.
Is there shorter approach than to use Shunting yard algorithm for this problem?
Well, once you have some experience with parsers postfix algorithm is quite simple.
1. From left to right evaluate for each char:
if its operand, push on the stack.
if its operator, pop A, then pop B then push B operand A onto the stack. Last item on the stack will be the result. If there's none or more than one means you're doing it wrong (assuming the postfix notation is valid).
Infix to postfix is quite simple as well. That being said I don't think it's an appropriate task for 40 minutes if You don't know the algorithms. Here is a boolean postfix evaluation method I wrote at some stage (uses Lambda as well):
public static boolean evaluateBool(String s)
{
Stack<Object> stack = new Stack<>();
StringBuilder expression =new StringBuilder(s);
expression.chars().forEach(ch->
{
if(ch=='0') stack.push(false);
else if(ch=='1') stack.push(true);
else if(ch=='A'||ch=='R'||ch=='X')
{
boolean op1 = (boolean) stack.pop();
boolean op2 = (boolean) stack.pop();
switch(ch)
{
case 'A' : stack.push(op2&&op1); break;
case 'R' : stack.push(op2||op1); break;
case 'X' : stack.push(op2^op1); break;
}//endSwitch
}else
if(ch=='N')
{
boolean op1 = (boolean) stack.pop();
stack.push(!op1);
}//endIF
});
return (boolean) stack.pop();
}
In your case to make it working (with that snippet) you would first have to parse the expression and replace special characters like "!","|","^" etc with something plain like letters or just use integer char value in your if cases.
Given an arbitrary list of booleans, what is the most elegant way of determining that exactly one of them is true?
The most obvious hack is type conversion: converting them to 0 for false and 1 for true and then summing them, and returning sum == 1.
I'd like to know if there is a way to do this without converting them to ints, actually using boolean logic.
(This seems like it should be trivial, idk, long week)
Edit: In case it wasn't obvious, this is more of a code-golf / theoretical question. I'm not fussed about using type conversion / int addition in PROD code, I'm just interested if there is way of doing it without that.
Edit2: Sorry folks it's a long week and I'm not explaining myself well. Let me try this:
In boolean logic, ANDing a collection of booleans is true if all of the booleans are true, ORing the collection is true if least one of them is true. Is there a logical construct that will be true if exactly one boolean is true? XOR is this for a collection of two booleans for example, but any more than that and it falls over.
You can actually accomplish this using only boolean logic, although there's perhaps no practical value of that in your example. The boolean version is much more involved than simply counting the number of true values.
Anyway, for the sake of satisfying intellectual curiosity, here goes. First, the idea of using a series of XORs is good, but it only gets us half way. For any two variables x and y,
x ⊻ y
is true whenever exactly one of them is true. However, this does not continue to be true if you add a third variable z,
x ⊻ y ⊻ z
The first part, x ⊻ y, is still true if exactly one of x and y is true. If either x or y is true, then z needs to be false for the whole expression to be true, which is what we want. But consider what happens if both x and y are true. Then x ⊻ y is false, yet the whole expression can become true if z is true as well. So either one variable or all three must be true. In general, if you have a statement that is a chain of XORs, it will be true if an uneven number of variables are true.
Since one is an uneven number, this might prove useful. Of course, checking for an uneven number of truths is not enough. We additionally need to ensure that no more than one variable is true. This can be done in a pairwise fashion by taking all pairs of two variables and checking that they are not both true. Taken together these two conditions ensure that exactly one if the variables are true.
Below is a small Python script to illustrate the approach.
from itertools import product
print("x|y|z|only_one_is_true")
print("======================")
for x, y, z in product([True, False], repeat=3):
uneven_number_is_true = x ^ y ^ z
max_one_is_true = (not (x and y)) and (not (x and z)) and (not (y and z))
only_one_is_true = uneven_number_is_true and max_one_is_true
print(int(x), int(y), int(z), only_one_is_true)
And here's the output.
x|y|z|only_one_is_true
======================
1 1 1 False
1 1 0 False
1 0 1 False
1 0 0 True
0 1 1 False
0 1 0 True
0 0 1 True
0 0 0 False
Sure, you could do something like this (pseudocode, since you didn't mention language):
found = false;
alreadyFound = false;
for (boolean in booleans):
if (boolean):
found = true;
if (alreadyFound):
found = false;
break;
else:
alreadyFound = true;
return found;
After your clarification, here it is with no integers.
bool IsExactlyOneBooleanTrue( bool *boolAry, int size )
{
bool areAnyTrue = false;
bool areTwoTrue = false;
for(int i = 0; (!areTwoTrue) && (i < size); i++) {
areTwoTrue = (areAnyTrue && boolAry[i]);
areAnyTrue |= boolAry[i];
}
return ((areAnyTrue) && (!areTwoTrue));
}
No-one mentioned that this "operation" we're looking for is shortcut-able similarly to boolean AND and OR in most languages. Here's an implementation in Java:
public static boolean exactlyOneOf(boolean... inputs) {
boolean foundAtLeastOne = false;
for (boolean bool : inputs) {
if (bool) {
if (foundAtLeastOne) {
// found a second one that's also true, shortcut like && and ||
return false;
}
foundAtLeastOne = true;
}
}
// we're happy if we found one, but if none found that's less than one
return foundAtLeastOne;
}
With plain boolean logic, it may not be possible to achieve what you want. Because what you are asking for is a truth evaluation not just based on the truth values but also on additional information(count in this case). But boolean evaluation is binary logic, it cannot depend on anything else but on the operands themselves. And there is no way to reverse engineer to find the operands given a truth value because there can be four possible combinations of operands but only two results. Given a false, can you tell if it is because of F ^ F or T ^ T in your case, so that the next evaluation can be determined based on that?.
booleanList.Where(y => y).Count() == 1;
Due to the large number of reads by now, here comes a quick clean up and additional information.
Option 1:
Ask if only the first variable is true, or only the second one, ..., or only the n-th variable.
x1 & !x2 & ... & !xn |
!x1 & x2 & ... & !xn |
...
!x1 & !x2 & ... & xn
This approach scales in O(n^2), the evaluation stops after the first positive match is found. Hence, preferred if it is likely that there is a positive match.
Option 2:
Ask if there is at least one variable true in total. Additionally check every pair to contain at most one true variable (Anders Johannsen's answer)
(x1 | x2 | ... | xn) &
(!x1 | !x2) &
...
(!x1 | !xn) &
(!x2 | !x3) &
...
(!x2 | !xn) &
...
This option also scales in O(n^2) due to the number of possible pairs. Lazy evaluation stops the formula after the first counter example. Hence, it is preferred if its likely there is a negative match.
(Option 3):
This option involves a subtraction and is thus no valid answer for the restricted setting. Nevertheless, it argues how looping the values might not be the most beneficial solution in an unrestricted stetting.
Treat x1 ... xn as a binary number x. Subtract one, then AND the results. The output is zero <=> x1 ... xn contains at most one true value. (the old "check power of two" algorithm)
x 00010000
x-1 00001111
AND 00000000
If the bits are already stored in such a bitboard, this might be beneficial over looping. Though, keep in mind this kills the readability and is limited by the available board length.
A last note to raise awareness: by now there exists a stack exchange called computer science which is exactly intended for this type of algorithmic questions
It can be done quite nicely with recursion, e.g. in Haskell
-- there isn't exactly one true element in the empty list
oneTrue [] = False
-- if the list starts with False, discard it
oneTrue (False : xs) = oneTrue xs
-- if the list starts with True, all other elements must be False
oneTrue (True : xs) = not (or xs)
// Javascript
Use .filter() on array and check the length of the new array.
// Example using array
isExactly1BooleanTrue(boolean:boolean[]) {
return booleans.filter(value => value === true).length === 1;
}
// Example using ...booleans
isExactly1BooleanTrue(...booleans) {
return booleans.filter(value => value === true).length === 1;
}
One way to do it is to perform pairwise AND and then check if any of the pairwise comparisons returned true with chained OR. In python I would implement it using
from itertools import combinations
def one_true(bools):
pairwise_comp = [comb[0] and comb[1] for comb in combinations(bools, 2)]
return not any(pairwise_comp)
This approach easily generalizes to lists of arbitrary length, although for very long lists, the number of possible pairs grows very quickly.
Python:
boolean_list.count(True) == 1
OK, another try. Call the different booleans b[i], and call a slice of them (a range of the array) b[i .. j]. Define functions none(b[i .. j]) and just_one(b[i .. j]) (can substitute the recursive definitions to get explicit formulas if required). We have, using C notation for logical operations (&& is and, || is or, ^ for xor (not really in C), ! is not):
none(b[i .. i + 1]) ~~> !b[i] && !b[i + 1]
just_one(b[i .. i + 1]) ~~> b[i] ^ b[i + 1]
And then recursively:
none(b[i .. j + 1]) ~~> none(b[i .. j]) && !b[j + 1]
just_one(b[i .. j + 1] ~~> (just_one(b[i .. j]) && !b[j + 1]) ^ (none(b[i .. j]) && b[j + 1])
And you are interested in just_one(b[1 .. n]).
The expressions will turn out horrible.
Have fun!
That python script does the job nicely. Here's the one-liner it uses:
((x ∨ (y ∨ z)) ∧ (¬(x ∧ y) ∧ (¬(z ∧ x) ∧ ¬(y ∧ z))))
Retracted for Privacy and Anders Johannsen provided already correct and simple answers. But both solutions do not scale very well (O(n^2)). If performance is important you can stick to the following solution, which performs in O(n):
def exact_one_of(array_of_bool):
exact_one = more_than_one = False
for array_elem in array_of_bool:
more_than_one = (exact_one and array_elem) or more_than_one
exact_one = (exact_one ^ array_elem) and (not more_than_one)
return exact_one
(I used python and a for loop for simplicity. But of course this loop could be unrolled to a sequence of NOT, AND, OR and XOR operations)
It works by tracking two states per boolean variable/list entry:
is there exactly one "True" from the beginning of the list until this entry?
are there more than one "True" from the beginning of the list until this entry?
The states of a list entry can be simply derived from the previous states and corresponding list entry/boolean variable.
Python:
let see using example...
steps:
below function exactly_one_topping takes three parameter
stores their values in the list as True, False
Check whether there exists only one true value by checking the count to be exact 1.
def exactly_one_topping(ketchup, mustard, onion):
args = [ketchup,mustard,onion]
if args.count(True) == 1: # check if Exactly one value is True
return True
else:
return False
How do you want to count how many are true without, you know, counting? Sure, you could do something messy like (C syntax, my Python is horrible):
for(i = 0; i < last && !booleans[i]; i++)
;
if(i == last)
return 0; /* No true one found */
/* We have a true one, check there isn't another */
for(i++; i < last && !booleans[i]; i++)
;
if(i == last)
return 1; /* No more true ones */
else
return 0; /* Found another true */
I'm sure you'll agree that the win (if any) is slight, and the readability is bad.
It is not possible without looping. Check BitSet cardinality() in java implementation.
http://fuseyism.com/classpath/doc/java/util/BitSet-source.html
We can do it this way:-
if (A=true or B=true)and(not(A=true and B=true)) then
<enter statements>
end if
So Mathematica is different from other dialects of lisp because it blurs the lines between functions and macros. In Mathematica if a user wanted to write a mathematical function they would likely use pattern matching like f[x_]:= x*x instead of f=Function[{x},x*x] though both would return the same result when called with f[x]. My understanding is that the first approach is something equivalent to a lisp macro and in my experience is favored because of the more concise syntax.
So I have two questions, is there a performance difference between executing functions versus the pattern matching/macro approach? Though part of me wouldn't be surprised if functions were actually transformed into some version of macros to allow features like Listable to be implemented.
The reason I care about this question is because of the recent set of questions (1) (2) about trying to catch Mathematica errors in large programs. If most of the computations were defined in terms of Functions, it seems to me that keeping track of the order of evaluation and where the error originated would be easier than trying to catch the error after the input has been rewritten by the successive application of macros/patterns.
The way I understand Mathematica is that it is one giant search replace engine. All functions, variables, and other assignments are essentially stored as rules and during evaluation Mathematica goes through this global rule base and applies them until the resulting expression stops changing.
It follows that the fewer times you have to go through the list of rules the faster the evaluation. Looking at what happens using Trace (using gdelfino's function g and h)
In[1]:= Trace#(#*#)&#x
Out[1]= {x x,x^2}
In[2]:= Trace#g#x
Out[2]= {g[x],x x,x^2}
In[3]:= Trace#h#x
Out[3]= {{h,Function[{x},x x]},Function[{x},x x][x],x x,x^2}
it becomes clear why anonymous functions are fastest and why using Function introduces additional overhead over a simple SetDelayed. I recommend looking at the introduction of Leonid Shifrin's excellent book, where these concepts are explained in some detail.
I have on occasion constructed a Dispatch table of all the functions I need and manually applied it to my starting expression. This provides a significant speed increase over normal evaluation as none of Mathematica's inbuilt functions need to be matched against my expression.
My understanding is that the first approach is something equivalent to a lisp macro and in my experience is favored because of the more concise syntax.
Not really. Mathematica is a term rewriter, as are Lisp macros.
So I have two questions, is there a performance difference between executing functions versus the pattern matching/macro approach?
Yes. Note that you are never really "executing functions" in Mathematica. You are just applying rewrite rules to change one expression into another.
Consider mapping the Sqrt function over a packed array of floating point numbers. The fastest solution in Mathematica is to apply the Sqrt function directly to the packed array because it happens to implement exactly what we want and is optimized for this special case:
In[1] := N#Range[100000];
In[2] := Sqrt[xs]; // AbsoluteTiming
Out[2] = {0.0060000, Null}
We might define a global rewrite rule that has terms of the form sqrt[x] rewritten to Sqrt[x] such that the square root will be calculated:
In[3] := Clear[sqrt];
sqrt[x_] := Sqrt[x];
Map[sqrt, xs]; // AbsoluteTiming
Out[3] = {0.4800007, Null}
Note that this is ~100× slower than the previous solution.
Alternatively, we might define a global rewrite rule that replaces the symbol sqrt with a lambda function that invokes Sqrt:
In[4] := Clear[sqrt];
sqrt = Function[{x}, Sqrt[x]];
Map[sqrt, xs]; // AbsoluteTiming
Out[4] = {0.0500000, Null}
Note that this is ~10× faster than the previous solution.
Why? Because the slow second solution is looking up the rewrite rule sqrt[x_] :> Sqrt[x] in the inner loop (for each element of the array) whereas the fast third solution looks up the value Function[...] of the symbol sqrt once and then applies that lambda function repeatedly. In contrast, the fastest first solution is a loop calling sqrt written in C. So searching the global rewrite rules is extremely expensive and term rewriting is expensive.
If so, why is Sqrt ever fast? You might expect a 2× slowdown instead of 10× because we've replaced one lookup for Sqrt with two lookups for sqrt and Sqrt in the inner loop but this is not so because Sqrt has the special status of being a built-in function that will be matched in the core of the Mathematica term rewriter itself rather than via the general-purpose global rewrite table.
Other people have described much smaller performance differences between similar functions. I believe the performance differences in those cases are just minor differences in the exact implementation of Mathematica's internals. The biggest issue with Mathematica is the global rewrite table. In particular, this is where Mathematica diverges from traditional term-level interpreters.
You can learn a lot about Mathematica's performance by writing mini Mathematica implementations. In this case, the above solutions might be compiled to (for example) F#. The array may be created like this:
> let xs = [|1.0..100000.0|];;
...
The built-in sqrt function can be converted into a closure and given to the map function like this:
> Array.map sqrt xs;;
Real: 00:00:00.006, CPU: 00:00:00.015, GC gen0: 0, gen1: 0, gen2: 0
...
This takes 6ms just like Sqrt[xs] in Mathematica. But that is to be expected because this code has been JIT compiled down to machine code by .NET for fast evaluation.
Looking up rewrite rules in Mathematica's global rewrite table is similar to looking up the closure in a dictionary keyed on its function name. Such a dictionary can be constructed like this in F#:
> open System.Collections.Generic;;
> let fns = Dictionary<string, (obj -> obj)>(dict["sqrt", unbox >> sqrt >> box]);;
This is similar to the DownValues data structure in Mathematica, except that we aren't searching multiple resulting rules for the first to match on the function arguments.
The program then becomes:
> Array.map (fun x -> fns.["sqrt"] (box x)) xs;;
Real: 00:00:00.044, CPU: 00:00:00.031, GC gen0: 0, gen1: 0, gen2: 0
...
Note that we get a similar 10× performance degradation due to the hash table lookup in the inner loop.
An alternative would be to store the DownValues associated with a symbol in the symbol itself in order to avoid the hash table lookup.
We can even write a complete term rewriter in just a few lines of code. Terms may be expressed as values of the following type:
> type expr =
| Float of float
| Symbol of string
| Packed of float []
| Apply of expr * expr [];;
Note that Packed implements Mathematica's packed lists, i.e. unboxed arrays.
The following init function constructs a List with n elements using the function f, returning a Packed if every return value was a Float or a more general Apply(Symbol "List", ...) otherwise:
> let init n f =
let rec packed ys i =
if i=n then Packed ys else
match f i with
| Float y ->
ys.[i] <- y
packed ys (i+1)
| y ->
Apply(Symbol "List", Array.init n (fun j ->
if j<i then Float ys.[i]
elif j=i then y
else f j))
packed (Array.zeroCreate n) 0;;
val init : int -> (int -> expr) -> expr
The following rule function uses pattern matching to identify expressions that it can understand and replaces them with other expressions:
> let rec rule = function
| Apply(Symbol "Sqrt", [|Float x|]) ->
Float(sqrt x)
| Apply(Symbol "Map", [|f; Packed xs|]) ->
init xs.Length (fun i -> rule(Apply(f, [|Float xs.[i]|])))
| f -> f;;
val rule : expr -> expr
Note that the type of this function expr -> expr is characteristic of term rewriting: rewriting replaces expressions with other expressions rather than reducing them to values.
Our program can now be defined and executed by our custom term rewriter:
> rule (Apply(Symbol "Map", [|Symbol "Sqrt"; Packed xs|]));;
Real: 00:00:00.049, CPU: 00:00:00.046, GC gen0: 24, gen1: 0, gen2: 0
We've recovered the performance of Map[Sqrt, xs] in Mathematica!
We can even recover the performance of Sqrt[xs] by adding an appropriate rule:
| Apply(Symbol "Sqrt", [|Packed xs|]) ->
Packed(Array.map sqrt xs)
I wrote an article on term rewriting in F#.
Some measurements
Based on #gdelfino answer and comments by #rcollyer I made this small program:
j = # # + # # &;
g[x_] := x x + x x ;
h = Function[{x}, x x + x x ];
anon = Table[Timing[Do[ # # + # # &[i], {i, k}]][[1]], {k, 10^5, 10^6, 10^5}];
jj = Table[Timing[Do[ j[i], {i, k}]][[1]], {k, 10^5, 10^6, 10^5}];
gg = Table[Timing[Do[ g[i], {i, k}]][[1]], {k, 10^5, 10^6, 10^5}];
hh = Table[Timing[Do[ h[i], {i, k}]][[1]], {k, 10^5, 10^6, 10^5}];
ListLinePlot[ {anon, jj, gg, hh},
PlotStyle -> {Black, Red, Green, Blue},
PlotRange -> All]
The results are, at least for me, very surprising:
Any explanations? Please feel free to edit this answer (comments are a mess for long text)
Edit
Tested with the identity function f[x] = x to isolate the parsing from the actual evaluation. Results (same colors):
Note: results are very similar to this Plot for constant functions (f[x]:=1);
Pattern matching seems faster:
In[1]:= g[x_] := x*x
In[2]:= h = Function[{x}, x*x];
In[3]:= Do[h[RandomInteger[100]], {1000000}] // Timing
Out[3]= {1.53927, Null}
In[4]:= Do[g[RandomInteger[100]], {1000000}] // Timing
Out[4]= {1.15919, Null}
Pattern matching is also more flexible as it allows you to overload a definition:
In[5]:= g[x_] := x * x
In[6]:= g[x_,y_] := x * y
For simple functions you can compile to get the best performance:
In[7]:= k[x_] = Compile[{x}, x*x]
In[8]:= Do[k[RandomInteger[100]], {100000}] // Timing
Out[8]= {0.083517, Null}
You can use function recordSteps in previous answer to see what Mathematica actually does with Functions. It treats it just like any other Head. IE, suppose you have the following
f = Function[{x}, x + 2];
f[2]
It first transforms f[2] into
Function[{x}, x + 2][2]
At the next step, x+2 is transformed into 2+2. Essentially, "Function" evaluation behaves like an application of pattern matching rules, so it shouldn't be surprising that it's not faster.
You can think of everything in Mathematica as an expression, where evaluation is the process of rewriting parts of the expression in a predefined sequence, this applies to Function like to any other head
What is the difference between the & and && logical operators in MATLAB?
The single ampersand & is the logical AND operator. The double ampersand && is again a logical AND operator that employs short-circuiting behaviour. Short-circuiting just means the second operand (right hand side) is evaluated only when the result is not fully determined by the first operand (left hand side)
A & B (A and B are evaluated)
A && B (B is only evaluated if A is true)
&& and || take scalar inputs and short-circuit always. | and & take array inputs and short-circuit only in if/while statements. For assignment, the latter do not short-circuit.
See these doc pages for more information.
As already mentioned by others, & is a logical AND operator and && is a short-circuit AND operator. They differ in how the operands are evaluated as well as whether or not they operate on arrays or scalars:
& (AND operator) and | (OR operator) can operate on arrays in an element-wise fashion.
&& and || are short-circuit versions for which the second operand is evaluated only when the result is not fully determined by the first operand. These can only operate on scalars, not arrays.
Both are logical AND operations. The && though, is a "short-circuit" operator. From the MATLAB docs:
They are short-circuit operators in that they evaluate their second operand only when the result is not fully determined by the first operand.
See more here.
& is a logical elementwise operator, while && is a logical short-circuiting operator (which can only operate on scalars).
For example (pardon my syntax).
If..
A = [True True False True]
B = False
A & B = [False False False False]
..or..
B = True
A & B = [True True False True]
For &&, the right operand is only calculated if the left operand is true, and the result is a single boolean value.
x = (b ~= 0) && (a/b > 18.5)
Hope that's clear.
&& and || are short circuit operators operating on scalars. & and | operate on arrays, and use short-circuiting only in the context of if or while loop expressions.
A good rule of thumb when constructing arguments for use in conditional statements (IF, WHILE, etc.) is to always use the &&/|| forms, unless there's a very good reason not to. There are two reasons...
As others have mentioned, the short-circuiting behavior of &&/|| is similar to most C-like languages. That similarity / familiarity is generally considered a point in its favor.
Using the && or || forms forces you to write the full code for deciding your intent for vector arguments. When a = [1 0 0 1] and b = [0 1 0 1], is a&b true or false? I can't remember the rules for MATLAB's &, can you? Most people can't. On the other hand, if you use && or ||, you're FORCED to write the code "in full" to resolve the condition.
Doing this, rather than relying on MATLAB's resolution of vectors in & and |, leads to code that's a little bit more verbose, but a LOT safer and easier to maintain.