I have been reading about methods and functions in Scala. Jim's post and Daniel's complement to it do a good job of explaining what the differences between these are. Here is what I took with me:
functions are objects, methods are not;
as a consequence functions can be passed as argument, but methods can not;
methods can be type-parametrised, functions can not;
methods are faster.
I also understand the difference between def, val and var.
Now I have actually two questions:
Why can't we parametrise the apply method of a function to parametrise the function? And
Why can't the method be called by the function object to run faster? Or the caller of the function be made calling the original method directly?
Looking forward to your answers and many thanks in advance!
1 - Parameterizing functions.
It is theoretically possible for a compiler to parameterize the type of a function; one could add that as a feature. It isn't entirely trivial, though, because functions are contravariant in their argument and covariant in their return value:
trait Function1[+T,-R] { ... }
which means that another function that can take more arguments counts as a subclass (since it can process anything that the superclass can process), and if it produces a smaller set of results, that's okay (since it will also obey the superclass construct that way). But how do you encode
def fn[A](a: A) = a
in that framework? The whole point is that the return type is equal to the type passed in, whatever that type has to be. You'd need
Function1[ ThisCanBeAnything, ThisHasToMatch ]
as your function type. "This can be anything" is well-represented by Any if you want a single type, but then you could return anything as the original type is lost. This isn't to say that there is no way to implement it, but it doesn't fit nicely into the existing framework.
2 - Speed of functions.
This is really simple: a function is the apply method on another object. You have to have that object in order to call its method. This will always be slower (or at least no faster) than calling your own method, since you already have yourself.
As a practical matter, JVMs can do a very good job inlining functions these days; there is often no difference in performance as long as you're mostly using your method or function, not creating the function object over and over. If you're deeply nesting very short loops, you may find yourself creating way too many functions; moving them out into vals outside of the nested loops may save time. But don't bother until you've benchmarked and know that there's a bottleneck there; typically the JVM does the right thing.
Think about the type signature of a function. It explicitly says what types it takes. So then type-parameterizing apply() would be inconsistent.
A function is an object, which must be created, initialized, and then garbage-collected. When apply() is called, it has to grab the function object in addition to the parent.
Related
Method types have no value. How do we evaluate a method?
Using SML as an example, I have
fun myFunc(x) = x + 5
val b = myFunc(2)
In the second expression, myFun has a type and a value, we use its type to do type checking and use its value together with its argument to evaluate value for b
But in Scala methods without a value how do we evaluate? I am pretty new to Scala so it may not be very clear.
def myFunc(x) = x + 5
val b = myFunc(2)
From val b = myFunc(2) to val b = 2 + 5, what happened in between? From where or what object do we know that myFunc(x) is x + 5?
THanks!!
The simple answer is: just because a method is not a value in the sense of "a thing that can be manipulated by you" doesn't mean that it is not a value in the sense of "a thing that can be manipulated by the author of the compiler".
Of course, a method will have an object representing it inside of the compiler. In fact, that object will probably look very similar to the object representing a function inside, say, the MLTon SML compiler or SML/NJ.
In SML, syntax is not a value, but you are not questioning how it is possible to write a function call either, aren't you? After all, in order to call a function in SML, I need to write a function call using function call syntax, so how can I do that when syntax is not a value?
Well, the answer is the same: just because syntax is not a value that the programmer can manipulate, the compiler (or more precisely the parser) obviously does know about syntax.
I can't tell you why the decision was made to have functions be values in Scala but not methods, but I can make a guess. Scala is an object-oriented language. In an object-oriented language, every value is an object, and every object has methods that are bound to that object. So, if methods are objects, they need to have methods, which are objects, which have methods, which are objects, and so on.
There are ways to deal with this, of course, but it makes the language more complex. For a similar reason, classes aren't objects (unlike, say, in Smalltalk, Python, and Ruby). Note that even in highly reflective, introspective, dynamic languages like Ruby, methods are not objects. Classes are, but not methods.
It is possible using reflection to get a proxy object that represents a method, but that object is not the method itself. And you can actually do the same in Scala as well.
And of course it is possible to turn a method into a function value by η-expansion.
I'm assuming that you're compiling to Java Virtual Machine (JVM) bytecode, such as with scalac, which is probably the most common way to use Scala. Disclaimer: I'm not an expert on the JVM, so some parts of this answer might be a bit wrong, but the general idea is right.
Essentially, a method is a set of instructions for the runtime to execute. It exists as part of the compiled code on disk (e.g. a .class file). When the JVM loads the class, it pulls the entire class file into memory, including the methods. When the JVM encounters a method call, it looks up the method and starts executing the instructions in it. If the method returns a result, the JVM makes that result available in the calling code, then does whatever you wanted to do with it there, such as assigning to a variable.
With that knowledge, we can answer some of your questions:
From val b = myFunc(2) to val b = 2 + 5, what happened in between?
This isn't quite how it works, as the JVM doesn't "expand" myFunc in place, but instead looks up myFunc and executes the instructions in it.
From where or what object do we know that myFunc(x) is x + 5?
Not from any object. While myFunc is in memory, it's in an area of memory that you can't access directly (but the JVM can).
why can't it be a value since it is a chunk of memory?
Not all memory fits into the nice abstractions of types and values.
It's a rather general purpose question and not specific to any one language. I don't quite understand the point behind passing a function as an argument to another function. I understand that if a function, say, foo1() needs to use some result returned by another function foo2(), why can't the values returned/updated by foo2() be passed to foo1() as is? Or in another scenario, why can't the foo2() be called within foo1() with its results being used therein?
Also what happens under the hood when a foo2() is passed as an argument to foo1()? Is foo2() executed prior to foo1()?
Generally speaking, you pass a function foo2 to a function foo1 in cases where multiple evaluations of foo2 will be necessary - and you perhaps don't know in advance what parameters will be used for each call of foo2, so you couldn't possibly perform the calls yourself in advance.
I think that a sort() function/method on lists might be the best concrete example. Consider a list of People - you might reasonably want to sort them alphabetically by name, or numerically by age, or by geographical distance from a given point, or many other possible orders. It would hardly be practical to include every such ordering as built-in options to sort(): the usual approach taken by languages is to allow the caller to provide a function as a parameter, that defines the ordering between items of the list.
There are many reasons:
Dependency injection: if you pass a method that in production will use a database call, and you use it with different parameters, you could substitute it with some mock when unit testing.
Map, filter, reduce: you could apply the same method to a list of parameters, to map it, filter it or reduce it.
Usually to provide callbacks, or to separate interface from implementation. Look up the following:
1. Dependency Injection,
2. Callbacks,
3. Anonymous Functions (aka Lambdas),
4. PIMPL
etc
Take a look at this book where it is used extensively in developing TDD with C:
https://www.amazon.co.uk/Driven-Development-Embedded-Pragmatic-Programmers/dp/193435662X
I've read here about the difference between functions and methods in scala. It says that methods can be slightly faster than functions. But when passing a method m as an argument using m _, m is implicitly converted to a function.
Is the performance difference significant enough to ponder avoiding Functions when it is going to be a bottleneck in my program?
Is there a way to pass a method as an argument without converting it to a Function?
Kind of irrelevant to by 2. But in general, forget about performance, methods are more readable than function declarations. They might be a little faster in some situations from compiler optimizations, but:
You cannot pass a method as an argument without converting it to a function. A method is a special language construct, and not an object itself. You must use eta-expansion to convert it to one if you want to use it as an object.
No.
No. In the extremely rare case where this kind of microperformance was important, you'd want to pass the object that the method is on, and either modify the receiving function to accept that, or make the object implement one of the FunctionN interfaces so that it can be used as a function.
It makes just as much sense avoiding functions as it does avoiding other object allocations. There's nothing particularly special about them.
It's possible to pass and invoke methods directly using reflection, but the performance is going to be much worse than passing functions in the similar situation.
I'm trying to understand how the programming technique known as currying differs from an ordinary callback interface (such as the Observer/Observable interfaces in Java, or the classic Visitor design pattern).
I understand what currying is, I just don't understand why it's uniquely useful to the point that it requires its own terminology and language support.
Could someone explain a programming situation that is better solved by currying than by a callback method? What's the practical significance of the fact that currying uses a separate function for each argument?
[update:] to summarize the answers I got: currying comes part and parcel with the fact that functions are "first class" citizens, ie objects that can be created and passed around like any other object reference. This makes it possible to return a function from a function, in other words currying.
As for the reason why currying is useful, currying provides a syntax to let you concisely decorate function calls so that derived functions can be created with minimal boilerplate code overhead. Whereas in java you might create several overloaded or "wrapper" methods for each partial parameter set which ultimately invoke a master method containing all the parameters, currying provides a lighter syntax that lets you generate these "function wrappers" as needed in your code.
Currying and callbacks are two completely different technologies.
Callbacks are essentially a synonym for "passing a function to a function" (i.e. higher-order function that consumes a function); currying is a form of partial application, i.e. a function which isn't passed all of the parameters it expects returns a new function that only expects the free parameters.
Accordingly, they are not alternatives at all.
Currying is useful because it makes it much easier to concisely create functions that can be used as, for example, callbacks, or in a pointfree programme. It also means that you can, e.g. pass a callback to a function like map, and have a new function that applies your callback to every element of any list you care to pass to it.
Well, it's a point of language support.
In Java, for example, you can define all sorts of callback interfaces: one for parmeterless methods, one for methods with one argument, one for methods with two arguments, and so forth.
But wehn functions are first class citiziens, one does not need this: Single argument functions will do the job, because functions can be returned. Hence, one important interface in all "functional java" projects will be some interface of the form:
interface Fun<A,B> {
public B apply(A a);
}
or the like that covers this pattern.
I mean if there's some declarative way to prevent an object from changing any of it's members.
In the following example
class student(var name:String)
val s = new student("John")
"s" has been declared as a val, so it will always point to the same student.
But is there some way to prevent s.name from being changed by just declaring it like immutable???
Or the only solution is to declare everything as val, and manually force immutability?
No, it's not possible to declare something immutable. You have to enforce immutability yourself, by not allowing anyone to change it, that is remove all ways of modifying the class.
Someone can still modify it using reflection, but that's another story.
Scala doesn't enforce that, so there is no way to know. There is, however, an interesting compiler-plugin project named pusca (I guess it stands for Pure-Scala). Pure is defined there as not mutating a non-local variable and being side-effect free (e.g. not printing to the console)—so that calling a pure method repeatedly will always yield the same result (what is called referentially transparent).
I haven't tried out that plug-in myself, so I can't say if it's any stable or usable already.
There is no way that Scala could do this generally.
Consider the following hypothetical example:
class Student(var name : String, var course : Course)
def stuff(course : Course) {
magically_pure_val s = new Student("Fredzilla", course)
someFunctionOfStudent(s)
genericHigherOrderFunction(s, someFunctionOfStudent)
course.someMethod()
}
The pitfalls for any attempt to actually implement that magically_pure_val keyword are:
someFunctionOfStudent takes an arbitrary student, and isn't implemented in this compilation unit. It was written/compiled knowing that Student consists of two mutable fields. How do we know it doesn't actually mutate them?
genericHigherOrderFunction is even worse; it's going to take our Student and a function of Student, but it's written polymorphically. Whether or not it actually mutates s depends on what its other arguments are; determining that at compile time with full generality requires solving the Halting Problem.
Let's assume we could get around that (maybe we could set some secret flags that mean exceptions get raised if the s object is actually mutated, though personally I wouldn't find that good enough). What about that course field? Does course.someMethod() mutate it? That method call isn't invoked from s directly.
Worse than that, we only know that we'll have passed in an instance of Course or some subclass of Course. So even if we are able to analyze a particular implementation of Course and Course.someMethod and conclude that this is safe, someone can always add a new subclass of Course whose implementation of someMethod mutates the Course.
There's simply no way for the compiler to check that a given object cannot be mutated. The pusca plugin mentioned by 0__ appears to detect purity the same way Mercury does; by ensuring that every method is known from its signature to be either pure or impure, and by raising a compiler error if the implementation of anything declared to be pure does anything that could cause impurity (unless the programmer promises that the method is pure anyway).[1]
This is quite a different from simply declaring a value to be completely (and deeply) immutable and expecting the compiler to notice if any of the code that could touch it could mutate it. It's also not a perfect inference, just a conservative one
[1]The pusca README claims that it can infer impurity of methods whose last expression is a call to an impure method. I'm not quite sure how it can do this, as checking if that last expression is an impure call requires checking if it's calling a not-declared-impure method that should be declared impure by this rule, and the implementation might not be available to the compiler at that point (and indeed could be changed later even if it is). But all I've done is look at the README and think about it for a few minutes, so I might be missing something.