I have been going through the Scala Stream collection API and I have noticed that Stream.cons is implemented as an embedded object. What advantage does this have over implementing it as a function? Under what circumstances should one consider using this technique?
Cheers.
As an object, it defines unapply in addition to apply, which let you pattern match on it.
Related
Is there a method for switching off that warning? I completely understand what it says, why it is generally helpful and why WithFilter exists, but this particular monad is used to compose individual functions rather than collections of monadic type values, and WithFilter won't provide any meaningful improvement while requiring an additional method and classes confounding the simple interface of the class.
What about defining def withFilter(f: A => Boolean) = filter(f) and documenting it only exists for this purpose? Unfortunately, Scala compiler doesn't have a general way to switch off warnings you don't want and I don't think there is a way specific to this one.
I've read here about the difference between functions and methods in scala. It says that methods can be slightly faster than functions. But when passing a method m as an argument using m _, m is implicitly converted to a function.
Is the performance difference significant enough to ponder avoiding Functions when it is going to be a bottleneck in my program?
Is there a way to pass a method as an argument without converting it to a Function?
Kind of irrelevant to by 2. But in general, forget about performance, methods are more readable than function declarations. They might be a little faster in some situations from compiler optimizations, but:
You cannot pass a method as an argument without converting it to a function. A method is a special language construct, and not an object itself. You must use eta-expansion to convert it to one if you want to use it as an object.
No.
No. In the extremely rare case where this kind of microperformance was important, you'd want to pass the object that the method is on, and either modify the receiving function to accept that, or make the object implement one of the FunctionN interfaces so that it can be used as a function.
It makes just as much sense avoiding functions as it does avoiding other object allocations. There's nothing particularly special about them.
It's possible to pass and invoke methods directly using reflection, but the performance is going to be much worse than passing functions in the similar situation.
I'm new to scala(just start learning it), but have figured out smth strange for me: there are classes Array and List, they both have such methods/functions as foreach, forall, map etc. But any of these methods aren't inherited from some special class(trait). From java perspective if Array and List provide some contract, that contract have to be declared in interface and partially implemented in abstract classes. Why do in scala each type(Array and List) declares own set of methods? Why do not they have some common type?
But any of these methods aren't inherited from some special class(trait)
That simply not true.
If you open scaladoc and lookup say .map method of Array and List and then click on it you'll see where it is defined:
For list:
For array:
See also info about Traversable and Iterable both of which define most of the contracts in scala collections (but some collections may re-implement methods defined in Traversable/Iterable, e.g. for efficiency).
You may also want to look at relations between collections (scroll to the two diagrams) in general.
I'll extend om-nom-nom answer here.
Scala doesn't have an Array -- that's Java Array, and Java Array doesn't implement any interface. In fact, it isn't even a proper class, if I'm not mistaken, and it certainly is implemented through special mechanisms at the bytecode level.
On Scala, however, everything is a class -- an Int (Java's int) is a class, and so is Array. But in these cases, where the actual class comes from Java, Scala is limited by the type hierarchy provided by Java.
Now, going back to foreach, map, etc, they are not methods present in Java. However, Scala allows one to add implicit conversions from one class to another, and, through that mechanism, add methods. When you call arr.foreach(println), what is really done is Predef.refArrayOps(arr).foreach(println), which means foreach belongs to the ArrayOps class -- as you can see in the scaladoc documentation.
I was wondering if there is a good reason to use Collection.empty[T] instead of new Collection[T]() (or the inverse) ? Or is it just a personal preference ?
Thanks.
Calling new Collection[T]() will create a new instance every time. On the other hand, Collection.empty[T] will most likely always return the same singleton object, usually defined somewhere as
object Empty extends Collection[Nothing] ...
which will be much faster. Edit: This is only possible for immutable collections, mutable collections have to return a new instance every time empty is called.
You should always prefer Collection.empty[Type].
In addition to Collection.empty[T] being clearer on the intent, you should favour it for the same reason that you should favour factory methods in general when instantiating a collection: because thoses factories abstract away some implementation details that you might not (or should not) care about.
By example, when you do Seq.empty[String] you actually get an instance of List[String]. You could directly instantiate a List[String] but if all you care about is to have some Seq you would introduce a needless dependency to List (well OK, actually you cannot as it stands, because List is already abstract, but let's pretend we can for the sake of the argument)
The whole point of factories is precisely to have some amount of separation of concern and not bother with unnecessary instantiation details.
As another more elaborate example, let's talk about collection.immutable.HashMap. This one is very much a concrete class so you might think there is no need for a factory here. Except that for optimization purpose the factory in the companion object collection.immutable.HashMap will actually create different sub-classes depending on the number of elements that you initialize the map with (see this question: Scala: how to make a Hash(Trie)Map from a Map (via Anorm in Play)). Obviously, if you directly instantiate collection.immutable.HashMap you will lose this optimization.
Another common optimization for empty is to always return (when it is an immutable collection) the same instance, yet another useful optimization that you would lose by directly instantiating the collection.
So as a rule of thumb, as far as you can you should use the factories that are provided by the various collection companion objects, so as to shield yourself from unneeded dependencies while at the same time benefiting from potential optimizations provided by the collection framework.
empty is just a special case of factory, and so the same logic applies.
I have been reading about methods and functions in Scala. Jim's post and Daniel's complement to it do a good job of explaining what the differences between these are. Here is what I took with me:
functions are objects, methods are not;
as a consequence functions can be passed as argument, but methods can not;
methods can be type-parametrised, functions can not;
methods are faster.
I also understand the difference between def, val and var.
Now I have actually two questions:
Why can't we parametrise the apply method of a function to parametrise the function? And
Why can't the method be called by the function object to run faster? Or the caller of the function be made calling the original method directly?
Looking forward to your answers and many thanks in advance!
1 - Parameterizing functions.
It is theoretically possible for a compiler to parameterize the type of a function; one could add that as a feature. It isn't entirely trivial, though, because functions are contravariant in their argument and covariant in their return value:
trait Function1[+T,-R] { ... }
which means that another function that can take more arguments counts as a subclass (since it can process anything that the superclass can process), and if it produces a smaller set of results, that's okay (since it will also obey the superclass construct that way). But how do you encode
def fn[A](a: A) = a
in that framework? The whole point is that the return type is equal to the type passed in, whatever that type has to be. You'd need
Function1[ ThisCanBeAnything, ThisHasToMatch ]
as your function type. "This can be anything" is well-represented by Any if you want a single type, but then you could return anything as the original type is lost. This isn't to say that there is no way to implement it, but it doesn't fit nicely into the existing framework.
2 - Speed of functions.
This is really simple: a function is the apply method on another object. You have to have that object in order to call its method. This will always be slower (or at least no faster) than calling your own method, since you already have yourself.
As a practical matter, JVMs can do a very good job inlining functions these days; there is often no difference in performance as long as you're mostly using your method or function, not creating the function object over and over. If you're deeply nesting very short loops, you may find yourself creating way too many functions; moving them out into vals outside of the nested loops may save time. But don't bother until you've benchmarked and know that there's a bottleneck there; typically the JVM does the right thing.
Think about the type signature of a function. It explicitly says what types it takes. So then type-parameterizing apply() would be inconsistent.
A function is an object, which must be created, initialized, and then garbage-collected. When apply() is called, it has to grab the function object in addition to the parent.