Preamble: I'm teaching a course in object-functional programming using Scala and one of the things we do is to take sample problems and compare how they might be implemented using object-functional programming and using state-based, object-oriented programming, which is the background most of the students have.
So I want to implement a simple class in Scala that has a private var with a public accessor method (a very common idiom in state-based, object-oriented programming). Looking at Alvin Alexander's "Scala Cookbook" the recommended code to do this is pretty ghastly:
class Person(private var _age: Int):
def incrAge() = _age += 1
def age = _age
I say "ghastly" because I'm having to invent two names that essentially represent the age field, one used in the constructor and another used in the class interface. I'm curious if people more familiar with Scala would know of a simpler syntax that would avoid this?
EDIT: It seems clear to me now that Scala combines the val/var declaration with the given visibility (public/private), so for a var either both accessor&mutator are public or both are private. Depending on perspective, you might find this inflexible, or feel it rightly punishes you for using var 🙂.
Yes, a better way of doing it is not using var
class Person(val age: Int) {
def incrAge = new Person(age+1)
}
If you are going to write idiomatic scala code, you should start with pretending that certain parts of it simply do not exist: mostly vars, nulls and returns, but also mutable structures or collections, arrays, and certain methods like .get on a Try or an Option, or the Await object. Oh, and also isInstanceOf and asInstance.
You may ask "why do these things exist if they are not supposed to be used?". Well, because sometimes, in a very few very specific cases they are actually useful for achieving a very limited very specific purpose. But that would be probably fewer than 0.1% of the cases you will come across in your career, unless you are involved in some hard core low level library development (in which case, you would not be posting questions like this here).
So, until you acquire enough command of the language to be able to definitively distinguish those 0.1% of the cases from the other 99.9%, you are much better off simply ignoring those language features, and pretending they do not exist (if you can't figure out how to achieve a certain task without using one of those, post a question here, and people will gladly help you).
You said "Having to create two names to manage a single field is ugly." Indeed. But you know what's uglier? Using vars.
(Btw, the way you typically do this in java is getAge and setAge – still two names. The ugliness is rooted in allowing the value labeled with the given name to be different at different points of program execution, not in how specifically the semantics of mutation looks like).
Related
I am in the process of learning Scala through Coursera course (progfun).
We are being learned to think functionally and use tail recursions when possible to implement functions/methods.
And as an example for foreach on a list function, we have taught to implement it like:
def foreach[T](list: List[T], f: [T] => Unit) {
if (!list.isEmpty) foreach(list.tail) else f(list.head)
}
Then I was surprised when I found the following implementation in some Scala apis:
override /*IterableLike*/
def foreach[B](f: A => B) {
var these = this
while (!these.isEmpty) {
f(these.head)
these = these.tail
}
}
So how come we are being learned to use recursion and avoid using mutable variables and the api is being implemented by opposite techniques?
Have a look at scala.collection.LinearSeqOptimized where scala.collection.immutable.List extend. (similar implementation found in the List class itself)
Don't forget that Scala is intended to be a multiparadigm language. For educational purposes, it's good to know how to read and write tail-call recursive functions. But when using the language day-to-day, it's important to remember that it's not pure FP.
It's possible that part of the library predated TCO and the #tailrec annotation. You'd have to look at commit history to find out.
That implementation of foreach might use a mutable var, but from the outside, it appears to be pure. Ultimately, this is exactly what TCO would do behind the scenes.
There are two parts to your question:
So how come we are being learned to use recursion and avoid using mutable variables
Because the teachers assume that you either already know about imperative programming with mutable state and loops or will be exposed to it sometime during your career anyway, so they would rather focus on teaching you the things you are less likely to pick up on your own.
Also, imperative programming with mutable state is much harder to reason about, much harder to understand and thus much harder to teach.
and the api is being implemented by opposite techniques?
Because the Scala standard library is intended to be a high-performance industrial-strength library, not a teaching example. Maybe the person who wrote that code profiled it and measured it to be 0.001% percent faster than the tail-recursive version. Maybe, when that code was written, the compiler couldn't yet reliably optimize the tail-recursive version.
Don't forget that Iterable and friends are the cornerstone of Scala's collections library, those methods you are looking at are probably among the most often executed methods in the entire Scala universe. Even the tiniest performance optimization pays out in a method that is executed billions of times.
I've been reading a lot of other people's Scala code recently, and one of the things that I have difficultly with (coming from Java) is a lack of explicit type annotations.
It's certainly convenient when writing code to be able to leave out type annotations -- however when reading code I often find that explicit type annotations help me to understand at a glance what code is doing more easily.
The Scala style guide (http://docs.scala-lang.org/style/types.html) doesn't seem to provide any definitive guidance on this, stating:
Use type inference where possible, but put clarity first, and favour explicitness in public APIs.
To my mind, this is a bit contradictory. While it's clearly obvious what type this variable is:
val tokens = new HashMap[String, Int]
It's not so obvious what type this one is:
val tokens = readTokens()
So, if I was putting clarity first I would probably annotate all variables where the type is not already declared on the same line.
Do any Scala practitioners have guidance on this? Am I crazy to be considering adding type annotations to my local variables? I'm particularly interested in hearing from folks who spend a lot of time reading scala code (for example, in code reviews), as well as writing it.
It's not so obvious what type this one is:
val tokens = readTokens()
Good names are important: the name is plural, ergo it returns some collection of some kind. The most general collection types in Scala are Traversable and Iterator, and they mostly share a common interface, so it's not really important which one of the two it is. The name also talks about "reading tokens", ergo it obviously should return Tokens in some fashion. And last but not least, the method call has parentheses, which according to the style guide means it has side-effects, so I wouldn't count on being able to traverse the collection more than once.
Ergo, the return type is something like
Traversable[Token]
or
Iterator[Token]
and which of the two it is doesn't really matter because their client interfaces are mostly identical.
Note also that the latter constraint (only traversing the collection once) isn't even captured in the type, even if you were providing an explicit type, you would still have to look at the name and the style!
I am new to Scala and heard a lot that everything is an object in Scala. What I don't get is what's the advantage of "everything's an object"? What are things that I cannot do if everything is not an object? Examples are welcome. Thanks
The advantage of having "everything" be an object is that you have far fewer cases where abstraction breaks.
For example, methods are not objects in Java. So if I have two strings, I can
String s1 = "one";
String s2 = "two";
static String caps(String s) { return s.toUpperCase(); }
caps(s1); // Works
caps(s2); // Also works
So we have abstracted away string identity in our operation of making something upper case. But what if we want to abstract away the identity of the operation--that is, we do something to a String that gives back another String but we want to abstract away what the details are? Now we're stuck, because methods aren't objects in Java.
In Scala, methods can be converted to functions, which are objects. For instance:
def stringop(s: String, f: String => String) = if (s.length > 0) f(s) else s
stringop(s1, _.toUpperCase)
stringop(s2, _.toLowerCase)
Now we have abstracted the idea of performing some string transformation on nonempty strings.
And we can make lists of the operations and such and pass them around, if that's what we need to do.
There are other less essential cases (object vs. class, primitive vs. not, value classes, etc.), but the big one is collapsing the distinction between method and object so that passing around and abstracting over functionality is just as easy as passing around and abstracting over data.
The advantage is that you don't have different operators that follow different rules within your language. For example, in Java to perform operations involving objects, you use the dot name technique of calling the code (static objects still use the dot name technique, but sometimes the this object or the static object is inferred) while built-in items (not objects) use a different method, that of built-in operator manipulation.
Number one = Integer.valueOf(1);
Number two = Integer.valueOf(2);
Number three = one.plus(two); // if only such methods existed.
int one = 1;
int two = 2;
int three = one + two;
the main differences is that the dot name technique is subject to polymorphisim, operator overloading, method hiding, and all the good stuff that you can do with Java objects. The + technique is predefined and completely not flexible.
Scala circumvents the inflexibility of the + method by basically handling it as a dot name operator, and defining a strong one-to-one mapping of such operators to object methods. Hence, in Scala everything is an object means that everything is an object, so the operation
5 + 7
results in two objects being created (a 5 object and a 7 object) the plus method of the 5 object being called with the parameter 7 (if my scala memory serves me correctly) and a "12" object being returned as the value of the 5 + 7 operation.
This everything is an object has a lot of benefits in a functional programming environment, for example, blocks of code now are object too, making it possible to pass back and forth blocks of code (without names) as parameters, yet still be bound to strict type checking (the block of code only returns Long or a subclass of String or whatever).
When it boils down to it, it makes some kinds of solutions very easy to implement, and often the inefficiencies are mitigated by the lack of need to handle "move into primitives, manipulate, move out of primitives" marshalling code.
One specific advantage that comes to my mind (since you asked for examples) is what in Java are primitive types (int, boolean ...) , in Scala are objects that you can add functionality to with implicit conversions. For example, if you want to add a toRoman method to ints, you could write an implicit class like:
implicit class RomanInt(i:Int){
def toRoman = //some algorithm to convert i to a Roman representation
}
Then, you could call this method from any Int literal like :
val romanFive = 5.toRoman // V
This way you can 'pimp' basic types to adapt them to your needs
In addition to the points made by others, I always emphasize that the uniform treatment of all values in Scala is in part an illusion. For the most part it is a very welcome illusion. And Scala is very smart to use real JVM primitives as much as possible and to perform automatic transformations (usually referred to as boxing and unboxing) only as much as necessary.
However, if the dynamic pattern of application of automatic boxing and unboxing is very high, there can be undesirable costs (both memory and CPU) associated with it. This can be partially mitigated with the use of specialization, which creates special versions of generic classes when particular type parameters are of (programmer-specified) primitive types. This avoids boxing and unboxing but comes at the cost of more .class files in your running application.
Not everything is an object in Scala, though more things are objects in Scala than their analogues in Java.
The advantage of objects is that they're bags of state which also have some behavior coupled with them. With the addition of polymorphism, objects give you ways of changing the implicit behavior and state. Enough with the poetry, let's go into some examples.
The if statement is not an object, in either scala or java. If it were, you could be able to subclass it, inject another dependency in its place, and use it to do stuff like logging to a file any time your code makes use of the if statement. Wouldn't that be magical? It would in some cases help you debug stuff, and in other cases it would make your hairs grow white before you found a bug caused by someone overwriting the behavior of if.
Visiting an objectless, statementful world: Imaging your favorite OOP programming language. Think of the standard library it provides. There's plenty of classes there, right? They offer ways for customization, right? They take parameters that are other objects, they create other objects. You can customize all of these. You have polymorphism. Now imagine that all the standard library was simply keywords. You wouldn't be able to customize nearly as much, because you can't overwrite keywords. You'd be stuck with whatever cases the language designers decided to implement, and you'd be helpless in customizing anything there. Such languages exist, you know them well, they're the sequel-like languages. You can barely create functions there, but in order to customize the behavior of the SELECT statement, new versions of the language had to appear which included the features most desired. This would be an extreme world, where you'd only be able to program by asking the language designers for new features (which you might not get, because someone else more important would require some feature incompatible with what you want)
In conclusion, NOT everything is an object in scala: Classes, expressions, keywords and packages surely aren't. More things however are, like functions.
What's IMHO a nice rule of thumb is that more objects equals more flexibility
P.S. in Python for example, even more things are objects (like the classes themselves, the analogous concept for packages (that is python modules and packages). You'd see how there, black magic is easier to do, and that brings both good and bad consequences.
There are 2 reasons for me to ask:
1. I'd like a better code fragmentation to facilitate version control on per-function level
2. I struggle from some attention deficit disorder and it is hard for me to work with long pieces of code such as big class files
To address these problems I used to use include directives in C++ and partial class definitions and manually-definable foldable regions in C#. Are there any such things available in Scala 2.8?
I've tried to use editor-fold tag in NetBeans IDE, but it does not work in Scala editor unfortunately :-(
UPDATE: As far as I understand, there are no such facilities in Scala. So I'd like to ask: someone who has any connection to Scala authors, or an account on their Bugzilla (or whatever they use), please, suggest them an idea - they should probably think of introducing something of such (I was fascinated by C# regions and partial classes for example, and plain old includes also look like a convenient tool to have) to make Scala even more beautiful through laconicity, IMHO.
How about doing it with traits? You define it like this:
trait Similarity
{
def isSimilar(x: Any): Boolean
def isNotSimilar(x: Any): Boolean = !isSimilar(x)
}
...and then you use it like so:
class Point(xc: Int, yc: Int) extends Similarity
{
var x: Int = xc
var y: Int = yc
def isSimilar(obj: Any) =
obj.isInstanceOf[Point] &&
obj.asInstanceOf[Point].x == x
}
If the class Point were bigger, you could split it further into traits, resulting in the division that you want. Please note, however, that I don't think this is advisable, as it will make it very difficult to get a good overview of your code, unless you already know it by heart. If you can break it in a nice way, however, you might be able to get some nice, reusable blocks out of it, so in the end it might still be worth doing.
Best of luck to you!
//file A.scala
trait A { self: B =>
....
}
//file B.scala
trait B { self: A =>
....
}
//file C.scala
class C extends A with B
I suggest to read white paper by Martin at this link. In this white paper 'Case Sudy: The Scala Compiler' chapter will give you idea about how you can achieve component based design having code in several separate files.
Scala code folding works properly in IDEA.
The version control tools I work with (bzr or git, mostly) have no trouble isolating changes line-by-line. What use case do you have--that's common enough to worry about--where line-level isolation (which allows changes to independent methods to be merged without user intervention) is not enough?
Also, if you can't focus on something as large as one class with many methods, use more classes. A method generally requires you to know what the other methods are, what the fields are, and so on. Having that split across separate files is just asking for trouble. Instead, encapsulate your problem in a different way so you can work with smaller self-contained chunks at a time.
I keep seeing the phrase "duck typing" bandied about, and even ran across a code example or two. I am way too lazy busy to do my own research, can someone tell me, briefly:
the difference between a 'duck type' and an old-skool 'variant type', and
provide an example of where I might prefer duck typing over variant typing, and
provide an example of something that i would have to use duck typing to accomplish?
I don't mean to seem fowl by doubting the power of this 'new' construct, and I'm not ducking the issue by refusing to do the research, but I am quacking up at all the flocking hype i've been seeing about it lately. It looks like no typing (aka dynamic typing) to me, so I'm not seeing the advantages right away.
ADDENDUM: Thanks for the examples so far. It seems to me that using something like 'O->can(Blah)' is equivalent to doing a reflection lookup (which is probably not cheap), and/or is about the same as saying (O is IBlah) which the compiler might be able to check for you, but the latter has the advantage of distinguishing my IBlah interface from your IBlah interface while the other two do not. Granted, having a lot of tiny interfaces floating around for every method would get messy, but then again so can checking for a lot of individual methods...
...so again i'm just not getting it. Is it a fantastic time-saver, or the same old thing in a brand new sack? Where is the example that requires duck typing?
In some of the answers here, I've seen some incorrect use of terminology, which has lead people to provide wrong answers.
So, before I give my answer, I'm going to provide a few definitions:
Strongly typed
A language is strongly typed if it enforces the type safety of a program. That means that it guarantees two things: something called progress and something else called preservation. Progress basically means that all "validly typed" programs can in fact be run by the computer, They may crash, or throw an exception, or run for an infinite loop, but they can actually be run. Preservation means that if a program is "validly typed" that it will always be "Validly typed", and that no variable (or memory location) will contain a value that does not conform to its assigned type.
Most languages have the "progress" property. There are many, however, that don't satisfy the "preservation" property. A good example, is C++ (and C too). For example, it is possible in C++ to coerce any memory address to behave as if it was any type. This basically allows programmers to violate the type system any time they want. Here is a simple example:
struct foo
{
int x;
iny y;
int z;
}
char * x = new char[100];
foo * pFoo = (foo *)x;
foo aRealFoo;
*pFoo = aRealFoo;
This code allows someone to take an array of characters and write a "foo" instance to it. If C++ was strongly typed this would not be possible. Type safe languages, like C#, Java, VB, lisp, ruby, python, and many others, would throw an exception if you tried to cast an array of characters to a "foo" instance.
Weakly typed
Something is weakly typed if it is not strongly typed.
Statically typed
A language is statically typed if its type system is verified at compile time. A statically typed language can be either "weakly typed" like C or strongly typed like C#.
Dynamically typed
A dynamically typed language is a language where types are verified at runtime. Many languages have a mixture, of some sort, between static and dynamic typing. C#, for example, will verify many casts dynamically at runtime because it's not possible to check them at compile time. Other examples are languages like Java, VB, and Objective-C.
There are also some languages that are "completely" or "mostly" dynamically typed, like "lisp", "ruby", and "small talk"
Duck typing
Duck typing is something that is completely orthogonal to static, dynamic, weak, or strong typing. It is the practice of writing code that will work with an object regardless of its underlying type identity. For example, the following VB.NET code:
function Foo(x as object) as object
return x.Quack()
end function
Will work, regardless of what the type of the object is that is passed into "Foo", provided that is defines a method called "Quack". That is, if the object looks like a duck, walks like a duck, and talks like a duck, then it's a duck. Duck typing comes in many forms. It's possible to have static duck typing, dynamic duck typing, strong duck typing, and weak duck typing. C++ template functions are a good example of "weak static duck typing". The example show in "JaredPar's" post shows an example of "strong static duck typing". Late binding in VB (or code in Ruby or Python) enables "strong dynamic duck typing".
Variant
A variant is a dynamically typed data structure that can hold a range of predefined data types, including strings, integer types, dates, and com objects. It then defines a bunch of operations for assigning, converting, and manipulating data stored in variants. Whether or not a variant is strongly typed depends on the language in which it is used. For example, a variant in a VB 6 program is strongly typed. The VB runtime ensures that operations written in VB code will conform to the typing rules for variants. Tying to add a string to an IUnknown via the variant type in VB will result in a runtime error. In C++, however, variants are weakly typed because all C++ types are weakly typed.
OK.... now that I have gotten the definitions out of the way, I can now answer your question:
A variant, in VB 6, enables one form of doing duck typing. There are better ways of doing duck typing (Jared Par's example is one of the best), than variants, but you can do duck typing with variants. That is, you can write one piece of code that will operate on an object regardless of its underlying type identity.
However, doing it with variants doesn't really give a lot of validation. A statically typed duck type mechanism, like the one JaredPar describes gives the benefits of duck typing, plus some extra validation from the compiler. That can be really helpful.
The simple answer is variant is weakly typed while duck typing is strongly typed.
Duck typing can be summed up nicely as "if it walks like a duck, looks like a duck, acts like a duck, then it's a duck." It computer science terms consider duck to be the following interface.
interface IDuck {
void Quack();
}
Now let's examine Daffy
class Daffy {
void Quack() {
Console.WriteLine("Thatsssss dispicable!!!!");
}
}
Daffy is not actually an IDuck in this case. Yet it acts just like a Duck. Why make Daffy implement IDuck when it's quite obvious that Daffy is in fact a duck.
This is where Duck typing comes in. It allows a type safe conversion between any type that has all of the behaviors of a IDuck and an IDuck reference.
IDuck d = new Daffy();
d.Quack();
The Quack method can now be called on "d" with complete type safety. There is no chance of a runtime type error in this assignment or method call.
Duck typing is just another term for dynamic typing or late-binding. A variant object that parses/compiles with any member access (e.g., obj.Anything) that may or not actually be defined during runtime is duck typing.
Probably nothing requires duck-typing, but it can be convenient in certain situations.
Say you have a method that takes and uses an object of the sealed class Duck from some 3rd party library. And you want to make the method testable. And Duck has an awfully big API (kind of like ServletRequest) of which you only need to care about a small subset. How do you test it?
One way is to make the method take something that quacks. Then you can simply create a quacking mock object.
Try reading the very first paragraph of the Wikipedia article on duck typing.
Duck typing on Wikipedia
I can have an interface (IRunnable) that defines the method Run().
If I have another class with a method like this:
public void RunSomeRunnable(IRunnable rn) { ... }
In a duck type friendly language I could pass in any class that had a Run() method into the RunSomeRunnable() method.
In a statically typed language the class being passed into RunSomeRunnable needs to explicitly implement the IRunnable interface.
"If it Run() like a duck"
variant is more like object in .NET at least.
#Kent Fredric
Your example can most certainly be done without duck typing by using explicit interfaces...uglier yes, but it's not impossible.
And personally, I find having well defined contracts in interfaces much better for enforcing quality code, than relying on duck typing...but that's just my opinion and take it with a grain of salt.
public interface ICreature { }
public interface IFly { fly();}
public interface IWalk { walk(); }
public interface IQuack { quack(); }
// ETC
// Animal Class
public class Duck : ICreature, IWalk, IFly, IQuack
{
fly() {};
walk() {};
quack() {};
}
public class Rhino: ICreature, IWalk
{
walk();
}
// In the method
List<ICreature> creatures = new List<ICreature>();
creatures.Add(new Duck());
creatures.Add(new Rhino());
foreach (ICreature creature in creatures)
{
if (creature is IFly)
(creature as IFly).fly();
if (creature is IWalk)
(creature as IWalk).walk();
}
// Etc
In regards to your request for an example of something you'd need to use duck typing to accomplish, I don't think such a thing exists. I think of it like I think about whether to use recursion or whether to use iteration. Sometimes one just works better than the other.
In my experience, duck typing makes code more readable and easier to grasp (both for the programmer and the reader). But I find that more traditional static typing eliminates a lot of needless typing errors. There's simply no way to objectively say one is better than another or even to say what situations one is more effective than the other.
I say that if you're comfortable using static typing, then use it. But you should at least try duck typing out (and use it in a nontrivial project if possible).
To answer you more directly:
...so again i'm just not getting it. Is it a fantastic time-saver, or the same old thing in a brand new sack?
It's both. You're still attacking the same problems. You're just doing it a different way. Sometimes that's really all you need to do to save time (even if for no other reason to force yourself to think about doing something a different way).
Is it a panacea that will save all of mankind from extinction? No. And anyone who tells you otherwise is a zealot.
A variant (at least as I've used them in VB6) holds a variable of a single, well-defined, usually static type. E.g., it might hold an int, or a float, or a string, but variant ints are used as ints, variant floats are used as floats, and variant strings are used as strings.
Duck typing instead uses dynamic typing. Under duck typing, a variable might be usable as an int, or a float, or a string, if it happens to support the particular methods that an int or float or string supports in a particular context.
Example of variants versus duck typing:
For a web application, suppose I want my user information to come from LDAP instead of from a database, but I still want my user information to be useable by the rest of the web framework, which is based around a database and an ORM.
Using variants: No luck. I can create a variant that can contain a UserFromDbRecord object or a UserFromLdap object, but UserFromLdap objects won't be usable by routines that expect objects from the FromDbRecord hierarchy.
Using duck typing: I can take my UserFromLdap class and add a couple of methods that make it act like a UserFromDbRecord class. I don't need to replicate the entire FromDbRecord interface, just enough for the routines that I need to use. If I do this right, it's an extremely powerful and flexible technique. If I do it wrong, it produces very confusing and brittle code (subject to breakage if either the DB library or the LDAP library changes).
I think the core point of duck typing is how it is used. One uses method detection and introspection of the entity in order to know what to do with it, instead of declaring in advance what it will be ( where you know what to do with it ).
It's probably more practical in OO languages, where primitives are not primitives, and are instead objects.
I think the best way to sum it up, in variant type, an entity is/can be anything, and what it is is uncertain, as opposed to an entity only looks like anything, but you can work out what it is by asking it.
Here's something I don't believe is plausible without ducktyping.
sub dance {
my $creature = shift;
if( $creature->can("walk") ){
$creature->walk("left",1);
$creature->walk("right",1);
$creature->walk("forward",1);
$creature->walk("back",1);
}
if( $creature->can("fly") ){
$creature->fly("up");
$creature->fly("right",1);
$creature->fly("forward",1);
$creature->fly("left", 1 );
$creature->fly("back", 1 );
$creature->fly("down");
} else if ( $creature->can("walk") ) {
$creature->walk("left",1);
$creature->walk("right",1);
$creature->walk("forward",1);
$creature->walk("back",1);
} else if ( $creature->can("splash") ) {
$creature->splash( "up" ) for ( 0 .. 4 );
}
if( $creature->can("quack") ) {
$creature->quack();
}
}
my #x = ();
push #x, new Rhinoceros ;
push #x, new Flamingo;
push #x, new Hyena;
push #x, new Dolphin;
push #x, new Duck;
for my $creature (#x){
new Thread(sub{
dance( $creature );
});
}
Any other way would require you to put type restrictions on for functions, which would cut out different species, needing you to create different functions for different species, making the code really hellish to maintain.
And that really sucks in terms of just trying to perform good choreography.
Everything you can do with duck-typing you can also do with interfaces. Duck-typing is fast and comfortable, but some argue it can lead to errors (if two distinct methods/properties are named alike). Interfaces are safe and explicit, but people might say "why state the obvious?". Rest is a flame. Everyone chooses what suits him and no one is "right".