Related
Preamble: I'm teaching a course in object-functional programming using Scala and one of the things we do is to take sample problems and compare how they might be implemented using object-functional programming and using state-based, object-oriented programming, which is the background most of the students have.
So I want to implement a simple class in Scala that has a private var with a public accessor method (a very common idiom in state-based, object-oriented programming). Looking at Alvin Alexander's "Scala Cookbook" the recommended code to do this is pretty ghastly:
class Person(private var _age: Int):
def incrAge() = _age += 1
def age = _age
I say "ghastly" because I'm having to invent two names that essentially represent the age field, one used in the constructor and another used in the class interface. I'm curious if people more familiar with Scala would know of a simpler syntax that would avoid this?
EDIT: It seems clear to me now that Scala combines the val/var declaration with the given visibility (public/private), so for a var either both accessor&mutator are public or both are private. Depending on perspective, you might find this inflexible, or feel it rightly punishes you for using var 🙂.
Yes, a better way of doing it is not using var
class Person(val age: Int) {
def incrAge = new Person(age+1)
}
If you are going to write idiomatic scala code, you should start with pretending that certain parts of it simply do not exist: mostly vars, nulls and returns, but also mutable structures or collections, arrays, and certain methods like .get on a Try or an Option, or the Await object. Oh, and also isInstanceOf and asInstance.
You may ask "why do these things exist if they are not supposed to be used?". Well, because sometimes, in a very few very specific cases they are actually useful for achieving a very limited very specific purpose. But that would be probably fewer than 0.1% of the cases you will come across in your career, unless you are involved in some hard core low level library development (in which case, you would not be posting questions like this here).
So, until you acquire enough command of the language to be able to definitively distinguish those 0.1% of the cases from the other 99.9%, you are much better off simply ignoring those language features, and pretending they do not exist (if you can't figure out how to achieve a certain task without using one of those, post a question here, and people will gladly help you).
You said "Having to create two names to manage a single field is ugly." Indeed. But you know what's uglier? Using vars.
(Btw, the way you typically do this in java is getAge and setAge – still two names. The ugliness is rooted in allowing the value labeled with the given name to be different at different points of program execution, not in how specifically the semantics of mutation looks like).
Coming from a java background I always mark instance variables as private. I'm learning scala and almost all of the code I have viewed the val/var instances have default (public) access. Why is this the access ? Does it not break information hiding/encapsulation principle ?
It would help it you specified which code, but keep in mind that some example code is in a simplified form to highlight whatever it is that the example is supposed to show you. Since the default access is public, that means that you often get the modifiers left off for simplicity.
That said, since a val is immutable, there's not much harm in leaving it public as long as you recognize that this is now part of the API for your class. That can be perfectly okay:
class DataThingy(data: Array[Double) {
val sum = data.sum
}
Or it can be an implementation detail that you shouldn't expose:
class Statistics(data: Array[Double]) {
val sum = data.sum
val sumOfSquares = data.map(x => x*x).sum
val expectationSquared = (sum * sum)/(data.length*data.length)
val expectationOfSquare = sumOfSquares/data.length
val varianceOfSample = expectationOfSquare - expectationSquared
val standardDeviation = math.sqrt(data.length*varianceOfSample/(data.length-1))
}
Here, we've littered our class with all of the intermediate steps for calculating standard deviation. And this is especially foolish given that this is not the most numerically stable way to calculate standard deviation with floating point numbers.
Rather than merely making all of these private, it is better style, if possible, to use local blocks or private[this] defs to perform the intermediate computations:
val sum = data.sum
val standardDeviation = {
val sumOfSquares = ...
...
math.sqrt(...)
}
or
val sum = data.sum
private[this] def findSdFromSquares(s: Double, ssq: Double) = { ... }
val standardDeviation = findMySD(sum, data.map(x => x*x).sum)
If you need to store a calculation for later use, then private val or private[this] val is the way to go, but if it's just an intermediate step on the computation, the options above are better.
Likewise, there's no harm in exposing a var if it is a part of the interface--a vector coordinate on a mutable vector for instance. But you should make them private (better yet: private[this], if you can!) when it's an implementation detail.
One important difference between Java and Scala here is that in Java you can not replace a public variable with getter and setter methods (or vice versa) without breaking source and binary compatibility. In Scala you can.
So in Java if you have a public variable, the fact that it's a variable will be exposed to the user and if you ever change it, the user has to change his code. In Scala you can replace a public var with a getter and setter method (or a public val with just a getter method) without the user ever knowing the difference. So in that sense no implementation details are exposed.
As an example, let's consider a rectangle class:
class Rectangle(val width: Int, val height:Int) {
val area = width * height
}
Now what happens if we later decide that we don't want the area to be stored as a variable, but rather it should be calculated each time it's called?
In Java the situation would be like this: If we had used a getter method and a private variable, we could just remove the variable and change the getter method to calculate the area instead of using the variable. No changes to user code needed. But since we've used a public variable, we are now forced to break user code :-(
In Scala it's different: we can just change the val to def and that's it. No changes to user code needed.
Actually, some Scala developers tend to use default access too much. But you can find appropriate examples in famous Scala projects(for example, Twitter's Finagle).
On the other hand, creating objects as immutable values is the standard way in Scala. We don't need to hide all the attributes if they're immutable completely.
I'd like to answer the question with a bit more generic approach. I think the answer you are looking for has to do with the design paradigms on which Scala is built. Instead of the classical prodecural / object oriented approach, like you see in Java, functional programming is used to a much higher extend. I cannot cover all the code that you mention of course, but in general (well written) Scala code will not need a lot of mutability.
As pointed out by Rex, val's are immutable, so there are few reasons for them to not be public. But as I see it the immutability is not a goal in itself, but a result of functional programming. So if we consider functions as something like x -> function -> y the function part becomes somewhat of a black box; we don't really care what it does, as long as it does it correctly. As the Haskell Wiki writes:
Purely functional programs typically operate on immutable data. Instead of altering existing values, altered copies are created and the original is preserved.
This also explains the missing closure, since the parts we traditionally wanted to hide away is executed in the functions and thus hidden anyway.
So, to cut things short, I would argue that mutability and closure has become more redundant in Scala. And why clutter things up with getters and setter when it can be avoided?
How do I create a properly functional configurable object in Scala? I have watched Tony Morris' video on the Reader monad and I'm still unable to connect the dots.
I have a hard-coded list of Client objects:
class Client(name : String, age : Int){ /* etc */}
object Client{
//Horrible!
val clients = List(Client("Bob", 20), Client("Cindy", 30))
}
I want Client.clients to be determined at runtime, with the flexibility of either reading it from a properties file or from a database. In the Java world I'd define an interface, implement the two types of source, and use DI to assign a class variable:
trait ConfigSource {
def clients : List[Client]
}
object ConfigFileSource extends ConfigSource {
override def clients = buildClientsFromProperties(Properties("clients.properties"))
//...etc, read properties files
}
object DatabaseSource extends ConfigSource { /* etc */ }
object Client {
#Resource("configuration_source")
private var config : ConfigSource = _ //Inject it at runtime
val clients = config.clients
}
This seems like a pretty clean solution to me (not a lot of code, clear intent), but that var does jump out (OTOH, it doesn't seem to me really troublesome, since I know it will be injected once-and-only-once).
What would the Reader monad look like in this situation and, explain it to me like I'm 5, what are its advantages?
Let's start with a simple, superficial difference between your approach and the Reader approach, which is that you no longer need to hang onto config anywhere at all. Let's say you define the following vaguely clever type synonym:
type Configured[A] = ConfigSource => A
Now, if I ever need a ConfigSource for some function, say a function that gets the n'th client in the list, I can declare that function as "configured":
def nthClient(n: Int): Configured[Client] = {
config => config.clients(n)
}
So we're essentially pulling a config out of thin air, any time we need one! Smells like dependency injection, right? Now let's say we want the ages of the first, second and third clients in the list (assuming they exist):
def ages: Configured[(Int, Int, Int)] =
for {
a0 <- nthClient(0)
a1 <- nthClient(1)
a2 <- nthClient(2)
} yield (a0.age, a1.age, a2.age)
For this, of course, you need some appropriate definition of map and flatMap. I won't get into that here, but will simply say that Scalaz (or Rúnar's awesome NEScala talk, or Tony's which you've seen already) gives you all you need.
The important point here is that the ConfigSource dependency and its so-called injection are mostly hidden. The only "hint" that we can see here is that ages is of type Configured[(Int, Int, Int)] rather than simply (Int, Int, Int). We didn't need to explicitly reference config anywhere.
As an aside, this is the way I almost always like to think about monads: they hide their effect so it's not polluting the flow of your code, while explicitly declaring the effect in the type signature. In other words, you needn't repeat yourself too much: you say "hey, this function deals with effect X" in the function's return type, and don't mess with it any further.
In this example, of course the effect is to read from some fixed environment. Another monadic effect you might be familiar with include error-handling: we can say that Option hides error-handling logic while making the possibility of errors explicit in your method's type. Or, sort of the opposite of reading, the Writer monad hides the thing we're writing to while making its presence explicit in the type system.
Now finally, just as we normally need to bootstrap a DI framework (somewhere outside our usual flow of control, such as in an XML file), we also need to bootstrap this curious monad. Surely we'll have some logical entry point to our code, such as:
def run: Configured[Unit] = // ...
It ends up being pretty simple: since Configured[A] is just a type synonym for the function ConfigSource => A, we can just apply the function to its "environment":
run(ConfigFileSource)
// or
run(DatabaseSource)
Ta-da! So, contrasting with the traditional Java-style DI approach, we don't have any "magic" occurring here. The only magic, as it were, is encapsulated in the definition of our Configured type and the way it behaves as a monad. Most importantly, the type system keeps us honest about which "realm" dependency injection is occurring in: anything with type Configured[...] is in the DI world, and anything without it is not. We simply don't get this in old-school DI, where everything is potentially managed by the magic, so you don't really know which portions of your code are safe to reuse outside of a DI framework (for example, within your unit tests, or in some other project entirely).
update: I wrote up a blog post which explains Reader in greater detail.
There are 2 reasons for me to ask:
1. I'd like a better code fragmentation to facilitate version control on per-function level
2. I struggle from some attention deficit disorder and it is hard for me to work with long pieces of code such as big class files
To address these problems I used to use include directives in C++ and partial class definitions and manually-definable foldable regions in C#. Are there any such things available in Scala 2.8?
I've tried to use editor-fold tag in NetBeans IDE, but it does not work in Scala editor unfortunately :-(
UPDATE: As far as I understand, there are no such facilities in Scala. So I'd like to ask: someone who has any connection to Scala authors, or an account on their Bugzilla (or whatever they use), please, suggest them an idea - they should probably think of introducing something of such (I was fascinated by C# regions and partial classes for example, and plain old includes also look like a convenient tool to have) to make Scala even more beautiful through laconicity, IMHO.
How about doing it with traits? You define it like this:
trait Similarity
{
def isSimilar(x: Any): Boolean
def isNotSimilar(x: Any): Boolean = !isSimilar(x)
}
...and then you use it like so:
class Point(xc: Int, yc: Int) extends Similarity
{
var x: Int = xc
var y: Int = yc
def isSimilar(obj: Any) =
obj.isInstanceOf[Point] &&
obj.asInstanceOf[Point].x == x
}
If the class Point were bigger, you could split it further into traits, resulting in the division that you want. Please note, however, that I don't think this is advisable, as it will make it very difficult to get a good overview of your code, unless you already know it by heart. If you can break it in a nice way, however, you might be able to get some nice, reusable blocks out of it, so in the end it might still be worth doing.
Best of luck to you!
//file A.scala
trait A { self: B =>
....
}
//file B.scala
trait B { self: A =>
....
}
//file C.scala
class C extends A with B
I suggest to read white paper by Martin at this link. In this white paper 'Case Sudy: The Scala Compiler' chapter will give you idea about how you can achieve component based design having code in several separate files.
Scala code folding works properly in IDEA.
The version control tools I work with (bzr or git, mostly) have no trouble isolating changes line-by-line. What use case do you have--that's common enough to worry about--where line-level isolation (which allows changes to independent methods to be merged without user intervention) is not enough?
Also, if you can't focus on something as large as one class with many methods, use more classes. A method generally requires you to know what the other methods are, what the fields are, and so on. Having that split across separate files is just asking for trouble. Instead, encapsulate your problem in a different way so you can work with smaller self-contained chunks at a time.
I keep seeing the phrase "duck typing" bandied about, and even ran across a code example or two. I am way too lazy busy to do my own research, can someone tell me, briefly:
the difference between a 'duck type' and an old-skool 'variant type', and
provide an example of where I might prefer duck typing over variant typing, and
provide an example of something that i would have to use duck typing to accomplish?
I don't mean to seem fowl by doubting the power of this 'new' construct, and I'm not ducking the issue by refusing to do the research, but I am quacking up at all the flocking hype i've been seeing about it lately. It looks like no typing (aka dynamic typing) to me, so I'm not seeing the advantages right away.
ADDENDUM: Thanks for the examples so far. It seems to me that using something like 'O->can(Blah)' is equivalent to doing a reflection lookup (which is probably not cheap), and/or is about the same as saying (O is IBlah) which the compiler might be able to check for you, but the latter has the advantage of distinguishing my IBlah interface from your IBlah interface while the other two do not. Granted, having a lot of tiny interfaces floating around for every method would get messy, but then again so can checking for a lot of individual methods...
...so again i'm just not getting it. Is it a fantastic time-saver, or the same old thing in a brand new sack? Where is the example that requires duck typing?
In some of the answers here, I've seen some incorrect use of terminology, which has lead people to provide wrong answers.
So, before I give my answer, I'm going to provide a few definitions:
Strongly typed
A language is strongly typed if it enforces the type safety of a program. That means that it guarantees two things: something called progress and something else called preservation. Progress basically means that all "validly typed" programs can in fact be run by the computer, They may crash, or throw an exception, or run for an infinite loop, but they can actually be run. Preservation means that if a program is "validly typed" that it will always be "Validly typed", and that no variable (or memory location) will contain a value that does not conform to its assigned type.
Most languages have the "progress" property. There are many, however, that don't satisfy the "preservation" property. A good example, is C++ (and C too). For example, it is possible in C++ to coerce any memory address to behave as if it was any type. This basically allows programmers to violate the type system any time they want. Here is a simple example:
struct foo
{
int x;
iny y;
int z;
}
char * x = new char[100];
foo * pFoo = (foo *)x;
foo aRealFoo;
*pFoo = aRealFoo;
This code allows someone to take an array of characters and write a "foo" instance to it. If C++ was strongly typed this would not be possible. Type safe languages, like C#, Java, VB, lisp, ruby, python, and many others, would throw an exception if you tried to cast an array of characters to a "foo" instance.
Weakly typed
Something is weakly typed if it is not strongly typed.
Statically typed
A language is statically typed if its type system is verified at compile time. A statically typed language can be either "weakly typed" like C or strongly typed like C#.
Dynamically typed
A dynamically typed language is a language where types are verified at runtime. Many languages have a mixture, of some sort, between static and dynamic typing. C#, for example, will verify many casts dynamically at runtime because it's not possible to check them at compile time. Other examples are languages like Java, VB, and Objective-C.
There are also some languages that are "completely" or "mostly" dynamically typed, like "lisp", "ruby", and "small talk"
Duck typing
Duck typing is something that is completely orthogonal to static, dynamic, weak, or strong typing. It is the practice of writing code that will work with an object regardless of its underlying type identity. For example, the following VB.NET code:
function Foo(x as object) as object
return x.Quack()
end function
Will work, regardless of what the type of the object is that is passed into "Foo", provided that is defines a method called "Quack". That is, if the object looks like a duck, walks like a duck, and talks like a duck, then it's a duck. Duck typing comes in many forms. It's possible to have static duck typing, dynamic duck typing, strong duck typing, and weak duck typing. C++ template functions are a good example of "weak static duck typing". The example show in "JaredPar's" post shows an example of "strong static duck typing". Late binding in VB (or code in Ruby or Python) enables "strong dynamic duck typing".
Variant
A variant is a dynamically typed data structure that can hold a range of predefined data types, including strings, integer types, dates, and com objects. It then defines a bunch of operations for assigning, converting, and manipulating data stored in variants. Whether or not a variant is strongly typed depends on the language in which it is used. For example, a variant in a VB 6 program is strongly typed. The VB runtime ensures that operations written in VB code will conform to the typing rules for variants. Tying to add a string to an IUnknown via the variant type in VB will result in a runtime error. In C++, however, variants are weakly typed because all C++ types are weakly typed.
OK.... now that I have gotten the definitions out of the way, I can now answer your question:
A variant, in VB 6, enables one form of doing duck typing. There are better ways of doing duck typing (Jared Par's example is one of the best), than variants, but you can do duck typing with variants. That is, you can write one piece of code that will operate on an object regardless of its underlying type identity.
However, doing it with variants doesn't really give a lot of validation. A statically typed duck type mechanism, like the one JaredPar describes gives the benefits of duck typing, plus some extra validation from the compiler. That can be really helpful.
The simple answer is variant is weakly typed while duck typing is strongly typed.
Duck typing can be summed up nicely as "if it walks like a duck, looks like a duck, acts like a duck, then it's a duck." It computer science terms consider duck to be the following interface.
interface IDuck {
void Quack();
}
Now let's examine Daffy
class Daffy {
void Quack() {
Console.WriteLine("Thatsssss dispicable!!!!");
}
}
Daffy is not actually an IDuck in this case. Yet it acts just like a Duck. Why make Daffy implement IDuck when it's quite obvious that Daffy is in fact a duck.
This is where Duck typing comes in. It allows a type safe conversion between any type that has all of the behaviors of a IDuck and an IDuck reference.
IDuck d = new Daffy();
d.Quack();
The Quack method can now be called on "d" with complete type safety. There is no chance of a runtime type error in this assignment or method call.
Duck typing is just another term for dynamic typing or late-binding. A variant object that parses/compiles with any member access (e.g., obj.Anything) that may or not actually be defined during runtime is duck typing.
Probably nothing requires duck-typing, but it can be convenient in certain situations.
Say you have a method that takes and uses an object of the sealed class Duck from some 3rd party library. And you want to make the method testable. And Duck has an awfully big API (kind of like ServletRequest) of which you only need to care about a small subset. How do you test it?
One way is to make the method take something that quacks. Then you can simply create a quacking mock object.
Try reading the very first paragraph of the Wikipedia article on duck typing.
Duck typing on Wikipedia
I can have an interface (IRunnable) that defines the method Run().
If I have another class with a method like this:
public void RunSomeRunnable(IRunnable rn) { ... }
In a duck type friendly language I could pass in any class that had a Run() method into the RunSomeRunnable() method.
In a statically typed language the class being passed into RunSomeRunnable needs to explicitly implement the IRunnable interface.
"If it Run() like a duck"
variant is more like object in .NET at least.
#Kent Fredric
Your example can most certainly be done without duck typing by using explicit interfaces...uglier yes, but it's not impossible.
And personally, I find having well defined contracts in interfaces much better for enforcing quality code, than relying on duck typing...but that's just my opinion and take it with a grain of salt.
public interface ICreature { }
public interface IFly { fly();}
public interface IWalk { walk(); }
public interface IQuack { quack(); }
// ETC
// Animal Class
public class Duck : ICreature, IWalk, IFly, IQuack
{
fly() {};
walk() {};
quack() {};
}
public class Rhino: ICreature, IWalk
{
walk();
}
// In the method
List<ICreature> creatures = new List<ICreature>();
creatures.Add(new Duck());
creatures.Add(new Rhino());
foreach (ICreature creature in creatures)
{
if (creature is IFly)
(creature as IFly).fly();
if (creature is IWalk)
(creature as IWalk).walk();
}
// Etc
In regards to your request for an example of something you'd need to use duck typing to accomplish, I don't think such a thing exists. I think of it like I think about whether to use recursion or whether to use iteration. Sometimes one just works better than the other.
In my experience, duck typing makes code more readable and easier to grasp (both for the programmer and the reader). But I find that more traditional static typing eliminates a lot of needless typing errors. There's simply no way to objectively say one is better than another or even to say what situations one is more effective than the other.
I say that if you're comfortable using static typing, then use it. But you should at least try duck typing out (and use it in a nontrivial project if possible).
To answer you more directly:
...so again i'm just not getting it. Is it a fantastic time-saver, or the same old thing in a brand new sack?
It's both. You're still attacking the same problems. You're just doing it a different way. Sometimes that's really all you need to do to save time (even if for no other reason to force yourself to think about doing something a different way).
Is it a panacea that will save all of mankind from extinction? No. And anyone who tells you otherwise is a zealot.
A variant (at least as I've used them in VB6) holds a variable of a single, well-defined, usually static type. E.g., it might hold an int, or a float, or a string, but variant ints are used as ints, variant floats are used as floats, and variant strings are used as strings.
Duck typing instead uses dynamic typing. Under duck typing, a variable might be usable as an int, or a float, or a string, if it happens to support the particular methods that an int or float or string supports in a particular context.
Example of variants versus duck typing:
For a web application, suppose I want my user information to come from LDAP instead of from a database, but I still want my user information to be useable by the rest of the web framework, which is based around a database and an ORM.
Using variants: No luck. I can create a variant that can contain a UserFromDbRecord object or a UserFromLdap object, but UserFromLdap objects won't be usable by routines that expect objects from the FromDbRecord hierarchy.
Using duck typing: I can take my UserFromLdap class and add a couple of methods that make it act like a UserFromDbRecord class. I don't need to replicate the entire FromDbRecord interface, just enough for the routines that I need to use. If I do this right, it's an extremely powerful and flexible technique. If I do it wrong, it produces very confusing and brittle code (subject to breakage if either the DB library or the LDAP library changes).
I think the core point of duck typing is how it is used. One uses method detection and introspection of the entity in order to know what to do with it, instead of declaring in advance what it will be ( where you know what to do with it ).
It's probably more practical in OO languages, where primitives are not primitives, and are instead objects.
I think the best way to sum it up, in variant type, an entity is/can be anything, and what it is is uncertain, as opposed to an entity only looks like anything, but you can work out what it is by asking it.
Here's something I don't believe is plausible without ducktyping.
sub dance {
my $creature = shift;
if( $creature->can("walk") ){
$creature->walk("left",1);
$creature->walk("right",1);
$creature->walk("forward",1);
$creature->walk("back",1);
}
if( $creature->can("fly") ){
$creature->fly("up");
$creature->fly("right",1);
$creature->fly("forward",1);
$creature->fly("left", 1 );
$creature->fly("back", 1 );
$creature->fly("down");
} else if ( $creature->can("walk") ) {
$creature->walk("left",1);
$creature->walk("right",1);
$creature->walk("forward",1);
$creature->walk("back",1);
} else if ( $creature->can("splash") ) {
$creature->splash( "up" ) for ( 0 .. 4 );
}
if( $creature->can("quack") ) {
$creature->quack();
}
}
my #x = ();
push #x, new Rhinoceros ;
push #x, new Flamingo;
push #x, new Hyena;
push #x, new Dolphin;
push #x, new Duck;
for my $creature (#x){
new Thread(sub{
dance( $creature );
});
}
Any other way would require you to put type restrictions on for functions, which would cut out different species, needing you to create different functions for different species, making the code really hellish to maintain.
And that really sucks in terms of just trying to perform good choreography.
Everything you can do with duck-typing you can also do with interfaces. Duck-typing is fast and comfortable, but some argue it can lead to errors (if two distinct methods/properties are named alike). Interfaces are safe and explicit, but people might say "why state the obvious?". Rest is a flame. Everyone chooses what suits him and no one is "right".