How Scalding DSL translates into regular Scala code? - scala

Please help to find out how Scalding DSL translates into regular Scala code.
https://github.com/twitter/scalding/wiki/Fields-based-API-Reference#sortBy
For example:
val fasterBirds = birds.map('speed -> 'doubledSpeed) { speed : Int => speed * 2 }
Questions:
What conventions I need to follow to add my own functions to Scalding map,reduce, groupBy,sort and `scanLeft?
How Scalding translates expressions on fields like `'inpFld -> 'outFld to Scala code?
What data structures/functions Scalding translator creates? Where to find them in Scalding source code?
Thanks!

That IS regular Scala code. One strength of Scala lies in its extensibility. The syntax allows the programmer to extend the syntax of programs to create domain-specific languages. This is especially helpful when using underlying libraries.
The domain-specific language of Scala doesn't translate so much as allow you to defer application of code until the appropriate time. The tick character (') means that the following set of characters is a symbol, built-in datatype. The -> operator is syntactic sugar that can be expressed in the same way that a comma is, but visually, it imparts the concept of "translation" or "from this to that".
The domain-specific language you are looking at doesn't create structures, although it looks like it does create a functor. In this case it is a seen by the Java Virtual Machine as a Function1[Type,Type] instance which has an apply method that takes its argument and returns a result which is calculated by the provided code.

Related

Is the reason we can use val defining functions in Scala?

Is the reason a val variable can be used to contain a function definition is because functions are first class citizens where they can be contained in variables?
In Scala damn near everything is an expression. From a practical perspective what that means is pretty much every bit of syntactically correct Scala code that you can write evaluates to an object that can you can do more Scala on. Examples of things you can do to these objects are: call a method on it, pass it to a function, or store it in a val. Expressions can be thought of in contrast to statements, which are just instructions to the computer to do something. An example of the use of statements in Scala are import commands. The heavy prevalence of expressions in Scala are a deliberate design choice intended to make the language more flexible and extensible.

"Lifting" exceptions to Option types

Both F# and Scala act as a hybrid language that is often used to bridge the words of tradional object oriented code to functional code.
A concept that belongs more to the OO world are exceptions, whereas the functional world favors Option types in many cases.
To wrap existing library code that relies on exceptions - and make it more functional - I would thus like to "lift" exception-throwing code to instead return an option type.
In Scala, there is a nice library function to "catch all" and convert to option. It can be used like this:
import scala.util.control.Exception._
val functionalVersion = allCatch opt myFunction
see In Scala, is there a pre-existing library function for converting exceptions to Options?
Now that I'm moving to F# I have the same requirement, but I can't seem to find an existing utility function for this - and also struggle to implement one myself.
I can create such a wrapper for a unit function, aka an action
let catchAll f = try Some (f()) with | _ -> None
But the problem here is that I don't want to first wrap all the exeption throwing code into an action.
For example I would like to wrap the array-indexing operator, so that it doesn't throw.
// Wrap out-of-bounds exception with option type
let maybeGetIndex (array: int[]) (index: int) = catchAll (fun () -> array.[index])
maybeGetIndex [| 1; 2; 3 |] 10 // -> None
However, it would be much nicer if one could simple write
(catchAll a.[index])
i.e. apply catchAll to a whole expression before it is evaluated.
(Scala can achieve this through call-by-name parameters which seem to be missing from F#)
So this question is twofold:
Is there an existing library function to wrap exceptions into option
types?
Is there a language feature that would allow me to implement
it?
First of all, I think that it's not true that a "concept that belongs more to the object-oriented world are exceptions". Exceptions exist in many functional languages of the ML family and, for example, OCaml relies on them quite heavily and uses them even for certain control flow structures. In F#, this is not so much the case, because .NET exceptions are somewhat slower, but I see exceptions very much orthogonal to the object-oriented/functional issue.
For this reason, I actually find exceptions often preferable to option types in F#. The downside is that there is less type-checking (you do not know what might throw), but the upside is that the language provides a nice integrated langauge support for exceptions. If you need to handle exceptional situations then exceptions are a good way of doing that!
To answer your original question about syntactic tricks you can make - I would probably just use a function as your existing code does, because that's explicit and easy to understand (and presumably, you'll only need exception wrapping in some core functions that you implement).
That said, you can define a computation expression builder that wraps the code in the body and serves as your catchAll function with a somewhat neater syntax:
type CatchAllBuilder() =
member x.Delay(f) = try Some(f()) with _ -> None
member x.Return(v) = v
let catchAll = CatchAllBuilder()
This lets you write something like:
catchAll { return Array.empty.[0] }
As mentioned earlier, I wouldn't do this because (i) I don't think all exceptions need to be converted to options in F# and (ii) it introduces unfamiliar syntax that new team members (and future you) might be confused by, but it is probably the nicest syntax you can get.
[EDIT: Now a working version with return - this is somewhat less pretty, but perhaps still useful!]

Scala: why it is possible to have Some(None)?

>Option(None)
res2: Option[None.type] = Some(None)
Why it is possible? Why Option of None doesn't returns None?
Scala (like most statically typed functional programming languages) is built out of pieces that can be composed together in consistent ways. This is in contrast with other programming languages and libraries (often dynamic ones) that attempt to predict the programmer's intentions and often support this by having lots of special cases (automatic flattening of nested constructions, etc.).
In Scala Option is just a type constructor—you can create an Option[A] for literally any type A by writing Option(a). Option[Int] is itself a type, for example, so you could have an Option[Option[Int]], an Option[Option[Option[Int]]], and so on. There are no special cases, just a general mechanism for building up programs.
Not sure if this is a good answer. But try read next article.
http://danielwestheide.com/blog/2012/12/19/the-neophytes-guide-to-scala-part-5-the-option-type.html
Some is actually a wrapper for the value you trying to use. So your code is valid.

Scala Bean Coercion :: The Missing LINQ

Update
Play 2.0's Scala version will feature ANORM, which seems similar to Querulous in that both are JDBC wrappers and not ORMs. Here's ANORM query coercion at work with a parser combinator:
SQL("""
select * from Country c
join CountryLanguage l on l.CountryCode = c.Code
where c.code = 'FRA';
"""
).as(
str("name") ~< spanM(
by=str("code"), str("language") ~< str("isOfficial")
) ^^ {
case country~languages =>
SpokenLanguages(
country,
languages.collect { case lang~"T" => lang } headOption,
languages.collect { case lang~"F" => lang }
)
} ?
)
The multi-line query """ sql """ is nice, I like that, but the coercion, please, no ;-) In Groovy, bean coercion with the same query is a 1-liner:
List[Country] c = sql.rows("select * from country")?.collect{ it.toRowResult() as Country }
the null safe operator (?.might-be-null) in groovy is quite convenient, scala seems to require the Some() Option[] combo to deal with possible null outcomes. Do Scala coders like null handling in Scala?
I guess the general thrust of this post is: can Scala provide scripting language concision while retaining compiler type safe code? Given that Scala is perhaps more powerful/expressive than C# (unintentional flame), then a full blown Scala LINQ must be possible. Furthermore, since Scala straddles the functional and OO paradigms, then it must also be able to achieve Groovy level concision (for example, the 1-liner query-bean-coercion above).
If these assumptions are true, then why do the existing scala ORMs and jdbc wrappers require so much boilerplate compared to groovy and LINQ on C#? Obviously I am an idealist looking for bare bones DSLs where implementations are either incredibly concise, or closely mirror the underlying language they represent (as in LINQ-to-SQL).
Original
Have been taking a run through the various Scala ORMs (squeryl, daomapper, couple others will fill in later) and SQL helper frameworks (querulous so far)
Being new to Scala and strongly typed languages in general, one thing that leaps out at me is the need to specify the type (String, Int, etc.) of each column in every query result.
About to get on an overnight train here, but this struck me just now, so putting it out there (will add some examples when I get back online again to make this a bit of less of a ramble)
For now, a quick one from Querulous's readme on github:
val users = queryEvaluator.select("SELECT * FROM users WHERE id IN (?) OR name = ?", List(1,2,3), "Jacques") { row =>
new User(row.getInt("id"), row.getString("name"))
}
While I understand that the compiler needs to know the type of every "object" you work with, it seems non-DRY to have to specify "row.getInt('id')" when the domain class itself already declares that id is of type Int.
So, coming from a fair degree of ignorance, I will ask, why do Scala ORMs and SQL helper frameworks not provide developers with an implementation model that allows for inferred or implicit result sets?
Just to put in context, I am coming from Grails, which has an, imo, excellent domain/validation model among other framework nice-to-haves, but suffers from dynamic language time wasting fat-finger typing (startup time is painful as well) which is why I am exploring Scala frameworks.
See Scala Integrated Query as I understand it is scheduled to be integrated in the typesafe stack as Scala Language Integrated Connection Kit (SLICK)

Scala versus F# question: how do they unify OO and FP paradigms?

What are the key differences between the approaches taken by Scala and F# to unify OO and FP paradigms?
EDIT
What are the relative merits and demerits of each approach? If, in spite of the support for subtyping, F# can infer the types of function arguments then why can't Scala?
I have looked at F#, doing low level tutorials, so my knowledge of it is very limited. However, it was apparent to me that its style was essentially functional, with OO being more like an add on -- much more of an ADT + module system than true OO. The feeling I get can be best described as if all methods in it were static (as in Java static).
See, for instance, any code using the pipe operator (|>). Take this snippet from the wikipedia entry on F#:
[1 .. 10]
|> List.map fib
(* equivalent without the pipe operator *)
List.map fib [1 .. 10]
The function map is not a method of the list instance. Instead, it works like a static method on a List module which takes a list instance as one of its parameters.
Scala, on the other hand, is fully OO. Let's start, first, with the Scala equivalent of that code:
List(1 to 10) map fib
// Without operator notation or implicits:
List.apply(Predef.intWrapper(1).to(10)).map(fib)
Here, map is a method on the instance of List. Static-like methods, such as intWrapper on Predef or apply on List, are much more uncommon. Then there are functions, such as fib above. Here, fib is not a method on int, but neither it is a static method. Instead, it is an object -- the second main difference I see between F# and Scala.
Let's consider the F# implementation from the Wikipedia, and an equivalent Scala implementation:
// F#, from the wiki
let rec fib n =
match n with
| 0 | 1 -> n
| _ -> fib (n - 1) + fib (n - 2)
// Scala equivalent
def fib(n: Int): Int = n match {
case 0 | 1 => n
case _ => fib(n - 1) + fib(n - 2)
}
The above Scala implementation is a method, but Scala converts that into a function to be able to pass it to map. I'll modify it below so that it becomes a method that returns a function instead, to show how functions work in Scala.
// F#, returning a lambda, as suggested in the comments
let rec fib = function
| 0 | 1 as n -> n
| n -> fib (n - 1) + fib (n - 2)
// Scala method returning a function
def fib: Int => Int = {
case n # (0 | 1) => n
case n => fib(n - 1) + fib(n - 2)
}
// Same thing without syntactic sugar:
def fib = new Function1[Int, Int] {
def apply(param0: Int): Int = param0 match {
case n # (0 | 1) => n
case n => fib.apply(n - 1) + fib.apply(n - 2)
}
}
So, in Scala, all functions are objects implementing the trait FunctionX, which defines a method called apply. As shown here and in the list creation above, .apply can be omitted, which makes function calls look just like method calls.
In the end, everything in Scala is an object -- and instance of a class -- and every such object does belong to a class, and all code belong to a method, which gets executed somehow. Even match in the example above used to be a method, but has been converted into a keyword to avoid some problems quite a while ago.
So, how about the functional part of it? F# belongs to one of the most traditional families of functional languages. While it doesn't have some features some people think are important for functional languages, the fact is that F# is function by default, so to speak.
Scala, on the other hand, was created with the intent of unifying functional and OO models, instead of just providing them as separate parts of the language. The extent to which it was succesful depends on what you deem to be functional programming. Here are some of the things that were focused on by Martin Odersky:
Functions are values. They are objects too -- because all values are objects in Scala -- but the concept that a function is a value that can be manipulated is an important one, with its roots all the way back to the original Lisp implementation.
Strong support for immutable data types. Functional programming has always been concerned with decreasing the side effects on a program, that functions can be analysed as true mathematical functions. So Scala made it easy to make things immutable, but it did not do two things which FP purists criticize it for:
It did not make mutability harder.
It does not provide an effect system, by which mutability can be statically tracked.
Support for Algebraic Data Types. Algebraic data types (called ADT, which confusingly also stands for Abstract Data Type, a different thing) are very common in functional programming, and are most useful in situations where one commonly use the visitor pattern in OO languages.
As with everything else, ADTs in Scala are implemented as classes and methods, with some syntactic sugars to make them painless to use. However, Scala is much more verbose than F# (or other functional languages, for that matter) in supporting them. For example, instead of F#'s | for case statements, it uses case.
Support for non-strictness. Non-strictness means only computing stuff on demand. It is an essential aspect of Haskell, where it is tightly integrated with the side effect system. In Scala, however, non-strictness support is quite timid and incipient. It is available and used, but in a restricted manner.
For instance, Scala's non-strict list, the Stream, does not support a truly non-strict foldRight, such as Haskell does. Furthermore, some benefits of non-strictness are only gained when it is the default in the language, instead of an option.
Support for list comprehension. Actually, Scala calls it for-comprehension, as the way it is implemented is completely divorced from lists. In its simplest terms, list comprehensions can be thought of as the map function/method shown in the example, though nesting of map statements (supports with flatMap in Scala) as well as filtering (filter or withFilter in Scala, depending on strictness requirements) are usually expected.
This is a very common operation in functional languages, and often light in syntax -- like in Python's in operator. Again, Scala is somewhat more verbose than usual.
In my opinion, Scala is unparalled in combining FP and OO. It comes from the OO side of the spectrum towards the FP side, which is unusual. Mostly, I see FP languages with OO tackled on it -- and it feels tackled on it to me. I guess FP on Scala probably feels the same way for functional languages programmers.
EDIT
Reading some other answers I realized there was another important topic: type inference. Lisp was a dynamically typed language, and that pretty much set the expectations for functional languages. The modern statically typed functional languages all have strong type inference systems, most often the Hindley-Milner1 algorithm, which makes type declarations essentially optional.
Scala can't use the Hindley-Milner algorithm because of Scala's support for inheritance2. So Scala has to adopt a much less powerful type inference algorithm -- in fact, type inference in Scala is intentionally undefined in the specification, and subject of on-going improvements (it's improvement is one of the biggest features of the upcoming 2.8 version of Scala, for instance).
In the end, however, Scala requires all parameters to have their types declared when defining methods. In some situations, such as recursion, return types for methods also have to be declared.
Functions in Scala can often have their types inferred instead of declared, though. For instance, no type declaration is necessary here: List(1, 2, 3) reduceLeft (_ + _), where _ + _ is actually an anonymous function of type Function2[Int, Int, Int].
Likewise, type declaration of variables is often unnecessary, but inheritance may require it. For instance, Some(2) and None have a common superclass Option, but actually belong to different subclases. So one would usually declare var o: Option[Int] = None to make sure the correct type is assigned.
This limited form of type inference is much better than statically typed OO languages usually offer, which gives Scala a sense of lightness, and much worse than statically typed FP languages usually offer, which gives Scala a sense of heavyness. :-)
Notes:
Actually, the algorithm originates from Damas and Milner, who called it "Algorithm W", according to the wikipedia.
Martin Odersky mentioned in a comment here that:
The reason Scala does not have Hindley/Milner type inference is
that it is very difficult to combine with features such as
overloading (the ad-hoc variant, not type classes), record
selection, and subtyping
He goes on to state that it may not be actually impossible, and it came down to a trade-off. Please do go to that link for more information, and, if you do come up with a clearer statement or, better yet, some paper one way or another, I'd be grateful for the reference.
Let me thank Jon Harrop for looking this up, as I was assuming it was impossible. Well, maybe it is, and I couldn't find a proper link. Note, however, that it is not inheritance alone causing the problem.
F# is functional - It allows OO pretty well, but the design and philosophy is functional nevertheless. Examples:
Haskell-style functions
Automatic currying
Automatic generics
Type inference for arguments
It feels relatively clumsy to use F# in a mainly object-oriented way, so one could describe the main goal as to integrate OO into functional programming.
Scala is multi-paradigm with focus on flexibility. You can choose between authentic FP, OOP and procedural style depending on what currently fits best. It's really about unifying OO and functional programming.
There are quite a few points that you can use for comparing the two (or three). First, here are some notable points that I can think of:
Syntax
Syntactically, F# and OCaml are based on the functional programming tradition (space separated and more lightweight), while Scala is based on the object-oriented style (although Scala makes it more lightweight).
Integrating OO and FP
Both F# and Scala very smoothly integrate OO with FP (because there is no contradiction between these two!!) You can declare classes to hold immutable data (functional aspect) and provide members related to working with the data, you can also use interfaces for abstraction (object-oriented aspects). I'm not as familiar with OCaml, but I would think that it puts more emphasis on the OO side (compared to F#)
Programming style in F#
I think that the usual programming style used in F# (if you don't need to write .NET library and don't have other limitations) is probably more functional and you'd use OO features only when you need to. This means that you group functionality using functions, modules and algebraic data types.
Programming style in Scala
In Scala, the default programming style is more object-oriented (in the organization), however you still (probably) write functional programs, because the "standard" approach is to write code that avoids mutation.
What are the key differences between the approaches taken by Scala and F# to unify OO and FP paradigms?
The key difference is that Scala tries to blend the paradigms by making sacrifices (usually on the FP side) whereas F# (and OCaml) generally draw a line between the paradigms and let the programmer choose between them for each task.
Scala had to make sacrifices in order to unify the paradigms. For example:
First-class functions are an essential feature of any functional language (ML, Scheme and Haskell). All functions are first-class in F#. Member functions are second-class in Scala.
Overloading and subtypes impede type inference. F# provides a large sublanguage that sacrifices these OO features in order to provide powerful type inference when these features are not used (requiring type annotations when they are used). Scala pushes these features everywhere in order to maintain consistent OO at the cost of poor type inference everywhere.
Another consequence of this is that F# is based upon tried and tested ideas whereas Scala is pioneering in this respect. This is ideal for the motivations behind the projects: F# is a commercial product and Scala is programming language research.
As an aside, Scala also sacrificed other core features of FP such as tail-call optimization for pragmatic reasons due to limitations of their VM of choice (the JVM). This also makes Scala much more OOP than FP. Note that there is a project to bring Scala to .NET that will use the CLR to do genuine TCO.
What are the relative merits and demerits of each approach? If, in spite of the support for subtyping, F# can infer the types of function arguments then why can't Scala?
Type inference is at odds with OO-centric features like overloading and subtypes. F# chose type inference over consistency with respect to overloading. Scala chose ubiquitous overloading and subtypes over type inference. This makes F# more like OCaml and Scala more like C#. In particular, Scala is no more a functional programming language than C# is.
Which is better is entirely subjective, of course, but I personally much prefer the tremendous brevity and clarity that comes from powerful type inference in the general case. OCaml is a wonderful language but one pain point was the lack of operator overloading that required programmers to use + for ints, +. for floats, +/ for rationals and so on. Once again, F# chooses pragmatism over obsession by sacrificing type inference for overloading specifically in the context of numerics, not only on arithmetic operators but also on arithmetic functions such as sin. Every corner of the F# language is the result of carefully chosen pragmatic trade-offs like this. Despite the resulting inconsistencies, I believe this makes F# far more useful.
From this article on Programming Languages:
Scala is a rugged, expressive,
strictly superior replacement for
Java. Scala is the programming
language I would use for a task like
writing a web server or an IRC client.
In contrast to OCaml [or F#], which was a
functional language with an
object-oriented system grafted to it,
Scala feels more like an true hybrid
of object-oriented and functional
programming. (That is, object-oriented
programmers should be able to start
using Scala immediately, picking up
the functional parts only as they
choose to.)
I first learned about Scala at POPL
2006 when Martin Odersky gave an
invited talk on it. At the time I saw
functional programming as strictly
superior to object-oriented
programming, so I didn't see a need
for a language that fused functional
and object-oriented programming. (That
was probably because all I wrote back
then were compilers, interpreters and
static analyzers.)
The need for Scala didn't become
apparent to me until I wrote a
concurrent HTTPD from scratch to
support long-polled AJAX for yaplet.
In order to get good multicore
support, I wrote the first version in
Java. As a language, I don't think
Java is all that bad, and I can enjoy
well-done object-oriented programming.
As a functional programmer, however,
the lack of (or needlessly verbose)
support of functional programming
features (like higher-order functions)
grates on me when I program in Java.
So, I gave Scala a chance.
Scala runs on the JVM, so I could
gradually port my existing project
into Scala. It also means that Scala,
in addition to its own rather large
library, has access to the entire Java
library as well. This means you can
get real work done in Scala.
As I started using Scala, I became
impressed by how cleverly the
functional and object-oriented worlds
blended together. In particular, Scala
has a powerful case
class/pattern-matching system that
addressed pet peeves lingering from my
experiences with Standard ML, OCaml
and Haskell: the programmer can decide
which fields of an object should be
matchable (as opposed to being forced
to match on all of them), and
variable-arity arguments are
permitted. In fact, Scala even allows
programmer-defined patterns. I write a
lot of functions that operate on
abstract syntax nodes, and it's nice
to be able to match on only the
syntactic children, but still have
fields for things such as annotations
or lines in the original program. The
case class system lets one split the
definition of an algebraic data type
across multiple files or across
multiple parts of the same file, which
is remarkably handy.
Scala also
supports well-defined multiple
inheritance through class-like devices
called traits.
Scala also allows a
considerable degree of overloading;
even function application and array
update can be overloaded. In my
experience, this tends to make my
Scala programs more intuitive and
concise.
One feature that turns out to save a
lot of code, in the same way that type
classes save code in Haskell, is
implicits. You can imagine implicits
as an API for the error-recovery phase
of the type-checker. In short, when
the type checker needs an X but got a
Y, it will check to see if there's an
implicit function in scope that
converts Y into X; if it finds one, it
"casts" using the implicit. This makes
it possible to look like you're
extending just about any type in
Scala, and it allows for tighter
embeddings of DSLs.
From the above excerpt it is clear that Scala's approach to unify OO and FP paradigms is far more superior to that of OCaml or F#.
HTH.
Regards,
Eric.
The syntax of F# was taken from OCaml but the object model of F# was taken from .NET. This gives F# a light and terse syntax that is characteristic of functional programming languages and at the same time allows F# to interoperate with the existing .NET languages and .NET libraries very smoothly through its object model.
Scala does a similar job on the JVM that F# does on the CLR. However Scala has chosen to adopt a more Java-like syntax. This may assist in its adoption by object-oriented programmers but to a functional programmer it can feel a bit heavy. Its object model is similar to Java's allowing for seamless interoperation with Java but has some interesting differences such as support for traits.
If functional programming means programming with functions, then Scala bends that a bit. In Scala, if I understand correctly, you're programming with methods instead of functions.
When the class (and the object of that class) behind the method don't matter, Scala will let you pretend it's just a function. Perhaps a Scala language lawyer can elaborate on this distinction (if it even is a distinction), and any consequences.