I'm going to be giving a short (30-40 mins) lunch-time talk on Scala to technical staff at my company. I'd like some suggestions for what would be the most appropriate content. Most people attending will have experience in Java and/or C# (plus various other languages).
What are the key things to cover? I'd like to give a brief introduction to the Scala syntax so that people don't feel lost when looking at code examples. I'll also cover some of the history behind the language and its designers. What would help people get the most out of the talk?
People are almost certainly coming to talk to get an answer to the question, "Why should I use Scala?" Anything you can provide to help them answer that will be valuable.
Keep the discussion of the history and the personalities behind Scala to a minimum.
A whirlwind tour of the syntax is useful, but keep it short.
Spend a good chunk of the talk demonstrating examples and comparisons to Java. Show cases where Scala shines. You should literally be running and executing code so that people get a real, hands-on feel for how things work.
Make sure to cover weaknesses, too! Provide an objective and balanced overview.
I gave a similar talk - mostly to those with a Java background. I felt that taking a piece of real Java (about 30 lines) and iteratively adding scala features worked pretty well. The 30 lines of Java eventually ended up as 6 (six!) of scala. The point being (of course) that 6 lines are more readable and maintainable than 30.
I converted the scala to line-by-line Java equivalent and then introduced:
Type inference
Option
Closures
Pattern-matching (on lists)
Type aliases
Tail recursion
I found that this segment took quite a long time because the audience were very interested in the minutiae of scala's syntax (especially around function-expressions). Before undertaking the pattern-matching bit, I had a slide explaining the various things you could use in a match.
Tough. One has to balance the new and the familiar. For instance:
Talk about traits, how they differ from interfaces and multiple inheritance. Note that most methods in all of Scala collections can actually be found on the trait Traversable, which has a single abstract method: foreach.
Speak of functions and partial functions, show map/filter/foreach, and how they make use of functions.
Talk about pattern matching -- show how unapply is used to enable representation independence, while at the same time case classes make the common case easy.
Above all AVOID any topic that might be difficult to understand quickly, or you may waste time on them. For example of great topics I wouldn't talk about: self types, variance, for-comprehensions.
Pick more topics than you have time for. Let the public steer the conversation towards the topcis they are more interested in. If anyone starts to boggle down a topic too much, say you'll be pleased to explain it in more details later, and ask if they would mind if you moved to another topic. On the other hand, if everyone seems to be picking up on one thing in particular, stay with it. Otherwise, it might feel like you want to hide something.
I gave a presentation on re-writing Java classes in Scala. It has lots of examples of Java -> Scala and (hopefully) makes the gains obvious. Feel free to borrow any content you want... presentation took 1hr 10minutes so you might want to cut some stuff out.
Presentation: http://www.colinhowe.co.uk/downloads/rewriting-java-in-scala.ppt
Video: http://skillsmatter.com/podcast/java-jee/re-writing-java-classes-in-scala-and-making-your-code-lovely
You could do worse than running through Jonas Bonér's presentation, Pragmatic Real-World Scala. Perhaps skip some advanced topics in there on different applications of traits and self-type annotations.
Related
How does the Cats library relate to scalaz? The Cats project mentions it is descended from scalaz.
I would like to keep this from getting too political*, but cats is for all intents and purposes scalaz. It has not reached full parity as of yet, but keep in mind it was only created a few months ago. The goal is for it to be a more pragmatic approach and more democratic when it comes to its evolution. So, naming of operators and classes is hopefully going to be a little more straightforward, as well as it has no qualms with using mutable data within a method if it means better performance. Last, they are HOPING to have better documentation....all of this means that it may end up becoming a replacement for scalaz with a better beginner's approach for those not embroiled in the math world. If you want a fuller answer, then maybe head over to their gitter board and Erik (non) could answer it himself :)
*The gist is that scalaz has some social baggage with it that causes a number of big names to shy away from using and/or contributing.
just want to mention here that fairly recently scalaz became as well a namespace for a bunch of new, fp libraries, like testz, scalaz-zio, scalaz-metrics, scalaz-http, scalaz-analytics and more. And ScalaZ 8 is on the way!
I was under the impression that there are folks out there that do write pure applications using Scalaz, but based on this example: [ stacking StateT in scalaz ], it looks like anything real would also be impossibly hairy.
Are there any guidelines or examples of real, modular, loosely-coupled, pure applications in Scala? I'm expecting that this means scalaz.effect.SafeApp and RWST over IO, but I'd like to hear from folks who have done it.
Thanks.
Edit: In the absence of an answer, I've started collecting resources as an answer below. If you have any examples or related links to contribute, please do.
i think you are mixing two different things. one is pure functional programming and second is scala type system. you can do 'pure' programming in any language, even in java. if the language is funvtional than you will have pure functional programming.
does it make your programs work faster? depends on the program - it scales better but for single threaded parts you will rather loose performance.
does it 'save your cognition'? it depends on how good you are in what you are doing. if you work with FP, monads, arrows etc on the daily basis then i assume it may help significantly. if you show the code to the OO developer he probably won't understand anything.
does it save the development time? as previously, i think it may but to be honest it doesn't matter that much. you more often read the code rather than write it
can you do useful stuff in PFP? yes, some companies makes money on haskell
and now, can it be done in scala? for sure. will anyone do it in scala? probably not because it's too easy to break the purity, because type system is too weak and because there are better, 'more pure' tools for it (but currently not on jvm)
I guess I will start collecting resources here, and update as I find more.
Functional Reactive Programming: stefan hoeck's blog, github, examples
Monadic effect worlds for interacting safely with mutable data. (tpolecat)
Mellow database access for Scala (tpolecat)
Dependency Injection without the Gymnastics (tony, rúnar)
Google search for "extends SafeApp"
Soooo...
Semigroups, Monoids, Monads, Functors, Lenses, Catamorphisms, Anamorphisms, Arrows... These all sound good, and after an exercise or two (or ten), you can grasp their essence. And with Scalaz, you get them for free...
However, in terms of real-world programming, I find myself struggling to find usages to these notions. Yes, of course I always find someone on the web using Monads for IO or Lenses in Scala, but... still...
What I am trying to find is something along the "prescriptive" lines of a pattern. Something like: "here, you are trying to solves this, and one good way to solve it is by using lenses this way!"
Suggestions?
Update: Something along these lines, with a book or two, would be great (thanks Paul): Examples of GoF Design Patterns in Java's core libraries
The key to functional programming is abstraction, and composability of abstractions. Monads, Arrows, Lenses, these are all abstractions which have proven themselves useful, mostly because they are composable. You've asked for a "prescriptive" answer, but I'm going to say no. Perhaps you're not convinced that functional programming matters?
I'm sure plenty of people on StackOverflow would be more than happy to try and help you solve a specific problem the FP way. Have a list of stuff and you want to traverse the list and build up some result? Use a fold. Want to parse XML? hxt uses arrows for that. And monads? Well, tons of data types turn out to be Monads, so learn about them and you'll discover a wealth of ways you can manipulate these data types. But its kind of hard to just pull examples out of thin air and say "lenses are the Right Way to do this", "monoids are the best way to do that", etc. How would you explain to a newbie what the use of a for loop is? If you want to [blank], then use a for loop [in this way]. It's so general; there are tons of ways to use a for loop. The same goes for these FP abstractions.
If you have many years of OOP experience, then don't forget you were once a newbie at OOP. It takes time to learn the FP way, and even more time to unlearn some OOP tendencies. Give it time and you will find plenty of uses for a Functional approach.
I gave a talk back in September focused on the practical application of monoids and applicative functors/monads via scalaz.Validation. I gave another version of the same talk at the scala Lift Off, where the emphasis was more on the validation. I would watch the first talk until I start on validations and then skip to the second talk (27 minutes in).
There's also a gist I wrote which shows how you might use Validation in a "practical" application. That is, if you are designing software for nightclub bouncers.
I think you can take the reverse approach and instead when writing a small piece of functionality, ask yourself whether any of those would apply: Semigroups, Monoids, Monads, Functors, Lenses, Catamorphisms, Anamorphisms, Arrows... A lots of those concepts can be used in a local way.
Once you start down that route, you may see usage everywhere. For me, I sort of get Semigroups, Monoids, Monads, Functors. So take the example of answering this question How do I populate a list of objects with new values. It's a real usage for the person asking the question (a self described noob). I am trying to answer in a simple way but I have to refrain myself from scratching the itch "there are monoids in here".
Scratching it now: using foldMap and the fact that Int and List are monoids and that the monoid property is preserved when dealing with tuple, maps and options:
// using scalaz
listVar.sliding(2).toList.foldMap{
case List(prev, i) => Some(Map(i -> (1, Some(List(math.abs(i - prev))))))
case List(i) => Some(Map(i -> (1, None)))
case _ => None
}.map(_.mapValues{ case (count, gaps) => (count, gaps.map(_.min)) })
But I don't come to that result by thinking I will use hard core functional programming. It comes more naturally by thinking this seems simpler if I compose those monoids combined with the fact that scalaz has utility methods like foldMap. Interestingly when looking at the resulting code it's not obvious that I'm totally thinking in terms of monoid.
You might like this talk by Chris Marshall. He covers a couple of Scalaz goodies - namely Monoid and Validation - with many practical examples. Ittay Dror has written a very accessible post on how Functor, Applicative Functor, and Monad can be useful in practice. Eric Torreborre and Debasish Gosh's blogs also have a bunch of posts covering use cases for categorical constructs.
This answer just lists a few links instead of providing some real substance here. (Too lazy to write.) Hope you find it helpful anyway.
I understand your situation, but you will find that to learn functional programming you will need to adjust your point of view to the documentation you find, instead of the other way around. Luckily in Scala you have the possibility of becoming a functional programmer gradually.
To answer your questions and explain the point-of-view difference, I need to distinguish between "type classes" (monoids, functors, arrows), mathematically called "structures", and generic operations or algorithms (catamorphisms or folds, anamorphisms or unfolds, etc.). These two often interact, since many generic operations are defined for specific classes of data types.
You look for prescriptive answers similar to design patterns: when does this concept apply? The truth is that you have surely seen the prescriptive answers, and they are simply the definitions of the different concepts. The problem (for you) is that those answers are intrinsically different from design patterns, but it is so for good reasons.
On the one hand, generic algorithms are not design patterns, which suggest a structure for the code you write; they are abstractions defined in the language which you can directly apply. They are general descriptions for common algorithms which you already implement today, but by hand. For instance, whenever you are computing the maximum element of a list by scanning it, you are hardcoding a fold; when you sum elements, you are doing the same; and so on. When you recognize that, you can declare the essence of the operation you are performing by calling the appropriate fold function. This way, you save code and bugs (no opportunity for off-by-one errors), and you save the reader the effort to read all the needed code.
On the other hand, structures concern not the goal you have in mind but properties of the entities you are modeling. They are more useful for bottom-up software construction, rather than top-down: when defining your data, you can declare that it is a e.g. a monoid. Later, when processing your data, you have the opportunity to use operations on e.g. monoids to implement your processing. In some cases it is useful to strive to express your algorithm in terms of the predefined ones. For instance, very often if you need to reduce a tree to a single value, a fold can do most or all of what you need. Of course, you can also declare that your data type is a monoid when you need a generic algorithm on monoids; but the earlier you notice that, the earlier you can start reusing generic algorithms for monoids.
Last advice is that probably most of the documentation you will find about these concepts concerns Haskell, because this language has been around for much more time and supports them in a quite elegant way. Quite recommended here are Learn you a Haskell for Great Good, a Haskell course for beginners, where among others chapters 11 to 14 focus on some type classes, and Typeclassopedia (which contains links to various articles with specific examples). EDIT: Finally, an example of applications of Monoids, taken from Typeclassopedia, is here: http://apfelmus.nfshost.com/articles/monoid-fingertree.html. I'm not saying there is little documentation for Scala, just that there is more in Haskell, and Haskell is where the application of these concepts to programming was born.
In Java and C++ designing program's objects hierarchy is pretty obvious. But beginning Scala I found myself difficult to decide what classes to define to better employ Scala's syntactic sugar facilities (an even idealess about how should I design for better performance). Any good readings on this question?
I have read 4 books on Scala, but I have not found what you are asking for. I guess you have read "Programming in Scala" by Odersky (Artima) already. If not, this is a link to the on-line version:
http://www.docstoc.com/docs/8692868/Programming-In-Scala
This book gives many examples how to construct object-oriented models in Scala, but all examples are very small in number of classes. I do not know of any book that will teach you how to structure large scale systems using Scala.
Imperative object-orientation has
been around since Smalltalk, so we
know a lot about this paradigm.
Functional object-orientation on the
other hand, is a rather new concept,
so in a few years I expect books
describing large scale FOO systems to
appear. Anyway, I think that the PiS
book gives you a pretty good picture
how you can put together the basic
building blocks of a system, like
Factory pattern, how to replace the
Strategy pattern with function
literals and so on.
One thing that Viktor Klang once told me (and something I really agree upon) is that one difference between C++/Java and Scala OO is that you define a lot more (smaller) classes when you use Scala. Why? Because you can! The syntactic sugar for the case class result in a very small penalty for defining a class, both in typing and in readability of the code. And as you know, many small classes usually means better OO (fewer bugs) but worse performance.
One other thing I have noticed is that I use the factory pattern a lot more when dealing with immutable objects, since all "changes" of an instance results in creating a new instance. Thank God for the copy() method on the case class. This method makes the factory methods a lot shorter.
I do not know if this helped you at all, but I think this subject is very interesting myself, and I too await more literature on this subject.
Cheers!
This is still an evolving matter. For instance, the just released Scala 2.8.0 brought support of type constructor inference, which enabled a pattern of type classes in Scala. The Scala library itself has just began using this pattern. Just yesterday I heard of a new Lift module in which they are going to try to avoid inheritance in favor of type classes.
Scala 2.8.0 also introduced lower priority implicits, plus default and named parameters, both of which can be used, separately or together, to produce very different designs than what was possible before.
And if we go back in time, we note that other important features are not that old either:
Extractor methods on case classes object companions where introduced February 2008 (before that, the only way to do extraction on case classes was through pattern matching).
Lazy values and Structural types where introduced July 2007.
Abstract types support for type constructors was introduced in May 2007.
Extractors for non-case classes was introduced in January 2007.
It seems that implicit parameters were only introduced in March 2006, when they replaced the way views were implemented.
All that means we are all learning how to design Scala software. Be sure to rely on tested designs of functional and object oriented paradigms, to see how new features in Scala are used in other languages, like Haskell and type classes or Python and default (optional) and named parameters.
Some people dislike this aspect of Scala, others love it. But other languages share it. C# is adding features as fast as Scala. Java is slower, but it goes through changes too. It added generics in 2004, and the next version should bring some changes to better support concurrent and parallel programming.
I don't think that there are much tutorials for this. I'd suggest to stay with the way you do it now, but to look through "idiomatic" Scala code as well and to pay special attention in the following cases:
use case classes or case objects instead of enums or "value objects"
use objects for singletons
if you need behavior "depending on the context" or dependency-injection-like functionality, use implicits
when designing a type hierarchy or if you can factor things out of a concrete class, use traits when possible
Fine grained inheritance hierarchies are OK. Keep in mind that you have pattern matching
Know the "pimp my library" pattern
And ask as many questions as you feel you need to understand a certain point. The Scala community is very friendly and helpful. I'd suggest the Scala mailing list, Scala IRC or scala-forum.org
I've just accidentally googled to a file called "ScalaStyleGuide.pdf". Going to read...
This question already has answers here:
Closed 12 years ago.
Possible Duplicate:
Why is Lisp used for AI?
What makes a language suitable for Artificial Intelligence development?
I've heard that LISP and Prolog are widely used in this field. What features make them suitable for AI?
Overall I would say the main thing I see about languages "preferred" for AI is that they have high order programming along with many tools for abstraction.
It is high order programming (aka functions as first class objects) that tends to be a defining characteristic of most AI languages http://en.wikipedia.org/wiki/Higher-order_programming that I can see. That article is a stub and it leaves out Prolog http://en.wikipedia.org/wiki/Prolog which allows high order "predicates".
But basically high order programming is the idea that you can pass a function around like a variable. Surprisingly a lot of the scripting languages have functions as first class objects as well. LISP/Prolog are a given as AI languages. But some of the others might be surprising. I have seen several AI books for Python. One of them is http://www.nltk.org/book. Also I have seen some for Ruby and Perl. If you study more about LISP you will recognize a lot of its features are similar to modern scripting languages. However LISP came out in 1958...so it really was ahead of its time.
There are AI libraries for Java. And in Java you can sort of hack functions as first class objects using methods on classes, it is harder/less convenient than LISP but possible. In C and C++ you have function pointers, although again they are much more of a bother than LISP.
Once you have functions as first class objects, you can program much more generically than is otherwise possible. Without functions as first class objects, you might have to construct sum(array), product(array) to perform the different operations. But with functions as first class objects you could compute accumulate(array, +) and accumulate(array, *). You could even do accumulate(array, getDataElement, operation). Since AI is so ill defined that type of flexibility is a great help. Now you can build much more generic code that is much easier to extend in ways that were not originally even conceived.
And Lambda (now finding its way all over the place) becomes a way to save typing so that you don't have to define every function. In the previous example, instead of having to make getDataElement(arrayelement) { return arrayelement.GPA } somewhere you can just say accumulate(array, lambda element: return element.GPA, +). So you don't have to pollute your namespace with tons of functions to only be called once or twice.
If you go back in time to 1958, basically your choices were LISP, Fortran, or Assembly. Compared to Fortran LISP was much more flexible (unfortunately also less efficient) and offered much better means of abstraction. In addition to functions as first class objects, it also had dynamic typing, garbage collection, etc. (stuff any scripting language has today). Now there are more choices to use as a language, although LISP benefited from being first and becoming the language that everyone happened to use for AI. Now look at Ruby/Python/Perl/JavaScript/Java/C#/and even the latest proposed standard for C you start to see features from LISP sneaking in (map/reduce, lambdas, garbage collection, etc.). LISP was way ahead of its time in the 1950's.
Even now LISP still maintains a few aces in the hole over most of the competition. The macro systems in LISP are really advanced. In C you can go and extend the language with library calls or simple macros (basically a text substitution). In LISP you can define new language elements (think your own if statement, now think your own custom language for defining GUIs). Overall LISP languages still offer ways of abstraction that the mainstream languages still haven't caught up with. Sure you can define your own custom compiler for C and add all the language constructs you want, but no one does that really. In LISP the programmer can do that easily via Macros. Also LISP is compiled and per the programming language shootout, it is more efficient than Perl, Python, and Ruby in general.
Prolog basically is a logic language made for representing facts and rules. What are expert systems but collections of rules and facts. Since it is very convenient to represent a bunch of rules in Prolog, there is an obvious synergy there with expert systems.
Now I think using LISP/Prolog for every AI problem is not a given. In fact just look at the multitude of Machine Learning/Data Mining libraries available for Java. However when you are prototyping a new system or are experimenting because you don't know what you are doing, it is way easier to do it with a scripting language than a statically typed one. LISP was the earliest languages to have all these features we take for granted. Basically there was no competition at all at first.
Also in general academia seems to like functional languages a lot. So it doesn't hurt that LISP is functional. Although now you have ML, Haskell, OCaml, etc. on that front as well (some of these languages support multiple paradigms...).
The main calling card of both Lisp and Prolog in this particular field is that they support metaprogramming concepts like lambdas. The reason that is important is that it helps when you want to roll your own programming language within a programming language, like you will commonly want to do for writing expert system rules.
To do this well in a lower-level imperative language like C, it is generally best to just create a separate compiler or language library for your new (expert system rule) language, so you can write your rules in the new language and your actions in C. This is the principle behind things like CLIPS.
The two main things you want are the ability to do experimental programming and the ability to do unconventional programming.
When you're doing AI, you by definition don't really know what you're doing. (If you did, it wouldn't be AI, would it?) This means you want a language where you can quickly try things and change them. I haven't found any language I like better than Common Lisp for that, personally.
Similarly, you're doing something not quite conventional. Prolog is already an unconventional language, and Lisp has macros that can transform the language tremendously.
What do you mean by "AI"? The field is so broad as to make this question unanswerable. What applications are you looking at?
LISP was used because it was better than FORTRAN. Prolog was used, too, but no one remembers that. This was when people believed that symbol-based approaches were the way to go, before it was understood how hard the sensing and expression layers are.
But modern "AI" (machine vision, planners, hell, Google's uncanny ability to know what you 'meant') is done in more efficient programming languages that are more sustainable for a large team to develop in. This usually means C++ these days--but it's not like anyone thinks of C++ as a good language for AI.
Hell, you can do a lot of what was called "AI" in the 70s in MATLAB. No one's ever called MATLAB "a good language for AI" before, have they?
Functional programming languages are easier to parallelise due to their stateless nature. There seems to already be a subject about it with some good answers here: Advantages of stateless programming?
As said, its also generally simpler to build programs that generate programs in LISP due to the simplicity of the language, but this is only relevant to certain areas of AI such as evolutionary computation.
Edit:
Ok, I'll try and explain a bit about why parallelism is important to AI using Symbolic AI as an example, as its probably the area of AI that I understand best. Basically its what everyone was using back in the day when LISP was invented, and the Physical Symbol Hypothesis on which it is based is more or less the same way you would go about calculating and modelling stuff in LISP code. This link explains a bit about it:
http://www.cs.st-andrews.ac.uk/~mkw/IC_Group/What_is_Symbolic_AI_full.html
So basically the idea is that you create a model of your environment, then searching through it to find a solution. One of the simplest to algorithms to implement is a breadth first search, which is an exhaustive search of all possible states. While producing an optimal result, it is usually prohibitively time consuming. One way to optimise this is by using a heuristic (A* being an example), another is to divide the work between CPUs.
Due to statelessness, in theory, any node you expand in your search could be ran in a separate thread without the complexity or overhead involved in locking shared data. In general, assuming the hardware can support it, then the more highly you can parallelise a task the faster you will get your result. An example of this could be the folding#home project, which distributes work over many GPUs to find optimal protein folding configurations (that may not have anything to do with LISP, but is relevant to parallelism).
As far as I know from LISP is that is a Functional Programming Language, and with it you are able to make "programs that make programs. I don't know if my answer suits your needs, see above links for more information.
Pattern matching constructs with instantiation (or the ability to easily construct pattern matching code) are a big plus. Pattern matching is not totally necessary to do A.I., but it can sure simplify the code for many A.I. tasks. I'm finding this also makes F# a convenient language for A.I.
Languages per se (without libraries) are suitable/comfortable for specific areas of research/investigation and/or learning/studying ("how to do the simplest things in the hardest way").
Suitability for commercial development is determined by availability of frameworks, libraries, development tools, communities of developers, adoption by companies. For ex., in internet you shall find support for any, even the most exotic issue/areas (including, of course, AI areas), for ex., in C# because it is mainstream.
BTW, what specifically is context of question? AI is so broad term.
Update:
Oooops, I really did not expect to draw attention and discussion to my answer.
Under ("how to do the simplest things in the hardest way"), I mean that studying and learning, as well as academic R&D objectives/techniques/approaches/methodology do not coincide with objectives of (commercial) development.
In student (or even academic) projects one can write tons of code which would probably require one line of code in commercial RAD (using of component/service/feature of framework or library).
Because..! oooh!
Because, there is no sense to entangle/develop any discussion without first agreeing on common definitions of terms... which are subjective and depend on context... and are not so easy to be formulate in general/abstract context.
And this is inter-disciplinary matter of whole areas of different sciences
The question is broad (philosophical) and evasively formulated... without beginning and end... having no definitive answers without of context and definitions...
Are we going to develop here some spec proposal?