Why is elt not as common as car, cdr, first, rest? - lisp

I see this a lot in examples I read in books and articles:
(caddr *something*)
Or the many variants of c***r commands.
It seems a bit ridiculous to me, when you can more clearly just pull things out with elt:
(elt *something* 2)
But I don't see this technique used as much.
Is there a convention I don't understand that prefers the c***r functions?

elt is a generic function that works on both lists and arrays. It is needed when you want to write a generic algorithm which works in the same way on both data types. But there aren't many such algorithms because this would usually disadvantage lists.
The general convention seems to be that:
If you write a general-purpose function to work on lists you would use c(a|d)+r function (these cases are exceedingly rare because most of the time there is a library function for that). This usually happens in Stackoverflow questions code / code for class assignments etc :)
Seasoned Lisp programmers would use first, second etc. if possible. This is also some times mentioned in best practices. The best practices would also usually mention that one should create appropriate data structures instead of dealing with non-trivial lists.
nth or elt are indeed rare because it's hard to think of a good use case for them. I can imagine how both can be used in macros, where the performance isn't important, but some kind of generality is desirable, for instance if someone would want to deal with strings and lists of characters in the same way. Maybe in some prototyping code, where the programmer isn't yet sure of what data type they are going to use, but that's about it.

Functions like caddr, etc can extract components from lists nested inside other lists. But elt only operates on the top-level list, thus it could return a nested list in its entirety, but you'd need to also nest elt commands to extract components, which is what caddr is ultimately doing, so they are not really so comparable.
In some cases, you could interchange them, as they might return the same results if your list has no lists inside it.

Related

What is a multi-pass Sequence in Swift?

I am looking at this language from the Swift GeneratorType documentation and I'm having a hard time understanding it:
Any code that uses multiple generators (or for...in loops) over a single sequence should have static knowledge that the specific sequence is multi-pass, either because its concrete type is known or because it is constrained to CollectionType. Also, the generators must be obtained by distinct calls to the sequence's generate() method, rather than by copying.
What does it mean for a sequence to be "multi-pass"? This language seems quite important, but I can't find a good explanation for it. I understand, for example, the concept of a "multi-pass compiler", but I'm unsure if the concepts are similar or related...
Also, I have searched SO for other posts that answer this question. I have found this one, which makes the following statement in the C++ context:
The difference between algorithms that copy their iterators and those that do not is that the former are termed "multipass" algorithms, and require their iterator type to satisfy ForwardIterator, while the latter are single-pass and only require InputIterator.
But the meaning of that isn't entirely clear to me either, and I'm not sure if the concept is the same in Swift.
Any insight from those wiser than me would be much appreciated.
A "multi-pass" sequence is one that can be iterated over multiple times via a for...in loop or by using any number of generators (constructed via generate())
The text explains you would know a sequence is multi-pass because you
know its type (perhaps a class you designed) or
know it conforms to CollectionType. (for example, sets and arrays)

Should I prefer hashes or hashrefs in Perl?

I'm still learning perl.
For me it feels more "natural" to references to hashes rather than directly access them, because it is easier to pass references to a sub, (one variable can be passed instead of a list). Generally I prefer this approach to that where one directly accesses %hashes".
The question is, where (in what situations) is better to use plain %hashes, so
$hash{key} = $value;
instead of
$href->{key} = $value
Is here any speed, or any other "thing" what prefers to use %hashes and not $hashrefs? Or it is matter only of pure personal taste and TIMTOWTDI? Some examples, when is better to use %hash?
I think this kind of question is very legitimate: Programming languages such as Perl or C++ have come a long way and accumulated a lot of historical baggage, but people typically learn them from ahistorical, synchronous exposés. Hence they keep wondering why TIMTOWDI and WTF all these choices and what is better and what should be preferred?
So, before version 5, Perl didn't have references. It only had value types. References are an add-on to Perl 4, enabling lots more stuff to be written. Value types had to be retained, of course, to keep backward compatibility; and also, for simplicity's sake, because frequently you don't need the indirection that references are.
To answer your question:
Don't waste time thinking about the speed of Perl hash lists. They're fast. They're memory access. Accessing a database or the filesystem or the net, that is where your program will typically spend time.
In theory, a dereference operation should take a tiny bit of time, so tiny it shouldn't matter.
If you're curious, then benchmark. Don't draw too many conclusions from differences you might see. Things could look different on another release.
So there is no speed reason to favour references over value types or vice versa.
Is there any other reason? I'd say it's a question of style and taste. Personally, I prefer the syntax without -> accessors.
If you can use a plain hashes, to describe your data, you use a plain hash. However, when your data structure gets a bit more complex, you will need to use references.
Imagine a program where I'm storing information about inventory items, and how many I have in stock. A simple hash works quite well:
$item{XP232} = 324;
$item{BV348} = 145;
$item{ZZ310} = 485;
If all you're doing is creating quick programs that can read a file and store simple information for a report, there's no need to use references at all.
However, when things get more complex, you need references. For example, my program isn't just tracking my stock, I'm tracking all aspects of my inventory. Inventory items also have names, the company that creates them, etc. In this case, I'll want to have my hashes not pointing to a single data point (the number of items I have in stock), but a reference to a hash:
$item{XP232}->{DESCRIPTION} = "Blue Widget";
$item{XP232}->{IN_STOCK} = 324;
$item{XP232}->{MANUFACTURER} = "The Great American Widget Company";
$item{BV348}->{DESCRIPTION} = "A Small Purple Whatzit";
$item{BV348}->{IN_STOCK} = 145;
$item{BV348}->{MANUFACTURER} = "Acme Whatzit Company";
You can do all sorts of wacky things to do something like this (like have separate hashes for each field or put all fields in a single value separated by colons), but it's simply easier to use references to store these more complex structures.
For me the main reason to use $hashrefs to %hashes is the ability to give them meaningful names (a related idea would be name the references to an anonymous hash) which can help you separate data structures from program logic and make things easier to read and maintain.
If you end up with multiple levels of references (refs to refs?!) you start to loose this clean and readable advantage, though. As well, for short programs or modules, or at earlier stages of development where you are retesting things as you go, directly accessing the %hash can make things easier for simple debugging (print statements and the like) and avoiding accidental "action at a distance" issues so you can focus on "iterating" through your design, using references where appropriate.
In general though I think this is a great question because TIMTOWDI and TIMTOCWDI where C = "correct". Thanks for asking it and thanks for the answers.

Real World Functional Programming in Scala

Soooo...
Semigroups, Monoids, Monads, Functors, Lenses, Catamorphisms, Anamorphisms, Arrows... These all sound good, and after an exercise or two (or ten), you can grasp their essence. And with Scalaz, you get them for free...
However, in terms of real-world programming, I find myself struggling to find usages to these notions. Yes, of course I always find someone on the web using Monads for IO or Lenses in Scala, but... still...
What I am trying to find is something along the "prescriptive" lines of a pattern. Something like: "here, you are trying to solves this, and one good way to solve it is by using lenses this way!"
Suggestions?
Update: Something along these lines, with a book or two, would be great (thanks Paul): Examples of GoF Design Patterns in Java's core libraries
The key to functional programming is abstraction, and composability of abstractions. Monads, Arrows, Lenses, these are all abstractions which have proven themselves useful, mostly because they are composable. You've asked for a "prescriptive" answer, but I'm going to say no. Perhaps you're not convinced that functional programming matters?
I'm sure plenty of people on StackOverflow would be more than happy to try and help you solve a specific problem the FP way. Have a list of stuff and you want to traverse the list and build up some result? Use a fold. Want to parse XML? hxt uses arrows for that. And monads? Well, tons of data types turn out to be Monads, so learn about them and you'll discover a wealth of ways you can manipulate these data types. But its kind of hard to just pull examples out of thin air and say "lenses are the Right Way to do this", "monoids are the best way to do that", etc. How would you explain to a newbie what the use of a for loop is? If you want to [blank], then use a for loop [in this way]. It's so general; there are tons of ways to use a for loop. The same goes for these FP abstractions.
If you have many years of OOP experience, then don't forget you were once a newbie at OOP. It takes time to learn the FP way, and even more time to unlearn some OOP tendencies. Give it time and you will find plenty of uses for a Functional approach.
I gave a talk back in September focused on the practical application of monoids and applicative functors/monads via scalaz.Validation. I gave another version of the same talk at the scala Lift Off, where the emphasis was more on the validation. I would watch the first talk until I start on validations and then skip to the second talk (27 minutes in).
There's also a gist I wrote which shows how you might use Validation in a "practical" application. That is, if you are designing software for nightclub bouncers.
I think you can take the reverse approach and instead when writing a small piece of functionality, ask yourself whether any of those would apply: Semigroups, Monoids, Monads, Functors, Lenses, Catamorphisms, Anamorphisms, Arrows... A lots of those concepts can be used in a local way.
Once you start down that route, you may see usage everywhere. For me, I sort of get Semigroups, Monoids, Monads, Functors. So take the example of answering this question How do I populate a list of objects with new values. It's a real usage for the person asking the question (a self described noob). I am trying to answer in a simple way but I have to refrain myself from scratching the itch "there are monoids in here".
Scratching it now: using foldMap and the fact that Int and List are monoids and that the monoid property is preserved when dealing with tuple, maps and options:
// using scalaz
listVar.sliding(2).toList.foldMap{
case List(prev, i) => Some(Map(i -> (1, Some(List(math.abs(i - prev))))))
case List(i) => Some(Map(i -> (1, None)))
case _ => None
}.map(_.mapValues{ case (count, gaps) => (count, gaps.map(_.min)) })
But I don't come to that result by thinking I will use hard core functional programming. It comes more naturally by thinking this seems simpler if I compose those monoids combined with the fact that scalaz has utility methods like foldMap. Interestingly when looking at the resulting code it's not obvious that I'm totally thinking in terms of monoid.
You might like this talk by Chris Marshall. He covers a couple of Scalaz goodies - namely Monoid and Validation - with many practical examples. Ittay Dror has written a very accessible post on how Functor, Applicative Functor, and Monad can be useful in practice. Eric Torreborre and Debasish Gosh's blogs also have a bunch of posts covering use cases for categorical constructs.
This answer just lists a few links instead of providing some real substance here. (Too lazy to write.) Hope you find it helpful anyway.
I understand your situation, but you will find that to learn functional programming you will need to adjust your point of view to the documentation you find, instead of the other way around. Luckily in Scala you have the possibility of becoming a functional programmer gradually.
To answer your questions and explain the point-of-view difference, I need to distinguish between "type classes" (monoids, functors, arrows), mathematically called "structures", and generic operations or algorithms (catamorphisms or folds, anamorphisms or unfolds, etc.). These two often interact, since many generic operations are defined for specific classes of data types.
You look for prescriptive answers similar to design patterns: when does this concept apply? The truth is that you have surely seen the prescriptive answers, and they are simply the definitions of the different concepts. The problem (for you) is that those answers are intrinsically different from design patterns, but it is so for good reasons.
On the one hand, generic algorithms are not design patterns, which suggest a structure for the code you write; they are abstractions defined in the language which you can directly apply. They are general descriptions for common algorithms which you already implement today, but by hand. For instance, whenever you are computing the maximum element of a list by scanning it, you are hardcoding a fold; when you sum elements, you are doing the same; and so on. When you recognize that, you can declare the essence of the operation you are performing by calling the appropriate fold function. This way, you save code and bugs (no opportunity for off-by-one errors), and you save the reader the effort to read all the needed code.
On the other hand, structures concern not the goal you have in mind but properties of the entities you are modeling. They are more useful for bottom-up software construction, rather than top-down: when defining your data, you can declare that it is a e.g. a monoid. Later, when processing your data, you have the opportunity to use operations on e.g. monoids to implement your processing. In some cases it is useful to strive to express your algorithm in terms of the predefined ones. For instance, very often if you need to reduce a tree to a single value, a fold can do most or all of what you need. Of course, you can also declare that your data type is a monoid when you need a generic algorithm on monoids; but the earlier you notice that, the earlier you can start reusing generic algorithms for monoids.
Last advice is that probably most of the documentation you will find about these concepts concerns Haskell, because this language has been around for much more time and supports them in a quite elegant way. Quite recommended here are Learn you a Haskell for Great Good, a Haskell course for beginners, where among others chapters 11 to 14 focus on some type classes, and Typeclassopedia (which contains links to various articles with specific examples). EDIT: Finally, an example of applications of Monoids, taken from Typeclassopedia, is here: http://apfelmus.nfshost.com/articles/monoid-fingertree.html. I'm not saying there is little documentation for Scala, just that there is more in Haskell, and Haskell is where the application of these concepts to programming was born.

Would the ability to declare Lisp functions 'pure' be beneficial?

I have been reading a lot about Haskell lately, and the benefits that it derives from being a purely functional language. (I'm not interested in discussing monads for Lisp) It makes sense to me to (at least logically) isolate functions with side-effects as much as possible. I have used setf and other destructive functions plenty, and I recognize the need for them in Lisp and (most of) its derivatives.
Here we go:
Would something like (declare pure) potentially help an optimizing compiler? Or is this a moot point because it already knows?
Would the declaration help in proving a function or program, or at least a subset that was declared as pure? Or is this again something that is unnecessary because it's already obvious to the programmer and compiler and prover?
If for nothing else, would it be useful to a programmer for the compiler to enforce purity for functions with this declaration and add to the readability/maintainablity of Lisp programs?
Does any of this make any sense? Or am I too tired to even think right now?
I'd appreciate any insights here. Info on compiler implementation or provability is welcome.
EDIT
To clarify, I didn't intend to restrict this question to Common Lisp. It clearly (I think) doesn't apply to certain derivative languages, but I'm also curious if some features of other Lisps may tend to support (or not) this kind of facility.
You have two answers but neither touch on the real problem.
First, yes, it would obviously be good to know that a function is pure. There's a ton of compiler level things that would like to know that, as well as user level things. Given that lisp languages are so flexible, you could twist things a bit: instead of a "pure" declaration that asks the compiler to try harder or something, you just make the declaration restrict the code in the definition. This way you can guarantee that the function is pure.
You can even do that with additional supporting facilities -- I mentioned two of them in a comment I made to johanbev's answer: add the notion of immutable bindings and immutable data structures. I know that in Common Lisp these are very problematic, especially immutable bindings (since CL loads code by "side-effecting" it into place). But such features will help simplifying things, and they're not inconceivable (see for example the Racket implementation that has immutable pairs and other data structures, and has immutable bindings.
But the real question is what can you do in such restricted functions. Even a very simple looking problem would be infested with issues. (I'm using Scheme-like syntax for this.)
(define-pure (foo x)
(cons (+ x 1) (bar)))
Seems easy enough to tell that this function is indeed pure, it doesn't do anything . Also, seems that having define-pure restrict the body and allow only pure code would work fine in this case, and will allow this definition.
Now start with the problems:
It's calling cons, so it assumes that it is also known to be pure. In addition, as I mentioned above, it should rely on cons being what it is, so assume that the cons binding is immutable. Easy, since it's a known builtin. Do the same with bar, of course.
But cons does have a side effect (even if you're talking about Racket's immutable pairs): it allocates a new pair. This seems like a minor and ignorable point, but, for example, if you allow such things to appear in pure functions, then you won't be able to auto-memoize them. The problem is that someone might rely on every foo call returning a new pair -- one that is not-eq to any other existing pair. Seems that to make it fine you need to further restrict pure functions to deal not only with immutable values, but also values where the constructor doesn't always create a new value (eg, it could hash-cons instead of allocate).
But that code also calls bar -- so no you need to make the same assumptions on bar: it must be known as a pure function, with an immutable binding. Note specifically that bar receives no arguments -- so in that case the compiler could not only require that bar is a pure function, it could also use that information and pre-compute its value. After all, a pure function with no inputs could be reduced to a plain value. (Note BTW that Haskell doesn't have zero-argument functions.)
And that brings another big issue in. What if bar is a function of one input? In that case you'd have an error, and some exception will get thrown ... and that's no longer pure. Exceptions are side-effects. You now need to know the arity of bar in addition to everything else, and you need to avoid other exceptions. Now, how about that input x -- what happens if it isn't a number? That will throw an exception too, so you need to avoid it too. This means that you now need a type system.
Change that (+ x 1) to (/ 1 x) and you can see that not only do you need a type system, you need one that is sophisticated enough to distinguish 0s.
Alternatively, you could re-think the whole thing and have new pure arithmetic operations that never throw exceptions -- but with all the other restrictions you're now quite a long way from home, with a language that is radically different.
Finally, there's one more side-effect that remains a PITA: what if the definition of bar is (define-pure (bar) (bar))? It certainly is pure according to all of the above restrictions... But diverging is a form of a side effect, so even this is no longer kosher. (For example, if you did make your compiler optimize nullary functions to values, then for this example the compiler itself would get stuck in an infinite loop.) (And yes, Haskell doesn't deal with that, it doesn't make it less of an issue.)
Given a Lisp function, knowing if it is pure or not is undecidable in general. Of course, necessary conditions and sufficient conditions can be tested at compile time. (If there are no impure operations at all, then the function must be pure; if an impure operation gets executed unconditionally, then the function must be impure; for more complicated cases, the compiler could try to prove that the function is pure or impure, but it will not succeed in all cases.)
If the user can manually annotate a function as pure, then the compiler could either (a.) try harder to prove that the function is pure, ie. spend more time before giving up, or (b.) assume that it is and add optimizations which would not be correct for impure functions (like, say, memoizing results). So, yes, annotating functions as pure could help the compiler if the annotations are assumed to be correct.
Apart from heuristics like the "trying harder" idea above, the annotation would not help to prove stuff, because it's not giving any information to the prover. (In other words, the prover could just assume that the annotation is always there before trying.) However, it could make sense to attach to pure functions a proof of their purity.
The compiler could either (a.) check if pure functions are indeed pure at compile time, but this is undecidable in general, or (b.) add code to try to catch side effects in pure functions at runtime and report those as an error. (a.) would probably be helpful with simple heuristics (like "an impure operation gets executed unconditionally), (b.) would be useful for debug.
No, it seems to make sense. Hopefully this answer also does.
The usual goodies apply when we can assume purity and referential
transparency. We can automatically memoize hotspots. We can
automatically parallelize computation. We can deal away with a lot of
race conditions. We can also use structure sharing with data that we
know cannot be modified, for instance the (quasi) primitive ``cons()''
does not need to copy the cons-cells in the list it's consing to.
These cells are not affected in any way by having another cons-cell
pointing to it. This example is kinda obvious, but compilers are often
good performers in figuring out more complex structure sharing.
However, actually determining if a lambda (a function) is pure or has
referential transparency is very tricky in Common Lisp. Remember that
a funcall (foo bar) start by looking at (symbol-function foo). So in
this case
(defun foo (bar)
(cons 'zot bar))
foo() is pure.
The next lambda is also pure.
(defun quux ()
(mapcar #'foo '(zong ding flop)))
However, later on we can redefine foo:
(let ((accu -1))
(defun foo (bar)
(incf accu)))
The next call to quux() is no longer pure! The old pure foo() has been
redefined to an impure lambda. Yikes. This example is maybe somewhat
contrived but it's not that uncommon to lexically redefine some
functions, for instance with a let block. In that case it's not
possible to know what would happen at compile time.
Common Lisp has a very dynamic semantic, so actually being
able to determine control flow and data flow ahead of time (for
instance when compiling) is very hard, and in most useful cases
entirely undecidable. This is quite typical of languages with dynamic
type systems. There is a lot of common idioms in Lisp you cannot use
if you must use static typing. It's mainly these that fouls any
attempt to do much meaningful static analysis. We can do it for primitives
like cons and friends. But for lambdas involving other things than
primitives we are in much deeper water, especially in the cases where
we need to look at complex interplay between functions. Remember that
a lambda is only pure if all the lambdas it calls are also pure.
On the top of my head, it could be possible, with some deep macrology,
to do away with the redefinition problem. In a sense, each lambda gets
an extra argument which is a monad that represents the entire state of
the lisp image (we can obviously restrict ourselves to what the function
will actually look at). But it's probably more useful to be able do
declare purity ourselves, in the sense that we promise the compiler
that this lambda is indeed pure. The consequences if it isn't is then
undefined, and all sorts of mayhem could ensue...

Transient collections for Scala?

Clojure has a very nice concept of transient collections. Is there a library providing those for Scala (or F#)?
This sounds like a really great concept for language like F#, thanks for an interesting link!
When programming with arrays, F# programmers use exactly the same pattern. For example, create a mutable array, initialize it imperatively, return it and then work with it using functions that treat it as immutable such as Array.map (even though the array can actually be mutated as there is no transient array).
Using seq<'a> type: One way to do something similar is to convert the data structure to a generic sequence (seq<'a>) which is an immutable data type and so you cannot (directly) modify the original data structure via seq<'a>. For example:
let test () =
let arr = Array.create 10 0
for i in 0 .. (arr.Length - 1) do
arr.[i] <- // some calculation
Array.toSeq arr
The good thing is that the conversion is usually O(1) (arrays/lists/.. implement seq<'a> as an interface, so this is just cast). However, seq<'a> doesn't preserve properties of the source collection (such as efficiency, etc) and you can process it only using generic functions for working with sequences (from the Seq module). However, I think this is relatively close to the transient collections pattern.
Similar .NET type is also ReadOnlyCollection<'a> which wraps a collection type (somewhat more powerful than seq<'a>) into an immutable wrapper whose operations for modifying collection throw exception.
Related types: For more complicated types of collections, F#/.NET usually have both mutable and immutable implementation (the immutable one coming from F# libraries). The types are usually quite different, but sometimes share a common interface. This makes it possible to use one type when you use mutation and convert it to the other type when you know that you won't need it anymore. However, here you need copy data between different structures, so the conversion definitely isn't O(1). It may be something between O(n) and O(n*log n).
Examples of similar collections are mutable Dictionary<'Key, 'Value> with immutable Map<'Key, 'Value> and mutable HashSet<'T> or SortedSet<'T> with immutable set<'T> (from F# libraries).
I don't know of any library for this in F# (nothing in the standard library, and I don't recall seeing anyone blog anything like this, though there are a number of libraries for simple persistent/immutable structures). It would be great for some third-party to create such a library. Rich Hickey is the man these days when it comes to these awesome practical (mostly-)functional data structures, I love reading about that stuff.
Please have a look at the following post by Daniel Spiewak:
http://www.codecommit.com/blog/scala/implementing-persistent-vectors-in-scala
He also ported the algorithm by Rich Hickey to Scala. In the article IntMap is also mentioned which is almost as fast the Clojure implementation.