transformations of data structures with structural sharing - scala

In functional programming, it's possible to save a large amount of memory by employing structural sharing. For instance, these two lists are the same, but the second is represented in memory in a much more efficient way:
val n = 4
// n: Int = 4
val l1 = List.tabulate(n)(x => (n-x until n).toList)
// l1: List[List[Int]] = List(List(), List(3), List(2, 3), List(1, 2, 3))
val l2 = List.unfold((n, List.empty[Int])) { case (i, l) =>
if (i > 0) Some((l, ((i - 1), i - 1 :: l)))
else None
}
// l2: List[List[Int]] = List(List(), List(3), List(2, 3), List(1, 2, 3))
But the moment you actually do something with l2, this advantage is quickly lost. Not only will the result of l2.map(_.map(_ + 1)) require just as much memory as l1, it's also inefficient, because it will perform n*(n-1)/2 additions, even though there's only n-1 different numbers in this data structure.
With a mutable data structure this is easy: you can update the values in place, along with a marker that tells you whether the operation was already performed on that node. This way, you only traverse the data structure once, structural sharing is preserved and you only perform n-1 additions.
But is there some elegant, functional way to achieve this without using mutable data structures?

As far as I know the mutable collections in the standard library don't take advantage of structural sharing, so you'd be implementing one on your own.
l2's structural sharing is somewhat coincidental: a side effect of how it was built up. It's not encoding the underlying structure (in this case that the nth element is the tail of the n+1th element with the last element being all the underlying elements). But you can encode that structure fairly easily (I suspect it may be easier than with the mutable version).
Using the 2.13 APIs
case class TailsFirst[A](val elements: List[A]) extends Seq[List[A]] {
def length: Int = 1 + elements.length
def apply(idx: Int): List[A] = {
require(idx < length)
underlying.drop(elements.size - idx)
}
// useful for iteration, which will happen a lot (e.g. in toString)
lazy val reverseElements: List[A] = elements.reverse
def iterator(): Iterator[List[A]] =
new Iterator[List[A]] {
var state: Option[List[A]] = Some(reverseElements)
var toEmit: List[A] = Nil
def hasNext: Boolean = state.nonEmpty
def next(): List[A] = {
if (hasNext) {
val ret = toEmit
state.flatMap(_.headOption) match {
case None =>
assume(state.contains(Nil)) // hint for a mythical static analyzer
state = None
ret
case Some(toPrepend) =>
state = state.map(_.tail)
toEmit = toPrepend :: toEmit
ret
}
} else throw new NoSuchElementException("exhausted iterator")
def flatMapElements[B](f: A => List[B]): TailsFirst[B] =
TailsFirst(elements.flatMap(f))
def mapElements[B](f: A => B): TailsFirst[B] =
TailsFirst(elements.map(f))
}

Related

functional programming with scala

I've been trying to enhance my knowledge with scala.
I am trying to implement this function recursively but having difficulty.
My main question IS, how can you accept a list in the parameter that accepts either a list or numbers.
depth(x: Any): Int is the signature you want, then pattern match on x to determine if it's a List[_] or not, where _ indicates you don't care what's in the list. (Using Seq[_] would be the more idiomatic Scala type to use, actually.) Note that the example shown is missing a pair of parens, List(1, 2, List(3))... It also assumes that depth(8) == 0 (for example).
A tricky part is that you shouldn't assume that a nested list will either be the first or last element in the "parent" list, i.e., ...List(1,List(2,3),4)... is possible.
A final bit worth mentioning; if you were building a "production" depth method, it would be worth having a Tree abstraction with Node and Leaf concrete types so you can use a better type signature, depth(tree: Tree[_]): Int, to make it explicitly clear when something represents part of the tree structure vs. data in the tree. Using List here is convenient for the exercise, but a bit ambiguous otherwise, since you could have a tree of stuff where some nodes are actually lists.
I will try to answer this by giving a shot on the recursive solution:
def depth(listOrNum: Any): Int = {
def help(y: Any, c: Int): Int = {
y match {
case y: Int => c
case List(curHead, rest # _*) =>
Math.max(help(curHead, c+1), help(rest, c))
case _ => 0
}
}
help(listOrNum, 0)
}
collect is handy here:
def depth(xs: List[Any]): Int =
1 + xs.collect{case xs: List[_] => depth(xs)}
.foldLeft(0)(_ max _)
P.S. I agree with Dean's comments about the type List[Any] being a poor way to represent trees. List[Any] is a type that should never appear in ordinary Scala code, so I'm sad to see it used in an exercise intended for beginners.
If you are insisting on using for comprehension, I can provide implementation that works with it.
First you define two useful classes
import scala.collection.generic.CanBuildFrom
import scala.collection.mutable.Builder
class Folder[T](init : T, step : (T,T) => T) extends Builder[T,T] {
private[this] var state = init
override def += (elem : T) = {
state = step(state, elem)
this
}
override def clear() {
state = init
}
override def result() : T = state
}
class CanBuildFolder[F,T](init : T, step : (T,T) => T) extends CanBuildFrom[F,T,T] {
override def apply() : Builder[T,T] = new Folder(init, step)
override def apply(from : F) : Builder[T,T] = new Folder(init, step)
}
than you can use them with the for comprehension
import scala.math.max
object Test {
val example = List(1, List(2, 3), List( List(4, 5), 6, List(7, List(List(8), 9))))
implicit val maxFolder = new CanBuildFolder[List[Any], Int](0, max)
def depth(source : List[Any]) : Int =
for (x <- source) yield x match {
case l : List[Any] => depth(l) + 1
case _ => 1
}
assert(depth(example) == 5)
}

Insert a new element in a specified position of a list

There is no built-in function or a method of a List that would allow user to add a new element in a certain position of a List. I've wrote a function that does this but I'm not sure that its a good idea to do it this way, even though it works perfectly well:
def insert(list: List[Any], i: Int, value: Any) = {
list.take(i) ++ List(value) ++ list.drop(i)
}
Usage:
scala> insert(List(1,2,3,5), 3, 4)
res62: List[Any] = List(1, 2, 3, 4, 5)
Type Safety
The most glaring thing I see is the lack of type safety / loss of type information. I would make the method generic in the list's element type:
def insert[T](list: List[T], i: Int, value: T) = {
list.take(i) ++ List(value) ++ list.drop(i)
}
Style
If the body only consists of a single expression, there is no need for curly braces:
def insert[T](list: List[T], i: Int, value: T) =
list.take(i) ++ List(value) ++ list.drop(i)
Efficiency
#Marth's comment about using List.splitAt to avoid traversing the list twice is also a good one:
def insert[T](list: List[T], i: Int, value: T) = {
val (front, back) = list.splitAt(i)
front ++ List(value) ++ back
}
Interface
It would probably be convenient to be able to insert more than one value at a time:
def insert[T](list: List[T], i: Int, values: T*) = {
val (front, back) = list.splitAt(i)
front ++ values ++ back
}
Interface, take 2
You could make this an extension method of List:
implicit class ListWithInsert[T](val list: List[T]) extends AnyVal {
def insert(i: Int, values: T*) = {
val (front, back) = list.splitAt(i)
front ++ values ++ back
}
}
List(1, 2, 3, 6).insert(3, 4, 5)
// => List(1, 2, 3, 4, 5, 6)
Closing remarks
Note, however, that inserting into the middle of the list is just not a good fit for a cons list. You'd be much better off with a (mutable) linked list or a dynamic array instead.
You can also use xs.patch(i, ys, r), which replaces r elements of xs starting with i by the patch ys, by using r=0 and by making ys a singleton:
List(1, 2, 3, 5).patch(3, List(4), 0)
In the Scala course by his eminence Martin Odersky himself, he implements it similarly to
def insert(list: List[Any], i: Int, value: Any): List[Any] = list match {
case head :: tail if i > 0 => head :: insert(tail, i-1, value)
case _ => value :: list
}
One traversal at most.

Split a collection to those items that satisfy a predicate, and those that don't [duplicate]

How do I split a sequence into two lists by a predicate?
Alternative: I can use filter and filterNot, or write my own method, but isn't there a better more general (built-in) method ?
By using partition method:
scala> List(1,2,3,4).partition(x => x % 2 == 0)
res0: (List[Int], List[Int]) = (List(2, 4),List(1, 3))
Good that partition was the thing you wanted -- there's another method that also uses a predicate to split a list in two: span.
The first one, partition will put all "true" elements in one list, and the others in the second list.
span will put all elements in one list until an element is "false" (in terms of the predicate). From that point forward, it will put the elements in the second list.
scala> Seq(1,2,3,4).span(x => x % 2 == 0)
res0: (Seq[Int], Seq[Int]) = (List(),List(1, 2, 3, 4))
You might want to take a look at scalex.org - it allows you to search the scala standard library for functions by their signature. For example, type the following:
List[A] => (A => Boolean) => (List[A], List[A])
You would see partition.
You can also use foldLeft if you need something a little extra. I just wrote some code like this when partition didn't cut it:
val list:List[Person] = /* get your list */
val (students,teachers) =
list.foldLeft(List.empty[Student],List.empty[Teacher]) {
case ((acc1, acc2), p) => p match {
case s:Student => (s :: acc1, acc2)
case t:Teacher => (acc1, t :: acc2)
}
}
I know I might be late for the party and there are more specific answers, but you could make good use of groupBy
val ret = List(1,2,3,4).groupBy(x => x % 2 == 0)
ret: scala.collection.immutable.Map[Boolean,List[Int]] = Map(false -> List(1, 3), true -> List(2, 4))
ret(true)
res3: List[Int] = List(2, 4)
ret(false)
res4: List[Int] = List(1, 3)
This makes your code a bit more future-proof if you need to change the condition into something non boolean.
If you want to split a list into more than 2 pieces, and ignore the bounds, you could use something like this (modify if you need to search for ints)
def split(list_in: List[String], search: String): List[List[String]] = {
def split_helper(accum: List[List[String]], list_in2: List[String], search: String): List[List[String]] = {
val (h1, h2) = list_in2.span({x: String => x!= search})
val new_accum = accum :+ h1
if (h2.contains(search)) {
return split_helper(new_accum, h2.drop(1), search)
}
else {
return accum
}
}
return split_helper(List(), list_in, search)
}
// TEST
// split(List("a", "b", "c", "d", "c", "a"), {x: String => x != "x"})

How to split a sequence into two pieces by predicate?

How do I split a sequence into two lists by a predicate?
Alternative: I can use filter and filterNot, or write my own method, but isn't there a better more general (built-in) method ?
By using partition method:
scala> List(1,2,3,4).partition(x => x % 2 == 0)
res0: (List[Int], List[Int]) = (List(2, 4),List(1, 3))
Good that partition was the thing you wanted -- there's another method that also uses a predicate to split a list in two: span.
The first one, partition will put all "true" elements in one list, and the others in the second list.
span will put all elements in one list until an element is "false" (in terms of the predicate). From that point forward, it will put the elements in the second list.
scala> Seq(1,2,3,4).span(x => x % 2 == 0)
res0: (Seq[Int], Seq[Int]) = (List(),List(1, 2, 3, 4))
You might want to take a look at scalex.org - it allows you to search the scala standard library for functions by their signature. For example, type the following:
List[A] => (A => Boolean) => (List[A], List[A])
You would see partition.
You can also use foldLeft if you need something a little extra. I just wrote some code like this when partition didn't cut it:
val list:List[Person] = /* get your list */
val (students,teachers) =
list.foldLeft(List.empty[Student],List.empty[Teacher]) {
case ((acc1, acc2), p) => p match {
case s:Student => (s :: acc1, acc2)
case t:Teacher => (acc1, t :: acc2)
}
}
I know I might be late for the party and there are more specific answers, but you could make good use of groupBy
val ret = List(1,2,3,4).groupBy(x => x % 2 == 0)
ret: scala.collection.immutable.Map[Boolean,List[Int]] = Map(false -> List(1, 3), true -> List(2, 4))
ret(true)
res3: List[Int] = List(2, 4)
ret(false)
res4: List[Int] = List(1, 3)
This makes your code a bit more future-proof if you need to change the condition into something non boolean.
If you want to split a list into more than 2 pieces, and ignore the bounds, you could use something like this (modify if you need to search for ints)
def split(list_in: List[String], search: String): List[List[String]] = {
def split_helper(accum: List[List[String]], list_in2: List[String], search: String): List[List[String]] = {
val (h1, h2) = list_in2.span({x: String => x!= search})
val new_accum = accum :+ h1
if (h2.contains(search)) {
return split_helper(new_accum, h2.drop(1), search)
}
else {
return accum
}
}
return split_helper(List(), list_in, search)
}
// TEST
// split(List("a", "b", "c", "d", "c", "a"), {x: String => x != "x"})

What's a good and functional way to swap collection elements in Scala?

In a project of mine one common use case keeps coming up. At some point I've got a sorted collection of some kind (List, Seq, etc... doesn't matter) and one element of this collection. What I want to do is to swap the given element with it's following element (if this element exists) or at some times with the preceding element.
I'm well aware of the ways to achieve this using procedural programming techniques. My question is what would be a good way to solve the problem by means of functional programming (in Scala)?
Thank you all for your answers. I accepted the one I myself did understand the most. As I'm not a functional programmer (yet) it's kind of hard for me to decide which answer was truly the best. They are all pretty good in my opinion.
The following is the functional version of swap with the next element in a list, you just construct a new list with elements swapped.
def swapWithNext[T](l: List[T], e : T) : List[T] = l match {
case Nil => Nil
case `e`::next::tl => next::e::tl
case hd::tl => hd::swapWithNext(tl, e)
}
A zipper is a pure functional data structure with a pointer into that structure. Put another way, it's an element with a context in some structure.
For example, the Scalaz library provides a Zipper class which models a list with a particular element of the list in focus.
You can get a zipper for a list, focused on the first element.
import scalaz._
import Scalaz._
val z: Option[Zipper[Int]] = List(1,2,3,4).toZipper
You can move the focus of the zipper using methods on Zipper, for example, you can move to the next offset from the current focus.
val z2: Option[Zipper[Int]] = z >>= (_.next)
This is like List.tail except that it remembers where it has been.
Then, once you have your chosen element in focus, you can modify the elements around the focus.
val swappedWithNext: Option[Zipper[Int]] =
for (x <- z2;
y <- x.delete)
yield y.insertLeft(x.focus)
Note: this is with the latest Scalaz trunk head, in which a bug with Zipper's tail-recursive find and move methods has been fixed.
The method you want is then just:
def swapWithNext[T](l: List[T], p: T => Boolean) : List[T] = (for {
z <- l.toZipper
y <- z.findZ(p)
x <- y.delete
} yield x.insertLeft(y.focus).toStream.toList) getOrElse l
This matches an element based on a predicate p. But you can go further and consider all nearby elements as well. For instance, to implement an insertion sort.
A generic version of Landei's:
import scala.collection.generic.CanBuildFrom
import scala.collection.SeqLike
def swapWithNext[A,CC](cc: CC, e: A)(implicit w1: CC => SeqLike[A,CC],
w2: CanBuildFrom[CC,A,CC]): CC = {
val seq: SeqLike[A,CC] = cc
val (h,t) = seq.span(_ != e)
val (m,l) = (t.head,t.tail)
if(l.isEmpty) cc
else (h :+ l.head :+ m) ++ l.tail
}
some usages:
scala> swapWithNext(List(1,2,3,4),3)
res0: List[Int] = List(1, 2, 4, 3)
scala> swapWithNext("abcdef",'d')
res2: java.lang.String = abcedf
scala> swapWithNext(Array(1,2,3,4,5),2)
res3: Array[Int] = Array(1, 3, 2, 4, 5)
scala> swapWithNext(Seq(1,2,3,4),3)
res4: Seq[Int] = List(1, 2, 4, 3)
scala>
An alternative implementation for venechka's method:
def swapWithNext[T](l: List[T], e: T): List[T] = {
val (h,t) = l.span(_ != e)
h ::: t.tail.head :: e :: t.tail.tail
}
Note that this fails with an error if e is the last element.
If you know both elements, and every element occurs only once, it gets more elegant:
def swap[T](l: List[T], a:T, b:T) : List[T] = l.map(_ match {
case `a` => b
case `b` => a
case e => e }
)
How about :
val identifierPosition = 3;
val l = "this is a identifierhere here";
val sl = l.split(" ").toList;
val elementAtPos = sl(identifierPosition)
val swapped = elementAtPos :: dropIndex(sl , identifierPosition)
println(swapped)
def dropIndex[T](xs: List[T], n: Int) : List[T] = {
val (l1, l2) = xs splitAt n
l1 ::: (l2 drop 1)
}
kudos to http://www.scala-lang.org/old/node/5286 for dropIndex function