Access element of an Array and return a monad? - scala

If I access an index outside the bounds of an Array, I get an ArrayIndexOutOfBoundsException, eg:
val a = new Array[String](3)
a(4)
java.lang.ArrayIndexOutOfBoundsException: 4
Is there a method to return a monad instead (eg: Option)? And why doesn't the default collections apply method for Array support this?

You can use lift:
a.lift(4) // None
a.lift(2) // Some(null)
Array[T] is a PartialFunction[Int, T] and lift creates a Function[Int, Option[T]] from the index to an option of the element type.

You could use scala.util.Try:
scala> val a = new Array[String](3)
a: Array[String] = Array(null, null, null)
scala> import scala.util.Try
import scala.util.Try
scala> val fourth = Try(a(3))
third: scala.util.Try[String] = Failure(java.lang.ArrayIndexOutOfBoundsException
: 3)
scala> val third = Try(a(2))
third: scala.util.Try[String] = Success(null)
Another good idea is not using Array in the first place, but that's outside the scope of this question.
Where it is relevant to your question though is why this behaviour. Array is intended to function like a Java array and be compatible with a Java Array. Since this is how Java arrays work, this is how Array works.

Related

What is the difference between List.view and LazyList?

I am new to Scala and I just learned that LazyList was created to replace Stream, and at the same time they added the .view methods to all collections.
So, I am wondering why was LazyList added to Scala collections library, when we can do List.view?
I just looked at the Scaladoc, and it seems that the only difference is that LazyList has memoization, while View does not. Am I right or wrong?
Stream elements are realized lazily except for the 1st (head) element. That was seen as a deficiency.
A List view is re-evaluated lazily but, as far as I know, has to be completely realized first.
def bang :Int = {print("BANG! ");1}
LazyList.fill(4)(bang) //res0: LazyList[Int] = LazyList(<not computed>)
Stream.fill(3)(bang) //BANG! res1: Stream[Int] = Stream(1, <not computed>)
List.fill(2)(bang).view //BANG! BANG! res2: SeqView[Int] = SeqView(<not computed>)
In 2.13, you can't force your way back from a view to the original collection type:
scala> case class C(n: Int) { def bump = new C(n+1).tap(i => println(s"bump to $i")) }
defined class C
scala> List(C(42)).map(_.bump)
bump to C(43)
res0: List[C] = List(C(43))
scala> List(C(42)).view.map(_.bump)
res1: scala.collection.SeqView[C] = SeqView(<not computed>)
scala> .force
^
warning: method force in trait View is deprecated (since 2.13.0): Views no longer know about their underlying collection type; .force always returns an IndexedSeq
bump to C(43)
res2: scala.collection.IndexedSeq[C] = Vector(C(43))
scala> LazyList(C(42)).map(_.bump)
res3: scala.collection.immutable.LazyList[C] = LazyList(<not computed>)
scala> .force
bump to C(43)
res4: res3.type = LazyList(C(43))
A function taking a view and optionally returning a strict realization would have to also take a "forcing function" such as _.toList, if the caller needs to choose the result type.
I don't do this sort of thing at my day job, but this behavior surprises me.
The difference is that LazyList can be generated from huge/infinite sequence, so you can do something like:
val xs = (1 to 1_000_000_000).to(LazyList)
And that won't run out of memory. After that you can operate on the lazy list with transformers. You won't be able to do the same by creating a List and taking a view from it. Having said that, SeqView has a much reacher set of methods compared to LazyList and that's why you can actually take a view of a LazyList like:
val xs = (1 to 1_000_000_000).to(LazyList)
val listView = xs.view

Cats Seq[Xor[A,B]] => Xor[A, Seq[B]]

I have a sequence of Errors or Views (Seq[Xor[Error,View]])
I want to map this to an Xor of the first error (if any) or a Sequence of Views
(Xor[Error, Seq[View]]) or possibly simply (Xor[Seq[Error],Seq[View])
How can I do this?
You can use sequenceU provided by the bitraverse syntax, similar to as you would do with scalaz. It doesn't seem like the proper type classes exist for Seq though, but you can use List.
import cats._, data._, implicits._, syntax.bitraverse._
case class Error(msg: String)
case class View(content: String)
val errors: List[Xor[Error, View]] = List(
Xor.Right(View("abc")), Xor.Left(Error("error!")),
Xor.Right(View("xyz"))
)
val successes: List[Xor[Error, View]] = List(
Xor.Right(View("abc")),
Xor.Right(View("xyz"))
)
scala> errors.sequenceU
res1: cats.data.Xor[Error,List[View]] = Left(Error(error!))
scala> successes.sequenceU
res2: cats.data.Xor[Error,List[View]] = Right(List(View(abc), View(xyz)))
In the most recent version of Cats Xor is removed and now the standard Scala Either data type is used.
Michael Zajac showed correctly that you can use sequence or sequenceU (which is actually defined on Traverse not Bitraverse) to get an Either[Error, List[View]].
import cats.implicits._
val xs: List[Either[Error, View]] = ???
val errorOrViews: Either[Error, List[View]] = xs.sequenceU
You might want to look at traverse (which is like a map and a sequence), which you can use most of the time instead of sequence.
If you want all failing errors, you cannot use Either, but you can use Validated (or ValidatedNel, which is just a type alias for Validated[NonEmptyList[A], B].
import cats.data.{NonEmptyList, ValidatedNel}
val errorsOrViews: ValidatedNel[Error, List[View]] = xs.traverseU(_.toValidatedNel)
val errorsOrViews2: Either[NonEmptyList[Error], List[View]] = errorsOrViews.toEither
You could also get the errors and the views by using MonadCombine.separate :
val errorsAndViews: (List[Error], List[View]) = xs.separate
You can find more examples and information on Either and Validated on the Cats website.

Best way to handle Error on basic Array

val myArray = Array("1", "2")
val error = myArray(5)//throws an ArrayOutOfBoundsException
myArray has no fixed size, which explains why a call like performed on the above second line might happen.
First, I never really understood the reasons to use error handling for expected errors. Am I wrong to consider this practice as bad, resulting from poor coding skills or an inclination towards laziness?
What would be the best way to handle the above case?
What I am leaning towards: basic implementation (condition) to prevent accessing the data like depicted;
use Option;
use Try or Either;
use a try-catch block.
1 Avoid addressing elements through index
Scala offers a rich set of collection operations that are applied to Arrays through ArrayOps implicit conversions. This lets us use combinators like map, flatMap, take, drop, .... on arrays instead of addressing elements by index.
2 Prevent access out of range
An example I've seen often when parsing CSV-like data (in Spark):
case class Record(id:String, name: String, address:String)
val RecordSize = 3
val csvData = // some comma separated data
val records = csvData.map(line => line.split(","))
.collect{case arr if (arr.size == RecordSize) =>
Record(arr(0), arr(1), arr(2))}
3 Use checks that fit in the current context
If we are using monadic constructs to compose access to some resource, use a fitting way of lift errors to the application flow:
e.g. Imagine we are retrieving user preferences from some repository and we want the first one:
Option
def getUserById(id:ID):Option[User]
def getPreferences(user:User) : Option[Array[Preferences]]
val topPreference = for {
user <- userById(id)
preferences <- getPreferences(user)
topPreference <- preferences.lift(0)
} yield topPreference
(or even better, applying advice #1):
val topPreference = for {
user <- userById(id)
preferences <- getPreferences(user)
topPreference <- preferences.headOption
} yield topPreference
Try
def getUserById(id:ID): Try[User]
def getPreferences(user:User) : Try[Array[Preferences]]
val topPreference = for {
user <- userById(id)
preferences <- getPreferences(user)
topPreference <- Try(preferences(0))
} yield topPreference
As general guidance: Use the principle of least power.
If possible, use error-free combinators: = array.drop(4).take(1)
If all that matters is having an element or not, use Option
If we need to preserve the reason why we could not find an element, use Try.
Let the types and context of the program guide you.
If indexing myArray can be expected to error on occasion, then it sounds like Option would be the way to go.
myArray.lift(1) // Option[String] = Some(2)
myArray.lift(5) // Option[String] = None
You could use Try() but why bother if you already know what the error is and you're not interested in catching or reporting it?
Use arr.lift (available in standard library) which returns Option instead of throwing exception.
if not use safely
Try to access the element safely to avoid accidentally throwing exceptions in middle of the code.
implicit class ArrUtils[T](arr: Array[T]) {
import scala.util.Try
def safely(index: Int): Option[T] = Try(arr(index)).toOption
}
Usage:
arr.safely(4)
REPL
scala> val arr = Array(1, 2, 3)
arr: Array[Int] = Array(1, 2, 3)
scala> implicit class ArrUtils[T](arr: Array[T]) {
import scala.util.Try
def safely(index: Int): Option[T] = Try(arr(index)).toOption
}
defined class ArrUtils
scala> arr.safely(4)
res5: Option[Int] = None
scala> arr.safely(1)
res6: Option[Int] = Some(2)

Is there a way to get a Scala HashMap to automatically initialize values?'

I thought it could be done as follows
val hash = new HashMap[String, ListBuffer[Int]].withDefaultValue(ListBuffer())
hash("A").append(1)
hash("B").append(2)
println(hash("B").head)
However the above prints the unintuitive value of 1. I would like
hash("B").append(2)
To do something like the following behind the scenes
if (!hash.contains("B")) hash.put("B", ListBuffer())
Use getOrElseUpdate to provide the default value at the point of access:
scala> import collection.mutable._
import collection.mutable._
scala> def defaultValue = ListBuffer[Int]()
defaultValue: scala.collection.mutable.ListBuffer[Int]
scala> val hash = new HashMap[String, ListBuffer[Int]]
hash: scala.collection.mutable.HashMap[String,scala.collection.mutable.ListBuffer[Int]] = Map()
scala> hash.getOrElseUpdate("A", defaultValue).append(1)
scala> hash.getOrElseUpdate("B", defaultValue).append(2)
scala> println(hash("B").head)
2
withDefaultValue uses exactly the same value each time. In your case, it's the same empty ListBuffer that gets shared by everyone.
If you use withDefault instead, you could generate a new ListBuffer every time, but it wouldn't get stored.
So what you'd really like is a method that would know to add the default value. You can create such a method inside a wrapper class and then write an implicit conversion:
class InstantiateDefaults[A,B](h: collection.mutable.Map[A,B]) {
def retrieve(a: A) = h.getOrElseUpdate(a, h(a))
}
implicit def hash_can_instantiate[A,B](h: collection.mutable.Map[A,B]) = {
new InstantiateDefaults(h)
}
Now your code works as desired (except for the extra method name, which you could pick to be shorter if you wanted):
val hash = new collection.mutable.HashMap[
String, collection.mutable.ListBuffer[Int]
].withDefault(_ => collection.mutable.ListBuffer())
scala> hash.retrieve("A").append(1)
scala> hash.retrieve("B").append(2)
scala> hash("B").head
res28: Int = 2
Note that the solution (with the implicit) doesn't need to know the default value itself at all, so you can do this once and then default-with-addition to your heart's content.

How to use mutable collections in Scala

I think I may be failing to understand how mutable collections work. I would expect mutable collections to be affected by applying map to them or adding new elements, however:
scala> val s: collection.mutable.Seq[Int] = collection.mutable.Seq(1)
s: scala.collection.mutable.Seq[Int] = ArrayBuffer(1)
scala> s :+ 2 //appended an element
res32: scala.collection.mutable.Seq[Int] = ArrayBuffer(1, 2)
scala> s //the original collection is unchanged
res33: scala.collection.mutable.Seq[Int] = ArrayBuffer(1)
scala> s.map(_.toString) //mapped a function to it
res34: scala.collection.mutable.Seq[java.lang.String] = ArrayBuffer(1)
scala> s //original is unchanged
res35: scala.collection.mutable.Seq[Int] = ArrayBuffer(1)
//maybe mapping a function that changes the type of the collection shouldn't work
//try Int => Int
scala> s.map(_ + 1)
res36: scala.collection.mutable.Seq[Int] = ArrayBuffer(2)
scala> s //original unchanged
res37: scala.collection.mutable.Seq[Int] = ArrayBuffer(1)
This behaviour doesn't seem to be separate from the immutable collections, so when do they behave separately?
For both immutable and mutable collections, :+ and +: create new collections. If you want mutable collections that automatically grow, use the += and +=: methods defined by collection.mutable.Buffer.
Similarly, map returns a new collection — look for transform to change the collection in place.
map operation applies the given function to all the elements of collection, and produces a new collection.
The operation you are looking for is called transform. You can think of it as an in-place map except that the transformation function has to be of type a -> a instead of a -> b.
scala> import collection.mutable.Buffer
import collection.mutable.Buffer
scala> Buffer(6, 3, 90)
res1: scala.collection.mutable.Buffer[Int] = ArrayBuffer(6, 3, 90)
scala> res1 transform { 2 * }
res2: res1.type = ArrayBuffer(12, 6, 180)
scala> res1
res3: scala.collection.mutable.Buffer[Int] = ArrayBuffer(12, 6, 180)
The map method never modifies the collection on which you call it. The type system wouldn't allow such an in-place map implementation to exist - unless you changed its type signature, so that on some type Collection[A] you could only map using a function of type A => A.
(Edit: as other answers have pointed out, there is such a method called transform!)
Because map creates a new collection, you can go from a Collection[A] to a Collection[B] using a function A => B, which is much more useful.
As others have noted, the map and :+ methods return a modified copy of the collection. More generally, all methods defined in collections from the scala.collection package will never modify a collection in place even when the dynamic type of the collection is from scala.collection.mutable. To modify a collection in place, look for methods defined in scala.collection.mutable._ but not in scala.collection._.
For example, :+ is defined in scala.collection.Seq, so it will never perform in-place modification even when the dynamic type is a scala.collection.mutable.ArrayBuffer. However, +=, which is defined in scala.collection.mutable.ArrayBuffer and not in scala.collection.Seq, will modify the collection in place.