Semigroup typeclass (Either) with slightly altered combine - scala

Using cats.Semigroup one can write this:
import cats.Semigroup
import cats.implicits._
val l1: String Either Int = Left("error")
val r1: String Either Int = Right(1)
val r2: String Either Int = Right(2)
l1 |+| r1 // Left("error")
r1 |+| r2 // Right(3)
I would like to have an equally idiomatic operator (combine-like) that works like this:
if there is (at least) one Right in my computation, return a Right
if there are only Lefts, return a Left
E.g.:
Right(1) |+| Right(2) // Right(3)
Right(1) |+| Left("2") // Right(1)
Left("1") |+| Left("2") // Left("12") // in my particular case the wrapped value here does not really matter (could also be e.g. Left("1") or Left("2")), but I guess Left("12") would be the must logical result
Is there something like this already defined in e.g. cats on Either?

There are a bunch of lawful semigroup instances for Either, and which of them should be included in Cats was a matter of some debate. Cats, Scalaz, and Haskell all make different choices in this respect, and the instance you're describing (flipped but with both lefts and right combining) is different from all three of those, it doesn't have a specific name that I'm aware of, and it isn't provided under any name or in any form by Cats.
That's of course not a problem in itself, since as we'll see below it's pretty easy to verify that this instance is lawful, but there is one potential issue you should be aware of. You don't really explain your intended semantics, but if you ever want to promote this to a Monoid, the fact that you pick the Right when you have both a Left and a Right means that your zero will have to be Left. This might be kind of weird if you're thinking of rights as successes and lefts as errors that are safe to ignore when combining values.
You're asking about Semigroup, though, not Monoid, so let's just ignore that for now and show that this thing is lawful. First for the definition:
import cats.kernel.Semigroup
implicit def eitherSemigroup[A, B](implicit
A: Semigroup[A],
B: Semigroup[B]
): Semigroup[Either[A, B]] = Semigroup.instance {
case (Right(x), Right(y)) => Right(B.combine(x, y))
case (r # Right(_), Left(_)) => r
case (Left(_), r # Right(_)) => r
case (Left(x), Left(y)) => Left(A.combine(x, y))
}
And then the checking part:
import cats.instances.int._
import cats.instances.string._
import cats.kernel.instances.either.catsStdEqForEither
import cats.kernel.laws.discipline.SemigroupTests
import org.scalacheck.Test.Parameters
SemigroupTests(eitherSemigroup[String, Int]).semigroup.all.check(Parameters.default)
And yeah, it's fine:
+ semigroup.associative: OK, passed 100 tests.
+ semigroup.combineAllOption: OK, passed 100 tests.
+ semigroup.repeat1: OK, passed 100 tests.
+ semigroup.repeat2: OK, passed 100 tests.
Personally if I wanted something like this I'd probably use a wrapper to avoid confusing future readers of my code (including myself), but given that nobody really knows what the semigroup of Either should do, I don't think using a custom instance is as big of a problem as it is for most other types from the standard library.

Related

Scala: isInstanceOf followed by asInstanceOf

In my team, I often see teammates writing
list.filter(_.isInstanceOf[T]).map(_.asInstanceOf[T])
but this seems a bit redundant to me.
If we know that everything in the filtered list is an instance of T then why should we have to explicitly cast it as such?
I know of one alternative, which is to use match.
eg:
list.match {
case thing: T => Some(thing)
case _ => None
}
but this has the drawback that we must then explicitly state the generic case.
So, given all the above, I have 2 questions:
1) Is there another (better?) way to do the same thing?
2) If not, which of the two options above should be preferred?
You can use collect:
list collect {
case el: T => el
}
Real types just work (barring type erasure, of course):
scala> List(10, "foo", true) collect { case el: Int => el }
res5: List[Int] = List(10)
But, as #YuvalItzchakov has mentioned, if you want to match for an abstract type T, you must have an implicit ClassTag[T] in scope.
So a function implementing this may look as follows:
import scala.reflect.ClassTag
def filter[T: ClassTag](list: List[Any]): List[T] = list collect {
case el: T => el
}
And using it:
scala> filter[Int](List(1, "foo", true))
res6: List[Int] = List(1)
scala> filter[String](List(1, "foo", true))
res7: List[String] = List(foo)
collect takes a PartialFunction, so you shouldn't provide the generic case.
But if needed, you can convert a function A => Option[B] to a PartialFunction[A, B] with Function.unlift. Here is an example of that, also using shapeless.Typeable to work around type erasure:
import shapeless.Typeable
import shapeless.syntax.typeable._
def filter[T: Typeable](list: List[Any]): List[T] =
list collect Function.unlift(_.cast[T])
Using:
scala> filter[Option[Int]](List(Some(10), Some("foo"), true))
res9: List[Option[Int]] = List(Some(10))
but this seems a bit redundant to me.
Perhaps programmers in your team are trying to shield that piece of code from someone mistakenly inserting a type other then T, assuming this is some sort of collection with type Any. Otherwise, the first mistake you make, you'll blow up at run-time, which is never fun.
I know of one alternative, which is to use match.
Your sample code won't work because of type erasure. If you want to match on underlying types, you need to use ClassTag and TypeTag respectively for each case, and use =:= for type equality and <:< for subtyping relationships.
Is there another (better?) way to do the same thing?
Yes, work with the type system, not against it. Use typed collections when you can. You haven't elaborated on why you need to use run-time checks and casts on types, so I'm assuming there is a reasonable explanation to that.
If not, which of the two options above should be preferred?
That's a matter of taste, but using pattern matching on types can be more error-prone since one has to be aware of the fact that types are erased at run-time, and create a bit more boilerplate code for you to maintain.

Enforcing non-emptyness of scala varargs at compile time

I have a function that expects a variable number of parameters of the same type, which sounds like the textbook use case for varargs:
def myFunc[A](as: A*) = ???
The problem I have is that myFunc cannot accept empty parameter lists. There's a trivial way of enforcing that at runtime:
def myFunc[A](as: A*) = {
require(as.nonEmpty)
???
}
The problem with that is that it happens at runtime, as opposed to compile time. I would like the compiler to reject myFunc().
One possible solution would be:
def myFunc[A](head: A, tail: A*) = ???
And this works when myFunc is called with inline arguments, but I'd like users of my library to be able to pass in a List[A], which this syntax makes very awkward.
I could try to have both:
def myFunc[A](head: A, tail: A*) = myFunc(head +: tail)
def myFunc[A](as: A*) = ???
But we're right back where we started: there's now a way of calling myFunc with an empty parameter list.
I'm aware of scalaz's NonEmptyList, but in as much as possible, I'd like to stay with stlib types.
Is there a way to achieve what I have in mind with just the standard library, or do I need to accept some runtime error handling for something that really feels like the compiler should be able to deal with?
What about something like this?
scala> :paste
// Entering paste mode (ctrl-D to finish)
def myFunc()(implicit ev: Nothing) = ???
def myFunc[A](as: A*) = println(as)
// Exiting paste mode, now interpreting.
myFunc: ()(implicit ev: Nothing)Nothing <and> [A](as: A*)Unit
myFunc: ()(implicit ev: Nothing)Nothing <and> [A](as: A*)Unit
scala> myFunc(3)
WrappedArray(3)
scala> myFunc(List(3): _*)
List(3)
scala> myFunc()
<console>:13: error: could not find implicit value for parameter ev: Nothing
myFunc()
^
scala>
Replacing Nothing with a class that has an appropriate implicitNotFound annotation should allow for a sensible error message.
Let's start out with what I think is your base requirement: the ability to define myFunc in some way such that the following occurs at the Scala console when a user provides literals. Then maybe if we can achieve that, we can try to go for varargs.
myFunc(List(1)) // no problem
myFunc(List[Int]()) // compile error!
Moreover, we don't want to have to force users either to split a list into a head and tail or have them convert to a ::.
Well when we're given literals, since we have access to the syntax used to construct the value, we can use macros to verify that a list is non-empty. Moreover, there's already a library that'll do it for us, namely refined!
scala> refineMV[NonEmpty]("Hello")
res2: String Refined NonEmpty = Hello
scala> refineMV[NonEmpty]("")
<console>:39: error: Predicate isEmpty() did not fail.
refineMV[NonEmpty]("")
^
Unfortunately this is still problematic in your case, because you'll need to put refineMV into the body of your function at which point the literal syntactically disappears and macro magic fails.
Okay what about the general case that doesn't rely on syntax?
// Can we do this?
val xs = getListOfIntsFromStdin() // Pretend this function exists
myFunc(xs) // compile error if xs is empty
Well now we're up against a wall; there's no way a compile time error can happen here since the code has already been compiled and yet clearly xs could be empty. We'll have to deal with this case at runtime, either in a type-safe manner with Option and the like or with something like runtime exceptions. But maybe we can do a little better than just throw our hands up in the air. There's two possible paths of improvement.
Somehow provide implicit evidence that xs is nonempty. If the compiler can find that evidence, then great! If not, it's on the user to provide it somehow at runtime.
Track the provenance of xs through your program and statically prove that it must be non-empty. If this cannot be proved, either error out at compile time or somehow force the user to handle the empty case.
Once again, unfortunately this is problematic.
I strongly suspect this is not possible (but this is still only a suspicion and I would be happy to be proved wrong). The reason is that ultimately implicit resolution is type-directed which means that Scala gets the ability to do type-level computation on types, but Scala has no mechanism that I know of to do type-level computation on values (i.e. dependent typing). We require the latter here because List(1, 2, 3) and List[Int]() are indistinguishable at the type level.
Now you're in SMT solver land, which does have some efforts in other languages (hello Liquid Haskell!). Sadly I don't know of any such efforts in Scala (and I imagine it would be a harder task to do in Scala).
The bottom line is that when it comes to error checking there is no free lunch. A compiler can't magically make error handling go away (although it can tell you when you don't strictly need it), the best it can do is yell at you when you forget to handle certain classes of errors, which is itself very valuable. To underscore the no free lunch point, let's return to a language that does have dependent types (Idris) and see how it handles non-empty values of List and the prototypical function that breaks on empty lists, List.head.
First we get a compile error on empty lists
Idris> List.head []
(input):1:11:When checking argument ok to function Prelude.List.head:
Can't find a value of type
NonEmpty []
Good, what about non-empty lists, even if they're obfuscated by a couple of leaps?
Idris> :let x = 5
-- Below is equivalent to
-- val y = identity(Some(x).getOrElse(3))
Idris> :let y = maybe 3 id (Just x)
-- Idris makes a distinction between Natural numbers and Integers
-- Disregarding the Integer to Nat conversion, this is
-- val z = Stream.continually(2).take(y)
Idris> :let z = Stream.take (fromIntegerNat y) (Stream.repeat 2)
Idris> List.head z
2 : Integer
It somehow works! What if we really don't let the Idris compiler know anything about the number we pass along and instead get one at runtime from the user? We blow up with a truly gargantuan error message that starts with When checking argument ok to function Prelude.List.head: Can't find a value of type NonEmpty...
import Data.String
generateN1s : Nat -> List Int
generateN1s x = Stream.take x (Stream.repeat 1)
parseOr0 : String -> Nat
parseOr0 str = case parseInteger str of
Nothing => 0
Just x => fromIntegerNat x
z : IO Int
z = do
x <- getLine
let someNum = parseOr0 x
let firstElem = List.head $ generateN1s someNum -- Compile error here
pure firstElem
Hmmm... well what's the type signature of List.head?
Idris> :t List.head
-- {auto ...} is roughly the same as Scala's implicit
head : (l : List a) -> {auto ok : NonEmpty l} -> a
Ah so we just need to provide a NonEmpty.
data NonEmpty : (xs : List a) -> Type where
IsNonEmpty : NonEmpty (x :: xs)
Oh a ::. And we're back at square one.
Use scala.collection.immutable.::
:: is the cons of the list
defined in std lib
::[A](head: A, tail: List[A])
use :: to define myFunc
def myFunc[A](list: ::[A]): Int = 1
def myFunc[A](head: A, tail: A*): Int = myFunc(::(head, tail.toList))
Scala REPL
scala> def myFunc[A](list: ::[A]): Int = 1
myFunc: [A](list: scala.collection.immutable.::[A])Int
scala> def myFunc[A](head: A, tail: A*): Int = myFunc(::(head, tail.toList))
myFunc: [A](head: A, tail: A*)Int

How can I avoid boilerplate when generating case classes with ScalaCheck?

I used to use an idiom like the following to generate case classes with ScalaCheck:
GenSomething.map2(GenSomethingElse)(MyClass(_, _))
We recently upgraded ScalaCheck to 1.11, which removed the mapN methods. I'd really like to be able to avoid having to assign intermediate names to the generators for each field, and the mapN methods provided the easiest way to do that. Now, the best syntax is:
for {
something <- GenSomething
somethingElse <- GenSomethingElse
} yield MyClass(
something = something,
somethingElse = somethingElse)
That's not so bad (for structures will a small number of constructor arguments), but I'd really like to make it clear that there's nothing special going on here, and I'm just specifying generators for each of the arguments without the reader of the code having to read through to confirm that.
In short, I'd like something akin to applicative syntax. Unfortunately, it's not an option to use scalaz, shapeless, or macros. I realize that that last sentence pretty much makes my question "how can I do X without access to the things that let me do X", but I'm hoping that someone will have a good idea.
Since you are explicitly excluding libraries that are meant to prevent boilerplate, you will have to live with some boilerplate.
You can define gen combiners for each arity, using a similar approach to Gen.resultOf. In fact, you can just use Gen.resultOf, since the only difference to resultOf is that you want explicitly provided Gens instead of implicitly provided Arbitrarys.
object GenCombiner {
def zipMap[A, R](a: Gen[A])(f: A ⇒ R): Gen[R] =
Gen.resultOf(f)(Arbitrary(a))
def zipMap[A, B, R](a: Gen[A], b: Gen[B])(f: (A, B) ⇒ R): Gen[R] =
Gen.resultOf(f)(Arbitrary(a), Arbitrary(b))
def zipMap[A, B, C, R](a: Gen[A], b: Gen[B], c: Gen[C])(f: (A, B, C) ⇒ R): Gen[R] =
Gen.resultOf(f)(Arbitrary(a), Arbitrary(b), Arbitrary(c))
// other arities
}
object GenCombinerTest {
import GenCombiner._
case class Foo(alpha: String, num: String)
val fooGen: Gen[Foo] = zipMap(Gen.alphaStr, Gen.numStr)(Foo)
}

Is there any fundamental limitations that stops Scala from implementing pattern matching over functions?

In languages like SML, Erlang and in buch of others we may define functions like this:
fun reverse [] = []
| reverse x :: xs = reverse xs # [x];
I know we can write analog in Scala like this (and I know, there are many flaws in the code below):
def reverse[T](lst: List[T]): List[T] = lst match {
case Nil => Nil
case x :: xs => reverse(xs) ++ List(x)
}
But I wonder, if we could write former code in Scala, perhaps with desugaring to the latter.
Is there any fundamental limitations for such syntax being implemented in the future (I mean, really fundamental -- e.g. the way type inference works in scala, or something else, except parser obviously)?
UPD
Here is a snippet of how it could look like:
type T
def reverse(Nil: List[T]) = Nil
def reverse(x :: xs: List[T]): List[T] = reverse(xs) ++ List(x)
It really depends on what you mean by fundamental.
If you are really asking "if there is a technical showstopper that would prevent to implement this feature", then I would say the answer is no. You are talking about desugaring, and you are on the right track here. All there is to do is to basically stitch several separates cases into one single function, and this can be done as a mere preprocessing step (this only requires syntactic knowledge, no need for semantic knowledge). But for this to even make sense, I would define a few rules:
The function signature is mandatory (in Haskell by example, this would be optional, but it is always optional whether you are defining the function at once or in several parts). We could try to arrange to live without the signature and attempt to extract it from the different parts, but lack of type information would quickly come to byte us. A simpler argument is that if we are to try to infer an implicit signature, we might as well do it for all the methods. But the truth is that there are very good reasons to have explicit singatures in scala and I can't imagine to change that.
All the parts must be defined within the same scope. To start with, they must be declared in the same file because each source file is compiled separately, and thus a simple preprocessor would not be enough to implement the feature. Second, we still end up with a single method in the end, so it's only natural to have all the parts in the same scope.
Overloading is not possible for such methods (otherwise we would need to repeat the signature for each part just so the preprocessor knows which part belongs to which overload)
Parts are added (stitched) to the generated match in the order they are declared
So here is how it could look like:
def reverse[T](lst: List[T]): List[T] // Exactly like an abstract def (provides the signature)
// .... some unrelated code here...
def reverse(Nil) = Nil
// .... another bit of unrelated code here...
def reverse(x :: xs ) = reverse(xs) ++ List(x)
Which could be trivially transformed into:
def reverse[T](list: List[T]): List[T] = lst match {
case Nil => Nil
case x :: xs => reverse(xs) ++ List(x)
}
// .... some unrelated code here...
// .... another bit of unrelated code here...
It is easy to see that the above transformation is very mechanical and can be done by just manipulating a source AST (the AST produced by the slightly modified grammar that accepts this new constructs), and transforming it into the target AST (the AST produced by the standard scala grammar).
Then we can compile the result as usual.
So there you go, with a few simple rules we are able to implement a preprocessor that does all the work to implement this new feature.
If by fundamental you are asking "is there anything that would make this feature out of place" then it can be argued that this does not feel very scala. But more to the point, it does not bring that much to the table. Scala author(s) actually tend toward making the language simpler (as in less built-in features, trying to move some built-in features into libraries) and adding a new syntax that is not really more readable goes against the goal of simplification.
In SML, your code snippet is literally just syntactic sugar (a "derived form" in the terminology of the language spec) for
val rec reverse = fn x =>
case x of [] => []
| x::xs = reverse xs # [x]
which is very close to the Scala code you show. So, no there is no "fundamental" reason that Scala couldn't provide the same kind of syntax. The main problem is Scala's need for more type annotations, which makes this shorthand syntax far less attractive in general, and probably not worth the while.
Note also that the specific syntax you suggest would not fly well, because there is no way to distinguish one case-by-case function definition from two overloaded functions syntactically. You probably would need some alternative syntax, similar to SML using "|".
I don't know SML or Erlang, but I know Haskell. It is a language without method overloading. Method overloading combined with such pattern matching could lead to ambiguities. Imagine following code:
def f(x: String) = "String "+x
def f(x: List[_]) = "List "+x
What should it mean? It can mean method overloading, i.e. the method is determined in compile time. It can also mean pattern matching. There would be just a f(x: AnyRef) method that would do the matching.
Scala also has named parameters, which would be probably also broken.
I don't think that Scala is able to offer more simple syntax than you have shown in general. A simpler syntax may IMHO work in some special cases only.
There are at least two problems:
[ and ] are reserved characters because they are used for type arguments. The compiler allows spaces around them, so that would not be an option.
The other problem is that = returns Unit. So the expression after the | would not return any result
The closest I could come up with is this (note that is very specialized towards your example):
// Define a class to hold the values left and right of the | sign
class |[T, S](val left: T, val right: PartialFunction[T, T])
// Create a class that contains the | operator
class OrAssoc[T](left: T) {
def |(right: PartialFunction[T, T]): T | T = new |(left, right)
}
// Add the | to any potential target
implicit def anyToOrAssoc[S](left: S): OrAssoc[S] = new OrAssoc(left)
object fun {
// Use the magic of the update method
def update[T, S](choice: T | S): T => T = { arg =>
if (choice.right.isDefinedAt(arg)) choice.right(arg)
else choice.left
}
}
// Use the above construction to define a new method
val reverse: List[Int] => List[Int] =
fun() = List.empty[Int] | {
case x :: xs => reverse(xs) ++ List(x)
}
// Call the method
reverse(List(3, 2, 1))

How do you use scalaz.WriterT for logging in a for expression?

How do you use scalaz.WriterT for logging?
About monad transformers
This is a very short introduction. You may find more information on haskellwiki or this great slide by #jrwest.
Monads don't compose, meaning that if you have a monad A[_] and a monad B[_], then A[B[_]] can not be derived automatically. However in most cases this can be achieved by having a so-called monad transformer for a given monad.
If we have monad transformer BT for monad B, then we can compose a new monad A[B[_]] for any monad A. That's right, by using BT, we can put the B inside A.
Monad transformer usage in scalaz
The following assumes scalaz 7, since frankly I didn't use monad transformers with scalaz 6.
A monad transformer MT takes two type parameters, the first is the wrapper (outside) monad, the second is the actual data type at the bottom of the monad stack. Note: It may take more type parameters, but those are not related to the transformer-ness, but rather specific for that given monad (like the logged type of a Writer, or the error type of a Validation).
So if we have a List[Option[A]] which we would like to treat as a single composed monad, then we need OptionT[List, A]. If we have Option[List[A]], we need ListT[Option, A].
How to get there? If we have the non-transformer value, we can usually just wrap it with MT.apply to get the value inside the transformer. To get back from the transformed form to normal, we usually call .run on the transformed value.
So val a: OptionT[List, Int] = OptionT[List, Int](List(some(1)) and val b: List[Option[Int]] = a.run are the same data, just the representation is different.
It was suggested by Tony Morris that is best to go into the transformed version as early as possible and use that as long as possible.
Note: Composing multiple monads using transformers yields a transformer stack with types just the opposite order as the normal data type. So a normal List[Option[Validation[E, A]]] would look something like type ListOptionValidation[+E, +A] = ValidationT[({type l[+a] = OptionT[List, a]})#l, E, A]
Update: As of scalaz 7.0.0-M2, Validation is (correctly) not a Monad and so ValidationT doesn't exist. Use EitherT instead.
Using WriterT for logging
Based on your need, you can use the WriterT without any particular outer monad (in this case in the background it will use the Id monad which doesn't do anything), or can put the logging inside a monad, or put a monad inside the logging.
First case, simple logging
import scalaz.{Writer}
import scalaz.std.list.listMonoid
import scalaz._
def calc1 = Writer(List("doing calc"), 11)
def calc2 = Writer(List("doing other"), 22)
val r = for {
a <- calc1
b <- calc2
} yield {
a + b
}
r.run should be_== (List("doing calc", "doing other"), 33)
We import the listMonoid instance, since it also provides the Semigroup[List] instance. It is needed since WriterT needs the log type to be a semigroup in order to be able to combine the log values.
Second case, logging inside a monad
Here we chose the Option monad for simplicity.
import scalaz.{Writer, WriterT}
import scalaz.std.list.listMonoid
import scalaz.std.option.optionInstance
import scalaz.syntax.pointed._
def calc1 = WriterT((List("doing calc") -> 11).point[Option])
def calc2 = WriterT((List("doing other") -> 22).point[Option])
val r = for {
a <- calc1
b <- calc2
} yield {
a + b
}
r.run should be_== (Some(List("doing calc", "doing other"), 33))
With this approach, since the logging is inside the Option monad, if any of the bound options is None, we would just get a None result without any logs.
Note: x.point[Option] is the same in effect as Some(x), but may help to generalize the code better. Not lethal just did it that way for now.
Third option, logging outside of a monad
import scalaz.{Writer, OptionT}
import scalaz.std.list.listMonoid
import scalaz.std.option.optionInstance
import scalaz.syntax.pointed._
type Logger[+A] = WriterT[scalaz.Id.Id, List[String], A]
def calc1 = OptionT[Logger, Int](Writer(List("doing calc"), Some(11): Option[Int]))
def calc2 = OptionT[Logger, Int](Writer(List("doing other"), None: Option[Int]))
val r = for {
a <- calc1
b <- calc2
} yield {
a + b
}
r.run.run should be_== (List("doing calc", "doing other") -> None)
Here we use OptionT to put the Option monad inside the Writer. One of the calculations is Noneto show that even in this case logs are preserved.
Final remarks
In these examples List[String] was used as the log type. However using String is hardly ever the best way, just some convention forced on us by logging frameworks. It would be better to define a custom log ADT for example, and if needed to output, convert it to string as late as possible. This way you could serialize the log's ADT and easily analyse it later programmatically (instead of parsing strings).
WriterT has a host of useful methods to work with to ease logging, check out the source. For example given a w: WriterT[...], you may add a new log entry using w :++> List("other event"), or even log using the currently held value using w :++>> ((v) => List("the result is " + v)), etc.
There are many explicit and longish code (types, calls) in the examples. As always, these are for clarity, refactor them in your code by extracting common types and ops.
type OptionLogger[A] = WriterT[Option, NonEmptyList[String], A]
val two: OptionLogger[Int] = WriterT.put(2.some)("The number two".pure[NonEmptyList])
val hundred: OptionLogger[Int] = WriterT.put(100.some)("One hundred".pure[NonEmptyList])
val twoHundred = for {
a <- two
b <- hundred
} yield a * b
twoHundred.value must be equalTo(200.some)
val log = twoHundred.written map { _.list } getOrElse List() mkString(" ")
log must be equalTo("The number two One hundred")