`circe` Type-level Json => A Function? - scala

Using circe or argonaut, how can I write a Json => A (note - Json may not be the name of the type) where A is given by the SSN class:
// A USA Social Security Number has exactly 8 digits.
case class SSN(value: Sized[List[Nat], _8])
?
Pseudocode:
// assuming this function is named f
f(JsArray(JsNumber(1))) would fail to become an A since its size is 1, whereas
f(JsArray(JsNumber(1), ..., JsNumber(8))) === SSN(SizedList(1,...,8))

circe doesn't (currently) provide instances for Sized, but it probably should. In any case you can write your own pretty straightforwardly:
import cats.data.Xor
import io.circe.{ Decoder, DecodingFailure }
import shapeless.{ Nat, Sized }
import shapeless.ops.nat.ToInt
import shapeless.syntax.sized._
implicit def decodeSized[L <: Nat, A](implicit
dl: Decoder[List[A]],
ti: ToInt[L]
): Decoder[Sized[List[A], L]] = Decoder.instance { c =>
dl(c).flatMap(as =>
Xor.fromOption(as.sized[L], DecodingFailure(s"Sized[List[A], _${ti()}]", c.history))
)
}
I've restricted this to List representations, but you could make it more generic if you wanted.
Now you can write your SSN instance like this (note that I'm using Int instead of Nat for the individual numbers, since once you've got something statically typed as a Nat it's not worth much):
case class SSN(value: Sized[List[Int], Nat._8])
implicit val decodeSSN: Decoder[SSN] = Decoder[Sized[List[Int], Nat._8]].map(SSN(_))
And then:
scala> import io.circe.jawn.decode
import io.circe.jawn.decode
scala> decode[SSN]("[1, 2, 3, 4, 5, 6, 7, 8]")
res0: cats.data.Xor[io.circe.Error,SSN] = Right(SSN(List(1, 2, 3, 4, 5, 6, 7, 8)))
scala> decode[SSN]("[1, 2, 3, 4, 5, 6, 7]")
res1: cats.data.Xor[io.circe.Error,SSN] = Left(DecodingFailure(Sized[List[A], _8], List()))
If you really want a Json => SSN you could do this:
val f: Json => SSN = Decoder[SSN].decodeJson(_).valueOr(throw _)
But that's not really idiomatic use of circe.

Related

Scala generics with unknown number of types

I have a scala function that should receive any number of Iterable[T]s of an unknown and different Ts.
for example, this is a valid call
foo(Iterable(1, 2, 3), Iterable(" "))
and should be translated to something like
foo(arg1: Iterable[Int](1, 2, 3),
arg2: Iterable[String](" "))
Note that the following does not work.
foo(args: Iterable[Any]*)
Because I need each iterable to be of one specific type.
Here's an example of how I want this to look:
foo[T*](arge: Iterable[T]*) = {
args.foreach({
case _: Iterable[Int] => ...
case _: Iterable[String] => ...
}
}
How can I achieve this?

Scala : Is there any operation in scala like rdd1.fun( val1) or rdd1.fun( rdd2)?

Suppose I have two RDDs: rdd1=(Double,Int,String), rdd2=(Double,String) and a function:fun1 wrote by myself and it would take both rdd1 and rdd2 as its inputs. how could I get the result like rdd1.fun1( val1) or rdd1.fun1( rdd2)?
For example,
rdd1=((1.53, 1, "22.35, 20.37, 15.52, 8.96"),
(2.62, 2, "17.15, 1.83, 16.36, 5.24"),
(5.66, 3, "7.98, 14.16, 12.35, 6.36"))
rdd2=( 1.53,"22.35, 20.37")
(ps. 1.53 is the minimal of[1.53,2.62,5.66]).
And fun1 would return a new rdd3 from rdd1 where each element in rdd2 replaced each corresponding parameter in rdd1,the expected output is as follows,
fun1(rdd1,rdd2)
{
...
new Tuple3(p1:Double, p2:Int, p3:String)
}
rdd3=((1.53, 1, "22.35, 20.37, 15.52, 8.96"),
(1.53, 2, "22.35, 20.37, 16.36, 5.24"),
(1.53, 3, "22.35, 20.37,12.35,6.36")).
Maybe one way to call fun1 is rdd2.fun1(rdd1) or some other calling methods.
I'v tried "join", but it doesn't work for my problem, because "join" only return those pairs with the same key.
But I don't know how to make fun1 work when rdd1 and rdd2 are the inputs.
You can do this with implicit conversions or implicit classes.
implicit class RichRDD[T](rdd: RDD[T]) {
def myFunction(other: RDD[T]): ? = { ... }
}
Here's an example with integers:
scala> implicit class RichInt(i: Int) {
| def toThePowerOf(b: Int): Int = scala.math.pow(i, b).toInt
}
defined class RichInt
scala> 2.toThePowerOf(4)
res1: Int = 16

Accessing Previous output while operator chaining in Scala

How to access the resulting output value to perform an upcoming operation for example:
scala> List(1,4,3,4,4,5,6,7)
res0: List[Int] = List(1, 4, 3, 4, 4, 5, 6, 7)
scala> res0.removeDuplicates.slice(0, ???.size -2)
In the above line, i need to perform slice operation after removing duplicates. To do this, how to access output of .removeDuplicate(), so that i can use it to find size for slice operation.
I need to perform this in a single step. Not in multiple steps like:
scala> res0.removeDuplicates
res1: List[Int] = List(1, 4, 3, 5, 6, 7)
scala> res1.slice(0, res1.size -2)
res2: List[Int] = List(1, 4, 3, 5)
I want to access intermediate results in the final operation. removeDuplicates() is just an example.
list.op1().op2().op3().finalop() here i want to access: output of op1,op2,op3 in finalop
Wrapping into into an Option may be one option (no pun intended):
val finalResult = Some(foo).map { foo =>
foo.op1(foo.stuff)
}.map { foo =>
foo.op2(foo.stuff)
}.map { foo =>
foo.op3(foo.stuff)
}.get.finalOp
You can make the wrapping part implicit to make it a little nicer:
object Tapper {
implicit class Tapped[T] extends AnyVal(val v: T) {
def tap[R](f: T => R) = f(v)
}
}
import Tapper._
val finalResult = foo
.tap(f => f.op1(f.stuff))
.tap(f => f.op2(f.stuff))
.tap(f => f.finalOp(f.stuff))
With for comprehension it is possible to compose operations in quite readable way with ability to access intermediate results:
val res = for {
ls1 <- Option(list.op1)
ls2 = ls1.op2() // Possible to access list, ls1
ls3 = ls2.op3() // Possible to access list, ls1, ls2
} yield ls4.finalOp() // Possible to access list, ls1, ls2, ls3
For example:
scala> val ls = List(1,1,2,2,3,3,4,4)
ls: List[Int] = List(1, 1, 2, 2, 3, 3, 4, 4)
scala> :paste
// Entering paste mode (ctrl-D to finish)
for {
ls1 <- Option(ls.map(_ * 2))
ls2 = ls1.map(_ + ls1.size)
ls3 = ls2.filter(_ < ls1.size + ls2.size)
} yield ls3.sum
// Exiting paste mode, now interpreting.
res15: Option[Int] = Some(72)
You will not need to know the length if you use dropRight:
scala> val a = List(1,4,3,4,4,5,6,7)
a: List[Int] = List(1, 4, 3, 4, 4, 5, 6, 7)
scala> a.dropRight(2)
res0: List[Int] = List(1, 4, 3, 4, 4, 5)
So do this: res0.removeDuplicates.dropRight(2)
If you really need it in one function, you can write a custom foldLeft, something like this:
var count = 0
val found = new HashSet()
res0.foldLeft(List[Int]()) { (z, i) =>
if(!found.contains(i)){
if(count < 4){
z :+ i
found += i
count += 1
}
}
}
However I don't really see the problem in chaining calls like in res0.removeDuplicates.slice. One benefit of functional programming is that our compiler can optimize in situations like this where we just want a certain behavior and don't want to specify the implementation.
You want to process some data through a series of transformations: someData -> op1 -> op2 -> op3 -> finalOp. However, inside op3, you would like to have access to intermediate results from the processing done in op1. The key here is to pass to the next function in the processing chain all the information that will be required downstream.
Let's say that your input is xs: Seq[String] and op1 is of type (xs: Seq[String]) => Seq[String]. You want to modify op1 to return case class ResultWrapper(originalInputLength: Int, deduplicatedItems: Seq[String], somethingNeededInOp5: SomeType). If all of your ops pass along what the other ops need down the line, you will get what you need. It's not very elegant, because there is coupling between your ops: the upstream needs to save the info that the downstream needs. They are not really "different operations" any more at this point.
One thing you can do is to use a Map[A,B] as your "result wrapper". This way, there is less coupling between ops, but less type safety as well.

SortedSet map does not always preserve element ordering in result?

Given the following Scala 2.9.2 code:
Updated with non-working example
import collection.immutable.SortedSet
case class Bar(s: String)
trait Foo {
val stuff: SortedSet[String]
def makeBars(bs: Map[String, String])
= stuff.map(k => Bar(bs.getOrElse(k, "-"))).toList
}
case class Bazz(rawStuff: List[String]) extends Foo {
val stuff = SortedSet(rawStuff: _*)
}
// test it out....
val b = Bazz(List("A","B","C"))
b.makeBars(Map("A"->"1","B"->"2","C"->"3"))
// List[Bar] = List(Bar(1), Bar(2), Bar(3))
// Looks good?
// Make a really big list not in order. This is why we pass it to a SortedSet...
val data = Stream.continually(util.Random.shuffle(List("A","B","C","D","E","F"))).take(100).toList
val b2 = Bazz(data.flatten)
// And how about a sparse map...?
val bs = util.Random.shuffle(Map("A" -> "1", "B" -> "2", "E" -> "5").toList).toMap
b2.makeBars(bs)
// res24: List[Bar] = List(Bar(1), Bar(2), Bar(-), Bar(5))
I've discovered that, in some cases, the makeBars method of classes extending Foo does not return a sorted List. In fact, the list ordering does not reflect the ordering of the SortedSet
What am I missing about the above code where Scala will not always map a SortedSet to a List with elements ordered by the SortedSet ordering?
You're being surprised by implicit resolution.
The map method requires a CanBuildFrom instance that's compatible with the target collection type (in simple cases, identical to the source collection type) and the mapper function's return type.
In the particular case of SortedSet, its implicit CanBuildFrom requires that an Ordering[A] (where A is the return type of the mapper function) be available. When your map function returns something that the compiler already knows how to find an Ordering for, you're good:
scala> val ss = collection.immutable.SortedSet(10,9,8,7,6,5,4,3,2,1)
ss: scala.collection.immutable.SortedSet[Int] = TreeSet(1, 2, 3, 4, 5,
6, 7, 8, 9, 10)
scala> val result1 = ss.map(_ * 2)
result1: scala.collection.immutable.SortedSet[Int] = TreeSet(2, 4, 6, 8, 10,
12, 14, 16, 18, 20)
// still sorted because Ordering[Int] is readily available
scala> val result2 = ss.map(_ + " is a number")
result2: scala.collection.immutable.SortedSet[String] = TreeSet(1 is a number,
10 is a number,
2 is a number,
3 is a number,
4 is a number,
5 is a number,
6 is a number,
7 is a number,
8 is a number,
9 is a number)
// The default Ordering[String] is an "asciibetical" sort,
// so 10 comes between 1 and 2. :)
However, when your mapper function turns out to return a type for which no Ordering is known, the implicit on SortedSet doesn't match (specifically, no value can be found for its implicit parameter), so the compiler looks "upward" for a compatible CanBuildFrom and finds the generic one from Set.
scala> case class Foo(i: Int)
defined class Foo
scala> val result3 = ss.map(Foo(_))
result3: scala.collection.immutable.Set[Foo] = Set(Foo(10), Foo(4), Foo(6), Foo(7), Foo(1), Foo(3), Foo(5), Foo(8), Foo(9), Foo(2))
// The default Set is a hash set, therefore ordering is not preserved
Of course, you can get around this by simply supplying an instance of Ordering[Foo] that does whatever you expect:
scala> implicit val fooIsOrdered: Ordering[Foo] = Ordering.by(_.i)
fooIsOrdered: Ordering[Foo] = scala.math.Ordering$$anon$9#7512dbf2
scala> val result4 = ss.map(Foo(_))
result4: scala.collection.immutable.SortedSet[Foo] = TreeSet(Foo(1), Foo(2),
Foo(3), Foo(4), Foo(5),
Foo(6), Foo(7), Foo(8),
Foo(9), Foo(10))
// And we're back!
Finally, note that toy examples often don't exhibit the problem, because the Scala collection library has special implementations for small (n <= 6) Sets and Maps.
You're probably making assumption about what SortedSet does from Java. You need to specify what order you want the elements to be in. See http://www.scala-lang.org/docu/files/collections-api/collections_8.html

generate case classes from CSV in Scala

I've got a CSV response from a service and I want to generate a list of case classes. For example:
case class MyCaseClass(e1: String, e2: String, e3: String)
val body = getLargeCsvFromServiceOrSomething()
val elements = body.split(",")
Now I have an Array[String]. I want to take that large array and break it down into 3 element chucks, so I can generate my List[MyCaseClass], where each instance take 3 elements from the array. Is there a method similar to splitAt, but spits every n elements? I'm sure I can do this point-free, but it's just not coming to me.
What you want is grouped:
scala> List(1,2,3,4,5,6,7).grouped(3).toList
res0: List[List[Int]] = List(List(1, 2, 3), List(4, 5, 6), List(7))
So your thing might be like:
val elements = Array("a","b","c","d","e","f")
val classes = elements.grouped(3).map{ case Array(a,b,c) => MyCaseClass(a,b,c) }
println(classes.toList) // List(MyCaseClass(a,b,c), MyCaseClass(d,e,f))