generate case classes from CSV in Scala - scala

I've got a CSV response from a service and I want to generate a list of case classes. For example:
case class MyCaseClass(e1: String, e2: String, e3: String)
val body = getLargeCsvFromServiceOrSomething()
val elements = body.split(",")
Now I have an Array[String]. I want to take that large array and break it down into 3 element chucks, so I can generate my List[MyCaseClass], where each instance take 3 elements from the array. Is there a method similar to splitAt, but spits every n elements? I'm sure I can do this point-free, but it's just not coming to me.

What you want is grouped:
scala> List(1,2,3,4,5,6,7).grouped(3).toList
res0: List[List[Int]] = List(List(1, 2, 3), List(4, 5, 6), List(7))
So your thing might be like:
val elements = Array("a","b","c","d","e","f")
val classes = elements.grouped(3).map{ case Array(a,b,c) => MyCaseClass(a,b,c) }
println(classes.toList) // List(MyCaseClass(a,b,c), MyCaseClass(d,e,f))

Related

Accessing Previous output while operator chaining in Scala

How to access the resulting output value to perform an upcoming operation for example:
scala> List(1,4,3,4,4,5,6,7)
res0: List[Int] = List(1, 4, 3, 4, 4, 5, 6, 7)
scala> res0.removeDuplicates.slice(0, ???.size -2)
In the above line, i need to perform slice operation after removing duplicates. To do this, how to access output of .removeDuplicate(), so that i can use it to find size for slice operation.
I need to perform this in a single step. Not in multiple steps like:
scala> res0.removeDuplicates
res1: List[Int] = List(1, 4, 3, 5, 6, 7)
scala> res1.slice(0, res1.size -2)
res2: List[Int] = List(1, 4, 3, 5)
I want to access intermediate results in the final operation. removeDuplicates() is just an example.
list.op1().op2().op3().finalop() here i want to access: output of op1,op2,op3 in finalop
Wrapping into into an Option may be one option (no pun intended):
val finalResult = Some(foo).map { foo =>
foo.op1(foo.stuff)
}.map { foo =>
foo.op2(foo.stuff)
}.map { foo =>
foo.op3(foo.stuff)
}.get.finalOp
You can make the wrapping part implicit to make it a little nicer:
object Tapper {
implicit class Tapped[T] extends AnyVal(val v: T) {
def tap[R](f: T => R) = f(v)
}
}
import Tapper._
val finalResult = foo
.tap(f => f.op1(f.stuff))
.tap(f => f.op2(f.stuff))
.tap(f => f.finalOp(f.stuff))
With for comprehension it is possible to compose operations in quite readable way with ability to access intermediate results:
val res = for {
ls1 <- Option(list.op1)
ls2 = ls1.op2() // Possible to access list, ls1
ls3 = ls2.op3() // Possible to access list, ls1, ls2
} yield ls4.finalOp() // Possible to access list, ls1, ls2, ls3
For example:
scala> val ls = List(1,1,2,2,3,3,4,4)
ls: List[Int] = List(1, 1, 2, 2, 3, 3, 4, 4)
scala> :paste
// Entering paste mode (ctrl-D to finish)
for {
ls1 <- Option(ls.map(_ * 2))
ls2 = ls1.map(_ + ls1.size)
ls3 = ls2.filter(_ < ls1.size + ls2.size)
} yield ls3.sum
// Exiting paste mode, now interpreting.
res15: Option[Int] = Some(72)
You will not need to know the length if you use dropRight:
scala> val a = List(1,4,3,4,4,5,6,7)
a: List[Int] = List(1, 4, 3, 4, 4, 5, 6, 7)
scala> a.dropRight(2)
res0: List[Int] = List(1, 4, 3, 4, 4, 5)
So do this: res0.removeDuplicates.dropRight(2)
If you really need it in one function, you can write a custom foldLeft, something like this:
var count = 0
val found = new HashSet()
res0.foldLeft(List[Int]()) { (z, i) =>
if(!found.contains(i)){
if(count < 4){
z :+ i
found += i
count += 1
}
}
}
However I don't really see the problem in chaining calls like in res0.removeDuplicates.slice. One benefit of functional programming is that our compiler can optimize in situations like this where we just want a certain behavior and don't want to specify the implementation.
You want to process some data through a series of transformations: someData -> op1 -> op2 -> op3 -> finalOp. However, inside op3, you would like to have access to intermediate results from the processing done in op1. The key here is to pass to the next function in the processing chain all the information that will be required downstream.
Let's say that your input is xs: Seq[String] and op1 is of type (xs: Seq[String]) => Seq[String]. You want to modify op1 to return case class ResultWrapper(originalInputLength: Int, deduplicatedItems: Seq[String], somethingNeededInOp5: SomeType). If all of your ops pass along what the other ops need down the line, you will get what you need. It's not very elegant, because there is coupling between your ops: the upstream needs to save the info that the downstream needs. They are not really "different operations" any more at this point.
One thing you can do is to use a Map[A,B] as your "result wrapper". This way, there is less coupling between ops, but less type safety as well.

Looping through a list of tuples in Scala

I have a sample List as below
List[(String, Object)]
How can I loop through this list using for?
I want to do something like
for(str <- strlist)
but for the 2d list above. What would be placeholder for str?
Here it is,
scala> val fruits: List[(Int, String)] = List((1, "apple"), (2, "orange"))
fruits: List[(Int, String)] = List((1,apple), (2,orange))
scala>
scala> fruits.foreach {
| case (id, name) => {
| println(s"$id is $name")
| }
| }
1 is apple
2 is orange
Note: The expected type requires a one-argument function accepting a 2-Tuple.
Consider a pattern matching anonymous function, { case (id, name) => ... }
Easy to copy code:
val fruits: List[(Int, String)] = List((1, "apple"), (2, "orange"))
fruits.foreach {
case (id, name) => {
println(s"$id is $name")
}
}
With for you can extract the elements of the tuple,
for ( (s,o) <- list ) yield f(s,o)
I will suggest using map, filter,fold or foreach(whatever suits your need) rather than iterating over a collection using loop.
Edit 1:
e.g
if you want to apply some func foo(tuple) on each element
val newList=oldList.map(tuple=>foo(tuple))
val tupleStrings=tupleList.map(tuple=>tuple._1) //in your situation
if you want to filter according to some boolean condition
val newList=oldList.filter(tuple=>someCondition(tuple))
or simply if you want to print your List
oldList.foreach(tuple=>println(tuple)) //assuming tuple is printable
you can find example and similar functions here https://twitter.github.io/scala_school/collections.html
If you just want to get the strings you could map over your list of tuples like this:
// Just some example object
case class MyObj(i: Int = 0)
// Create a list of tuples like you have
val tuples = Seq(("a", new MyObj), ("b", new MyObj), ("c", new MyObj))
// Get the strings from the tuples
val strings = tuples.map(_._1)
// Output: Seq[String] = List(a, b, c)
Note: Tuple members are accessed using the underscore notation (which
is indexed from 1, not 0)

SortedSet map does not always preserve element ordering in result?

Given the following Scala 2.9.2 code:
Updated with non-working example
import collection.immutable.SortedSet
case class Bar(s: String)
trait Foo {
val stuff: SortedSet[String]
def makeBars(bs: Map[String, String])
= stuff.map(k => Bar(bs.getOrElse(k, "-"))).toList
}
case class Bazz(rawStuff: List[String]) extends Foo {
val stuff = SortedSet(rawStuff: _*)
}
// test it out....
val b = Bazz(List("A","B","C"))
b.makeBars(Map("A"->"1","B"->"2","C"->"3"))
// List[Bar] = List(Bar(1), Bar(2), Bar(3))
// Looks good?
// Make a really big list not in order. This is why we pass it to a SortedSet...
val data = Stream.continually(util.Random.shuffle(List("A","B","C","D","E","F"))).take(100).toList
val b2 = Bazz(data.flatten)
// And how about a sparse map...?
val bs = util.Random.shuffle(Map("A" -> "1", "B" -> "2", "E" -> "5").toList).toMap
b2.makeBars(bs)
// res24: List[Bar] = List(Bar(1), Bar(2), Bar(-), Bar(5))
I've discovered that, in some cases, the makeBars method of classes extending Foo does not return a sorted List. In fact, the list ordering does not reflect the ordering of the SortedSet
What am I missing about the above code where Scala will not always map a SortedSet to a List with elements ordered by the SortedSet ordering?
You're being surprised by implicit resolution.
The map method requires a CanBuildFrom instance that's compatible with the target collection type (in simple cases, identical to the source collection type) and the mapper function's return type.
In the particular case of SortedSet, its implicit CanBuildFrom requires that an Ordering[A] (where A is the return type of the mapper function) be available. When your map function returns something that the compiler already knows how to find an Ordering for, you're good:
scala> val ss = collection.immutable.SortedSet(10,9,8,7,6,5,4,3,2,1)
ss: scala.collection.immutable.SortedSet[Int] = TreeSet(1, 2, 3, 4, 5,
6, 7, 8, 9, 10)
scala> val result1 = ss.map(_ * 2)
result1: scala.collection.immutable.SortedSet[Int] = TreeSet(2, 4, 6, 8, 10,
12, 14, 16, 18, 20)
// still sorted because Ordering[Int] is readily available
scala> val result2 = ss.map(_ + " is a number")
result2: scala.collection.immutable.SortedSet[String] = TreeSet(1 is a number,
10 is a number,
2 is a number,
3 is a number,
4 is a number,
5 is a number,
6 is a number,
7 is a number,
8 is a number,
9 is a number)
// The default Ordering[String] is an "asciibetical" sort,
// so 10 comes between 1 and 2. :)
However, when your mapper function turns out to return a type for which no Ordering is known, the implicit on SortedSet doesn't match (specifically, no value can be found for its implicit parameter), so the compiler looks "upward" for a compatible CanBuildFrom and finds the generic one from Set.
scala> case class Foo(i: Int)
defined class Foo
scala> val result3 = ss.map(Foo(_))
result3: scala.collection.immutable.Set[Foo] = Set(Foo(10), Foo(4), Foo(6), Foo(7), Foo(1), Foo(3), Foo(5), Foo(8), Foo(9), Foo(2))
// The default Set is a hash set, therefore ordering is not preserved
Of course, you can get around this by simply supplying an instance of Ordering[Foo] that does whatever you expect:
scala> implicit val fooIsOrdered: Ordering[Foo] = Ordering.by(_.i)
fooIsOrdered: Ordering[Foo] = scala.math.Ordering$$anon$9#7512dbf2
scala> val result4 = ss.map(Foo(_))
result4: scala.collection.immutable.SortedSet[Foo] = TreeSet(Foo(1), Foo(2),
Foo(3), Foo(4), Foo(5),
Foo(6), Foo(7), Foo(8),
Foo(9), Foo(10))
// And we're back!
Finally, note that toy examples often don't exhibit the problem, because the Scala collection library has special implementations for small (n <= 6) Sets and Maps.
You're probably making assumption about what SortedSet does from Java. You need to specify what order you want the elements to be in. See http://www.scala-lang.org/docu/files/collections-api/collections_8.html

How to turn a list of objects into a map of two fields in Scala

I'm having a real brain fart here. I'm working with the Play Framework. I have a method which takes a map and turns it into a HTML select element. I had a one-liner to take a list of objects and convert it into a map of two of the object's fields, id and name. However, I'm a Java programmer and my Scala is weak, and I've only gone and forgotten the syntax of how I did it.
I had something like
organizations.all.map {org => /* org.prop1, org.prop2 */ }
Can anyone complete the commented part?
I would suggest:
map { org => (org.id, org.name) } toMap
e.g.
scala> case class T(val a : Int, val b : String)
defined class T
scala> List(T(1, "A"), T(2, "B"))
res0: List[T] = List(T(1,A), T(2,B))
scala> res0.map(t => (t.a, t.b))
res1: List[(Int, String)] = List((1,A), (2,B))
scala> res0.map(t => (t.a, t.b)).toMap
res2: scala.collection.immutable.Map[Int,String] = Map(1 -> A, 2 -> B)
You could also take an intermediary List out of the equation and go straight to the Map like this:
case class Org(prop1:String, prop2:Int)
val list = List(Org("foo", 1), Org("bar", 2))
val map:Map[String,Int] = list.map(org => (org.prop1, org.prop2))(collection.breakOut)
Using collection.breakOut as the implicit CanBuildFrom allows you to basically skip a step in the process of getting a Map from a List.

What is the correct way to get a subarray in Scala?

I am trying to get a subarray in scala, and I am a little confused on what the proper way of doing it is. What I would like the most would be something like how you can do it in python:
x = [3, 2, 1]
x[0:2]
but I am fairly certain you cannot do this.
The most obvious way to do it would be to use the Java Arrays util library.
import java.util.Arrays
val start = Array(1, 2, 3)
Arrays.copyOfRange(start, 0, 2)
But it always makes me feel a little dirty to use Java libraries in Scala. The most "scalaic" way I found to do it would be
def main(args: List[String]) {
val start = Array(1, 2, 3)
arrayCopy(start, 0, 2)
}
def arrayCopy[A](arr: Array[A], start: Int, end: Int)(implicit manifest: Manifest[A]): Array[A] = {
val ret = new Array(end - start)
Array.copy(arr, start, ret, 0, end - start)
ret
}
but is there a better way?
You can call the slice method:
scala> Array("foo", "hoo", "goo", "ioo", "joo").slice(1, 4)
res6: Array[java.lang.String] = Array(hoo, goo, ioo)
It works like in python.
Imagine you have an array with elements from a to f
scala> val array = ('a' to 'f').toArray // Array('a','b','c','d','e','f')
Then you can extract a sub-array from it in different ways:
Dropping the first n first elements with drop(n: Int)
array.drop(2) // Array('c','d','e','f')
Take the first n elements with take(n: Int)
array.take(4) // Array('a','b','c','d')
Select any interval of elements with slice(from: Int, until: Int). Note that until is excluded.
array.slice(2,4) // Array('c','d')
The slice method is stricly equivalent to:
array.take(4).drop(2) // Array('c','d')
Exclude the last n elements with dropRight(n: Int):
array.dropRight(4) // Array('a','b')
Select the last n elements with takeRight(n: Int):
array.takeRight(4) // Array('c','d','e','f')
Reference: Official documentation
An example of extracting specific columns from a 2D Scala Array (original_array):
import scala.collection.mutable.ArrayBuffer
val sub_array = ArrayBuffer[Array[String]]()
val columns_subset: Seq[String] = Seq("ColumnA", "ColumnB", "ColumnC")
val columns_original = original_array(0)
for (column_now <- columns_subset) {
sub_array += original_array.map{_(columns_original.indexOf(column_now))}
}
sub_array