list of string matching with input string - scala

I am learning scala and I have the following issue:
Given a list in input
val listin = List("Apple,January,10",
"Banana,August,15",
"Strawberry,June,20")
and a String val inputstring="Banana,August"
I want to find the price in column matching with the string.
I wrote the following code :
case class Fruit(name:String, month:String,price:Int)
val splitString=inputstring.split(",")
val listSplit=listin.map(_.spilt(","))
But I don't know how to match the case of equality between the string and a line in the list
The expected result is
val output="Banana_August_15"`

Not sure why you want to replace the commas with underscores, or what purpose the case class serves, but this produces the requested result.
listin.filter(_.startsWith(inputstring+","))
.map(_.replaceAllLiterally(",","_")
//res0: List[String] = List(Banana_August_15)

Related

Read a tuple from a file in Scala

my Task is to read registrations from a file given like:
Keri,345246,2
Ingar,488058,2
Almeta,422016,1
and insert them into a list(Tuple of (String, Int, Int).
So far I wrote this:
The problem is that I don‘t understand why I can't try to cast value2 and value3 to Int even tho they should be Strings because they come from an Array of Strings. Could someone tell me, what my mistake is, I am relatively new to Scala
What is the point of using Scala if you are going to write Java code?
This is how you would properly read a file as a List of case classes.
import scala.io.Source
import scala.util.Using
// Use proper names for the fields.
final case class Registration(field1: String, field2: Int, field3: Int)
// You may change the error handling logic.
def readRegistrationsFromFile(fileName: String): List[Registration] =
Using(Source.fromFile(fileName)) { source =>
source.getLines().map(line => line.split(',').toList).flatMap {
case field1Raw :: field2Raw :: field3Raw :: Nil =>
for {
field2 <- field2Raw.toIntOption
field3 <- field3Raw.toIntOption
} yield Registration(field1 = field1Raw.trim, field2, field3)
case _ =>
None
}.toList
}.getOrElse(default = List.empty)
(feel free to ask any question you may have about this code)
In Scala, in order to convert a String to an Int you need explicit casting.
This can be achieved like this if you are sure the string can be parsed into a integer:
val values = values(1).toInt
If you cannot trust the input (and you probably should not), you can use .toIntOption which will give you a Option[Int] defined if the value was converted successfully or undefined if the string did not represent an integer.
The previous answers are correct. I would add a few more points.
saveContent is declared as a val. This is means it cannot be changed (assigned another value). You can use the Scala REPL (command-line) tool to check:
scala> val saveContent = Nil
val v: collection.immutable.Nil.type = List()
scala> saveContent = 3
^
error: reassignment to val
Instead, you could use a var, although it would be more idiomatic to have an overall pattern like the one provided by Luis Miguel's answer - with pattern-matching and a for-comprehension.
You can use the Scala REPL to check the types of the variables, too. Splitting a String will always lead to more Strings, not Ints, etc.
> val values = "a,2,3".split(",")
val values: Array[String] = Array(a, 2, 3)
> values(2)
val res3: String = 3
This is why a cast like Gael's is necessary.
Array-type access is done with parentheses and not square brackets, in Scala. See above, and http://scalatutorials.com/tour/interactive_tour_of_scala_lists for more details.

Assigning value to arg with val object(arg) = object

I am following this tutorial on GraphQL with Sangria. I am wondering about the following line
val JsObject(fields) = requestJSON
where requestJSON is an object of JsValue. This way of assigning fields is new to me and my question is, if you could name that pattern or provide me with a link to a tutorial regarding this structure.
The important thing to know is that val definitions support a Pattern on the left-hand side of the assignment, thus providing (subset of the functionality of) Pattern Matching.
So, your example is equivalent to:
val fields = requestJSON match {
case JsObject(foo) => foo
}
See Scala Language Specification Section 4.1 Value Declarations and Definitions for details.
So, for example, if you have a list l and you want to assign the first element and the rest, you could write:
val x :: xs = l
Or, for the fairly common case where a method returns a tuple, you could write:
val (result1, result2) = foo()
It is the Extractor pattern, you can reach the same result implementing the unapply method on your arbitrary object (like shown in the example). When you create a case class the compiler produces an unapply method for you, so you can do:
case class Person(name : String, surname : String)
val person = Person("gianluca", "aguzzi")
val Person(name, surname) = person

Scala Cast List of Any to list of Int

Given a List of Any:
val l = List(2.9940714E7, 2.9931662E7, 2.993162E7, 2.9931625E7, 2.9930708E7, 2.9930708E7, 2.9931477E7)
I need to cast each item to Int.
Works:
l(1).asInstanceOf[Double].toInt
Not:
l.foreach{_.asInstanceOf[Double].toInt}
> java.lang.String cannot be cast to java.lang.Double
If
l.foreach{_.asInstanceOf[String].toDouble.toInt}
> java.lang.Double cannot be cast to java.lang.String
I'm new to Scala. Please tell me what I'm missing.
Why I can cast one item from list, but can't do this via iterator?
Thanks!
It seems as if a String somehow ended up in your List l.
Given a list that is structured like this (with mixed integers, Doubles, and Strings):
val l = List[Any](2.9940714E7, 2.9931625E7, "2.345E8", 345563)
You can convert it to list of integers as follows:
val lAsInts = l.map {
case i: Int => i
case d: Double => d.toInt
case s: String => s.toDouble.toInt
}
println(lAsInts)
This works for Doubles, Ints and Strings. If it still crashes with some exceptions during the cast, then you can add more cases. You can, of course, do the same in a foreach if you want, but in this case it wouldn't do anything, because you don't do anything in the body of foreach except casting. This has no useful side-effects (e.g. it prints nothing).
Another option would be:
lAsInts = l.map{_.toString.toDouble.toInt}
This is in a way even more forgiving to all kind of weird input, at least as long as all values are "numeric" in some sense.
However, this is definitely code-smell: how did you get a list with such a wild mix of values to begin with?
Your given List is of type Double. You can use simply map operation to convert it to Int type. try following code,
val l: List[Double] = List(2.9940714E7, 2.9931662E7, 2.993162E7, 2.9931625E7, 2.9930708E7, 2.9930708E7, 2.9931477E7)
//simply map the list and use toInt
val intLst: List[Int] = l.map(_.toInt)
print(intLst)
//output
//List(29940714, 29931662, 29931620, 29931625, 29930708, 29930708, 29931477)
But suppose you have same List as List[Any] instead then you can use following to convert it to Int.
val l: List[Any] = List(2.9940714E7, 2.9931662E7, 2.993162E7, 2.9931625E7, 2.9930708E7, 2.9930708E7, 2.9931477E7)
val intLst: List[Int] = l.map(_.asInstanceOf[Double].toInt)
It will give same output as above.

Spark cast column to sql type stored in string

The simple request is I need help adding a column to a dataframe but, the column has to be empty, its type is from ...spark.sql.types and the type has to be defined from a string.
I can probably do this with ifs or case but I'm looking for something more elegant. Something that does not require writing a case for every type in org.apache.spark.sql.types
If I do this for example:
df = df.withColumn("col_name", lit(null).cast(org.apache.spark.sql.types.StringType))
It works as intended, but I have the type stored as a string,
var the_type = "StringType"
or
var the_type = "org.apache.spark.sql.types.StringType"
and I can't get it to work by defining the type from the string.
For those interested here are some more details: I have a set containing tuples (col_name, col_type) both as strings and I need to add columns with the correct types for a future union between 2 dataframes.
I currently have this:
for (i <- set_of_col_type_tuples) yield {
val tip = Class.forName("org.apache.spark.sql.types."+i._2)
df = df.withColumn(i._1, lit(null).cast(the_type))
df }
if I use
val the_type = Class.forName("org.apache.spark.sql.types."+i._2)
I get
error: overloaded method value cast with alternatives: (to: String)org.apache.spark.sql.Column <and> (to: org.apache.spark.sql.types.DataType)org.apache.spark.sql.Column cannot be applied to (Class[?0])
if I use
val the_type = Class.forName("org.apache.spark.sql.types."+i._2).getName()
It's a string so I get:
org.apache.spark.sql.catalyst.parser.ParseException: mismatched input '.' expecting {<EOF>, '('}(line 1, pos 3)
== SQL == org.apache.spark.sql.types.StringType
---^^^
EDIT: So, just to be clear, the set contains tuples like this ("col1","IntegerType"), ("col2","StringType") not ("col1","int"), ("col2","string"). A simple cast(i._2) does not work.
Thank you.
You can use overloaded method cast, which has a String as an argument:
val stringType : String = ...
column.cast(stringType)
def cast(to: String): Column
Casts the column to a different data type, using the canonical string
representation of the type.
You can also scan for all Data Types:
val types = classOf[DataTypes]
.getDeclaredFields()
.filter(f => java.lang.reflect.Modifier.isStatic(f.getModifiers()))
.map(f => f.get(new DataTypes()).asInstanceOf[DataType])
Now types is Array[DataType]. You can translate it to Map:
val typeMap = types.map(t => (t.getClass.getSimpleName.replace("$", ""), t)).toMap
and use in code:
column.cast(typeMap(yourType))

Runtime exception in syntax like defining two vals with identical names

In some book I've got a code similar to this:
object ValVarsSamples extends App {
val pattern = "([ 0-9] +) ([ A-Za-z] +)". r // RegEx
val pattern( count, fruit) = "100 Bananas"
}
This is supposed to be a trick, it should like defining same names for two vals, but it is not.
So, this fails with an exception.
The question: what this might be about? (what's that supposed to be?) and why it does not work?
--
As I understand first: val pattern - refers to RegEx constructor function.. And in second val we are trying to pass the params using such a syntax? just putting a string
This is an extractor:
val pattern( count, fruit) = "100 Bananas"
This code is equivalent
val res = pattern.unapplySeq("100 Bananas")
count = res.get(0)
fruit = res.get(1)
The problem is your regex doesn't match, you should change it to:
val pattern = "([ 0-9]+) ([ A-Za-z]+)". r
The space before + in [ A-Za-z] + means you are matching a single character in the class [ A-Za-z] and then at least one space character. You have the same issue with [ 0-9] +.
Scala regexes define an extractor, which returns a sequence of matching groups in the regular expression. Your regex defines two groups so if the match succeeds the sequence will contain two elements.