What 28 frames are elided when dividing by zero in the REPL? - scala

scala> 5 / 0
java.lang.ArithmeticException: / by zero
... 28 elided
Twenty eight frames elided for a simple arithmetic expression?! What are these frames, why does Scala need that many to do safe division, and why are they being elided in the first place?

scala> import scala.util.Try
import scala.util.Try
scala> Try(5/0)
res2: scala.util.Try[Int] = Failure(java.lang.ArithmeticException: / by zero)
scala> res2.recover { case e: ArithmeticException => e.printStackTrace }
java.lang.ArithmeticException: / by zero
at $line8.$read$$iw$$iw$$anonfun$1.apply$mcI$sp(<console>:13)
at $line8.$read$$iw$$iw$$anonfun$1.apply(<console>:13)
at $line8.$read$$iw$$iw$$anonfun$1.apply(<console>:13)
at scala.util.Try$.apply(Try.scala:192)
at $line8.$read$$iw$$iw$.<init>(<console>:13)
at $line8.$read$$iw$$iw$.<clinit>(<console>)
at $line8.$eval$.$print$lzycompute(<console>:7)
at $line8.$eval$.$print(<console>:6)
at $line8.$eval.$print(<console>)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at scala.tools.nsc.interpreter.IMain$ReadEvalPrint.call(IMain.scala:786)
at scala.tools.nsc.interpreter.IMain$Request.loadAndRun(IMain.scala:1047)
at scala.tools.nsc.interpreter.IMain$WrappedRequest$$anonfun$loadAndRunReq$1.apply(IMain.scala:638)
at scala.tools.nsc.interpreter.IMain$WrappedRequest$$anonfun$loadAndRunReq$1.apply(IMain.scala:637)
at scala.reflect.internal.util.ScalaClassLoader$class.asContext(ScalaClassLoader.scala:31)
at scala.reflect.internal.util.AbstractFileClassLoader.asContext(AbstractFileClassLoader.scala:19)
at scala.tools.nsc.interpreter.IMain$WrappedRequest.loadAndRunReq(IMain.scala:637)
at scala.tools.nsc.interpreter.IMain.interpret(IMain.scala:569)
at scala.tools.nsc.interpreter.IMain.interpret(IMain.scala:565)
at scala.tools.nsc.interpreter.ILoop.interpretStartingWith(ILoop.scala:807)
at scala.tools.nsc.interpreter.ILoop.command(ILoop.scala:681)
at scala.tools.nsc.interpreter.ILoop.processLine(ILoop.scala:395)
at scala.tools.nsc.interpreter.ILoop.loop(ILoop.scala:415)
at scala.tools.nsc.interpreter.ILoop$$anonfun$process$1.apply$mcZ$sp(ILoop.scala:923)
at scala.tools.nsc.interpreter.ILoop$$anonfun$process$1.apply(ILoop.scala:909)
at scala.tools.nsc.interpreter.ILoop$$anonfun$process$1.apply(ILoop.scala:909)
at scala.reflect.internal.util.ScalaClassLoader$.savingContextLoader(ScalaClassLoader.scala:97)
at scala.tools.nsc.interpreter.ILoop.process(ILoop.scala:909)
at scala.tools.nsc.MainGenericRunner.runTarget$1(MainGenericRunner.scala:74)
at scala.tools.nsc.MainGenericRunner.run$1(MainGenericRunner.scala:87)
at scala.tools.nsc.MainGenericRunner.process(MainGenericRunner.scala:98)
at scala.tools.nsc.MainGenericRunner$.main(MainGenericRunner.scala:103)
at scala.tools.nsc.MainGenericRunner.main(MainGenericRunner.scala)
res3: scala.util.Try[AnyVal] = Success(())
The elided lines are basically the REPL's overhead of reading a line, compiling it into a class, and constructing an instance of the class, which is where the code written in the REPL is evaluated.

Related

How can I cast WrappedArray to List in Spark Scala?

I use a DataFrame to handle data in spark. I have a array column in this dataframe. At the end of all the transformation I want to do, I have a dataframe with one array column and one row. In order to apply groupby, map and reduce I want to have this array as a list but I can't do it.
.drop("ScoresArray")
.filter($"min_score" < 0.2)
.select("WordsArray")
.agg(collect_list("WordsArray"))
.withColumn("FlattenWords", flatten($"collect_list(WordsArray)"))
.drop("collect_list(WordsArray)")
.collect()
val test1 = words(0).getAs[immutable.List[String]](0)
Here is the error message :
[error] (run-main-0) java.lang.ClassCastException: scala.collection.mutable.WrappedArray$ofRef cannot be cast to scala.collection.immutable.List
[error] java.lang.ClassCastException: scala.collection.mutable.WrappedArray$ofRef cannot be cast to scala.collection.immutable.List
[error] at analysis.Analysis$.main(Analysis.scala:37)
[error] at analysis.Analysis.main(Analysis.scala)
[error] at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
[error] at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
[error] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
[error] at java.lang.reflect.Method.invoke(Method.java:498)
[error] stack trace is suppressed; run last Compile / bgRun for the full output
Thoughts ?
You can't cast an array to list but you can convert one to the other.
val test1 = words(0).getSeq[String](0).toList

How to convert Array of any of elements in to dataframe in spark scala?

I have a Array like Array[(Any, Any, Any)]. For example:
l1 = [(a,b,c),(d,e,f),(x,y,z)]
I want to convert it to a Dataframe as:
c1 c2 c3
a b c
d e f
x y z
I tried to convert the existing dataframe to a list:
val l1= test_df.select("c1","c2","c3").rdd.map(x =>
(x(0),x(1),x(2))).collect()
println (lst)
val c = Seq(l1).toDF("c1","c2","c3")
c.show()
But it is throwing this error:
xception in thread "main" java.lang.ClassNotFoundException: scala.Any
at java.net.URLClassLoader.findClass(Unknown Source)
at java.lang.ClassLoader.loadClass(Unknown Source)
In Pyspark:
l1 = [('a','b','c'),('d','e','f'),('x','y','z')]
sdf=spark.createDataFrame(l1)
sdf.show()

Scala hex string to integer conversion

In Scala, I can't seem to convert hex-strings back to integers:
val cols = Array(0x2791C3FF, 0x5DA1CAFF, 0x83B2D1FF, 0xA8C5D8FF,
0xCCDBE0FF, 0xE9D3C1FF, 0xDCAD92FF, 0xD08B6CFF,
0xC66E4BFF, 0xBD4E2EFF)
cols.map({ v => Integer.toHexString(v)}).map(v => Integer.parseInt(v, 16))
I get the following error message:
java.lang.NumberFormatException: For input string: "83b2d1ff"
at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
at java.lang.Integer.parseInt(Integer.java:583)
at $anonfun$3.apply(<console>:12)
at $anonfun$3.apply(<console>:12)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:245)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:245)
at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:186)
at scala.collection.TraversableLike$class.map(TraversableLike.scala:245)
at scala.collection.mutable.ArrayOps$ofRef.map(ArrayOps.scala:186)
... 35 elided
Int is too small. Use Long.
scala> 0x83B2D1FFDL
res3: Long = 35352551421
or
scala> java.lang.Long.decode("0x83B2D1FFD")
res4: Long = 35352551421
and back
scala> java.lang.Long.toHexString(res3)
res5: String = 83b2d1ffd
Try it as unsigned values.
cols.map(Integer.toHexString).map(Integer.parseUnsignedInt(_, 16))

Spark java.lang.ClassCastException: scala.collection.mutable.WrappedArray$ofRef cannot be cast to java.util.ArrayList

Spark is throwing ClassCastExpection when performing any operation on WrappedArray.
Example:
I have an map output like below
Output:
Map(1 -> WrappedArray(Pan4), 2 -> WrappedArray(Pan15), 3 -> WrappedArray(Pan16, Pan17, Pan18), 4 -> WrappedArray(Pan19, Pan1, Pan2, Pan3, Pan4, Pan5, Pan6))]
when invoked map.values, it's printing the output as the below output
MapLike(WrappedArray(Pan4), WrappedArray(Pan15), WrappedArray(Pan16, Pan17, Pan18), WrappedArray(Pan19, Pan1, Pan2, Pan3, Pan4, Pan5, Pan6))
It is throwing the exception if invoked by map.values.map(arr => arr) or map.values.forEach { value => println(value)}.
I am not able to perform any operation on the wrapped array. I just need the size of the elements present in each wrappedArray.
Error StackTrace
------------------
java.lang.ClassCastException: scala.collection.mutable.WrappedArray$ofRef cannot be cast to java.util.ArrayList
at WindowTest$CustomMedian$$anonfun$1.apply(WindowTest.scala:176)
at WindowTest$CustomMedian$$anonfun$1.apply(WindowTest.scala:176)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
at scala.collection.immutable.Map$Map4.foreach(Map.scala:181)
at scala.collection.TraversableLike$class.map(TraversableLike.scala:244)
at scala.collection.AbstractTraversable.map(Traversable.scala:105)
at WindowTest$CustomMedian.evaluate(WindowTest.scala:176)
at org.apache.spark.sql.execution.aggregate.ScalaUDAF.eval(udaf.scala:446)
at org.apache.spark.sql.execution.aggregate.AggregationIterator$$anonfun$35.apply(AggregationIterator.scala:376)
at org.apache.spark.sql.execution.aggregate.AggregationIterator$$anonfun$35.apply(AggregationIterator.scala:368)
at org.apache.spark.sql.execution.aggregate.SortBasedAggregationIterator.next(SortBasedAggregationIterator.scala:154)
at org.apache.spark.sql.execution.aggregate.SortBasedAggregationIterator.next(SortBasedAggregationIterator.scala:29)
at scala.collection.Iterator$$anon$14.hasNext(Iterator.scala:389)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327)
at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:308)
at scala.collection.Iterator$class.foreach(Iterator.scala:727)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
at scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:48)
at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:103)
at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:47)
at scala.collection.TraversableOnce$class.to(TraversableOnce.scala:273)
at scala.collection.AbstractIterator.to(Iterator.scala:1157)
at scala.collection.TraversableOnce$class.toBuffer(TraversableOnce.scala:265)
at scala.collection.AbstractIterator.toBuffer(Iterator.scala:1157)
at scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:252)
at scala.collection.AbstractIterator.toArray(Iterator.scala:1157)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$5.apply(SparkPlan.scala:212)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$5.apply(SparkPlan.scala:212)
at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1858)
at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1858)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
at org.apache.spark.scheduler.Task.run(Task.scala:89)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:227)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
I resolved the error by converting to Seq (sequence type).
Earlier:
val bufferMap: Map[Int, util.ArrayList[String]] = buffer.getAs[Map[Int, util.ArrayList[String]]](1)
Modified:
val bufferMap: Map[Int, Seq[String]] = buffer.getAs[Map[Int, Seq[String]]](1)
For those using Java Spark, encode the data set into an object instead of using Row and then the getAs method.
Suppose this data set that has some random information about a machine:
+-----------+------------+------------+-----------+---------+--------------------+
|epoch | RValues| SValues| TValues| ids| codes|
+-----------+------------+------------+-----------+---------+--------------------+
| 1546297225| [-1.0, 5.0]| [2.0, 6.0]| [3.0, 7.0]| [2, 3]|[MRT0000020611, M...|
| 1546297226| [-1.0, 3.0]| [-6.0, 6.0]| [3.0, 4.0]| [2, 3]|[MRT0000020611, M...|
| 1546297227| [-1.0, 4.0]|[-8.0, 10.0]| [3.0, 6.0]| [2, 3]|[MRT0000020611, M...|
| 1546297228| [-1.0, 6.0]|[-8.0, 11.0]| [3.0, 5.0]| [2, 3]|[MRT0000020611, M...|
+-----------+------------+------------+-----------+---------+--------------------+
Instead of having Dataset<Row>, create Dataset<MachineLog> that complies with this dataset column definition and create the MachineLog class. When doing a transformation, use the .as(Encoders.bean(MachineLog.class)) method to define the encoder.
For example:
spark.createDataset(dataset.rdd(), Encoders.bean(MachineLog.class));
But converting from a Dataset to RDD is not recommended. Try to use the as method.
Dataset<MachineLog> mLog = spark.read().parquet("...").as(Encoders.bean(MachineLog.class));
It can also be used after a transformation.
Dataset<MachineLog> machineLogDataset = aDataset
.join(
otherDataset,
functions.col("...").eqNullSafe("...")
)
).as(Encoders.bean(MachineLog.class));
Take into account that MachineLog class must obey the serialization rules (i.e., having anempty-explicit constructor, and getters and setters).
Try the below
map.values.**array**.forEach { value => println(value)}
array is a method in WrapperArray. It returns Array[T]. Here T is the type of the elements in the WrappedArray.

why does a scala class in the worksheet with same name but different case as the worksheet cause an exception?

When i have the following class i get an exception.
object rational {
println("Welcome to the Scala worksheet") //> Welcome to the Scala worksheet
val x = new Rational(1,2) //> java.lang.NoClassDefFoundError: Rational (wrong name: rational)
//| at java.lang.ClassLoader.defineClass1(Native Method)
//| at java.lang.ClassLoader.defineClassCond(ClassLoader.java:631)
//| at java.lang.ClassLoader.defineClass(ClassLoader.java:615)
//| at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:14
//| 1)
//| at java.net.URLClassLoader.defineClass(URLClassLoader.java:283)
//| at java.net.URLClassLoader.access$000(URLClassLoader.java:58)
//| at java.net.URLClassLoader$1.run(URLClassLoader.java:197)
//| at java.security.AccessController.doPrivileged(Native Method)
//| at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
//| at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
//| at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
//| at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
//| at rational$$anonfun$main$1.apply$mcV$sp(rational.scala:3)
//| at org.scalaide.worksheet.runtime.library.WorksheetSupport$$ano
//| Output exceeds cutoff limit.
}
class Rational (x:Int, y:Int) {
def numer = x
def denom = y
def add(that:Rational) =
new Rational (
numer * that.denom + denom * that.numer, denom * that.denom
)
override def toString() = numer + "/" + denom
}
When i change the name of object rational to rationals everything is fine
This is probably because of the type-insensitivity of the Windows file system. This would get messed up in Java too. The compiled .class files for the upper- and lower-case versions are indistinguishable and cannot both exist in the filesystem at the same time.
The object rational generates both a rational$.class and a rational.class.
The class Rational generates a Rational.class. This appears to be being overwritten by rational.class. When the JVM tries to read Rational.class it finds rational.class, sees that the content is wrong, and complains bitterly about having been led astray.