Can't access value - scala

scala> val blank_line_accumulator = sc.accumulator(0,"Blank Lines")
blank_line_accumulator: org.apache.spark.Accumulator[Int] = 0
val input_file2 = sc.textFile("file:///home/cloudera/input2.txt").foreach{x=>if(x.length()==0)blank_line_accumulator +=1}
input_file2: Unit = ()
scala> input_file2.value :40: error: value value is not a
member of Unit
input_file2.value
This is my problem while accessing the Value.

There is no error for me to access the value it worked like a charm.. may be you are doing some simple mistake somewher else. take fresh spark-shell and try again..
scala> blank_line_accumulator.value
res3: Int = 3
to debug this try below... should give Class[Int] = int
scala> blank_line_accumulator.value.getClass
res4: Class[Int] = int
and try to debug scala> blank_line_accumulator.getClass
should give below...
res6: Class[_ <: org.apache.spark.Accumulator[Int]] = class
org.apache.spark.Accumulator

foreach doesn't return any useful value, which is represented as Unit (which you can see in the type: input_file2: Unit = ()). Unit doesn't have a value, so there's nothing to access. Probably you meant blank_line_accumulator.value as Ram Ghadiyaram's answer shows.

Related

Eliminating identity wrapper types from Scala APIs

Suppose I am trying to "abstract over execution":
import scala.language.higherKinds
class Operator[W[_]]( f : Int => W[Int] ) {
def operate( i : Int ) : W[Int] = f(i)
}
Now I can define an Operator[Future] or Operator[Task] etc. For example...
import scala.concurrent.{ExecutionContext,Future}
def futureSquared( i : Int ) = Future( i * i )( ExecutionContext.global )
In REPL-style...
scala> val fop = new Operator( futureSquared )
fop: Operator[scala.concurrent.Future] = Operator#105c54cb
scala> fop.operate(4)
res0: scala.concurrent.Future[Int] = Future(<not completed>)
scala> res0
res1: scala.concurrent.Future[Int] = Future(Success(16))
Hooray!
But I also might want a straightforward synchronous version, so I define somewhere
type Identity[T] = T
And I can define a synchronous operator...
scala> def square( i : Int ) : Identity[Int] = i * i
square: (i: Int)Identity[Int]
scala> val sop = new Operator( square )
sop: Operator[Identity] = Operator#18f2960b
scala> sop.operate(9)
res2: Identity[Int] = 81
Sweet.
But, it's awkward that the inferred type of the result is Identity[Int], rather than the simpler, straightforward Int. Of course the two types are really the same, and so are identical in every way. But I'd like clients of my library who don't know anything about this abstracting-over-execution stuff not to be confused.
I could write a wrapper by hand...
class SimpleOperator( inner : Operator[Identity] ) extends Operator[Identity]( inner.operate ) {
override def operate( i : Int ) : Int = super.operate(i)
}
which does work...
scala> val simple = new SimpleOperator( sop )
simple: SimpleOperator = SimpleOperator#345c744e
scala> simple.operate(7)
res3: Int = 49
But this feels very boiler-platey, especially if my abstracted-over-execution class has lots of methods rather than just one. And I'd have to remember to keep the wrapper in sync as the generic class evolves.
Is there some more generic, maintainable way to get a version of Operator[Identity] that makes the containing type disappear from the type inference and API docs?
This more of long comment rather than an answer...
But, it's awkward that the inferred type of the result is Identity[Int], rather than the simpler, straightforward Int. Of course the two types apparent types are really the same, and so are identical in every way. But I'd like clients of my library who don't know anything about this abstracting-over-execution stuff not to be confused.
This sounds like you want to convert Indentity[T] back to T... Have you considered type ascription?
scala>def f[T](t: T): Identity[T] = t
scala>f(3)
// res11: Identity[Int] = 3
scala>f(3): Int
// res12: Int = 3
// So in your case
scala>sop.operate(9): Int
// res18: Int = 81
As Steve Waldman suggested in comments given type Identity[T] = T, the types T and Identity[T] really are identical without any ceremony, substitutable and transparent at call sites or anywhere else. For example, following works fine out-of-the-box
sop.operate(9) // res2: cats.Id[Int] = 81
def foo(i: Int) = i
foo(sop.operate(9)) // res3: Int = 81
extract from Cats is the dual of pure and extracts the value from its context, so perhaps we could provide similar methods for users not familiar with the above equivalence (like myself if you see my previous edit).
Can be done by providing types explicitly, but still looks magical for external users investigating method signature.
type Identity[T] = T
def square( i : Int ):Int = i * i
class Operator[W[_], T <: W[Int] ]( f : Int => T ) {
def operate(i : Int):T = f(i)
}
val op = new Operator[Identity,Int](square)
op.operate(5)
//res0: Int = 25
Works for new Operator[Future,Future[Int]] as well.

Get max in a ListBuffer of Some[Long] in Scala

This piece of code works fine and returns 343423 as expected:
val longList: ListBuffer[Long] = ListBuffer(103948,343423,209754)
val maxLong = longList.max
But it doesn't work for Some[Long]:
val longSomeList: ListBuffer[Some[Long]] = ListBuffer(Some(103948),Some(343423),Some(209754))
val maxSomeLong = longSomeList.max
Error: No implicit Ordering defined for Some[Long].
val maxSomeLong = longSomeList.max
Is there any simple solution to get the max of the second list?
max function from TraversableForwarder(scala.collection.generic)
In which real world scenario would you have a ListBuffer[Some[Long]]? You could just as well have a ListBuffer[Long] then.
This works:
val longSomeList: ListBuffer[Option[Long]] = ListBuffer(Some(103948),Some(343423),Some(209754))
val maxSomeLong = longSomeList.max
You are looking for .flatten.
longSomeList.flatten.max
Or give it the ordering to use explicitly:
longSomeList
.max(Ordering.by[Option[Int], Int](_.getOrElse(Int.MinValue)))
Also, don't use mutable collections.
longSomeList.collect { case Some(n) => n }.max
The problem is you are trying to order elements of type Some[Long], which is not defined. So you are telling the compiler to know how to order these:
scala> Some(1) < Some(2)
<console>:8: error: value < is not a member of Some[Int]
Some(1) < Some(2)
^
What you can do is either unwrap the Somes to get the Longs
longSomeList.flatten.max
or to define your implicit ordering likewise:
implicit object Ord extends Ordering[Some[Long]] {
def compare(a: Some[Long], b: Some[Long]) = a.getOrElse(Long.MinValue) compare b.getOrElse(Long.MinValue)
}
and then:
scala> longSomeList.max
res12: Some[Long] = Some(343423)

How to get the type of a field using reflection?

Is there a way to get the Type of a field with scala reflection?
Let's see the standard reflection example:
scala> class C { val x = 2; var y = 3 }
defined class C
scala> val m = ru.runtimeMirror(getClass.getClassLoader)
m: scala.reflect.runtime.universe.Mirror = JavaMirror ...
scala> val im = m.reflect(new C)
im: scala.reflect.runtime.universe.InstanceMirror = instance mirror for C#5f0c8ac1
scala> val fieldX = ru.typeOf[C].declaration(ru.newTermName("x")).asTerm.accessed.asTerm
fieldX: scala.reflect.runtime.universe.TermSymbol = value x
scala> val fmX = im.reflectField(fieldX)
fmX: scala.reflect.runtime.universe.FieldMirror = field mirror for C.x (bound to C#5f0c8ac1)
scala> fmX.get
res0: Any = 2
Is there a way to do something like
val test: Int = fmX.get
That means can I "cast" the result of a reflection get to the actual type of the field? And otherwise: is it possible to do a reflection set from a string? In the example something like
fmx.set("10")
Thanks for hints!
Here's the deal... the type is not known at compile time, so, basically, you have to tell the compiler what the type it's supposed to be. You can do it safely or not, like this:
val test: Int = fmX.get.asInstanceOf[Int]
val test: Int = fmX.get match {
case n: Int => n
case _ => 0 // or however you want to handle the exception
}
Note that, since you declared test to be Int, you have to assign an Int to it. And even if you kept test as Any, at some point you have to pick a type for it, and it is always going to be something static -- as in, in the source code.
The second case just uses pattern matching to ensure you have the right type.
I'm not sure I understand what you mean by the second case.

InstanceOf some type from runtime, Scala

The idea, is that, for example we got type of some object:
val tm = getTypeTag("String here").tpe
//> tm: reflect.runtime.universe.Type = java.lang.String
// for example I got another val or var, of some type:
val tmA: Any = "String here"
//> tmA: Any = String here
How to make tmA.InstanceOf(tm) (it is a mnemonic code)? 'Cause tm it is not a type alias, and we cant make InstanceOf[tm] exactly.
EDITED
there I mean analog function for asIstanceOf, to make a sort of type casting
EDITED2
I'll partly answer my question myself. So if we have TypeTags is is all easy!
def tGet[T](t: TypeTag[T], obj: Any): T = obj.asInstanceOf[T]
It is a harder situation if we only got Type and not the whole TypeTag[T].
You can use a mirror to reflect the instance:
val mirror = runtimeMirror(getClass.getClassLoader)
def isTm(a: Any) = mirror.reflect(a).symbol == tm.typeSymbol
And then:
scala> isTm("String here": Any)
res0: Boolean = true
scala> isTm(List("String here"): Any)
res1: Boolean = false
I don't think I have to tell you what a bad idea this is, though.
You need just to use type attribute of your variable after the variable.
As an example you can write:
val h ="hello"
val b:Any = "hhhh"
val stringB: String = b.asInstanceOf[h.type]
println(stringB)

How do I use Scala Hashmaps and Tuples together correctly?

My code is as follows
import scala.collection.mutable.HashMap
type CrossingInterval = (Date, Date)
val crossingMap = new HashMap[String, CrossingInterval]
val crossingData: String = ...
Firstly why does the following line compile?
val time = crossingMap.getOrElse(crossingData, -1)
I would have thought -1 would have been an invalid value
Secondly how do I do a basic check such as the following
if (value exists in map) {
}
else {
}
In Java I would just check for null values. I'm not sure about the proper way to do it in Scala
Typing your code in the interpreter shows why the first statement compiles:
type Date = String
scala> val time = crossingMap.getOrElse(crossingData, -1)
time: Any = -1
Basically, getOrElse on a Map[A, B] (here B = CrossingDate) accepts a parameter of any type B1 >: B: that means that B1 must be a supertype of B. Here B1 = Any, and -1 is of course a valid value of type Any. In this case you actually want to have a type declaration for time.
For testing whether a key belongs to the map, just call the contains method. An example is below - since Date was not available, I simply defined it as an alias to String.
scala> crossingMap.contains(crossingData)
res13: Boolean = false
scala> crossingMap += "" -> ("", "")
res14: crossingMap.type = Map("" -> ("",""))
//Now "" is a map of the key
scala> crossingMap.contains("")
res15: Boolean = true
If you want to check whether a value is part of the map, the simplest way is to write this code:
crossingMap.values.toSet.contains("")
However, this builds a Set containing all values. EDIT: You can find a better solution for this subproblem in Kipton Barros comment.