Kafka Scala: How to move val into try catch block - scala

I'd like to move:
val kafkaPartitionOffset = kafkaConsumer.endOffsets(consumedPartitions.asJava)
into a try catch block like so:
val kafkaPartitionOffset : SomeClass =
try {
kafkaConsumer.endOffsets(consumedPartitions.asJava)
} catch {
case e: Exception => {
log.error(s"${consumerGroupId} Could not get Kafka offset", e)
None
}
}
But I'm having trouble on what the SomeClass should be. I've tried Map[TopicPartition, Long] but it says Type mismatch. Any help is appreciated, thank you!
Update: I've also tried Any but I'm unable to do a kafkaPartitionOffset.get(topicPartition) below (get is highlighted red with error message cannot resolve symbol get:
for((topicPartition,OffsetAndMetadata) <- mapTopicPartitionOffset){
val bbCurrentOffset = OffsetAndMetadata.get(topicPartition)
// latest offset
val partitionLatestOffset = kafkaPartitionOffset.get(topicPartition)
// Log for a particular partition
val delta = partitionLatestOffset - bbCurrentOffset
topicOffsetList += delta.abs
}

Take a look at this:
val x = try {
throw new RuntimeException("runtime ex")
"some string"
} catch { case _: RuntimeException => 2 }
The compiler needs to know the type of x before runtime, since x can be used somewhere else in your code, right? So the compiler says:
"Hmm, what is a type that this literal "some string" is of that type,
and also, literal 2 is of that type?"
So it looks for the lowest super-type of both String and Int, which is Any in this case! so val x: Any = .... Now I'm not aware of what this expression kafkaConsumer.endOffsets(...) returns, in case it returns Option[T], then SomeClass would also be Option[T] since you're returning None in the catch block, and if not, do not use None there just because nothing else would fit, there are better approaches to exception handling.
But anyways, Scala provides some utility types to avoid this kind of try catch as much as possible, I would recommend you to use Try in this case.
val kafkaPartitionOffset: Try[Whatever-endOffsets-returns] =
Try(kafkaConsumer.endOffsets(consumedPartitions.asJava))
By the way, the title of the question doesn't match the actual question, please consider changing the title :)

Related

Using Option with .map() and .getOrElse()

I am trying to read a value from a Map[String, String] given a key.
This key|value is optional, in that it might not be there
So, I want to use Option and then map & getOrElse as below to write the value if it's there, or set it to some default in case it's not there.
val endpoint:String = Option(config.getString("endpoint"))
.map(_.value())
.getOrElse()
The code above fails with "Symbol value is inaccessible from this place"
config is a Map[String, Object]
getString is a method on config that takes in the key, and returns the value
public String getString(String key){
<...returns value...>
}
I could just drop the Option() and do, but then I have to deal with the exception that will be throw by getString()
val endpoint:String = config.getString("endpoint")
Any ideas what's wrong with this, or how to fix this?
Better ways of writing this?
UPDATE: I need to mention that config is an object in an imported Java library. Not sure if that makes a difference or not.
If I understand your question correctly, config.getString will throw an exception when the key is not present. In this case, wrapping the call in Option() will not help catch that exception: you should wrap in Try instead and convert that to an Option.
Try[String] represents a computation that can either succeed and become a Success(String), or fail and give you a Failure(thrownException). If you're familiar with Option, this is very similar to the two possibilities of Some and None, except that Failure will wrap the exception so that you know what caused the problem. The Try(someComputation) method will just do something like this for you:
try {
Success(someComputation)
} catch {
case ex: Exception => Failure(ex)
}
The second thing to consider is what you actually want to happen when there is no value. One sensible idea would be to provide a default configuration, and this is what getOrElse is for: you can't use without giving it the default value!
Here is an example:
val endpoint = Try(config.getString("endpoint"))
.toOption
.getOrElse("your_default_value")
We can do even better: now that we're using Try to catch the exception, there is no need to convert to Option if we're going to access the value right away.
val endpoint = Try(config.getString("endpoint")).getOrElse("your_default_value")
You can get a value from a map like this.
val m: Map[String, String] = Map("foo" -> "bar")
val res = m.get("foo").getOrElse("N.A")
val res2 = m.getOrElse("foo", "N.A") // same as above but cleaner
But perhaps if you want to use pattern matching:
val o: Option[String] = m.get("foo")
val res: String = o match {
case Some(value) => value
case None => "N.A"
}
Finally, a safe way to handle reading from config.
val endpoint:String = config.getString("endpoint") // this can return null
val endpoint: Option[String] = Option(config.getString("endpoint")) // this will return None if endpoint is not found
I suspect the config object might even have a method like
val endpoint: Option[String] = config.getStringOpt("endpoint")
Then you can use pattern matching to extract the value in the option. Or one of the many combinators map, flatMap, fold etc
val endPoint = Option(config.getString("endpoint"))
def callEndPoint(endPoint: String): Future[Result] = ??? // calls endpoint
endPoint match {
case Some(ep) => callEndPoint(ep)
case None => Future.failed(new NoSuchElementException("End point not found"))
}
Or
val foo = endPoint.map(callEndPoint).getOrElse(Future.failed(new NoSuchElement...))

Scala Try[String] found instead of String

I am trying to understand how Try works in scala (not try/catch) but Try. As an example, here I wish to check if the file exists, and if yes, I will use the data in the file later in the code, but it doesn't work:
val texte = Try(Source.fromFile(chemin_texte).getLines().filter(_!="").foldLeft(""){_+_})
texte match {
case Success(x) => x
case Failure(e) => println("An error occured with the text file"); println("Error: " + e.getMessage)
}
/*phrases du texte*/
val phrases_txt = split_phrases(texte).map(phrase => phrase)
At val phrases_txt I wish to use the output of texte if the file exists, if not the program should halt at Failure(e).
The error that I get is type mismatch; found: scala.util.Try[String] required: String .
Any help? Thanks.
Think of Try as just a container for a computation that can fail. It is not comparable with a try and catch block because they just "throw" the exceptions, which are expected to be handled later on in the program. Scala Try forces you to ensure that a possible error is handled at all times from that point onwards in your program.
You can do something like this:
val texte = Try(Source.fromFile(chemin_texte).getLines().filter(_!="").foldLeft(""){_+_})
val phrases: Try[List[String]] = texte.map(split_phrases)
I don't see the point of .map(phrases => phrases) because it will return the same object. The map function has a type of T[A] => T[B], so that means that for a container with values of type A, the map will run a given function f on the contents of the container and produce a container of type B where function f is responsible for converting an object of type A to type B.
If you wish to further use your phrases object in your program with other values that produce Try values, you can use the flatMap function or for expressions that make life easier. For example:
val morePhrases: Try[List[String]] = ???
def mergePhrases(phrases1: List[String], phrases2: List[String]): Phrases = phrases1 ++ phrases2
val mergedPhrases: Try[List[String]] = for {
p1 <- phrases
p2 <- morePhrases
} yield mergePhrases(p1, p2) // Only for demonstration, you could also do yield p1 ++ p2
The mergedPhrases value in the code above is just a Try container containing the result of application of mergePhrases function on contents of phrases and morePhrases.
Note that the Try block may not always be the best way to capture error at the end of your program you'll what the first error occurred, but you won't know what exactly the error was. That's why we have things like Either.

Error while finding lines starting with H or I using Scala

I am trying to learn Spark and Scala. I am working on a scenario to identify the lines that start with H or I. Below is my code
def startWithHorI(s:String):String=
{
if(s.startsWith("I")
return s
if(s.startsWith("H")
return s
}
val fileRDD=sc.textFile("wordcountsample.txt")
val checkRDD=fileRDD.map(startWithHorI)
checkRDD.collect
It is throwing an error while creating the function Found:Unit Required:Boolean.
From research I understood that it is not able to recognize the return as Unit means void. Could someone help me.
There are a few things wrong with your def, we will start there:
It is throwing the error because according to the code posted, your syntax is incomplete and the def is defined improperly:
def startWithHorI(s:String): String=
{
if(s.startsWith("I")) // missing extra paren char in original post
s // do not need return statement
if(s.startsWith("H")) // missing extra paren char in original post
s // do not need return statement
}
This will still return an error because we are expecting a String when the compiler sees that it's returning an Any. We cannot do this if we do not have an else case (what will be returned when s does not start with H or I?) - the compiler will see this as an Any return type. The correction for this would be to have an else condition that ultimately returns a String.
def startWithHorI(s: String): String = {
if(s.startsWith("I")) s else "no I"
if(s.startsWith("H")) s else "no H"
}
If you don't want to return anything, then an Option is worth looking at for a return type.
Finally we can achieve what you are doing via filter - no need to map with a def:
val fileRDD = sc.textFile("wordcountsample.txt")
val checkRDD = fileRDD.filter(s => s.startsWith("H") || s.startsWith("I"))
checkRDD.collect
While passing any function to rdd.map(fn) make sure that fn covers all possible scenarios.
If you want to completely avoid strings which does not start with either H or I then use flatMap and return Option[String] from your function.
Example:
def startWithHorI(s:String): Option[String]=
{
if(s.startsWith("I") || s.startsWith("H")) Some(s)
else None
}
Then,
sc.textFile("wordcountsample.txt").flatMap(startWithHorI)
This will remove all rows not starting with H or I.
In general, to minimize run-time errors try to create total functions which handles all possible values of the arguments.
Something like below would work for you?
val fileRDD=sc.textFile("wordcountsample.txt")
fileRDD.collect
Array[String] = Array("Hello ", Hello World, Instragram, Good Morning)
val filterRDD=fileRDD.filter( x=> (x(0) == 'H'||x(0) == 'I'))
filterRDD.collect()
Array[String] = Array("Hello ", Hello World, Instragram)

Scala functional composition

I'm trying to get a function currying working correctly. What I have is the following:
def method(x: ByteArrayInputStream)
(y: ByteArrayOutputStream)
(z: GZIPOutputStream)
(func: (ByteArrayInputStream, GZIPOutputStream) => Unit) = {
.....
.....
}
Now when I call it, I call it like this:
method(new ByteArrayInputStream("".getBytes("UTF-8")))
(new ByteArrayOutputStream())
(new GZIPOutputStream(_))
(myFunc(_, _))
My understanding is that in the third parameter i.e., to the GZIPOutputStream, when I say _, it will pick the value from the second parameter. But it complains saying that
Type mismatch, expected: GZIPOutputstream, actual: (OutputStream) => GZIPOutputStream
Any hints?
The problem is at
(new GZIPOutputStream(_))
As your error says, your method wants a a GZIPOutputstream, but you are passing it a function from OutputStream to GZIPOutputStream
The underscore is a little confusing at first, but it is the way to tell scala that you are intentionally not passing an argument to GZIPOutputStream so that it won't complain about missing arguments. In other words, you are passing the function itself instead of the result of the function.
How to fix it depends on what you're actually trying to do. If you actually want to pass a GZIPOutputStream, you'll need to replace that _ with an OutputStream.
If your intent is to have method create a GZIPOutputStream given a factory function like the one you are passing, you'd want to change the declared type for z. Eg,
(z: (OutputStream) => GZIPOutputStream)
and then in the method body you could say something like z(y) to get a GZIPOutputStream. (Or replace y with some other output stream.)
I'm not exactly sure how to do this... but here is one solution that mimics what you are looking for
def add(j: Int)(i: Option[Int] = None): Int = j + i.getOrElse(j)
add(5)()
The add(5)() returns 10 and uses the j value
I have managed to skin that a bit and here is what I have been to:
val bytePayload = method(new ByteArrayInputStream(s.getBytes("UTF-8")))(new ByteArrayOutputStream())(writeBytes(_,_))
def method(bin: ByteArrayInputStream)
(bos: ByteArrayOutputStream)
(func: (ByteArrayInputStream, GZIPOutputStream) => Unit): Either[String, Array[Byte]] = {
val gzip = new GZIPOutputStream(bos)
try {
func(bin, gzip)
gzip.finish
} catch {
case e: Exception => Left(e.getMessage)
} finally {
bin.close()
bos.close()
gzip.close()
}
Right(bos.toByteArray)
}
Though I still handle Exceptions, I'm to some extent convinced that I don't throw them around.

Return from nested condition, for loop, try/catch block in Scala

I've got a nested block of a condition, a for loop and try/catch block to return a tuple:
val (a, b) = {
if (...) {
for (...) {
try {
getTuple(conf)
} catch {
case e: Throwable => println(...)
}
}
sys.exit
} else {
try {
getTuple(userConf)
} catch {
case e: Throwable => println(...); sys.exit
}
}
}
If the if condition matches I would like to try x different conf configurations. When getTuple throws an exception, try the next one. When getTuple does not throw an exception, fill the tuple with the result. getTuple returns the tuple (a,b).
Problem: However, the for loop does not exit when getTuple does not throw an exception. I also tried break but that does not work as it should return the tuple and not just exit the for loop.
How can I get this to work?
Instead of throwing Exception, getTuple should evaluate as a Option[Tuple2[T,U]], it has more meaning and does not break the flow of the program.
This way, you can have a for like this:
val tuples: List[Option[Tuple2[T,U]] = for {
c <- configs
} yield getTuple(c)
val firstWorkingConfig: Option[Tuples2[T,U]] = tuples.flatten.headOption
// Exit the program if no config is okay
firstWorkingConfig.getOrElse {
sys.exit
}
Hope this helps
I'd suggest to use more functional features of Scala instead of dealing with loops and exceptions.
Suppose we have a function that throws an exception on some inputs:
def myfn(x: Double): Double =
if (x < 0)
throw new Exception;
else
Math.sqrt(x);
We can either refactor it to return Option:
def optMyfn(x: Double): Option[Double] =
if (x < 0)
None;
else
Some(Math.sqrt(x));
Or if we don't want to (or cannot modify the original code), we can simply wrap it using Exception.Catcher:
def optMyfn(x: Double): Option[Double]
= scala.util.control.Exception.allCatch.opt(myfn(x));
Now let's have a sequence of numbers and we want to find the first one on which the function succeeds:
val testseq: Seq[Double] = Seq(-3.0, -2.0, 2.0, 4.0, 5.0);
We can use Scala's functional features to apply the function to all the elements and find the first Some result as
testseq.toStream.map(optMyfn _).flatten.headOption
It'd work as well without using toStream, but we'd call optMyfn unnecessarily on all the element, instead of just those we need to find the first successful result - streams make the computation lazy.
(Another option would be to use views, like
testseq.view.map(optMyfn _).collectFirst({ case Some(x) => x })
.)
The sys.exit call is kinda misplaced in the if statement. It should be inside the catch clause to allow for a return of tupe if try is successful.
If you want to loop until you get a working tuple a while loop makes more sense. The semantics of a for loop is to evalute the body for all elements you are iterating. Since your goal is to stop after the first which satisfies a condition, a while loop seems much more natural. Of course there would be some more functional alternatives, for instance:
wrap your getTuple in an Option, where None corresponds to an exception.
put the calls to getTuple in some iterable/traversable collection
get the first element which is not None
While I was explaining the concept, Romain Sertelon gave a nice example of the idea...
Another solution is to just wrap the whole block into a method and use return like:
val (a: Type, b: Type) = setup(...)
def setup(...) : (Type, Type) = {
if (...) {
for (...) {
try {
return getTuple(...)
} catch {
...
}
}
sys.exit
} else {
try {
return getTuple(...)
} catch {
case e: Throwable => println(...); sys.exit
}
}
}
I think I should learn more basics of Scala, but that works now for the moment.
Thanks for your answers.