Is it possible to reference count call location? - scala

This question isn't programming language specific (the more general the better), but I'm working in Scala (not necessarily on the JVM). Is there a means to reference count by call location, not the number of total calls? In particular, it would be great to be able to detect if a given method is called from more than one call location.
I think I can fake it to some extent by doing a reference equality check with a function, but this could be abused easily by having a global-ish token, or even calling the function multiple times in the same scope:
sealed case class Token();
class MyClass[A] {
var tokenOpt: Option[Token] = None
def callMeFromOnePlace(x: A)(implicit tk: Token) = {
tokenOpt match {
case Some(priorTk) => if (priorTk ne tk) throw new IllegalStateException("")
case None => tokenOpt = Some(tk)
}
// Do some work ...
}
}
Then this should work fine:
val myObj = new MyClass[Int]
val myIntList = List(1,2,3)
implicit val token = Token()
myIntList.map(ii => myObj.callMeFromOnePlace(ii))
But unfortunately, so would this:
val myObj = new MyClass[Int]
implicit val token = Token()
myObj.callMeFromOnePlace(1)
myObj.callMeFromOnePlace(1) //oops, want this to fail

When you are talking about call location, it can be represented by a call stack trace. Here is a simple example:
// keep track of calls here (you can use immutable style if you want)
var callCounts = Map.empty[Int, Int]
def f(): Unit = {
// calculate call stack trace hashCode for more efficient storage
// .toSeq makes WrappedArray, that knows how to properly calculate .hashCode()
val hashCode = new RuntimeException().getStackTrace.toSeq.hashCode()
val callLocation = hashCode
callCounts += (callLocation -> (callCounts.getOrElse(callLocation, 0) + 1))
}
List(1,2,3).foreach(_ =>
f()
)
f()
f()
println(callCounts) // Map(75070239 -> 3, 900408638 -> 1, -1658734417 -> 1)

I am not completely clear what you want to do but for your //oops.. example to fail you need just check the PriorTk is not None. (do note that it is not a thread safe solution )

For completeness, enforcing these kind of constraints from a type system perspective requires linear types.

Related

request timeout from flatMapping over cats.effect.IO

I am attempting to transform some data that is encapsulated in cats.effect.IO with a Map that also is in an IO monad. I'm using http4s with blaze server and when I use the following code the request times out:
def getScoresByUserId(userId: Int): IO[Response[IO]] = {
implicit val formats = DefaultFormats + ShiftJsonSerializer() + RawShiftSerializer()
implicit val shiftJsonReader = new Reader[ShiftJson] {
def read(value: JValue): ShiftJson = value.extract[ShiftJson]
}
implicit val shiftJsonDec = jsonOf[IO, ShiftJson]
// get the shifts
var getDbShifts: IO[List[Shift]] = shiftModel.findByUserId(userId)
// use the userRoleId to get the RoleId then get the tasks for this role
val taskMap : IO[Map[String, Double]] = taskModel.findByUserId(userId).flatMap {
case tskLst: List[Task] => IO(tskLst.map((task: Task) => (task.name -> task.standard)).toMap)
}
val traversed: IO[List[Shift]] = for {
shifts <- getDbShifts
traversed <- shifts.traverse((shift: Shift) => {
val lstShiftJson: IO[List[ShiftJson]] = read[List[ShiftJson]](shift.roleTasks)
.map((sj: ShiftJson) =>
taskMap.flatMap((tm: Map[String, Double]) =>
IO(ShiftJson(sj.name, sj.taskType, sj.label, sj.value.toString.toDouble / tm.get(sj.name).get)))
).sequence
//TODO: this flatMap is bricking my request
lstShiftJson.flatMap((sjLst: List[ShiftJson]) => {
IO(Shift(shift.id, shift.shiftDate, shift.shiftStart, shift.shiftEnd,
shift.lunchDuration, shift.shiftDuration, shift.breakOffProd, shift.systemDownOffProd,
shift.meetingOffProd, shift.trainingOffProd, shift.projectOffProd, shift.miscOffProd,
write[List[ShiftJson]](sjLst), shift.userRoleId, shift.isApproved, shift.score, shift.comments
))
})
})
} yield traversed
traversed.flatMap((sLst: List[Shift]) => Ok(write[List[Shift]](sLst)))
}
as you can see the TODO comment. I've narrowed down this method to the flatmap below the TODO comment. If I remove that flatMap and merely return "IO(shift)" to the traversed variable the request does not timeout; However, that doesn't help me much because I need to make use of the lstShiftJson variable which has my transformed json.
My intuition tells me I'm abusing the IO monad somehow, but I'm not quite sure how.
Thank you for your time in reading this!
So with the guidance of Luis's comment I refactored my code to the following. I don't think it is optimal (i.e. the flatMap at the end seems unecessary, but I couldnt' figure out how to remove it. BUT it's the best I've got.
def getScoresByUserId(userId: Int): IO[Response[IO]] = {
implicit val formats = DefaultFormats + ShiftJsonSerializer() + RawShiftSerializer()
implicit val shiftJsonReader = new Reader[ShiftJson] {
def read(value: JValue): ShiftJson = value.extract[ShiftJson]
}
implicit val shiftJsonDec = jsonOf[IO, ShiftJson]
// FOR EACH SHIFT
// - read the shift.roleTasks into a ShiftJson object
// - divide each task value by the task.standard where task.name = shiftJson.name
// - write the list of shiftJson back to a string
val traversed = for {
taskMap <- taskModel.findByUserId(userId).map((tList: List[Task]) => tList.map((task: Task) => (task.name -> task.standard)).toMap)
shifts <- shiftModel.findByUserId(userId)
traversed <- shifts.traverse((shift: Shift) => {
val lstShiftJson: List[ShiftJson] = read[List[ShiftJson]](shift.roleTasks)
.map((sj: ShiftJson) => ShiftJson(sj.name, sj.taskType, sj.label, sj.value.toString.toDouble / taskMap.get(sj.name).get ))
shift.roleTasks = write[List[ShiftJson]](lstShiftJson)
IO(shift)
})
} yield traversed
traversed.flatMap((t: List[Shift]) => Ok(write[List[Shift]](t)))
}
Luis mentioned that mapping my List[Shift] to a Map[String, Double] is a pure operation so we want to use a map instead of flatMap.
He mentioned that I'm wrapping every operation that comes from the database in IO which is causing a great deal of recomputation. (including DB transactions)
To solve this issue I moved all of the database operations inside of my for loop, using the "<-" operator to flatMap each of the return values allows the variables being used to preside within the IO monads, hence preventing the recomputation experienced before.
I do think there must be a better way of returning my return value. flatMapping the "traversed" variable to get back inside of the IO monad seems to be unnecessary recomputation, so please anyone correct me.

How to functionally handle a logging side effect

I want to log in the event that a record doesn't have an adjoining record. Is there a purely functional way to do this? One that separates the side effect from the data transformation?
Here's an example of what I need to do:
val records: Seq[Record] = Seq(record1, record2, ...)
val accountsMap: Map[Long, Account] = Map(record1.id -> account1, ...)
def withAccount(accountsMap: Map[Long, Account])(r: Record): (Record, Option[Account]) = {
(r, accountsMap.get(r.id))
}
def handleNoAccounts(tuple: (Record, Option[Account]) = {
val (r, a) = tuple
if (a.isEmpty) logger.error(s"no account for ${record.id}")
tuple
}
def toRichAccount(tuple: (Record, Option[Account]) = {
val (r, a) = tuple
a.map(acct => RichAccount(r, acct))
}
records
.map(withAccount(accountsMap))
.map(handleNoAccounts) // if no account is found, log
.flatMap(toRichAccount)
So there are multiple issues with this approach that I think make it less than optimal.
The tuple return type is clumsy. I have to destructure the tuple in both of the latter two functions.
The logging function has to handle the logging and then return the tuple with no changes. It feels weird that this is passed to .map even though no transformation is taking place -- maybe there is a better way to get this side effect.
Is there a functional way to clean this up?
I could be wrong (I often am) but I think this does everything that's required.
records
.flatMap(r =>
accountsMap.get(r.id).fold{
logger.error(s"no account for ${r.id}")
Option.empty[RichAccount]
}{a => Some(RichAccount(r,a))})
If you're using scala 2.13 or newer you could use tapEach, which takes function A => Unit to apply side effect on every element of function and then passes collection unchanged:
//you no longer need to return tuple in side-effecting function
def handleNoAccounts(tuple: (Record, Option[Account]): Unit = {
val (r, a) = tuple
if (a.isEmpty) logger.error(s"no account for ${record.id}")
}
records
.map(withAccount(accountsMap))
.tapEach(handleNoAccounts) // if no account is found, log
.flatMap(toRichAccount)
In case you're using older Scala, you could provide extension method (updated according to Levi's Ramsey suggestion):
implicit class SeqOps[A](s: Seq[A]) {
def tapEach(f: A => Unit): Seq[A] = {
s.foreach(f)
s
}
}

Type mismatch found Unit, required Future[Customer] on flatmap

I have the below code and in my findOrCreate() function, I'm getting an error saying Type mismatch found Unit, required Future[Customer]. The customerByPhone() function that is being called inside findOrCreate() also contains calls that are expecting Futures, which is why I'm using a fatmap. I don't know why the result of the flatmap is resulting in Unit. What am I doing wrong?
override def findOrCreate(phoneNumber: String, creationReason: String): Future[AvroCustomer] = {
//query for customer in db
val avroCustomer: Future[AvroCustomer] = customerByPhone(phoneNumber).flatMap(_ => createUserAndEvent(phoneNumber, creationReason, 1.0))
}
override def customerByPhone(phoneNumber: String): Future[AvroCustomer] = {
val query = Schema.Customers.byPhoneNumber(phoneNumber)
val dbAction: DBIO[Option[Schema.Customer]] = query.result.headOption
db.run(dbAction)
.map(_.map(AvroConverters.toAvroCustomer).orNull)
}
private def createUserAndEvent(phoneNumber: String, creationReason: String, version: Double): Future[AvroCustomer] = {
val query = Schema.Customers.byPhoneNumber(phoneNumber)
val dbAction: DBIO[Option[Schema.Customer]] = query.result.headOption
val data: JsValue = Json.obj(
"phone_number" -> phoneNumber,
"agent_number" -> "placeholder for agent number",
"creation_reason" -> creationReason
)
//empty for now
val metadata: JsValue = Json.obj()
//creates user
val avroCustomer: Future[AvroCustomer] = db.run(dbAction).map(_.map(AvroConverters.toAvroCustomer).orNull)
avroCustomer.onComplete({
case Success(null) => {
}
//creates event
case Success(customer) => {
val uuid: UUID = UUID.fromString(customer.id)
//create event
val event: Future[CustomerEvent] = db.run(Schema.CustomerEvents.create(
uuid,
"customer_creation",
version,
data,
metadata)
).map(AvroConverters.toAvroEvent)
}
case Failure(exception) => {
}
})
Future.successful(new AvroCustomer)
}
While Reactormonk basically answered this in the comments, I'm going to actually write an answer with some details. His comment that a val statement produces Unit is fundamentally correct, but I'm hoping some elaboration will make things more clear.
The key element that I see is that val is a declaration. Declarations in Scala are statements that don't produce useful values. Because of the functional nature of Scala, they do produce a value, but it is Unit and as there is only one instance of Unit, it doesn't carry any meaning.
The reason programmers new to Scala are often tempted to do something like this is that they don't think of blocks of code as statements and are often used to using return in other languages. So let's consider a simplified function here.
def foo(i: Int): Int = {
42 * i
}
I include a code block as I think that is key to this error, though it really isn't needed here. The value of a code block is simply the value of the last expression in the code block. This is why we don't have to specify return, but most programmers who are used to return are a bit uncomfortable with that naked expression at the end of a block. That is why it is tempting to throw in the val declaration.
def foo(i: Int): Int = {
val result = 42 * i // Error: type mismatch.
}
Of course, as was mentioned, but val results in Unit making this incorrect. You could add an extra line with just result, and that will compile, but it is overly verbose and non-idiomatic.
Scala supports the use of return to leave a method/function and give back a particular value, though the us is generally frowned upon. As such, the following code works.
def foo(i: Int): Int = {
return 42 * i
}
While you shouldn't use return in Scala code, I feel that imagining it being there can help with understanding what is wrong here. If you stick a return in front of the val you get code like the following.
def foo(i: Int): Int = {
return val result = 42 * i // Error: type mismatch.
}
At least to me, this code is clearly incorrect. The val is a declaration and as such it just doesn't work with a return. It takes some time to get used to the functional style of blocks as expressions. Until you get to that point, it might help just to act like there is a return at the end of methods without actually putting one there.
It is worth noting that, in the comments, jwvh claims that a declaration like this in C would return a value. That is false. Assignments in most C-family languages give back the value that was assigned, so a = 5 returns the value 5, but declarations don't, so int a = 5; does not give back anything and can't be used as an expression.

Scala Parallel Collections- How to return early?

I have a list of possible input Values
val inputValues = List(1,2,3,4,5)
I have a really long to compute function that gives me a result
def reallyLongFunction( input: Int ) : Option[String] = { ..... }
Using scala parallel collections, I can easily do
inputValues.par.map( reallyLongFunction( _ ) )
To get what all the results are, in parallel. The problem is, I don't really want all the results, I only want the FIRST result. As soon as one of my input is a success, I want my output, and want to move on with my life. This did a lot of extra work.
So how do I get the best of both worlds? I want to
Get the first result that returns something from my long function
Stop all my other threads from useless work.
Edit -
I solved it like a dumb java programmer by having
#volatile var done = false;
Which is set and checked inside my reallyLongFunction. This works, but does not feel very scala. Would like a better way to do this....
(Updated: no, it doesn't work, doesn't do the map)
Would it work to do something like:
inputValues.par.find({ v => reallyLongFunction(v); true })
The implementation uses this:
protected[this] class Find[U >: T](pred: T => Boolean, protected[this] val pit: IterableSplitter[T]) extends Accessor[Option[U], Find[U]] {
#volatile var result: Option[U] = None
def leaf(prev: Option[Option[U]]) = { if (!pit.isAborted) result = pit.find(pred); if (result != None) pit.abort }
protected[this] def newSubtask(p: IterableSplitter[T]) = new Find(pred, p)
override def merge(that: Find[U]) = if (this.result == None) result = that.result
}
which looks pretty similar in spirit to your #volatile except you don't have to look at it ;-)
I took interpreted your question in the same way as huynhjl, but if you just want to search and discardNones, you could do something like this to avoid the need to repeat the computation when a suitable outcome is found:
class Computation[A,B](value: A, function: A => B) {
lazy val result = function(value)
}
def f(x: Int) = { // your function here
Thread.sleep(100 - x)
if (x > 5) Some(x * 10)
else None
}
val list = List.range(1, 20) map (i => new Computation(i, f))
val found = list.par find (_.result.isDefined)
//found is Option[Computation[Int,Option[Int]]]
val result = found map (_.result.get)
//result is Option[Int]
However find for parallel collections seems to do a lot of unnecessary work (see this question), so this might not work well, with current versions of Scala at least.
Volatile flags are used in the parallel collections (take a look at the source for find, exists, and forall), so I think your idea is a good one. It's actually better if you can include the flag in the function itself. It kills referential transparency on your function (i.e. for certain inputs your function now sometimes returns None rather than Some), but since you're discarding the stopped computations, this shouldn't matter.
If you're willing to use a non-core library, I think Futures would be a good match for this task. For instance:
Akka's Futures include Futures.firstCompletedOf
Twitter's Futures include Future.select
...both of which appear to enable the functionality you're looking for.

How to use scalax.io.CommandLineParser?

I want to create a class that takes string array as a constructor argument and has command line option values as members vals. Something like below, but I don't understand how the Bistate works.
import scalax.data._
import scalax.io.CommandLineParser
class TestCLI(arguments: Array[String]) extends CommandLineParser {
private val opt1Option = new Flag("p", "print") with AllowAll
private val opt2Option = new Flag("o", "out") with AllowAll
private val strOption = new StringOption("v", "value") with AllowAll
private val result = parse(arguments)
// true or false
val opt1 = result(opt1Option)
val opt2 = result(opt2Option)
val str = result(strOption)
}
Here are shorter alternatives to that pattern matching to get a boolean:
val opt1 = result(opt1Option).isInstanceOf[Positive[_]]
val opt2 = result(opt2Option).posValue.isDefined
The second one is probably better. The field posValue is an Option (there's negValue as well). The method isDefined from Option tells you whether it is a Some(x) or None.
I'm not personally familiar with Scalax or Bistate in particular, but just looking at the scaladocs, it looks like a left-right disjunction. Scala's main library has a monad very much like this (Either), so I'm surprised that they didn't just use the standard one.
In essence, Bistate and Either are a bit like Option, except their "None-equivalent" can contain a value. For example, if I were writing code using Either, I might do something like this:
def div(a: Int, b: Int) = if (b != 0) Left(a / b) else Right("Divide by zero")
div(4, 2) match {
case Left(x) => println("Result: " + x)
case Right(e) => Println("Error: " + e)
}
This would print "Result: 2". In this case, we're using Either to simulate an exception. We return an instance of Left which contains the value we want, unless that value cannot be computed for some reason, in which case we return an error message wrapped up inside an instance of Right.
So if I want to assign to variable boolean value of whether flag is found I have to do like below?
val opt1 = result(opt1Option) match {
case Positive(_) => true
case Negative(_) => false
}
Isn't there a way to write this common case with less code than that?