Customized Iterator for the Process.repeatEval in scalaz stream - scala

I want to parse infinite urls by using scalaz stream. The template url response looks like this :
{
nextUrl: "nextUrl"
}
I am thinking to use scalaz Stream to parse infinitely. The method I am gonna to use is Process.repeatEval. However, it is little bit hard to do it since the next link is embedded inside the current url. Thus, I create a customized iterator and here is some pseudo code
class Iterator {
var currentUrl = null //state...
def hasNext(): Boolean
def next(): UrlContent
}
Process.repeatEval(Task {iterator}).takeWhile(_.hasNext()).map(_.next()).run.run
It is working, but I am not a fun of this, because iterator has state, and I am trying to remove the usage of mutable value.
Back to my question, am I looking for the suitable choice (Process.repeatEval) from scalaz stream. If yes, should I use this customized iterator.
Many thanks in advance

Related

Efficient JSON-toJSON transformations with spray-json

I have a scenario similar to this: an Akka HTTP service calls another service and performs some transformations on its JSON response. Let's say it replaces "http" with "https" on every "link" attribute's value.
Right now the implementation is something like:
def route: Route =
callToAnotherService(request) { eventualJsonResponse =>
complete(
eventualJsonResponse.flatMap(
(jsonResponse: HttpResponse) => {
Unmarshal(jsonResponse.entity.withContentType(MediaTypes.`application/json`))
.to[JsValue]
.map(replaceHttpInLinks)
.flatMap(Marshal(_).to[ResponseEntity])
.map(responseEntity => jsonResponse.copy(entity = responseEntity)))
}
)
)
}
The transformation method has the following signature:
def replaceHttpInLinks(jsValue: JsValue): JsValue = {
// Recursively find "link" attributes and replace protocol
}
As you can see, the called service's JSON response is unmarshalled into a JsValue object and then this object is used to perform the changes.
That response can be huge, and I'm concerned about both performance and memory consumption.
I was looking for a way of making those changes without unmarshalling the whole JSON document, and hopefully without introducing foreign libraries (Play JSON or others). I was thinking of something event based, along the lines of the old SAX API for XML.
Does anyone come up with any idea to achieve it?
I think that with Spray is more complicated because it will try to build the JsValue from the body of the HTTPRequest. My suggestion is to use Circe and use HCursor to unmarshall manually. Take a look to some exampl here.
You can integrate circe with Akka: https://github.com/hseeberger/akka-http-json

How can I perform session based logging in Play Framework

We are currently using the Play Framework and we are using the standard logging mechanism. We have implemented a implicit context to support passing username and session id to all service methods. We want to implement logging so that it is session based. This requires implementing our own logger. This works for our own logs but how do we do the same for basic exception handling and logs as a result. Maybe there is a better way to capture this then with implicits or how can we override the exception handling logging. Essentially, we want to get as many log messages to be associated to the session.
It depends if you are doing reactive style development or standard synchronous development:
If standard synchronous development (i.e. no futures, 1 thread per request) - then I'd recommend you just use MDC, which adds values onto Threadlocal for logging. You can then customise the output in logback / log4j. When you get the username / session (possibly in a Filter or in your controller), you can then set the values there and then and you do not need to pass them around with implicits.
If you are doing reactive development you have a couple options:
You can still use MDC, except you'd have to use a custom Execution Context that effectively copies the MDC values to the thread, since each request could in theory be handled by multiple threads. (as described here: http://code.hootsuite.com/logging-contextual-info-in-an-asynchronous-scala-application/)
The alternative is the solution which I tend to use (and close to what you have now): You could make a class which represents MyAppRequest. Set the username, session info, and anything else, on that. You can continue to pass it around as an implicit. However, instead of using Action.async, you make your own MyAction class which an be used like below
myAction.async { implicit myRequest => //some code }
Inside the myAction, you'd have to catch all Exceptions and deal with future failures, and do the error handling manually instead of relying on the ErrorHandler. I often inject myAction into my Controllers and put common filter functionality in it.
The down side of this is, it is just a manual method. Also I've made MyAppRequest hold a Map of loggable values which can be set anywhere, which means it had to be a mutable map. Also, sometimes you need to make more than one myAction.async. The pro is, it is quite explicit and in your control without too much ExecutionContext/ThreadLocal magic.
Here is some very rough sample code as a starter, for the manual solution:
def logErrorAndRethrow(myrequest:MyRequest, x:Throwable): Nothing = {
//log your error here in the format you like
throw x //you can do this or handle errors how you like
}
class MyRequest {
val attr : mutable.Map[String, String] = new mutable.HashMap[String, String]()
}
//make this a util to inject, or move it into a common parent controller
def myAsync(block: MyRequest => Future[Result] ): Action[AnyContent] = {
val myRequest = new MyRequest()
try {
Action.async(
block(myRequest).recover { case cause => logErrorAndRethrow(myRequest, cause) }
)
} catch {
case x:Throwable =>
logErrorAndRethrow(myRequest, x)
}
}
//the method your Route file refers to
def getStuff = myAsync { request:MyRequest =>
//execute your code here, passing around request as an implicit
Future.successful(Results.Ok)
}

What is this ScalaRX code doing?

So I'm pretty new to both Scala and RX. The guy who knew the most, and who actually wrote this code, just left, and I'm not sure what's going on. This construct is all over his code and I'm not really clear what its doing:
def foo(List[Long]) : Observable[Unit] =
Observable {
subscriber => {
do some stuff
subscriber.onNext()
subscriber.onCompleted()
}
I mostly get do some stuff, and the calls to subscriber. What I don't get is, where does subscriber come from? Does subscriber => { instantiate the subscriber? What does Observable { subscriber => { ... } } do/mean?
If you take a look at the Observable companion object documentation, you will see an apply method that takes a function of type (Subscriber[T]) ⇒ Unit. So, when you call Observable{withSomeLambda}, then this is the same as calling Observable.apply{withSomeLambda}
And, if you go all the way to the source code you will see that this is really returning
toScalaObservable(rx.Observable.create(f))
where f is the lambda that you passed in.
So, subscriber is just the parameter of the lambda. It is passed in by the caller of that function.
This code is creating a new Observable as described here.
Basically when a downstream component subscribes to this stream, this callback is called. In the callback we determine when we, as a data source, will call onNext(v: T) which is how we pass each element we are generating to them, and when we will call onCompleted() which is how we tell the subscriber that we are done sending data.
Once you have created an Observable you can start calling Observable operators, which will either result in another, compound Observable, or will result in a terminating condition which will end the process, and generally result in a final result for the flow (often a collection or aggregate value).
You don't use the List in your question, but normally if you wanted to make a reactive stream out of a list you would call Observable.from().
PS: I think this is RxJava code.

Serialize Function1 to database

I know it's not directly possible to serialize a function/anonymous class to the database but what are the alternatives? Do you know any useful approach to this?
To present my situation: I want to award a user "badges" based on his scores. So I have different types of badges that can be easily defined by extending this class:
class BadgeType(id:Long, name:String, detector:Function1[List[UserScore],Boolean])
The detector member is a function that walks the list of scores and return true if the User qualifies for a badge of this type.
The problem is that each time I want to add/edit/modify a badge type I need to edit the source code, recompile the whole thing and re-deploy the server. It would be much more useful if I could persist all BadgeType instances to a database. But how to do that?
The only thing that comes to mind is to have the body of the function as a script (ex: Groovy) that is evaluated at runtime.
Another approach (that does not involve a database) might be to have each badge type into a jar that I can somehow hot-deploy at runtime, which I guess is how a plugin-system might work.
What do you think?
My very brief advice is that if you want this to be truly data-driven, you need to implement a rules DSL and an interpreter. The rules are what get saved to the database, and the interpreter takes a rule instance and evaluates it against some context.
But that's overkill most of the time. You're better off having a little snippet of actual Scala code that implements the rule for each badge, give them unique IDs, then store the IDs in the database.
e.g.:
trait BadgeEval extends Function1[User,Boolean] {
def badgeId: Int
}
object Badge1234 extends BadgeEval {
def badgeId = 1234
def apply(user: User) = {
user.isSufficientlyAwesome // && ...
}
}
You can either have a big whitelist of BadgeEval instances:
val weDontNeedNoStinkingBadges = Map(
1234 -> Badge1234,
5678 -> Badge5678,
// ...
}
def evaluator(id: Int): Option[BadgeEval] = weDontNeedNoStinkingBadges.get(id)
def doesUserGetBadge(user: User, id: Int) = evaluator(id).map(_(user)).getOrElse(false)
... or if you want to keep them decoupled, use reflection:
def badgeEvalClass(id: Int) = Class.forName("com.example.badge.Badge" + id + "$").asInstanceOf[Class[BadgeEval]]
... and if you're interested in runtime pluggability, try the service provider pattern.
You can try and use Scala Continuations - they can give you the ability to serialize the computation and run it at later time or even on another machine.
Some links:
Continuations
What are Scala continuations and why use them?
Swarm - Concurrency with Scala Continuations
Serialization relates to data rather than methods. You cannot serialize functionality because it is a class file which is designed to serialize that and object serialization serializes the fields of an object.
So like Alex says, you need a rule engine.
Try this one if you want something fairly simple, which is string based, so you can serialize the rules as strings in a database or file:
http://blog.maxant.co.uk/pebble/2011/11/12/1321129560000.html
Using a DSL has the same problems unless you interpret or compile the code at runtime.

Not able to use Mockito ArgumentMatchers in Scala

I am using ScalaMock and Mockito
I have this simple code
class MyLibrary {
def doFoo(id: Long, request: Request) = {
println("came inside real implementation")
Response(id, request.name)
}
}
case class Request(name: String)
case class Response(id: Long, name: String)
I can easily mock it using this code
val lib = new MyLibrary()
val mock = spy(lib)
when(mock.doFoo(1, Request("bar"))).thenReturn(Response(10, "mock"))
val response = mock.doFoo(1, Request("bar"))
response.name should equal("mock")
But If I change my code to
val lib = new MyLibrary()
val mock = spy(lib)
when(mock.doFoo(anyLong(), any[Request])).thenReturn(Response(10, "mock"))
val response = mock.doFoo(1, Request("bar"))
response.name should equal("mock")
I see that it goes inside the real implementation and gets a null pointer exception.
I am pretty sure it goes inside the real implementation without matchers too, the difference is that it just doesn't crash in that case (any ends up passing null into the call).
When you write when(mock.doFoo(...)), the compiler has to call mock.doFoo to compute the parameter that is passed to when.
Doing this with mock works, because all implementations are stubbed out, but spy wraps around the actual object, so, the implementations are all real too.
Spies are frowned upon in mockito world, and are considered code smell.
If you find yourself having to mock out some functionality of your class while keeping the rest of it, it is almost surely the case when you should just split it into two separate classes. Then you'd be able to just mock the whole "underlying" object entirely, and have no need to spy on things.
If you are still set on using spies for some reason, doReturn would be the workaround, as the other answer suggests. You should not pass null as the vararg parameter though, it changes the semantics of the call. Something like this should work:
doReturn(Response(10, "mock"), Array.empty:_*).when(mock).doFoo(any(), any())
But, I'll stress it once again: this is just a work around. The correct solution is to use mock instead of spy to begin with.
Try this
doReturn(Response(10, "mock"), null.asInstanceOf[Array[Object]]: _*).when(mock.doFoo(anyLong(), any[Request]))