I moved from Casbah to Reactive Mongo and from that moment I couldn't make work the test of my actor.
I have a dao for the persistence layer and tests for that tier. All the tests passed. So, the only thing that comes to my mind its a problem of synchronization.
" UserActor " should {
val socketActorProbe = new TestProbe(system)
val peyiProbe = new TestProbe(system)
val identifyId = 1
val emailCsr = "csr#gmail.com"
val emailPeyi = "peyi#gmail.com"
val point = new Point[LatLng](new LatLng(-31.4314041, -64.1670626))
" test preStart() " in new WithApplication {
db.createDB(id1, id2, id3)
val userActorRefCsr = TestActorRef[UserActor](Props(classOf[UserActor], emailCsr, socketActorProbe.ref))
val csr = userActorRefCsr.underlyingActor
val userActorRef = TestActorRef[UserActor](Props(classOf[UserActor], emailPeyi, socketActorProbe.ref))
val peyi = userActorRef.underlyingActor
peyi.receive(ActorIdentity(identifyId, Option(userActorRefCsr)))
db.clearDB()
}
Actor class.
class UserActor(email: String, upstream: ActorRef) extends Actor {
import UserActor._
val identifyId = 1
val usersFromDB = ReactiveMongoFactory.db.collection[BSONCollection]("users")
val userDao = new UserDao(usersFromDB)
val meFuture = userDao.findMeByEmail(email)
var friends: Map[String, ActorRef] = Map()
override def preStart() = {
meFuture onComplete { result =>
val emailsFriends: List[String] = userDao.getMyFriendsEmail(result.get.get)
println(emailsFriends)
for (email <- emailsFriends) {
println("sending msg to " + email)
context.actorSelection("/user/" + email) ! Identify(identifyId)
}
}
}
private def giveMyFriend(email: String): Option[ActorRef] = {
for(friend <- friends){
if (friend._1 == email) new Some(friend._2)
}
None
}
def active(another: ActorRef): Actor.Receive = {
case Terminated(`another`) => context.stop(self)
}
def receive = {
case ActorIdentity(`identifyId`, Some(actorRef)) =>
meFuture onComplete { result =>
println(" ... subscribing ... " + result.get.get.basicProfile.email)
actorRef ! Subscribe(result.get.get.basicProfile.email.get)
context.watch(actorRef)
context.become(active(actorRef))
}
case Subscribe(email) =>
friends += (email -> sender)
context watch sender
case Terminated(user) => {
for(friend <- friends){
if (friend._2 == user ) friends -= friend._1 //removing by key
}
}
case UserMoved(email, point) =>
upstream ! UserPosition(email, System.currentTimeMillis(), point.coordinates)
}
}
Im receiving the following output.
The exception is thrown in the following lines of code.
def findMeByEmail(email: String): Future[Option[User]] = {
val query = BSONDocument("email" -> email)
println( " .... finding user ..... email: " + email )
val cursor = users.find(query).cursor[BSONDocument]
val userFuture = cursor.headOption.map(
doc => Some(userReader.read(doc.get))
)
userFuture
}
If I run the test for that method, it's all ok.
describe("get my friends emails") {
it("returns a list of emails") {
val futureUser = userDao.findMeByEmail("csr#gmail.com")
ScalaFutures.whenReady(futureUser) { result =>
val friends = userDao.getMyFriendsEmail(result.get)
assert(friends.length == 2)
}
}
}
Basically, Im trying to look my friends (Other actor) and then register them in a map to have a reference. I couldn't find any good example which shows tests using Reactive Mongo with Actors.
I hope somebody can help me to understand whats going on here. Thanks in advance.
Related
I'm new to Akka Stream. I used following code for CSV parsing.
class CsvParser(config: Config)(implicit system: ActorSystem) extends LazyLogging with NumberValidation {
import system.dispatcher
private val importDirectory = Paths.get(config.getString("importer.import-directory")).toFile
private val linesToSkip = config.getInt("importer.lines-to-skip")
private val concurrentFiles = config.getInt("importer.concurrent-files")
private val concurrentWrites = config.getInt("importer.concurrent-writes")
private val nonIOParallelism = config.getInt("importer.non-io-parallelism")
def save(r: ValidReading): Future[Unit] = {
Future()
}
def parseLine(filePath: String)(line: String): Future[Reading] = Future {
val fields = line.split(";")
val id = fields(0).toInt
try {
val value = fields(1).toDouble
ValidReading(id, value)
} catch {
case t: Throwable =>
logger.error(s"Unable to parse line in $filePath:\n$line: ${t.getMessage}")
InvalidReading(id)
}
}
val lineDelimiter: Flow[ByteString, ByteString, NotUsed] =
Framing.delimiter(ByteString("\n"), 128, allowTruncation = true)
val parseFile: Flow[File, Reading, NotUsed] =
Flow[File].flatMapConcat { file =>
val src = FileSource.fromFile(file).getLines()
val source : Source[String, NotUsed] = Source.fromIterator(() => src)
// val gzipInputStream = new GZIPInputStream(new FileInputStream(file))
source
.mapAsync(parallelism = nonIOParallelism)(parseLine(file.getPath))
}
val computeAverage: Flow[Reading, ValidReading, NotUsed] =
Flow[Reading].grouped(2).mapAsyncUnordered(parallelism = nonIOParallelism) { readings =>
Future {
val validReadings = readings.collect { case r: ValidReading => r }
val average = if (validReadings.nonEmpty) validReadings.map(_.value).sum / validReadings.size else -1
ValidReading(readings.head.id, average)
}
}
val storeReadings: Sink[ValidReading, Future[Done]] =
Flow[ValidReading]
.mapAsyncUnordered(concurrentWrites)(save)
.toMat(Sink.ignore)(Keep.right)
val processSingleFile: Flow[File, ValidReading, NotUsed] =
Flow[File]
.via(parseFile)
.via(computeAverage)
def importFromFiles = {
implicit val materializer = ActorMaterializer()
val files = importDirectory.listFiles.toList
logger.info(s"Starting import of ${files.size} files from ${importDirectory.getPath}")
val startTime = System.currentTimeMillis()
val balancer = GraphDSL.create() { implicit builder =>
import GraphDSL.Implicits._
val balance = builder.add(Balance[File](concurrentFiles))
val merge = builder.add(Merge[ValidReading](concurrentFiles))
(1 to concurrentFiles).foreach { _ =>
balance ~> processSingleFile ~> merge
}
FlowShape(balance.in, merge.out)
}
Source(files)
.via(balancer)
.withAttributes(ActorAttributes.supervisionStrategy { e =>
logger.error("Exception thrown during stream processing", e)
Supervision.Resume
})
.runWith(storeReadings)
.andThen {
case Success(_) =>
val elapsedTime = (System.currentTimeMillis() - startTime) / 1000.0
logger.info(s"Import finished in ${elapsedTime}s")
case Failure(e) => logger.error("Import failed", e)
}
}
}
I wanted to to use Akka HTTP which would give all ValidReading entities parsed from CSV but I couldn't understand on how would I do that.
The above code fetches file from server and parse each lines to generate ValidReading.
How can I pass/upload CSV via akka-http, parse the file and stream the resulted response back to the endpoint?
The "essence" of the solution is something like this:
import akka.http.scaladsl.server.Directives._
val route = fileUpload("csv") {
case (metadata, byteSource) =>
val source = byteSource.map(x => x)
complete(HttpResponse(entity = HttpEntity(ContentTypes.`text/csv(UTF-8)`, source)))
}
You detect that the uploaded thing is a multipart-form-data with a chunk named "csv". You get the byteSource from that. Do the calculation (insert your logic to the .map(x=>x) part). Convert your data back to ByteString. Complete the request with the new source. This will make your endoint like a proxy.
My stream works for smaller file of 1000 lines but stops when I test it on a large file ~12MB and ~250,000 lines? I tried applying backpressure with a buffer and throttling it and still same thing...
Here is my data streamer:
class UserDataStreaming(usersFile: File) {
implicit val system = ActorSystemContainer.getInstance().getSystem
implicit val materializer = ActorSystemContainer.getInstance().getMaterializer
def startStreaming() = {
val graph = RunnableGraph.fromGraph(GraphDSL.create() {
implicit builder =>
val usersSource = builder.add(Source.fromIterator(() => usersDataLines)).out
val stringToUserFlowShape: FlowShape[String, User] = builder.add(csvToUser)
val averageAgeFlowShape: FlowShape[User, (String, Int, Int)] = builder.add(averageUserAgeFlow)
val averageAgeSink = builder.add(Sink.foreach(averageUserAgeSink)).in
usersSource ~> stringToUserFlowShape ~> averageAgeFlowShape ~> averageAgeSink
ClosedShape
})
graph.run()
}
val usersDataLines = scala.io.Source.fromFile(usersFile, "ISO-8859-1").getLines().drop(1)
val csvToUser = Flow[String].map(_.split(";").map(_.trim)).map(csvLinesArrayToUser)
def csvLinesArrayToUser(line: Array[String]) = User(line(0), line(1), line(2))
def averageUserAgeSink[usersSource](source: usersSource) {
source match {
case (age: String, count: Int, totalAge: Int) => println(s"age = $age; Average reader age is: ${Try(totalAge/count).getOrElse(0)} count = $count and total age = $totalAge")
case bad => println(s"Bad case: $bad")
}
}
def averageUserAgeFlow = Flow[User].fold(("", 0, 0)) {
(nums: (String, Int, Int), user: User) =>
var counter: Option[Int] = None
var totalAge: Option[Int] = None
val ageInt = Try(user.age.substring(1, user.age.length-1).toInt)
if (ageInt.isSuccess) {
counter = Some(nums._2 + 1)
totalAge = Some(nums._3 + ageInt.get)
}
else {
counter = Some(nums._2 + 0)
totalAge = Some(nums._3 + 0)
}
//println(counter.get)
(user.age, counter.get, totalAge.get)
}
}
Here is my Main:
object Main {
def main(args: Array[String]): Unit = {
implicit val system = ActorSystemContainer.getInstance().getSystem
implicit val materializer = ActorSystemContainer.getInstance().getMaterializer
val usersFile = new File("data/BX-Users.csv")
println(usersFile.length())
val userDataStreamer = new UserDataStreaming(usersFile)
userDataStreamer.startStreaming()
}
It´s possible that there may be any error related to one row of your csv file. In that case, the stream materializes and stops. Try to define your flows like that:
FlowFlowShape[String, User].map {
case (user) => try {
csvToUser(user)
}
}.withAttributes(ActorAttributes.supervisionStrategy {
case ex: Throwable =>
log.error("Error parsing row event: {}", ex)
Supervision.Resume
}
In this case the possible exception is captured and the stream ignores the error and continues.
If you use Supervision.Stop, the stream stops.
I made a Source for an Akka Stream based on a ReactiveStreams Publisher like this:
object FlickrSource {
val apiKey = Play.current.configuration.getString("flickr.apikey")
val flickrUserId = Play.current.configuration.getString("flickr.userId")
val flickrPhotoSearchUrl = s"https://api.flickr.com/services/rest/?method=flickr.photos.search&api_key=$apiKey&user_id=$flickrUserId&min_taken_date=%s&max_taken_date=%s&format=json&nojsoncallback=1&page=%s&per_page=500"
def byDate(date: LocalDate): Source[JsValue, Unit] = {
Source(new FlickrPhotoSearchPublisher(date))
}
}
class FlickrPhotoSearchPublisher(date: LocalDate) extends Publisher[JsValue] {
override def subscribe(subscriber: Subscriber[_ >: JsValue]) {
try {
val from = new LocalDate()
val fromSeconds = from.toDateTimeAtStartOfDay.getMillis
val toSeconds = from.plusDays(1).toDateTimeAtStartOfDay.getMillis
def pageGet(page: Int): Unit = {
val url = flickrPhotoSearchUrl format (fromSeconds, toSeconds, page)
Logger.debug("Flickr search request: " + url)
val photosFound = WS.url(url).get().map { response =>
val json = response.json
val photosThisPage = (json \ "photos" \ "photo").as[JsArray]
val numPages = (json \ "photos" \ "pages").as[JsNumber].value.toInt
Logger.debug(s"pages: $numPages")
Logger.debug(s"photos this page: ${photosThisPage.value.size}")
photosThisPage.value.foreach { photo =>
Logger.debug(s"onNext")
subscriber.onNext(photo)
}
if (numPages > page) {
Logger.debug("nextPage")
pageGet(page + 1)
} else {
Logger.debug("onComplete")
subscriber.onComplete()
}
}
}
pageGet(1)
} catch {
case ex: Exception => {
subscriber.onError(ex)
}
}
}
}
It will make a search request to Flickr and source the results as JsValues. I tried to wire it to lots of different Flows and Sinks, but this would be the most basic setup:
val source: Source[JsValue, Unit] = FlickrSource.byDate(date)
val sink: Sink[JsValue, Future[Unit]] = Sink.foreach(println)
val stream = source.toMat(sink)(Keep.right)
stream.run()
I see that the onNext gets called a couple of times, and then the onComplete. However, the Sink does not receive anything. What am I missing, is this not a valid way to create a Source?
I mistakenly understood that Publisher was a simple interface like Observable, that you can implement yourself. The Akka team pointed out that this is not the correct way to implement a Publisher. In fact Publisher is a complicated class that is supposed to be implemented by libraries, rather than end users. This Source.apply(Publisher) method used in the question is there for interoperability with other Reactive Streams implementations.
The purpose for wanting an implementation of Source is that I want a backpressured source to fetch the search results from Flickr (which is maximized at 500 per request) and I don't want to make more (or faster) requests than is needed downstream. This can be achieved by implementing an ActorPublisher.
Update
This is the ActorPublisher that does what I want: create a Source that produces search results, but only makes as many REST calls as are needed downstream. I think there is still room for improvement, so feel free to edit it.
import akka.actor.Props
import akka.stream.actor.ActorPublisher
import akka.stream.actor.ActorPublisherMessage.{Cancel, Request}
import org.joda.time.LocalDate
import play.api.Play.current
import play.api.libs.json.{JsArray, JsNumber, JsValue}
import play.api.libs.ws.WS
import play.api.{Logger, Play}
import scala.concurrent.ExecutionContext.Implicits.global
object FlickrSearchActorPublisher {
val apiKey = Play.current.configuration.getString("flickr.apikey")
val flickrUserId = Play.current.configuration.getString("flickr.userId")
val flickrPhotoSearchUrl = s"https://api.flickr.com/services/rest/?method=flickr.photos.search&api_key=$apiKey&user_id=$flickrUserId&min_taken_date=%s&max_taken_date=%s&format=json&nojsoncallback=1&per_page=500&page="
def byDate(from: LocalDate): Props = {
val fromSeconds = from.toDateTimeAtStartOfDay.getMillis / 1000
val toSeconds = from.plusDays(1).toDateTimeAtStartOfDay.getMillis / 1000
val url = flickrPhotoSearchUrl format (fromSeconds, toSeconds)
Props(new FlickrSearchActorPublisher(url))
}
}
class FlickrSearchActorPublisher(url: String) extends ActorPublisher[JsValue] {
var currentPage = 1
var numPages = 1
var photos = Seq[JsValue]()
def searching: Receive = {
case Request(count) =>
Logger.debug(s"Received Request for $count results from Subscriber, ignoring as we are still searching")
case Cancel =>
Logger.info("Cancel Message Received, stopping")
context.stop(self)
case _ =>
}
def accepting: Receive = {
case Request(count) =>
Logger.debug(s"Received Request for $count results from Subscriber")
sendSearchResults()
case Cancel =>
Logger.info("Cancel Message Received, stopping")
context.stop(self)
case _ =>
}
def getNextPageOrStop() {
if (currentPage > numPages) {
Logger.debug("No more pages, stopping")
onCompleteThenStop()
} else {
val pageUrl = url + currentPage
Logger.debug("Flickr search request: " + pageUrl)
context.become(searching)
WS.url(pageUrl).get().map { response =>
val json = response.json
val photosThisPage = (json \ "photos" \ "photo").as[JsArray]
numPages = (json \ "photos" \ "pages").as[JsNumber].value.toInt
Logger.debug(s"page $currentPage of $numPages")
Logger.debug(s"photos this page: ${photosThisPage.value.size}")
photos = photosThisPage.value.seq
if (photos.isEmpty) {
Logger.debug("No photos found, stopping")
onCompleteThenStop()
} else {
currentPage = currentPage + 1
sendSearchResults()
context.become(accepting)
}
}
}
}
def sendSearchResults() {
if (photos.isEmpty) {
getNextPageOrStop()
} else {
while(isActive && totalDemand > 0) {
onNext(photos.head)
photos = photos.tail
if (photos.isEmpty) {
getNextPageOrStop()
}
}
}
}
getNextPageOrStop()
val receive = searching
}
I try write some simple akka-http and akka-streams based application, that handle http requests, always with one precompiled stream, because I plan to use long time processing with back-pressure in my requestProcessor stream
My application code:
import akka.actor.{ActorSystem, Props}
import akka.http.scaladsl._
import akka.http.scaladsl.server.Directives._
import akka.http.scaladsl.server._
import akka.stream.ActorFlowMaterializer
import akka.stream.actor.ActorPublisher
import akka.stream.scaladsl.{Sink, Source}
import scala.annotation.tailrec
import scala.concurrent.Future
object UserRegisterSource {
def props: Props = Props[UserRegisterSource]
final case class RegisterUser(username: String)
}
class UserRegisterSource extends ActorPublisher[UserRegisterSource.RegisterUser] {
import UserRegisterSource._
import akka.stream.actor.ActorPublisherMessage._
val MaxBufferSize = 100
var buf = Vector.empty[RegisterUser]
override def receive: Receive = {
case request: RegisterUser =>
if (buf.isEmpty && totalDemand > 0)
onNext(request)
else {
buf :+= request
deliverBuf()
}
case Request(_) =>
deliverBuf()
case Cancel =>
context.stop(self)
}
#tailrec final def deliverBuf(): Unit =
if (totalDemand > 0) {
if (totalDemand <= Int.MaxValue) {
val (use, keep) = buf.splitAt(totalDemand.toInt)
buf = keep
use foreach onNext
} else {
val (use, keep) = buf.splitAt(Int.MaxValue)
buf = keep
use foreach onNext
deliverBuf()
}
}
}
object Main extends App {
val host = "127.0.0.1"
val port = 8094
implicit val system = ActorSystem("my-testing-system")
implicit val fm = ActorFlowMaterializer()
implicit val executionContext = system.dispatcher
val serverSource: Source[Http.IncomingConnection, Future[Http.ServerBinding]] = Http(system).bind(interface = host, port = port)
val mySource = Source.actorPublisher[UserRegisterSource.RegisterUser](UserRegisterSource.props)
val requestProcessor = mySource
.mapAsync(1)(fakeSaveUserAndReturnCreatedUserId)
.to(Sink.head[Int])
.run()
val route: Route =
get {
path("test") {
parameter('test) { case t: String =>
requestProcessor ! UserRegisterSource.RegisterUser(t)
???
}
}
}
def fakeSaveUserAndReturnCreatedUserId(param: UserRegisterSource.RegisterUser): Future[Int] =
Future.successful {
1
}
serverSource.to(Sink.foreach {
connection =>
connection handleWith Route.handlerFlow(route)
}).run()
}
I found solution about how create Source that can dynamically accept new items to process, but I can found any solution about how than obtain result of stream execution in my route
The direct answer to your question is to materialize a new Stream for each HttpRequest and use Sink.head to get the value you're looking for. Modifying your code:
val requestStream =
mySource.map(fakeSaveUserAndReturnCreatedUserId)
.to(Sink.head[Int])
//.run() - don't materialize here
val route: Route =
get {
path("test") {
parameter('test) { case t: String =>
//materialize a new Stream here
val userIdFut : Future[Int] = requestStream.run()
requestProcessor ! UserRegisterSource.RegisterUser(t)
//get the result of the Stream
userIdFut onSuccess { case userId : Int => ...}
}
}
}
However, I think your question is ill posed. In your code example the only thing you're using an akka Stream for is to create a new UserId. Futures readily solve this problem without the need for a materialized Stream (and all the accompanying overhead):
val route: Route =
get {
path("test") {
parameter('test) { case t: String =>
val user = RegisterUser(t)
fakeSaveUserAndReturnCreatedUserId(user) onSuccess { case userId : Int =>
...
}
}
}
}
If you want to limit the number of concurrent calls to fakeSaveUserAndReturnCreateUserId then you can create an ExecutionContext with a defined ThreadPool size, as explained in the answer to this question, and use that ExecutionContext to create the Futures:
val ThreadCount = 10 //concurrent queries
val limitedExecutionContext =
ExecutionContext.fromExecutor(Executors.newFixedThreadPool(ThreadCount))
def fakeSaveUserAndReturnCreatedUserId(param: UserRegisterSource.RegisterUser): Future[Int] =
Future { 1 }(limitedExecutionContext)
I'm trying to obtain internal state of an actor in my unit test, but by some reason the old state persists.
My actor should be adding/removing/listing self-registering actor services:
class DirectoryServiceActor extends Actor {
var servicesMap: Map[String, List[ActorRef]] = Map.empty[String, List[ActorRef]]
def receive = {
case AddService(serviceType) ⇒
servicesMap = servicesMap + (serviceType -> (sender :: servicesMap.getOrElse(serviceType, List.empty[ActorRef])))
sender ! Ack
case RemoveService ⇒
val oldMap = servicesMap
servicesMap = servicesMap.mapValues(list ⇒ (if (list.contains(sender)) list.diff(List(sender)) else list).toList)
println(servicesMap)
if (servicesMap.equals(oldMap)) {
sender ! Nack
} else {
sender ! Ack
}
case ListServices ⇒
sender ! services
}
def services: Map[String, List[ActorRef]] = this.servicesMap
}
And my test is
"Remove existing service successfully" in {
implicit val timeout = 10 millis
val probe = new TestProbe(system)
val directoryService = TestActorRef[DirectoryServiceActor]
val actor = directoryService.underlyingActor
directoryService.tell(AddService("test"), probe.ref)
probe.expectMsg(timeout, Ack)
directoryService.tell(RemoveService, probe.ref)
probe.expectMsg(timeout, Ack)
println("TEST: " + actor.services)
actor.services("test") should not contain (probe.ref)
}
Judging by failed test and console output it seems that actor.underlyingActor.services returns the old value:
Map(test -> List())
TEST: Map(test -> List(Actor[akka://myApp/system/testActor3#-2080677614]))
Even though inside of the actor, the variable has already been set to a new value. What have I missed?
Update: Seems not to be related to Akka, actually, but can be worked around using futures in the test:
"Remove existing service successfully" in {
implicit val timeout = Timeout(100 millis)
val directoryService = TestActorRef[DirectoryServiceActor]
val addResponseFuture = directoryService ? AddService(self, "test")
addResponseFuture.value.get should be(Success(Ack(self)))
val removeResponseFuture = directoryService ? RemoveService(self)
removeResponseFuture.value.get should be(Success(Ack(self)))
val listResponseFuture = directoryService ? ListServices
listResponseFuture.value.get should be(Success(Map("test" -> List())))
val actor = directoryService.underlyingActor
actor.services("test") should not contain (self)
}
I suppose that it is happening due to mapValue not actually creating a new map: Scala: Why mapValues produces a view and is there any stable alternatives?
For some reason, I think that mapValues is what's causing issues for you. Try changing the RemoveService handling as follows:
case RemoveService =>
val oldMap = servicesMap
servicesMap = servicesMap.map{
case (key, list) => (key, list.filterNot(_ == sender))
}
if (servicesMap.equals(oldMap)) {
sender ! Nack
} else {
sender ! Ack
}