How to run a function every 3 seconds in Play scala? [duplicate] - scala

This question already has answers here:
How to correctly schedule task in Play Framework 2.4.2 scala?
(2 answers)
Closed 6 years ago.
Iam new to play scala, My use case is run a function every 3 seconds and (Every 3 seconds i check MySQL DB, i get a particular value means proceed to next step). any other possibilities to do...

You should inject Akka ActorSystem in your class and schedule a job with it, like:
import play.api.libs.concurrent.Execution.Implicits.defaultContext
// ...
system.scheduler.scheduleOnce(10.milliseconds) {
file.delete()
}
This code is taken from the Play docs, you should read them. For more complicated tasks use Actors, of course.

Related

Spark : Multiple independant actions in parallel [duplicate]

This question already has answers here:
How to run concurrent jobs(actions) in Apache Spark using single spark context
(2 answers)
Processing multiple files as independent RDD's in parallel
(3 answers)
How to run multiple Spark jobs in parallel?
(3 answers)
Closed 4 years ago.
I am using the SQL API of Spark 2.0.0.
I would like to know what is the good practice when I have two independant actions that have to be done on my data. Here is a basic example :
val ds = sc.parallelize(List(
("2018-12-07T15:31:48Z", "AAA",3),
("2018-12-07T15:32:48Z", "AAA",25),
("2018-12-07T15:33:48Z", "AAA",20),
("2018-12-07T15:34:48Z", "AAA",10),
("2018-12-07T15:35:48Z", "AAA",15),
("2018-12-07T15:36:48Z", "AAA",16),
("2018-12-07T15:37:48Z", "AAA",8),
("2018-12-07T15:31:48Z", "BBB",15),
("2018-12-07T15:32:48Z", "BBB",0),
("2018-12-07T15:33:48Z", "BBB",0),
("2018-12-07T15:34:48Z", "BBB",1),
("2018-12-07T15:35:48Z", "BBB",8),
("2018-12-07T15:36:48Z", "BBB",7),
("2018-12-07T15:37:48Z", "BBB",6)
)).toDF("timestamp","tag","value")
val newDs = commonTransformation(ds).cache();
newDs.count() // force computation of the dataset
val dsAAA = newDs.filter($"tag"==="AAA")
val dsBBB = newDs.filter($"tag"==="BBB")
actionAAA(dsAAA)
actionBBB(dsBBB)
Using the following functions :
def commonTransformation(ds:Dataset[Row]):Dataset[Row]={
ds // do multiple transformations on dataframe
}
def actionAAA(ds:Dataset[Row]){
Thread.sleep(5000) // Sleep to simulate an action that takes time
ds.show()
}
def actionBBB(ds:Dataset[Row]){
Thread.sleep(5000) // Sleep to simulate an action that takes time
ds.show()
}
In this example, we have an input dataset that contains multiple time series identified by the 'tag' column. Some transofrmations are applied to this whole dataset.
Then, I want to apply different actions depending of the tag of the time series on my data.
In my example, I get the expected result, but I had to wait a long time for both my actions to get executed, event though I had executors available.
I partialy solved the problem by using Java class Future, which allows me to start my actions in an asynchronous way. But with this solution, Spark become very slow if I start too much actions compared to his resources and end up taking more time than doing the actions one by one.
So for now, my solution is to start multiple actions with a maximum limit of actions running at the same time, but I don't feel like it is the good way to do (and the maximum limit is hard to guess).

Optimize database actions Slick 3

I have created a database with Slick. And I am trying to create schemas of tables, select some information and so on. Here is my code for schemas creation:
val createUserTable = UserTable.table.schema.create
val createTaskTable = TaskTable.table.schema.create
Await.result(db.run(DBIO.seq(Queries.createUserTable, Queries.createTaskTable)), 2 seconds)
This code works just fine but I do not want to use Await.result with every query. What I am looking for is executing them in batch at least by purpose (creation, selection and so on). I could I have created this method to pass different actions:
def executeAction[T](action: DBIO[T]) =
Await.result(db.run(action), 2 seconds)
So I am curious how can I change it to pass some data structure which holds a sequence of queries? For example, List(createUserTable, createTaskTable)
Your help is appreciated!
Two ways to avoid Await for every DBIO action
Create list of DBIO actions and gather them using DBIO.seq and execute.
Use for-comprehension to compose all DBIO actions into one DBIO action.
This will help you some using await again and again to wait for the result of your intermediate DBIO actions.
In both cases, you have to wait for results in main thread (i.e stop the main thread from exiting) using Await.result at least once.

Apache Spark multiple job to access single method

Say I have 5 jobs that wants to access single method that will read this big file and put it to RDD. Instead of reading this file multiple times (because there will be 5 jobs that will do the same method), there's this "mother" class that will check if there already exist a job that already called the method.
Assuming that these 5 jobs are executed in a sequence, then you can read the file and cache it <RDD>.cache(...) in the first job itself and rest all job can check if the file already exists in cache then just use it, else read it again.
for more info, Refer to RDD API.

Gatling test that uses a different action every request

We're trying to stress test our REST-ish app with Gatling. We want our users to make a post with a different fileBody every request.
Our scenario looks like:
scenario("100%")
.during(15 minutes) {
exec(requestStream.next())
.pause(118 seconds, 120 seconds)
}
.users(2)
.delay(2 minutes)
.protocolConfig(httpConf)
...build up several scenarios...
setUp(severalScenarios)
This runs fine but it appears that the block with the exec is only executed one time when each scenario is built for the first time. We thought that the block would be executed every time the during(...) loop comes around giving each user a new Request from the iterator to run every 15 minutes.
Are we missing something? Is there a smarter way of doing this?
No, that's not the way the DSL works. The DSL elements are actually builders that are resolved once and for all when the simulation is loaded.
What you want is inject dynamic data into your scenario elements, and you have to use Feeders, user Session, Gatling EL, etc. What does your requestStream look like?

How to make #Schedule method to be called only once at a time

We are using the #Schedule EJB annotation to start a scheduled job do a delta SOLR import every x seconds. Of course we want to have as less timeshift in indexing as possible. Currently I try to do the delta every 5 seconds.
So my question is: Can I make sure the job is only called once?
So let's say the indexing is lasting longer than 5 seconds, will the next schedules call wait until the previous is coming back?
Yes. You can make the bean a #Singleton and use #Lock(LockType.Write) to make sure that timers don't run in parallel. Note that LockType.Write is the default in a Singleton. So even the explicit mention with the annotation would not be required.