Share an object over multiple processes - eclipse

Goal
I want to share an object with multiple processes. The Object has a list with method references. Each process should check for work in the List and process it.
Problem
As soon as I share the object the list gets empty
Code
###manager.BlockNet.py:###
import manager.BlockNetProcess
from multiprocessing import queues
class blockmgr():
WAITLIST=[]
kill=False
def __init__(self,):
pass
def adJob(self,job):
if not job in self.WAITLIST:
self.WAITLIST.append(job)
def getJob(self):
j=self.WAITLIST.pop(0)
return j
def __empty(self,id):
print('empty')
def startProcesses(self,mgr,number=8):
self.qeue=queues.Queue()
self.queue.put(mgr)
result=manager.BlockNetProcess.start(q,number)
return result
mgr is the object I want to share between all processes
###manager.BlockNetProcess.py:###
from multiprocessing import pool,queues
def worker(q):
'''this is the method each process executes'''
print('start Worker')
mgr=q.get()
while not mgr.kill:
mgr=q.get()
job=mgr.getJob()
print(job)
#job[0](job[1])
return []
def start(q,number):
for i in range(number):
p = pool.Pool(processes=1)
result=p.apply_async(worker,[q])
return result
After starting the processes I get an error because I cann not pop from a list if it's empty. But I've added thing to the list before calling the startProcesses Method.
Where is my mistake? Or is there a better Method to exchange Objects between processes?

Related

Pytest setup class once before testing

I'm using pytest for testing with mixer library for generating model data. So, now I'm trying to setup my tests once before they run. I grouped them into TestClasses, set to my fixtures 'class' scope, but this doesn't work for me.
#pytest.mark.django_db
class TestCreateTagModel:
#classmethod
#pytest.fixture(autouse=True, scope='class')
def _set_up(cls, create_model_instance, tag_model, create_fake_instance):
cls.model = tag_model
cls.tag = create_model_instance(cls.model)
cls.fake_instance = create_fake_instance(cls.model)
print('setup')
def test_create_tag(self, tag_model, create_model_instance, check_instance_exist):
tag = create_model_instance(tag_model)
assert check_instance_exist(tag_model, tag.id)
conftest.py
pytest.fixture(scope='class')
#pytest.mark.django_db(transaction=True)
def create_model_instance():
instance = None
def wrapper(model, **fields):
nonlocal instance
if not fields:
instance = mixer.blend(model)
else:
instance = mixer.blend(model, **fields)
return instance
yield wrapper
if instance:
instance.delete()
#pytest.fixture(scope='class')
#pytest.mark.django_db(transaction=True)
def create_fake_instance(create_related_fields):
"""
Function for creating fake instance of model(fake means that this instance doesn't exists in DB)
Args:
related (bool, optional): Flag which indicates create related objects or not. Defaults to False.
"""
instance = None
def wrapper(model, related=False, **fields):
with mixer.ctx(commit=False):
instance = mixer.blend(model, **fields)
if related:
create_related_fields(instance, **fields)
return instance
yield wrapper
if instance:
instance.delete()
#pytest.fixture(scope='class')
#pytest.mark.django_db(transaction=True)
def create_related_fields():
django_rel_types = ['ForeignKey']
def wrapper(instance, **fields):
for f in instance._meta.get_fields():
if type(f).__name__ in django_rel_types:
rel_instance = mixer.blend(f.related_model)
setattr(instance, f.name, rel_instance)
return wrapper
But I'm catching exception in mixer gen_value method: Database access not allowed, use django_db mark(that I'm already use). Do you have any ideas how this can be implemented?
You can set things up once before a run by returning the results of the setup, rather than modifying the testing class directly. From my own attempts, it seems any changes to the class made within class-scope fixtures are lost when individual tests are run. So here's how you should be able to do this. Replace your _setup fixture with these:
#pytest.fixture(scope='class')
def model_instance(self, tag_model, create_model_instance):
return create_model_instance(tag_model)
#pytest.fixture(scope='class')
def fake_instance(self, tag_model, create_fake_instance):
return create_fake_instance(tag_model)
And then these can be accessed through:
def test_something(self, model_instance, fake_instance):
# Check that model_instance and fake_instance are as expected
I'm not familiar with Django myself though, so there might be something else with it going on. This should at least help you solve one half of the problem, if not the other.

Running a background task to update a cache

I'm creating a web server using Akka-HTTP. When receiving a request (to a certain route), the server calls a REST API. The call to this API is fairly long. Currently, I'm caching the result so that the following request uses the cache. I want to have a background task that updates periodically the cache (by calling the API). When receiving the request, the server would use the cached result (instead of having to call the API). The cache would only be updated through the background task.
How would I do that? I can use Akka's scheduling module to run the task periodically but I don't know how to update the cache once the task has run.
Currently, I have something like that:
val roster = Util.get_roster()
var pcache = new SRCache(roster)
val route = cache(lfuCache, keyerFunction)(
pathSingleSlash {
get {
complete(
HttpEntity(
ContentTypes.`text/html(UTF-8)`,Views.index(pcache.get).toString))
}
}
)
pcache.get calls the API (which is quite slow) and I want to replace the API call by something that simply returns the content of the cache and add a background task to update the cache.
Assuming you are using the cache from this example: https://doc.akka.io/docs/akka-http/current/common/caching.html.
import akka.http.caching.scaladsl.Cache
import akka.http.caching.scaladsl.CachingSettings
import akka.http.caching.LfuCache
import akka.http.scaladsl.server.RequestContext
import akka.http.scaladsl.server.RouteResult
import akka.http.scaladsl.model.Uri
import akka.http.scaladsl.server.directives.CachingDirectives._
import scala.concurrent.duration._
// Use the request's URI as the cache's key
val keyerFunction: PartialFunction[RequestContext, Uri] = {
case r: RequestContext => r.request.uri
}
val defaultCachingSettings = CachingSettings(system)
val lfuCacheSettings =
defaultCachingSettings.lfuCacheSettings
.withInitialCapacity(25)
.withMaxCapacity(50)
.withTimeToLive(20.seconds)
.withTimeToIdle(10.seconds)
val cachingSettings =
defaultCachingSettings.withLfuCacheSettings(lfuCacheSettings)
val lfuCache: Cache[Uri, RouteResult] = LfuCache(cachingSettings)
// Create the route
val route = cache(lfuCache, keyerFunction)(innerRoute)
Your background task should be scheduled to update lfuCache. Here is the interface of this cache class you can use: https://doc.akka.io/api/akka-http/10.1.10/akka/http/caching/scaladsl/Cache.html.
Methods of interest:
abstract def get(key: K): Option[Future[V]]
// Retrieves the future instance that is currently in the cache for the given key.
abstract def getOrLoad(key: K, loadValue: (K) ⇒ Future[V]): Future[V]
// Returns either the cached Future for the given key,
// or applies the given value loading function on the key, producing a Future[V].
abstract def put(key: K, mayBeValue: Future[V])
(implicit ex: ExecutionContext): Future[V]
// Cache the given future if not cached previously.
This is the Scheduler interface you can use:
https://doc.akka.io/docs/akka/current/scheduler.html
val cancellable =
system.scheduler.schedule(0 milliseconds, 5 seconds, ...)
Your scheduler will call lfuCache.put(...) every n seconds and update cache.
Next, your code can follow one of these patterns:
Use cached route as your are already doing with:
val route = cache(lfuCache, keyerFunction)(....
Or simply call lfuCache.get(key) or lfuCache.getOrLoad(...) without using caching dsl directives (without this: cache(lfuCache,...).
If you are using Cache class directly for putting and retrieving values, then consider using simpler keys instead of URI values.

How can I run parallel instances of a function that returns a Try?

I have a function that returns a Try, and I want to run multiple instances of it in parallel, but I’m coming up blank on how to do that – I can only seem to run it one after the other.
Context: this function is meant to acquire a lock so that if multiple threads/workers are running in parallel, they don’t read on each other’s toes. In the tests, I want to run five instances simultaneously, and assert that all but one of them was locked out. This was working when the function returned a Future, but I’ve done some refactoring and now it returns a Try, and the test has stopped working.
The behaviour doesn’t seem to be related to the locking code – it seems I just don’t understand concurrency!
I’ve been trying to use Future.fromTry, and execute them in parallel. For example:
import scala.concurrent.ExecutionContext.Implicits.global
import scala.concurrent.Future
import scala.util.{Success, Try}
object Main extends App {
def greet(name: String): Try[Unit] = Try {
println(s"Hello $name!")
Thread.sleep(1000)
println(s"Goodbye $name!")
()
}
Seq("alice", "bob", "carol", "dave", "eve").map { name =>
Future.fromTry { greet(name) }
}
}
I’d expect to see all the “Hello” messages, and then all the “Goodbye” messages – instead, it seems to be executing them one after the other.
Hello alice!
Goodbye alice!
Hello bob!
Goodbye bob!
Hello carol!
Goodbye carol!
Hello dave!
Goodbye dave!
Hello eve!
Goodbye eve!
I looked around, and found suggestions about tweaking the ExecutionContext and adding parallelism – thing is, this environment seems perfectly happy to run Futures in parallel.
On the same machine, with the same global ExecutionContext, if I tweak the function to return a Future, not a Try, I see the output I’d expect, and the functions appear to be running in parallel.
import scala.concurrent.ExecutionContext.Implicits.global
import scala.concurrent.Future
import scala.util.{Success, Try}
object Main extends App {
def greet(name: String): Future[Unit] = Future {
println(s"Hello $name!")
Thread.sleep(1000)
println(s"Goodbye $name!")
()
}
Seq("faythe", "grace", "heidi", "ivan", "judy").map { name =>
greet(name)
}
Thread.sleep(2000) // Let the futures finish
}
Hello faythe!
Hello ivan!
Hello grace!
Hello judy!
Hello heidi!
Goodbye ivan!
Goodbye grace!
Goodbye heidi!
Goodbye judy!
Goodbye faythe!
What am I doing wrong with Future.fromTry that means it’s waiting for the Futures to finish? How do I make it match the second example?
Or am I barking up the wrong tree entirely?
The documentation explicitly states that fromTry will create an already completed Future from the result, thus it first evaluates the function and then lift it inside the Future context. As such, it is completely serial.
You can first create a List[Future[String]] from the names, and then map the list and map the inner Futures to execute your function.
Or, since a Future already represents the possibility failure (and internally uses Try), why not simply use Future in your function (as you said it was before).

Testing rx-observables from Futures/Iterables

I have:
val observable: Observable[Int] = Observable.from(List(5))
and I can test that the input list is indeed passed on to the observable by testing:
materializeValues(observable) should contain (5)
where materializeValues is:
def materializeValues[T](observable: Observable[T]): List[T] = {
observable.toBlocking.toIterable.toList
}
Now, if I create an observable from a future, I can't seem to use materializeValues for the test as the test times out. So if I have:
val futVal = Future.successful(5)
val observable: Observable[Int] = Observable.from(futVal)
materializeValues(observable) should contain(5)
it times out and does not pass the test. What is different in the process of materializing these two observables, which leads to me not being able to block on it?
Also, what is the idomatic way of testing an observable? Is there any way of doing it without calling toBlocking?
I think the problem is that you use AsyncWordSpecLike (by the way why AsyncWordSpecLike instead of AsyncWordSpec?). AsyncWordSpecLike/AsyncWordSpec are designed to simplify testing Future. Unfortunately Observable is a more powerful abstraction that can't be easily mapped onto a Future.
Particularly AsyncWordSpecLike/AsyncWordSpec allow your tests to return Future[Assertion]. To make it possible it provides custom implicit ExecutionContext that it can force to execute everything and know when all scheduled jobs have finished. However the same custom ExecutionContext is the reason why your second code doesn't work: processing of the scheduled jobs starts only after execution of your test code has finished but your code blocks on the futVal because unlucklily for you callback registered in Future.onComplete is scheduled to be run on the ExecutionContext. It means that you have a kind of dead-lock with your own thread.
I'm not sure what is the official way to test Observable on Scala. In Java I think TestSubscriber is the suggested tool. As I said Observable is fundamentally more powerful thing than Future so I think to test Observable you should avoid using AsyncWordSpecLike/AsyncWordSpec. If you switch to use FlatSpec or WordSpec, you can do something like this:
class MyObservableTestSpec extends WordSpec with Matchers {
import scala.concurrent.ExecutionContext.Implicits.global
val testValue = 5
"observables" should {
"be testable if created from futures" in {
val futVal = Future.successful(testValue)
val observable = Observable.from(futVal)
val subscriber = TestSubscriber[Int]()
observable(subscriber)
subscriber.awaitTerminalEvent
// now after awaitTerminalEvent you can use various subscriber.assertXyz methods
subscriber.assertNoErrors
subscriber.assertValues(testValue)
// or you can use Matchers as
subscriber.getOnNextEvents should contain(testValue)
}
}
}

Parallel file processing in Scala

Suppose I need to process files in a given folder in parallel. In Java I would create a FolderReader thread to read file names from the folder and a pool of FileProcessor threads. FolderReader reads file names and submits the file processing function (Runnable) to the pool executor.
In Scala I see two options:
create a pool of FileProcessor actors and schedule a file processing function with Actors.Scheduler.
create an actor for each file name while reading the file names.
Does it make sense? What is the best option?
Depending on what you're doing, it may be as simple as
for(file<-files.par){
//process the file
}
I suggest with all my energies to keep as far as you can from the threads. Luckily we have better abstractions which take care of what's happening below, and in your case it appears to me that you do not need to use actors (while you can) but you can use a simpler abstraction, called Futures. They are a part of Akka open source library, and I think in the future will be a part of the Scala standard library as well.
A Future[T] is simply something that will return a T in the future.
All you need to run a future, is to have an implicit ExecutionContext, which you can derive from a java executor service. Then you will be able to enjoy the elegant API and the fact that a future is a monad to transform collections into collections of futures, collect the result and so on. I suggest you to give a look to http://doc.akka.io/docs/akka/2.0.1/scala/futures.html
object TestingFutures {
implicit val executorService = Executors.newFixedThreadPool(20)
implicit val executorContext = ExecutionContext.fromExecutorService(executorService)
def testFutures(myList:List[String]):List[String]= {
val listOfFutures : Future[List[String]] = Future.traverse(myList){
aString => Future{
aString.reverse
}
}
val result:List[String] = Await.result(listOfFutures,1 minute)
result
}
}
There's a lot going on here:
I am using Future.traverse which receives as a first parameter which is M[T]<:Traversable[T] and as second parameter a T => Future[T] or if you prefer a Function1[T,Future[T]] and returns Future[M[T]]
I am using the Future.apply method to create an anonymous class of type Future[T]
There are many other reasons to look at Akka futures.
Futures can be mapped because they are monad, i.e. you can chain Futures execution :
Future { 3 }.map { _ * 2 }.map { _.toString }
Futures have callback: future.onComplete, onSuccess, onFailure, andThen etc.
Futures support not only traverse, but also for comprehension
Ideally you should use two actors. One for reading the list of files, and one for actually reading the file.
You start the process by simply sending a single "start" message to the first actor. The actor can then read the list of files, and send a message to the second actor. The second actor then reads the file and processes the contents.
Having multiple actors, which might seem complicated, is actually a good thing in the sense that you have a bunch of objects communicating with eachother, like in a theoretical OO system.
Edit: you REALLY shouldn't be doing doing concurrent reading of a single file.
I was going to write up exactly what #Edmondo1984 did except he beat me to it. :) I second his suggestion in a big way. I'll also suggest that you read the documentation for Akka 2.0.2. As well, I'll give you a slightly more concrete example:
import akka.dispatch.{ExecutionContext, Future, Await}
import akka.util.duration._
import java.util.concurrent.Executors
import java.io.File
val execService = Executors.newCachedThreadPool()
implicit val execContext = ExecutionContext.fromExecutorService(execService)
val tmp = new File("/tmp/")
val files = tmp.listFiles()
val workers = files.map { f =>
Future {
f.getAbsolutePath()
}
}.toSeq
val result = Future.sequence(workers)
result.onSuccess {
case filenames =>
filenames.foreach { fn =>
println(fn)
}
}
// Artificial just to make things work for the example
Thread.sleep(100)
execContext.shutdown()
Here I use sequence instead of traverse, but the difference is going to depend on your needs.
Go with the Future, my friend; the Actor is just a more painful approach in this instance.
But if use actors, what's wrong with that?
If we have to read / write to some property file. There is my Java example. But still with Akka Actors.
Lest's say we have an actor ActorFile represents one file. Hm.. Probably it can not represent One file. Right? (would be nice it could). So then it represents several files like PropertyFilesActor then:
Why would not use something like this:
public class PropertyFilesActor extends UntypedActor {
Map<String, String> filesContent = new LinkedHashMap<String, String>();
{ // here we should use real files of cource
filesContent.put("file1.xml", "");
filesContent.put("file2.xml", "");
}
#Override
public void onReceive(Object message) throws Exception {
if (message instanceof WriteMessage) {
WriteMessage writeMessage = (WriteMessage) message;
String content = filesContent.get(writeMessage.fileName);
String newContent = content + writeMessage.stringToWrite;
filesContent.put(writeMessage.fileName, newContent);
}
else if (message instanceof ReadMessage) {
ReadMessage readMessage = (ReadMessage) message;
String currentContent = filesContent.get(readMessage.fileName);
// Send the current content back to the sender
getSender().tell(new ReadMessage(readMessage.fileName, currentContent), getSelf());
}
else unhandled(message);
}
}
...a message will go with parameter (fileName)
It has its own in-box, accepting messages like:
WriteLine(fileName, string)
ReadLine(fileName, string)
Those messages will be storing into to the in-box in the order, one after antoher. The actor would do its work by receiving messages from the box - storing/reading, and meanwhile sending feedback sender ! message back.
Thus, let's say if we write to the property file, and send showing the content on the web page. We can start showing page (right after we sent message to store a data to the file) and as soon as we received the feedback, update part of the page with a data from just updated file (by ajax).
Well, grab your files and stick them in a parallel structure
scala> new java.io.File("/tmp").listFiles.par
res0: scala.collection.parallel.mutable.ParArray[java.io.File] = ParArray( ... )
Then...
scala> res0 map (_.length)
res1: scala.collection.parallel.mutable.ParArray[Long] = ParArray(4943, 1960, 4208, 103266, 363 ... )