How to know the used buffer size of a SourceQueue in akka? - scala

So I'm a fan of using Source.queue and now I want to monitor it, to know how much of the buffer it's being used at certain point. Yet I haven't seen a way to know such info. Only thing which comes to mind is a workaround in which I have a mutable number to which I add when I offer an element, and subtract, when the operation has been completed.

You could change the QueueSource class to provide a materialized queue that exposes its internal buffer size. This can be done by creating a new trait - e.g.
trait SourceQueueWithCompleteAndSize[T] extends SourceQueueWithComplete[T] {
def size: Int
}
Then, in the final part of the stage (see the original code for reference), you need to provide your new trait implementation instead of the SourceQueueWithComplete. You should be able to access the size of the internal buffer with the used method.
(stageLogic, new SourceQueueWithCompleteAndSize[T] {
// other methods implementations
override def size: Int = stageLogic.buffer.used // NEW BIT
})
It's a fair amount of code to copy, but it might be better than adding an external counter around your stage. Might as well be a valid contribution to akka-stream-contrib.

Related

Can it be safe to share a var?

My application has a class ApplicationUsers that has no mutable members. Upon creation of instances, it reads the entire user database (relatively small) into an immutable collection. It has a number of methods to query the data.
I am now faced with the problem of having to create new users (or modify some of their attributes). My current idea is to use an Akka actor that, at a high level, would look like this:
class UserActor extends Actor{
var users = new ApplicationUsers
def receive = {
case GetUsers => sender ! users
case SomeMutableOperation => {
PerformTheChangeOnTheDatabase() // does not alter users (which is immutable)
users = new ApplicationUsers // reads the database from scratch into a new immutable instance
}
}
}
Is this safe? My reasoning is that it should be: whenever users is changed by SomeMutableOperation any other threads making use of previous instances of users already have a handle to an older version, and should not be affected. Also, any GetUsers request will not be acted upon until a new instance is not safely constructed.
Is there anything I am missing? Is my construct safe?
UPDATE: I probably should be using Agents to do this, but the question is still holds: is the above safe?
You are doing it exactly right: have immutable data types and reference them via var within the actor. This way you can freely share the data and mutability is confined to the actor. The only thing to watch out for is if you reference the var from a closure which is executed outside of the actor (e.g. in a Future transformation or a Props instance). In such a case you need to make a stack-local copy:
val currentUsers = users
other ? Process(users) recoverWith { case _ => backup ? Process(currentUsers) }
In the first case you just grab the value—which is fine—but asking the backup happens from a different thread, hence the need for val currentUsers.
Looks fine to me. You don't seem to need Agents here.

Akka tracking when the actors finished

I have following code that traverses through a list of people and calls a callback for each of them in class1.
def syncPeople(callback: Person => _) = Future {
person.findAll(criteria).foldLeft(0L) { (count, obj) =>
callback(obj)
count + 1
}
}
Callback and the call to syncPeople is in class2 and looks similar to this
def getActor(person: Person):ActorRef = {
if(person.isMale) maleActor
else femaleActor
}
def process(person: Person): Unit = {
val workActor = getActor(person)
workActor ! person
} //The actor does the actual work and may be quite intense
def syncPeople(process)
Now, I want to track the total time taken to sync all people. ie when the last workActor completes the work. I am using a third Actor: MonitorActor to keep track of start and end times. The MaleActor, FemaleActor can send messages to this when they process an individual
Whats the best way to keep track of this spawned processes?
I explored
Future.sequence // but the class sending the workActor the message is not an actor. so the future does not receive the message
keeping track of personIds when they finish, but without using a var, to accumulate the received messages in MonitorActor its not possible implement this.. and using var is not preferred way of doing things
What could be other ways of implementing this
Funny, I'm working on a very similar problem to this at the moment. The solution I would suggest is using akka-fsm which keeps track of state.
Essentially in something outside of your state object, do something like generate a Long that represents an id:
def getId(): Long = System.currentTimeMillis() / 1000L
The state object when implemented correctly is immutable, so you just keep reusing this id throughout the transaction.
I know this answer is missing a lot of the implementation details but I'm still working on the implementation myself in my own code. Hopefully after reading about akka-fsm a bit and playing with it, this answer will make sense?
Don't demonize mutable state, it's SHARED mutable state, which causes the most of the problems. You don't have shared mutable state inside an actor, because you always talk to actorRefs and the actors process only one message at a time (no race conditions and other evil stuff). What I'm saying is, it's ok to use a var (unless you spawn some futures inside the actor, which mutate the var, because then your are back to SHARING mutable state). FSM is another solution as #devnulled suggested, but it sounds more like an overkill for your use case.

How to re-boot liftweb?

I have my Boot.scala with boot method in it where i do my setup.
At the end, I make the call to LiftRules.statelessDispatchTable and append an new instance of my class that extends the RestHelper, which has the serve block.
At some point, I get a signal and need to change this class, so i need to make another call into the statelessDispatchTable to remove the original one and add a new one.
What's a good way to do this?
Thanks!
EDIT: I AM GOING TO UPDATE THE QUESTION WITH THE ANSWER I GOT FROM DAVID POLLAK:
You can't. Once your app is started, there's no way to change LiftRules.
However, the stuff you're adding to statelessDispatchTable is a PartialFunction[Req, Box[LiftResponse]] so you can write a PartialFunction that looks like:
object RestThing1 extends RestHelper { .... }
object RestThing2 extends RestHelper {....}
object MyDynamicRestThing extends PartialFunction[Req, Box[LiftResponse]] {
def isDefinedAt(in: Req): Boolean = if (testCondition) RestThing1.isDefinedAt(in) else RestThing2.isDefinedAt(in)
def apply(in: Req): Box[LiftRequest] = if (testCondition) RestThing1.apply(in) else RestThing2.apply(in)
}
LiftRules.statelessDispatchTable.append(MyDynamicRestThing)
You could create a second-level dispatch...e.g., an object that receives the requests, then according to some other logic proxies the requests on to the real handler. Then you don't have to mess with the top-level dispatch table at all.
Would really make sense to do this if what you are needing to do is toggle it based on a signal (e.g. it will revert back at some point), or if there is additional logic that would benefit from being in a proper abstraction.

Serialize Function1 to database

I know it's not directly possible to serialize a function/anonymous class to the database but what are the alternatives? Do you know any useful approach to this?
To present my situation: I want to award a user "badges" based on his scores. So I have different types of badges that can be easily defined by extending this class:
class BadgeType(id:Long, name:String, detector:Function1[List[UserScore],Boolean])
The detector member is a function that walks the list of scores and return true if the User qualifies for a badge of this type.
The problem is that each time I want to add/edit/modify a badge type I need to edit the source code, recompile the whole thing and re-deploy the server. It would be much more useful if I could persist all BadgeType instances to a database. But how to do that?
The only thing that comes to mind is to have the body of the function as a script (ex: Groovy) that is evaluated at runtime.
Another approach (that does not involve a database) might be to have each badge type into a jar that I can somehow hot-deploy at runtime, which I guess is how a plugin-system might work.
What do you think?
My very brief advice is that if you want this to be truly data-driven, you need to implement a rules DSL and an interpreter. The rules are what get saved to the database, and the interpreter takes a rule instance and evaluates it against some context.
But that's overkill most of the time. You're better off having a little snippet of actual Scala code that implements the rule for each badge, give them unique IDs, then store the IDs in the database.
e.g.:
trait BadgeEval extends Function1[User,Boolean] {
def badgeId: Int
}
object Badge1234 extends BadgeEval {
def badgeId = 1234
def apply(user: User) = {
user.isSufficientlyAwesome // && ...
}
}
You can either have a big whitelist of BadgeEval instances:
val weDontNeedNoStinkingBadges = Map(
1234 -> Badge1234,
5678 -> Badge5678,
// ...
}
def evaluator(id: Int): Option[BadgeEval] = weDontNeedNoStinkingBadges.get(id)
def doesUserGetBadge(user: User, id: Int) = evaluator(id).map(_(user)).getOrElse(false)
... or if you want to keep them decoupled, use reflection:
def badgeEvalClass(id: Int) = Class.forName("com.example.badge.Badge" + id + "$").asInstanceOf[Class[BadgeEval]]
... and if you're interested in runtime pluggability, try the service provider pattern.
You can try and use Scala Continuations - they can give you the ability to serialize the computation and run it at later time or even on another machine.
Some links:
Continuations
What are Scala continuations and why use them?
Swarm - Concurrency with Scala Continuations
Serialization relates to data rather than methods. You cannot serialize functionality because it is a class file which is designed to serialize that and object serialization serializes the fields of an object.
So like Alex says, you need a rule engine.
Try this one if you want something fairly simple, which is string based, so you can serialize the rules as strings in a database or file:
http://blog.maxant.co.uk/pebble/2011/11/12/1321129560000.html
Using a DSL has the same problems unless you interpret or compile the code at runtime.

Are singletons automatically persisted between requests in ASP.NET MVC?

I have a lookup table (LUT) of thousands integers that I use on a fair amount of requests to compute stuff based on what was fetched from database.
If I simply create a standard singleton to hold the LUT, is it automatically persisted between requests or do I specifically need to push it to the Application state?
If they are automatically persisted, then what is the difference storing them with the Application state?
How would a correct singleton implementation look like? It doesn't need to be lazily initialized, but it needs to be thread-safe (thousands of theoretical users per server instance) and have good performance.
EDIT: Jon Skeet's 4th version looks promising http://csharpindepth.com/Articles/General/Singleton.aspx
public sealed class Singleton
{
static readonly Singleton instance=new Singleton();
// Explicit static constructor to tell C# compiler
// not to mark type as beforefieldinit
static Singleton()
{
}
Singleton()
{
}
public static Singleton Instance
{
get
{
return instance;
}
}
// randomguy's specific stuff. Does this look good to you?
private int[] lut = new int[5000];
public int Compute(Product p) {
return lut[p.Goo];
}
}
Yes, static members persists (not the same thing as persisted - it's not "saved", it never goes away), which would include implementations of a singleton. You get a degree of lazy initialisation for free, as if it's created in a static assignment or static constructor, it won't be called until the relevant class is first used. That creation locks by default, but all other uses would have to be threadsafe as you say. Given the degree of concurrency involved, then unless the singleton was going to be immutable (your look-up table doesn't change for application lifetime) you would have to be very careful as to how you update it (one way is a fake singleton - on update you create a new object and then lock around assigning it to replace the current value; not strictly a singleton though it looks like one "from the outside").
The big danger is that anything introducing global state is suspect, and especially when dealing with a stateless protocol like the web. It can be used well though, especially as an in-memory cache of permanent or near-permanent data, particularly if it involves an object graph that cannot be easily obtained quickly from a database.
The pitfalls are considerable though, so be careful. In particular, the risk of locking issues cannot be understated.
Edit, to match the edit in the question:
My big concern would be how the array gets initialised. Clearly this example is incomplete as it'll only ever have 0 for each item. If it gets set at initialisation and is the read-only, then fine. If it's mutable, then be very, very careful about your threading.
Also be aware of the negative effect of too many such look-ups on scaling. While you save for mosts requests in having pre-calculation, the effect is to have a period of very heavy work when the singleton is updated. A long-ish start-up will likely be tolerable (as it won't be very often), but arbitrary slow downs happening afterwards can be tricky to trace to their source.
I wouldn't rely on a static being persisted between requests. [There is always the, albeit unlikely, chance that the process would be reset between requests.] I'd recommend HttpContext's Cache object for persisting shared resources between requests.
Edit: See Jon's comments about read-only locking.
It's been a while since I've dealt with singleton's (I prefer letting my IOC container deal with lifetimes), but here's how you can handle the thread-safety issues. You'll need to lock around anything that mutates the state of the singleton. Read only operations, like your Compute(int) won't need locking.
// I typically create one lock per collection, but you really need one per set of atomic operations; if you ever modify two collections together, use one lock.
private object lutLock = new object();
private int[] lut = new int[5000];
public int Compute(Product p) {
return lut[p.Goo];
}
public void SetValue(int index, int value)
{
//lock as little code as possible. since this step is read only we don't lock it.
if(index < 0 || index > lut.Length)
{
throw new ArgumentException("Index not in range", "index");
}
// going to mutate state so we need a lock now
lock(lutLock)
{
lut[index] = value;
}
}