Should mutexes be used to ensure concurrency, while using POSIX message queues? - mutex

I am using a single POSIX message queue in my application which is accessed by multiple readers. Should I use mutexes in this scenario?

This is similar to Is msgsnd() thread- and/or process-safe? .
The short answer is - it's already thread-safe, so there's no need to use mutexes.

Related

Read-Write lock with GCD

My application makes heavy use of GCD, and almost everything is split up in small tasks handled by dispatches. However, the underlying data model is mostly read and only occasionally written.
I currently use locks to prevent changes to the critical data structures while reading. But after looking into locks some more today, I found NSConditionLock and some page about read-write locks. The latter is exactly what I need.
I found this implementation: http://cocoaheads.byu.edu/wiki/locks . My question is, will this implementation work with GCD, seeing that it uses PThreads?
It will still work. pthreads is the threading API which underlies all of the other thread-using APIs on Mac OS X. (Under that there's Mach thread activations, but that's SPI, not API.) Anyway, the pthreads locks don't really require that you use pthreads threads.
However, GCD offers a better alternative as of iOS 5: dispatch_barrier_async(). Basically, you have a private concurrent queue. You submit all read operations to it in the normal fashion. You submit write operations to it using the barrier routines. Ta-da! Read-write locking.
You can learn more about this if you have access to the WWDC 2011 session video for Session 210 - Mastering Grand Central Dispatch.
You might also want to consider maintaining a serial queue for all read/write operations. You can then dispatch_sync() writes to that queue to ensure that changes to the data model are applied promptly and dispatch_async() all the reads to make sure you maintain nice performance in the app.
Since you have a single serial queue on which all the reads and writes take place you ensure that no reads can happen during a write. This is far less costly than a lock but it means you cannot execute multiple 'read' operations simultaneously. This is unlikely to cause a problem for most applications.
Using dispatch_barrier_async() might mean that writes you make take an arbitrary amount of time to actually be committed since all the pre-existing tasks in the queue have to be completed before your barrier block executes.

What's the best way to handle incoming messages?

I'm writing a server for an online game, that should be able to handle 1,000-2,000 clients in the end. The 3 ways I found to do this were basically:
1 thread/connection (blocking)
Making a list of clients, and loop through them (non-blocking)
Select (Basically a blocking statement for all clients at once with optional timeout?)
In the past I've used 1, but as we all know, it doesn't scale well. 2 is okay, but I have mixed feelings, about one client technically being able to make everyone else freeze. 3 sounds interesting (a bit better than 2), but I've heard it's not suitable for too many connections.
So, what would be the best way to do it (in D)? Are there other options?
The usual approach is closest to 3: asynchronous programming with a higher-performance select alternative, such as the poll or epoll system calls on Linux, IOCP on Windows, or higher-level libraries wrapping them. D does not support them directly, but you can find D bindings or 3rd-party D libraries (e.g. Tango) providing support for them.
Higher-performance servers (e.g. nginx) use one thread/process per CPU core, and use asynchronous event handling within that thread/process.
One option to consider is to have a single thread that runs the select/pole/epoll but not process the results. Rather it queues up connections known to have results and lets a thread pool feed from that. If checking that a full request has been read in is cheap, you might do that in the poll thread with non-blocking IO and only queue up full requests.
I don't know if D provides any support for any of that aside from (possibly) the inter-thread communication and queuing.

Why does the Scala Actor implementation involve synchronized code?

My understanding is that the queue based approach to concurrency can be implemented without locking. But I see lots of synchronized keywords in the Actor.scala file (looking at 2.8.1). Is it synchronized, is it necessary, would it make a difference if there was an implementation that was not synchronized?
Apparently the question wasn't clear enough: my understanding is that this can be implemented with a non-blocking queue. Why was it not done so? Why use the synchronized keyword anywhere in here? There may be a very good reason, or it might be just because that's the way it was done and it's not necessary. I was just curious which.
The point is that the reactions, which you write in the "act" method, do not need to concern themselves with synchronization. Also, assuming that you do not expose the actor's state, your program will be fully thread safe.
That is not to say that there is no sync at all: synchronization is absolutely necessary [1] to implement read/write access to the actor's mailbox (i.e. the sending and receiving of messages) and also to ensure the actor's private state is consistent across any subsequent reacts.
This is achieved by the library itself and you, the user, need not concern yourself with how it is done. Your state is safe (you don't even need to use volatile fields) because of the JMM's happens before guarantees. That is, if a main-memory write happens before a sync point, then any read occurring after a sync will observe the main memory state left by the write.
[1] - by "synchronization", I mean some mechanism to guarantee a happens-before relationship in the Java Memory Model. This includes the synchronized keyword, the volatile modifier and/or the java.util.concurrent locking primitives

Is there any Non-blocking IO open source implementation for Scala's actors?

I have quite large files that I need to deal with (500Meg+ zip files).
Are there any non-blocking IO open source implementations for Scala's actors?
If I got your question right, you need non-blocking IO for files. I have bad news for you then.
NIO
Java NIO in Java6 supports only blocking operation when working with files. You may notice this from the fact FileChannel does not implement SelectableChannel interface. (NIO does support non-blocking mode for sockets however)
There is NIO.2 (JSR-203) specification aimed to overcome many current limits of java.io and NIO and to provide support for asynchronous IO on files as well. NIO.2 is to be released with Java 7 as far as I understand.
These are Java library limits, hence you will suffer from them in Scala as well.
Actors
Actors are based on Fork-Join framework of Doug Lea (at least in branch 2.7.x till version 2.7.7). One quote from FJTask class:
There is nothing actually preventing
you from blocking within a FJTask, and
very short waits/blocks are completely
well behaved. But FJTasks are not
designed to support arbitrary
synchronization since there is no way
to suspend and resume individual tasks
once they have begun executing.
FJTasks should also be finite in
duration -- they should not contain
infinite loops. FJTasks that might
need to perform a blocking action, or
hold locks for extended periods, or
loop forever can instead create normal
java Thread objects that will do so.
FJTasks are just not designed to
support these things.
FJ library is enhanced in Scala to provide a unified way permitting an actor to function like a thread or like an event-based task depending on number of working threads and "library activity" (you may find explanation in technical report "Actors that unify Threads and Events" by Philipp Haller and Martin Odersky).
Solution?
But after all if you run blocking code in an actor it behaves just like if it being a thread, so why not use an ordinary Thread for blocking reads and to send events to event-based actors from this thread?
Are you talking about Remote actors? A standard Actor is of course a intra-JVM entity. I'm not aware of an NIO implementation of remote actors I'm afraid.
Hello is that an option for you?
bigdata(R) is a scale-out storage and computing fabric supporting optional transactions, very high concurrency, and very high aggregate IO rates.
http://sourceforge.net/projects/bigdata/
Not that I know of, but you could probably get a lot of mileage out of looking at Naggati, a Scala wrapper around Apache Mina. Mina is a networking library that uses NIO, Naggati translates this into a Scala style of coding.

Are nonblocking I/O operations in Perl limited to one thread? Good design?

I am attempting to develop a service that contains numerous client and server sockets (a server service as well as clients that connect out to managed components and persist) that are synchronously polled through IO::Select. The idea was to handle the I/O and/or request processing needs that arise through pools of worker threads.
The shared keyword that makes data shareable across threads in Perl (threads::shared) has its limits--handle references are not among the primitives that can be made shared.
Before I figured out that handles and/or handle references cannot be shared, the plan was to have a select() thread that takes care of the polling, and then puts the relevant handles in certain ThreadQueues spread across a thread pool to actually do the reading and writing. (I was, of course, designing this so that modification to the actual descriptor sets used by select would be thread-safe and take place in one thread only--the same one that runs select(), and therefore never while it's running, obviously.)
That doesn't seem like it's going to happen now because the handles themselves can't be shared, so the polling as well as the reading and writing is all going to need to happen from one thread. Is there any workaround for this? I am referring to the decomposition of the actual system calls across threads; clearly, there are ways to use queues and buffers to have data produced in other threads and actually sent in others.
One problem that arises from this situation is that I have to give select() a timeout, and expect that it'll be high enough to not cause any issues with polling a rather large set of descriptors while low enough not to introduce too much latency into my timing event loop - although, I do understand that if there is actual I/O set membership detected in the polling process, select() will return early, which partly mitigates the problem. I'd rather have some way of waking select() up from another thread, but since handles can't be shared, I cannot easily think of a way of doing that nor see the value in doing so; what is the other thread going to know about when it's appropriate to wake select() anyway?
If no workaround, what is a good design pattern for this type of service in Perl? I have a requirement for a rather high amount of scalability and concurrent I/O, and for that reason went the nonblocking route rather than just spawning threads for each listening socket and/or client and/or server process, as many folks using higher-level languages these days are wont to do when dealing with sockets - it seems to be kind of a standard practice in Java land, and nobody seems to care about java.nio.* outside the narrow realm of systems-oriented programming. Maybe that's just my impression. Anyway, I don't want to do it that way.
So, from the point of view of an experienced Perl systems programmer, how should this stuff be organised? Monolithic I/O thread + pure worker (non-I/O) threads + lots of queues? Some sort of clever hack? Any thread safety gotchas to look out for beyond what I have already enumerated? Is there a Better Way? I have extensive experience architecting this sort of program in C, but not with Perl idioms or runtime characteristics.
EDIT: P.S. It has definitely occurred to me that perhaps a program with these performance requirements and this design should simply not be written in Perl. But I see an awful lot of very sophisticated services produced in Perl, so I am not sure about that.
Bracketing out your several, larger design questions, I can offer a few approaches to sharing filehandles across perl threads.
One may pass $client to a thread start routine or simply reference it in a new thread:
$client = $server_socket->accept();
threads->new(\&handle_client, $client);
async { handle_client($client) };
# $client will be closed only when all threads' references
# to it pass out of scope.
For a Thread::Queue design, one may enqueue() the underlying fd:
$q->enqueue( POSIX::dup(fileno $client) );
# we dup(2) so that $client may safely go out of scope,
# closing its underlying fd but not the duplicate thereof
async {
my $client = IO::Handle->new_from_fd( $q->dequeue, "r+" );
handle_client($client);
};
Or one may just use fds exclusively, and the bit vector form of Perl's select.