zmq pipeline pattern can not work in multiprocess? - queue

I think I do not use zmq in right pattern, what I want to do is:
send message by zmq in multiprocess
accept message in multiple client, but one message should be accept only once
according the second requirements I thought a pipeline should be ok ( PUSH/PULL ), but this mode can not work in multiprocess:
def foo(i):
return i
def producer():
context = zmq.Context()
zmq_socket = context.socket(zmq.PUSH)
zmq_socket.bind("tcp://127.0.0.1:5559")
with concurrent.futures.ProcessPoolExecutor(max_workers=10) as executor:
futs = [executor.submit(foo, i) for i in range(10)]
for fut in concurrent.futures.as_completed(futs):
work_message = { 'num' : fut.result() }
zmq_socket.send("test")
producer()
so, maybe I should use PUB/SUB pattern, but this can not meet the second requirement.
in fact what I want is something like this:
PUSH|-----| | PULL
PUSH|-----| | PULL
PUSH|-----|----- DEVICE -----| PULL
PUSH|-----| | PULL
PUSH|-----| | PULL

Not exactly, roger, ZeroMQ PUSH/PULL pattern works
it just does something else than you would like it to do.
ZeroMQ is a wonderfull toolbox kit of pre-baked behaviours with immense potential to assemble more complex behaviour models, where needed.
Start with understanding the primitive actors, than design your functional requirements.
PUSH/PULL Formal Communication Scenario gets two players:
1st: PUSH picks the phone and calls. Whom? The connected PULL-side. PUSH leaves a voice-mail message to be listened by PULL, once it decides to pick it from the voicemail.
2nd: PULL side ( at some point in time, not necessarily right upon 1.) hears the bell ringing and picks the phone.
3rd: PULL, if instructed to, processes the received message from PUSH
Nothing more, per se.
Assemblies of ZeroMQ primitive components:
yes, there is the way to step farther towards your goal:
Just complement your functional requirements with appropriate ZeroMQ primitives inter-connected so as to meet your additional requirements.
The most trivial-one, from Chapter 1 of the "Code Connected, Volume I." - a round-robin based forwarding of messages towards a pool-of-"worker"-processes.
More specialised assemblies may create additional functionalities for smart distributed complex behavioral models:
Typically a "control + signalling" SIG_PLANE is being implemented in parallel to the primary, functional processing.
Self-diagnostics service, hearbeat/self-healing signalling for non-stop processing scenarios, remote non-blocking logging service, front-end MVC/GUI-plane being the most typical layered design goals.
Your task?
If not interested in more features, just .connect() your Futures calculating PUSH-ers towards a PULL-ing endpoint.
This middle-step ( a performance / failure analyses singularity ) can collect on it's PULL-ing entry-side and right go and PUSH on it's other endpoint, here - on a round-robin basis, towards the pool of actual workers ( who are crowd-waiting for a task with their .connect()-ed PULL-er entry endpoints for an incoming task from the "sink"-collector entity ) - a task, from collected FIFO ( beware the buffering capacity and performance overheads you pay for this ), just goes to the "next" one, down the line.
________________________________ _____________________________
process(0)|....[PUSH].connect(A)|- ________________________________ -|.connect(B)[PULL] process(-1)
process(1)|....[PUSH].connect(A)|-- | | --|.connect(B)[PULL] process(-2)
process(2)|....[PUSH].connect(A)|--- ---.bind(A)[PULL]:NOP:[PUSH].bind(B)|--- ---|.connect(B)[PULL] process(-3)
process(3)|....[PUSH].connect(A)|-- |________________________________| --|.connect(B)[PULL] process(-4)
... ...
process(n)|....[PUSH].connect(A)|- -|.connect(B)[PULL] process(-m)
It's that simple.
Enjoy the cool ZeroMQ toolkit.

Related

ZMQ socket - disconnect when all request are served

I am trying to implement ZMQ REQ/REP model in Java
I have a Server-role, running on post 5564, which acts as Replier
ZMQ.Socket repSock = context.socket(ZMQ.REP);
I have a Client-role, running on post 5563
ZMQ.Socket syncclient = context.socket(ZMQ.REQ);
I have a proxy-server in middle, which passes request and response
ZMQ.proxy(reqSocket, repSocket, null);
Good thing about having a proxy is I can add multiple Servers
repSocket.connect("tcp://" + addr.getHostAddress() + ":" + port);
Which is working fine .
Now, when I remove a Server node from Proxy
repSocket.disconnect("tcp://" + addr.getHostAddress() + ":" + port);
Client gets stuck, as an request has being made and the REQ-socket waits for a response.
So the process stucks at syncclient.recvStr()
for (int request_nbr = 0; request_nbr < (request_nbr + 1); request_nbr++) {
syncclient.send(str.getBytes(),0);
System.out.println("Send Dataaaa....... " );
String data = syncclient.recvStr(Charset.defaultCharset());
System.out.println(" here.. " +data);
request_nbr++;
}
I searched and couldn't find a way to track the REQ-socket
I need any one of 2 things:
A way to keep track on a Socket-instance, which I am about to disconnect, wait till all messages are processed, so that syncclient.recvStr() will not be blocked
A way to reset the syncclient-socket, so that I can keep getting REQ/REP respond without an interruption
In real-world scenarios, rather avoid using a blocking-mode of the ZeroMQ .send() / .recv() methods and better use .poll().
While this may require a bit more SLOCs of code, the results are leaving you in a control, whereas a blocking SLOC takes all the control from your code and you cannot do much about that until ( if at all ) a next message gets delivered. That's a very wrong design practice and except the most simplistic schoolbook examples, that are actually sort of anti-patterns for the real-world.
So, do not expect Question 2 to become somehow magically solved, this is not a part of ZeroMQ API ( for many rather aloud evangelisated reasons ). Better decide between .setsockopt( ZMQ.REQ_RELAXED, 1 ), if API version and context of use permits, or do not use the trivial REQ/REP pattern at all ( due it's known risk of falling into an unsalvageable mutual dead-lock ( ref. other my posts on this very subject, where this phenomenon was both illustrated and countless times explained ) ).
In a similar manner, asking Question 1 seems reasonable in cases you have never read the ZeroMQ specifications and/or documentation and ZeroMQ "Best Practices". Having spent some time in this, your options would be crystal-clear. There are none such tools for doing this built-in. One can add some add-on, if in a need to add any similar non-core logic for her/his own need. The only setting that indirectly influences the behaviour on aSocket.close() is available in .setsockopt( ZMQ.LINGER, 0 ), which may help to prevent a system from a transition into an effective hangup-state, once aSocket waits infinitely for a state that will never happen in cases, when a message-queue is still non-empty ( messages still waiting for getting delivered ).
Going into Distributed-Systems design is like entering a new world. No sequences are guarranteed ( non-serial code execution paths happen ). No means of any local control of remote entities, their states, their failures, their presence at all, their actual ZeroMQ API version.
Indeed a challenging world to enter into.
N.b.:
You might already know, that one can .connect() aSocket-instance ( better an Access Point to aSocket-instance ) to more than one remote ends without using the proxy. With some additional .setsockopt() tuning ZMQ.IMMEDIATE to a value of 1, will help better manage the round-robin distribution policy, irrespective of the transport-classes used for the actual message delivery ( { tcp:// | ipc:// | vmci:// | pgm:// | epgm:// | inproc:// } ). All that at your fingertips.

Where to invoke SagaManager in CQRS even handling

Am new to Microservices and CQRS event handling. I am trying to understand with one simple task. In this task I have three REST external services to handle one transaction/request(Service). The three services are
step1: customer create.
step2: create business for customer
step3: Create Address for business.
I want to implement SAGA for these events with InMemorySagaRepository and saga manager.
Where exactly I have to initiate the SagaManager with repository, Is it in RestController or in CommandHandler ?
Can you please help me in understanding sagas flow ?
Thanks in Advance.
Half a year later, and I'm making an edit as I've now taken a course held by Greg Young called Greg Young's CQRS, Domain Events, Event Sourcing and how to apply DDD
I really recommend it to anyone thinking about CQRS. Help A LOT to understand what things actually are
Original anwser
In our product we use Sagas as something that reacts to events.
This means that our sagas are really just Subscribers to a specific Event. The saga then holds some logic as to whether it should do something or not.
If the saga finds that an action should be taken, it creates a Command which it puts on the CommandBus.
This means that Sagas are just 'reactors' and use the same path in as a user would (skipping the APIs etc).
But what a Saga really is, and what it should do, differs from the one talking about them to the other. (Disclaimer: This is how I read these posts, they might actually all say the same thing, but in a way to fluffy way for me [+my team] to see that)
http://blog.jonathanoliver.com/cqrs-sagas-with-event-sourcing-part-i-of-ii/ for example, raises the point that Sagas should not contain 'business logic' (anything that contains 'if' is business logic according to the post).
https://msdn.microsoft.com/en-us/library/jj591569.aspx Talks about Sagas as 'Process managers' which coordinate things between different Aggregates (remember that Aggregate1 can't talk to Aggregat2 directly, so a 'Process manager' is required to orchestrate the communication). To put it simply: Event -> Saga -> Command -> Event -> Saga... To reach the final destination.
https://lostechies.com/jimmybogard/2013/03/21/saga-implementation-patterns-variations/ Talks about two different patterns of what a Saga is. One is 'Publish-gatherer' which basically coordinates what should happen based on a Command. The other is 'Reporter', which just reports the status of things to where they need to go. It doesn't coordinate things, it just reports whatever it needs to report.
http://kellabyte.com/2012/05/30/clarifying-the-saga-pattern/ Has a write-up of what the Saga-pattern 'is'. The claim is that Sagas should/could compensate for different workflows that break.
http://cqrs.nu/Faq/sagas Has a very short description on what Sagas are and basically says 'They are state machines that lets aggregates react to other aggregates'.
So, given that, what is it you actually want the Saga to do? Should it coordinate everything? Or should it just react and not care what the Aggregates do?
My edited part
So, after taking the course on CQRS and talking with Greg about this, I've come to the conclusion that there is quite a lot of confusion out there on the web.
Lets start with just the concept 'Saga'. A Saga has actually nothing to do with CQRS. It's not a concept of it. 'Saga' a form of a two-phase-commit, only it's optimised for success rather than fail ( https://en.wikipedia.org/wiki/Compensating_transaction )
Now, what most people mean when they talk CQRS and say "Saga" is "Process Manager". And process managers are quite complicated it seems (Greg has a whole other course for just Process Managers).
Basically what they do is the manage the whole process of something (as the name suggests). The link to Microsoft is pretty much what it's all about.
To answer the question:
Where exactly I have to initiate the SagaManager with repository, Is it in RestController or in CommandHandler ?
Outside of them both. A Process Manager is it's own thing. It spans aggregates and repositories. Conceptually it might be better to look at it as a user doing all the things you want the PM do to, just that you program the users interaction and tell it what to listen for.
Disclaimer: I do not work for Greg, or anyone that stands to gain on my promotion for taking his courses. It's just that I learned a lot from it, so I recommend it just like I would recommend reading Eric Evans book on DDD.
In my application i've build Saga process manager using this MSDN documentation, my Saga is implemented in Application Service layer, it listens Events of Sales, Warehouse & Billing bounded contexts and on event occurrence sends Commands via Service Bus.
Simple example, hope it helps you to analyze how to build your saga (I am registering saga as handler in Composition Root) ;):
SAGA:
public class SalesSaga : Saga<SalesSagaData>,
ISagaStartedBy<OrderPlaced>,
IMessageHandler<StockReserved>,
IMessageHandler<PaymentAccepted>
{
private readonly ISagaPersister storage;
private readonly IBus bus;
public SalesSaga(ISagaPersister storage, IBus bus)
{
this.storage = storage;
this.bus = bus;
}
public void Handle(OrderPlaced message)
{
// Send ReserveStock command
// Save SalesSagaData
}
public void Handle(StockReserved message)
{
// Restore & Update SalesSagaData
// Send BillCustomer command
// Save SalesSagaData
}
public void Handle(PaymentAccepted message)
{
// Restore & Update SalesSagaData
// Send AcceptOrder command
// Complete Saga (Dispose SalesSagaData)
}
}
InMemorySagaPersister: (as SalesSagaDataID i am using OrderID its unique across whole process)
public sealed class InMemorySagaPersister : ISagaPersister
{
private static readonly Lazy<InMemorySagaPersister> instance = new Lazy<InMemorySagaPersister>(() => new InMemorySagaPersister());
private InMemorySagaPersister()
{
}
public static InMemorySagaPersister Instance
{
get
{
return instance.Value;
}
}
ConcurrentDictionary<int, ISagaData> data = new ConcurrentDictionary<int, ISagaData>();
public T GetByID<T>(int id) where T : ISagaData
{
T value;
var tData = new ConcurrentDictionary<int, T>(data.Where(c => c.Value.GetType() == typeof(T))
.Select(c => new KeyValuePair<int, T>(c.Key, (T)c.Value))
.ToArray());
tData.TryGetValue(id, out value);
return value;
}
public bool Save(ISagaData sagaData)
{
bool result;
ISagaData existingValue;
data.TryGetValue(sagaData.Id, out existingValue);
if (existingValue == null)
result = data.TryAdd(sagaData.Id, sagaData);
else
result = data.TryUpdate(sagaData.Id, sagaData, existingValue);
return result;
}
public bool Complete(ISagaData sagaData)
{
ISagaData existingValue;
return data.TryRemove(sagaData.Id, out existingValue);
}
}
One approach might be to have some sort of starting command that starts the Saga. In this scenario it would be configured in your composition root to listen to a certain command type. Once a command has been received in your message dispatcher (or whatever middleware messaging stuff you have) it would look for any Sagas that have been registered to be started by the command. You would then create the Saga and pass it the command. It could then react to other commands and events as they happen.
In your scenario I would suggest your Saga is a type of command handler so the initiation of it would be upon receiving a command

Forwarding AnyEvent::Log messages to a callback if certain requirements are met

I am working on a project that uses AnyEvent Log in the main program as well as several dependent modules/packages. I currently have each module writing to it's own context, and all contexts are added to the main programs context as slaves. This project is part of a much larger project, and in addition to writing out a local log file, there are certain messages that I would like to send to a remote program which will then be responsible for presenting the messages to users.
The problem is that in order to send to the remote program, I have to have a piece of information that is only available from the main program, so it's not feesible to just implement a method at the package level to send messages. The piece of information I need is more or less a transaction id, and the log messages are interesting events from a particular transaction.
The main program has 2 contexts ( main , secondary ). The messages I am interested in will either come from the secondary ctx OR one of the package/module contexts. I am interested in only sending info - crit level messages to users, but ONLY WHEN the txID exists in the main program. I ALWAYS want messages to be written to my local log file regardless of whether or not a deployment is running. I would like this to be something that I setup in the main program rather than in a module because the modules are tasked to do certain thing and shouldn't even be aware of the fact that there is an ID associated with the task at hand.
Here is a quick breakdown of the log configuration specific code in the main program.
# Immediately after Proc::Daemon::Init
my $logger = AnyEvent::Log::ctx "desman";
# configure is done before daemonization to allow for --nodaemon
sub configure {
my ( $level, $file ) = #_;
$AnyEvent::Log::FILTER->level($level);
$AnyEvent::Log::LOG->log_to_file($file);
}
sub log_event {
... logic to send messages as tx event ...
}
sub worker_init {
threads->create(sub {
$logger->attach( my $worklog = AnyEvent::Log::ctx "worker" );
... more stuff for worker specifics ...
});
}
Ideally, I would be able to use one or both of log_cb and fmt_cb to handle the formatting and sending of messages to the remote program using the log_event sub. I have tried a few different things, and so far I'm stuck.
# doesn't seem to do anything
$logger->fmt_cb( sub { ... } );
$logger->log_cb( sub { ... } );
# broke everything
$AnyEvent::Log::COLLECT->attach( my $evtlog = new AnyEvent::Log::Ctx
fmt_cb => \&event_formatter,
log_cb => \&log_event
);
$evtlog->levels('crit','warning','notice','info');
I've been searching around for more examples than what's in the docs, but haven't found much yet. Not much of a surprise there since AE::log is pretty much awesome as it is, but anything to help will be greatly appreciated.

Can my nginx module make a connection in the master process?

I'm writing an nginx module that wants to subscribe to a zeromq pubsub socket and update an in-memory data-structure based on the messages it receives. To save bandwidth, it makes sense that only one process should make the subscription, and the data structure should be in shm so that all processes can make use of it. To me it seems natural that that one process should be the master (since if it was a worker, the code would have to somehow decide which worker).
But when I call ngx_get_connection from my init_master or init_module callbacks, it segfaults, apparently due to ngx_cycle not being initialized yet. Google searches on plugins doing work in the master process seem pretty pessimistic. Is there a better way to accomplish my goal of making a single outgoing connection to the pubsub socket per server, regardless of how many workers it has?
Here's a sample of code that works in a worker context but not from the master:
void *zmq_context = zmq_ctx_new();
void *control_socket = zmq_socket(zmq_context, ZMQ_SUB);
int control_fd;
size_t fdsize = sizeof(int);
ngx_connection_t *control_connection;
zmq_connect(control_socket, "tcp://somewhere:1234");
zmq_setsockopt(control_socket, ZMQ_SUBSCRIBE, "", 0);
zmq_getsockopt(control_socket, ZMQ_FD, &control_fd, &fdsize);
control_connection = ngx_get_connection(control_fd, cycle->log);
control_connection->read->handler = my_read_handler;
control_connection->read->log = cycle->log;
ngx_add_event(control_connection->read, NGX_READ_EVENT, 0);
and elsewhere
void my_read_handler (ngx_event_t *ev) {
int events;
size_t events_size = sizeof(events);
zmq_getsockopt(control_socket, ZMQ_EVENTS, &events, &events_size);
while (events & ZMQ_POLLIN) {
/* ...
read a message, do something with it
... */
events = 0;
zmq_getsockopt(control_socket, ZMQ_EVENTS, &events, &events_size);
}
}
To save bandwidth, it makes sense that only one process should make the subscription, and the data structure should be in shm so that all processes can make use of it. To me it seems natural that that one process should be the master (since if it was a worker, the code would have to somehow decide which worker).
As I already said, all you need is to decline your natural idea and just use one worker process for your purpose.
Which worker? Well, let it be the first one started.

Debug missing messages in akka

I have the following architecture at the moment:
Load(Play app with basic interface for load tests) -> Gateway(Spray application with REST interface for incoming messages) -> Processor(akka app that works with MongoDB) -> MongoDB
Everything works fine as long as number of messages I am pushing through is low. However when I try to push 10000 events, that will eventully end up at MongoDB as documents, it stops at random places, for example on 742 message or 982 message and does nothing after.
What would be the best way to debug such situations? On the load side I am just pushing hard into the REST service:
for (i ← 0 until users) workerRouter ! Load(server, i)
and then in the workerRouter
WS.url(server + "/track/message").post(Json.toJson(newUser)).map { response =>
println(response.body)
true
}
On the spray side:
pathPrefix("track") {
path("message") {
post {
entity(as[TrackObj]) { msg =>
processors ! msg
complete("")
}
}
}
}
On the processor side it's just basically an insert into a collection. Any suggestions on where to start from?
Update:
I tried to move the logic of creating messages to the Gatewat, did a cycle of 1 to 10000 and it works just fine. However if spray and play are involed in a pipeline it interrupts and random places. Any suggestions on how to debug in this case?
In a distributed and parallel environment it is next to impossible to create a system that work reliably. Whatever debugging method you use it will only allow you to find a few bugs that will happen during the debug session.
Once our team spent 3 months(!) while tuning an application for a robust 24/7 working. And still there were bugs. Then we applied a method of Model checking (Spin). Within a couple of weeks we implemented a model that allowed us to get a robust application. However, model checking requires a bit different way of thinking and it can be difficult to start.
I moved the load test app to Spray framework and now it works like a charm. So I suppose the problem was somewhere in the way that I used WS API in Play framework:
WS.url(server + "/track/message").post(Json.toJson(newUser)).map { response =>
println(response.body)
true
}
The problem is resovled but not solved, won't work on a solution based on Play.