Is it possible to use EventMachine calls inside Thin without extra initialization?
Currently, I have a Sinatra app run by Thin (which is running as a service). When I try to use EventMachine.connect_unix_domain, I get eventmachine not initialized... even though Thin (and presumably EventMachine) is running.
class App < Sinatra::Base
$sock = EventMachine.connect_unix_domain("/tmp/appsock.sock")
# import all routes
Dir.glob("controllers/*.rb").each { |r| require_relative r }
end
My guess (sorry, don't have em installed on this box) is that the issue is because the code will be evaluated when the class is loaded. At that point, thin probably isn't setup and EM probably isn't initialized.
You could try wrapping the $sock = ... call in an EM.next_tick {} which should delay the execution until EM has actually started.
I believe, if memory serves, you can add stuff to next_tick before EM is actually initialized.
Related
I work on a fairly large Mojolicious application, which takes many seconds to compile.
Parts of the test suite of that application are written using playwright, which currently sets up a pristine database for each test case, and spins up an instance of the mojolicious application using #mojolicious/server-starter.
The compile-time of the application is starting to make it impractical to significantly expand the playwright test suite, and I'd like to address that without having to give up on the isolation that separate test databases and mojolicious application instances currently afford me.
In order to achieve that, the idea I'm currently pursuing is to have a small perl application that can pre-load the larger Mojolicious application, and which can be asked by the playwright test-suite to spawn a new instance of the larger application with a pristine database on an open port.
I'd like to communicate with that small perl application using HTTP, mainly out of convenience, and I'd like that small perl application to use Mojolicious to perform the HTTP communication, because that's also convenient and consistent with the rest of the code-base.
I've tried some naive approaches to implementing this idea, which looked roughly like this:
use TheBigApp;
$app->routes->post('/spawn-child')->to(cb => sub ($c) {
my ($sock, $port) = new_listen_socket();
if (my $pid = fork()) {
# record pid to later be able to shut it down or whatever
$c->render(json => { url => "http://localhost:$port" })
} else {
my $bapp = TheBigApp->new($c->req->json);
my $s = Mojo::Server::Daemon->new(listen => ...);
$s->app($bapp);
$bapp->start;
return;
}
});
All of the implementations I've tried along these lines seemed to run into issues due to the various singletons, such as the IOLoop, and even when overriding IOLoop->singleton to return a new instance with a new reactor within the sub-process, it appeared as if the forked-off child processes were still listening on the same socket as the parent-process that spawned them.
Are there perhaps lower-level Mojolicious APIs that could make this use-case work? Would it perhaps be simpler to implement the small parent process without Mojolicious to sidestep the issue entirely?
Thanks!
Digging a little in the Mojo Server/Daemon code, it always(?) sets up 'ReusePort' on the IO::Socket.
On linux (not macOS or BSD, windows I have no idea) that means that TCP connections to the same IP and port combination are 'load balanced' across multiple server instances by the kernel.
It is unclear from you post if the spawned processes are listening on the same port or not.
Assuming you are running linux, changing the listening port for each spawned child might help.
When my connection is open, the application won't exit, this causes some nasty problems for me (highly concurrent and nested using a shared sesssion, don't know when each part is finished) - is there a way to make sure that the cluster doesn't "hang" the application?
For example here:
object ZombieTest extends App {
val session= Cluster.builder().addContactPoint("localhost").build().connect()
// app doesn't exit unless doing:
session.getCluster.close() // won't exit unless this is called
}
In a slightly biased answer, you could look at https://github.com/outworkers/phantom instead of using the standard java driver.
You get scala.concurrent.Future, monix.eval.Task or even com.twitter.util.Future from a query automatically. You can choose between all three.
DB connection pools are better isolated inside ContactPoint and Database abstraction layers, which have shutdown methods you can easily wire in to your app lifecycle.
It's far faster than the Java driver, as the serialization an de-serialisation of types is wired in compile time via more advanced macro mechanisms.
The short answer is that you need to have a lifecycle way of calling session.close or session.closeAsync when you shut down everything else, it's how it's designed to work.
My mod_perl2-based intranet app uses DBI->connect_cached() which is supposedly overridden by Apache::DBI's version of the same. It has normally worked quite well, but just recently we started having an issue on our testing server--which had only two users connected--whereby our app would sometimes, but not always, die when trying to reload a page with 'FATAL: sorry, too many clients already' connecting to our postgres 9.0 backend, despite all of them being <IDLE> if I look at the stats in pgadmin3.
The backend is separate from our development and production backends, but they're all configured with max_connections = 100. Likewise the httpd services are all separate, but configured with
StartServers 8
MinSpareServers 5
MaxSpareServers 20
ServerLimit 99
MaxClients 99
MaxRequestsPerChild 4000
....
PerlModule Apache::DBI
I had been under the impression that I shouldn't call disconnect() on my database handles if I wanted them to actually benefit from caching. Was I wrong about that? If not, I guess I'll ask about the above error separately. Just wanted to make sure it wasn't this setup...
Apache::DBI's docs say:
When loading the DBI module (do not confuse this with the Apache::DBI
module) it checks if the environment variable 'MOD_PERL' has been set
and if the module Apache::DBI has been loaded. In this case every
connect request will be forwarded to the Apache::DBI module.
....
There is no need to remove the disconnect statements from your code.
They won't do anything because the Apache::DBI module overloads the
disconnect method.
If you are developing new code that is strictly for use in mod_perl,
you may choose to use DBI->connect_cached() instead, but consider
adding an automatic rollback after each request, as described above.
So I guess for my mod_perl2-only app, I don't need Apache::DBI because Apache::DBI's devs recommend using DBI->connect_cached. And I don't need disconnect statements.
But then DBI's docs say:
Note that the behaviour of [ connect_cached ] differs in several
respects from the behaviour of persistent connections implemented by
Apache::DBI. However, if Apache::DBI is loaded then connect_cached
will use it.
This makes it sound like Apache::DBI will actually affect connect_cached, in that instead of getting DBI->connect_cached behaviour when I call that, I'll get Apache::DBI->connect behaviour. And Apache::DBI's docs recommend against that.
UPDATE: I've set the first 5 parameters in the above config all to 1, and my app is still using up more and more connections as I hit its pages. This I don't understand at all--it should only have one process, and that one process should be re-using its connection.
Unless you plan on dropping Apache::DBI, the answer is a firm no, because Apache::DBI's override really does nothing:
# overload disconnect
{
package Apache::DBI::db;
no strict;
#ISA=qw(DBI::db);
use strict;
sub disconnect {
my $prefix = "$$ Apache::DBI ";
Apache::DBI::debug(2, "$prefix disconnect (overloaded)");
1;
}
;
}
Is there any possibility to pause/resume the work of embedded python interpreter in place, where I need? For example:
C++ pseudo-code part:
main()
{
script = "python_script.py";
...
RunScript(script); //-- python script runs till the command 'stop'
while(true)
{
//... read values from some variables in python-script
//... do some work ...
//... write new value to some other variables in python-script
ResumeScript(script); //-- python script resumes it's work where
// it was stopped. Not from begin!
}
...
}
Python script pseudo-code part:
#... do some init-work
while true:
#... do some work
stop # - here script stops and C++-function RunScript()
# returns control to C++-part
#... After calling C++-function ResumeScript
# the work continues from this line
Is this possible to do with Python/C API?
Thanks
I too have recently been searching for a way to manually "drive" an embedded language and I came across this question and figured I'd share a potential workaround.
I would implement the "blocking" behavior either through a socket, or some kind of messaging system. Instead of actually stopping the whole python interpreter, just have it block when it is waiting for C++ to do it's evaluations.
C++ will start the embedded runtime, then enter a loop of some sort that waits for python to "throw the signal" that it's ready. For instance C++ listens on port 5000, starts python, python does work, connects to port 5000 on localhost, then C++ sees the connection and grabs the data from python, performs work on it, then shuffles the data back over the socket to python, where python then receives the data and leaves the blocking loop.
I still need a way to fully pause the virtual runtime, but in your case you could achieve the same thing with a socket and some blocking behavior that uses the socket to coordinate the two pieces of code.
Good luck :)
EDIT: You may be able to hook this "injection" functionality used in this answer to completely stop python. Just modify it to inject a wait-loop perhaps.
Stopping embedded Python
Let's say, there are two modules is our framework: Handler.pm and Queries.pm
Queries.pm is optional and is being loaded at fastcgi process startup
BEGIN {
&{"check_module_$_"} () foreach (Queries);
}
sub check_module_queries {
...
require Eludia::Content::Queries;
...
}
every module function is loaded in one common namespace
now, there are two functions with same name (setup_page_content) in Handler.pm and Queries.pm
setup_page_content is being called in Handler.pm
It looks like original author suggested that Queries.pm:setup_page_content will be called, whenever Queries.pm is loaded
Sometimes it doesn't happen: traceback (obtained via caller ()) in these cases indicates, that setup_page_content was called from module Handler.pm
I logged %INC just before call and it contains Queries.pm and it full path in these cases
This behaviour is inconsistent and pops like in 2-3% of attempts on production installation, mostly when I send two parallel identical http requests. Due amount of effort to reproduce, I doesn't determine yet, whether it is installation specific.
How it will be decided which version of function with same name will be called?
Is it well-defined behaviour?
There should be a reason, original author wrote the code this way
UPDATE
perl version is v5.10.1 built for x86_64-linux-gnu-thread-multi
UPDATE 2
code: Handler.pm and Queries.pm
Queries.pm loading occurs in check_module_queries (BEGIN {} of Eludia.pm),
loaded for every request using Loader.pm (via use Loader.pm <params>)
I'd like to call Custom.pm:setup_page_content, whenever Custom.pm is loaded
So you'd like to call Custom::setup_page_content when it exists.
if (exists(&Custom::setup_page_content)) {
Custom::setup_page_content(...);
} else {
Handler::setup_page_content(...);
}
Sometimes it doesn't happen.
The total lack of information about this prevents me from commenting. Code? Anything?
Is there a way to 'unload' a module but leave it in %INC?
Loading a module is simply running it. You can't "unrun" code. At best, you can manually undo some of the effects of running it. Whether that includes changing %INC or not is entirely up to you.
When there is a namespace collision, the most recently defined/loaded function will take precedence.
These are not proper modules. These are simply libraries. If you turn them into proper modules with proper name spacing you won't have this issue. You barely need to do anything more than:
package Queries;
sub new {
my $proto = shift;
my $class = ref($proto) || $proto;
my $self = {};
bless ( $self, $class );
return $self;
}
And you'll be well on your way. Of course your program will need a couple changes to actually access the class functions. You can look at exporting to the functions to possibly save some refactoring time.
I have forked the code at github and initiated a pull request that I hope will address your problem.
The behaviour is perfectly defined: when you call PACKAGENAME::function_name, PACKAGENAME::function_name is called. It cannot be more than one PACKAGENAME::function_name at at time.
It is possible to redefine each function so many times as you wish. When you, say, execute some code like
eval "sub PACKAGENAME::function_name {...}";
the function is [re]defined. When you require some .pm file, it's much like eval the string with its content.
If PACKAGENAME is not defined explicitly, the __PACKAGE__ value is used.
So if you remark that the version of setup_page_content is taken from Handler.pm, it must indicate that Handler.pm's content was executed (by use, require, read/eval etc.) after Queries.pm was loaded (for the last time).
So you have to track all events of Handler.pm loading. The easy way is to add some debug print at the start of Handler.pm.