Using shared memory in mod_perl environment - perl

I have a requirement wherein
I have to place a data structure (Perl hash) in memory, so that each HTTP process (running a Perl script) will use that hash.
The hash structure is around 300 MB.
The environment is mod_perl
I thought of creating a module to load at Apache start that creates a hash in a shared region and returns a reference to it.
Can you please comment on the behaviour, or suggest alternative solutions. Also please point to some good resources to check the examples.

If you place huge hash data on mod_perl memory,
then mod_perl parent process reads it at server startup phase.
In first, you create Your/HugeData.pm on #INC directory.
package Your::HugeData;
our %dictionary = (
....
);
Next, apache process reads this on startup.
# In apache.conf (or anywhere apache config file)
PerlModule Your::HugeData
Then your script can use %Your::HugeData::dictionary as package variable.
# In mod_perl handler script or ModPerl::Registry (CGI emulate) script.
use Your::HugeData;
...
my $tokyo = $Your::HugeData::dictionary{tokyo};
When you use prefork MPM on Linux Apache, OS prefers "Copy on Write" mechanism.
So forked child processes see parent proces'es data if you only read the data.
In other words, there may be not waste memory using.

I would be thinking in terms of handing it around via Storable store it to a file, retrieve it at start.
If it needs to change, you'd need to use flock to arbitrate IO, and potentially some sort of mechanism for checking when it changed last (e.g. check mtime).

Related

Perl local libraries - Sybase

I'm going to build a extremly small script for dumping a Sybase database in perl. The problem is that Perl doesn't come with preinstalled Sybase-support. I don't have access to the servers root so I can't install any packages and I can't reach the perl-folder. The server is not configured for internet access so I have to deliver the packages "manually" thorugh FTP.
So, my question is if there are any easy ways of doing this. The only library I need is DBI::Sybase or Sybase standalone (maybe I haven't done my research enough and doesn't even need this much?) which means I would love to just be able to put the .pm file there, loading it through
use localModule
and then run my small script.
The solution has to work on both Red hat and Solaris if I understood my supervisor correctly.
Best regards
Since you are primarily concerned with dumping the database, and not data retrieval and manipulation, you could probably get by without having to use DBI::Sybase or other perl module that is not preinstalled.
Without more details, it's hard to be very specific, but here's the overview. Your perl script can execute some SQL scripts which can dump the databases.
You can either put the list of databases you wish to dump in a config file (or env file), or you can generate it dynamically by calling isql using the -b option to suppress headers, and nocount to suppress footers, and store the output in an array.
Once you have the list of databases, just loop them, running another isql command to dump each database.

Version 5 UUID in Perl

Offtopic:
I'm new to stack overflow, and I wanted to say hello!
On topic:
I'm generating a version 5 UUID for an application that needs randomized folder creation and deletion via a timestamp time() through
my $md5_UUID = create_uuid_as_string(UUID_MD5, time."$job");
These folders are generated per run on each job, and are deleted after running. If the same UUID is somehow generated, the +-1000 jobs that are running could halt.
Is there any information that I can pull from this or any possibility of collisions (different data generating the same UUID)? Are they truly unique? Also, which version of UUID should I use between SHA1 and MD5?
Use OS Tools
There's probably a pure Perl solution, but it may be overkill. If you are on a Linux system, you can capture the results of mktemp or uuidgen and use them in your Perl script. For example:
$ perl -e 'print `mktemp --directory`'
/tmp/tmp.vx4Fo1Ifh0
$ perl -e '$folder = `uuidgen`; print $folder'
113754e1-fae4-4685-851d-fb346365c9f0
The mktemp utility is nice because it will atomically create the directory for you, in addition to returning the directory name. You also have the ability to give more meaningful names to the directory by modifying the template (see man 1 mktemp); in contrast, UUIDs are not really good at conveying useful semantics.
If the folders last only the length of a job, and all the jobs are running on the same machine, you can just use the pid as a folder name. No need for uuids at all.
Use a v1 UUID
Perl's time() function is accurate to the second. So, if you're starting your jobs multiple times per second, or simultaneously on separate hosts, you could easily get collisions.
By contrast, a v1 UUID's time field is granular to nanoseconds, and it includes the MAC address of the generating host. See RFC 4122
for details. I can imagine a case where that wouldn't guarantee uniqueness (the client machines are VMs on separate layer-3 virtual networks, all with the same virtual MAC address), but that seems pathologically contrived.

Perl Proc module running method keeps program hanged not returning back why?

PERL PROC Module running function never get exited - it keep running infinitely
#!/usr/bin/perl -w
use Proc::PID::File;
my %g_args = ( name => "temp", verify => 1, dir => "/home/username/");
print "Hello , world";
print Proc::PID::File->running(%g_args);
exit(0);
Even on CTRL + C its not being killed.
Its not even throwing any exception - where am I wrong.
I'm very beginner for PERL lang.
File locking on NFS mounted disks is problematic, even at the best of times. Proc::PID::File seems designed to operate on local filesystems (at least my perusal of the code doesn't indicate that it's taking the special care required to handle remote systems). Hanging on NFS problems is unfortunately typical of some NFS related problems. You will not be able to easily kill the process.
Is there some reason that you need to use the home directory? If you only need synchronization for jobs running on a single machine, /tmp should suffice. If you need to synchronize across multiple machines, then you should consider modules which are known to be more NFS safe, or use a client server model and avoid filesystems entirely. CPAN is full of solutions.

WebApp configuration in mod_perl 2 environment

I have a web app I'm writing in mod_perl 2. (It's a custom handler module, not registry or perlrun scripts.) There are several configuration options I'd like to have set at server initialization, preferably from a configuration file. The problem I'm having is that I haven't found a good place to pass a filename for my app's config file.
I first tried loading "./app.conf" but the current directory isn't the location of the modules, so it's unpredictable and error-prone. Or, I have to assume some path -- relative or absolute. This is inflexible and could be problematic if the host OS distribution is changed. I don't want to hard-code a path (though, something in /etc may be acceptable if there's just no better way).
I also tried PerlSetVar, but the value isn't available until request time. While this is workable, it means I'm potentially reading a config file from disk at least once per child (thread) init. I would rather load at server init and have an immutable static hash that is part of the spawned environment when a child is created.
I considered using a config.pl, but this means I either have a config.pl with one option to configure where to find the app.conf file, or I move the options themselves into config.pl and require end-users to respect Perl syntax when setting options. Future users will be internal admins, so that's not unreasonable, but it's more complicated than I'd like.
So what am I missing? Any good alternatives?
Usually a top priority is to avoid having configuration files amongst your executables. Otherwise a server misconfiguration could accidentally show your private configuration info to the world. I put everything the app needs under /srv/app0, with subdir cfg which is a sibling to the dirs containing executables. (More detail.)
If you're pre-loading modules via PerlPostConfigRequire startup.pl to access mod/startup.pl then that's the best place to put the configuration file location ../cfg/app.cnf and you have complete flexibility re how to store the configuration in memory. An alternative is to PerlModule your modules and load the configuration (with a relative path as above) in a BEGIN block within one of them.
Usually processing a configuration file doesn't take appreciable time, so a popular option is to lazy-load: if the code detects the configuration is missing it loads it before continuing. That's no use if the code needed to know the configuration earlier than that, but it avoids lots of problems, especially when migrating code to a non-modperl env.

Can I run my mod_perl application as an ordinary user

Can I run my mod_perl aplication as an ordinary user similar to running a plain vanilla CGI application under suexec?
From the source:
Is it possible to run mod_perl enabled Apache as suExec?
The answer is No. The reason is that
you can't "suid" a part of a process.
mod_perl lives inside the Apache
process, so its UID and GID are the
same as the Apache process.
You have to use mod_cgi if you need
this functionality.
Another solution is to use a crontab
to call some script that will check
whether there is something to do and
will execute it. The mod_perl script
will be able to create and update this
todo list.
A more nuanced answer with some possible workarounds from "Practical mod_perl" book:
(I hope that's not a pirated content, if it is please edit it out)
mod_perl 2.0 improves the situation,
since it allows a pool of Perl
interpreters to be dedicated to a
single virtual host. It is possible to
set the UIDs and GIDs of these
interpreters to be those of the user
for which the virtual host is
configured, so users can operate
within their own protected spaces and
are unable to interfere with other
users.
Additional solutions from the sme book are in appendix C2
As mod_perl runs within the apache process, I would think the answer is generally no. You could, for example, run a separate apache process as this ordinary user and use the main apache process as a proxy for it.