I had a recent problem where Tie::File proved the best answer I could work with for a Perl program. I'm at a point where I'm ready to work with CGI, and I need to ask: are there Perl modules that can't be used in CGI, especially that Tie::File? If there are any complications, are there ways to reconcile them?
A CGI is basically just a program that reads a request on STDIN and spews header + HTML on STDOUT. It isn't really special: there aren't any modules you can't use, if you try hard enough.
You could even get graphical (e.g., GTK) ones working with enough pain. Not that you'd want to. Unless you're a third-party vendor I've had the displeasure of making that work for.
But remember that multiple copies of your program may be running simultaneously (one per simultaneous web request), so if you're using flat files, you'll have to deal with locking.
Make sure your data file is readable and writable by your CGI process. I'm adding this answer because it led to a very odd bug. I had a script that wouldn't run from CGI. In fact the CGI could read the contents of the data just fine, but Tie::File failed (even though it worked fine if I called it from the command line). It turns out the permissions were set -rw-rw-r-- which means world-readable, but only my user and group could write to it. Since the CGI process didn't have write permissions, Tie::File failed in CGI.
Related
How can I start a mod_perl handler (called MyCacheHandler.pm) directly from another perl module (called MyModule.pm). Because currently I'm starting the handler via a web browser, but it would be a little bit easier to call it with MyModule.
As I understand it, you want to have it (MyCacheHandler) running in the background, and it won't produce any visible (to a browser) output? (Just side effects).
If that's correct, why is it even implemented as a mod_perl handler. Just implement it as a script and run it from cron, or as a daemon of some kind.
You could still control MyCacheHandler from MyModule (say via IPC).
Do some refactoring. Split MyCacheHandler.pm into two modules: one which is doing the hard work and does not depend on mod_perl anymore (that is, no more handling with $r), so it's callable from other modules. The other would be a now thin mod_perl handler calling the first module.
Or leave it as is, and just use LWP::UserAgent to access MyCacheHandler from MyModule.
I am trying to find a way to monitor the contents of a directory for changes. I have tried two approaches.
Use kqueue to monitor the directory
Use GCD to monitor the directory
The problem I am encountering is that I can't find a way to detect which file has changed. I am attempting to monitor a directory with potentially thousands of files in it and I do not want to call stat on every one of them to find out which ones changed. I also do not want to set up a separate dispatch source for every file in that directory. Is this currently possible?
Note: I have documented my experiments monitoring files with kqueue and GCD
My advice is to just bite the bullet and do a directory scan in another thread, even if you're talking about thousands of files. But if you insist, here's the answer:
There's no way to do this without rolling up your sleeves and going kernel-diving.
Your first option is to use the FSEvents framework, which sends out notifications when a file is created, edited or deleted (as well as things to do with attributes). Overview is here, and someone wrote an Objective C wrapper around the API, although I haven't tried it. But the overview doesn't mention the part about events surrounding file changes, just directories (like with kqueue). I ended up using the code from here along with the header file here to compile my own logger which I could use to get events at the individual file level. You'd have to write some code in your app to run the logger in the background and monitor it.
Alternatively, take a look at the "fs_usage" command, which constantly monitors all filesystem activity (and I do mean all). This comes with Darwin already, so you don't have to compile it yourself. You can use kqueue to listen for directory changes, while at the same time monitoring the output from "fs_usage". If you get a notification from kqueue that a directory has changed, you can look at the output from fs_usage, see which files were written to, and check the filenames against the directory that was modified. fs_usage is a firehose, so be prepared to use some options along with Grep to tame it.
To make things more fun, both your FSEvents logger and fs_usage require root access, so you'll have to get authorization from the user before you can use them in your OS X app (check out the Authorization Services Programming Guide for info on how to do it).
If this all sounds horribly complicated, that's because it is. But at least you didn't have to find out the hard way!
I am using the wonderful AnyEvent for creating an asynchronous TCP server (specifically, a MUD server).
In order to keep everything running smoothly and with as few blocking/synchronous pieces of code possible, I have replaced some modules I was using with their asynchronous counterpart, for example AnyEvent::Memcached and AnyEvent::Gearman. This allows the main program to be quite speedy, which is desirable. I have coded around the need for some of these calls to be synchronous.
One problem I currently have, and the focus of this question, is logging.
Before turning to AnyEvent for this server program, I was using Log::Log4perl as it allows me to fine-tune which modules or subroutines should be logged, at which level and to which log output (screen, file, etc).
The problem here is that the Log4perl actions (warn, info, etc) are currently performed synchronously but I have no requirement for that as long as the log lines eventually end up on the screen / file (and in the correct order).
Is Log::Log4perl still the right choice when using an asynchronous event handler such as AnyEvent, or should I look at a different module? If so, which is recommended?
AnyEvent::Log, which comes with AnyEvent, uses AnyEvent::IO, which appends to files asynchronously when IO::AIO is available (and synchronously when not).
What you are trying to avoid? If it's synchronous file IO (writing to log files/stdout etc.) then your problem would probably be solved with an asynchronous and/or buffering appender(s) rather than replacing all use of Log4perl in your code.
Log::Log4perl::Appender::Buffer seems like it might be a good start, but a completely async appender doesn't appear to exist anymore.
I am writing a couple fo scripts that go and collect data from a number of servers, the number will grow and im trynig to future proof my scripts, but im a little stuck.
so to start off with I have a script that looks up an IP in a mysql database and then connects to each server grabs some information and then puts it into the database again.
What i have been thinknig is there is a limited amount of time to do this and if i have 100 servers it will take a little bit of time to go out to each server get the information and then push it to a db. So I have thought about either using forks or threads in perl?
Which would be the prefered option in my situation? And hs anyone got any examples?
Thanks!
Edit: Ok so a bit more inforamtion needed: Im running on Linux, and what I thought was i could get the master script to collect the db information, then send off each sub process / task to connect and gather information then push teh information back to the db.
Which is best depends a lot on your needs; but for what it's worth here's my experience:
Last time I used perl's threads, I found it was actually slower and more problematic for me than forking, because:
Threads copied all data anyway, as a thread would, but did it all upfront
Threads didn't always clean up complex resources on exit; causing a slow memory leak that wasn't acceptable in what was intended to be a server
Several modules didn't handle threads cleanly, including the database module I was using which got seriously confused.
One trap to watch for is the "forks" library, which emulates "threads" but uses real forking. The problem I faced here was many of the behaviours it emulated were exactly what I was trying to get away from. I ended up using a classic old-school "fork" and using sockets to communicate where needed.
Issues with forks (the library, not the fork command):
Still confused the database system
Shared variables still very limited
Overrode the 'fork' command, resulting in unexpected behaviour elsewhere in the software
Forking is more "resource safe" (think database modules and so on) than threading, so you might want to end up on that road.
Depending on your platform of choice, on the other hand, you might want to avoid fork()-ing in Perl. Quote from perlfork(1):
Perl provides a fork() keyword that
corresponds to the Unix system call of
the same name. On most Unix-like
platforms where the fork() system call
is available, Perl's fork() simply
calls it.
On some platforms such as Windows
where the fork() system call is not
available, Perl can be built to
emulate fork() at the interpreter
level. While the emulation is
designed to be as compatible as
possible with the real fork() at the
level of the Perl program, there are
certain important differences that
stem from the fact that all the pseudo
child "processes" created this way
live in the same real process as far
as the operating system is concerned.
I am writing a Bulk Mail scheduler controlled from a Perl/CGI Application and would like to learn abut "good" ways to fork a CGI program to run a separate task? Should one do it at all? Or is it better to suffer the overhead of running a separate job-queue engine like Gearman or TheSchwartz as has been suggested recently. Does the answer/perspective change when using an near-MVC framework like CGI::Application over vanilla CGI.pm? The last comes from a possible project that I have in mind for a CGI::Application Plugin - that would make "forking" a process relatively simple to call.
Look at Proc::Daemon - it's the simplest thing that works. From your CGI script, do the CGI business (getting input, returning a response to the browser), then call Proc::Daemon::init() which does the fork, daemonizes your process and makes the parent exit. Then your script (now a daemon) does its long-running tasks and exits when they're done.
You'll want to update something (file, database record) while running as a daemon, so subsequent CGI invocations can check what it did (or how it's progressing).
Would something like POE be useful? It's more event-driven than forked, but it may meet your needs.