I'm developing a web app in Perl with some C as necessary for some heavy duty number crunching. The main problem I'm having so far is trying to decide if I should use mod-perl, mod-fastcgi or both to run my scripts because I'm having a difficult time trying to analyze the pros and cons of each mod.
Can anyone post a summary or give a link where I can find some comparison information and perhaps some recommendations with examples?
They are quite different beasts.
mod_fastcgi (by the way, mod_fcgid is recommended) just supports the FCGI protocol to execute CGIs faster with some knobs to control how many processes will it run simutaneously and not much more.
mod_perl, on the other hand is a platform for development of applications that exposes most Apache internals to you so you can tweak every webserver knob from your code, accelerates CGIs, and much more.
If all you wish is to run your CGIs quickly, and want to support as many hosts as possible, you should stick with supporting those two ways to run your code and probably standard CGI as well.
If you care about maximum efficiency at the cost of flexibility, you could aim for a single platform, probably mod_perl.
But probably the sanest option is to run everywhere and use a framework to build the application that'll take care of using the advantages of a particular way of executing if present, like Catalyst.
I would advise you to use a framework such as Catalyst that takes care of such details. For most applications, it doesn't matter how the program connects to the webserver, as long as it is done in an efficient way. The choice between mod_perl and FastCGI should be made by the sysadmin who deploys it, not the developer.
Here is a site with some actual performance comparisons of mod_perl, mod_fastcgi, cgi (Perl) and a Java servlet - for a very basic script: https://sites.google.com/site/arjunwebworld/Home/programming/apache-jmeter
In summary:
cgi - 1200+ requests per minute
mod_perl - 6000+ requests per minute (ModPerl::PerlRun only)
fast_cgi - 6000+ requests per minute
mod_perl - 6000+ requests per minute (ModPerl::Registry)
servlets - 2438 requests per minute.
There is an old thread on PerlMonks comparing mod_perl and fastcgi here: http://www.perlmonks.org/?node_id=108008
Related
I'm beginner of Perl.
My understanding is below.
FCGI is a protocol
It is a gateway interface between web server and web applications
The process keeps alive for specific period(such as 5 mins) and accepts multiple requests, so response is fast.
You cache some data before process is build so that you can share those caches with all process, and you can save memory by Copy-on-Write.
It looks nice.
However, I have never seen FCGI in my experience with modern development with Golang, Nginx or whatever.
Doesn't modern web application require FCGI anymore?
What was the disadvantage of FCGI, and what is the still advantage of FCGI?
If we say there are better alternatives/ways that will be proper statement instead of saying anything dead or alive. Still in 2021 I have seen code running with FCGI in production and its going good. The latest comment happens in 2019 in the github. Everything has a time frame. Being old doesn't mean bad/dead, being younger doesn't mean good/alive.
For modern web development there are many frameworks available right now -
Catalyst
Mojolicious
Dancer2
Kelp
Raisin
Top 3 are most common ones. Mojo is mine personal favorite.
You can use them with Plack/uWSGI and you are good to go in no time. They will take care of everything.
Since you mentioned "FastCGI is a protocol" and it is not an implementation, it shouldn't be specific to any language. There will be implementation across different language(maybe not popular). You can find them with single search. One example of Nginx
There are various other questions asked before similar to this. Have a look at those. They will give you more clarity.
Is there a speed difference between WSGI and FCGI?
Is mod_perl what I'm looking for? FastCGI? PSGI/Plack?
Perl CGI vs FastCGI
Which is better perl-CGI, mod_perl or PSGI?
To expand slightly on Maverick's answer...
When writing a web application, there are two questions you need to ask yourself:
What language/framework am I going to write this application in?
How am I going to deploy this application?
In the bad old days, the answers to the two questions were intertwined in annoying ways. You might write a CGI program originally and then switch to a different deployment environment (FCGI or mod_perl, perhaps) at a later date when you wanted better performance from your application. But changing to a different deployment environment would usually mean making a lot of changes to your code. The application needed too much knowledge about the deployment environment. And that made life hard.
If you use the frameworks in the other answer, then things are different. Most (maybe all) of those frameworks run on top of a protocol called PSGI (Perl Server Gateway Interface). By writing your application using those frameworks, then your application will interact with the web server using the PSGI protocols. And there are PSGI adaptors available for all web deployment environments. So you can start by deploying your Dancer2 application as a CGI program (very few people do that, but it's completely possible) and then move it to an FCGI or mod_perl environment without changing your code at all.
I still see people deploying applications in FCGI. It's certainly more common than using mod_perl these days. But, to be honest, the most common deployment environment I see is to set your application up as a standalone service running on a high port number and to use a server like nginx to proxy requests to your service. That's what I'd recommend.
Short version:
What criteria should I use to evaluate possible candidates for a Perl "app server" (mod_perl replacement)?
We are looking for some sort of framework which will allow executing various Perl programs repeatedly (as a service) without the costs of:
re-launcing perl interpreter once per each execution
loading/compiling Perl modules once per execution
(both of which are the benefits that running mod_perl provides)
Notes:
We do NOT much care about any additional benefits afforded by mod_perl too much, such as deep Apache integration.
This will be a pure app server, meaning there is no need for any web specific functionality (it's not a problem if the app server provides it, just won't be needed).
We will of course consider the obvious criteria (raw speed, production-ready stability, active development, ability to run on OSs we care about). What I'm interested in is less trivial and subtle things that we may wish from such a framework/server.
Background:
At $work, the powers that be decided that they want to replace a current situation (simple webapps being developed in Embperl and deployed via Apache/mod_perl).
The decision was made to use a (home-grown) MVC system that will have a Java Spring front end for the View; and the Controller will parsel out back-end service requests to per-app services that perform Model duties (don't get hung up on details of this - it's not very relevant to the main question).
One of the options for back-end services is Perl, so that we can leverage all our existing Perl IP (libraries, webapp backend code) going forward, and not have to port 100% of it to Java.
To summarize:
| View | Model/app | Model loaded/executed by: |
================================================================================
OLD | Empberl | Model.pm | mod_perl has Model.pm loaded, called from view.epl |
NEW | Java | Model.pm | perl generic_model.pl -model Model (does "require") |
================================================================================
Now, those of you who did Perl Web development for a while, will immediately notice the most glaring problem with the new design:
| Perl interpreter starts | Perl modules are loaded and compiled |
=======================================================================
OLD | Once per mod_perl thread | Once per mod_perl thread
NEW | Once per EVERY! request | Once per EVERY! request |
=======================================================================
In other words, in the new model, we no longer have any performance benefits afforded by mod_perl as a persistent server side app container!!!
Therefore, we are looking at possible app containers to serve the same function.
(as a side note, yes, we thought about simply running an instance of Apache with mod_perl as such an app container, as a viable possibility. However, since web functionality is not required, I'd like to see if any other options may fit the bill).
Starman is a High-performance preforking PSGI/Plack web server that may be used in that context. It's easy to build a REST application that serves stateless JSON objects (this is a simple use case).
Starman is a production-ready server and it's really easy to install a set of Starman instances behind a reverse-proxy (this SO question may helps you), for scaling purposes
I think you've already identified what you need to know and what to test: execution time versus memory. You also need to evaluate the flexibility and ease of deployment that you get with mod_perl and the big win of preserving the usefulness of software you've already developed (and the accumulated experience of your staff). Remember you can easily separate services by CPU and machine if your new front end is going to be talking to your applications inside your own network. A lot depends on how "web-ish" you can make your services and if they can be efficiently "daemonized". Of course there's lots of ways for web front ends to talk to other services and perl can handle pretty much all of them ... TIMTOWTDI.
Since you mention "tuits" (i.e. "manpower") as a constraint, perhaps the best approach in the short term is to set up a separate apache - mod_perl instance as an "application container" and run your applications that way (since they run well that way already, is this correct?). After all, apache (and mod_perl) are well known, battle tested, and eminently tweakable and configurable. The deployment options are pretty flexible (same machine, different machine(s), failover, load balancing, cloud, local, VMs) and they have been well tested as well.
Once you get that running you could then begin experimenting with various "low manpower required" approaches to magically daemonizing your applications and services without apache. Plack and Starman have been mentioned, Mojolicious:: is another framework that is capable of working with JSON websockets etc (and Plack). These too have been well tested but are perhaps less familiar than mod_perl and Apache. Still if you are a perl shop you shouldn't have difficulty working with these "modern" tools. One day, if you do end up with more resource, you could build out a sophisticated networked platform for all your perl based services and use all the cool "new" (or at least newer than mod_perl) stuff on CPAN like POE, Plack, etc. You might have a lot of fun developing cool stuff as you solve your business problem.
To clarify my earlier comment: Ubic (see Ubic on MetaCPAN) can daemonize (and thus precompile) your perl tools and offers some monitoring and management facilities that come for free with the framework. There is one Ubic module available designed for use with Plack called Ubic::Service::Plack. In and of itself Ubic does not provide an easy solution for your new Java/Swing application to talk to your perl applications but it might help manage/monitor whatever solution you come up with.
Good luck and have fun!
You can create a simple daemon using HTTP::Daemon, and have all benefits of compiling necessary parts of your code later (require), or in advance, before daemon starts.
I'm going to design an open-source web service which should collect ("web-scrape") some data from multiple - currently three - web sites.
The web sites do not expose any web service nor any API, they just publish web pages.
Data will be collected 'live' on any client's request from all the web sites in parallel, and will then be parsed to XML to be returned to the client.
The server operating system will be Linux.
The clients will initially be just an Android application of mine.
The concurrent clients will possibly be about 100 or more, if the project will be successful... ;-).
Currently my preferencese go to the adoption of:
perl (for the service laguage)
mod_perl2 with ModPerl::Registry (for an Apache embedded fast perl interpreter)
perl module CHI::Driver::FastMmap (for a modern and fast cache handler)
perl module Coro (for an async event loop to place many requests in parallel)
Since I suppose the specifications on the project can be of general use and interest, and since I am getting many problems with the combined use of Coro with mod_perl2, I ask:
Are my adoption preferences well matched?
Do you see any incompatibilities or potential problems?
Do you have any suggestion to enhance (in this order):
compatibility among components
neatness of the implementation
ease of maintainability
performances
You probably don't want to develop using mod_perl for any new project anymore. You really want to use something Plack based, or maybe even Plack itself. If you want to use Coro, using a AnyEvent such as Twiggy based backend may make most sense (though you may want to put a reverse proxy in front of it).
Are you happy sticking with apache?
If so, forget Coro and let apache handle concurrency; preload your modules and configuration, and write a super-efficient apache RequestHandler. (That's the way I go whenever apache2+modperl2 is available.)
If not, start learning Plack which is server-agnostic.
If you choose the first route, I'd recommend avoiding traditional CGI and instead adopting CGI::Application, which gives almost the lightness and speed of CGI but with a much much nicer/modern development environment and framework (and is Plack-compatible).
I have been trying to decide if my web project is a candidate for implementation using PSGI, but I don't really see what good it would do for my application at this stage.
I don't really understand all the fuss. To me PSGI seems like a framework that provides a common interface between different Apache modules which lets you move your application between them. e.g Easily move your application from running on mod_perl to fastcgi, and provide the application support for running on both options.
Is that right, or have I missed something?
As I and the team I am a part of not only develop the application, but also pretty much do maintenance and setup of servers I don't see the value for us of being able to run on fastcgi, cgi, and mod_perl, we do just fine with just mod_perl.
Have I misunderstood the PSGI functionality, or is it just not suitable for my project?
Forget the Apache bit. It's a way of writing your application so that the choice of webserver becomes less relevant. At $work we switched to Plack/PSGI after finding our app running with very high CPU load after upgrading to Apache2 - benchmarking various Apache configs and NYTProf'ing were unable to determine the reason, and using PSGI and the Starman webserver worked out much better for us.
Now everything is handled in one place by our PSGI app (URL re-writes, static content, expiry headers, etc) rather than Apache configuration, so it's a) Perl, and b) easily tested via our standard /t/ scripts. Also our tests are now testing exactly what a user sees, rather than just the basic app itself.
It may well not be relevant to you if you're happy with Apache and mod_perl, and I'm sure others will be able to give much better answers, but for us not having to deal with anything Apache-related again is such a relief in itself. The ease of testing, and the ability to just stick in a Data::Dumper and see what's going on rather than wrestling with ModRewrite and friends, is a great boon.
Borrowing from a recent blog post by chromatic, Why PSGI/Plack Matters (Testing), here's what it is:
It's a good idea borrowed from Python's WSGI and Ruby's Rack but made Perlish; it's a simple formalizing of a pattern of web application development, where the entry point into the application is a function reference and the exit point is a tuple of header information and a response body.
That's it. That's as simple as it can be, and that simplicity deceives a lot of people who want to learn it.
An important benefit is, ibid.,
Given a Plack application, you don't have to deploy to a web server—even locally—to test your application as if it were deployed … Plack and TWMP (and Plack::Test) use the well-defined Plack pattern to make something which was previously difficult into something amazingly easy. They're not the first and they won't be the last, but they do demonstrate the value of Plack.
Started wrote an answer and after 50 lines I deleted it. Simply because it is impossible tell (in short) why is PSGI extremely cool. I'm new in PSGI too, but zilion things now are much easier as before in my apache/mod_perl era.
I can give you next advices:
read the Plack advent calendar - all days, step-by-step. You must understand the basic philosophy, what is good on onions and so on... :)
search CPAN for "Plack::Middleware::" - and read the first few lines in each. Here are MANY. (Really should be somewhere some short overview for each one, unfortunately don't know any faster way. Simply it is good to know, what middlewares are already developed. (For example, you sure will need the Plack::Middleware::Session, or Plack::Middleware::Static and so on...)
read about Plack::Builder (already done, when you done with the advent calendar) :)
try write some apps with it and will find than Plack is like the first sex - now you didn't understand that you could live without it.
ps:
If was here something like "Perl Oscar", will sure nominating MyiagavaSan. :)
The only place where I found informations on G-WAN web server was the project web site and it looked very much like advertisement.
What I would really know is, for someone who is proficient with C, if it is as easy to use and extend that other architectures. For now I would mostly focus on scripting abilities.
Are C scripts on GWAN easy to write ?
Can you easily update and upload new C scripts to the server (say as easily than some PHP or Java pages on other architectures) ? Do you have to restart the server when doing so ?
Can you easily extend it with third party or existing C libraries ?
Any other feedback welcome.
Well, now G-WAN is available under Linux, I am using it for more than 6 months.
The C scripts are fully-ANSI C compatible so there is no difference for any seasonned C programmer.
To update them on the server, you can edit them directly in the /csp folder (remotely via SSH) or locally on a test machine (and copy them later): G-WAN reloads scripts on-the-fly when they have been changed on disk (no server stop required).
G-WAN C scripts can use any existing library (starting with all those under /usr/lib) without any configuration or interface: you just have to write a '#pragma link' followed by the name of the library at the top of your script.
What I found really useful is the ability to edit C scripts and refresh the view in the Internet browser to see how my code works.
If there is a compilation error, then G-WAN outputs the line in the source code (just like any C compiler).
But where it enters the extraordinary area, is when you have a C script crash: here also it gives you THE LINE NUMBER IN THE SOURCE CODE (with the faulty call and the backtrace).
Kind of black-magic when you are used to Apache modules.
My experience with G-WAN and its C scripts are:
The G-WAN community is very small. Questions you have are mostly answered by its single developer.
I consider the API not mature: it's not as "clean" as Java APIs.
The limitation, but at the same time the power, of C: it's a systems programming language. So writing application logic in it must be done carefully.
You generally need to be a good developer to get good results: if you do something wrong, the server crashes fast and hard (Unix-style).
I've written some scripts now, to try out G-WAN. Overall, it's been very "productive": not much bugs and it works if you follow the guidelines and don't want to do too much funky stuff you expect it to have, like mature web servers. However, I have got the feeling I'm reinventing the wheel a lot of times.
G-WAN also support scripts written in other programming languages (C++, Objective-C, Java, etc.) so you will benefit from whatever native libraries each language implements.
For C scripts, well, the /usr/lib directory lists more than 1,500 libraries that G-WAN can re-use with a simple #pragma link "library".
I found it neat to be able to write a Web application with a part in C, another in C++ and a third one in Java!
Benchmark shown how G-wan fare poorly at handling these tests.
http://joshitech.blogspot.sg/2012/04/performance-nginx-netty-cppcms.html
I have been using G-Wan for about two years. I consider it highly stable and production ready for static files. I have a number of static sites running for over a year with no issues.
I have built some small scale dynamic sites in C with it as demos/test projects. A bittorrent tracker and a real time analytics platform both using the KV Store for data backing.
In my view building large scale dynamic sites in G-Wan is possible but only with a significant investment in development and support. G-Wan is better suited to building robust highly scalable "enterprise grade" applications than tossing something together over a weekend.
I use G-Wan for a CMS http://solicms.com but for now, I use Ruby as primary language.
I have used G-wan for some preliminary testing and it does benchmark well. I have found a few points of concern that make it so that I will not likely use it for any of my projects. I have found that it seems to cache responses for about 0.5secs to speedup the responses/second and I can't have only some of the responses hitting the application code. Also the key/value store is great for cache and temporary data storage but I'm not sure how well it will work as a real back-end storage method.