Standalone WSGI servers for production - rest

I've been learning about WSGI for REST APIs in python. I've got a working setup with Lighttpd+FastCGI.
However this path will be dedicated to serving the API - Static content will be delivered via a Content Delivery network and any web sites can be set up as REST clients to the API.
There are far too many Python WGSI servers. Seems like besides the one built into Python, every WSGI module, framework, any my dog includes one, and these almost universally comes with a "Use it for development, but you may want to use a proper production quality WSGI stack".
Python Paste looks promising, but is it really stable, and does it duplicate too much of my existing web.py+army-of-modules framework?
My primary criteria is:
Stability. I want something I can pretty much configure and not worry about.
Security. Don't introduce security holes.
Performance: Should perform well enough. I certainly don't want it to be the bottleneck in my implementation, but I see benchmarks showing that WSGI servers handles many hundreds of requests per second so as long as the WSGI server is not abnormally slow I don't expect this to be an issue.
What other aspects of the WSGI server do I need to be concerned about in a high-volume environment?

I've seen Gunicorn used in pretty important production environments, so that would probably be your best choice. I can also do a shameless plug here for netius, which is a Python network library that can be used for the rapid creation of asynchronous non-blocking servers and clients. It has no dependencies, it's cross-platform, and brings some sample netius-powered servers out of the box, namely a production-ready WSGI server. I can't recommend that project for having been used by a lot of people, even though we use it for a mission-critical SaaS service of ours with significant load, but the only advantage to you in particular is that the codebase is small, strictly structured, and extensively commented, so you can easily audit it for security yourself.

Related

Is FCGI dead? What is alternative in these days?

I'm beginner of Perl.
My understanding is below.
FCGI is a protocol
It is a gateway interface between web server and web applications
The process keeps alive for specific period(such as 5 mins) and accepts multiple requests, so response is fast.
You cache some data before process is build so that you can share those caches with all process, and you can save memory by Copy-on-Write.
It looks nice.
However, I have never seen FCGI in my experience with modern development with Golang, Nginx or whatever.
Doesn't modern web application require FCGI anymore?
What was the disadvantage of FCGI, and what is the still advantage of FCGI?
If we say there are better alternatives/ways that will be proper statement instead of saying anything dead or alive. Still in 2021 I have seen code running with FCGI in production and its going good. The latest comment happens in 2019 in the github. Everything has a time frame. Being old doesn't mean bad/dead, being younger doesn't mean good/alive.
For modern web development there are many frameworks available right now -
Catalyst
Mojolicious
Dancer2
Kelp
Raisin
Top 3 are most common ones. Mojo is mine personal favorite.
You can use them with Plack/uWSGI and you are good to go in no time. They will take care of everything.
Since you mentioned "FastCGI is a protocol" and it is not an implementation, it shouldn't be specific to any language. There will be implementation across different language(maybe not popular). You can find them with single search. One example of Nginx
There are various other questions asked before similar to this. Have a look at those. They will give you more clarity.
Is there a speed difference between WSGI and FCGI?
Is mod_perl what I'm looking for? FastCGI? PSGI/Plack?
Perl CGI vs FastCGI
Which is better perl-CGI, mod_perl or PSGI?
To expand slightly on Maverick's answer...
When writing a web application, there are two questions you need to ask yourself:
What language/framework am I going to write this application in?
How am I going to deploy this application?
In the bad old days, the answers to the two questions were intertwined in annoying ways. You might write a CGI program originally and then switch to a different deployment environment (FCGI or mod_perl, perhaps) at a later date when you wanted better performance from your application. But changing to a different deployment environment would usually mean making a lot of changes to your code. The application needed too much knowledge about the deployment environment. And that made life hard.
If you use the frameworks in the other answer, then things are different. Most (maybe all) of those frameworks run on top of a protocol called PSGI (Perl Server Gateway Interface). By writing your application using those frameworks, then your application will interact with the web server using the PSGI protocols. And there are PSGI adaptors available for all web deployment environments. So you can start by deploying your Dancer2 application as a CGI program (very few people do that, but it's completely possible) and then move it to an FCGI or mod_perl environment without changing your code at all.
I still see people deploying applications in FCGI. It's certainly more common than using mod_perl these days. But, to be honest, the most common deployment environment I see is to set your application up as a standalone service running on a high port number and to use a server like nginx to proxy requests to your service. That's what I'd recommend.

What criteria should I use to evaluate a Perl "app server" (mod_perl replacement)?

Short version:
What criteria should I use to evaluate possible candidates for a Perl "app server" (mod_perl replacement)?
We are looking for some sort of framework which will allow executing various Perl programs repeatedly (as a service) without the costs of:
re-launcing perl interpreter once per each execution
loading/compiling Perl modules once per execution
(both of which are the benefits that running mod_perl provides)
Notes:
We do NOT much care about any additional benefits afforded by mod_perl too much, such as deep Apache integration.
This will be a pure app server, meaning there is no need for any web specific functionality (it's not a problem if the app server provides it, just won't be needed).
We will of course consider the obvious criteria (raw speed, production-ready stability, active development, ability to run on OSs we care about). What I'm interested in is less trivial and subtle things that we may wish from such a framework/server.
Background:
At $work, the powers that be decided that they want to replace a current situation (simple webapps being developed in Embperl and deployed via Apache/mod_perl).
The decision was made to use a (home-grown) MVC system that will have a Java Spring front end for the View; and the Controller will parsel out back-end service requests to per-app services that perform Model duties (don't get hung up on details of this - it's not very relevant to the main question).
One of the options for back-end services is Perl, so that we can leverage all our existing Perl IP (libraries, webapp backend code) going forward, and not have to port 100% of it to Java.
To summarize:
| View | Model/app | Model loaded/executed by: |
================================================================================
OLD | Empberl | Model.pm | mod_perl has Model.pm loaded, called from view.epl |
NEW | Java | Model.pm | perl generic_model.pl -model Model (does "require") |
================================================================================
Now, those of you who did Perl Web development for a while, will immediately notice the most glaring problem with the new design:
| Perl interpreter starts | Perl modules are loaded and compiled |
=======================================================================
OLD | Once per mod_perl thread | Once per mod_perl thread
NEW | Once per EVERY! request | Once per EVERY! request |
=======================================================================
In other words, in the new model, we no longer have any performance benefits afforded by mod_perl as a persistent server side app container!!!
Therefore, we are looking at possible app containers to serve the same function.
(as a side note, yes, we thought about simply running an instance of Apache with mod_perl as such an app container, as a viable possibility. However, since web functionality is not required, I'd like to see if any other options may fit the bill).
Starman is a High-performance preforking PSGI/Plack web server that may be used in that context. It's easy to build a REST application that serves stateless JSON objects (this is a simple use case).
Starman is a production-ready server and it's really easy to install a set of Starman instances behind a reverse-proxy (this SO question may helps you), for scaling purposes
I think you've already identified what you need to know and what to test: execution time versus memory. You also need to evaluate the flexibility and ease of deployment that you get with mod_perl and the big win of preserving the usefulness of software you've already developed (and the accumulated experience of your staff). Remember you can easily separate services by CPU and machine if your new front end is going to be talking to your applications inside your own network. A lot depends on how "web-ish" you can make your services and if they can be efficiently "daemonized". Of course there's lots of ways for web front ends to talk to other services and perl can handle pretty much all of them ... TIMTOWTDI.
Since you mention "tuits" (i.e. "manpower") as a constraint, perhaps the best approach in the short term is to set up a separate apache - mod_perl instance as an "application container" and run your applications that way (since they run well that way already, is this correct?). After all, apache (and mod_perl) are well known, battle tested, and eminently tweakable and configurable. The deployment options are pretty flexible (same machine, different machine(s), failover, load balancing, cloud, local, VMs) and they have been well tested as well.
Once you get that running you could then begin experimenting with various "low manpower required" approaches to magically daemonizing your applications and services without apache. Plack and Starman have been mentioned, Mojolicious:: is another framework that is capable of working with JSON websockets etc (and Plack). These too have been well tested but are perhaps less familiar than mod_perl and Apache. Still if you are a perl shop you shouldn't have difficulty working with these "modern" tools. One day, if you do end up with more resource, you could build out a sophisticated networked platform for all your perl based services and use all the cool "new" (or at least newer than mod_perl) stuff on CPAN like POE, Plack, etc. You might have a lot of fun developing cool stuff as you solve your business problem.
To clarify my earlier comment: Ubic (see Ubic on MetaCPAN) can daemonize (and thus precompile) your perl tools and offers some monitoring and management facilities that come for free with the framework. There is one Ubic module available designed for use with Plack called Ubic::Service::Plack. In and of itself Ubic does not provide an easy solution for your new Java/Swing application to talk to your perl applications but it might help manage/monitor whatever solution you come up with.
Good luck and have fun!
You can create a simple daemon using HTTP::Daemon, and have all benefits of compiling necessary parts of your code later (require), or in advance, before daemon starts.

Advantages of using an Erlang web server for a web application

Note: This question is heavily affected by the main requirement to the web application that I build: high availability and fault tolerance. All the other requirements (like scalability and number of users) is not in question here.
I have got and advice from one of the members of this community to use an Erlang web-server as a back-end for my web application.
The suggestion was that I could use something like Mochiweb as a backend and Django/Ruby on Rails as a front end using JSON and the Service Oriented Model.
The only obvious advantage of this approach that I can understand is that the development of the front-end part is 'as usual' - regular MVC stuff, Ruby on Rails or any other common framework of someone's choice.
But what about other advantages? Do they actually exist?
Sure, Erlang/OTP adds fault-tolerance to the system in question, but doesn't adding a web front-end layer diminish this fault tolerance level to much lower level?
Don't we introduce a 'single point of failure' by coupling Ruby on Rails with Mochiweb? Of course, Mochiweb can cope with faults, but what if something wrong happens on the front-end side?
Technically the Erlang/OTP platform does not do anything about fault-tolerance and high-availability on it's own. It just allows to easily implement concurrent and distributed software. Erlang web server running on a single machine can fail as all other do - just because hardware fails.
So for HA sites it's more important to have proper redundancy and fallback scenarios for cases of hardware and software failures, rather than using any specific software stack or platform. Probably it will just be slightly easier to implement in Erlang compared to other platforms (if you are familiar with it, of course), but absolutely the same results can be achieved using pure Ruby/Python/Java/C or almost any other.
The web industry have tons of experience setting up fault-tolerant frontends. It's just a matter of setting up multiple web machines (often light reverse-proxies) and some sort of HA manager (built into many loadbalancing solutions). The backend is usually the harder part.
I wouldn't use Erlang as a front-end web server if the back end is some other technology.
Many of the benefits of Erlang as a web server come about when the back end is also using Erlang. The biggest of these is lower I/O costs. When your front end and back end are completely separate software stacks, you lose that benefit.
If you're going to build something on Rails, you might as well use something you can get more help with on the front end, such as nginx.

Perl, mod_perl2 or CGI for a web-scraping service?

I'm going to design an open-source web service which should collect ("web-scrape") some data from multiple - currently three - web sites.
The web sites do not expose any web service nor any API, they just publish web pages.
Data will be collected 'live' on any client's request from all the web sites in parallel, and will then be parsed to XML to be returned to the client.
The server operating system will be Linux.
The clients will initially be just an Android application of mine.
The concurrent clients will possibly be about 100 or more, if the project will be successful... ;-).
Currently my preferencese go to the adoption of:
perl (for the service laguage)
mod_perl2 with ModPerl::Registry (for an Apache embedded fast perl interpreter)
perl module CHI::Driver::FastMmap (for a modern and fast cache handler)
perl module Coro (for an async event loop to place many requests in parallel)
Since I suppose the specifications on the project can be of general use and interest, and since I am getting many problems with the combined use of Coro with mod_perl2, I ask:
Are my adoption preferences well matched?
Do you see any incompatibilities or potential problems?
Do you have any suggestion to enhance (in this order):
compatibility among components
neatness of the implementation
ease of maintainability
performances
You probably don't want to develop using mod_perl for any new project anymore. You really want to use something Plack based, or maybe even Plack itself. If you want to use Coro, using a AnyEvent such as Twiggy based backend may make most sense (though you may want to put a reverse proxy in front of it).
Are you happy sticking with apache?
If so, forget Coro and let apache handle concurrency; preload your modules and configuration, and write a super-efficient apache RequestHandler. (That's the way I go whenever apache2+modperl2 is available.)
If not, start learning Plack which is server-agnostic.
If you choose the first route, I'd recommend avoiding traditional CGI and instead adopting CGI::Application, which gives almost the lightness and speed of CGI but with a much much nicer/modern development environment and framework (and is Plack-compatible).

Socket vs HTTP based communication for a mobile client/server application

I've recently decided to take on a pretty big software engineering project that will involve developing a client-server based application. My plan is to develop as many clients as possible: including native iPhone, Android and Blackberry Apps as well as a web-based app.
For my server I'm planning on using a VPS (possibly from slicehost.com) running a flavor of Linux with a MySQL database. My first question is what should be my strategy for clients to interface with the server. My ideas are:
HTTP-POST or GET based communication with a PHP script.
This is something I'm very familiar with - passing information to a PHP script from a form, working with it and returning output. I'm assuming I'd want to return output to clients as some sort of XML or JSON based string. I'm also assuming I'd want to create a well defined API for clients that want to interface with my server.
Socket based communication with either a PHP script, Java program, or C++ program
This I'm less familiar with. I've worked with basic tutorials on creating a script or simple application that creates a socket, listens for a connection and returns data. I'm assuming there is far less communication data-overhead with this method than an HTTP based method. My dream is for there to be A LOT of concurrent clients in use, all working with the server/database. I'm not sure if a simple HTTP/PHP script based communication design can scale effectively to meet the needs of many clients. Also, I may eventually want the capability of a Server-Push to clients triggered by various server events. I'm also unsure of what programming language is best suited for this. If efficiency is a big concern I'd imagine a PHP script might not be efficient enough?
Is there a commonly accepted way of doing this? For me this is an attempt to bridge a gap between some of my current skills. I have a lot of experience with PHP and interfacing with a MySQl database to serve dynamic web pages. I also have a lot of experience developing native iPhone applications (however none that have had any significant server-based communication). Also I've worked with Java/C++, and I've developed applications in both languages that have interfaced with MySQL.
I don't anticipate my clients sending/receiving a tremendous amount of data to/from a server. Something on par with a set of strings per a given client-side event.
Another question: Using a VPS - good idea? I obviously don't want to pay for a full-dedicated server (slicehost offers a VPS starting at ~ $20/month), and I'm assuming a VPS will be capable of meeting the requirements of a few initial clients. As more and more users begin to interface with my server, I'm assuming it will be possible to migrate to larger and larger 'slices' and possibly eventually moving to a full-dedicated server if necessary.
Thanks for the advice! :)
I'd say go with the simplicity of HTTP, at least until your needs outgrow its capabilities. (The more stateful your application needs to be, the less HTTP fits).
For low cost and scalability, you probably can't go wrong with a cloud like Rackspace's or Amazon's. But I'm just getting started with those, my servers have been VPSs from tektonic until now.