Perl web-service (server) best practices - perl

I'm currently using a modified HTTP::Daemon::Threaded server in
combination with SOAP::WSDL and Pod::WSDL to provide web services
used by a variety of client types and roles.
---- that's not the question, the following is -----
I'd like to arrive at an optimal solution (as far as is possible) with respect to the following topics:
Request/Dispatch/Response speed
Protocol security (proper use of client-authenticated SSLv3/TLS)
Resource security (security roles/traits on per-resource & per-method bases)
Declarative specification of types, method signatures, and required security roles & traits.
Questions:
I'd like to be using an IO::Select or IO::Async::Loop::IO_Ppoll -based server, but I understand that this is not compatible with in-server client authenticated SSL. Is my understanding correct?
I'd like to move away from verifying the client certificate on each request, to something like CA SiteMinder, where I give out a time-limited session cookie after successful client certificate verification, which can be used on subsequent requests to avoid this time penalty (and to lessen server load). Is this going to be as secure? (or even good enough?)
Is there some module/framework I can build on to provided Trait and Role -based Authorisation for the exposed object and methods. Pod::WSDL really only deals with types (and not even complex ones). I'd like to use/implement some declarative annotation (or external YAML) -based scheme to handle complex WSDL typing as well as Trait & Role Authorisation. Has anyone done this? (even separately?) Are there any other modules which might be a good starting point?
Finally - Am I just crazy for doing this in Perl5 ? ;)

Ok, everyone's answering everything but the real questions.
I'll break this out into specific questions in separate posts, and won't make any mention at all of the server make-up - a topic which everyone in this thread seems to want to discuss, and which is completely irrelevant.

I know this is an old question, but FYI IO::Async will work just fine with SSL, ever since the IO::Async::SSL module.

You're crazy for doing this in Perl :-)
Seriously though, more power to you. My question is, presuming you have some reason to reinvent this wheel, is why not consider Python? Perl is alive and well but so much of this kind of thing (low level scripting) is being done in Python now.
Finally, presuming you don't have an actual reason to be doing this (aside from fun), you should really consider a Web Framework (Django of course) and something like nginx to handle the HTTP interaction.

Related

Is REST only adequate for applications with human-computer interaction?

I am fairly new to building applications using the RESTful architecture. As a matter of fact, all I have done so far is categorized as Level 2 REST by Leonard Richardson and that I know Fielding would happily categorize as Non-RESTful.
I have spent hours trying to understand HATEOAS and how to reach level 4. And I see it more clearly now. I conceptualize the application as a series of state transitions, and the resources will dynamically provide links with information on how to move from one state to another.
But everything related to HATEOAS seem to be inherent of a human-computer interaction. I mean, even when the resources provide the links that enable the application user to move to the next state, it is ultimately the user the one that drives the application from one state to the other by causing the use of of the provided links.
But how are things supposed to work when we are dealing with computer-to-computer interaction? After all when it comes to service-orientation the idea of service composition is key, and we cannot naively assume that the client is always going to be a human being? Many services are designed to be consumed by non-human users, and some interactions/orchestrations might be fairly complex, the type of things that are typically modeled with things like BPM, or BPEL.
Is REST and particularly HATEOAS only usable in applications that imply human intervention and if not how is this supposed to work otherwise?
I am getting this vibe that REST is only good for certain type of solutions and inadequate for others, but literature out there has failed to explain those inadequacies and sell REST as the cure of all evil, but I just don't quite get how to use for proper service composition when humans are not the drivers.
I'd really appreciate any references or insights on this, because believe me I have two days straight reading all I have been able to find on this topic and I have not yet being able to reach any reasonable and well documented conclusions.
Well, your client app can parse the response to get possible actions. In this case actual urls are obtained not from knowledge of the API, but upon calling the initial method (usually GET). All human-less.
It sounds almost as if you're comparing SOA to REST/Hypermedia and fail to see that SOA is a strategy, for designing a complex system made out of other systems, while REST/Hypermedia is a software architecture style applying a bunch of constraints on client-server communication. The client, however, can be both a server or a human, it doesn't matter.
To use or not to use REST/Hypermedia is not something to bother with when outlining/designing service composition. It's a question that comes into play when trying to achieve syntactic interoperability. Many times it comes down to comparing REST to Soap and other technical details.

How to implement XEP-0289 FMUC plugin on a XMPP server?

I need to implement a distributed XMPP MuC application on the lines of XEP-0289 minus some of the features, in essence I want to have a bare bones implementation of the plugin, my concern is to address fault-tolerance and as of now I do not want to worry about the performance considerations as specified in 289.
I have looked into SleekXmpp as a tool to develop server side plugins, but don't know how comfortable it would be to use it for such an implementation, other options I have looked at are OpenFire , Tigase. I am comfortable with Python/Java and other key features to consider would be good documentation, ease of use etc keeping that in mind I would like to know what would be the preferred path to take for this development.
Any guidance will be appreciated.
you should be able to write a MUC component that includes FMUC (or similar). The general way to do this would be to use a library that supports XEP-0114 components (e.g. SleekXMPP (Python), Swiften (C++)) and implement MUC+FMUC through that. You haven't said what your concerns with SleekXMPP are, but it's a fairly well-respected library in the XMPP community, so seems a fair choice (I'd pick Swiften, but I'm biased as one of the authors).
Your second option (patching the server directly) isn't generally the XMPPish way of adding customisations (as it's vendor-specific), but should also work if you can find someone sufficiently familiar with the server code, or if you're willing to become so.
To achieve fault tolerance (assuming you mean resilience to server failures) you'd need to run your XMPP server clustered, and also cluster your FMUC implementation. With that done, the usual XMPP fail-over using SRV records in DNS should ensure other servers retry connections to another host.
On a side note, the next version of FMUC (XEP-0289) will have some of the features of the current revision stripped out, and a number of improvements made based on deployment experience, so if your work is not time-critical, it might be of benefit to you to read that when it's released. I also note that there exists at least one implementation of FMUC already (Isode's M-Link, on which I work), and there is interest from other vendors, so using the standard protocol might benefit you in terms of not re-inventing the wheel.

perl check if proxy is online

How can I check if a proxy is online and properly working with perl? I was considering running a get operation and comparing output but i'll be running this check so frequently this overhead would be huge, any other more lightweight alternative?
No, this is exactly how you do it. If you use a light-weight method such as HEAD, TRACE or OPTIONS instead, you cannot know whether the proxy is actually useful or censoring or even subtly subverting the unencrypted data.
You can keep the overhead small by testing against a minimal useful HTML document.
Like daxim said, I think that testing against a very small HTML document is going to be lightweight enough for most scenarios.
The insuperable lightweight solution will be to use a web service that responds you with minimal data about your proxy IP address, if it is online and working fast enough, etc. This of course is going to include a third party(who's going to be doing the not so light work, doing the requests to all the proxies) and this, like everything, has their pros and cons.
I use this proxy checker from Google code to do exactly what you need and I also obtain some more info of each IP address, like country and a couple of the proxy speed measurements. It's a very simple code that consumes a web service from http://proxyipchecker.com/ .
PS: The example is in PHP but is trivial to do the same in Perl.

PSGI: What is it and what's the fuss about?

I have been trying to decide if my web project is a candidate for implementation using PSGI, but I don't really see what good it would do for my application at this stage.
I don't really understand all the fuss. To me PSGI seems like a framework that provides a common interface between different Apache modules which lets you move your application between them. e.g Easily move your application from running on mod_perl to fastcgi, and provide the application support for running on both options.
Is that right, or have I missed something?
As I and the team I am a part of not only develop the application, but also pretty much do maintenance and setup of servers I don't see the value for us of being able to run on fastcgi, cgi, and mod_perl, we do just fine with just mod_perl.
Have I misunderstood the PSGI functionality, or is it just not suitable for my project?
Forget the Apache bit. It's a way of writing your application so that the choice of webserver becomes less relevant. At $work we switched to Plack/PSGI after finding our app running with very high CPU load after upgrading to Apache2 - benchmarking various Apache configs and NYTProf'ing were unable to determine the reason, and using PSGI and the Starman webserver worked out much better for us.
Now everything is handled in one place by our PSGI app (URL re-writes, static content, expiry headers, etc) rather than Apache configuration, so it's a) Perl, and b) easily tested via our standard /t/ scripts. Also our tests are now testing exactly what a user sees, rather than just the basic app itself.
It may well not be relevant to you if you're happy with Apache and mod_perl, and I'm sure others will be able to give much better answers, but for us not having to deal with anything Apache-related again is such a relief in itself. The ease of testing, and the ability to just stick in a Data::Dumper and see what's going on rather than wrestling with ModRewrite and friends, is a great boon.
Borrowing from a recent blog post by chromatic, Why PSGI/Plack Matters (Testing), here's what it is:
It's a good idea borrowed from Python's WSGI and Ruby's Rack but made Perlish; it's a simple formalizing of a pattern of web application development, where the entry point into the application is a function reference and the exit point is a tuple of header information and a response body.
That's it. That's as simple as it can be, and that simplicity deceives a lot of people who want to learn it.
An important benefit is, ibid.,
Given a Plack application, you don't have to deploy to a web server—even locally—to test your application as if it were deployed … Plack and TWMP (and Plack::Test) use the well-defined Plack pattern to make something which was previously difficult into something amazingly easy. They're not the first and they won't be the last, but they do demonstrate the value of Plack.
Started wrote an answer and after 50 lines I deleted it. Simply because it is impossible tell (in short) why is PSGI extremely cool. I'm new in PSGI too, but zilion things now are much easier as before in my apache/mod_perl era.
I can give you next advices:
read the Plack advent calendar - all days, step-by-step. You must understand the basic philosophy, what is good on onions and so on... :)
search CPAN for "Plack::Middleware::" - and read the first few lines in each. Here are MANY. (Really should be somewhere some short overview for each one, unfortunately don't know any faster way. Simply it is good to know, what middlewares are already developed. (For example, you sure will need the Plack::Middleware::Session, or Plack::Middleware::Static and so on...)
read about Plack::Builder (already done, when you done with the advent calendar) :)
try write some apps with it and will find than Plack is like the first sex - now you didn't understand that you could live without it.
ps:
If was here something like "Perl Oscar", will sure nominating MyiagavaSan. :)

Is SOAP now a legacy technology?

Are people still writing SOAP services or is it a technology that has passed its architectural shelf life? Are people returning to binary formats?
The alternative to SOAP is not binary formats.
I think you're seeing a surge in the desire to leave the complexities of WS-* behind in favor of REST and JSON, because they're much simpler to use and don't require frameworks to be used successfully. The problems that WS-* ostensibly tries to solve aren't problems for most users, but they have to pay for the complexity any way.
I still write WS-*–based services. Somewhat surprisingly, I've had less trouble with them when trying to inter-operate with less capable developers. This is because if I send them a WSDL file, they know how to crank it through their tool and get an API they can call, while being blissfully unaware what is happening under the hood. To give customers a REST-ful service, I have to start talking to them about HTTP and XML, which they really don't understand as well as they think they do, and then I start getting a headache.
In other words, to be successful with REST, both the service provider and consumer have to know what they're doing (and they can keep things simple and come up with a great, non–WS-* solution). With WS-* technologies, it can still succeed even if only one party has a clue.
I think, however, that REST-oriented standards that are much less complicated than current WS standards, will eventually emerge, and when that happens, comparable tools will be available too.
I think so. RESTful solutions are more and more sensible for the vast majority of use cases; the complexities of SOAP and other RPC technologies just aren't worth the effort anymore.
I wouldn't consider SOAP legacy at all. REST vs. SOAP is really just the continuation of the debate of COM/CORBA vs. HTTP POST/GET etc. SOAP is nothing more than an updated version of the same principles defined with C and C (contracts, providers, consumers etc.). It's just that has appeared to SOAP succeed (at least partially) where the other two failed (and it could be that SOAP just has a better marketing team), that is that SOAP really does allow to different systems to connect rather easily compared to it's predecessors. That being said, it still suffers from the same drawbacks that COM/CORBA did...it can get really complex.
I think REST is just coming back into style at the moment. It's nothing new, people are just taking another look at it. Look at the web. It's REST and it's been around for years. 5 years from now people are going to look back and say the same thing about it being legacy and the need to change. It's the nature of software development. Everything goes in cycles.
The debate about which one is better is going to be just like the tabs vs. spaces debate. There are going to be people on different sides swearing that one is better. Really in the end, they both accomplish the same goal. Sure one will be a better solution than the other in some situations, but in the end neither will be superior 100% of the time.
We were using SOAP, but since we control both messaging endpoints (thick client out on the web connecting to our servers) we decided that the "lingua franca" of XML wasn't offering any real benefit. Instead, we're experimenting with binary serialization via Google protocol buffers, and like everything we've learned so far. It's somewhat CORBA-esque, but doesn't make me grumpy the way CORBA did. Still haven't found the best fit for the RPC layer, but pretty sure the payload will be protocol buffers.
The point I'm trying to make is that if you control both sides of the conversation, there are significant efficiency advantages in bypassing the XML tax.
Yes, some people still are (and now it's 2011!). I think the main reason is that MS WCF automatically generates SOAP bindings. The horror.
It's impossible to define what the best technology solution is without considering what the problem is, in other words, what the context is. Both REST and SOAP have their place. If you have a high traffic site and a development audience who is comfortable with REST, then SOAP would be a bad choice, primarily because the message size is so incredibly bloated. If you have small scale site with a modest development budget, then SOAP will be a superior choice due to automatic proxy generation from WSDL. To make a fair comparison, it should be mentioned that implementing a REST conversation takes more development time and therefore is more expensive, a very relevant fact for your boss.
While it is true that SOAP is a more complicated protocol, in my experience this doesn't translate to maintainability issues. That's because messages ride on HTTP and can be easily debugged just like REST message, and the SOAP stacks available on major platforms are very solid.
The complexity of SOAP is of course an advantage if your requirements include sophisticated items like federated message security. On the other hand, these kind of requirements are not seen that often in my experience. The WS standards committee may have been vulnerable to some YAGNI issues. Now that web service communication is commonplace, it's turning out to be simpler that was originally envisioned.