Im required to use wget for its ability to work like a crawler to develop one for my project. But all around in google searches im seeing people recommend to use LWP instead of Wget. Can you guys enlighten me on why this is so?
If you're writing Perl and need to make an HTTP request, you should generally use LWP. It's silly to shell out to do something that is easily supported within the Perl process.
If you want to do something more complex, like recursive web crawling, you may want to look at the WWW::Mechanize or Mojolicious modules available from CPAN. But at that point it might be reasonable to shell out to take advantage of an external tool that already has recursive web-crawling capability.
If you're writing a shell script rather than a Perl program, then you have no choice but to use an external tool. The choice among wget, curl, and the LWP scripts (lwp-request, GET, etc) really comes down to what's easiest for your use case. They all have have approximately the same features, but some things are easier in one tool than the others. Use what's readily available for your system; there's usually more than one option, in which case you should give them all a try - read the doc, try a few use cases, see which one you like.
Related
I have a interactive perl shell script that asks the user for some input and accordingly with his choice the script returns another set of options to choose from.
Now, soon I will have a demand to create a web interface for it. No more black terminals with blinking prompt for our users.
I was wondering what is the best way to pull this off with minimum changes to the original script? How to keep the script running and wait for user input? What is the best way to send user input to the script and then return it?
Or should I refactor the script to use CGI? I want to hear some general direction that I should take. I already know just the basic of Perl and CGI.
If you are using CGI these days you might as well be using something modern and fast like Plack. Since Plack will wrap a CGI style application you can even use a CGI application "inside" Plack to get modern features, pluggable middleware, etc., alongside olde-fashioned 1990's traditions.
Depending on how complex your application is there's a number of small frameworks and helper modules that can make working with Plack easier. Converting a console application to use a web interface might be easier if you use something like Squatting or Kelp.
I work in a project that uses Perl scripts on machines that are maintained by system folks, and installing packages such as Perl modules is no easy task - you usually have to call the person with the permissions to do that, wait several days, learn the API of the package, and then remember to install it on every newly configured machine ever installed.
The alternative that is many times chosen is just calling system() (or backtick notation, ``, inside Perl) and using the output of the shell command that does what you want. Of course it's not always possible, and when it use, there is usually a wrapping to do around the command call, but it's generally easier.
My question is when is the trade-off pulling towards either side, in your experience?
Edit: adding an example:
I want to print the total size of a directory in a human-readable format, or list all regular files bigger than a certain size, and du seems to make this easier than installing modules that do the same thing...
Oh, that's an easy one. One always prefers modules because the tighter integration makes it possible to pass around data that do not suit the traditional IPC.
What, did you expect help in rationalising your suffering under crappy sysadminship?
The core issue seems to be the perceived difficulty and length of time to install Perl modules. I would identify why they have problems installing the packages and try and help streamline their process.
A common solution is to modify your process. Admins don't typically like to install straight from CPAN, but as a developer you can use a local CPAN repo. You "freeze" the packages you use there, and then promote them for use in production.
That said the trade-off between using modules or shelling it out as follows:
Data
Modules typically return structured data, shelling out returns unstructured text that you have to parse.
Performance/Resource Usage
Shelling out creates a whole other process, modules usually provide functionality within the current operating process.
Additional Dependencies
Shelling out makes your program dependent on whatever you're shelling out to. Keep in mind that some basic programs change in output and behavior from OS to OS, so you may also be coupling yourself to a particular OS or set of OSes.
Modules, always modules. And, when I say always, I mean "almost always."
Modules are easier to control and to debug. It's easier to require a module be installed than some unrelated utility. It's more portable to use a module than to rely on the chancy command line syntax of a particular command.
You have a bunch of modules that go with the core distro. Just use them.
You can install modules right in your home directory and when the time comes negotiate with the sysadmins: http://perl.jonallen.info/writing/articles/install-perl-modules-without-root
As said above, always modules, because ls ain't dir...
I've been looking at ways to add a Perl-based SOAP server to our systems. Every time I look # the SOAP tools in Perl I'm disappointed at how much more developer overhead seems to be required than folks in the VS.net world. (And I've been looking for a long time)
From reviewing other questions here, it looks like XML::Compile::SOAP seems to be the popular tool in recent times. I've worked out a way to get my code served as a service through it, but I also had to manually massage a .WSDL file I started with Pod::WSDL.
Is there a more direct approach that I'm not finding on CPAN or elsewhere? Pod::WSDL by default spits out rpc/encoded, which I understand is not the 'best' way anymore. The docs say that document/literal is not supported yet, and reading the source you can see parts of the code that do document/literal, but it doesn't work directly with XML::Compile::SOAP. Ideally I add a few annotations to our code in either POD or perl5 attributes, and push a button, and get WSDL that I can feed to the XML::Compile package.
Thanks in advance.
As far as I know there isn't a better approach. At least for simple services.
I found that if I write my web services with an RPC mindset then SOAP::Lite is ludicrously easy to use.
Write a small perl program with subroutine in same file
Refactor in OO style, params & return values as perl objects.
Move subroutine to a separate module, test.
Move subroutine module to server. Copy data transfer object modules there too.
Add a SOAP::Lite dispatcher to server (a handful of lines).
Add a handful of lines of SOAP::Lite at the top of perl program (client). Done.
It is true that this uses RPC/Encoded style of SOAP which is out of favour. However if you add a little bit of POD markup, you can generate WSDL which (post massage) .Net clients
can happily consume (At least, I tested with Mono and a 3rd party tested with MS C#)
As you say, Pod::WSDL's output does need massaging, I recall I had to fix one namespace as part of my build process.
The downside is that a lot of advanced SOAP-related XML processing is seemingly missing or isn't well integrated and that SOAP::Lite and Pod::WSDL seem a little moribund.
But for simple remote method invocation, passing Perl objects, I find it orders of magnitude simpler than the Java frameworks I've had a stab at. Your Mileage May Vary.
I just noticed CGI::Simple while looking something up for the CGI.pm module. Should I be using CGI::Simple instead? What will it offer me over CGI.pm, which I've used for eight years? I see that CGI::Simple doesn't do HTML generation; what should I be using for that? And will it integrate with CGI::Simple by allowing me to make form values persist, as CGI.pm does?
I think it boils down to this line from the docs: "In practical testing this module loads and runs about twice as fast as CGI.pm depending on the precise task."
If you aren't concerned by the speed of your CGI program, I think it is safe to ignore this module. If you are concerned with speed I would suggest you look into CGI::Fast first.
I have rarely used the HTML generation facilities of CGI.pm. For that, I prefer HTML::Template, usually in conjunction with CGI::Application. CGI::Application can use any $cgi object, specified in the call to its constructor.
I think CGI still has its place. I like CGI::Simple because it provides a clean OO interface.
I maintain CGI.pm and have helped patch CGI::Simple as well. I've looked the code for both in depth and have benchmarked them. I think there are minimal benefits to switching to CGI::Simple. YOu will find some headaches in the process, like incompatible syntax for handling file uploads that would need to be changed.
I agree with the sentiment of some others here that if you are going to forward, you should look beyond either of these. I recommend looking towards something that natively works with PSGI.
I'm kinda surprised you're still using CGI at all. Consider a more adult framework like a Catalyst/TT/DBIx stack.
You might try CGI::Simple for new things, but otherwise let sleeping dogs lie. If your old programs are working, leave them alone. :)
CGI.pm has a good install base it, most perl installs have it, a refactored and slightly minimized CGI::Simple doesn't do it for me really. I would have to have a point where I ended up needing CGI.pm for something and maintaining both.
I find the HTML generator of CGI to handle escaping, encoding and solid compliant HTML as a great tool.
As you seem ready to migrate, please stop writing dirty old CGI-based scripts. Use instead a modern and clean web engine such as Dancer or Mojolicious.
Which Google API search module for Perl do you recommend?
REST::Google::Search or Google::Search
Ease of use is important because I only really need to get the top two, maybe three results from the web search.
Also, is there a module that one knows of for getting things from wikipedia? Again easy of use is important.
There doesn't appear to be much to choose between them.
For what it's worth, I'd go with Google::Search, mostly because it's written using Moose under the hood :D