Yahoo Pipes clone script? - feed

Yahoo Pipes lack of processing power and cannot works well with website from far east. I need to process complex regex from far multiple locations, hundreds of posts every minutes, which Yahoo Pipes fail to generate the result.
Is there any codes or script act like Yahoo Pipes which I can use it in my own server?

Pipe2py is a compiler script that will generate a Python equivalent of a Yahoo Pipe given the URL of the pipe:
https://github.com/ggaughan/pipe2py/
(Note that not all Pipes blocks have yet been implemented.)
A "hosted" version of Pipe2Py is also available on Google App Engine: http://pipes-engine.appspot.com/

My inclination would be to use something like LWP with Perl
http://metacpan.org/pod/LWP

Related

How is my/the user's web browser displaying a web page built in Perl?

this isn't a specific programming related question, but more so a conceptual/software engineering related question.
I'm a new web dev hire at a small local company, who was given a really cool chance to learn and grow as a professional. They were kind enough to give me a chance, and I'd like to be proactive in learning as much about how their back-end system is working as I can, considering it's what I'll be working in most of the time.
From what I've gathered, their entire in-house built job tracking interface is built in Perl (will the aid of css, js, and sql), where the html pages are generated and spat out as the user wants to access them.
For example, if I want to access a specific job, it'll look like this in the user's url. https://tracking.ourcompanywebsite/jobtracker/job/1234
On the internal side, I know we have a "viewing" script that would be called something like "JobView" that will literally query all of the fields in the perl script, and structure an html page around that data we are requesting.
My question is, how the fudge is this happening? How does a user putting in that address on the url trigger a perl script to run on our server, and generate a page that is spat back out to the user?
I guess that's my main curiosity. In your average bare bones web development courses in college, I learned to make your html, css, and js files. When you want to view a web page, you simply put the directory of that html page, and it constructs everything around that.
When you put a directory to a perl file in a browser, it will just open that raw perl code haha.
I'm sure there may be some modules and various add-ons in our software that allows this to all work, that I may be missing, so please forgive me.
I know you guys don't have the codebase in front of you, but I figured conceptually there is something to be learned that doesn't necessarily need all of the specifics.
I hope that this question could be used for any other amateur devs having the same questions.
Consider the following two snippets:
cat file | program
printf 'foo\n' | cat | program
In the first snippet, cat reads its output from a file. In the second, it gets it from another program. But program doesn't care about any of that. It just reads whatever was provided to its STDIN.
The web browser is like program. It doesn't care where the web server got the HTML or image or whatever it requested. It sends a URL, and it receives a response with a document from the web server.
The web server, like cat, can obtain what it needs from multiple sources. Specifically, it can be configured to get the requested document in a few different ways.
The "default" would be to map the URL to a directory and return the file found there. But that's not the only option. There are two other major options commonly found in web servers:
Common Gateway Interface (CGI)
Some web servers can be configured to run a program based on the URL received. Information about the request is passed to the program, which is tasked with producing a response. The web server simply returns the output of this program to requesting browser.
FastCGI
It can be quite wasteful to spawn a new child for each request. FastCGI allows a web server to talk to an existing persistent process or pool of processes that listen for requests from the webserver. Again, the web server simply returns the response from this request to the requesting browser.

Is there a way to use powershell to open an ssh session to a cisco router and run a show command extrapolate that information?

I have been tasked with something at work that requires me to use powershell or some other scripting language to parse an email and take a part of the body out of it or if there is a way to grab the concurrent user count out of goliath for citrix users. I then need to open an SSH session to a router and run a sh-vpnsessiondb and grab the output of that command and combine it with the stuff parsed from the email. If someone can help with even a single part of the script it will go a long way for me :)

POST an HTML form to a Powershell script

I just need a plain static .html page form, to POST to a Powershell script.
I've seen plenty of Powershell Invoke-WebRequest cmdlet material, but where Powershell is always initiating the HTTP request (and then handling the HTTP response..)
Thank you!
The short answer is that you cannot POST directly to a PowerShell script. When you POST to a website you are passing arguments to the web server that are then being presented to code on that web server ( the target of your POST request ) that the web server is capable of executing. Web servers do not understand PowerShell ( unless Microsoft has implemented this, which a few quick googles suggests they haven't ).
That being said, your ultimate goal is likely that you want to consume data that you sourced from a form via a PowerShell script. You will need to implement a backend on the webserver to consume the POST request and pass it to the operating system level to be run via PowerShell. This is generally not a good idea but if you are doing it for an internal site to get something running quickly then so be it.
Here is the process to call a Powershell script from ASP.Net: http://jeffmurr.com/blog/?p=142
You could approach this problem in many other ways. You could write your backend site to save the data from the POST request to a file and come along and parse that file on a schedule with PowerShell. You could use a database in the same manor or you could create a trigger in the database to run the script each time a row is appended.
I suspect that if you work down one of these pathways you will ultimately find that the technology you are using on the backend ( like ASP.Net or PHP or JavaScript ) is capable of doing the work you need done and that you would have far less moving parts if you stuck with one of those. Don't be afraid to learn something new. Jumping to JavaScript from PowerShell is not that difficult.
And the world moves to fast. Here is a NodeJS-like implementation of a webserver in PowerShell.
https://gallery.technet.microsoft.com/scriptcenter/Powershell-Webserver-74dcf466

Is there any way to allow failed uploads to resume with a Perl CGI script?

The application is simple, an HTML form that posts to a Perl script. The problem is we sometimes have our customers upload very large files (gt 500mb) and their internet connections can be unreliable at times.
Is there any way to resume a failed transfer like in WinSCP or is this something that can't be done without support for it in the client?
AFAIK, it must be supported by the client. Basically, the client and the server need to negotiate which parts of the file (likely defined as parts in "multipart/form-data" POST) have already been uploaded, and then the server code needs to be able to merge newly uploaded data with existing one.
The best solution is to have custom uploader code, usually implemented in Java though I think this may be possible in Flash as well. You might be even able to do this via JavaScript - see 2 sections with examples below
Here's an example of how Google did it with YouTube: http://code.google.com/apis/youtube/2.0/developers_guide_protocol_resumable_uploads.html
It uses "308 Resume Incomplete" HTTP response which sends range: bytes=0-408 header from the server to indicate what was already uploaded.
For additional ideas on the topic:
http://code.google.com/p/gears/wiki/ResumableHttpRequestsProposal
Someone implemented this using Google Gears on calient side and PHP on server side (the latter you can easily port to Perl)
http://michaelshadle.com/2008/11/26/updates-on-the-http-file-upload-front/
http://michaelshadle.com/2008/12/03/updates-on-the-http-file-upload-front-part-2/
It's a shame that your clients can't use ftp uploading, since this already includes abilities like that. There is also "chunked transfer encoding" in HTTP. I don't know what Perl modules might support it already.

How do I use a Perl CGI locally without using curl and apache2?

I would like to submit a form to a CGI script localy (w3c-markup-validator), but it is too slow using curl and apache, I want to use this CGI script more than 5,000 times in an another script.
and currently it takes more than one hour.
What should I do to give the form directly to the CGI script (I upload a file with curl)?
edit: It seems to be too complicated and time consuming for what I needed, so I waited 1 hour and a half, each time I needed to test my generated xhtml files.
In definitive I didn't test any of the answers below, so the question will remain open.
Depending on the details of the script you might be able to create a fake CGI environment using HTTP::Request::AsCGI and then sourcing the CGI script with the "do" operator. But when it comes to speed and maintainability your best bet would be to factor the important part of the script's work into its own module, and rewrite the CGI as a client of that module. That way you don't have to invoke it as a CGI -- the batch job you're talking about now would be just another program using the same module to do the same work, but without CGI or the webserver environment getting in the way.
OK, I looked at the source code for this thing and it is not easy extract the validation stuff from all the rest. So, here is what I would.
First, ditch curl. Starting a new process for each file you want to validate is not a good idea. You are going to need to write a driver script that takes a list of URL's and submits them to your local server running on localhost. In fact, you might later want to parallelize this because there will normally be a bunch of httpd processes alive anyway. Well, I get ahead of myself.
This script can use LWP because all you are doing is submitting some data to the CGI script on localhost and storing/processing results. You do not need full WWW::Mechanize functionality.
As for the validator CGI script, you should configure that as a mod_perl registry script. Make sure you preload all necessary libraries.
This should boost documents processed per second from 1.3 to something more palatable.
CGI is a pretty simple API. All it does is read data either from an environment variable (for GET requests) or from stdin (for POST requests). So all you need is to do is to set up the environment and call the script. See the docs for details.
If the script uses CGI.pm, you can run it from the command line by supplying the '-debug' switch (to CGI.pm, in the use statement.) That will then allow you to send the post variables on stdin. You may have to tweak the script a little to make this work.