How can I validate an image file in Perl? - perl

How would I validate that a jpg file is a valid image file. We are having files written to a directory using FTP, but we seem to be picking up the file before it has finished writing it, creating invalid images. I need to be able to identify when it is no longer being written to. Any ideas?

Easiest way might just be to write the file to a temporary directory and then move it to the real directory after the write is finished.
Or you could check here.
JPEG::Error
[arguments: none] If the file reference remains undefined after a call to new, the file is to be considered not parseable by this module, and one should issue some error message and go to another file. An error message explaining the reason of the failure can be retrieved with the Error method:
EDIT:
Image::TestJPG might be even better.

You're solving the wrong problem, I think.
What you should be doing is figuring out how to tell when whatever FTPd you're using is done writing the file - that way when you come to have the same problem for (say) GIFs, DOCs or MPEGs, you don't have to fix it again.
Precisely how you do that depends rather crucially on what FTPd on what OS you're running. Some do, I believe, have hooks you can set to trigger when an upload's done.
If you can run your own FTPd, Net::FTPServer or POE::Component::Server::FTP are customizable to do the right thing.
In the absence of that:
1) try tailing the logs with a Perl script that looks for 'upload complete' messages
2) use something like lsof or fuser to check whether anything is locking a file before you try and copy it.

Again looking at the FTP issue rather than the JPG issue.
I check the timestamp on the file to make sure it hasn't been modified in the last X (5) mins - that way I can be reasonably sure they've finished uploading
# time in seconds that the file was last modified
my $last_modified = (stat("$path/$file"))[9];
# get the time in secs since epoch (ie 1970)
my $epoch_time = time();
# ensure file's not been modified during the last 5 mins, ie still uploading
unless ( $last_modified >= ($epoch_time - 300)) {
# move / edit or what ever
}

I had something similar come up once, more or less what I did was:
var oldImageSize = 0;
var currentImageSize;
while((currentImageSize = checkImageSize(imageFile)) != oldImageSize){
oldImageSize = currentImageSize;
sleep 10;
}
processImage(imageFile);

Have the FTP process set the readonly flag, then only work with files that have the readonly flag set.

Related

Openpanel and symbol communication not working

I am trying to make a patch that plays audio when a bang is pressed. I have put a symbol so that I don't need to keep reimporting the file. However it works sometimes but not all the time.
A warning in the Pd console reads: Start requested with no prior open
However I have imported an audio file
Is there something that I have done wrong?
Use [trigger] to get the order-of-execution correct.
One problem is, that whenever you send a [1( to [readsf~] you must have sent an [open ...( message directly beforehand.
Even if you have just successfully opened a file, but then stopped it (with [0() or played it through (so it has been closed automatically), you have to send the filename again.
The real problem is, that your messages are out of order: you should never have a fan-out (that is: connecting a message outlet to multiple inlets), as this will create undefined behavior.
Use [trigger] to get the order-of-execution correct.
(Mastering [trigger] is probably the single most important step in learning to program Pd)

Why does this code work successfully with Enumerator.fromFile?

I wrote the file transferring code as follows:
val fileContent: Enumerator[Array[Byte]] = Enumerator.fromFile(file)
val size = file.length.toString
file.delete // (1) THE FILE IS TEMPORARY SO SHOULD BE DELETED
SimpleResult(
header = ResponseHeader(200, Map(CONTENT_LENGTH -> size, CONTENT_TYPE -> "application/pdf")),
body = fileContent)
This code works successfully, even if the file size is rather large (2.6 MB),
but I'm confused because my understanding about .fromFile() is a wrapper of fromCallBack() and SimpleResult actually reads the file buffred,but the file is deleted before that.
MY easy assumption is that java.io.File.delete waits until the file gets released after the chunk reading completed, but I have never heard of that process of Java File class,
Or .fromFile() has already loaded all lines to the Enumerator instance, but it's against the fromCallBack() spec, I think.
Does anybody knows about this mechanism?
I'm guessing you are on some kind of a Unix system, OSX or Linux for example.
On a Unix:y system you can actually delete a file that is open, any filesystem entry is just a link to the actual file, and so is a file handle which you get when you open a file. The file contents won't become unreachable /deleted until the last link to it is removed.
So: it will no longer show up in the filesystem after you do file.delete but you can still read it using the InputStream that was created in Enumerator.fromFile(file) since that created a file handle. (On Linux you actually can find it through the special /proc filesystem which, among other things, contains the filehandles of each running process)
On windows I think you will get an error though, so if it is to run on multiple platforms you should probably check test your webapp on windows as well.

Perl CGI gets parameters from a different request to the current URL

This is a weird one. :)
I have a script running under Apache 1.3, with Apache::PerlRun option of mod_perl. It uses the standard CGI.pm module. It's a regularly accessed script on a busy server, accessed over https.
The URL is typically something like...
/script.pl?action=edit&id=47049
Which is then brought into Perl the usual way...
my $action = $cgi->param("action");
my $id = $cgi->param("id");
This has been working successfully for a couple of years. However we started getting support requests this week from our customers who were accessing this script and getting blank pages. We already had a line like the following that put the current URL into a form we use for customers to report an issue about a page...
$cgi->url(-query => 1);
And when we view source of the page, the result of that command is the same URL, but with an entirely different query string.
/script.pl?action=login&user=foo&password=bar
A query string that we recognise as being from a totally different script elsewhere on our system.
However crazy it sounds, it seems that when users are accessing a URL with a query string, the query string that the script is seeing is one from a previous request on another script. Of course the script can't handle that action and outputs nothing.
We have some automated test scripts running to see how often this happens, and it's not every time. To throw some extra confusion into the mix, after an Apache restart, the problem seems to initially disappear completely only to come back later. So whatever is causing it is somehow relieved by a restart, but we can't see how Apache can possibly take the request from one user and mix it up with another.
This, it appears, is an interesting combination of Apache 1.3, mod_perl 1.31, CGI.pm and Apache::GTopLimit.
A bug was logged against CGI.pm in May last year: RT #57184
Which also references CGI.pm params not being cleared?
CGI.pm registers a cleanup handler in order to cleanup all of it's cache.... (line 360)
$r->register_cleanup(\&CGI::_reset_globals);
Apache::GTopLimit (like Apache::SizeLimit mentioned in the bug report) also has a handler like this:
$r->post_connection(\&exit_if_too_big) if $r->is_main;
In pre mod_perl 1.31, post_connection and register_cleanup appears to push onto the stack, while in 1.31 it appears as if the GTopLimit one clobbers the CGI.pm entry. So if your GTopLimit function fires because the Apache process has got to large, then CGI.pm won't be cleaned up, leaving it open to returning the same parameters the next time you use it.
The solution seems to be to change line 360 of CGI.pm to;
$r->push_handlers( 'PerlCleanupHandler', \&CGI::_reset_globals);
Which explicitly pushes the handler onto the list.
Our restart of Apache temporarily resolved the problem because it reduced the size of all the processes and gave GTopLimit no reason to fire.
And we assume it has appeared over the past few weeks because we have increased the size of the Apache process either through new developments which included something that wasn't before.
All tests so far point to this being the issue, so fingers crossed it is!

NSStream Event Timer - iPhone

All,
Is there a way to have a minimum time to keep a stream open before it closes? For some reason, my stream is closing prematurely which is causing errors elsewhere. I need to keep it open to make sure ALL of the data is gathered, and then it can run the other code.
Thanks,
James
In the case that someone falls upon this question later, I ended up creating nested if statements to pull it off.
Basically, there is one statement that checks if the end tag is not found (for my code, the END of the ENTIRE data that I should be receiving is </SessionData> - So, I did if([outputString rangeOfString:#"</SessionData>"].location == NSNotFound). I created a string called totalOutput that would have the outputString added onto the end of totalOutput until the </SessionData> is found.
If anyone ever needs help, just go ahead and comment on here and I can give more information.

How can I get file size in Perl before processing an upload request?

I want to get file size I'm doing this:
my $filename=$query->param("upload_file");
my $filesize = (-s $filename);
print "Size: $filesize ";`
Yet it is not working. Note that I did not upload the file. I want to check its size before uploading it. So to limit it to max of 1 MB.
You can't know the size of something before uploading. But you can check the Content-Length request header sent by the browser, if there is one. Then, you can decide whether or not you want to believe it. Note that the Content-Length will be the length of the entire request stream, including other form fields, and not just the file upload itself. But it's sufficient to get you a ballpark figure for conformant clients.
Since you seem to be running under plain CGI, you should be able to get the request body length in $ENV{CONTENT_LENGTH}.
Also want to sanity check against possibly already having post max set (from perldoc CGI):
$CGI::POST_MAX
If set to a non-negative integer, this variable puts a ceiling on the size of
POSTings, in bytes. If CGI.pm detects a POST that is greater than the ceiling,
it will immediately exit with an error message. This value will affect both
ordinary POSTs and multipart POSTs, meaning that it limits the maximum size of
file uploads as well. You should set this to a reasonably high value, such as
1 megabyte.
The uploaded file is stashed in a tmp location on the server when the form is submitted, check the file size there.
Supply the value for $field.
my $upload_filehandle = $query->upload($field);
my $tmpfilename = $query->tmpFileName($upload_filehandle);
my $file_size = (-s $tmpfilename);
This has nothing to do with Perl.
You are trying to read the filesize of a file on the user's computer using commands that read files on your server, what you want can't be done using Perl.
This is something that has to be done in the browser, and looking briefly at these questions it's either very hard or impossible.
Your best bet is to allow the user to start the upload and abort if the file is too big.
If you want to check before you process the request, you might be better off checking on the web page that triggers the request. I don't think the web browser can do it on it's own, but if you don't mind Flash, there are many Flash upload tools that can check things like size (as well as file types) and prevent uploading.
A good one to start with is the YUI Uploader. Lots more here: What is the best multiple file JavaScript / Flash file uploader?
Obviously you would want to check on the server side too, but by the time the user has started sending the request to the server, you are already using up your CPU cycles and bandwidth.
Thanks everyone for your replies; I just found out why $filesize = (-s $filename); was not working before, it is due that I was checking file size while sending Ajax request and not while re submitting the page.That's why I was having size to be zero. I fixed that to submit the page and it worked. Thanks.
Just read this post but while checking the content-length is a good approximate pre-check you could also save the file to temporary folder and then perform any kind of check on it. If it doesn't meet your criteria just delete and don't send it to it's final destination.
Look at the perl documentation for file stats -X - perldoc.perl.org and stat-perldoc.perl.org. Also, you can look at this upload script which is doing the similar thing what you are trying to do.