Perl: Uploading a file from a web page - perl

I am trying to upload a file from a web page without using:
my $query = CGI->new;
my $filename = $query->param("File");
my $upload_filehandle = $query->upload("File");
My form has several input fields and only one of them is a file name. So when I parse the form I parse all of the input fields in one pass. This means that I have the filename withough using my $filename = $query->param("File"); but, as far as I can tell, this means I can't use my $upload_filehandle = $query->upload("File"); to do the actual uploading.
Any advice is appreciated.
Regards.

How are you parsing the input fields? More importantly why are you using CGI and reimplementing some of its functionality?
This should still work:
my %args = how_you_are_parsing_the_url();
my $query = CGI->new; #this parses the url as well
my $upload_filehandle = $query->upload("File"); #assuming the input is named File

Related

Session data not being updated by script using that session

Thinking I have narrowed down the issue, here is a better question.
My script, 'index', opens an existing session - because a session does exist from when it was created by a login script.
It does correctly use values from that session in the page output so evidently, it's accessing the session from either memory or the server's sessions_storage dir.
The script is written so as to add two values to the session but, that's not actually happening. And this is where it gets annoyingly frustrating.
After running the script, I check the session file in filezilla. Those two values do not exist. However, if I output a dump of the session, at the bottom of my script, the two values show in that output.
If I delete the session from my browser and then reload the page, those two values and a few others are showing in the new session file but, of course, the other values stored from previous files (eg login) are missing.
This I have worked out thus far:-
All other files (from login thru to this 'index') are creating and/or storing and retrieving to/from the session without issue.
'Index' script is not adding to the existing session file.
New session forced by deleting session cookie from browser shows the data is being stored as expected in the correct server dir.
Using flush(); at the end of my script (or anywhere after session creation/loading); has made no difference.
Can any of you with fresh eyes tell me what's (not) going on?
my $sessions_dir_location = '/' . $var . '/' . $www . '/' . $vhosts . '/' . $domain . '/name_of_sessions_storage_dir/';
my $session = new CGI::Session(undef, $cgi, {Directory=>"$sessions_dir_location"}) or die CGI::Session->errstr;
my $session_id = $session->id();
$session->flush();
my %vars = $cgi-Vars;
my $business_id = $vars{'business_id'};
print qq(<pre>bid=$business_id</pre>); #successful
$session->param('business_id', $business_id); #unsuccessful
print qq(<pre>session_id = $session_id); #successful
print $session->dump; # shows the business_id value as being stored.
print qq(</pre>);
The following works for me, it increases the session parameter business_id by one for each call:
use strict;
use warnings;
use CGI;
use CGI::Session;
my $cgi = CGI->new();
my $sessions_dir_location = "/tmp/sessions";
# Data Source Name, defaults to "driver:file;serializer:default;id:md5"
my $dsn = undef;
# new() : returns new session object, or undef on failure. Error message is
# accessible through errstr() - class method.
my $session = CGI::Session->new(
$dsn, $cgi, {Directory=>"$sessions_dir_location"}) or die CGI::Session->errstr();
my %vars = $cgi->Vars;
my $cgi_bsid = $vars{business_id};
my $session_bsid = $session->param("business_id");
my $new_bsid = $cgi_bsid // $session_bsid // 0;
$new_bsid++;
$session->param('business_id', $new_bsid);
$session->flush() or die CGI::Session->errstr();
# CGI::Session will use this cookie to identify the user at his/her next request
# and will be able to load his/her previously stored session data.
print $session->header();
my $session_id = $session->id();
print $cgi->start_html();
print join "<br>",
qq(session_id=$session_id),
qq(cgi_bsid=$cgi_bsid),
qq(session_bsid=$session_bsid),
qq(new_id=$new_bsid);
print $cgi->end_html();

Including an embedded image in an Outlook HTML email via Perl

I need to generate an HTML email with a banner image embedded. It must go through an Outlook2007 mail client. I tried to base64encode the image and put it inline (it looked good) but Outlook would not send the email. I have culled through many different articles (in various programming languages) that have gotten me to this point but it is still not working. This code creates the email and attaches the image but the image is not displayed.
use Win32::OLE;
use Win32::OLE::Const 'Microsoft Outlook';
my $oMailer = new Win32::OLE('Outlook.Application') or
die "Unable to start an Outlook instance: $!\n";
my $oEmail = $oMailer->CreateItem(0) or
die "Unable to create mail item: $!\n";
$oEmail->{'To'} = 'me#here.org';
$oEmail->{'Subject'} = "Embedded image test";
$oEmail->{'BodyFormat'} = olFormatHTML;
$oEmail->{'HTMLBody'} = "<html><body><img src=\"cid:banner.jpg\"></body></html>";
my $attachments = $oEmail->Attachments();
my $bannerAttachment = $attachments->Add('C:/test/banner.jpg', olEmbeddeditem);
$bannerAttachment->PropertyAccessor->SetProperty(
"http://schemas.microsoft.com/mapi/proptag/0x3712001E", "banner.jpg");
$oEmail->save();
(BTW, I removed all the Win32::OLE->LastError() checks before posting because none of them failed anyway.)
When adding the attachment, it does not set the attachment Type to olEmbeddeditem (5); Don't know if this is relevant to the problem.
The SetProperty does not set the value either. That is supposed to set the Content ID (cid) that is referenced in the img src in the HTML. I used the below code to GetProperty and it returns an empty string.
my $CIDvalue = $bannerAttachment->PropertyAccessor->GetProperty(
"http://schemas.microsoft.com/mapi/proptag/0x3712001E");
print ">>>CIDvalue = $CIDvalue\n";
So close I can taste it!
Careful reading in the Perl docs for WIN32::OLE revealed a SetProperty method that was apparently being called instead of the M$ one I thought I was calling. Changing the code to:
$bannerAttachment->PropertyAccessor->Invoke('SetProperty', "http://schemas.microsoft.com/mapi/proptag/0x3712001E", "banner.jpg");
made it work and there was great rejoicing :)

Perl WWW::Mechanize Web Spider. How to find all links

I am currently attempting to create a Perl webspider using WWW::Mechanize.
What I am trying to do is create a webspider that will crawl the whole site of the URL (entered by the user) and extract all of the links from every page on the site.
What I have so far:
use strict;
use WWW::Mechanize;
my $mech = WWW::Mechanize->new();
my $urlToSpider = $ARGV[0];
$mech->get($urlToSpider);
print "\nThe url that will be spidered is $urlToSpider\n";
print "\nThe links found on the url's starting page\n";
my #foundLinks = $mech->find_all_links();
foreach my $linkList(#foundLinks) {
unless ($linkList->[0] =~ /^http?:\/\//i || $linkList->[0] =~ /^https?:\/\//i) {
$linkList->[0] = "$urlToSpider" . $linkList->[0];
}
print "$linkList->[0]";
print "\n";
}
What it does:
1. At present it will extract and list all links on the starting page
2. If the links found are in /contact-us or /help format it will add 'http://www.thestartingurl.com' to the front of it so it becomes 'http://www.thestartingurl.com/contact-us'.
The problem:
At the moment it also finds links to external sites which I do not want it to do, e.g if I want to spider 'http://www.tree.com' it will find links such as http://www.tree.com/find-us.
However it will also find links to other sites like http://www.hotwire.com.
How do I stop it finding these external urls?
After finding all the urls on the page I then also want to save this new list of internal-only links to a new array called #internalLinks but cannot seem to get it working.
Any help is much appreciated, thanks in advance.
This should do the trick:
my #internalLinks = $mech->find_all_links(url_abs_regex => qr/^\Q$urlToSpider\E/);
If you don't want css links try:
my #internalLinks = $mech->find_all_links(url_abs_regex => qr/^\Q$urlToSpider\E/, tag => 'a');
Also, the regex you're using to add the domain to any relative links can be replaced with:
print $linkList->url_abs();

What does this Lucene-related code actually do?

#usr/bin/perl
use Plucene::Document;
use Plucene::Document::Field;
use Plucene::Index::Writer;
use Plucene::Analysis::SimpleAnalyzer;
use Plucene::Search::HitCollector;
use Plucene::Search::IndexSearcher;
use Plucene::QueryParser;
my $content = "I am the law";
my $doc = Plucene::Document->new;
$doc->add(Plucene::Document::Field->Text(content => $content));
$doc->add(Plucene::Document::Field->Text(author => "Philip Johnson"));
my $analyzer = Plucene::Analysis::SimpleAnalyzer->new();
my $writer = Plucene::Index::Writer->new("my_index", $analyzer, 1);
$writer->add_document($doc);
undef $writer; # close
my $searcher = Plucene::Search::IndexSearcher->new("my_index");
my #docs;
my $hc = Plucene::Search::HitCollector->new(collect => sub {
my ($self, $doc, $score) = #_;
push #docs, $searcher->doc($doc);
});
$searcher->search_hc($query => $hc);
Try as I may, I don't understand what this code does. I understand the familiar Perl syntax and what's going on on that end...but what is a Lucene Document, Index::Writer - etc.? Most importantly, when I run this code I expect something to be generated...yet I see nothing.
I know what an Analyzer is...thanks to this doc linked to in CPAN: http://onjava.com/pub/a/onjava/2003/01/15/lucene.html?page=2. But I am just not getting why I run this code and it doesn't seem to DO anything...
Lucene is a search engine designed to search huge amounts of text very fast.
My perl is not strong, but from what I understand from Lucene objects:
my $content = "I am the law";
my $doc = Plucene::Document->new;
$doc->add(Plucene::Document::Field->Text(content => $content));
$doc->add(Plucene::Document::Field->Text(author => "Philip Johnson"));
This part creates a new document object and adds two text fields to it, content and author, in preparation to add it to an lucene index file as searchable data.
my $analyzer = Plucene::Analysis::SimpleAnalyzer->new();
my $writer = Plucene::Index::Writer->new("my_index", $analyzer, 1);
$writer->add_document($doc);
undef $writer; # close
This part creates the index files and adds the previously created document do that index. At this point, you should have a "my_index" folder with several index files in it, in your application directory, with docs's data in it as searchable text.
my $searcher = Plucene::Search::IndexSearcher->new("my_index");
my #docs;
my $hc = Plucene::Search::HitCollector->new(collect => sub {
my ($self, $doc, $score) = #_;
push #docs, $searcher->doc($doc);
});
$searcher->search_hc($query => $hc);
This part attempts to search the index file created above for the same document data you just used to create the index file. Presumably, you'll have your search results in #docs at this point, which you might want to display to user (tho it is not, in this sample).
This seems to be a "hello world" application for Lucene usage in perl. In real-life applications, I dont see a scenario where you would create the index file and then search it from same piece of code.
Where did you get this code from? It is a copy of the code in the Synopsis at the start of the Plucene POD documentation.
I guess it was an attempt by someone to begin learning about Plucene. The code in a module's synopsis isn't necessarily meant to achieve something useful on its own.
As the documentation you refer to says, Lucene is a Java library that adds text indexing and searching capabilities to an application. It is not a complete application that one can just download, install, and run.
Where did you get the idea that you should run the code you show?

What's the best method to generate Multi-Page PDFs with Perl and PDF::API2?

I have been using PDF::API2 module to program a PDF. I work at a warehousing company and we are trying switch from text packing slips to PDF packing slips. Packing Slips have a list of items needed on a single order. It works great but I have run into a problem. Currently my program generates a single page PDF and it was all working fine. But now I realize that the PDF will need to be multiple pages if there are more than 30 items in an order. I was trying to think of an easy(ish) way to do that, but couldn’t come up with one. The only thing I could think of involves creating another page and having logic that redefines the coordinates of the line items if there are multiple pages. So I was trying to see if there was a different method or something I was missing that could help but I wasn’t really finding anything on CPAN.
Basically, i need to create a single page PDF unless there are > 30 items. Then it will need to be multiple.
I hope that made sense and any help at all would be greatly appreciated as I am relatively new to programming.
Since you already have the code working for one-page PDFs, changing it to work for multi-page PDFs shouldn't be too hard.
Try something like this:
use PDF::API2;
sub create_packing_list_pdf {
my #items = #_;
my $pdf = PDF::API2->new();
my $page = _add_pdf_page($pdf);
my $max_items_per_page = 30;
my $item_pos = 0;
while (my $item = shift(#items)) {
$item_pos++;
# Create a new page, if needed
if ($item_pos > $max_items_per_page) {
$page = _add_pdf_page($pdf);
$item_pos = 1;
}
# Add the item at the appropriate height for that position
# (you'll need to declare $base_height and $line_height)
my $y = $base_height - ($item_pos - 1) * $line_height;
# Your code to display the line here, using $y as needed
# to get the right coordinates
}
return $pdf;
}
sub _add_pdf_page {
my $pdf = shift();
my $page = $pdf->page();
# Your code to display the page template here.
#
# Note: You can use a different template for additional pages by
# looking at e.g. $pdf->pages(), which returns the page count.
#
# If you need to include a "Page 1 of 2", you can pass the total
# number of pages in as an argument:
# int(scalar #items / $max_items_per_page) + 1
return $page;
}
The main thing is to split up the page template from the line items so you can easily start a new page without having to duplicate code.
PDF::API2 is low-level. It doesn't have most of what you would consider necessary for a document, things like margins, blocks, and paragraphs. Because of this, I afraid you're going to have to do things the hard way. You may want to look at PDF::API2::Simple. It might meet your criteria and it's simple to use.
I use PDF::FromHTML for some similar work. Seems to be a reasonable choice, I guess I am not too big on positioning by hand.
The simplest method is to use PDF-API2-Simple
my #content;
my $pdf = PDF::API2::Simple->new(file => "$name");
$pdf->add_font('Courier');
$pdf->add_page();
foreach $line (#content)
{
$pdf->text($line, autoflow => 'on');
}
$pdf->save();