escape HTML by default in Template Toolkit - perl

Can I somehow configure Template Toolkit so that:
[% foo %]
does what you would now need to say:
[% foo | html %]
that is, escape HTML in foo? And do something else, like:
[% foo | noHtml %]
if I don't want to escape?

Came across your question when trying to answer the same question for myself.
http://search.cpan.org/~mithaldu/Template-AutoFilter/ seems to do what we want, but it does require installing another module. I'm going to give it a try anyway.

I guess you could create your own stash by extending Template::Stash so that it would by default escape variables.
That said I think it is not a good idea. Better to stick with default behaviour and refrain from custom modifications as they are certainly confusing.

I spent some time on this problem recently. Here is the outline of my solution.
I created a new class called HtmlSafe which contains strings that can be safely written to WWW client without a security vulnerability. The idea was that the functions that generate HTML tags return HtmlSafe objects, and variables out of the box are not HtmlSafe. Whatever creates a HtmlSafe also has vouched for the safety for the string in question. Concatenating a non-HTML-safe string with HTML-safe string causes the non-HTML-safe string to be escaped through CGI::escapeHTML, and then joined with the HTML-safe string. Concatenating another instance of HtmlSafe to HtmlSafe just joins the strings in question without escaping. I ended up using an overload so I could redefine the . operator for HtmlSafe class.
Armed with this thing, I gave a $template->process() function an $output variable that was actually a sub that invoked a concatenation with HtmlSafe, like this:
my $output = HtmlSafe->new("");
$template->process($vars, sub { $output .= $_[0]; });
return $output->unwrap(); # remove HtmlSafe and return underlying string
We are almost ready with the HtmlSafe TT2. The big changed I had to actually do was to change the textblock() function in Template::Directive, which is used by the Template::Parser to generate HtmlSafe instances of whatever text block it tried to emit. These appear to correspond with the TEXT nodes of the parsed template, so just doing
package MyDirective;
use base "Template::Directive";
sub textblock { my $self = shift; return "$Template::Directive::OUTPUT HtmlSafe->new(" . $self->text(#_) . ")"; }
which I gave to parser like so:
my $parser = Template::Parser->new({
FACTORY => "MyDirective",
});
In addition to this, I defined a "none" filter for TT2 which simply wraps whatever is defined as HtmlSafe, so you can output raw HTML if you have to. This avoids escaping things. The default "html" filter is no-op, because anything concatenated to HtmlSafe gets escaped now anyway.

Related

Using Powershell and HTMLBody.Replace how do I replace values inside an existing table?

I have an existing email template file for Outlook with To, CC, Subject and Body prefilled.
I can replace the values I need on the subject just fine, however, when it comes to the HTMLBody part, it only replaces values outside the table; I've tested this by putting all 15 placeholders outside the table.
In Powershell, I defined an array with the items that will be replaced and another that reads the values from a JSON file, then I loop through both in order to replace the values on the HTMLBody.
This is the code in question:
$emailToreplaceValues=#(
"[DailyReportDate]",
"[DailyReportSuccess]",
"[DailyReportFailure]",
"[DailyReportFailureRate]"
)
$newValues=#(
$valuesJSON.DailyReport.Date,
$valuesJSON.DailyReport.Success,
$valuesJSON.DailyReport.Failure,
$dailyReportFailureRate
)
$reportEmail = $outlookObj.CreateItemFromTemplate("$emailTemplate")
$reportEmail.Subject = $reportEmail.Subject.Replace("[date]", $date)
for($i=0;$i -le $newValues.Count;$i++) {
$reportEmail.HTMLBody = $reportEmail.HTMLBody.Replace($emailToreplaceValues[$i], $newValues[$i])
}
There's more values but for the sake of brevity, I only included a few of the values, from my understanding, the issue is that some of those values are inside a HTML table cell but I don't know if I can access the table or cells directly.
Firstly, do not use MailItem.HTMLBody property as variable - it is expensive to set and read, and it might not be the same HTML you set as Outlook performs some massaging and validation. Introduce an explicit variable, set it to the value of HTMLBody, do all your string replacements in a loop using that variable, then set the MailItem.HTMLBody property once.
You can also try to output the value of that variable to make sure the old values to be replaced are really there and are not broken by HTML formatting or encoding.
For the sake of future reference, the only way I was able to fix this, was by grabbing the html code off the email that I based my email template off.
I organized it so that any tags I want replaced are in their own line without anything else other than the spaces for indentation, then defined it as a variable that goes through the replace cycle and gets assigned to the MailItem.HTMLBody property after the replace cycle.

Handle POST data sent as array

I have an html form which sends a hidden field and a radio button with the same name.
This allows people to submit the form without picking from the list (but records a zero answer).
When the user does select a radio button, the form posts BOTH the hidden value and the selected value.
I'd like to write a perl function to convert the POST data to a hash. The following works for standard text boxes etc.
#!/usr/bin/perl
use CGI qw(:standard);
sub GetForm{
%form;
foreach my $p (param()) {
$form{$p} = param($p);
}
return %form;
}
However when faced with two form inputs with the same name it just returns the first one (ie the hidden one)
I can see that the inputs are included in the POST header as an array but I don't know how to process them.
I'm working with legacy code so I can't change the form unfortunately!
Is there a way to do this?
I have an html form which sends a hidden field and a radio button with
the same name.
This allows people to submit the form without picking from the list
(but records a zero answer).
That's an odd approach. It would be easier to leave the hidden input out and treat the absence of the data as a zero answer.
However, if you want to stick to your approach, read the documentation for the CGI module.
Specifically, the documentation for param:
When calling param() If the parameter is multivalued (e.g. from multiple selections in a scrolling list), you can ask to receive an array. Otherwise the method will return the first value.
Thus:
$form{$p} = [ param($p) ];
However, you do seem to be reinventing the wheel. There is a built-in method to get a hash of all paramaters:
$form = $CGI->new->Vars
That said, the documentation also says:
CGI.pm is no longer considered good practice for developing web applications, including quick prototyping and small web scripts. There are far better, cleaner, quicker, easier, safer, more scalable, more extensible, more modern alternatives available at this point in time. These will be documented with CGI::Alternatives.
So you should migrate away from this anyway.
Replace
$form{$p} = param($p); # Value of first field named $p
with
$form{$p} = ( multi_param($p) )[-1]; # Value of last field named $p
or
$form{$p} = ( grep length, multi_param($p) )[-1]; # Value of last field named $p
# that has a non-blank value

Exim getting random credential in exim.conf

I have been trying to get perl subroutine value and substitution to get the required part of string from randomips subroutine in exim.conf. However when i use string substitution i get error as follow:
Here is what I am trying to achieve
I am trying to split string by colon and get first occurrence as "interface". I'll be using second occurrence as the "helo_data.
exim.pl
sub randomhosts {
#inet = ("x.x.x.1:hostname1.domain.com","x.x.x.2:hostname2.domain.com","x.x.x.3:hostname3.domain.com"
);
return $inet[int rand($#inet+1)];
}
exim.conf
dkim_remote_smtp:
driver = smtp
interface = "${perl{randomhosts}%:*}"
helo_data = "${sender_address_domain}"
Error I get is as follow:
"failed to expand "interface" option for dkim_remote_smtp transport: missing '}' after 'perl'".
Probably the syntax.
Any help?
The code that you are trying to copy was written by someone who doesn't know much about Perl. It includes this line:
return $inet[int rand($#inet+1)];
A Perl programmer would write this as
return $inet[rand #inet];
I think there are a couple of issues here - one with your Exim syntax and one with your Perl syntax.
Exim is giving you this error:
failed to expand "interface" option for dkim_remote_smtp transport: missing '}' after 'perl'
I don't know anything about calling Perl from Exim, but this page mentions a syntax like ${perl{foo}} (which is similar to the one used in the page you are copying from) and one like ${perl{foo}{argument}} for calling a subroutine and passing it an argument. Nowhere does it mention syntax like yours:
${perl{randomhosts}%:*}
I'm not sure where you have got that syntax from, but it seems likely that this is what is causing your first error.
In a comment, you say
I am stying to get first part of string before colon for each random array value for "interface" and part after colon for "helo_data"
It seems to me that Exim doesn't support this requirement. You would need to call the function twice to get the two pieces of information that you require. You might be able to do this in the Perl using something like state variables - but it would be far more complex than the code you currently have there.
Secondly, your Perl code has a syntax error, so even if Exim was able to call your code, it wouldn't work.
The code you're copying sets up #inet like this:
#inet = ("x.x.x.1", "x.x.x.2", "x.x.x.3", "x.x.x.4");
Your equivalent code is this:
#inet = (
"x.x.x.1:hostname1.domain.com",
"x.x.x.2:hostname2.domain.com,
x.x.x.3:hostname3.domain.com
);
I've reformatted it, to make the problems more obvious. You are missing a number of quote marks around the elements of the array. (Note: I see that while I have been writing this answer, you have fixed that.)
Update: Ok, here is some code to put into exim.pl that does what you want.
use feature qw[state];
sub randomhosts {
state $current;
my #inet = (
"x.x.x.1:hostname1.domain.com",
"x.x.x.2:hostname2.domain.com",
"x.x.x.3:hostname3.domain.com"
);
if ($_[0] eq 'generate') {
shift;
#{$current}{qw[ip host]} = split /:/, $inet[rand #inet];
}
return $current->{$_[0]};
}
It generates a new ip/host pair if its first argument is 'generate'. It will then return either the hostname or the ip address from the generated pair. I think you can probably call it from your Exim config file like this:
dkim_remote_smtp:
driver = smtp
interface = "${perl{randomhosts}{generate}{ip}}"
helo_data = "${perl{randomhosts}{host}}"
But I'm no expert in Exim, so that syntax might need tweaking.
First I would like to note I have not worked with exim so I cannot say what exactly you are trying to do and why you have done things exactly so.
In the link you posted, a method called 'randinet' is added to exim.pl and the interface line in exim.conf is replaced by
interface = "${perl{randinet}}"
You have implemented a 'randomhosts' method and replaced the interface line with
interface = "${perl{randomhosts}%:*}"
Now the parser complains about not finding the closing bracket. That is likely due to the symbols you felt free to add but the parser does not have the freedom to ignore.
I suggest you try
interface = "${perl{randomhosts}}"

Perl XML::SAX - character() method error

I'm new to using Perl XML::SAX and I encountered a problem with the characters event that is triggered. I'm trying to parse a very large XML file using perl.
My goal is to get the content of each tag (I do not know the tag names - given any xml file, I should be able to crack the record pattern and return every record with its data and tag like Tag:Data).
While working with small files, everything is ok. But when running on a large file, the characters{} event does partial reading of the content. There is no specific pattern in the way it cuts down the reading. Sometimes its the starting few characters of data and sometimes its last few characters and sometimes its just one letter from the actual data.
The Sax Parser is:
$myhandler = MyFilter->new();
$parser = XML::SAX::ParserFactory->parser(Handler => $myhandler);
$parser->parse_file($filename);
And, I have written my own Handler called MyFilter and overridding the character method of the parser.
sub characters {
my ($self, $element) = #_;
$globalvar = $element->{Data};
print "content is: $globalvar \n";
}
Even this print statement, reads the values partially at times.
I also tried loading the Parsesr Package before calling the $parser->parse() as:
$XML::SAX::ParserPackage = "XML::SAX::ExpatXS";
Stil doesn't work. Could anyone help me out here? Thanks in advance!
Sounds like you need XML::Filter::BufferText.
http://search.cpan.org/dist/XML-Filter-BufferText/BufferText.pm
From the description "One common cause of grief (and programmer error) is that XML parsers aren't required to provide character events in one chunk. They can, but are not forced to, and most don't. This filter does the trivial but oft-repeated task of putting all characters into a single event."
It's very easy to use once you have it installed and will solve your partial character data problem.

HOP::Lexer with overlapping tokens

I'm using HOP::Lexer to scan BlitzMax module source code to fetch some data from it. One particular piece of data I'm currently interested in is a module description.
Currently I'm searching for a description in the format of ModuleInfo "Description: foobar" or ModuleInfo "Desc: foobar". This works fine. But sadly, most modules I scan have their description defined elsewhere, inside a comment block. Which is actually the common way to do it in BlitzMax, as the documentation generator expects it.
This is how all modules have their description defined in the main source file.
Rem
bbdoc: my module description
End Rem
Module namespace.modulename
This also isn't really a problem. But the line after the End Rem also contains data I want (the module name). This is a problem, since now 2 definitions of tokens overlap each other and after the first one has been detected it will continue from where it left off (position of content that's being scanned). Meaning that the token for the module name won't detect anything.
Yes, I've made sure my order of tokens is correct. It just doesn't seem possible (somewhat understandable) to move the cursor back a line.
A small piece of code for fetching the description from within a Rem-End Rem block which is above a module definition (not worked out, but working for the current test case):
[ 'MODULEDESCRIPTION',
qr/[ \t]*\bRem\n(?:\n|.)*?\s*\bEnd[ \t]*Rem\nModule[\s\t]+/i,
sub {
my ($label, $value) = #_;
$value =~ /bbdoc: (.+)/;
[$label, $1];
}
],
So in my test case I first scan for a single comment, then the block above (MODULEDESCRIPTION), then a block comment (Rem-End Rem), module name, etc.
Currently the only solution I can think of is setup a second lexer only for the module description, though I wouldn't prefer that. Is what I want even possible at all with HOP::Lexer?
Source of my Lexer can be found at https://github.com/maximos/maximus-web/blob/develop/lib/Maximus/Class/Lexer.pm
I've solved it by adding (a slightly modified version of) the MODULEDESCRIPTION. Inside the subroutine I simply filter out the module name and return an arrayref with 4 elements, which I later on iterate over to create a nice usable array with tokens and their values.
Solution is again at https://github.com/maximos/maximus-web/blob/develop/lib/Maximus/Class/Lexer.pm
Edit: Or let me just paste the piece of code here
[ 'MODULEDESCRIPTION',
qr/[ \t]*\bRem\R(?:\R|.)*?\bEnd[ \t]*Rem\R\bModule[\s\t]\w+\.\w+/i,
sub {
my ($label, $value) = #_;
my ($desc) = ($value =~ /\bbbdoc: (.+)/i);
my ($name) = ($value =~ /\bModule (\w+\.\w+)/i);
[$label, $desc, 'MODULENAME', $name];
}
],