Escape Single Quotes in Template Toolkit - perl

Do you ever escape single quotes in template toolkit for necessary javascript handlers? If so, how do you do it.
[% SET s = "A'B'C" %]
ABC
html_entity obviously doesn't work because it only handles the double quote. So how do you do it?

I don't use the inlined event handlers -- for the same reason I refuse to use the style attribute for css. Jquery just makes it to easy to do class="foo" on the html and $('.foo').click( function () {} ), in an external .js file.
But, for the purpose of doing my best to answer this question, check out these docs on Template::Filter for the ones in core.
It seems as if you could do [% s | replace( "'", "\\'" ) %], to escape single quotes. Or you could probably write a more complex sanitizing javascript parser that permits only function calls, and make your own Template::Filter

2018 update for reference:
TT has a method for this called squote for escaping single quotes and dquote for double quotes.
[% tim = "Tim O'Reilly" %]
[% tim.squote %] # Tim O\'Reilly
Questioned link would be something like:
ABC
http://www.template-toolkit.org/docs/manual/VMethods.html#section_squote

You can try: popup('[% s | html %]').

Perl isn't my strongest language... But!
Easiest way I've found is to use the JSON module. In a module called JS.pm or something:
use JSON;
sub encode () {
my $self = shift;
my $string = shift;
$json = JSON->new->allow_nonref;
return $json->encode( $string );
}
More here: http://search.cpan.org/~makamaka/JSON-2.90/lib/JSON.pm
Then in your template:
[% use JS; %]
<script>
var escaped_string = [% JS.encode( some_template_variable ) %];
</script>

Remember to double-escape the slash in the replacement, otherwise it will be interpreted as escaping the apostrophe.
[% string.replace( "'", "\\'" ) %]

Related

Extract email from string using Template Tookit

I'm guessing this is relatively simple, but I can't find the answer.
From a string such as '"John Doe" <email#example.com>' - how can I extract the email portion from it using Template Tookit?
An example string to parse is this:
$VAR1 = {
'date' => '2021-03-25',
'time' => '03:58:18',
'href' => 'https://example.com',
'from' => 'fezius#evrostroyserov.ru on behalf of Caroline <fezius#evrostroyserov.ru>',
'bytes' => 13620,
'pmail' => 'user#example.com',
'sender' => 'sender#example.com',
'subject' => 'Some Email Subject'
};
My code, based on #dave-cross help below where $VAR1 is the output of dumper.dump(item.from)
[% text = item.from -%]
[% IF (matches = text.match('(.*?)(\s)?+<(.*?)>')) -%]
<td>[% matches.1 %]</td>
[% ELSE -%]
<td>[% text %]</td>
[% END %]
However, it's still not matching against $VAR1
This does what you want, but it's pretty fragile and this really isn't the kind of thing that you should be doing in TT code. You should either get the data parsed outside of the template and passed into variables, or you should pass in a parsing subroutine that can be called from inside the template.
But, having given you the caveats, if you still insist this is what you want to do, then this is how you might do it:
In test.tt:
[% text = '"John Doe" <email#example.com>';
matches = text.match('"(.*?)"\s+<(.*?)>');
IF matches -%]
Name: [% matches.0 %]
Email: [% matches.1 %]
[% ELSE -%]
No match found
[% END -%]
Then, testing using tpage:
$ tpage test.tt
Name: John Doe
Email: email#example.com
But I cannot emphasise enough that you should not be doing it like this.
Update: I've used this test template to investigate your further problem.
[% item = { from => '"John Doe" <email#example.com>' };
text = item.from -%]
[% IF (matches = text.match('(.*?)(\s)?+<(.*?)>')) -%]
<td>[% matches.1 %]</td>
[% ELSE -%]
<td>[% text %]</td>
[% END %]
And running it, I get this:
$ tpage test2.tt
<td> </td>
That's what I'd expect to see for a match. You're printing matches.1. That's the second item from the matches array. And the second match group is (\s). So I'm getting the space between the name and the opening angle bracket.
You probably don't want that whitespace match in your matches array, so I'd remove the parentheses around it, to make the regex (.*?)\s*<(.*?)> (note that \s* is a simpler way to say "zero or more whitespace characters").
You can now use matches.0 to get the name and matches.1 to get the email address.
Oh, and there's no need to copy items.from into text. You can call the matches vmethod on any scalar variable, so it's probably simpler to just use:
[% matches = item.from.match(...) -%]
Did I mention that this is all a really terrible idea? :-)
Update2:
This is all going to be far easier if you give me complete, runnable code examples in the same way that I am doing for you. Any time I have to edit something in order to get an example running, we run the risk that I'm guessing incorrectly how your code works.
But, bearing that in mind, here's my latest test template:
[% item = {
'date' => '2021-03-25',
'time' => '03:58:18',
'href' => 'https://example.com',
'from' => 'fezius#evrostroyserov.ru on behalf of Caroline <fezius#evrostroyserov.ru>',
'bytes' => 13620,
'pmail' => 'user#example.com',
'sender' => 'sender#example.com',
'subject' => 'Some Email Subject'
};
text = item.from -%]
[% IF (matches = text.match('(.*?)(\s)?<(.*?)>')) -%]
<td>[% matches.2 %]</td>
[% ELSE -%]
<td>[% text %]</td>
[% END %]
I've changed the definition of item to have your full example. I've left the regex as it was before my suggestions. And (because I haven't changed the regex) I've changed the output to print matches.2 instead of matches.1.
And here's what happens:
$ tpage test3.tt
<td>fezius#evrostroyserov.ru</td>
So it works.
If yours doesn't work, then you need to identify the differences between my (working) code and your (non-working) code. I'm happy to help you identify those differences, but you have to give my your non-working example in order for me to do that.
Update3:
Again I've tried to incorporate the changes that you're talking about. But again, I've had to guess at stuff because you're not sharing complete runnable examples. And again, my code works as expected.
[% USE dumper -%]
[% item = {
'date' => '2021-03-25',
'time' => '03:58:18',
'href' => 'https://example.com',
'from' => 'fezius#evrostroyserov.ru on behalf of Caroline <fezius#evrostroyserov.ru>',
'bytes' => 13620,
'pmail' => 'user#example.com',
'sender' => 'sender#example.com',
'subject' => 'Some Email Subject'
};
-%]
[% matches = item.from.match('(.*?)(\s)?<(.*?)>') -%]
[% dumper.dump(matches) %]
And testing it:
$ tpage test4.tt
$VAR1 = [
'fezius#evrostroyserov.ru on behalf of Caroline',
' ',
'fezius#evrostroyserov.ru'
];
So that works. If you want any more help, then send a complete runnable example. If you don't do that, I won't be able to help you any more.
There's a very old (and unmaintained) module, Template::Extract, that let's you define a template, then work backward from a string that might have been produced by that template:
use Template::Extract;
use Data::Dumper;
my $obj = Template::Extract->new;
my $template = qq("[% name %]" <[% email %]>);
my $string = '"John Doe" <email#example.com>';
my $extracted = $obj->extract($template, $string);
print Dumper( $extracted );
The output is:
$VAR1 = {
'email' => 'email#example.com',
'name' => 'John Doe'
};
However, there are modules that already do this job for you and will handle many more situations
I have no idea how Template Toolkit can help you. Use Email::Address or Email::Address::XS to parse an e-mail address.

Perl split string at character entity reference

Quick Perl question with hopefully a simple answer. I'm trying to perform a split on a string containing non breaking spaces ( ). This is after reading in an html page using HTML::TreeBuilder::XPath and retrieving the string needed by $titleString = $tree->findvalue('/html/head/title')
use HTML::TreeBuilder::XPath;
$tree = HTML::TreeBuilder::XPath->new;
$tree->parse_file( "filename" );
$titleString = $tree->findvalue('/html/head/title');
print "$titleString\n";
Pasted below is the original string and below that the string that gets printed:
Mr Dan Perkins (Active)
Mr?Dan Perkins?(Active)
I've tried splitting $titleString with #parts = split('\?',$titleString); and also with the original nbsp, though neither have worked. My hunch is that there's a simple piece of encoding code to be added somewhere?
HTML code:
<html>
<head>
<title>Dan Perkins (Active)</title>
</head>
</html>
You shouldn't have to know how the text in the document is encoded. As such, findvalue returns an actual non-breaking space (U+00A0) when the document contains . As such, you'd use
split(/\xA0/, $title_string)
-or-
split(/\x{00A0}/, $title_string)
-or-
split(/\N{U+00A0}/, $title_string)
-or-
split(/\N{NBSP}/, $title_string)
-or-
split(/\N{NO-BREAK SPACE}/, $title_string)

access variable member "*" in template

i have a variable
var1 = {
'*' = > {
'counter' = > {
'default' = > '0',
'description' = > 'test'
}
}
}
in perl template toolkit.
How can i access the content of '*' in the template.
[% var1.*.counter %]
does not work because of the symbol is no valid name.
You can define a variable to be equal to * within the template, and use that instead.
[% star = '*' %]
[% var1.$star.counter.description %]
But I wonder why you have to have an asterisk as a key in the first place? It would be far better to provide a sensible key at the Perl level, perhaps by writing
$vars->{var1}{star} = $vars->{var1}{'*'}
before you call the Template Toolkit.

PHP preg_replace with any variation of upper/lowercase?

I needed to write a custom module in drupal to help out with my location search. Initially I simply needed to remove a comma from queries, and then I realized that I would need to replace all instances of states with their abbreviation (California -> CA) because of how information is stored in my database. However, upon doing this I found out that my method of using preg_replace seems to be dependent on upper/lowercase. So in this line:
$form_state['values'] = preg_replace("/alabama/", 'al', $form_state['values']);
"alabama" will be replaced with "al", but "Alabama" or "ALABAMA" will not. Is there a way to replace any instance of Alabama with its abbreviation without accounting for every possible variation in casings?
you can try also str_ireplace() it's Case-insensitive
<?php
$str = 'alabama ,Alabama,ALABAMA';
$replace = str_ireplace('alabama','al',$str);
echo $str;
echo "<br/>";
echo $test;
?>
$form_state['values'] = preg_replace("/alabama/i", 'al', $form_state['values']);
The 'i' modifier will make the pattern case-insensitive.

Extracting links inside <div>'s with HTML::TokeParser & URI

I'm an old-newbie in Perl, and Im trying to create a subroutine in perl using HTML::TokeParser and URI.
I need to extract ALL valid links enclosed within on div called "zone-extract"
This is my code:
#More perl above here... use strict and other subs
use HTML::TokeParser;
use URI;
sub extract_links_from_response {
my $response = $_[0];
my $base = URI->new( $response->base )->canonical;
# "canonical" returns it in the one "official" tidy form
my $stream = HTML::TokeParser->new( $response->content_ref );
my $page_url = URI->new( $response->request->uri );
print "Extracting links from: $page_url\n";
my($tag, $link_url);
while ( my $div = $stream->get_tag('div') ) {
my $id = $div->get_attr('id');
next unless defined($id) and $id eq 'zone-extract';
while( $tag = $stream->get_tag('a') ) {
next unless defined($link_url = $tag->[1]{'href'});
next if $link_url =~ m/\s/; # If it's got whitespace, it's a bad URL.
next unless length $link_url; # sanity check!
$link_url = URI->new_abs($link_url, $base)->canonical;
next unless $link_url->scheme eq 'http'; # sanity
$link_url->fragment(undef); # chop off any "#foo" part
print $link_url unless $link_url->eq($page_url); # Don't note links to itself!
}
}
return;
}
As you can see, I have 2 loops, first using get_tag 'div' and then look for id = 'zone-extract'. The second loop looks inside this div and retrieve all links (or that was my intention)...
The inner loop works, it extracts all links correctly working standalone, but I think there is some issues inside the first loop, looking for my desired div 'zone-extract'... Im using this post as a reference: How can I find the contents of a div using Perl's HTML modules, if I know a tag inside of it?
But all I have by the moment is this error:
Can't call method "get_attr" on unblessed reference
Some ideas? Help!
My HTML (Note URL_TO_EXTRACT_1 & 2):
<more html above here>
<div class="span-48 last">
<div class="span-37">
<div id="zone-extract" class="...">
<h2 class="genres"><img alt="extracting" class="png"></h2>
<li><a title="Extr 2" href="**URL_TO_EXTRACT_1**">2</a></li>
<li><a title="Con 1" class="sel" href="**URL_TO_EXTRACT_2**">1</a></li>
<li class="first">Pàg</li>
</div>
</div>
</div>
<more stuff from here>
I find that TokeParser is a very crude tool requiring too much code, its fault is that only supports the procedural style of programming.
A better alternatives which require less code due to declarative programming is Web::Query:
use Web::Query 'wq';
my $results = wq($response)->find('div#zone-extract a')->map(sub {
my (undef, $elem_a) = #_;
my $link_url = $elem_a->attr('href');
return unless $link_url && $link_url !~ m/\s/ && …
# Further checks like in the question go here.
return [$link_url => $elem_a->text];
});
Code is untested because there is no example HTML in the question.