I am trying to display some html encoded information on a document that is generated by a scheduled execution of a powershell script.
The following MVP illustrates my issue:
#{ a="<div style=""color:red;"">Hello</div>"; b="Hi"}.GetEnumerator() | Select Key, Value | ConvertTo-Html | Out-File -Encoding utf8 -FilePath C:\Scripts\Test.html
Which outputs:
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<title>HTML TABLE</title>
</head><body>
<table>
<colgroup><col/><col/></colgroup>
<tr><th>Key</th><th>Value</th></tr>
<tr><td>a</td><td><div style="color:red;">Hello</div></td></tr>
<tr><td>b</td><td>Hi</td></tr>
</table>
</body></html>
Which, when opened, looks like:
But I want my Hello to be red, and not to see the escaped html div code.
Is there any way to tell ConvertTo-Html not to escape my inputs?
Note: This MVP only illustrates the issue I'm facing. I actually have a very complex report that I would like to decorate for easier viewing (color coding, symbol, et al).
This is the report I am trying to configure:
The main purpose of the ConvertTo-Html cmdlet is to provide an easy-to-use tool for converting lists of objects into tabular HTML reports. The input for this conversion is expected to be non-HTML data, and characters that have a special meaning in HTML are automatically escaped. This cannot be turned off.
Unescaped HTML fragments can be inserted into the HTML report via the parameters -Body, -PreContent, and -PostContent before or after tabular data. However, for more complex reports this probably isn't versatile enough. The best approach in situations like that is to generate the individual parts of your report as fragments, e.g.
$ps = Get-Process | ConvertTo-Html -PreContext '<p>Process list</p>' -Fragment
and then combine all fragments with a here-string:
$html = #"
<html>
<head>
...
</head>
<body>
${ps}
<hr>
${other_fragment}
...
</body>
</html>
"#
As for individual formatting of particular parts of generated fragments: that is not supported. You need to modify the resulting HTML code yourself, either via search&replace (in fragments or the full HTML content) or by parsing and modifying the full HTML content.
Related
This is probably pretty basic, but I'm struggling with the command line. Suppose I want to turn a markdown file myDoc.md to a pdf file. Markedjs provides a command line tool to convert markdown to html, and wkhtmltopdf can convert html to pdf, so I have the command
marked myDoc.md | wkhtmltopdf - myDoc.pdf
That works, it generates the pdf. But the pdf is pretty ugly, I want to prepend a style section to the html before passing it to wkhtmltopdf. Yes I could put the style section in the markdown document, but I don't want to pollute the markup with this. I want to use marked to generate html, then prepend a style section, then feed that to wkhtmltopdf, without any intermediate files to clean up. Something like this pseudo code
myStyle="<style>
*{
font-family: arial;
}
h1{
text-align:center;
}
</style>"
marked myDoc.md | concatenatestrings myStyle - | wkhtmltopdf - myDoc.pdf
but where I'm having trouble is I don't know how to handle the multiline string for myStyle and finding something that does what the hypothetical concatenatestrings command does, taking a string from stdin, prepend myStyle, and output to stdout.
I would use a template file, then use a subshell to output the template and output of marked myDoc.md to stdout, then pipe the results to the rest of your chain.
So, let us create the template file...
template.html
<style>
*{
font-family: arial;
}
h1{
text-align:center;
}
</style>
...and use it
$ (cat template.html && marked myDoc.md) | wkhtmltopdf - myDoc.pdf
I haven't tested this with your command (I don't want to install marked just to test it), but have tested it with the following...
$ (echo ree && echo cola) | cat
ree
cola
I'm using OSCommerce for my online store and I'm currently optimizing my product page for rich snippets.
Some of my Google Indexed pages are being marked as "Failed" by Google due to double quotes in the description field.
I'm using an existing code which strips the html coding and truncates anything after 197 characters.
<?php echo substr(trim(preg_replace('/\s\s+/', ' ', strip_tags($product_info['products_description']))), 0, 197); ?>
How can I include the removal of quotes in that code so that the following string:
<strong>This product is the perfect "fit"</strong>
becomes:
This product is the perfect fit
Happened with me, try to use:
tep_output_string($product_info['products_description']))
" becomes "
We can try using preg_replace_callback here:
$input = "SOME TEXT HERE <strong>This product is the perfect \"fit\"</strong> SOME MORE TEXT HERE";
$output = preg_replace_callback(
"/<([^>]+)>(.*?)<\/\\1>/",
function($m) {
return str_replace("\"", "", $m[2]);
},
$input);
echo $output;
This prints:
SOME TEXT HERE This product is the perfect fit SOME MORE TEXT HERE
The regex pattern used does the following:
<([^>]+)> match an opening HTML tag, and capture the tag name
(.*?) then match and capture the content inside the tag
<\/\\1> finally match the same closing tag
Then, we use a callback function which does an additional replacement to strip off all double quotes.
Note that in general using regex against HTML is bad practice. But, if your text only has single level/occasional HTML tags, then the solution I gave above might be viable.
I'm using PoEdit to create my messages.mo file to be used in my php web app.
I checked my encoding is UTF-8 and still, my accents are not showing (e.g. "é", "è", ...). Actually, both source and target files are defined with UTF-8...
Here's the code I use to enable gettext:
<?php
$dir = "../locale";
$lang="fr_FR";
$domain="messages";
putenv("LANG=$lang");
setlocale(LC_ALL, $lang);
bindtextdomain ($domain, $dir);
textdomain ($domain);
echo gettext("TEST 1") . "\n";
echo __("Test 2"); // works if using gettext("Test 2");
?>
EDIT: I also add here the header of my page, stating I should be using UTF-8...
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
EDIT 2: Here a link to the po file. Also, I try to copy-paste the result here
R�gle plurielle 1 accessible en �criture.
I should get
Règle plurielle 1 accessible en écriture.
Any idea how to resolve this?
I generate a CSV file as one string with semicolons as field separator and chr(13) chr(10) as line separator. I append this to an email with the cfmailparam tag using disposition="attachment" and the content attribute contains the said string. The attachment of the mail seems to be encoded in UTF8 which Excel does not like, so my Umlauts are destroyed. Is there a possibility to provide the cfmailparam tag with a charset attribute to ensure the file is attached/sent Windows1252 encoded?
Is it better to store the string with the cffile tag and the Windows1252 encoding and appending it to the mail with the cfmailparam file attribute?
<cfset arrTitel = [
"Titel"
, "Geschäftsbereich"
]>
<cfsavecontent variable="csv"><cfoutput><cfloop from="1" to="#ArrayLen( arrTitel )#" index="spaltentitel"><cfif spaltentitel gt 1>;</cfif>"#arrTitel[spaltentitel]#</cfloop>#chr(13)##chr(10)#</cfoutput></cfsavecontent>
<cfmail from="#mail#" to="#mailempf#" subject="subj" type="text/html">
<cfmailparam
content="#csv#"
disposition="attachment"
file="Report.csv"
>
</cfmail>
This is what it basically looks like.
Please don't advise to change the encoding of the cfm file. Other values containing Umlauts are not hard coded but come from a database.
I want to extract some text which is present in a specific table cell in the HTML page.
Now, the problem is, this cell is present inside a table tag which has no ID/Name.
I am using HTML::TreeBuilder::XPath to extract the value using XPATH expressions.
Here is how the HTML content looks like:
<table border="0">
<tr>
<td>Some Text</td>
<td>The Text I want comes here</td>
</tr>
This is how my XPATH expression looks like:
#nodes=$tree->findnodes(q{//table[8]/tr/td[2]/text()});
print $_->string_value."\n" foreach(#nodes); # corrected, thanks mirod.
It does not display the output.
I have used, table[8] above since this is the eight table tag in the HTML page (assuming the index starts from 1).
Also, I have used td[2] since I want the innerHTML between the second td tag.
Thanks.
What happens if you remove the text() at the end of the XPath query? I would think that calling string_value on the td itself would be enough.
Also method calls are not interpolated in strings, so you need to write print $_->string_value, "\n".
This will give you the text of the content, not the markup though. For that you would need to use as_HTML, and strip the outer tags (there is no method in HTML::Element that gives you the inner HTML):
#!/usr/bin/perl
use strict;
use warnings;
use HTML::TreeBuilder::XPath;
my $tree= HTML::TreeBuilder::XPath->new_from_content( <DATA>);
my #nodes=$tree->findnodes(q{//table[1]/tr/td[2]});
print $_->string_value, "\n" foreach(#nodes); # text
print $_->as_HTML, "\n" foreach(#nodes); # outerHTML
__DATA__
<html>
<body>
<table border="0">
<tr>
<td>Some Text</td>
<td>The Text I want comes here with <b>nested</b> content</td>
</tr>
</body>
</html>
The mirod approach should work for you.
But I recommend to use findvalues instead of findnodes if you need text content.
Try to run this code and show output:
my #values=$tree->findvalues(q{//table[8]//tr[1]//td});
print $_, "\n" foreach(#values);