Using mpdf PDF document do not display unicode symbols - unicode

I am using mpdf for generating pdf. And the problem is with unicode character.
For example I have:
<tr>
<td>□ Yes</td>
<td>☑ No</td>
</tr>
When PDF is generating I have the same view instead of normal symbols.
□ Yes
☑ No
I am initialising mPDF and passing in the HTML like that:
$mpdf=new mPDF('lt','A4','','',5,5,10,5,1,1);
$mpdf->WriteHTML($css, 1);
$mpdf->WriteHTML($html);
$mpdf->Output($pdf_filename, $pdf_destination);
What can be wrong with unicode encoding?

Related

itext7/html2pdf: Inline image not aligned with text in generated PDF, but is in HTML

I want to generate a PDF from input that contains LaTeX math markup. So I am converting the math markup to png images (using jlatexmath) and inserting them as base64 encoded inline images in an HTML document, which is then converted to PDF using itext7/html2pdf.
The HTML version is aligned correctly:
However the PDF version is not:
The HTML looks like this:
<p>Here is an example using LaTeX math: <img
src="data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAHgAAAAdCAIAAADuPz9sAAAFqUlEQVR4Xu2ZXUiUWxSGu4huFb3ySooUrxQUbS4K80pBBX8QFS1BiMKEskBLUVFRkeOgUVqgRKBS9gtqRlNJYJTY0cZGzZ8KRXMMsTBTLB2+8zL7zJ496/uZ0dP5xgufi2F/7946r8u911p7Zp+0hy7so8KOMJvNVNo17BJvWoGem5uLiIj4/v07nXDFaDR+/vyZqrsGeINDquqOVqBBRkYGlVx5/PhxW1sbVR2cOHEiKytLsv+1JpOptbX1zp07dNH/Dxw+efKEqvqiFej19fXc3FyqCmxsbCQlJVHVgcViQaDDw8Mxttls+Je8f/8eA7pOF5KTk+GWqjqiHOgHDx48evSopqamqamJzglcvXq1vb2dqg7KysrwS6xWK8bYyA0NDW/fvp2dnaXrdAE+4ZaqOqIQaBzz/Px8DOrq6oaGhui0QFRU1NraGlUdHD9+fGlpiapeAj4jIyOpqiMKgcZG7urqwiA7O3tzc5NOOxgfH4+Li6OqA/xgSEgIVb1KbGzs6OgoVfVCIdCXL1+enp5GpFJSUkZGRui0gytXrhQXF1PVwfDwcHx8PFW9yqVLlxobG6mqFwqBfvfuHdqD7u7uCxcu4JVOOzh58uS9e/eo6uD69euFhYVU1WRgYOD06dMdHR3s8ePHj8+ePXNdIvX393/58oWIDLfNaGdnJzxT1WNEe4reJE17CoH2kCNHjoyNjVHVAdqVmzdvUlUTRIFnqq2trfPnz7vO/4tGTdNuRnE6DQYDVT2G29PwJqnb23mgDx8+PD8/T1UHYWFh2AJU1SQ1NZWP0fbcvn1bmHSCKZw5qnrQjM7MzAQFBVHVY7g9DW+Sur2dB9rPz29lZYWqdn79+nXgwIEfP37QCXXQ/0VHR7948YI9ZmZmsjM4NTWFk/H06VOM8/Ly8Pr169dTp04JP+rSjMrXc5BV/P39RcVzRHvcG7Y2Ekh9ff39+/fRp01OTkpK9hjOQH/48OFvTRA+4Qel/fv3451EhWM2mwMDA6mqCX5/bW0tf+StGIpqS0sLAre6usozw7Fjx/hK0owqrmfALTyLCpiYmGhwBQVTvoFEe9wb7l9IJujQscEHBwdxpJgu2uPsPNA+Pj4/f/4UFQ6uvNttOW7dutXT08MfkXn4GJd4XOqwm3jPIM7Km1H5egZOmK+vr6h4jmhPfHeAfE1uYWQBQyF1uC3fjIMHDy4sLFDVDhopjc5PkYsXL4p2jx49ysc4qpI9oK9fv2btpnj1kDej8vUM/F2HDh3ijwzcBv5yxWg0yne0aI97w+bDSralXr16xRdze2J8FAItuSvfDPwzcO6oaicxMfHu3btU1SQnJ0d8PHPmDK+0VVVVJpOpvLz8xo0byIBwn56ezlfKm1Gynq9EjwTP/HFbiPa4t5KSEpxdvFFfXx9SFpvl9paXl8UNoRBot+WbkZaWJh52keDg4E+fPlFVBSRZJLuCggJRxE5Uq+xohzX6dw2QYeCZqu6Q29PwJqnbcwm0WL4VS6oIllVWVhIR/P79OyAggKrqPHz4sLS01GKxEB11iSiMc+fOqRVhbSoqKqqrq6nqDkV7at4kdXvOQJPyrVZSOWiTFT8jxQlF6qDq9llcXHz+/DkRkRPQwBHRQ5KTk9+8eUPVHaHoTdK05wy0vHxLSiWVY7PZUF7lnzrhBCBLSvbvBNDnKV5V9Qc+4VZxr+mDM9CkfKuVVJGysjL5NxcoDuy+wBoPxfSiP729vcgAVNURZ6BJ+VYsqQQUVnl/kpCQ8O3bNwzOnj2LV/w2ssAroAzCLVV1RKHr2BYomy9fvmRjq9WKVB4TE8Me2Y727j5i4PKi/VWRDvzXQEv2nMM+1ggNDb127VpzczPT0fwh/3g9RyMBFhUVUVV3/kCgcd9FepHsn28ZDAa1e7m3gDfvfi3L+AOB3sMT9gKtE3uB1ol/ADKVewttxwYlAAAAAElFTkSuQmCC"/>.
And here is another: <img
src="data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAEIAAAAPCAIAAADf8I4NAAAB6klEQVR4Xt2WTatBQRjHfQzfAR/AZ7CwUUpethYWslXeyurSRRZeUyiSnQVSslAKSRZessGCjY1IeYn7dM69g2fOPY6bW+79rWae/8zT/OeZmXNE53+BiLT0en0sFruSXpp4PK7T6Uj3YkMikZA2odFo2Gy2brebzWaxJphfSnK94Ds2JpNJMplstVrlchlrDIPBAIco7iYRAp3kARu5XG4+n8Nhq9VqWGOQSqU4RHE3iRDoJHw2jsdjpVLxeDz5fN5kMo1GIzKAE9r8cDgMh8PsnkEGpD4EbL9KpXI6nZyXls9Gr9c7HA7pdDqTyTSbze12SwZwQtvodDrRaBRsrNdrjUaDVAK4fb/F5/OtVis0DCqgVqv3+z2Kn/ltsJjN5ul0Sro0gUDgjUEsFrMNv99/Op1YVavV7na7arUKw27nPQa4UiqVi8UCCwx8NtrtNkxWKBTQrtfrRP0OuhoAWwS32w2VwdoX/X6f9U/wer3X1YBNgTzwhEBwPB5fTf2Ez4bVak2lUna7HfaSZxEEThsul6tUKsnlcrhpWBMMXE6Hw2EwGIxG4w8PlXBkMhmKJBKJQqGwXC4tFguSnsszbcxmMxSB71SxWAwGg5vNBknPhdsG/IyEQiHSfXEikQg8JKR7sfGn+Sc2PgBzdMQ49YHi+gAAAABJRU5ErkJggg=="/>.
</p>
and the CSS contains:
img {
vertical-align:middle;
}
Any idea how to make it align in the PDF version?

Displaying unicode characters in codemirror

I have xml with lots of unicode characters. This is coming out of a database where original ¶ is represented as ¶ (which in turn is correctly rendered as ¶ in HTML). However, CodeMirror displays as ¶. Is there some way of having CodeMirror render these sequences as HTML does, ie ¶?
Figured out a solution -- basically convert the entities before submitting to codeMirror. See Value &# to unicode convert

Using mPDF to create a PDF from a HTML form

I want to use mpdf to create my PDF-files because I use norwegian letters such as ÆØÅ. The information on the PDF-file would mostly consist of text written by the user in a HTML form. But, I have some problems.
When using this code:
$mpdf->WriteHTML('Text with ÆØÅ');
The PDF will show the special characters.
But when using this:
<?php
include('mpdf/mpdf.php');
$name = 'Name - <b>' . $_POST['name'] . '</b>';
$mpdf = new mPDF();
$mpdf->WriteHTML($name);
$mpdf->Output();
exit;
?>
The special characters will not show.
The HTML form looks like this:
<form action="hidden.php" method="POST">
<p>Name:</p>
<input type="text" name="name">
<input type="submit" value="Send"><input type="reset" value="Clear">
</form>
Why won't the special characters show with this method? And which method should I use?
Since echoing the POST-data back onto the website does not show the characters as well, this clearly isn't an issue with mpdf. When using content including non-Ascii characters, special care about the websites character encoding has to be taken.
From the mpdf-documentation it can be seen that it supports UTF-8 encoding, so you might want to use that for your data. POST-data is received in the same encoding that is used by the website. So if the website is in latin-1, you will need to call utf8_encode() to convert the POST-data to unicode. If the website already uses UTF-8 you should be just fine.
If you don't set a specific encoding in the website header (which you should always to avoid this kind of trouble), encoding might depend on several factors such as the operating system and configuration on the server or the encoding of the original php sourcefile which, as it turns out, is influenced by your own OS configuration and choice of editor.

converting html to text with perl

I have a bunch of html files and need to convert and format them to text with perl i.e somthing like <br/> will be interperted to \n
I found this perl module on cpan html::formattext it format the text well but if there is link it strip it ,
are there any option with HTML::FormatText to format the html as is to text but when
there links like this
<a href="http://www.microsoft.com>http://www.microsoft.com</a>
i.e somthing like this :
<br /><b>Microsoft</b><br /><a href="http://www.microsoft.com>`
will be converted to:
microsoft
http://www.microsoft.com
Take a look at HTML::FormatText::WithLinks
Setting the after_link option to, say, " (%l)" will put the link in line after the anchor text. In your example you would get Microsoft (http://www.microsoft.com).

Formatting Field values using itextsharp

how can i have a string format "i am fine here" using itextsharp
fields.SetField("tgPara2", message2);
where message is "i am fine here" i want the fine word only to be bold.
Any help would be
iText has partial support for "rich text values" for text fields. You can get and set the rich values, but iText won't actually draw those values properly. You need to turn off SetGenerateAppearances, and open the PDF in Acrobat/Reader to see the rich text.
This means flattening isn't going to work (unless you open the PDF in Acrobat, then save it again... clunky).
You might want to check out the PDF Specification (Section 12..7.3.4 Rich Text Strings) for further information on what is and isn't legal. <b> is legal, as is the font-weight CSS style