convert html to pdf i angular 5 - unicode

I must be convert html elements to PDF, but I don't want to use html2canvas, because this package convert html image at first, then convert to pdf. this way didn't efficient for my needs.
I used latest version of jspdf ( 1.5.3, that said support built in Unicode and utf-8 characters) with jspdf-autotable.
all things worked, except Persian character that corrupted.
do I must any setting for Unicode characters in last version of jspdf?
I must be use custom font? or default setting built in support Unicode characters?
I read all article about this issue, and run it, but I didn't solve my problem

Try to use html2pdf library for properly export unicode and Persian/Arabic characters.
The html code must be convert to canvas in first step and then convert to pdf.
function export_to_pdf() {
// Get the element.
var element = document.getElementById('root');
// Generate the PDF.
html2pdf().from(element).set({
margin: 0,
filename: 'test.pdf',
html2canvas: { scale: 1},
jsPDF: {orientation: 'portrait', unit: 'in', format: 'a4', compressPDF: true}
}).save();
}

Related

Convert unicode to emoji

I'm looking for a way to represent an emoji 📄 in my code as unicode which is then displayed as an actual 'image' in output text. I'd like to use http://apps.timwhitlock.info/unicode/inspect/hex/1F4C4 to display the 'page facing up' in application, but I don't like the idea of having pictures in my code (though it is working fine) ;)
You can use arbitrary Unicode characters directly in your source code
let string = "📄"
or use the Swift Unicode escape sequence:
let string = "\u{1F4C4}"
More information in the section about "String Literals" in the Swift reference.

html2pdf special characters not rendered

I am using html2pdf library for genarating pdf with bookmarked index. By default it seems to work well for English content but i need to generate content that includes English & Arabic text. The "aefurat" font seems to work relatively good, except some special characters (’, ‘, “, ”, ...) that are rendered as boxes ([]).
The code I used is,
require_once(dirname(__FILE__).'/../html2pdf.class.php');
$html2pdf = new HTML2PDF('P', 'A4', 'en', true, 'UTF-8', 0);
$html2pdf->setDefaultFont('aefurat');
$html2pdf->writeHTML($content);
$html2pdf->Output('bookmark.pdf');
A Sample content that includes arabic and special chararacters is,
’This is Arabic’ "العربية" Example With TCPDF... some text here some
text here some “text here”.
Wondering if I need to use some other font or alter some configurations. Kindly advice me.

converting ms word document's special characters to html

I have word document and following code which is converting doc into html using Apache POI Api.
serializer.setOutputProperty(OutputKeys.ENCODING, "UTF-8");
serializer.setOutputProperty(OutputKeys.METHOD, "html");
serializer.transform(domSource, streamResult);
but the numbering i.e 1), a), i) etc. and bullet points characters are not being parsed correctly, I am getting garbage characters like 1? and when I open the html file in the editor I get numbers with unwanted boxes. I have tried a lot but I don't get proper solution of this.
Please help me out in order to get rid of this encoding issue.
Thanks

FPDF utf-8 encoding (HOW-TO)

Does anybody know how to set the encoding in FPDF package to UTF-8? Or at least to ISO-8859-7 (Greek) that supports Greek characters?
Basically I want to create a PDF file containing Greek characters.
Any suggestions would help.
George
Don't use UTF-8 encoding. Standard FPDF fonts use ISO-8859-1 or Windows-1252. It is possible to perform a conversion to ISO-8859-1 with utf8_decode():
$str = utf8_decode($str);
But some characters such as Euro won't be translated correctly. If the iconv extension is available, the right way to do it is the following:
$str = iconv('UTF-8', 'windows-1252', $str);
There also is a official UTF-8 Version of FPDF called tFPDF http://www.fpdf.org/en/script/script92.php
You can easyly switch from the original FPDF, just make sure you also use a unicode Font as shown in the example in the above link or my code:
<?php
//this is a UTF-8 file, we won't need any encode/decode/iconv workarounds
//define the path to the .ttf files you want to use
define('FPDF_FONTPATH',"../fonts/");
require('tfpdf.php');
$pdf = new tFPDF();
$pdf->AddPage();
// Add Unicode fonts (.ttf files)
$fontName = 'Helvetica';
$pdf->AddFont($fontName,'','HelveticaNeue LightCond.ttf',true);
$pdf->AddFont($fontName,'B','HelveticaNeue MediumCond.ttf',true);
//now use the Unicode font in bold
$pdf->SetFont($fontName,'B',12);
//anything else is identical to the old FPDF, just use Write(),Cell(),MultiCell()...
//without any encoding trouble
$pdf->Cell(100,20, "Some UTF-8 String");
//...
?>
I think its much more elegant to use this instead of spaming utf8_decode() everywhere and the ability to use .ttf files directly in AddFont() is an upside too.
Any other answer here is just a way to avoid or work around the problem, and avoiding UTF-8 is no real option for an up to date project.
There are also alternatives like mPDF or TCPDF (and others) wich base on FPDF but offer advanced functions, have UTF-8 Support and can interpret HTML Code (limited of course as there is no direct way to convert HTML to PDF).
Most of the FPDF code can be used directly in those librarys, so its pretty easy to migrate the code.
https://github.com/mpdf/mpdf
http://www.tcpdf.org/
there is a really simple solution for this problem.
In the file fpdf.php go to the line that says:
if($txt!=='')
{
It is line 648 in my version of fpdf.
Insert the following line of code:
$txt = iconv('utf-8', 'cp1252', $txt);
(above the line of code)
if($align=='R')
This works for all German special characters and should also work for Greek special characters. Otherwise simply replace cp1252 with the respective alphabet you require. You can see all supported characters here: http://en.wikipedia.org/wiki/Windows-1252
I saw the solution here: http://fudforum.org/forum/index.php?t=msg&goto=167345
Please use my example code above, as the original author forgot to insert a dash between utf and 8.
Hope the above was helpful.
Daan
You need to generate a font first. You must use the MakeFont utility included within the FPDF package. I used on Linux this a bit extended script from the demo:
<?php
// Generation of font definition file for tutorial 7
require('../makefont/makefont.php');
$dir = opendir('/usr/share/fonts/truetype/ttf-dejavu/');
while (($relativeName = readdir($dir)) !== false) {
if ($relativeName == '..' || $relativeName == '.')
continue;
MakeFont("/usr/share/fonts/truetype/ttf-dejavu/$relativeName",'ISO-8859-2');
}
?>
Then I copied generated files to the font directory of my web and used this:
$pdf->Cell(80,70, iconv('UTF-8', 'ISO-8859-2', 'Buňka jedna'),1);
(I was working on a table.) That worked for my language (Buňka jedna is czech for Cell one). Czech language belongs to central european languages, also ISO-8859-2. Regrettably the user of FPDF is forced to lost advantages of UTF-8 encoding. You cannot get this in your PDF:
Městečko Fruens Bøge
Danish letter ø becomes ř in ISO-8859-2.
Suggestion of solution: You need to get a Greek font, generate the font using proper encoding (ISO-8859-7) and use iconv with the same target encoding as the one the font has been generated with.
How do I create PDF's in FPDF that support Chinese, Japanese, Russian, etc.?
(snapshots of code in use below)
I'd like to provide: a summary of the problem, the solution, a github project with the working code, and an online example with the expected, resultant PDF.
The Problem :
As stated by Tarsis, swap FPDF to TFPDF.
You actually need a font that supports the UTF-8 characters you are using.
I.E., merely using Helvetica and trying to display Japanese will not work. If you use Font Forge, or some other font tool, you can scroll to the Chinese characters of the font, and see that they are blank.
Google has a font (Noto font) that contains all languages, and it is 20mb, which is usually several factors the size of your text. So, you can see why many fonts simply won't cover every single language.
The Solution :
I'm using rounded-mgenplus-20140828.ttf and ZCOOL_QingKe_HuangYou.ttf font packs for Japanese and Chinese, which are open source and can be found in many open source projects. In tFPDF itself, or a new inheriting class of it, like class HTMLtoPDF extends tFPDF {...}, you'll do this...
$this->AddFont('japanese', '', 'rounded-mgenplus-20140828.ttf', true);
$this->SetFont('japanese', '', 14);
$this->Write(14, '日本語');
Should be nothing more to it!
Code Package on GitHub :
https://github.com/HoldOffHunger/php-html-to-pdf
Working, Online Demo of Japanese :
https://www.earthfluent.com/privacy.pdf?language=ja
This answer didn't work for me, I needed to run html decode on the string also. See
iconv('UTF-8', 'windows-1252', html_entity_decode($str));
Props go to emfi from html_entity_decode in FPDF(using tFPDF extention)
just edit the function cell in the fpdf.php file, look for the line that looks like this
function cell ($w, $h = 0, $txt = '', $border = 0, $ln = 0, $align = '', $fill = false, $link = '')
{
after finding the line
write after the {,
$txt = utf8_decode($txt);
save the file and ready, the accents and the utf8 encoding will be working :)
There is an extension of FPDF called mPDF that allows Unicode fonts.
http://www.mpdf1.com/mpdf/index.php
None of the above solutions are going to work.
Try this:
function filter_html($value){
$value = mb_convert_encoding($value, 'ISO-8859-1', 'UTF-8');
return $value;
}
You can make a class to extend FPDF and add this:
class utfFPDF extends FPDF {
function Cell($w, $h=0, $txt="", $border=0, $ln=0, $align='', $fill=false, $link='')
{
if (!empty($txt)){
if (mb_detect_encoding($txt, 'UTF-8', false)){
$txt = iconv('UTF-8', 'ISO-8859-5', $txt);
}
}
parent::Cell($w, $h, $txt, $border, $ln, $align, $fill, $link);
}
}
I wanted to answer this for anyone who hasn't switched over to TFPDF for whatever reason (framework integration, etc.)
Go to: http://www.fpdf.org/makefont/index.php
Use a .ttf compatible font for the language you want to use. Make sure to choose the encoding number that is correct for your language. Download the files and paste them in your current FPDF font directory.
Use this to activate the new font: $pdf->AddFont($font_name,'','Your_Font_Here.php');
Then you can use $pdf->SetFont normally.
On the font itself, use iconv to convert to UTF-8. So if for example you're using Hebrew, you would do iconv('UTF-8', 'windows-1255', $first_name).
Substitute the windows encoding number for your language encoding.
For right to left, a quick fix is doing something like strrev(iconv('UTF-8', 'windows-1255', $first_name)).
You can apply this function on your text :
$yourtext = iconv('UTF-8', 'windows-1252', $yourtext);
Thanks
Like many said here:
$yourtext = iconv('UTF-8', 'windows-1252', $yourtext);
BUT! with an '//Ignore' after the windows-1252 or in my case CP1252, like this:
iconv("UTF-8", "CP1252//IGNORE", $row['project_name'])
This one worked for me, I hope it works for you!
Not sure if it will do for Greek, but I had the same issue for Brazilian Portuguese characters and my solution was to use html entities. I had basically two cases:
String may contain UTF-8 characters.
For these, I first encoded it to html entities with htmlentities() and then decoded them to iso-8859-1. Example:
$s = html_entity_decode(htmlentities($my_variable_text), ENT_COMPAT | ENT_HTML401, 'iso-8859-1');
Fixed string with html entities:
For these, I just left htmlentities() call out. Example:
$s = html_entity_decode("Treasurer/Trésorier", ENT_COMPAT | ENT_HTML401, 'iso-8859-1');
Then I passed $s to FPDF, like in this example:
$pdf->Cell(100, 20, $s, 0, 0, 'L');
Note: ENT_COMPAT | ENT_HTML401 is the standard value for parameter #2, as in http://php.net/manual/en/function.html-entity-decode.php
Hope that helps.
For offsprings.
How I managed to add russian language to fpdf on my Linux machine:
1) Go to http://www.fpdf.org/makefont/ and convert your ttf font(for example AerialRegular.ttf) into 2 files using ISO-8859-5 encoding: AerialRegular.php and AerialRegular.z
2) Put these 2 files into fpdf/font directory
3) Use it in your code:
$pdf = new \FPDI();
$pdf->AddFont('ArialMT','','ArialRegular.php');
$pdf->AddPage();
$tplIdx = $pdf->importPage(1);
$pdf->useTemplate($tplIdx, 0, 0, 211, 297); //width and height in mms
$pdf->SetFont('ArialMT','',35);
$pdf->SetTextColor(255,0,0);
$fullName = iconv('UTF-8', 'ISO-8859-5', 'Алексей');
$pdf->SetXY(60, 54);
$pdf->Write(0, $fullName);
Instead of this iconv solution:
$str = iconv('UTF-8', 'windows-1252', $str);
You could use the following:
$str = mb_convert_encoding($str, "UTF-8", "Windows-1252");
See: How to convert Windows-1252 characters to values in php?
There's an extention to FPDF called UFDPF
http://acko.net/blog/ufpdf-unicode-utf-8-extension-for-fpdf/
But, imho, it's better to use mpdf if you're it's possible for you to change class.
I use FPDF for ASP, and the iconv function is not available.
It seems strange, by I solved the UTF-8 problem by adding a fake image (an 1x1px jpeg) to the pdf, just after the AddPage() function:
pdf.Image "images/fpdf.jpg",0,0,1
In this way, accented characters are correctly added to my pdf, don't ask me why but it works.
I know that this question is old but I think my answer would help those who haven't found solution in other answers. So, my problem was that I couldn't display croatian characters in my PDF. Firstly, I used FPDF but, I think, it does not support Unicode. Finally, what solved my problem is tFPDF which is the version of FPDF that supports Unicode. This is the example that worked for me:
require('tFPDF/tfpdf.php');
$pdf = new tFPDF();
$pdf->AddPage();
$pdf->AddFont('DejaVu','','DejaVuSansCondensed.ttf',true);
$pdf->AddFont('DejaVu', 'B', 'DejaVuSansCondensed-Bold.ttf', true);
$pdf->SetFont('DejaVu','',14);
$txt = 'čćžšđČĆŽŠĐ';
$pdf->Write(8,$txt);
$pdf->Output();

How to draw Thai text to PDF file by using libharu library

i am using free pdf library libharu to generate PDF file,
but i have a encoding problem, i can not draw Thai lanugage text on PDF file,
all the text shows "???.."
Somebody know how to fix it?
Thanks
I have succeeded in rendering hieroglyphic texts (not Thai, but Chinese and Japanese) using libharu. First of all, I used Unicode mode, please refer to HPDF_UseUTFEncodings() function documentation.
For C language, here is a sequence of libharu API calls needed to overcome your trouble:
HPDF_UseUTFEncodings(docHandle);
HPDF_SetCurrentEncoder(docHandle, "UTF-8");
Here docHandle is a valid HPDF_Doc object.
Next part is proper work with UTF fonts:
const char * libFontName = HPDF_LoadTTFontFromFile(docHandle, fontFileName.c_str(), font_embed::EmbedFonts);
HPDF_Font font = HPDF_GetFont(docHandle, libFontName, "UTF-8");
After these calls you may render unicode texts containing Thai characters. Also note about embedding flag (3rd param of LoadTTFontFromFile) - your PDF file may be unreadable due to external font references. If you are not crazy with output PDF size, you may just embed fonts.
I've tested couple of Thai .ttf fonts found in Google and they were rendered OK. Also (it may be important, but I'm not sure) I'm using fork of libharu https://github.com/kdeforche/libharu which is now merged into master branch.
When you write text to the PDF, use the correct font and encoding. In the libharu documentation you have all the possibilities: https://github.com/libharu/libharu/wiki/Fonts
In your case, you must use the ISO8859-11 Thai, TIS 620-2569 character set
An example (in spanish):
HPDF_Font fontEn = HPDF_GetFont(pdf, "Helvetica-Bold", "ISO8859-2");
HPDF_Page_TextOut(page1, 50.00, 750.00, [#"Código para correcta codificación en libharu" cStringUsingEncoding:NSISOLatin1StringEncoding]);