Convert PDF file to HTML and create a template for e-signature (like DocuSing ) - pdfhtml

We have a business case where user will upload a pdf or word document, create a template, and mark the sections to capture the e-Sign. So each user use this templage to e-Sing it online (I can do this through DocuSign) but want to integrate it within our angular application. Suggestions are welcome.
Convert the PDF to HTML, but trying to find how we can provide provisions to mark the sections for e-Sing.

Related

IText Pdf - RadioBox(On/Off) not appearing for some pdf

In our application we are using Itext Pdf 5.5.3 library.
We have checked with some of the pdfs in which Checkboxes displayed correctly(check/uncheck) .
However there are some pdf with RadioBoxes and do not display radiobutton(on/off) correctly.
I also use this link to validate pdfs and java code
String[] values = form.getAppearanceStates("Checkbox");
return null values.
Also tried Itext RUPS and found that pdf which are working shows Form Field Names in RUPS Form Tab. And PDfs which are not working do not display form fields.
I tried generating pdf from word document and it doesn't display form fields in RUP , neither I can check/uncheck checkbox in Adobe Acrobat Reader.
What could be the solution to display radiobutton with check on / off ?
Edit -
I had created sample web application to reproduce the issue.
Please setup attached web application and let me know the fix for the issue.
Please download from this link
You have successfully discovered the difference between interactive PDF forms and "flat" PDF documents that look like a form to the human eye, but that aren't interactive forms.
To make the "flat" forms interactive, you need to open those flat documents in PDF editing software (e.g. Adobe Acrobat) and you need to add a form field manually.
You can ask Acrobat to guess where it should add fields, but Acrobat will be wrong in many cases for obvious reasons. You always need a human if you want it to be done correctly.
As for creating an interactive PDF from Word... Forget about it. Use OpenOffice or LibreOffice.

asp.net web application to convert pdf to word

Is there any clear and proper process to convert a pdf file into a word file with all formatting and images in asp.net web application?
The best way to do that is by using the OCR. It will recognize the text and the images in the PDF file, and then you can save it on a DOC file. I know a third party toolkit named leadtools that should help you doing your requirements, since it support the ASP.NET environment. You can check their Online OCR Demo
Also, you can check their website for more information, or contact their support team.
PDF is a presentational format where all the content is placed by absolute positions. There are no paragraphs and other structured elements (unless it is a Tagged PDF). Technically, you can output every word character by character in any order, but visually it would look like a normal text. Thus, to make a proper conversion to word it is required to do content recognition or some kind of OCR (e.g. ABBYY FineReader)
There are some paid components on the market that allow to do text extraction and some do converting pages to images (obviously, this is not a desired approach for converting into word).

User Fill in for Adobe forms

I am using Adobe life cycle designer to create docs in my application....I have all my documents in word and I use the export to option in Adobe Life cycle designer and i get the document converted and now I need to have a user fill in the exported document..so can some one please suggest me how this would go and we use the java script behind....
You could have them fill the form in Adobe land, then use the scripting method exportData to get the form data as XML, then inject that XML into your Word docx as a custom xml part.
From there, Word will use the XML in any content controls bound to it.

Converting large amounts of text and dynamic data into PDF

I have a three page Word document that needs to be converted into PDF. This Word document was given to me as a template to show me what the PDF output should look like. I tried converting this document into PDF, created a PDF form and used iTextSharp to open the form, populate it with data and return it back to the client. This is all great but due to large amounts of data stored, the placeholders were insufficient and the text would be truncated or hidden.
My second attempt was to create an MVC 2 View without master page, pass the model to the view, take the HTML representation of the View, pass it over to iTextSharp and render the PDF. The problem here was that iTextSharp failed on some tags (one of them was <hr> tag). I managed to get rid of the problematic tag, but then tables were not rendered properly. Namely, the border attribute was ignored so I ended up with borderless tables. That attempt failed.
I need a suggestion or advice on the most efficient way to create a PDF document in MVC 2 which would be maintainable in the long run. I really don't want my actions to be 200+ lines long. Working directly with the Word document is not the best solution as I have never worked with VSTO so I don't quite know what it would look like to open Word and manipulate text inside of it and add dynamic data and then convert that dynamically into PDF.
Any suggestion is highly welcome.
Best regards!
One thing that I've done in the past is to save the Word file as a DOCX and unzip it since DOCX is just a renamed zip file. Within the archive open up /word/document.xml and you'll see your document. There's a lot of weird XML tags in there but overall you should get a pretty good idea of where your content is. Then just add placeholder text like {FIRST_NAME}, save the file and re-zip.
Then from code you can just perform the same steps, unzipping with something like SharpZipLib or DotNetZip, swapping placeholder copy, re-zipping and then using very simple Word automation to Save-As a PDF.
The other route is to fully utilize iTextSharp and actually write Paragraphs and PdfPTable and everything else. It takes a lot longer to setup but would give you the most control.
Q: you say "... but due to large amounts of data stored, the placeholders were insufficient and the text would be truncated or hidden"
How do you end up having to much data ? If the word template can "hold" the data in 3 pages, they should fit in 3 PDF pages.
I used to use iTextSharp to create my PDF's, but I also almost always ended up building the PDF document from scratch myself.(not really a <200 line solution) Have you considerate another library, I recently switched to MigraDoc's PDFSharp.Way simpler to use then iText, lotsa examples / docus
Just my two cents
Word documents object model is quite easy to understand. It will either contain series of Paragraphs or Tables. Using the Open XML SDK, you can iterate through each paragraph/table in the word document and retrieve it's content and styles. Then you can generate PDF document on the fly using those retrieved information. This will work under MVC too.
But if your word document contains complex elements, then it will take some more time for you to implement based on this approach. Also, this approach would only work with (Word 2007 and 2010) files.
Also, HTML to PDF options currently available in the ITextSharp library would work with only known set of tags, as far as I know.
Another suggestion is to make use of commercially available .NET components. There are lot of good solution available. For ex: Syncfusion

How can I output tables to a PDF file with the iPhone SDK?

I want to output a PDF using UIKit's PDF creation methods. I see plenty of information on the web about creating a graphic context in a PDF, but I want to create smart text tables whose cells the user can later copy and paste into other applications (Word, Excel, etc.). How do I do this?
Unfortunately, that's not trivial. I recommend you the libharu PDF library for iPhone as a good point to start from.