itext html to pdf content gets out of document - itext

I'm trying to convert this piece of html without any css:
<!-- saved from url=(1129)https://00f74ba44bf27c26fa604fec19ae391f1d94b6b867-apidata.googleusercontent.com/download/storage/v1/b/backoffice-pao-export/o/document.html?jk=AFshE3XhuRHA7mtfWHAXotti5kjbdIdwxYMBJwIALdaUHwAd5SAytVpKLo_GL_3G_C4shq09Xmhlh2M5uo4BlheALWF58v-9mdqU7EYAR03iEraa1dZZNG0eu3waNSsxkMoxAHr-_GqZXDUHVNvMrLZnTiO7uYcZzQ2OuWvLl3xnX2ppzF0fZ3Bi1b7Rka7nhlNGmrjYDbWWBbrWRiiMnBNd_QZAK_T0t5XobSXCwlJ90IczJLMgjlDYXdq6UJzlsJQLEBI4MA5Ca1s0x-yhygik9sYOv1yawtyPAmvUfwVThET3b6HEA_tnVShpSes8rLZzAJemRtJ7HAJ0NhasQxwsIwOtmriFl8jhQCbFT7nxlwmnfhnSwTSqCxL9JiBdCTHOEqmHVCfsGAC3j3eiJdFFTncsgwhu2MN9_4DSibiuyc_UjHPPcOHOmbSLQxZFtnY4lL-OMIM4G-iDm5gb2k7_K0icO_-eTpSySqhKsFJroGg9KtzU-Rp8mUjeCeY_oGNWE8u1ndsZnP635pJ3hSzsFhEKK85X-L0BpCKTOH3WEATg7c4cEl-VaIyrEbz5ap4GoKCMo9oV2egcfoM2c2N91ZN5IpuXpAlwBoRf0O0zECZfBHQaVOX5RbNYu1cdB69jWVl52ZHl1q2dkx8pILl7dThSan5GHK3cfnP_0fucOiPLLKTH0KXZdY7y1eH666WyUdIsv4SrXvLHzhASeQp7XV_WjtEbVriylge0iOVdbngznKzVxGOJ5xQCnyr3oFZl_GfDnVxMokx-dBNefPAYCWNu3NrNkvJ1emR1KBlTJjX7OIrmQPjSDX5lx8fejzIB3cstLXeTHFVU-ITkQ4ZadevjoV_mMz3SKUU_chyzQVybYdHt498-1gVLmtlb2Qww3bKMPsOK9i3_h2MxvHiV9Sow6mYzZHV9Q-riCbBEDoRbNo0iyHgjbOjs-UHwQPN0U1bvOvU2RxcS7A&isca=1 -->
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=windows-1252">
</head>
<body>
<div id="184981f8-654a-4e90-a0f5-e75d1edaf2ca" class="act">
<div id="b995877a-0d3c-439f-984e-f9f809d124a5" class="footnotes">
<table>
<tbody>
<tr>
<td id="f29aca16-143d-1fc6-8f6a-d2aa116cde25">1</td>
<td>Ezechiel HAVRENNE is a lecturer at the University of Luxembourg on Investment Funds. Views expressed
in this article reflect some of the author’s experience to date on the subject matter. As the
Luxembourg investment fund market continues to develop these views may – and will most likely
continue to – evolve in one way or another. This article should in no way be construed as legal,
business or structuring advice rendered by the author or any other entity, nor should it be
construed as reflecting the views of such entity(ies)
</td>
</tr>
<tr>
<td id="434b1865-a5ea-1f96-b0fa-09ea9e4fb76a">2</td>
<td>The Preqin Quarterly Update: Private Debt, Q3 2020, 7 October 2020, page 12; <a
id="0e11d32d-c25b-65c1-8266-39da10bb62f3"
href="https://www.preqin.com/insights/research/quarterly-updates/preqin-quarterly-update-private-debt-q3-2020"
target="_blank" class="tech_external" rel="noopener">https://www.preqin.com/insights/research/quarterly-updates/preqin-quarterly-update-private-debt-q3-2020</a>
(accessed 15 March 2021). These figures drastically contrast with those reported by Lipper as of
October 2016, whereby “<em>the gross AuM of all funds that invest primarily in loan participations
was approximately USD 218 billon</em>� as mentioned in IOSCO’s final report; IOSCO
FR03/2017, ib., page 4
</td>
</tr>
<tr>
<td id="6bf035e5-d434-1eec-a550-58147bed84a0">3</td>
<td>According to EU recommendation 2003/361, 2 factors determine whether a business is an SME: (i) the
number of employees and (ii) either turnover or balance sheet total. A medium-sized company has up
to 250 employees, a turnover of up to €50 million or a balance sheet total of up to €43 million.
A small-sized company has up to 50 employees & a turnover or balance sheet total of up to €10
million. A micro-company has up to 10 employees & a turnover or balance sheet total of up to
€2 million
</td>
</tr>
<tr>
<td id="5028557e-4efe-1066-9fd4-28809a6d0653">4</td>
<td>For instance, one of the driving forces that has led European jurisdictions to consider permitting
funds to originate loans was the adoption of the EU regulation on European long-term investment
funds allowing funds the origination of loans under certain conditions. As a result, many
jurisdictions in Europe now allow loan originations by funds
</td>
</tr>
<tr>
<td id="cd0ac4df-9139-1c0a-9dd0-c15cca78845a">5</td>
<td>See IOSCO’s final report FR03/2017, <em>Findings of the Survey on Loan Funds</em>, February 2017,
page 4 <a id="76d9ff09-04f9-61a4-a311-2cfee0e19245"
href="https://www.iosco.org/library/pubdocs/pdf/IOSCOPD555.pdf" target="_blank"
class="tech_external" rel="noopener">https://www.iosco.org/library/pubdocs/pdf/IOSCOPD555.pdf</a>
(accessed 13 April 2021)
</td>
</tr>
<tr>
<td id="a0dd548b-cfa4-182c-9472-624a6be46538">6</td>
<td>See the Glossary of Summaries published on EUR-Lex, <a id="3052c250-b9c1-60f7-b36c-45ab06665101"
href="https://eur-lex.europa.eu/summary/glossary/sme.html"
target="_blank" class="tech_external"
rel="noopener">https://eur-lex.europa.eu/summary/glossary/sme.html</a>
(accessed 13 April 2021) as well as the European Commission’s page titled “<em>Access to finance
for SMEs</em>�,<a id="b8b721ff-fd48-67aa-aaac-e5b1d0d02b60"
href="https://ec.europa.eu/growth/access-to-finance_en" target="_blank"
class="tech_external" rel="noopener">
https://ec.europa.eu/growth/access-to-finance_en</a> (accessed 13 April 2021)
</td>
</tr>
<tr>
<td id="d98d8f00-f797-1b37-9540-36713cfdc8a7">7</td>
<td><em>Ib.</em></td>
</tr>
<tr>
<td id="3868e384-a464-1b26-933a-8ec3a95f86d5">8</td>
<td>For more information see <a id="dc357707-f043-68ce-a7bc-c9a5d9d86c7d"
href="https://ec.europa.eu/growth/smes/cosme_en" target="_blank"
class="tech_external" rel="noopener">https://ec.europa.eu/growth/smes/cosme_en</a>
(accessed 13 April 2021)
</td>
</tr>
<tr>
<td id="6766e322-fdf8-16b8-99e4-006e43fdecbd">9</td>
<td>See the European Commission’s page titled “COSME Financial Instruments�, <a
id="62cbd917-994d-6388-b0db-786a5c792685"
href="https://ec.europa.eu/growth/access-to-finance/cosme-financial-instruments_en"
target="_blank" class="tech_external" rel="noopener">https://ec.europa.eu/growth/access-to-finance/cosme-financial-instruments_en</a>
(accessed 13 April 2021)
</td>
</tr>
<tr>
<td id="11773190-b10f-1399-b71f-3a5fcfa5a5fc">10</td>
<td>Even if the eligibility for participation in the COSME LGF programme was extended to Loan
Origination funds it does not appear from the EIF’s register published as at 31 January 2021 that
any would have made the list. See<a id="cf5536ce-bff2-6220-9ed7-e4011b938b0e"
href="https://www.eif.org/what_we_do/guarantees/single_eu_debt_instrument/cosme-loan-facility-growth/cosme_lgf_signatures.pdf"
target="_blank" class="tech_external" rel="noopener">
https://www.eif.org/what_we_do/guarantees/single_eu_debt_instrument/cosme-loan-facility-growth/cosme_lgf_signatures.pdf</a>
(accessed 13 April 2021)
</td>
</tr>
<tr>
<td id="12b455e1-ceff-10b6-ba3d-df5b441fe989">11</td>
<td>Those associated countries include Iceland, Montenegro, Turkey, the Republic of North Macedonia,
Albania, Serbia, Bosnia and Herzegovina, and Kosovo
</td>
</tr>
<tr>
<td id="d8103a16-44fa-1096-8295-d478456b0117">12</td>
<td>Connor Hussey, Luxembourg private debt industry grows 36% from 2019, Private Funds CFO, 3 December
2020, <a id="0facc75b-6776-606c-b47d-e2025d559bf2"
href="https://www.privatefundscfo.com/luxembourg-private-debt-industry-grows-36-2-from-2019"
target="_blank" class="tech_external" rel="noopener">https://www.privatefundscfo.com/luxembourg-private-debt-industry-grows-36-2-from-2019</a>/
(accessed 13 April 2021). These figures should be in line with the then reality based on the 2017
final report of IOSCO whereby it stated that “<em>in Luxembourg, the net AuM of all domestic Loan
Funds (i.e., Funds with their primary activity engaged in lending and across various loan
activities, encompassing also activities such as microfinance, real estate debt or
infrastructure financing) is EUR 37.3 bn, constituting 1% of all domestic Funds</em>�, IOSCO
FR03/2017, ib., page 9
</td>
</tr>
<tr>
<td id="228c3276-de18-1393-9860-66ff5272b741">13</td>
<td>KPMG – ALFI Private Debt Fund Survey 2020, pages 4 and 5, <br><a
id="6d4a0dff-557a-603a-8b28-c47bd843b6b4"
href="https://assets.kpmg/content/dam/kpmg/lu/pdf/private-debt-fund-survey-2020.pdf"
target="_blank" class="tech_external" rel="noopener">https://assets.kpmg/content/dam/kpmg/lu/pdf/private-debt-fund-survey-2020.pdf?</a>utm_source=Sailthru&utm_medium=email&utm_campaign=Loan%20Note%203%20December%202020&utm_term=PDI_LONENOTE_SUBSCRIBER<br>
</td>
</tr>
</tbody>
</table>
</div>
</div>
</body>
</html>
But content gets cropped everytime when I run HtmlConverter.convertToPdf() with the html content as a string getting this as result:
However when I remove last tr element, I get expected result:
What do you think is causing this? Is it because table element has too many childs?
--- Question Update ----
So after reading the comment from #CptCave I tried changing the html to this format using word-break css property that's supposed to work in this case:
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=windows-1252">
<style>
.word-break{
word-break: break-all;
}
</style>
</head>
<body>
<div id="b995877a-0d3c-439f-984e-f9f809d124a5" class="footnotes">
<table class="word-break">
<tbody>
<tr>
<td id="7673aebd-bc37-198d-932f-987fb16fb503">94</td>
<td>See ESMA Consultation Paper Guidelines on transaction reporting, reference data, order record
keeping & clock synchronisation, 23 December 2015, ESMA/2015/1909, p. 49; <a
id="5326eab7-02a4-69ec-9069-2d0c8eb5f180"
href="https://www.esma.europa.eu/sites/default/files/library/2015-1909_guidelines_on_transaction_reporting_reference_data_order_record_keeping_and_clock_synchronisation.pdf"
target="_blank" class="tech_external" rel="noopener">https://www.esma.europa.eu/sites/default/files/library/2015-1909_guidelines_on_transaction_reporting_reference_data_order_record_keeping_and_clock_synchronisation.pdf</a>
(accessed on 13 April 2021)
</td>
</tr>
</tbody>
</table>
</div>
</body>
</html>
However I got this as result:
The solution was to add inline css
*<table style="word-wrap: break-word"/>*
So to accomplish I changed document structure with jsoup before converting it:
Document document = Jsoup.parse(html);
document.getElementsByTag("table").forEach(table -> {
table.attr("style", "word-wrap: break-word");
});

As far as I can see your issue is caused by the lack of word wrapping. Your last table row has a long uninterrupted string: the link with the UTM-tags. If you'd remove the utm-tags from it, the cropping would not persist.
<tr>
<td id="228c3276-de18-1393-9860-66ff5272b741">13</td>
<td>KPMG – ALFI Private Debt Fund Survey 2020, pages 4 and 5, <br><a
id="6d4a0dff-557a-603a-8b28-c47bd843b6b4"
href="https://assets.kpmg/content/dam/kpmg/lu/pdf/private-debt-fund-survey-2020.pdf"
target="_blank" class="tech_external" rel="noopener">https://assets.kpmg/content/dam/kpmg/lu/pdf/private-debt-fund-survey-2020.pdf</a><br>
</td>
</tr>
The more durable solution is to implement word wrapping with CSS with the parameter overflow-wrap set to break-word.
There is a full example of this in the iText KB: https://kb.itextpdf.com/home/it7kb/examples/pdfhtml-support-for-overflow-wrap-word-break-css-properties

Related

How to auto fill form fills based on data from a table?

I have a form which the data is being submitted to a table.
Using the column 'month_date_show' I would like to auto fill the form
if that date exist in the column in the table, since it will only be submit once a month.
If the date does exist in the table then the user will be able to edit those results, if it does not exist
then no need to auto fill the user will submit new results for that month.
Not sure if this is something that will have to use ajax or java script to make it happen.
Right now I hard coded the values in the form , but i would like to make it dynamic.
<cfquery datasource ="intranet" name="GetSummary">
SELECT * from cse_result_summary
</cfquery>
<form method="post" name="myform" action="cse_execoffice_datepicker_test.cfm" onsubmit="return validateForm()">
<table >
<tr>
<td>
<input type="text" id="dpMonthYear" NAME="month_date_show" value="9/2014" style="width:80px;" />
</td>
<td>
<img alt="Month/Year Picker" onclick="showCalendarControl('dpMonthYear');"
src="pictures/datepicker.gif" />
</td>
</tr>
</table>
<table >
<tbody>
<tr>
<td> Rising Star Award Winner:</td>
<td><input type="text" name="risingstar" size="50" class="get_branches_departments_displaynum" value="john"></td>
</tr>
<tr>
<td>Department Average:</td>
<td><input type="text" name="risingstar_ave" size="8" class="get_branches_departments_displaynum" value="5"></td>
</tr>
<tr>
<td> Rising Star Award Winner runner-up:</td>
<td><input type="text" name="risingstar_runner" size="50" class="get_branches_departments_displaynum" value="joe"></td>
</tr>
</tbody>
</table>
<p><input type="submit" name="Submit" value="Submit"></p>
</form>
Well, if you are actually storing the month and year of the forms submitted, and assuming that the users are submitting form for the current month and current year (i.e. it is September 2014 now and the users would be submitting form for 09/2014 but not submitting form for any other form, ex. 08/2014), then at the time when a user gets to your page, you already know whether a new form is needed by searching against the database.
You can then populate the form accordingly.
It depends. If you don't care if the user refreshes the page, just submit the form onChange, run the query to select information for that date
SELECT winner, average, runup
FROM tablename
WHERE datecol = <cfqueryparam type=<whatevertypeappropriate> value="#val_name#">
In your form just have the value field value dynamically assigned.
<input type="text" name="winner" value="#queryname.winner#">
If you don't want them to have to resubmit the form, I think there'll have to be some kind of binding.

Mailchimp repeatable blocks issue

I have next table in my template:
<table ...>
<tr mc:repeatable mc:hideable>
<td mc:edit="mc-news-item-image"><img .../></td>
<td mc:edit="mc-news-item-h3"><h3>Lorem ipsum.</h3></td>
<td mc:edit="mc-news-item-date"><span>22.10.13</span></td>
</tr>
</table>
In edit campaign mode if I try to dublicate a row I get copy of it but can't edit any block in template. Any help would be much appreciated!
It appears that Mailchimp doesn't allow the hideable and repeatable tags to be on the same table/div.
So you are going to have to use one or the other.

creating two grids one besides other each grid having each grid with two columns using dojo.

I want to have two grids one besides other. Each one of this grid having each grid contains two columns. I need complete source code for this using DOJO.
Basically i am comparing data between two employees. So i can compare data between two employees.
I already created one grid which display one employee data but failed to create another grid beside this employee. I need help with displaying two grids one besides other using DOJO
Have you tried to setup two different grids into two different Nodes.
Example:
<div id="emp1"> insert grid1</div> <div id="emp2">insert grid2</div>
Make two diffrent stores for the Grids and startup both seperatly.
Regards
<table dojoType="dojox.grid.DataGrid" id="table1">
<thead>
<tr>
<th width="300px" field="Title">
Title of Movie
</th>
<th width="50px">
Year
</th>
</tr>
<tr>
<th colspan="2">
Producer
</th>
</tr>
</thead>
</table>
<table dojoType="dojox.grid.DataGrid" id="table2">
<thead>
<tr>
<th width="300px" field="Title">
Title of Movie
</th>
<th width="50px">
Year
</th>
</tr>
<tr>
<th colspan="2">
Producer
</th>
</tr>
</thead>
</table>
then using js, add store to initialize it.
var table1 = dijit.byId('table1');
var store1 = new ...whatEverStore;
table1.setStore(store1);
//do same for table2
Hope this helps

storeTextPresent by ID selenium IDE

---Jump down to my edit with a simplistic example---
I have searched Ad Nauseum, and spent hours getting this close, but still not solving my automation problem.
Here's the thing I convert the local paper from the print edition to the online edition through PDF's, the content gets pushed to the website and is not live, until I go in and edit some settings, a lot of these are redundant. so if I can get past this one point I'm golden and can shave literally hours off of the time it takes to do this work.
The paper has twenty or thirty writers, and the one_off_byline can vary a bit here is some examples.
id=id_one_off_byline value="Michael Reid" however it may also look like the these next two or some variation thereoff even.
id=id_one_off_byline value="By Michael Reid"
id=id_one_off_byline value="By Michael Reid - Your Daily Paper"
I have used storeTextPresent to find Michael on the page. however the problem is there is another select value box on the page that contains every writers name in a selection box, which is what I'm actually trying to populate. So here is what I have
<tr>
<td>storeTextPresent</td>
<td>Jeff</td>
<td>IsTextAppears</td>
</tr>
<tr>
<td>gotoIf</td>
<td>${isTextAppears}</td>
<td>Jeff</td>
</tr>
<tr>
<td>storeTextPresent</td>
<td>Graham</td>
<td>IsTextAppears</td>
</tr>
<tr>
<td>gotoIf</td>
<td>${isTextAppears}</td>
<td>Graham</td>
</tr>
<tr>
<td>label</td>
<td>Jeff</td>
</tr>
<tr>
<td>label</td>
<td>Graham</td>
</tr>
Another way to phrase this I hope___
Form Field 1 (ignore syntax, I'm just setting up an example here) the id= is the important part in the two form fields.
<select name="select" id="pick_animal" value="">
< option="the fast cat" id="1">
< option="the fast dog" id="2">
</select>
Form Field 2
<input type="text" id="some_animal" value="the fast cat from dover">
I need to detect that id="some_animal" contains cat
so I can perform an action on the correct option in field 1.
I can do the second part just fine, I just can't detect "cat" in only the input with id "some_animal"
storeTextPresent just looks for cat to exist anywhere on the page. ugh!!
<tr>
<td>storeTextPresent</td>
<td>Jeff</td>
<td>IsTextAppears</td>
</tr>
<tr>
<td>gotoIf</td>
<td>${isTextAppears}==Jeff</td>
<td>Jeff</td>
</tr>
<tr>
<td>goto</td>
<td>END</td>
<td></td>
</tr>
<tr>
<td>label</td>
<td>Jeff</td>
<td></td>
</tr>
<tr>
<td>echo</td>
<td>something</td>
<td></td>
</tr>
<tr>
<td>label</td>
<td>END</td>
<td></td>
</tr>
Try like this.

jQuery .each function across browsers (works in ff, ie8, not ie7)

I've been messing with this for far too long, and managed to get IE8 working, but IE7 has me stumped.
I've got a table, and for each column, I am trying to extract a number of divs. I am only extracting divs which match specific selectors, not all divs in the column.
My original jquery selector was
jQuery('div.a1, div.a3, div.a4, div.a7','table#a'+tableId+' td:nth-child('+columnNum+')').each(function(){
alert(jQuery(this).attr('id'));
});
This worked great in FF, but didn't trigger the .each function at all in IE.
After messing around for a bit, I got to
jQuery('td:nth-child('+columnNum+') > div.a1, td:nth-child('+columnNum+') > div.a3, td:nth-child('+columnNum+') > div.a4,td:nth-child('+columnNum+') > div.a7', table#a+'tableId).each(function(){
alert(jQuery(this.attr('id'));
});
Not so nice, but works in IE8.
I had tried all sorts of combinations using .eq(+'columnNum+') but nothing else was working.
Now I go and test in IE7, and again the .each isn't being triggered.
What is the nicest way (and cross-browser compatible) to work with this sort of .each element?
--------------addition--------------
After further testing and playing around with suggestions from DrJ and bdukes, I've found that the table#'+tableId breaks the function in both IE7&8.
I've gone back to my original code
jQuery('div.a1, div.a3, div.a4, div.a7','table#a'+tableId+' td:nth-child('+columnNum+')').each(function(){
alert(jQuery(this).attr('id'));
});
as that seems to me the most efficient.
If I remove 'table#a'+tableId, i get the correct response in all browsers, except that it is adding up the results from all tables, and I need to be able to get only the results from one table at a time.
I have also tried 'table#a'+tableId+'>td:nth-child('+columnNum+')').each, but that doesn't work either.
The first function i've used works perfectly in firefox.
----------------the html being selected---------------------------
The tables are being created dynamically in javascript so I can't really copy and past it, but here is what the output looks like. It ends up looking kinda like a gantt chart on a table.
<table id="a1">
<tr>
<th colspan="5">
Group Name
</th>
</tr>
<tr class="rowId1" >
<td>
<div class="a1" id="a43" style="margin-left:13px; width:60px" ></div>
</td>
<td>
</td>
<td>
<div class="a3" id="a93" style="margin-left:4px; width: 80px" ></div>
<div class="a2" id="a94" style="margin-left:4px; width: 30px" ></div>
</td>
<td>
<div class="a1" id="a24" style="margin-left: 15px; width: 65px;" ></div>
</td>
<td>
</td>
</tr>
</tr>
<tr class="rowId1" >
<td>
<div class="a7" id="a24" style="margin-left:10px; width:60px" ></div>
</td>
<td>
<div class="a2" id="a15" style="margin-left:14px; width: 22px" ></div&gt
</td>
<td>
;
<div class="a2" id="a105" style="margin-left: 8px; width: 50px" ></div>
</td>
<td>
</td>
<td>
<div class="a4" id="a102" style="margin-left: 5px; width: 45px;" ></div>
</td>
</tr>
</table>
It turns out this was an issue with IE failing when two different elements have the same ID. Apparently this breaks the .each function.
I had two tables
table.notes#a1 & table.inputs#a1
The .each function should have gone through each table but instead found neither.
jQuery also wouldn't run in ie with
jQuery('div.a1, div.a3, div.a4, div.a7','table.inputs#a'+tableId+' td:nth-child('+columnNum+')').each(function(){
alert(jQuery(this).attr('id'));
});
which it should have done, as I am them pointing directly to a specific table even if the id is not unique.
I'm using id's retrieved from the database for the id, and IE doesn't like id's that start with numbers, so I just added an 'a' to the beginning of the id.
However, it apparently doesn't like that either, so now I'm adding the first letter of the class and then the '1' or whatever the id number is.
This solves the issue.