How can I render a completed CGI form as a PDF? - perl

I have an HTML form which a user may have filled in or partially filled in. I want to snapshot that state and render it as a PDF document. I've been using wkhtmltopdf.
I've tried this from both the client side and the server side, and the rendered result is always the original form, never the filled-in one.
I notice if I reload the filled-in form page I get back the filled-in form, but if I cut and paste the form's URL into a new window, I get the initial, non-filled-in form.
So I've convinced myself that, if I could use CGI::Session properly, I could successfully open a session identical to the filled-in session. I tried using CGI::Session::Plugin::Redirect with no joy. I think the key is that window.open() has to use the SID of the filled-in form window.
I don't have a lot of experience with CGI session management, so this has been a four-day quest to nowhere. Any advice is appreciated, even if it's to abandon this approach and go back to the more common post->render a new form in a new window, and generate the PDF from that. I'd like to avoid all of that if I can.

Say you have the following HTML document on your web server:
/var/www/html/index.html
<!DOCTYPE html>
<html>
<head>
<meta charset="UTF-8">
</head>
<body>
<form action="/process.cgi">
<input type="text" name="foo">
</form>
</body>
</html>
When you navigate to http://hostname/index.html in your browser, the webserver returns this document and the browser displays it.
When you fill in the text field in your browser, the document on the webserver doesn't change. So anybody who navigates to http://hostname/index.html will get the original, unmodified form. This is why you can't simply copy and paste the URL into another browser tab and get the filled-in form.
Most browsers use caching by default. When you fill in some fields in a form, the browser caches what you entered. When you reload the page, the webserver sends the exact same document as before* (i.e. the unmodified form), but the browser uses the cached data to fill in the form fields the way you had them. If you override the cache when you reload the page (Ctrl+F5 in Firefox), the form fields will not be filled in. Note that neither the URL nor the document on the server have changed. This is why you can't copy and paste the URL into another browser tab after reloading the page and get the filled-in form.
wkhtmltopdf takes a URL, renders the corresponding page, and generates a PDF based on what is rendered. Based on the explanation above, it should now be clear why wkhtmltopdf always generates an image of the unmodified form.
The solution
If filling in form fields doesn't change anything on the webserver, what does it change? It changes the DOM, a structure describing the document in your browser that you can access using JavaScript.
One approach would be to use a client-side JavaScript PDF generator like jsPDF; since it runs on the client, it has access to the DOM that the user is interacting with, so it can "see" the values the user enters into the form fields.
* Actually, the webserver will typically send a 304 Unmodified response to save bandwidth, but form caching works the same either way.

The explanation from ThisSuitIsBlackNot is accurate about why your design is failing. Typing characters into form fields in a browser changes only your screen and the data in the memory allocated to the browser.
I suggest a different solution. The WWW::Mechanize::Firefox module is a variant of WWW::Mechanize that uses a real browser application to retrieve and render web pages. It is mostly chosen when a site requires JavaScript support, but it is useful here because it has a content_as_png method which returns a PNG image of the current page. Hopefully that is enough for you to build a PDF file with the required content

Related

Using Adobe Test-and-Target, how do I avoid seeing the first page before the redirect kicks in?

I've got an A/B Test set up in Adobe Test-and-Target. The idea is that 50% of the time, visitors to a certain page should be redirected to a different page instead. It is working correctly, in that half of users are sent to the new page.
However, sometimes the entire original page is loaded before the redirect happens. I put the mbox in the head tag of the page, which I thought would ensure the redirect happened before any HTML was displayed to the user, but that's not happening.
How can I create a seamless result for the user, where the redirected users only see the new page loading, and never see the original page?
For our site, the <script src="http://maur.imageg.net/js/mbox.js" ></script>is at the very end of the head tag and works fine.
Your mbox.js should be as close to the top as possible and then your inline mbox should be defined preferably right after the tag. This way the request is made before the content starts to render, and the redirect kicks in before the guest even sees the page.
Avoid using anything post DOM related for example jQuery's:
document.ready( function{});
If you paste your code you're using for the A/B - we be able to review & respond accordingly.
However pure Javascript and pure CSS should execute seamlessly.
You can use CSS first to not show anything
<style>
body {display:none!important}
</style>
Then use JavaScript to redirect the page to new page.

Indexing an HTML page that redirects onload

I have a pure GWT based website and as we are aware the search engines cannot index pure gwt based websites. Thus, I have created an alternate web page as shown below which is stored as a separate html in the war folder. The purpose of this webpage is to enlist and index details regarding my website. This page is never displayed on my website, but instead is meant only for indexing. The url leading to this web page is part of the Sitemaps.xml. Thus I am assuming that the below html will be indexed because it's a part of Sitemaps. So here are my questions:
Will the content I give in the div with id "crawler" be indexed given the fact that it is scheduled for removal onload and that the browser is redirected to another url on load?
Is there a better way to get the content indexed for a pure GWT website which does not have any html based user interface?
I can also have urls that will invoke a servlet and return a response that is meant for indexing. But then the same url will be displayed in search results, which is not useful. In other words, I am trying to figure out a way in which the content gets indexed, but when the user clicks the search result he should be redirected to the home page instead of showing the indexed content.
<head>
<script>
function load(){
element = document.getElementById("crawler");
element.parentNode.removeChild(element);
window.location.href='http://<mysite>.com';
}
</script>
</head>
<body onLoad='load()'>
<div id="crawler">
<CONTENT TO BE INDEXED>......
</div>
</body>
As you can see here the div (crawler) that contains all the content that is meant for indexing, is removed as soon as the body loads. Apart from this the page also redirects to the home page of the site on load.
The crawler will read in the entire contents of the page for indexing, so it will have no trouble picking up the portion within the div. The onload is not executed by the crawler prior to reading the page.
A method I have used in the past was to generate static html versions of the pages and reference these through the sitemap.xml. Users landing on the html page would then be directed to the equivalent dynamic page when they click on a link (ie: Buy or Specifications). This worked well for search engine placement with many pages appearing in the top ten.
The best solution to notify the search engines about an undiscoverable website's content is to create a HTML website (as you did). If you create redirects based on the crawler, search engines will not love you. I think you have to fill out your HTML with content with relevant information and add
<link rel="canonical" href="https://gwtsite.com/exact_url"/>
tag to your website's head section. This will notify the search engines that the other site has to appear in the SERP-s instead of the HTML one.

Jeditable displays entire HTML document as replacement for the editable field after trigger/submit

I am using jeditable and had it working very weird.
after editing the editable field and submits it instead of printing the new content it displays the entire document window in the textbox(placeholder of editable content).
question: from the example where the author used save.php. what was the content of save.php?
is it necessary to send the result on a php file?? can't an HTML file work?
I believe within the comments box at the bottom of the author's main page - somebody has kindly provided a version of the save.php file for people to use and modify as needed.
The save.php file is used to actually save the values of the editable field/s. Without it, nothing would happen to the data and it would reset to the default text if the page is refreshed.
Options instead of a php file could be:
Saving the text/select changes to a Cookie
Using another server side methos such as asp, jsp, rails or .NET to process the saving of the changes.
an html page is a static page with no processing facility per say to communicate with the website server, so no.. html is not suitable for such a need.
Saving script must return the string you want to display on page after editing. You are now returning full html page.
Source of for all demofiles can be found from GitHub.

ASP Classic - Passing form data to Iframe

I'm looking to pass data from a form into an iFrame, but I have a slight problem.
The form page I can edit with no restrictions
The page I send the data to I cannot edit unless its html or JavaScript
The data needs to end up in an iframe within this page, which I can edit with no restrictions
I'm incorporating a search function into a CMS system which is why I cannot edit the iframe's parent page, and why I am using iframes at all.
At the moment the data sends to the parent page but is not picked up within the iframe, I am sending via the POST method.
I got it..
Added and extra page which converted the post data into session data,
if anyone knows a better way i would like to hear it though.
And they are the same domain, but editing the CMS system would have taken ages to look through as its not mainstream or developed by me.
Maybe I'm oversimplifying the problem, but can't you use the "target" attribute of the form tag to post to the Iframe?

Form submit to iframe on new page

I have a form which submits to an iframe, This works fine if you are on a page with the iframe.
I want to be able to have the form on any page and when submit is pressed load a page and send the submit to the iframe
e.g.
On page "article.php" and press submit
Open page "results.php" and
Send post data from form clicked in "article.php" to iframe "DataHere" on "results.php"
Thanks in advance
You could try detecting if the frame exists when the form is submitted and if it does not, reload the whole page and generate the iFrame.
If you need a hand checkout http://www.java-scripts.net/javascripts/Check-Frames-Page-Script.phtml
If you are able to comfortably sanitize your initial POST data to avoid XSS, you could create an intermediate page for your iframe destination that does your POST for you:
On page article.php with <form action="results.php">, press submit.
results.php validates that the input isn't XSS, and renders with a <iframe src="negotiator.php?my=form&data=here"></iframe>
negotiator.php takes the query string arguments (and runs the same sanitizing as results.php) and POSTs them to your intended url.
Your results will load in the iframe.
It's pretty important that you make sure your input isn't insane. If your form requires arbitrary text, punctuation, and special characters, this is not safe for you.