Puppeteer Generate PDF from multiple HTML strings - html-to-pdf

I am using Puppeteer to generate PDF files from HTML strings.
Reading the documentation, I found two ways of generating the PDF files:
First, passing an url and call the goto method as follows:
page.goto('https://example.com');
page.pdf({format: 'A4'});
The second one, which is my case, calling the method setContent as follows:
page.setContent('<p>Hello, world!</p>');
page.pdf({format: 'A4'});
The thing is that I have 3 different HTML strings that are sent from the client and I want to generate a single PDF file with 3 pages (in case I have 3 HTML strings).
I wonder if there exists a way of doing this with Puppeteer? I accept other suggestions, but I need to use chrome-headless.

I was able to do this by doing the following:
Generate 3 different PDFs with puppeteer. You have the option of saving the file locally or to store it in a variable.
I saved the files locally, because all the PDF Merge plugins that I found only accept URLs and they don't accept buffers for instance. After generating synchronously the PDFs locally, I merged them using PDF Easy Merge.
The code is like this:
const page1 = '<h1>HTML from page1</h1>';
const page2 = '<h1>HTML from page2</h1>';
const page3 = '<h1>HTML from page3</h1>';
const browser = await puppeteer.launch();
const tab = await browser.newPage();
await tab.setContent(page1);
await tab.pdf({ path: './page1.pdf' });
await tab.setContent(page2);
await tab.pdf({ path: './page2.pdf' });
await tab.setContent(page3);
await tab.pdf({ path: './page3.pdf' });
await browser.close();
pdfMerge([
'./page1.pdf',
'./page2.pdf',
'./page3.pdf',
],
path.join(__dirname, `./mergedFile.pdf`), async (err) => {
if (err) return console.log(err);
console.log('Successfully merged!');
})

I was able to generate multiple PDF from multiple URLs from below code:
package.json
{
............
............
"dependencies": {
"puppeteer": "^1.1.1",
"easy-pdf-merge": "0.1.3"
}
..............
..............
}
index.js
const puppeteer = require('puppeteer');
const merge = require('easy-pdf-merge');
var pdfUrls = ["http://www.google.com","http://www.yahoo.com"];
(async () => {
const browser = await puppeteer.launch();
const page = await browser.newPage();
var pdfFiles=[];
for(var i=0; i<pdfUrls.length; i++){
await page.goto(pdfUrls[i], {waitUntil: 'networkidle2'});
var pdfFileName = 'sample'+(i+1)+'.pdf';
pdfFiles.push(pdfFileName);
await page.pdf({path: pdfFileName, format: 'A4'});
}
await browser.close();
await mergeMultiplePDF(pdfFiles);
})();
const mergeMultiplePDF = (pdfFiles) => {
return new Promise((resolve, reject) => {
merge(pdfFiles,'samplefinal.pdf',function(err){
if(err){
console.log(err);
reject(err)
}
console.log('Success');
resolve()
});
});
};
RUN Command: node index.js

pdf-merger-js is another option. page.setContent should work just the same as a drop-in replacement for page.goto below:
const PDFMerger = require("pdf-merger-js"); // 3.4.0
const puppeteer = require("puppeteer"); // 14.1.1
const urls = [
"https://news.ycombinator.com",
"https://en.wikipedia.org",
"https://www.example.com",
// ...
];
const filename = "merged.pdf";
let browser;
(async () => {
browser = await puppeteer.launch();
const [page] = await browser.pages();
const merger = new PDFMerger();
for (const url of urls) {
await page.goto(url);
merger.add(await page.pdf());
}
await merger.save(filename);
})()
.catch(err => console.error(err))
.finally(() => browser?.close())
;

Related

Redirecting url with Puppeteer by changing url

I am trying to get change my request url and see the new url in the response
const puppeteer = require('puppeteer');
(async () => {
const browser = await puppeteer.launch();
const page = await browser.newPage();
await page.setRequestInterception(true);
page.on('request', interceptedRequest => {
if (interceptedRequest.url().includes('some-string')) {
interceptedRequest.respond({
status: 302,
headers: {
url: 'www.new.url.com'
},
})
}
interceptedRequest.continue()
});
page.on('response', response => {
console.log(response.url())
})
await page.goto('www.orginal.url.com')
// some code omitted
})();
In the interceptedRequest.respond method I'm trying to update the value of the url. Originally I was trying:
interceptedRequest.continue({url: 'www.new.url.com'})
but that way is not long supported in the current version of Puppeteer.
I was expecting to get www.new.url.com in the response, but I actually get the orignial url with www.new.url.com appended to the end.
Thanks in advance for any help.
It helped me. You need to change url to location
const puppeteer = require('puppeteer');
(async () => {
const browser = await puppeteer.launch();
const page = await browser.newPage();
await page.setRequestInterception(true);
page.on('request', interceptedRequest => {
if (interceptedRequest.url().includes('some-string')) {
interceptedRequest.respond({
status: 302,
headers: {
location: 'www.new.url.com'
},
})
}
});
page.on('response', response => {
console.log(response.url())
})
await page.goto('www.orginal.url.com')
// some code omitted
})();

Unit and integration test of Express REST API and multer single file update middleware

Introduction
Hello everybody,
I'm pretty new to unit and integration testing. The current REST API I'm working on involves file uploads and file system. If you want me to explain what's API this is, I can explain it to you using few sentences. Imagine a system like Microsoft Word. There are only users and users have documents. Users' documents are only JSON files and they are able to upload JSON file to add a document. My API currently has 3 routes, 2 middlewares.
Routes:
auth.js (authorization route)
documents.js (document centered CRUD operations)
users.js
Middlewares:
auth.js (To check if there is valid JSON web token to continue)
uploadFile.js (To upload single file using multer)
I have been able to unit/integration test auth.js, users.js routes and auth.js middleware. These routes and middlewares were only involving small packages of data I/O, so they were pretty easy for me. But documents.js router and uploadFile.js middleware is pretty hard for me to overcome.
Let me share my problems.
Source codes
documents.js Router
.
.
.
router.post('/mine', [auth, uploadFile], async (req, res) => {
const user = await User.findById(req.user._id);
user.leftDiskSpace(function(err, leftSpace) {
if(err) {
return res.status(400).send(createError(err.message, 400));
} else {
if(leftSpace < 0) {
fs.access(req.file.path, (err) => {
if(err) {
res.status(403).send(createError('Your plan\'s disk space is exceeded.', 403));
} else {
fs.unlink(req.file.path, (err) => {
if(err) res.status(500).send('Silinmek istenen doküman diskten silinemedi.');
else res.status(403).send(createError('Your plan\'s disk space is exceeded.', 403));
});
}
});
} else {
let document = new Document({
filename: req.file.filename,
path: `/uploads/${req.user.username}/${req.file.filename}`,
size: req.file.size
});
document.save()
.then((savedDocument) => {
user.documents.push(savedDocument._id);
user.save()
.then(() => res.send(savedDocument));
});
}
}
});
});
.
.
.
uploadFile.js Middleware
const fs = require('fs');
const path = require('path');
const createError = require('./../helpers/createError');
const jsonFileFilter = require('./../helpers/jsonFileFilter');
const multer = require('multer');
const storage = multer.diskStorage({
destination: function(req, file, cb) {
console.log('file: ', file);
if(!req.user.username) return cb(new Error('Dokümanın yükleneceği klasör için isim belirtilmemiş.'), null);
let uploadDestination = path.join(process.cwd(), 'uploads', req.user.username);
fs.access(uploadDestination, (err) => {
if(err) {
// Directory with username doesn't exist in uploads folder, so create one
fs.mkdir(uploadDestination, (err) => {
if(err) cb(err, null);
cb(null, uploadDestination);
});
} else {
// Directory with username exists
cb(null, uploadDestination);
}
});
},
filename: function(req, file, cb) {
cb(null, `${file.originalname.replace('.json', '')}--${Date.now()}.json`);
}
});
module.exports = function(req, res, next) {
multer({ storage: storage, fileFilter: jsonFileFilter }).single('document')(req, res, function(err) {
if(req.fileValidationError) return res.status(400).send(createError(req.fileValidationError.message, 400));
else if(!req.file) return res.status(400).send(createError('Herhangi bir doküman seçilmedi.', 400));
else if(err instanceof multer.MulterError) return res.status(500).send(createError(err.message, 500));
else if(err) return res.status(500).send(createError(err, 500));
else next();
});
}
Questions
1. How can I test user.leftDiskSpace(function(err, leftSpace) { ... }); function which has a callback and contains some Node.js fs methods which also has callbacks?
I want to reach branches and statements user.leftDiskSpace() function containing. I thought of using mock functions to mock out the function but I don't know how to do so.
2. How to change multer disk storage's upload destination for a specified testing folder?
Currently my API uploads the test documents to development/production uploads disk storage destination. What is the best way to change upload destination for testing? I thought to use NODE_ENV global variable to check if the API is being tested or not and change destination in uploadFile.js middleware but I'm not sure if it's a good solution of this problem. What should I do?
Current documents.test.js file
const request = require('supertest');
const { Document } = require('../../../models/document');
const { User } = require('../../../models/user');
const mongoose = require('mongoose');
const path = require('path');
let server;
describe('/api/documents', () => {
beforeEach(() => { server = require('../../../bin/www'); });
afterEach(async () => {
server.close();
await User.deleteMany({});
await Document.deleteMany({});
});
.
.
.
describe('POST /mine', () => {
let user;
let token;
let file;
const exec = async () => {
return await request(server)
.post('/api/documents/mine')
.set('x-auth-token', token)
.attach('document', file);
}
beforeEach(async () => {
user = new User({
username: 'user',
password: '1234'
});
await user.save();
token = user.generateAuthToken();
file = path.join(process.cwd(), 'tests', 'integration', 'files', 'test.json');
});
it('should return 400 if no documents attached', async () => {
file = undefined;
const res = await exec();
expect(res.status).toBe(400);
});
it('should return 400 if a non-JSON document attached', async () => {
file = path.join(process.cwd(), 'tests', 'integration', 'files', 'test.png');
const res = await exec();
expect(res.status).toBe(400);
});
});
});

Flutter Web multipart formdata file upload progress bar

I'm using Flutter web and strapi headless cms for backend. I'm able to send the files successfully, but would like its progress indication. Backend restrictions: File upload must be multipart form-data, being it a buffer or stream. Frontend restrictions: Flutter web doesn't have access to system file directories; files must be loaded in memory and sent using its bytes.
I'm able to upload the file using flutter's http package or the Dio package, but have the following problems when trying to somehow access upload progress:
Http example code:
http.StreamedResponse response;
final uri = Uri.parse(url);
final request = MultipartRequest(
'POST',
uri,
);
request.headers['authorization'] = 'Bearer $_token';
request.files.add(http.MultipartFile.fromBytes(
'files',
_fileToUpload.bytes,
filename: _fileToUpload.name,
));
response = await request.send();
var resStream = await response.stream.bytesToString();
var resData = json.decode(resStream);
What I tryed:
When acessing the response.stream for the onData, it only responds when the server sends the finished request (even though the methods states it's supposed to gets some indications of progress).
Dio package code
Response response = await dio.post(url,
data: formData,
options: Options(
headers: {
'authorization': 'Bearer $_token',
},
), onSendProgress: (int sent, int total) {
setState(() {
pm.progress = (sent / total) * 100;
});
The problems:
It seems the package is able to get some progress indication, but Dio package for flutter web has a bug which has not been fixed: requests block the ui and the app freezes until upload is finished.
Hi you can use the universal_html/html.dart package to do the progress bar, here are steps:
to import universal package
import 'package:universal_html/html.dart' as html;
Select files from html input element instead using file picker packages
_selectFile() {
html.FileUploadInputElement uploadInput = html.FileUploadInputElement();
uploadInput.multiple = false;
uploadInput.accept = '.png,.jpg,.glb';
uploadInput.click();
uploadInput.onChange.listen((e) {
_file = uploadInput.files.first;
});
}
Create upload_worker.js into web folder, my example is upload into S3 post presigned url
self.addEventListener('message', async (event) => {
var file = event.data.file;
var url = event.data.uri;
var postData = event.data.postData;
uploadFile(file, url, postData);
});
function uploadFile(file, url, presignedPostData) {
var xhr = new XMLHttpRequest();
var formData = new FormData();
// if you use postdata, you can open the comment
//Object.keys(presignedPostData).forEach((key) => {
// formData.append(key, presignedPostData[key]);
//});
formData.append('Content-Type', file.type);
// var uploadPercent;
formData.append('file', file);
xhr.upload.addEventListener("progress", function (e) {
if (e.lengthComputable) {
console.log(e.loaded + "/" + e.total);
// pass progress bar status to flutter widget
postMessage(e.loaded/e.total);
}
});
xhr.onreadystatechange = function () {
if (xhr.readyState == XMLHttpRequest.DONE) {
// postMessage("done");
}
}
xhr.onerror = function () {
console.log('Request failed');
// only triggers if the request couldn't be made at all
// postMessage("Request failed");
};
xhr.open('POST', url, true);
xhr.send(formData);
}
Flutter web call upload worker to upload and listener progress bar status
class Upload extends StatefulWidget {
#override
_UploadState createState() => _UploadState();
}
class _UploadState extends State<Upload> {
html.Worker myWorker;
html.File file;
_uploadFile() async {
String _uri = "/upload";
final postData = {};
myWorker.postMessage({"file": file, "uri": _uri, "postData": postData});
}
_selectFile() {
html.InputElement uploadInput = html.FileUploadInputElement();
uploadInput.multiple = false;
uploadInput.click();
uploadInput.onChange.listen((e) {
file = uploadInput.files.first;
});
}
#override
void initState() {
myWorker = new html.Worker('upload_worker.js');
myWorker.onMessage.listen((e) {
setState(() {
//progressbar,...
});
});
super.initState();
}
#override
Widget build(BuildContext context) {
return Column(
children: [
RaisedButton(
onPressed: _selectFile,
child: Text("Select File"),
),
RaisedButton(
onPressed: _uploadFile,
child: Text("Upload"),
),
],
);
}
}
that's it, I hope it can help you.

access document.documentElement from puppeteer

I can get access to the entire HTML for any URL by opening dev-tools and typing:
document.documentElement
I am trying to replicate the same behavior using puppeteer, however, the snippet below returns {}
const puppeteer = require('puppeteer'); // v 1.1.0
const iPhone = puppeteer.devices['Pixel 2 XL'];
async function start(canonical_url) {
const browserURL = 'http://127.0.0.1:9222';
const browser = await puppeteer.connect({browserURL});
const page = await browser.newPage();
await page.emulate(iPhone);
await page.goto(canonical_url, {
waitUntil: 'networkidle2',
});
const data = await page.evaluate(() => document.documentElement);
console.log(data);
}
returns:
{}
Any idea on what I could be doing wrong here?

delete and create folder before uploading files into the folder using Multer

I am trying to upload files into a folder using multer and it is working fine.
Now my requirement is before it upload file into 'uploads' folder, it should delete it first, create the upload folder and then upload it.
I just want to do operation on uploaded file not on the previous data stored.
Code:
const fs = require("fs-extra");
const path = require("path");
const uploadPath = path.resolve(__dirname, "uploads");
const multer = require("multer");
const storage = multer.diskStorage({
destination: "./uploads/",
filename: function(req, file, cb) {
cb(null, file.originalname);
}
});
const upload = multer({ storage: storage });
router.post("/fileupload", upload.array("docs", 10), async function(
req,
res,
next
) {
let result = {};
try {
if (fs.existsSync(uploadPath)) {
fs.removeSync(uploadPath);
console.log("dir removed");
fs.ensureDirSync(uploadPath);
console.log("directory created");
} else {
fs.ensureDirSync(uploadPath);
console.log("directory created");
}
const uploadObj = util.promisify(upload.any());
await uploadObj(req, res);
result.message = "Upload successful";
res.send(result);
} catch (e) {
console.error(e);
console.error("Upload error");
}
});
I tried to make the code async also but after that it is not uploading any file. What I understood is upload.array is a middleware so it run first whenever POST request is called and rest run after this. So multer is uploading the data in existing folder and then once it comes inside the POST fs is deleting and creating it again.
how can I make it work?
Thanks
I found a way to make it work.
As multer is a middleware it executes first then the rest code. So i put that middleware in my code instead at the header. Below is the full code.
const create_upload_dir = () => {
if (fs.existsSync(uploadPath)) {
fs.removeSync(uploadPath);
console.log("dir removed");
fs.ensureDirSync(uploadPath);
console.log("directory created");
} else {
fs.ensureDirSync(uploadPath);
console.log("directory created");
}
return Promise.resolve("Success");
};
const multer = require("multer");
const upload_documents = () => {
const storage = multer.diskStorage({
destination: "./uploads/",
filename: function(req, file, cb) {
cb(null, file.originalname);
}
});
const upload = multer({ storage: storage });
return upload;
};
router.post("/fileupload", async function(req, res, next) {
let result = {};
try {
await create_upload_dir();
const upload = upload_documents();
upload.array("docs", 10);
const uploadObj = util.promisify(upload.any());
await uploadObj(req, res);
console.log("upload successful");
res.send("Upload successful");
} catch (e) {
console.error(e);
console.error("Upload error");
}
});
notice I put in try block
upload.array("docs", 10);
instead of
router.post("/fileupload",upload.array("docs", 10), async function(req, res, next)