How do I make 2 (or more) calls with Adobe PDF Services and skip using the file system (in between?) - adobe-pdfservices

It's fairly simple to make one call to Adobe PDF Services, get the result, and save it, for example:
// more stuff above
exportPdfOperation.execute(executionContext)
.then(result => result.saveAsFile(output))
But if I want to do two, or more, operations, do I need to keep saving the result to the file system and re-providing it (is that even a word ;) to the API?

So this tripped me up as well. In most demos, you'll see:
result => result.saveAsFile()
towards the end. However, the object passes to the completed promise, result, is a FileRef object that can then be used as the input to another call.
Here's a sample that takes an input Word doc and calls the API method to create a PDF. It then takes that and runs OCR on it. Both methods that wrap the API calls return FileRefs, so at the end I saveAsFile on it. (Note, this demo is using v1 of the SDK, it would work the same w/ v2.)
const PDFToolsSdk = require('#adobe/documentservices-pdftools-node-sdk');
const fs = require('fs');
//clean up previous
(async ()=> {
// hamlet.docx was too big for conversion
const input = './hamlet2.docx';
const output = './multi.pdf';
const creds = './pdftools-api-credentials.json';
if(fs.existsSync(output)) fs.unlinkSync(output);
let result = await createPDF(input, creds);
console.log('got a result');
result = await ocrPDF(result, creds);
console.log('got second result');
await result.saveAsFile(output);
})();
async function createPDF(source, creds) {
return new Promise((resolve, reject) => {
const credentials = PDFToolsSdk.Credentials
.serviceAccountCredentialsBuilder()
.fromFile(creds)
.build();
const executionContext = PDFToolsSdk.ExecutionContext.create(credentials),
createPdfOperation = PDFToolsSdk.CreatePDF.Operation.createNew();
// Set operation input from a source file
const input = PDFToolsSdk.FileRef.createFromLocalFile(source);
createPdfOperation.setInput(input);
let stream = new Stream.Writable();
stream.write = function() {
}
stream.end = function() {
console.log('end called');
resolve(stream);
}
// Execute the operation and Save the result to the specified location.
createPdfOperation.execute(executionContext)
.then(result => resolve(result))
.catch(err => {
if(err instanceof PDFToolsSdk.Error.ServiceApiError
|| err instanceof PDFToolsSdk.Error.ServiceUsageError) {
reject(err);
} else {
reject(err);
}
});
});
}
async function ocrPDF(source, creds) {
return new Promise((resolve, reject) => {
const credentials = PDFToolsSdk.Credentials
.serviceAccountCredentialsBuilder()
.fromFile(creds)
.build();
const executionContext = PDFToolsSdk.ExecutionContext.create(credentials),
ocrOperation = PDFToolsSdk.OCR.Operation.createNew();
// Set operation input from a source file.
//const input = PDFToolsSdk.FileRef.createFromStream(source);
ocrOperation.setInput(source);
let stream = new Stream.Writable();
stream.end = function() {
console.log('end called');
resolve(stream);
}
// Execute the operation and Save the result to the specified location.
ocrOperation.execute(executionContext)
.then(result => resolve(result))
.catch(err => reject(err));
});
}

Related

How to get data from react query "useQuery" hook in a specific type

When we get data from useQuery hook, I need to parse the data a specific type before it return to user. I want data which return from useQuery hook should be of "MyType" using the parsing function i created below. I am unable to find method to use my parsing function. Is there any way to do it? I don't want to rely on schema structure for data type.
type MyType = {
id: number;
//some more properties
}
function parseData(arr: any[]): MyType[]{
return arr.map((obj, index)=>{
return {
id: arr.id,
//some more properties
}
})
}
const {data} = await useQuery('fetchMyData', async ()=>{
return await axios.get('https://fake-domain.com')
}
)
I would take the response from the api and transform it inside the queryFn, before you return it to react-query. Whatever you return winds up in the query cache, so:
const { data } = await useQuery('fetchMyData', async () => {
const response = await axios.get('https://fake-domain.com')
return parseData(response.data)
}
)
data returned from useQuery should then be of type MyType[] | undefined
There are a bunch of other options to do data transformation as well, and I've written about them here:
https://tkdodo.eu/blog/react-query-data-transformations
I think you should create your own hook and perform normalisation there:
const useParseData = () => {
const { data } = await useQuery('fetchMyData', async () => {
return await axios.get('https://fake-domain.com')
}
return parseData(data)
}
And where you need this data you could just call const parsedData = useParseData()

nextjs parse XLSX on API route from incoming request

I have been trying to reduce my NextJS bundle size by moving my XLSX parsing to an API route. It uses the npm xlsx (sheetjs) package, and extracts JSON from a selected XLSX.
What I am doing in the frontend is
let res;
let formData = new FormData();
formData.append("file", e.target.files[0]);
try {
res = await axios.post("/api/importExcel", formData);
} catch (e) {
createCriticalError(
"Critical error during file reading from uploaded file!"
);
}
On the API route I am unable to to read the file using XLSX.read()
I believe NextJS uses body-parser on the incoming requests but I am unable to convert the incoming data to an array buffer or any readable format for XLSX.
Do you have any suggestions about how to approach this issue?
I tried multiple solutions, the most viable seemed this, but it still does not work
export default async function handler(req, res) {
console.log(req.body);
let arr;
let file = req.body;
let contentBuffer = await new Response(file).arrayBuffer();
try {
var data = new Uint8Array(contentBuffer);
var workbook = XLSX.read(data, { type: "array" });
var sheet = workbook.Sheets[workbook.SheetNames[0]];
arr = XLSX.utils.sheet_to_json(sheet);
} catch (e) {
console.error("Error while reading the excel file");
console.log({ ...e });
res.status(500).json({ err: e });
}
res.status(200).json(arr);
}
Since you're uploading a file, you should start by disabling the body parser to consume the body as a stream.
I would also recommend using a third-party library like formidable to handle and parse the form data. You'll then be able to read the file using XLSX.read() and convert it to JSON.
import XLSX from "xlsx";
import formidable from "formidable";
// Disable `bodyParser` to consume as stream
export const config = {
api: {
bodyParser: false
}
};
export default async function handler(req, res) {
const form = new formidable.IncomingForm();
try {
// Promisified `form.parse`
const jsonData = await new Promise(function (resolve, reject) {
form.parse(req, async (err, fields, files) => {
if (err) {
reject(err);
return;
}
try {
const workbook = XLSX.readFile(files.file.path);
const sheet = workbook.Sheets[workbook.SheetNames[0]];
const jsonSheet = XLSX.utils.sheet_to_json(sheet);
resolve(jsonSheet);
} catch (err) {
reject(err);
}
});
});
return res.status(200).json(jsonData);
} catch (err) {
console.error("Error while parsing the form", err);
return res.status(500).json({ error: err });
}
}

Puppeteer and express can not load new data using REST API

I'm using puppeteer to scrape page that has contents that change periodically and use express to present data in rest api.
If I turn on headless chrome to see what is being shown in the browser, the new data is there, but the data is not showing up in get() and http://localhost:3005/api-weather. The normal browser only shows the original data.
const express = require('express');
const server = new express();
const cors = require('cors');
const morgan = require('morgan');
const puppeteer = require('puppeteer');
server.use(morgan('combined'));
server.use(
cors({
allowHeaders: ['sessionId', 'Content-Type'],
exposedHeaders: ['sessionId'],
origin: '*',
methods: 'GET, HEAD, PUT, PATCH, POST, DELETE',
preflightContinue: false
})
);
const WEATHER_URL = 'https://forecast.weather.gov/MapClick.php?lat=40.793588904953985&lon=-73.95738513173298';
const hazard_url2 = `file://C:/Users/xdevtran/Documents/vshome/wc_api/weather-forecast-nohazard.html`;
(async () => {
try {
const browser = await puppeteer.launch({ headless: true });
const page = await browser.newPage();
await page.setRequestInterception(true);
page.on("request", request => {
console.log(request.url());
request.continue();
});
await page.goto(hazard_url2, { timeout: 0, waitUntil: 'networkidle0' });
hazard = {
"HazardTitle": "stub",
"Hazardhref": "stub"
}
let forecast = await page.evaluate(() => {
try {
let forecasts = document.querySelectorAll("#detailed-forecast-body.panel-body")[0].children;
let weather = [];
for (var i = 0, element; element = forecasts[i]; i++) {
period = element.querySelector("div.forecast-label").textContent;
forecast = element.querySelector("div.forecast-text").textContent;
weather.push(
{
period,
forecast
}
)
}
return weather;
} catch (err) {
console.log('error in evaluate: ', err);
res.end();
}
}).catch(err => {
console.log('err.message :', err.message);
});
weather = forecast;
server.get('/api-weather', (req, res) => {
try {
res.end(JSON.stringify(weather, null, ' '));
console.log(weather);
} catch (err) {
console.log('failure: ', err);
res.sendStatus(500);
res.end();
return;
}
});
} catch (err) {
console.log('caught error :', err);
}
browser.close();
})();
server.listen(3005, () => {
console.log('http://localhost:3005/api-weather');
});
I've tried several solutions WaitUntil, WaitFor, .then and sleep but nothing seems to work.
I wonder if it has something to do with express get()? I'm using res.end() instead of res.send() is because the json looks better when I use res.end(). I don't really know the distinction.
I'm also open to using this reload solution. But I received errors and didn't use it.
I also tried waitForNavigation(), but I don't know how it works, either.
Maybe I'm using the wrong search term to find the solution. Could anyone point me in the right direction? Thank you.

Cannot use results of an api call in conversation

I am trying to use the results of an api call in conversation but haven't been able to pass the results so that I can use them in conv.ask. In the example here, I am able to log the "wind inner" but when I try to use it in conv.ask, I get "undefined." I know it is a scoping issue, but I haven't been able to solve it. Thanks!
app.intent('weather', (conv) => {
var url = "http://api.wunderground.com/api/"+apiKey+"/yesterday/q/55417.json";
var request = http.get(url, function (response) {
var buffer = "",
data,
history;
response.on("data", function (chunk) {
buffer += chunk;
});
response.on("end", function (err) {
console.log(buffer);
console.log("\n");
data = JSON.parse(buffer);
history = data.history;
var wind = (history.dailysummary[0].maxwspdi);
console.log("wind inner: ", wind);//this works
});
});
conv.ask("the wind speed is" + wind + "miles per hour");
//unable to get the wind variable to be defined ouside the api call
});
You should be using a Promise to return asynchronous responses. Additionally, make sure you wait until you get the response data to generate your text-to-speech. Here's a snippet that should work.
app.intent('weather', (conv) => {
return new Promise((resolve, reject) => {
const url = "http://api.wunderground.com/api/"+apiKey+"/yesterday/q/55417.json";
const request = http.get(url, function (response) {
var buffer = "",
data,
history;
response.on("data", function (chunk) {
buffer += chunk;
});
response.on("end", function (err) {
console.log(buffer);
console.log("\n");
data = JSON.parse(buffer);
history = data.history;
const wind = (history.dailysummary[0].maxwspdi);
console.log("wind inner: ", wind);//this works
conv.ask("the wind speed is" + wind + "miles per hour");
resolve();
});
});
});
});

Add streamed email data to MongoDB in meteor

I have a route set up to receive a webhook from SendGrid, which sends MultiPart/form data. I can get the various fields to output in the console with busboy, but I'm struggling to fill in the final piece of the puzzle: getting this parsed data into a Collection object (or just into MongoDB if not familiar with meteor).
I thought something like the following would work, but the data arrays in the db are always blank, i'm guessing i'm missing a crucial step in knowing when the stream has finished?
WebApp.connectHandlers.use('/applicants', (req, res, next) => {
let body = '';
req.on('data', Meteor.bindEnvironment(function (data) {
body += data;
let bb = new Busboy({ headers: req.headers });
let theEmail = [];
bb.on('field', function(fieldname, val) {
console.log('Field [%s]: value: %j', fieldname, val);
let theObject = [];
theObject[fieldname] = val;
theEmail.push(theObject);
}).on('error', function(err) {
console.error('oops', err);
}).on('finish', Meteor.bindEnvironment(function() {
console.log('Done parsing form!');
// try to add data to database....
Meteor.call('applicants.add', theEmail);
}));
return req.pipe(bb);
}));
req.on('end', Meteor.bindEnvironment(function () {
res.writeHead(200);
res.end();
}));