It's fairly simple to make one call to Adobe PDF Services, get the result, and save it, for example:
// more stuff above
exportPdfOperation.execute(executionContext)
.then(result => result.saveAsFile(output))
But if I want to do two, or more, operations, do I need to keep saving the result to the file system and re-providing it (is that even a word ;) to the API?
So this tripped me up as well. In most demos, you'll see:
result => result.saveAsFile()
towards the end. However, the object passes to the completed promise, result, is a FileRef object that can then be used as the input to another call.
Here's a sample that takes an input Word doc and calls the API method to create a PDF. It then takes that and runs OCR on it. Both methods that wrap the API calls return FileRefs, so at the end I saveAsFile on it. (Note, this demo is using v1 of the SDK, it would work the same w/ v2.)
const PDFToolsSdk = require('#adobe/documentservices-pdftools-node-sdk');
const fs = require('fs');
//clean up previous
(async ()=> {
// hamlet.docx was too big for conversion
const input = './hamlet2.docx';
const output = './multi.pdf';
const creds = './pdftools-api-credentials.json';
if(fs.existsSync(output)) fs.unlinkSync(output);
let result = await createPDF(input, creds);
console.log('got a result');
result = await ocrPDF(result, creds);
console.log('got second result');
await result.saveAsFile(output);
})();
async function createPDF(source, creds) {
return new Promise((resolve, reject) => {
const credentials = PDFToolsSdk.Credentials
.serviceAccountCredentialsBuilder()
.fromFile(creds)
.build();
const executionContext = PDFToolsSdk.ExecutionContext.create(credentials),
createPdfOperation = PDFToolsSdk.CreatePDF.Operation.createNew();
// Set operation input from a source file
const input = PDFToolsSdk.FileRef.createFromLocalFile(source);
createPdfOperation.setInput(input);
let stream = new Stream.Writable();
stream.write = function() {
}
stream.end = function() {
console.log('end called');
resolve(stream);
}
// Execute the operation and Save the result to the specified location.
createPdfOperation.execute(executionContext)
.then(result => resolve(result))
.catch(err => {
if(err instanceof PDFToolsSdk.Error.ServiceApiError
|| err instanceof PDFToolsSdk.Error.ServiceUsageError) {
reject(err);
} else {
reject(err);
}
});
});
}
async function ocrPDF(source, creds) {
return new Promise((resolve, reject) => {
const credentials = PDFToolsSdk.Credentials
.serviceAccountCredentialsBuilder()
.fromFile(creds)
.build();
const executionContext = PDFToolsSdk.ExecutionContext.create(credentials),
ocrOperation = PDFToolsSdk.OCR.Operation.createNew();
// Set operation input from a source file.
//const input = PDFToolsSdk.FileRef.createFromStream(source);
ocrOperation.setInput(source);
let stream = new Stream.Writable();
stream.end = function() {
console.log('end called');
resolve(stream);
}
// Execute the operation and Save the result to the specified location.
ocrOperation.execute(executionContext)
.then(result => resolve(result))
.catch(err => reject(err));
});
}
Related
When we get data from useQuery hook, I need to parse the data a specific type before it return to user. I want data which return from useQuery hook should be of "MyType" using the parsing function i created below. I am unable to find method to use my parsing function. Is there any way to do it? I don't want to rely on schema structure for data type.
type MyType = {
id: number;
//some more properties
}
function parseData(arr: any[]): MyType[]{
return arr.map((obj, index)=>{
return {
id: arr.id,
//some more properties
}
})
}
const {data} = await useQuery('fetchMyData', async ()=>{
return await axios.get('https://fake-domain.com')
}
)
I would take the response from the api and transform it inside the queryFn, before you return it to react-query. Whatever you return winds up in the query cache, so:
const { data } = await useQuery('fetchMyData', async () => {
const response = await axios.get('https://fake-domain.com')
return parseData(response.data)
}
)
data returned from useQuery should then be of type MyType[] | undefined
There are a bunch of other options to do data transformation as well, and I've written about them here:
https://tkdodo.eu/blog/react-query-data-transformations
I think you should create your own hook and perform normalisation there:
const useParseData = () => {
const { data } = await useQuery('fetchMyData', async () => {
return await axios.get('https://fake-domain.com')
}
return parseData(data)
}
And where you need this data you could just call const parsedData = useParseData()
I have been trying to reduce my NextJS bundle size by moving my XLSX parsing to an API route. It uses the npm xlsx (sheetjs) package, and extracts JSON from a selected XLSX.
What I am doing in the frontend is
let res;
let formData = new FormData();
formData.append("file", e.target.files[0]);
try {
res = await axios.post("/api/importExcel", formData);
} catch (e) {
createCriticalError(
"Critical error during file reading from uploaded file!"
);
}
On the API route I am unable to to read the file using XLSX.read()
I believe NextJS uses body-parser on the incoming requests but I am unable to convert the incoming data to an array buffer or any readable format for XLSX.
Do you have any suggestions about how to approach this issue?
I tried multiple solutions, the most viable seemed this, but it still does not work
export default async function handler(req, res) {
console.log(req.body);
let arr;
let file = req.body;
let contentBuffer = await new Response(file).arrayBuffer();
try {
var data = new Uint8Array(contentBuffer);
var workbook = XLSX.read(data, { type: "array" });
var sheet = workbook.Sheets[workbook.SheetNames[0]];
arr = XLSX.utils.sheet_to_json(sheet);
} catch (e) {
console.error("Error while reading the excel file");
console.log({ ...e });
res.status(500).json({ err: e });
}
res.status(200).json(arr);
}
Since you're uploading a file, you should start by disabling the body parser to consume the body as a stream.
I would also recommend using a third-party library like formidable to handle and parse the form data. You'll then be able to read the file using XLSX.read() and convert it to JSON.
import XLSX from "xlsx";
import formidable from "formidable";
// Disable `bodyParser` to consume as stream
export const config = {
api: {
bodyParser: false
}
};
export default async function handler(req, res) {
const form = new formidable.IncomingForm();
try {
// Promisified `form.parse`
const jsonData = await new Promise(function (resolve, reject) {
form.parse(req, async (err, fields, files) => {
if (err) {
reject(err);
return;
}
try {
const workbook = XLSX.readFile(files.file.path);
const sheet = workbook.Sheets[workbook.SheetNames[0]];
const jsonSheet = XLSX.utils.sheet_to_json(sheet);
resolve(jsonSheet);
} catch (err) {
reject(err);
}
});
});
return res.status(200).json(jsonData);
} catch (err) {
console.error("Error while parsing the form", err);
return res.status(500).json({ error: err });
}
}
I'm using puppeteer to scrape page that has contents that change periodically and use express to present data in rest api.
If I turn on headless chrome to see what is being shown in the browser, the new data is there, but the data is not showing up in get() and http://localhost:3005/api-weather. The normal browser only shows the original data.
const express = require('express');
const server = new express();
const cors = require('cors');
const morgan = require('morgan');
const puppeteer = require('puppeteer');
server.use(morgan('combined'));
server.use(
cors({
allowHeaders: ['sessionId', 'Content-Type'],
exposedHeaders: ['sessionId'],
origin: '*',
methods: 'GET, HEAD, PUT, PATCH, POST, DELETE',
preflightContinue: false
})
);
const WEATHER_URL = 'https://forecast.weather.gov/MapClick.php?lat=40.793588904953985&lon=-73.95738513173298';
const hazard_url2 = `file://C:/Users/xdevtran/Documents/vshome/wc_api/weather-forecast-nohazard.html`;
(async () => {
try {
const browser = await puppeteer.launch({ headless: true });
const page = await browser.newPage();
await page.setRequestInterception(true);
page.on("request", request => {
console.log(request.url());
request.continue();
});
await page.goto(hazard_url2, { timeout: 0, waitUntil: 'networkidle0' });
hazard = {
"HazardTitle": "stub",
"Hazardhref": "stub"
}
let forecast = await page.evaluate(() => {
try {
let forecasts = document.querySelectorAll("#detailed-forecast-body.panel-body")[0].children;
let weather = [];
for (var i = 0, element; element = forecasts[i]; i++) {
period = element.querySelector("div.forecast-label").textContent;
forecast = element.querySelector("div.forecast-text").textContent;
weather.push(
{
period,
forecast
}
)
}
return weather;
} catch (err) {
console.log('error in evaluate: ', err);
res.end();
}
}).catch(err => {
console.log('err.message :', err.message);
});
weather = forecast;
server.get('/api-weather', (req, res) => {
try {
res.end(JSON.stringify(weather, null, ' '));
console.log(weather);
} catch (err) {
console.log('failure: ', err);
res.sendStatus(500);
res.end();
return;
}
});
} catch (err) {
console.log('caught error :', err);
}
browser.close();
})();
server.listen(3005, () => {
console.log('http://localhost:3005/api-weather');
});
I've tried several solutions WaitUntil, WaitFor, .then and sleep but nothing seems to work.
I wonder if it has something to do with express get()? I'm using res.end() instead of res.send() is because the json looks better when I use res.end(). I don't really know the distinction.
I'm also open to using this reload solution. But I received errors and didn't use it.
I also tried waitForNavigation(), but I don't know how it works, either.
Maybe I'm using the wrong search term to find the solution. Could anyone point me in the right direction? Thank you.
I am trying to use the results of an api call in conversation but haven't been able to pass the results so that I can use them in conv.ask. In the example here, I am able to log the "wind inner" but when I try to use it in conv.ask, I get "undefined." I know it is a scoping issue, but I haven't been able to solve it. Thanks!
app.intent('weather', (conv) => {
var url = "http://api.wunderground.com/api/"+apiKey+"/yesterday/q/55417.json";
var request = http.get(url, function (response) {
var buffer = "",
data,
history;
response.on("data", function (chunk) {
buffer += chunk;
});
response.on("end", function (err) {
console.log(buffer);
console.log("\n");
data = JSON.parse(buffer);
history = data.history;
var wind = (history.dailysummary[0].maxwspdi);
console.log("wind inner: ", wind);//this works
});
});
conv.ask("the wind speed is" + wind + "miles per hour");
//unable to get the wind variable to be defined ouside the api call
});
You should be using a Promise to return asynchronous responses. Additionally, make sure you wait until you get the response data to generate your text-to-speech. Here's a snippet that should work.
app.intent('weather', (conv) => {
return new Promise((resolve, reject) => {
const url = "http://api.wunderground.com/api/"+apiKey+"/yesterday/q/55417.json";
const request = http.get(url, function (response) {
var buffer = "",
data,
history;
response.on("data", function (chunk) {
buffer += chunk;
});
response.on("end", function (err) {
console.log(buffer);
console.log("\n");
data = JSON.parse(buffer);
history = data.history;
const wind = (history.dailysummary[0].maxwspdi);
console.log("wind inner: ", wind);//this works
conv.ask("the wind speed is" + wind + "miles per hour");
resolve();
});
});
});
});
I have a route set up to receive a webhook from SendGrid, which sends MultiPart/form data. I can get the various fields to output in the console with busboy, but I'm struggling to fill in the final piece of the puzzle: getting this parsed data into a Collection object (or just into MongoDB if not familiar with meteor).
I thought something like the following would work, but the data arrays in the db are always blank, i'm guessing i'm missing a crucial step in knowing when the stream has finished?
WebApp.connectHandlers.use('/applicants', (req, res, next) => {
let body = '';
req.on('data', Meteor.bindEnvironment(function (data) {
body += data;
let bb = new Busboy({ headers: req.headers });
let theEmail = [];
bb.on('field', function(fieldname, val) {
console.log('Field [%s]: value: %j', fieldname, val);
let theObject = [];
theObject[fieldname] = val;
theEmail.push(theObject);
}).on('error', function(err) {
console.error('oops', err);
}).on('finish', Meteor.bindEnvironment(function() {
console.log('Done parsing form!');
// try to add data to database....
Meteor.call('applicants.add', theEmail);
}));
return req.pipe(bb);
}));
req.on('end', Meteor.bindEnvironment(function () {
res.writeHead(200);
res.end();
}));