How to parse a PDF in nodejs - pdf2json

I am trying to parse a pdf and categorize information based on text formatting/decoration. How do you suggest I do that?
For example, I have a pdf in which the structure is repeated:
S.No. BOLD+UNDERLINED TITLE para
How do I categorize this data into an array of objects based on text decoration:
[
{ sno: "", title: "", desc: "" },
...
]

I went through the documentation for pdf2json and figured that I might have to use pdfData.formImage.Pages[pageNumber].Texts[wordNumber].R[0] object after parsing the pdf to get hold of values I need.
The property TS of the above object is an array, the value at TS[2] corresponds to whether the text is bold (value = 1) or not (value = 0). I could not find any details on data related to underline text-decoration.
I also needed to initialize the parser as follows:
let pdfParser = new PDFParser(null, 1).
Check this for more details.

Related

How to filter an array of object in Mui DataGrid?

I recently changed my tables to Mui-datagrid on Material UI 5, and I have a special use case with an array of objects. I want to enable the phone number filter in this column, but the number is provided as an object list.
phone: [
{ type: "home", number: "795-946-1806" },
{ type: "mobile", number: "850-781-8104" }
]
I was expecting a 'customFilterAndSearch' or an option to customise how to search in this specific field.
customFilterAndSearch: (term, rowData) =>
!!rowData?.suppressedOptions.find(({ description }) =>
description?.toLowerCase().includes(term.toLowerCase())
),
I have made some tries with the filterOperators, but no success yet. I have made a full example here https://codesandbox.io/s/mui-data-grid-vs05fr?file=/demo.js
As far as I can see from the DataGrid documentation I don't see any way to change the filter function for a specific function.
Likely the best workaround for your use case will be converting this to a string be converting the data to a string before you pass it to the datagrid. Though you will lose the styling that you currently do by making the phone type bold.
On second though your best best would probably be to split the phone column into two columns which would probably be the cleanest way of solving your problem
Add helper function.
You could potentially add a helper function to just map all the phone lists to something like mobilePhone or homePhone
const mapPhoneObject = (rows) => {
rows.forEach((row) => {
row.phone.forEach((phone) => {
row[`${phone.type}Phone`] = phone.number;
});
});
return rows
};
I've added a fork of your snippet with my function, it is I think the most viable solution for your problem: https://codesandbox.io/s/mui-data-grid-forked-ppii8y

Arbitrary HTTP API Call to Enter Cell Value into a MUTL_PICKLIST Column

I am quite new new to Smartsheets and to programming.
I am using Integromat to update various stuff in Smartsheets - 99% operations are done via a nice interface for dummies.
But I have an issue with one column which is MULTI_PICKLIST and which cannot be processed with native dummy-friendly UI.
Basically, I'm adding a new row and one of the columns on the way is the MULTI_PICKLIST one. In order to enter value into this cell, I need to make an arbitrary HTTP API call.
I know row ID, I know column ID. I just need to construct the body of the HTTP request.
The possible picklist value are: John or Maya or Paul. Assume I need to enter "John" into the column.
Attached, you will find my "progress". I obviously, I'm stuck with the BODY part. Can someone give me a little push, please? I think it's gotta be like 5 lines of code.
This is what I have:
DZ
A few things...
First, the value that you're using for URL doesn't look quite right. It should be in the following format, where {sheetId} is replaced with the ID of the sheet you're updating:
sheets/{sheetId}/rows
Second, I don't think you need the key/value that you've specified for Query String -- I'd suggest that you delete this info.
Next, I'm not sure what the other possible values are for Type (based on your screenshot, it looks like a picklist) -- but if JSON is an option, I'd suggest choosing that option instead of Text.
Finally, here's any example of the correct structure/contents for Body to update a MULTI_PICKLIST cell with the value John -- replace the value of the id property (5225480965908356) with your Row ID and replace the value of the columnId property (8436269809198980) with your Column ID:
[
{
"id": "5225480965908356",
"cells": [
{
"columnId": "8436269809198980",
"objectValue": {
"objectType": "MULTI_PICKLIST",
"values": ["John"]
}
}
]
}
]
If you want to select multiple values for a MULTI_PICKLIST cell, here's an example that specifies two values for the cell (John and Maya):
[
{
"id": "5225480965908356",
"cells": [
{
"columnId": "8436269809198980",
"objectValue": {
"objectType": "MULTI_PICKLIST",
"values": ["John", "Maya"]
}
}
]
}
]
** UPDATE **
My initial answer answer above assumed you wanted to update a cell value in a MULTI-PICKLIST column (b/c you've selected PUT for the Method value in your screenshot -- which is the verb used to update a row). Having re-read your question just now though, it sounds like maybe you want to add a new row...is that correct? If so, then the value for Method should be POST (not PUT), and Body will need to include additional objects within the cells array to specify values of other cells in the new row. The following example request (when used with the verb POST) adds a new row and populates 3 cells in that row, the first of which is a MULTI_PICKLIST cell:
[
{
"cells": [
{
"columnId": "8436269809198980",
"objectValue": {
"objectType": "MULTI_PICKLIST",
"values": ["John"]
}
},
{
"columnId": 6101753539127172,
"value": "test value"
},
{
"columnId": 4055216160040836,
"value": 10
}
]
}
]
More info about the Add Rows request can be found in the Smartsheet API docs: Add Rows.

Algolia search for array that contains value

I am using Algolia search, and right now I use this to find a specific item by id:
algolia.getObject(id)
However, I need to make a search by barcode rather than ID - need a pointer in the right direction here.
The barcodes field is an array that can contain one or more barcode numbers.
You can trigger a search with filters on the barcodes attributes. The filters parameter supports multiple format, numeric values included. It does not matter if the attribute hold a single or multiple (an array) values. Here is an example with the JavaScript client:
const algoliasearch = require('algoliasearch');
const client = algoliasearch('YOUR_APP_ID', 'YOUR_API_KEY');
const index = client.initIndex('YOUR_INDEX_NAME');
index
.search({
filters: 'barcodes = YOUR_BARCODE_VALUE',
})
.then(response => {
console.log(response.hits);
});
The above example assumes that your records have a structure like this one:
{
"barcodes": [10, 20, 30]
}

How to specify the gid (tabs) in Google Spreadsheet API v4?

I am trying to use RESTful API to gather the data from a Google spreadsheet spreadsheet.
I read the document at https://developers.google.com/sheets/api/reference/rest/v4/spreadsheets.values/get but I cannot a way that allows me to specify a specific GID (tabs) of the spreadsheet.
After call the API, I was able to get a value from the first tab of the spreadsheet, but I want to change it to the second page (another GID)
Any help would be appreciated. Thanks.
Edit: Added example:
I have a sample spreadsheet here: https://docs.google.com/spreadsheets/d/1da3e6R08S8OoibbOgshyAQy7m9VRGPOrSGzVIvMSXf4/edit#gid=0
When I want to get the value on A1, I can call the RESTful API:
https://content-sheets.googleapis.com/v4/spreadsheets/1da3e6R08S8OoibbOgshyAQy7m9VRGPOrSGzVIvMSXf4/values/A1?key=API_KEY
And I will get:
{
"range": "Sheet1!A1",
"majorDimension": "ROWS",
"values": [
[
"This is on the first tab!"
]
]
}
But as you see in my spreadsheet, I have to tabs (sheet2), how can I get that value of "This is on the second tab!"?
To specify a specific GID (tabs) of the spreadsheet. Enter your 'sheet_name' with single quotes. Don't miss to add double quotes for range. Hope it helps
var query = {
auth: authClient,
spreadsheetId: 'spreadsheetID',
range: "'sheet1'!A1:D3",
};
$data[] = new Google_Service_Sheets_ValueRange([
'range' => 'MGM!A1:Z' . $rowNum,
'values' => $values
]);
MGM - is your tag name

JQGrid Dynamic Select Data

I have utilised the example code at Example Code at this link
and I have got my grid to show a dynamically constructed select dropdown on add and edit. However when it is just showing the data in the grid it shows the dropdown index instead of its associated data. Is there a way to get the grid to show the data associated with the index instead of the index itself.
e.g. the data on my select could be "0:Hello;1:World"; The drop down on the edit/add window is showing Hello and World and has the correct indexes for them. If the cell has a value of 1 I would expect it to show World in the grid itself but it is showing 1 instead.
Here is the row itself from my grid:
{ name: 'picklist', index: 'picklist', width: 80, sortable: true, editable: true,
edittype: "select", formatter: "select", editrules: { required: true} },
I am filling the dynamic data content in the loadComplete event as follows:
$('#mygrid').setColProp('picklist', { editoptions: { value: picklistdata} });
picklist data is a string of "0:Hello;1:World" type value pairs.
Please can anyone offer any help. I am fairly new to JQGrids so please could you also include examples.
I know you have already solved the problem but I faced the same problem in my project and would like to offer my solution.
First, I declare a custom formatter for my select column (in this case, the 'username' column).
$.extend($.fn.fmatter, {
selectuser: function(cellvalue, options, rowdata) {
var userdata;
$.ajax({
url:'dropdowns/json/user',
async:false,
dataType:'json',
cache:true,
success: function(data) {
userdata = data;
}
});
return typeof cellvalue != 'undefined' ? userdata[cellvalue] : cellvalue ;
}
});
This formatter loads up the mapping of id and user in this case, and returns the username for the particular cellvalue. Then, I set the formatter:'selectuser' option to the column's colModel, and it works.
Of course, this does one json request per row displayed in the grid. I solved this problem by setting 10 seconds of caching to the headers of my json responses, like so:
private function set_caching($seconds_to_cache = 10) {
$ts = gmdate("D, d M Y H:i:s", time() + $seconds_to_cache) . " GMT";
header("Expires: $ts");
header("Pragma: cache");
header("Cache-Control: max-age=$seconds_to_cache");
}
I know this solution is not perfect, but it was adequate for my application. Cache hits are served by the browser instantly and the grid flows smoothly. Ultimately, I hope the built-in select formatter will be fixed to work with json data.
If you save in jqGrid ids of the select elements and want to show the corresponding textes then you should use formatter:'select' in the colModel (see http://www.trirand.com/jqgridwiki/doku.php?id=wiki:predefined_formatter#formatter_type_select) together with the edittype: "select".
The Usage of stype: 'select' could be also interesting for you if you plan to support data searching.