How to I convert files above 200 kb in size to JSON schema on liquid XML? - liquid-xml

I'm trying to convert a very large JSON file to JSON Schema, but when I upload that file to Liquid_XML it gives me a prompt saying file size above 200 kb. Can buying a licensed version fix this issue? If so, which license to buy?

The online json converter tool has a 200kb limit, so you would need to use the desktop app Liquid Studio.
You would need a JSON Editor Edition license to use all of the JSON and JSON Schema tools.

Related

Convert OpenStreetMap POI Data to CSV

I am looking to extract some Point of Interest (POI) data from OpenStreetMap in a tabular format. I have used this link navigated to the relevant country and downloaded the file,
http://download.geofabrik.de/
What I get is a file with the .osm.pbf extension. However, I think it is possible to download the files in other formats like .shp.zip and .osm.bz2. Is there some way that I can convert this data into a tabular format like a CSV file?
I came across a tool called Osmosis which can be used to manipulate data in these formats, but I am not sure if it can be used for this purpose,
https://wiki.openstreetmap.org/wiki/Osmosis
I was able to successfully install in on my Windows machine though.
To be frank, I am not even sure if this gives me what I want.
In essence, what I am looking for is Sri Lankan POI data that contains the following attributes,
https://data.humdata.org/dataset/hotosm_lka_points_of_interest
If the conversion of this file does not give me data in this format, then I am open to other approaches as well? What is the best way to go about acquiring this data?

Can I use a sql query or script to create format description files for multiple tables in an IBM DB2 for System I database?

I have an AS400 with an IBM DB2 database and I need to create a Format Description File (FDF) for each table in the DB. I can create the FDF file using the IBM Export tool but it will only create one file at a time which will take several days to complete. I have not found a way to create the files systematically using a tool or query. Is this possible or should this be done using scripting?
First of all, to correct a misunderstanding...
A Format Description File has nothing at all to do with the format of a Db2 table. It actually describes the format of the data in a stream file that you are uploading into the Db2 table. Sure you can turn on an option during the download from Db2 to create the FDF file, but it's still actually describing the data in the stream file you've just downloaded the data into. You can use the resulting FDF file to upload a modified version of the downloaded data or as the starting point for creating an FDF file that matches the actual data you want to upload.
Which explain why there's no built-in way to create an appropriate FDF file for every table on the system.
I question why you think you actually to generate an FDF file for every table.
As I recall, the format of the FDF (or it's newer variant FDFX) is pretty simple; it shouldn't be all that difficult to generate if you really wanted to. But I don't have one handy at the moment, and my Google-FU has failed me.

How to Read Word Document in Postgresql

I am new to this stuff, not sure if this can be achieved, I am expecting a bunch of word documents daily, which (structured data) I need to process and store its values to my POSTGRES. I searched over the internet, all I could find is storing the Word document in Blob, Bytea format, do encode, decode, etc, which is again returning text that I can not process. Can this be achieved, if so can you please provide a sample code that can count words/characters/lines in the word document, I can extend that to my need and requirement. I am using Ubuntu on AWS,
show server_encoding;
UTF8
I have tried below
pg_read_file('/var/lib/postgresql/docs/testDoc.docx');
pg_read_binary_file('/var/lib/postgresql/docs/testDoc.docx')
encode(pg_read_binary_file('/var/lib/postgresql/docs/testDoc.docx'),'base64')
decode(encode(pg_read_binary_file('/var/lib/postgresql/docs/testDoc.docx'),'base64'),'base64')::text;
Regards
Bharat
You can checkout the OpenXML library from Microsoft.
This is a .NET Frameowrk based OSS library that maps office documents as objects.
With this library you can build, for example, a program that extracts informations and send data to your PostgreSQL.
Tis library is available for the .NET Core framework too, so you can build a program tha can be run on Ubuntu too .NET Core library on NUGET.
Another way is to write a Java program. Same concept.
In Java you can use Apache POI library to read office documents.
Remember that an office document is a compressed file (with ZIP algorithm) that contains XML data, that represents the document.
One option is to use some front end language to read the docx files and upload them to Postgres.
In Ruby you could:
Install the docx and pg gem.
gem install docx pg
and then create something like the ruby file:
require 'docx'
require 'pg'
doc = Docx::Document.open('document.docx')
conn = PG.connect( dbname: 'postgres_db', user: 'username', password: 'password' )
doc.paragraphs.each do |p|
conn.exec( "INSERT INTO table paragraphs (paragraph) VALUES ( $1 )", [ p ] )
end
I'm sure this could be done in Python or whatever language you know best.

is there a possibility to import documents in bulk on alfresco

is there a possibility to in bulk import documents and their metadata in alfresco. In fact what I want is upload a bunch of documents and inject their metadata from a xml file.
thanks in advance
The link that Abbas pointed you to is the best option. The Bulk File System Import Tool supports bulk importing content as well as metadata.
Write a script that exports your spreadsheet into the format the BFSIT expects. Then upload your content and each of the content's metadata descriptor files (generated from your spreadsheet) to the server. Finally, run the import.
If instead what you are trying to do is not import files and metadata but instead you just want to set metadata from your spreadsheet on a bunch of existing content that is already in the repository, then what you can do is write a script that reads your spreadsheet and uses something like Python cmislib or OpenCMIS (both are from Apache Chemistry) to set that metadata on objects in the repository in bulk.
You can also use CMIS to upload files, but the BFSIT is much more efficient.

Importing and exporting KML into and from DashDB for Geospatial Analysis

I am a recent convert to IBM's DashDB, and I considering proposing to use it at my work. My case would greatly be bolstered if I could show good, easy integration for geospatial analytics data, namely loading and performing SQL filtering on geodata currently in .shp or .kml formats. If it would be possible to also export the filtered data into a KML as a result that would be AMAZING.
So, to give a practical example, say I have an .SHP file with all the zipcodes in the US, I want to select the shape for the 02138 zip code and send it to the query-sender in KML format.
Does anyone have experience with that?
Loading .kml file directly to dashDB is currently not possible, as answered by another post here. You may use ArcGIS for Desktop to connect to dashDB if you want to do this.
https://gis.stackexchange.com/questions/141194/importing-and-exporting-kml-into-and-from-dashdb-for-geospatial-analysis
On the other hand, uploading a .shp file is natively supported and is demonstrated in this tutorial below. Note that dashDB requires the upload file to be compressed in .zip, .tar or .gz file formats.
https://www-01.ibm.com/support/knowledgecenter/SS6NHC/com.ibm.swg.im.dashdb.doc/tutorial/analyze-geospatial-data.html
Also, ArcGIS for Desktop should let you use its export features as well.
There are 2 options loading shape files: via the gui ("Load geospatial data") or via the CLPPlus command IDA LOADGEOSPATIALDATA .
Once the data is loaded it resides in the internal geometry format (e.g. as ST_Polygon), and can then be accessed in various formats through spatial functions imbedded in SQL calls. Unfortunately, kml is not one of the formats that dashDB supports as of now - options are GML, WKT, WKB.