How can you get XML out of a Data Factory?
Great there is an XML format but this is only a source ... not a sink
So how can ADF write XML output?
I've looked around and there have been suggestions of using external services, but I'd like to keep it all "in Data Factory"
e.g. I could knock together an Azure Function, which could take JSON, and convert it to XML, using an example like so
But how can I then get ADF to, e.g. to this XML to a File System ?
No, this is not possible.
If you just want to copy, then use binary format is ok. But if you are trying to let ADF output XML, it is not possible.(As the document you mentioned told.)
Related
I have a pipeline in ADF v2 that calls a SOAP endpoint which returns a base64 encoded string, which is actually a zip file containing 2 files. I am only interested in file[1] (the 2nd one). I want to take this file, and write it to a storage account.
What's the best way to do this in ADF without resorting to external things like Functions call etc.
I already create export T-SQL in CSV.
Now I need to export in HL7 V2 format.
Do you know what tools are available?
Can someone share some sql code as a starting point?
Thanks
JCS
If you already have an exported CSV using T-SQL, then I guess you can use Mirth (NextGen) Connect as the interface engine. This allows you to pick up the CSV file then create an HL7 v2 message from there.
This link should help you: https://github.com/nextgenhealthcare/connect
The good thing with this is this is designed for healthcare interoperability, which supports HL7 v2, FHIR and other data types like CSV, JSON, delimited text, etc...
Mirth Connect will help you with that. You must create a "File Reader" channel and map out your CSV headers. Then you can define the destination and type of connector (transmission method: HTTP Sender, FileWriter, Database Writer, TCP Sender, Web Service Sender, etc) of your choice and transform your CSV into a HL7 v2 message to be sended.
Also, you can create a channel to query the data straight from the database by using a Database Reader and build your HL7 v2 message starting from there without the CSV extraction and save some time.
I want to create a ADF pipeline which needs to access an API and using some filter parameter it will get data from there and write the output in JSON format in DataLake. How can I do that??
After the JSON available in Lake it needs to be converted to CSV file. How to do?
You can create a pipeline with copy activity from HTTP connector to Datalake connector. Use HTTP as the copy source to access the API (https://learn.microsoft.com/en-us/azure/data-factory/connector-http), specify the format in dataset as JSON. Reference https://learn.microsoft.com/en-us/azure/data-factory/supported-file-formats-and-compression-codecs#json-format on how to define the schema. Use Datalake connector as the copy sink, specify the format as Text format, and do some modification like row delimiter and column delimiter according to your need.
the below work follow may meet your requirement:
Involve a Copy activity in ADFv2, where the source dataset is HTTP data store and the destination is the Azure Data lake store, HTTP source data store allows you to fetch data by calling API and Copy activity will copy data into your destination data lake.
Chain an U-SQL activity after Copy activity, once the Copy activity succeeds, it'll run the U-SQL script to convert json file to CSV file.
I am trying utilise Django REST APIs to insert data into the database, instead of the direct write. I've been able to read JSON data using the tRESTClient component but I am not too sure about the insertion/POST. Could someone point me to the components (and relation) that I should use?
The current job that I have is mostly:
Read data from raw file -> tMap -> DB
and I wish to do something like:
Read data from raw file -> tMap -> (pass on data to REST endpoint via POST)
Used the tRestClient component after my tMap and I could see the records getting inserted into the DB but all of them are without any data. Strangely nowhere I was asked to specify the JASON tree. The number of records getting inserted are equal to rows being read from raw file so at least something is right. But I couldn't locate the menu/options to specify which data element read from the raw file should tag to which JASON element.
How do I specify the data to JSON mapping?
PS: I realise that this might not be the most efficient way to ingest data but that's what the business wants since it brings in an additional layer of control.
Actually, I need to create a transformation which will read the JSON file from the system directory and rename the JSON fields(keys) based on the metadata inputs. Finally, write the modified JSON into '.js' file using JSON output step. This conversion must be done using the ETL Metadata Injection step.
Since I am new to Pentaho Data Integration tool, can anyone help me with the sample '.ktr' files for the above scenario.
Thanks in advance.
The same use case is on the Pentaho official documentation here, except it does it with Excel files rather than JSON objects.
Now, the Metadata Injection Step requires the development of a rather sophisticated machinery. And json, it is rather simple to build with a simple javascript. So, where do you get the "dictionary" (source field name -> target field name) from?