Is there a way to load a .DAT file in the MS SQL Sever table using ADF? - azure-data-factory

I have a .DAT file in a ftp server. I have to load the data in a SQL table. I am trying to create datasets in ADF using this .DAT file using ftp as source and ADLS as sink. However it is not getting created. Please suggest how can this be done or if there are any alternatives
I tried loading the .DAT file to ADLS using file format as binary but I am unable to look at data.
If I use file format as csv then I cannot use comma, space, tab, blank space, etc. as delimiter and ADF requires me to enter a delimiter.

I have taken simple .dat file with month data and tried to transfer it into SQL server table followed below procedure:
Dataset setting:
Pipeline source settings:
Successful pipeline run:
I tried loading the .DAT file to ADLS using file format as binary but I am unable to look at data.
You cannot transfer data by using Binary as source if you took binary as source your sink also should be binary format and SQL server table is not a binary format.

Related

Retrieve Large XML file from HTTP and copy to Azure SQL

I'm currently using an API call to download an XML file which is 600MB and copy to Azure SQL using the copying activity. Due to the file being large no output have ever succeed as the debug keeps on running and never completing.
I have tried splitting the source file using dataflow, lookup and get meta, however, HTTP request support aren't available for these activities. Is there a way to copy larger xml file from a http request to Azure SQL?

Export a CSV file from AS400 to my pc through Cl program

I want to export a database file that is created through a query, from the AS400 machine to my pc in the form of a csv file.
Is there a way to create that connection of the AS400 and my pc through a cl program?
An idea of what I want to do can be derived from the following code:
CLRPFM DTABASENAME
RUNQRY QRY(QRYTEST1)
CHGVAR VAR(&PATH) VALUE('C:\TESTS')
CHGVAR VAR(&PATH1) VALUE('C:\TESTS')
CHGVAR VAR(&CMD) VALUE(%TRIM(&PATH) *CAT '/DTABASENAME.CSV' !> &PATH !> &PATH1)
STRPCO PCTA(*YES)
STRPCCMD PCCMD(&CMD) PAUSE(*YES)
where I somehow get my database file, give the path that I want it to be saved in, in my pc , and lastly run the pc command accordingly
Take a look at
Copy From Query File (CPYFRMQRYF)
Which will allow you to create a database physical file from the query.
You may also want to look at
Copy To Import File (CPYTOIMPF)
Which will copy data from a database physical file to an Integrated File System (IFS) stream file (such as .CSV); which are the type of files you'd find on a PC.
ex:
CPYTOIMPF FROMFILE(MYLIB/MYPF) TOSTMF('/home/myuser/DTABASENAME.CSV') RCDDLM(*CRLF) DTAFMT(*DLM) STRDLM(*DBLQUOTE) STRESCCHR(*STRDLM) RMVBLANK(*TRAILING)
FLDDLM(',')
However, there's no single command to transfer data to your PC. Well technically, I suppose that's not true. If you configure a (SMB or NFS) file share on your PC and configure the IBM SMB or NFS client; you could in fact CPYTOIMPF directly to that file share or use the Copy Object (CPY) command to copy from the IFS to the network share.
If your PC has an FTP server available, you could send the data via the IBM i's FTP client. Similarly, if you have a SSH server on your PC, OpenSSL is available via PASE and SFTP or SCP could be used. You could also email the file from the i.
Instead of trying to send the file to your PC from the i. An easier solution would be to kick off a process on the PC that runs the download. My preference would be a Access Client Solution (ACS) data transfer.
You configure and save (as a .dtfx file) the transfer
Then you can kick it off with a
STRPCCMD cmd('java -jar C:\ACS\acsbundle.jar /plugin=download C:\testacs.dtfx')
More detailed information can be found in the Automating ACS Data Transfer document
The ACS download compoent is SQL based, so you could probably remove the need to use Query/400 at all
Assuming that you have your IFS QNTC mapped to your network domain. You could use the command CPYTOIMPF to copy the data directly from an IBMI DB2 file to a network directory.
This sample would result in a CSV file.
CPYTOIMPF FROMFILE(file) TOSTMF('//QNTC/servername or ip/path/filename.csv') STMFCCSID(*PCASCII) RCDDLM(*CRLF) STRDLM(*NONE)
Use the FLDDLM(';') option in addition to make semicolon separated values, omit it to use comma as value separator.

Some questions about google Data fusion

I am discovering the tool and I have some questions:
-what do you exactly mean by the type File in (Source, Sink),
-is it also possible to send the result of the pipeline directly to a FTP server
I check the documentation, but I did not find this information
thank you
Short answer: File refers to the filesystem where the pipelines run. In Data Fusion context if you are using File sink the contents will be written to HDFS on Dataproc cluster.
Data Fusion has SFTP put actions that can be used to write to SFTP. Here is a simple pipeline of how to write to SFTP from GCS.
Step1: GCS Source to File Sink - This writes the content of GCS to HDFS on Dataproc when the pipeline is run
Step 2: SFTP Put action, that takes the output of File sink and upload to SFTP.
You need to configure the output path of File the same as source path in SFTP

How we can copy any file within Azure Data Lake Store folders

We already have Move-AzureRmDataLakeStoreItemwhich will move files between folders inside Azure datalake. What I am seeking is to copy files within the datalake without effecting the original file.
The possibilities that I know are-
using USQL to EXTRACT data from sourcefile and then OUTPUT to the destinationfile - but I am trying to copy all sort of files (.gz,.txt,.info,.exe,.msi) and I am not sure if USQL can help me with .gz or .exe or .msi files
using Data Factory to copy data from/to Data Lake store
So, my ask here is do we have anything else at our disposal with which we can perform a copy of files within Azure Data Lake Store?
You have couple of other options,
run distcp on an HDI cluster - Similar to instructions provided here. https://learn.microsoft.com/en-us/azure/data-lake-store/data-lake-store-copy-data-wasb-distcp
use adlcopy if you are copying limited amount of data (saying 10-100's of GB) - https://learn.microsoft.com/en-us/azure/data-lake-store/data-lake-store-copy-data-azure-storage-blob
Does this suffice please? Or do you want something natively supported by Azure Data Lake Store via its REST APIs?
Thanks,
Sachin Sheth
Program Manager, Azure Data Lake.

Understanding Tableau File Types better

I have done a lot of research on the file types of tableau but I would still like to know more about some file extensions
What is the relation between twbx and tdsx?
Is twbx file : twb + tde file OR twb+tde+tds file ?
What would be the main difference between tds and tdsx file ?
How to use a tps(preferences) file in tableau workbook?
can all the file extensions be used on server or only some ?
twbx contains the workbook including a copy of all the data that you connected your workbook to, while a tdsx contains connection information to remote data sources (Server IPs, tables, etc) as well as any local data that somebody else wouldn't have access to otherwise (eg. an Excel file on your computer). No dashboards are involved. https://onlinehelp.tableau.com/current/pro/desktop/en-us/export_connection.html
A twbx is a twb with a tde file if you want. Remote and local data is stored within your workbook so that other people can access it https://onlinehelp.tableau.com/current/pro/desktop/en-us/save_savework_packagedworkbooks.html
a tds only includes references to data sources but no actual data, so if you connect to a local excel file and it is not accessible via a network, a colleague of yours won't be able to use this file to get the data. A tdsx includes these datasets so you can share it. https://onlinehelp.tableau.com/current/pro/desktop/en-us/export_connection.html
A tps contains custom colour palettes, how you can use it can be found here: https://onlinehelp.tableau.com/current/pro/desktop/en-us/formatting_create_custom_colors.html
Generally all files that you can connect to with Tableau Desktop can be accessed via the server (assuming that the server can access them, ie. the files or sources are on the network). You might however be required to install additional drivers on the server to access for example SAP BW. These drivers are not contained in the default installation.