Add Double-Quote on Header of Blob file - azure-data-factory

I have a Copy data activity, where the source is SQL Server Query and sink is a blob file.
The blob file is created successfully but it doesn't have a double-quote in the header, same as the rows. Can that be configured in ADF?
Blob file:

Unfortunately, that is not possible in Azure Datafactory. As we explicitly declare the First row as header then it's going to take the first row as column names and wont be having double quotes same as rows. Because, Quote character & Escape characters is only for the rows, you can avoid having quotes in the rows as well.
Here, is the way you can have double quotes only when you again run a Copy Activity using the previous output blob file as source and sink as to another blob then it could be possible, and eliminating to declare Row as header for both source and sink datasets:

I found a better solution without creating another Copy Activity. In the Mapping section of the copy activity, just add double-quote (") on the column name.

Related

How to pass special character as parameter in Azure Data Factory?

I am trying to parametrize the Column Delimiter field in CSV dataset in Azure Data Factory.
(https://i.stack.imgur.com/JGkD5.png)
This unfortunately doesn't work when I pass a special character as a parameter.
When I hardcode the special character in the column delimiter field all works as expected.
This works
However, when I have \u0006 as a parameter in SQL DB (varchar(10) type)
(https://i.stack.imgur.com/GyTdr.png)
and I pass it in the pipeline
(https://i.stack.imgur.com/CUeRf.png)
The Copy Data activity doesn't detect this special character as a delimiter.
My guess is that when I use a parameter it passes \u0006 as a string, but I can't find anywhere how to bypass that.
I tried to pass \u0006 in column delimiter as a dynamic content. It didn't consider that as column delimiter. All data are shown as a single column.
Therefore, I tried to pass equivalent symbol of \u0006 ACK() as a dynamic value to that column delimiter and it worked. I tried to convert the \u0006 into the special character using SQL script. Below are the steps to do this.
File delimiters are stored in a table.
To convert this column into equivalent characters, \u is removed from the column and the resultant hexadecimal value is converted into an integer. Then nchar() function is used to the integer data.
select nchar(cast(right(file_delimiter,4) as int)) as file_delimiter from t5
The above SQL query is used in Lookup activity in ADF.
When this value is passed as a dynamic content to column delimiter to that dataset, values are properly delimited.
Once pipeline is run, data is copied successfully.

copy activity fail in azure data factory column delimiter issue?

I have a source csv file which i am loading to sql db using copy activity. In the 45th row i have a cell with this kind of data with unwanted characters.
Atualmente, as solicitações de faturamento manual de serviços de mobilidade de clientes da Região
I tried loading the file. Its throwing error at row 45 that it has more column count than expected. I tried removing unwanted characters from this text. Then the copy actvty got executed. In source my delimiter is set as , by default. How can I handle this situation. Source csv file is in UTF8 format. in sql db i have set every column to varchar(max).
I reproduced this and got the same error when I had the same data in my 3rd row without any double quotes for the data.
If you want to use the default delimiter(,), then use double quotes(") over rows.
Target data after copy activity:

Need recommendation in adf pipeline source properties while loading delimited text files from azure blob to snowflake

We are trying to load a delimited file which has blank data for few columns located in azure blob and would like to get a value like NA in our target snowflake table whenever we encounter a blank value in source csv file. We have been trying to provide a NA against the Null option but it is not working, any suggestions?
Here is the screenshot of what i have mentioned above.
I have used data flow activity in Azure data factory to resolve this issue.
Source file with NULL value in “Name” column.
Now use Derived Column transformation. In Derived column's settings Select column name and use iifNull({Name}, 'NA') expression.
In data preview, Null value in Name column is replaced with NA.
You can follow the above steps to replace Null values and Sink data from blob storage to Snowflake.

Azure ADF Copy Activity with Trailing Column Delimiter

I have a strange source CSV file where it contains a trailing column delimiter at the end of each record just before the carriage return/new line.
When ADF is previewing this data, it displays only 2 columns without issue and all the data rows. However, when using the copy activity, it fails with the following exception.
ErrorCode=DelimitedTextColumnNameNotAllowNull,'Type=Microsoft.DataTransfer.Common.Shared.HybridDeliveryException,Message=The
name of column index 3 is empty. Make sure column name is properly
specified in the header
Now I understand why it's complaining about this due to trailing delimiter, but my question is whether or not there is a way to deal with this condition? I've tried including the trailing comma in the record delimiter (,\r\n), but then it just pivots the data where all the columns become rows.
Is there a way to address this condition in copy activity?
When preview the data in dataset, it seams correct:
But actually in copy actives, the data will derived to 3 columns by the column delimiter ",", the third column is empty or NULL value. This will cause the error.
If you use Data Flow import projection from source, you can see the third column:
Just for now, copy active doesn't support modify the data schema. You must use Data flow Derived Column to create a new schema for the source. For example:
Then mapping the new column/schema to sink will solve the problem.
HTH.
Use a different encoding for your CSV. CSV utf-8 will do the trick.

Copy text file using postgres with custom delimiter by character size

I need to copy a text file which has confusing delimiter. I believe the delimiter is space. However, some of the column values are empty and I cannot differentiate which column which making it harder to load the data to database since the space is not indicating anything. Thus, when I try to COPY, the mapping is not right and I am getting ERROR: extra data after last expected column
I have tried to change the delimiter to comma and such, I am still getting the same error above. The below code can be used when I try to load some dummy data with proper delimiter.
COPY usm00070219(HEADREC_ID,YEAR,MONTH,DAY,HOUR,RELTIME,NUMLEV,P_SRC,NP_SRC,LAT,LON) FROM 'D:\....\USM00070219-data.txt' DELIMITER ' ';
This is example data:
It should have 11 columns but the data on the first row is only 10 and it cannot identify the empty value column. The spacings are not helpful at all!
Is there any way I can separate the columns by character size as delimiter and force the data to be divided by the size given?
COPY is not made to handle fixed-width text files. I can think of two options:
Load the file as it is into a table with a single text column using COPY. Then use regexp_split_to_array to split it into its components and inser these into another table.
You can use file_fdw to create a foreign table with a single text column like above and operate on that. That saves loading the file into the database.
There is a foreign data wrapper for fixed-width text files that you can try.