ADF json to sql copy empty value not inserted as null - azure-data-factory

I am attempting to copy json data to a sql table and noticed that any empty value is not being inserted as a null, even though the column is nullable. It seems to be inserting an empty string.
I have tried to add nullValue and treatEmptyAsNull parameters like the code below, but that made no difference:
"source": {
"type": "BlobSource",
"recursive": true,
"nullValue": "",
"treatEmptyAsNull": true
},
I am expecting a null to be inserted.
Is this standard behavior for ADF copy using json as a source to not insert empty values as null? Is there other properties I need to add to the json?

The value inserted into SQL db can't be null directly because your source data is empty string "",not null value. ADF copy activity can't convert empty string to null automatically for you.
However, you could invoke a stored procedure in sql server dataset. In that SP, you could convert the "" to null value as you want before the columns inserted into table. Please follow the detail steps in above link or some example in my previous case:Azure Data factory copy activity failed mapping strings (from csv) to Azure SQL table sink uniqueidentifier field.

Related

Azure data factory -ingesting the data from csv file to sql table- data activity sql sink stored procedure -table type and table type parameter name

I am running one of the existing Azure data factory pipeline that contains instet into sql table, where the sink is a sql server stored procedure.
I supplied the table type and table type parameter name and which maps to my table.
I am getting error as Failure happened on 'Sink' side.
Storedprocedure:
CREATE PROCEDURE [ods].[Insert_EJ_Bing]
#InputTable [ods].[EJ_Bing_Type] READONLY,
#FileName varchar(1000)
AS
BEGIN
insert into #workingtable
(
[ROWHASH],
[Date])
select [ROWHASH],
[Date] from #InputTable
end
Error message:
Failure happened on 'Sink' side.
ErrorCode=InvalidParameter,'Type=Microsoft.DataTransfer.Common.Shared.HybridDeliveryException,Message=The value of the property '' is invalid:
'Invalid 3 part name format for TypeName.'.,Source=,''Type=System.ArgumentException,Message=Invalid 3 part name format for TypeName.,Source=System.Data
Can anyone please let me where i am doing wrong.
There is nothing wrong with your approach or code. You are getting this error because of the way you specified the Table type value in the sink.
I got same error as you with the above table type.
Change the Table type from [dbo.][tabletype] to [dbo].[tabletype]
You can see my copy activity is successful.
Rather than typing the stored procedure name and Table type names, you can do like below.

not able to do copy activity with bit value in azure data factory without column mapping for sink as postgresql

I've multiple csv files in folder like employee.csv, student.csv, etc.,.. with headers
And also I've tables for all the files(Both header and table column name is same).
employee.csv
id|name|is_active
1|raja|1
2|arun|0
student.csv
id|name
1|raja
2|arun
Table Structure:
emplyee:
id INT, name VARCHAR, is_active BIT
student:
id INT, name VARCHAR
now I'm trying to do copy activity with all the files using foreach activity,
the student table copied successfully, but the employee table was not copied its throwing error while reading the employee.csv file.
Error Message:
{"Code":27001,"Message":"ErrorCode=TypeConversionInvalidHexLength,Exception occurred when converting value '0' for column name 'is_active' from type 'String' (precision:, scale:) to type 'ByteArray' (precision:0, scale:0). Additional info: ","EventType":0,"Category":5,"Data":{},"MsgId":null,"ExceptionType":"Microsoft.DataTransfer.Common.Shared.PluginRuntimeException","Source":null,"StackTrace":"","InnerEventInfos":[]}
Use data flow activity.
In dataflow activity, select Source.
After this select derived column and change datatype of is_active column from BIT to String.
As shown in below screenshot, Salary column has string datatype. So I changed it to integer.
To modify datatype use expression builder. You can use toString()
This way you can change datatype before sink activity.
In a last step, provide Sink as postgresql and run pipeline.

Azure Data Factory -> Copy from SQL to Table Storage (boolean mapping)

I am adding pipeline in Azure Data factory to migrate data from SQL to Table storage.
All seems working fine, however i observed that bit column is not getting copies as expected.
I have a filed "IsMinor" in SQL DB.
If i don't add explicit mapping for bit column as is then, it is copied as null
If i set it as 'True' Or 'False' from SQL, it is copied as String instead of Boolean in TableStorage.
I also tried to specify type while mapping the field i.e. "IsMinor (Boolean)", however it didn't worked as well.
Following is my sample table
I want the bit value above to be copied as "Boolean" in Table storage instead of String.
I tried copy the boolean data from my SQL database to table Storage, it works.
As you know, SQL server does't support boolean data type, so I create table like this:
All the data preview look well in Source dataset:
I just create a table test1 in Table storage, let the data factory create the PartitionKey and RowKey automatically.
Run the pipeline and check the data in test1 with Storage Explorer:
From the document Understanding the Table service data model, Table storage do support Boolean property types.
Hope this help.

PostgreSQL: COPY from csv missing values into a column with NOT NULL Constraint

I have a table with an INTEGER Column which has NOT NULL constraint and a DEFAULT value = 0;
I need to copy data from a series of csv files.
In some of these files this column is an empty string.
So far, I have set NULL parameter in the COPY command to some non existing value so empty string is not converted to NULL value, but now I get an error saying that empty string is incorrect value for the INTEGER column.
I would like to use COPY command because of its speed, but maybe it is not possible.
The file contains no header. All columns in the file have their counterparts in the table.
It there a way to specify that:
an empty sting is zero, or
if there is en empty string use the default column value?
You could create a view on the table that does not contain the column and create an INSTEAD OF INSERT trigger on it. When you COPY data into that view, the default value will be used for the table. Don't know if the performance will be good enough.

PostgreSQL COPY CSV with two NULL strings

I have a source of csv files from a web query which contains two variations of a string that I would like to class as NULL when copying to a PostgreSQL table.
e.g.
COPY my_table FROM STDIN WITH CSV DELIMITER AS ',' NULL AS ('N/A', 'Not applicable');
I know this query will throw an error so I'm looking for a way to specify two separate NULL strings in a COPY CSV query?
I think your best bet in this case, since COPY does not support multiple NULL strings, is to set the NULL string argument to one of them, and then, once it's all loaded, do an UPDATE that will set values in any column you wish having the other NULL string you want to the actual NULL value (the exact query would depend on which columns could have those values).
If you have a bunch of columns, you could use CASE statements in your SET clause to return NULL if it matches your special string, or the value otherwise. NULLIF could also be used (that would be more compact). e.g. NULLIF(col1, 'Not applicable')