ADF CopyData Is it possible to have a dynamic Additional Columns that can be nullable?

ADF CopyData Is it possible to have a dynamic Additional Columns that can be nullable? - azure-data-factory

I have a configuration table with file names, destination table name (and other configs) to copy data into a SQL table. Sometimes I want the filename in a new column, but not for every files.
Is it possible to have a default value to not generate additional column for some files?
I tried
#json(
if(
equals(item().AdditionalColumns, null),
'{}',
item().AdditionalColumns
)
)
But I get this error: The value of property 'additionalColumns' is in unexpected type 'IList`1'.
And
#json(
if(
equals(item().AdditionalColumns, null),
'{[]}',
item().AdditionalColumns
)
)
But I get this error: The function 'json' parameter is not valid. The provided value '{[]}' cannot be parsed: 'Invalid property identifier character: [. Path '', line 1, position 1
Thank you

I figured out.
#json(
if(
equals(item()?.AdditionalColumns, null),
'[]',
item()?.AdditionalColumns
)
)

Related

Snowflake null values quoted in CSV breaks PostgreSQL unload

I am trying to shift data from Snowflake to Postgresql and to do so I first load it into s3 in CSV format. In the table, comas in text could appear, I therefore use FIELD_OPTIONALLY_ENCLOSED_BY snowflake unloading option to quote the content of the problematic cells. However when this happen + null values, I can't manage to have a valid CSV for PostgreSQL.
I created a simple table for you to understand the issue. Here it is :
CREATE OR REPLACE TABLE PUBLIC.TEST(
TEXT_FIELD VARCHAR(),
NUMERIC_FIELD INT
);
INSERT INTO PUBLIC.TEST VALUES
('A', 1),
(NULL, 2),
('B', NULL),
(NULL, NULL),
('Hello, world', NULL)
;
COPY INTO #STAGE/test
FROM PUBLIC.TEST
FILE_FORMAT = (
COMPRESSION = NONE,
TYPE = CSV,
FIELD_OPTIONALLY_ENCLOSED_BY = '"'
NULL_IF = ''
)
OVERWRITE = TRUE;
Snowflake will from that create the following CSV
"A",1
"",2
"B",""
"",""
"Hello, world",""
But after that, it is for me impossible to copy this CSV inside a PostgreSQL Table as it is.
Even thought from PostgreSQL documentation we have next to NULL option :
Specifies the string that represents a null value. The default is \N (backslash-N) in text format, and an unquoted empty string in CSV format.
Not setting COPY Option in PostgreSQL COPY INTO will result in a failed unloading. Indeed it won't work as we also have to specify the quote used using QUOTE. Here it'll be QUOTE '"'
Therefore during POSTGRESQL unloading, using :
FORMAT csv, HEADER false, QUOTE '"' will give :
DataError: invalid input syntax for integer: "" CONTEXT: COPY test, line 3, column numeric_field: ""
FORMAT csv, HEADER false, NULL '""', QUOTE '"' will give :
NotSupportedError: CSV quote character must not appear in the NULL specification
FYI, To test the unloading in s3 I will use this command in PostgreSQL:
CREATE IF NOT EXISTS TABLE PUBLIC.TEST(
TEXT_FIELD VARCHAR(),
NUMERIC_FIELD INT
);
CREATE EXTENSION IF NOT EXISTS aws_s3 CASCADE;
SELECT aws_s3.table_import_from_s3(
'PUBLIC.TEST',
'',
'(FORMAT csv, HEADER false, NULL ''""'', QUOTE ''"'')',
'bucket',
'test_0_0_0.csv',
'aws_region'
)
Thanks a lot for any ideas on what I could do to make it happen? I would love to find a solution that don't requires modifying the csv between snowflake and postgres. I think it is an issue more on the Snowflake side as it don't really make sense to quote null values. But PostgreSQL is not helping either.

When you set the NULL_IF value to '', you are actually telling Snowflake to convert NULLS to a BLANK, which then get quoted. When you are copying out of Snowflake, the copy options are "backwards" in a sense and NULL_IF acts more like an IFNULL.
This is the code that I'd use on the Snowflake side, which will result in an unquoted empty string in your CSV file:
FILE_FORMAT = (
COMPRESSION = NONE,
TYPE = CSV,
FIELD_OPTIONALLY_ENCLOSED_BY = '"'
NULL_IF = ()
)

Cannot insert character 'N' to PostgreSQL in Sequelize

I have data object as below:
const sStatus = "N";
const oMemberCard = {
code: memberCardCode,
member: memberCode,
status: sStatus,
status_datetime: moment().format(),
};
I use Sequelize to insert but have error:
Executing (default): INSERT INTO "mb_member_card"
("member","code","status","status_datetime") VALUES
('A001132203','0315099359','NaN','2020-10-18 00:33:32.000 +07:00');
[ERROR] SequelizeDatabaseError: value too long for type character(1)
I tried use escape("N") but not success.
Anybody can help me! Thanks.
SOLDED
I mistake field type in model. Sorry!

The problem is - your trying to insert NaN not N on Your status column which is character type with length limit 1, check the oMemberCard variable especially status element first before passing it to sequalize.

Identifying isNull and isNotNul in dataframe

I'm trying to identify the columns which are null and which are not null and depending on that trying to insert a string.
ab_final = join_df.withColumn("linked_A_B",
when(
col("a3_inbound").isNull() & ("a3_item").isNull()
), 'No Value'
).when(
(col("it_item").isNull() & ("it_export").isNull()),
'NoShipment'
).when(
(col("a3_inbound").isNotNull() & ("it_export").isNotNull()),
'Export')
)
I'm getting the below error
str' object has no attribute 'isnull'
Plz help

DB2 SQL Error: SQLCODE=-302 while executing prepared statement

I have a SQL query which takes user inputs hence security flaw is present.
The existing query is:
SELECT BUS_NM, STR_ADDR_1, CITY_NM, STATE_CD, POSTAL_CD, COUNTRY_CD,
BUS_PHONE_NB,PEG_ACCOUNT_ID, GDN_ALERT_ID, GBIN, GDN_MON_REF_NB,
ALERT_DT, ALERT_TYPE, ALERT_DESC,ALERT_PRIORITY
FROM ( SELECT A.BUS_NM, AE.STR_ADDR_1, A.CITY_NM, A.STATE_CD, A.POSTAL_CD,
CC.COUNTRY_CD, A.BUS_PHONE_NB, A.PEG_ACCOUNT_ID, 'I' ||
LPAD(INTL_ALERT_DTL_ID, 9,'0') GDN_ALERT_ID,
LPAD(IA.GBIN, 9,'0') GBIN, IA.GDN_MON_REF_NB,
DATE(IAD.ALERT_TS) ALERT_DT,
XMLCAST(XMLQUERY('$A/alertTypeConfig/biqCode/text()' passing
IAC.INTL_ALERT_TYPE_CONFIG as "A") AS CHAR(4)) ALERT_TYPE,
, ROW_NUMBER() OVER () AS "RN"
FROM ACCOUNT A, Other tables
WHERE IA.GDN_MON_REF_NB = '100'
AND A.PEG_ACCOUNT_ID = IAAR.PEG_ACCOUNT_ID
AND CC.COUNTRY_CD = A.COUNTRY_ISO3_CD
ORDER BY IA.INTL_ALERT_ID ASC )
WHERE ALERT_TYPE IN (" +TriggerType+ ");
I changed it to accept TriggerType from setString like:
SELECT BUS_NM, STR_ADDR_1, CITY_NM, STATE_CD, POSTAL_CD, COUNTRY_CD,
BUS_PHONE_NB,PEG_ACCOUNT_ID, GDN_ALERT_ID, GBIN, GDN_MON_REF_NB,
ALERT_DT, ALERT_TYPE, ALERT_DESC,ALERT_PRIORITY
FROM ( SELECT A.BUS_NM, AE.STR_ADDR_1, A.CITY_NM, A.STATE_CD, A.POSTAL_CD,
CC.COUNTRY_CD, A.BUS_PHONE_NB, A.PEG_ACCOUNT_ID,
'I' || LPAD(INTL_ALERT_DTL_ID, 9,'0') GDN_ALERT_ID,
LPAD(IA.GBIN, 9,'0') GBIN, IA.GDN_MON_REF_NB,
DATE(IAD.ALERT_TS) ALERT_DT,
XMLCAST(XMLQUERY('$A/alertTypeConfig/biqCode/text()' passing
IAC.INTL_ALERT_TYPE_CONFIG as "A") AS CHAR(4)) ALERT_TYPE,
ROW_NUMBER() OVER () AS "RN"
FROM ACCOUNT A, other tables
WHERE IA.GDN_MON_REF_NB = '100'
AND A.PEG_ACCOUNT_ID = IAAR.PEG_ACCOUNT_ID
AND CC.COUNTRY_CD = A.COUNTRY_ISO3_CD
ORDER BY IA.INTL_ALERT_ID ASC )
WHERE ALERT_TYPE IN (?);
Setting trigger type as below:
if (StringUtils.isNotBlank(request.getTriggerType())) {
preparedStatement.setString(1, triggerType != null ? triggerType.toString() : "");
}
Getting error as
Caused by: com.ibm.db2.jcc.am.SqlDataException: DB2 SQL Error: SQLCODE=-302, SQLSTATE=22001, SQLERRMC=null, DRIVER=4.19.26

The -302 SQLCODE indicates a conversion error of some sort.
SQLSTATE 22001 narrows that down a bit by telling us that you are trying to force a big string into a small variable. Given the limited information in your question, I am guessing it is the XMLCAST that is the culprit.
DB2 won't jam 30 pounds of crap into a 4 pound bag so to speak, it gives you an error. Maybe giving XML some extra room in the cast might be a help. If you need to make sure it ends up being only 4 characters long, you could explicitly do a LEFT(XMLCAST( ... AS VARCHAR(64)), 4). That way the XMLCAST has the space it needs, but you cut it back to fit your variable on the fetch.
The other thing could be that the variable being passed to the parameter marker is too long. DB2 will guess the type and length based on the length of ALERT_TYPE. Note that you can only pass a single value through a parameter marker. If you pass a comma separated list, it will not behave as expected (unless you expect ALERT_TYPE to also contain a comma separated list). If you are getting the comma separated list from a table, you can use a sub-select instead.

Wrong IN predicate use with a parameter.
Do not expect that IN ('AAAA, M250, ABCD') (as you try to do passing a comma-separated string as a single parameter) works as IN ('AAAA', 'M250', 'ABCD') (as you need). These predicates are not equivalent.
You need some "string tokenizer", if you want to pass such a comma-separated string like below.
select t.*
from
(
select XMLCAST(XMLQUERY('$A/alertTypeConfig/biqCode/text()' passing IAC.INTL_ALERT_TYPE_CONFIG as "A") AS CHAR(4)) ALERT_TYPE
from table(values xmlparse(document '<alertTypeConfig><biqCode>M250, really big code</biqCode></alertTypeConfig>')) IAC(INTL_ALERT_TYPE_CONFIG)
) t
--WHERE ALERT_TYPE IN ('AAAA, M250, ABCD')
join xmltable('for $id in tokenize($s, ",\s?") return <i>{string($id)}</i>'
passing cast('AAA, M250 , ABCD' as varchar(200)) as "s"
columns token varchar(200) path '.') x on x.token=t.ALERT_TYPE
;
Run the statement as is. Then you may uncomment the string with WHERE clause and comment out the rest to see what you try to do.
P.S.:
The error you get is probably because you don't specify the data type of the parameter (you don't use something like IN (cast(? as varchar(xxx))), and db2 compiler assumes that its length is equal to the length of the ALERT_TYPE expression (4 bytes).

How to import data into teradata tables from delimited file using BTEQ import?

I am trying to execute following bteq command on linux environment but couldn't load data properly into Teradata DB server. Can someone please advise me to resolve the below issue that I am facing while loading.
BTEQ Command used :
.SET width 64000;
.SET session transaction btet;
.logmech ldap
.logon XXXXXXX/XXXXXXXX,********;
DATABASE corecm;
.PACK 1000
.IMPORT VARTEXT '~' FILE=/v/global/user/application_event_bus_evt
.REPEAT *
USING(APPLICATION_EVENT_ID CHAR(24),BUS_EVT_ID CHAR(24),BUS_EVT_VID BIGINT,BUS_EVT_RESTATE_IN SMALLINT)
insert into corecm.application_event_bus_evt (APPLICATION_EVENT_ID
, BUS_EVT_ID
, BUS_EVT_VID
, BUS_EVT_RESTATE_IN
)
values
( COALESCE(:APPLICATION_EVENT_ID,1)
, COALESCE(:BUS_EVT_ID,1)
, COALESCE(:BUS_EVT_VID,1)
, COALESCE(:BUS_EVT_RESTATE_IN,1)
) ;
.LOGOFF;
.EXIT;
SAMPLE INPUT FILE DELIMITTER "~" [ /v/global/user/application_event_bus_evt ] :
Ckn3gMxLEeOgIQBQVgErYA==~g+GDDtlaY3n7BdUrYshDFA==~1~1
CL1kEcxLEeOgIQBQVgErYA==~qoKoiuGDbClpcGt/z6RKGw==~1~1
oYIVcMxKEeOgIQBQVgErYA==~mfmQiwl7yAteevzJfilMvA==~1~1
5N7ME5bM4xGhM7exj3ykUw==~yFM2FZbM4xGhM7exj3ykUw==~1~0
JLBH4JfM4xGDH9s5+Ds/8w==~doZ/7pfM4xGDH9s5+Ds/8w==~1~0
fGvpoMxKEeOgIQBQVgErYA==~mQUQIK2mY6WIPcszfp5BTQ==~1~1
Table Definition :
CREATE MULTISET TABLE CORECM.APPLICATION_EVENT_BUS_EVT ,NO FALLBACK ,
NO BEFORE JOURNAL,
NO AFTER JOURNAL,
CHECKSUM = DEFAULT,
DEFAULT MERGEBLOCKRATIO
(
APPLICATION_EVENT_ID CHAR(26) CHARACTER SET LATIN NOT CASESPECIFIC NOT NULL,
BUS_EVT_ID CHAR(26) CHARACTER SET LATIN NOT CASESPECIFIC NOT NULL,
BUS_EVT_VID BIGINT NOT NULL,
BUS_EVT_RESTATE_IN SMALLINT)
UNIQUE PRIMARY INDEX ( APPLICATION_EVENT_ID ,BUS_EVT_ID ,BUS_EVT_VID )
INDEX APPLICATION_EVENT_BUS_EVT_IDX1 ( APPLICATION_EVENT_ID )
INDEX APPLICATION_EVENT_BUS_EVT_IDX2 ( BUS_EVT_ID ,BUS_EVT_VID );
Results set in DB server as,
APPLICATION_EVENT_ID BUS_EVT_ID BUS_EVT_VID BUS_EVT_RESTATE_IN
1 Ckn3gMxLEeOgIQBQVgErYA == g+GDDtlaY3n7BdUrYshD 85,849,873,219,141,958 12,544
2 CL1kEcxLEeOgIQBQVgErYA == qoKoiuGDbClpcGt/z6RK 85,849,873,219,155,783 12,544
3 oYIVcMxKEeOgIQBQVgErYA == mfmQiwl7yAteevzJfilM 85,849,873,219,142,006 12,544
4 5N7ME5bM4xGhM7exj3ykUw == JAf0GpbM4xGhM7exj3yk 85,849,873,219,155,797 12,288
5 JLBH4JfM4xGDH9s5+Ds/8w == Du6T7pfM4xGDH9s5+Ds/ 85,849,873,219,155,768 12,288
6 fGvpoMxKEeOgIQBQVgErYA == mQUQIK2mY6WIPcszfp5B 85,849,873,219,146,068 12,544
If we look at the Data, we can see two issues as,
First two column data length is 24 CHARACTERS ( as per input file ), but the issue is that it been shifted two characters in next column.
Column BUS_EVT_VID and BUS_EVT_RESTATE_IN has wrong data 85,849,873,219,141,958 and 12,544 instead of 1 and 1 respectively (this may be because first two column data got shifted)
I tried following options to resolve the above issue but couldn't resolve the issue,
Modified the Table Definition, i.e. changed datatype to
CHAR(28),CHAR(24),CHAR(26)
Modified the Table Definition column
datatypes to VARCHAR(24), VARCHAR(26)
Modified BTEQ command, i.e. altered datatype in below line,
USING(APPLICATION_EVENT_ID CHAR(24),BUS_EVT_ID CHAR(24),BUS_EVT_VID BIGINT,BUS_EVT_RESTATE_IN SMALLINT)
Thanks in advance.

When you define VARTEXT all input columns must be defined as VARCHAR, but you used CHAR and INT.
This should work, VARCHAR length based on the definition of your target table:
USING(
APPLICATION_EVENT_ID VARCHAR(26),
BUS_EVT_ID VARCHAR(26),
BUS_EVT_VID VARCHAR(19),
BUS_EVT_RESTATE_IN VARCHAR(6)
)

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

ADF CopyData Is it possible to have a dynamic Additional Columns that can be nullable? - azure-data-factory

I figured out. #json( if( equals(item()?.AdditionalColumns, null), '[]', item()?.AdditionalColumns ) )

Related

Snowflake null values quoted in CSV breaks PostgreSQL unload

Cannot insert character 'N' to PostgreSQL in Sequelize

Identifying isNull and isNotNul in dataframe

DB2 SQL Error: SQLCODE=-302 while executing prepared statement

How to import data into teradata tables from delimited file using BTEQ import?

Categories

Resources