How to Resolve the Maximum Rejected Threshold was reached while reading data from SRCTable to TGTTable using ADF

How to Resolve the Maximum Rejected Threshold was reached while reading data from SRCTable to TGTTable using ADF - azure-data-factory

I am getting below mentioned error while loading data from synapse src Table to synapse TGT Table.
SQLServerException: Query aborted-- the maximum reject threshold (0 rows) was reached while reading from an external source: 1 rows rejected out of total 1 rows processed.\nColumn ordinal: 26, Expected data type: VARCHAR(255) collate SQL_Latin1_General_CP1_CI_AS NOT
Requesting you to suggest how to overcome the above mentioned issue.
Regards,
Ashok

The error may be due to data truncation from the column 26 of your source file.
As a first check I would suggest increasing the destination table column from VARCHAR(255) to VARCHAR(MAX) and then try to run this copy again.
ALTER TABLE TGT ALTER COLUMN [column 29] VARCHAR(MAX);
If it successes, you can easily run a max on that destination table column to determine how big it should be.
SELECT MAX(LEN([column 29])) FROM TGT
Some related reading about polybase copy:
https://medium.com/microsoftazure/azure-synapse-data-load-using-polybase-or-copy-command-from-vnet-protected-azure-storage-da8aa6a9ac68

Related

SQL3116W The field value in row and column is missing, but the target column is not nullable. How to specify to use Column Default

I'm using LOAD command to get data into a table where one of the columns has the default value of the current timestamp. I had NULL value in the data being read as I thought it would cause the table to use the default value but based on above error that's not the case. How do I avoid the above error in this case?
Here is the full command, input file is text file: LOAD FROM ${LOADDIR}/${InputFile}.exp OF DEL MODIFIED BY COLDEL| INSERT INTO TEMP_TABLE NONRECOVERABLE

Try:
LOAD FROM ${LOADDIR}/${InputFile}.exp OF DEL MODIFIED BY USEDEFAULTS COLDEL| INSERT INTO TEMP_TABLE NONRECOVERABLE
This modifier usedefaults has been available in Db2-LUW since V7.x, as long as they are fully serviced (i.e. have had the final fixpack correctly applied).
Note that some Db2-LUW versions place restrictions on usage of usedefaults modifier, as detailed in the documentation. For example, restrictions relating to use with other modifiers, or modes or target table type.
Always specify your Db2-server version and platform when asking for help because the answer can depende on these facts.

You can specify which columns from the input file go into which columns of the table using METHOD P - if you omit the column you want the default for it will throw a warning but the default will be populated:
$ db2 "create table testtab1 (cola int, colb int, colc timestamp not null default)"
DB20000I The SQL command completed successfully.
$ cat tt1.del
1,1,1
2,2,2
3,3,99
$ db2 "load from tt1.del of del method P(1,2) insert into testtab1 (cola, colb)"
SQL27967W The COPY NO recoverability parameter of the Load has been converted
to NONRECOVERABLE within the HADR environment.
SQL3109N The utility is beginning to load data from file
"/home/db2inst1/tt1.del".
SQL3500W The utility is beginning the "LOAD" phase at time "07/12/2021
10:14:04.362385".
SQL3112W There are fewer input file columns specified than database columns.
SQL3519W Begin Load Consistency Point. Input record count = "0".
SQL3520W Load Consistency Point was successful.
SQL3110N The utility has completed processing. "3" rows were read from the
input file.
SQL3519W Begin Load Consistency Point. Input record count = "3".
SQL3520W Load Consistency Point was successful.
SQL3515W The utility has finished the "LOAD" phase at time "07/12/2021
10:14:04.496670".
Number of rows read = 3
Number of rows skipped = 0
Number of rows loaded = 3
Number of rows rejected = 0
Number of rows deleted = 0
Number of rows committed = 3
$ db2 "select * from testtab1"
COLA COLB COLC
----------- ----------- --------------------------
1 1 2021-12-07-10.14.04.244232
2 2 2021-12-07-10.14.04.244232
3 3 2021-12-07-10.14.04.244232
3 record(s) selected.

How to query parquet data files from Azure Synapse when data may be structured and exceed 8000 bytes in length

I am having trouble reading, querying and creating external tables from Parquet files stored in Datalake Storage gen2 from Azure Synapse.
Specifically I see this error while trying to create an external table through the UI:
"Error details
New external table
Previewing the file data failed. Details: Failed to execute query. Error: Column 'members' of type 'NVARCHAR' is not compatible with external data type 'JSON string. (underlying parquet nested/repeatable column must be read as VARCHAR or CHAR)'. File/External table name: [DELETED] Total size of data scanned is 1 megabytes, total size of data moved is 0 megabytes, total size of data written is 0 megabytes.
. If the issue persists, contact support and provide the following id :"
My main hunch is that since a couple columns were originally JSON types, and some of the rows are quite long (up to 9000 characters right now, which could increase at any point in time during my ETL), this is some kind of conflict with some possible default limit's I have seen referenced in the documentation (enter link description here). Data appears internally like the following example, please bear in mind sometimes this would be way longer
["100.001", "100.002", "100.003", "100.004", "100.005", "100.006", "100.023"]
If I try to manually create the external table (which has worked every other time I have tried following code similar to this
CREATE EXTERNAL TABLE example1(
[id] bigint,
[column1] nvarchar(4000),
[column2] nvarchar(4000),
[column3] datetime2(7)
)
WITH (
LOCATION = 'location/**',
DATA_SOURCE = [datasource],
FILE_FORMAT = [SynapseParquetFormat]
)
GO
the table is created with no error nor warnings but trying to make a very simple select
SELECT TOP (100) [id] bigint,
[column1] nvarchar(4000),
[column2] nvarchar(4000),
[column3] datetime2(7)
FROM [schema1].[example1]
The following error is shown:
"External table 'dbo' is not accessible because content of directory cannot be listed."
It can also show the equivalent:
"External table 'schema1' is not accessible because content of directory cannot be listed."
This error persists even when creating external table with the argument "max" as it appears in this doc
Summary: How to create external table from parquet files with fields exceeding 4000, 8000 bytes or even up to 2gb, which would be the maximum size according to this
Thank you all in advance

What does "tuple (0,79)" in postgres log file mean when a deadlock happened?

In postgres log:
2016-12-23 15:28:14 +07 [17281-351 trns: 4280939, vtrns: 3/20] postgres#deadlocks HINT: See server log for query details.
2016-12-23 15:28:14 +07 [17281-352 trns: 4280939, vtrns: 3/20] postgres#deadlocks CONTEXT: while locking tuple (0,79) in relation "account"
2016-12-23 15:28:14 +07 [17281-353 trns: 4280939, vtrns: 3/20] postgres#deadlocks STATEMENT: SELECT id FROM account where id=$1 for update;
when I provoke a deadlock I can see text: tuple (0,79).
As I know, a tuple just is several rows in table. But I don't understand what (0,79) means. I have only 2 rows in table account, it's just play and self-learning application.
So what does (0,79) means?

This is the data type of the system column ctid. A tuple ID is a pair
(block number, tuple index within block) that identifies the physical
location of the row within its table.
read https://www.postgresql.org/docs/current/static/datatype-oid.html
It means block number 0, row index 79
also read http://rachbelaid.com/introduction-to-postgres-physical-storage/
also run SELECT id,ctid FROM account where id=$1 with right $1 to check out...

SQL Server 2008 R2, "string or binary data would be truncated" error

In SQL Server 2008 R2, I am trying to insert 30 million records from a source table to the target table. Out of these 30 million records, few records have some bad data and exceeds the length of target field. Generally due to these bad data, the whole insert gets aborted with "string or binary data would be truncated" error, without loading any rows in the target table and SQL Server also do not specify which row had the problem. Is there a way that we can insert rest of rows and catch the bad data rows without big impact on the performance (because performance is the main concern in this case) .

You can use the len function in your where condition to filter out long values:
select ...
from ...
where len(yourcolumn) <= 42
gives you the "good" records
select ...
from ...
where len(yourcolumn) > 42
gives you the "bad" records. You can use such where conditions in an insert select syntax as well.
You can also truncate your string as well, like:
select
left(col, 42) col
from yourtable
In the examples I assumed that 42 is your character limit.

You are not mention that how to insert data i.e. bulk insert or SSIS.
I prefer in this condition SSIS, in which you have control and also find the solution of your issue means you can insert the proper data as #Lajos suggest as well as for bad data you can create a temporary table and get the bad datas.
You can give flow of your logic via transformation and also error handling. You can more search for this too.
https://www.simple-talk.com/sql/reporting-services/using-sql-server-integration-services-to-bulk-load-data/
https://www.mssqltips.com/sqlservertip/2149/capturing-and-logging-data-load-errors-for-an-ssis-package/
http://www.techbrothersit.com/2013/07/ssis-how-to-redirect-invalid-rows-from.html

ORA-01652 Unable to extend temp segment by in tablespace

I am creating a table like
create table tablename
as
select * for table2
I am getting the error
ORA-01652 Unable to extend temp segment by in tablespace
When I googled I usually found ORA-01652 error showing some value like
Unable to extend temp segment by 32 in tablespace
I am not getting any such value.I ran this query
select
fs.tablespace_name "Tablespace",
(df.totalspace - fs.freespace) "Used MB",
fs.freespace "Free MB",
df.totalspace "Total MB",
round(100 * (fs.freespace / df.totalspace)) "Pct. Free"
from
(select
tablespace_name,
round(sum(bytes) / 1048576) TotalSpace
from
dba_data_files
group by
tablespace_name
) df,
(select
tablespace_name,
round(sum(bytes) / 1048576) FreeSpace
from
dba_free_space
group by
tablespace_name
) fs
where
df.tablespace_name = fs.tablespace_name;
Taken from: Find out free space on tablespace
and I found that the tablespace I am using currently has around 32Gb of free space. I even tried creating table like
create table tablename tablespace tablespacename
as select * from table2
but I am getting the same error again. Can anyone give me an idea, where the problem is and how to solve it. For your information the select statement would fetch me 40,000,000 records.

I found the solution to this. There is a temporary tablespace called TEMP which is used internally by database for operations like distinct, joins,etc. Since my query(which has 4 joins) fetches almost 50 million records the TEMP tablespace does not have that much space to occupy all data. Hence the query fails even though my tablespace has free space.So, after increasing the size of TEMP tablespace the issue was resolved. Hope this helps someone with the same issue. Thanks :)

Create a new datafile by running the following command:
alter tablespace TABLE_SPACE_NAME add datafile 'D:\oracle\Oradata\TEMP04.dbf'
size 2000M autoextend on;

You don't need to create a new datafile; you can extend your existing tablespace data files.
Execute the following to determine the filename for the existing tablespace:
SELECT * FROM DBA_DATA_FILES;
Then extend the size of the datafile as follows (replace the filename with the one from the previous query):
ALTER DATABASE DATAFILE 'D:\ORACLEXE\ORADATA\XE\SYSTEM.DBF' RESIZE 2048M;

I encountered the same error message but don't have any access to the table like "dba_free_space" because I am not a dba. I use some previous answers to check available space and I still have a lot of space. However, after reducing the full table scan as many as possible. The problem is solved. My guess is that Oracle uses temp table to store the full table scan data. It the data size exceeds the limit, it will show the error. Hope this helps someone with the same issue

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse