DB adapter receives duplicate rows from DB2 table

DB adapter receives duplicate rows from DB2 table - db2

It is 11g version..
A DB adapter has been created with operation type as to perform SELECT operation on the DB2 table/view. The DB2 table/view does not have the PRIMARY KEY column defined in the database. But we have a step in the DB adapter configuration wizard where it is required to select the primary key column of the table/view (if the primary key is already defined, it will be selected by default). Let the DB2 view has data something like below.
A|B |C
1|X1|2.2
1|Y1|4.6
1|Z1|4.6
1|X1|3.3
1|Z1|(null)
The Primary key column is selected as 'B' just to proceed to the next step as there isn't really a column with unique values.
Now when the business service of this case is executed in the OSB console the response is received with duplicate rows as below.
<collection><br/>
<row><br/>
<A>1</A><br/>
<B>X1</B><br/>
<C>2.2</C><br/>
</row><br/>
<row><br/>
<A>1</A><br/>
<B>Y1</B><br/>
<C>4.6</C><br/>
</row><br/>
<row><br/>
<A>1</A><br/>
<B>Z1</B><br/>
<C>4.6</C><br/>
</row><br/>
<row><br/>
<A>1</A><br/>
<B>X1</B><br/>
<C>2.2</C><br/>
</row><br/>
<row><br/>
<A>1</A><br/>
<B>Z1</B><br/>
<C>4.6</C><br/>
</row><br/>
</collection>
As seen from the response XML it is clear that the same data present in the table is not fetched exactly. (i.e.) if there are two rows for which the value of B is X1 but different values of A & C, the response has two rows but with the value of A,B & C of only the first row (as shown for values of C in above XML for X1 and Z1)
There is also way to select multiple columns as to create composite primary key but all the three columns cannot be selected as primary key as the column C will have some data that can be (null) which will throw error as the primary key column cannot have null values.
Only the columns A and B can be selected as composite primary key then the response XML would have 5 records in XML but with the data of first row in all the five records as the value of A is same for all the rows.
Hope i am able explain the problem to make everyone understand. Please comment in your thoughts and ideas if you have ever faced the same or similar issue.
Thanks in advance

Related

Adding a partition from another DB to existing partitioned table in Postgres 12

Short description of a problem
I need to build a partitioned table "users" with 2 partitions located on separate servers (Moscow and Hamburg), each partition is table with columns:
id - integer primary key with auto increment
region - smallint partition key, which equals either 100 for Hamburg or 200 for Moscow
login - unique character varying with length of 100.
I intended to make sequences for id as n*1000 + 100 for Hamburg, and n*1000 + 200 for Moscow, so just looking on primary key I will know which partition it belongs to.
region is intended to be read only and never change after creation, so no records will move between partitions.
SELECT queries must be able to return records from all partitions and UPDATE queries must be able to modify records on all partitions, INSERT/DELETE queries must be able to add/delete records only to local partition, so data stored in them is not completely isolated.
What was done
Using pgAdmin4
I created a "test" table on Hamburg server, added all column info, marked it as partitioned table with partition key region and partition type List.
I created a "hamburg" partition in this table, adding primary key constraint as id,region and unique key constraint as login,region.
I created a "moscow" table on Moscow server with the same column info as "test"
I added postgres_fdw extension to Hamburg server, created Foreign server pointing to DB on Moscow server and User mapping.
I added "moscow" foreign table to Hamburg server pointing to "moscow" table on Moscow server.
What is my problem
I couldn't figure out how to attach this foreign table as second partition to "test" table.
When I tried to attach partition through pgAdmin dialog in "test" table partitions properties it shows me an error: cannot unpack non-iterable Response object
When I tried to add partition with query as follows:
ALTER TABLE public.test ATTACH PARTITION public.moscow FOR VALUES IN (200);
It shows me an error:
ERROR: cannot attach foreign table "moscow" as partition of partitioned table "test"
DETAIL: Table "test" contains unique indexes.
SQL state: 42809
I removed unique constraint from login column but it shows the same error.
When I make partitioned table with the same properties and both partitions initially located on the same server all works well, except for postgres watch for login uniqueness per-partition rather than in whole table, but I suggest this is its limitation.
So, how can I attach a table located on the second server as partition to partitioned table located on the first one?

The error message is pretty clear: Since you cannot create an index on a partitioned table, PostgreSQL cannot create a partition of the unique index. But the unique index is required to implement the constraint.
See this source comment:
/*
* If we're attaching a foreign table, we must fail if any of the indexes
* is a constraint index; otherwise, there's nothing to do here. Do this
* before starting work, to avoid wasting the effort of building a few
* non-unique indexes before coming across a unique one.
*/
Either drop the unique constraint or don't use foreign tables as partitions.

Ok, I was finally able to add a foreign table as partition to partitioned table.
It was necessary to drop primary key property on id and unique property on login columns for partitioned table
After that I was able to attach foreign table as partition to partitioned table
Later I have added primary key property on id and unique property on login columns for each local partition.
So in the end I have unique global id as it is generated by sequences for each DB with never intersected values. For login uniqueness I have to manually check global table if there is any record with it before inserting.
P.S. Hopefully, this partitioning mechanism in postgres is suitable for geographically distant regions.

How do I verify in CQL that all the rows have successfully copied from a CSV to a Cassandra table? ***SELECT statements are not returning all results

I am trying to understand Cassandra by playing with a public dataset.
I had inserted 1.5M rows from CSV to a table on my local instance of Cassandra, WITH REPLICATION = { 'class' : 'SimpleStrategy', 'replication_factor' : 1 }
The table was created with one field as a partition key, and one more as primary key
I had a confirmation that 1.5M rows were processed.
COPY Completed
But when I run SELECT or SELECT COUNT(*) on the table, I always get a max of 182 rows.

Secondly, the number of records returned with clustered columns seem to higher than single columns which is not making sense to me. What am I missing from Cassandra's architecture and querying point of view.
Lastly I have also tried reading the same Cassandra table from pyspark shell, and it seems to be reading 182 rows too.

Your primary key is PRIMARY KEY (state, severity). With this primary key definition, all rows for accidents in the same state of same severity will overwrite each other. You probably only have 182 different (state, severity) combinations in your dataset.
You could include another clustering column to record the unique accident, like an accident_id
This blog highlights the importance of the primary key, and has some examples:
https://www.datastax.com/blog/2016/02/most-important-thing-know-cassandra-data-modeling-primary-key

Ebean request giving 2 identic rows

I have one row in my PostgreSQL table with the name I look for, but Ebean query gives me two identic results (same primary key). Each row has a unique name so I must be able to findUnique()
finder.where().eq("name", name).findUnique()
Handmade sql query gives me only one
String sql = "select id, name from totem where name ilike :name";
Any idea to get findUnique() working ?
Is it an Ebean bug ?

I have a OnetoOne relationship with table B.
So Ebean generates an inner join with this table.
Table B has two entry corresponding the object I'm fetching on table A.
So the request returns 2 rows.

Duplicate Key error when using INSERT DEFAULT

I am getting a duplicate key error, DB2 SQL Error: SQLCODE=-803, SQLSTATE=23505, when I try to INSERT records. The primary key is one column, INTEGER 4, Generated, and it is the first column.
the insert looks like this: INSERT INTO SCHEMA.TABLE1 values (DEFAULT, ?, ?, ...)
It's my understanding that using the value DEFAULT will just let DB2 auto-generate the key at the time of insert, which is what I want. This works most of the time, but sometimes/randomly I get the duplicate key error. Thoughts?
More specifically, I'm running against DB2 9.7.0.3, using Scriptella to copy a bunch of records from one database to another. Sometimes I can process a bunch with no problems, other times I'll get the error right away, other times after 2 records, or 20 records, or 30 records, etc. Does not seem to be a pattern, nor is it the same record every time. If I change the data to copy 1 record instead of a bunch, sometimes I'll get the error one time, then it's fine the next time.
I thought maybe some other process was inserting records during my batch program, and creating keys at the same time. However, the tables I'm copying TO should not have any other users/processes trying to INSERT records during this same time frame, although there could be READS happening.
Edit: adding create info:
Create table SCHEMA.TABLE1 (
SYSTEM_USER_KEY INTEGER NOT NULL
generated by default as identity (start with 1 increment by 1 cache 20),
COL2...,
)
alter table SCHEMA.TABLE1
add constraint SYSTEM_USER_SYSTEM_USER_KEY_IDX
Primary Key (SYSTEM_USER_KEY);

You most likely have records in your table with IDs that are bigger then the next value in your identity sequence. To find out what the current value your sequence is about at, run the following query.
select s.nextcachefirstvalue-s.cache, s.nextcachefirstvalue-s.increment
from syscat.COLIDENTATTRIBUTES as a inner join syscat.sequences as s on a.seqid=s.seqid
where a.tabschema='SCHEMA'
and a.TABNAME='TABLE1'
and a.COLNAME='SYSTEM_USER_KEY'
So basically what happened is that somehow you got records in your table with ids that are bigger then the current last value of your identity sequence. So sooner or later these ids will collide with identity generated ids.
There are different reasons on how this could have happened. One possibility is that data was loaded which already contained values for the id column or that records were inserted with an actual value for the ID. Another option is that the identity sequence was reset to start at a lower value than the max id in the table.

Whatever the cause, you may also want the fix:
SELECT MAX(<primary_key_column>) FROM onsite.forms;
ALTER TABLE <table> ALTER COLUMN <primary_key_column> RESTART WITH <number from previous query + 1>;

Insert into table with Identity and foreign key columns

I was trying to insert values from one table to another from two different databases.
My issue is I have two tables with a relation and the first table is having an identity column also.
eg table first(id, Name) - table second(id, address)
So now both the table exist with values in a db and i am trying to copy values from this db to another db.
So when I insert values from first db to second db the the first table will insert values for the Id column by itself so now I have to link that id to the second table.
How can I do that?
UPDATE using MSSQL server 2000

You can use #scope_identity immediately after your insert in SQL server 2000 which will give you the last id within the current scope but I'm not sure how that would work with bulk inserting of data
http://msdn.microsoft.com/en-us/library/ms190315.aspx

If this were SQL Server 2005 or later I would suggest using the output clause in your insert statement to retrieve the ids just inserted, but that was not available in SQL Server 2000.
If your data contains some column or series of columns which is unique other than the identity column, then you can query your first table based on that series of columns to get the ids and use that to populate your second table.

If the target tables were empty you could use SET IDENTITY_INSERT ON - this would allow to insert original values to identity columns, and you will not have to update referenced IDs. Of course if there is any existing ids that can overlap inserted ids - that is not the solution.
If names in first tables are unique, you could boild mapping between new and old ids and perform update something like this:
UPDATE S
SET S.id = F.id
FROM second S
INNER JOIN first_original FO ON FO.id = S.id
INNER JOIN first F ON F.name = FO.name
If names are not unique, then original ids should be saved in "first" in order to provide mapping between old and new ids. It can be temporary new column that can be deleted after ids in "second" will be updated.
Or as Rich Andrews said you could use #scope_identity, but in this case you will have to perform insert one by one - declare a cursor on source table, insert each record, get its new id and insert it into "second" table.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

DB adapter receives duplicate rows from DB2 table - db2

Related

Adding a partition from another DB to existing partitioned table in Postgres 12

How do I verify in CQL that all the rows have successfully copied from a CSV to a Cassandra table? ***SELECT statements are not returning all results

Ebean request giving 2 identic rows

Duplicate Key error when using INSERT DEFAULT

Insert into table with Identity and foreign key columns

Categories

Resources