Merge join not able to join properly on varchar column - tsql

I've created below code to implement SCD type 2 using merge, when i run the code i'm getting primary key violations on csname field. I have the below values as part of primary key, not sure whether merge SQL does support for varchar or not.
if I run the normal inner join SQL on the same key then i'm getting the matching records as well.
Any help much appreciated
csname
ER - Building Complaints
TR - Building Applications
CREATE PROCEDURE dbo.load_target
AS
BEGIN
INSERT INTO [TR_DW].[enum].[Rt]([csname],[enddatetime],[EffectiveToDate],[EffectiveFromDate],[CurrentRecord])
SELECT[csname],[enddatetime],[EffectiveToDate],[EffectiveFromDate],[CurrentRecord]
FROM
(
MERGE [TR_DW].[enum].[Rt] RtCSQSuTT
USING [TR].[enum].[Rt] RtCSQSuST
ON (RtCSQSuTT.csname = RtCSQSuST.csname)
WHEN NOT MATCHED THEN
INSERT ([csname],[enddatetime],[EffectiveToDate],[EffectiveFromDate],[CurrentRecord])
VALUES ([csname],[enddatetime],'12/31/9999', getdate(), 'Y')
WHEN MATCHED AND RtCSQSuTT.[CurrentRecord] = 'Y' AND
(ISNULL(RtCSQSuTT.[enddatetime], '') != ISNULL(RtCSQSuST.[enddatetime], ''))THEN
UPDATE SET
RtCSQSuTT.[CurrentRecord] = 'N',
RtCSQSuTT.[EffectiveFromDate] = GETDATE() - 1,
RtCSQSuTT.[EffectiveToDate] = GETDATE()
OUTPUT $Action Action_Taken,RtCSQSuST.[csqname],RtCSQSuST.[enddatetime],'12/31/9999' AS[EffectiveToDate],GETDATE() AS[EffectiveFromDate],'Y' AS[CurrentRecord]
)AS MERGE_OUT21
WHERE MERGE_OUT21.Action_Taken = 'UPDATE';
END
GO

Related

COALESCE failing following CTE Deletion. (PostgreSQL)

PostgreSQL 11.1 PgAdmin 4.1
This works some of the time:
BEGIN;
SET CONSTRAINTS ALL DEFERRED;
WITH _in(trx, lastname, firstname, birthdate, old_disp, old_medname, old_sig, old_form, new_disp, new_medname, new_sig, new_form, new_refills) AS (
VALUES ('2001-06-07 00:00:00'::timestamp,
UPPER(TRIM('JONES')), UPPER(TRIM('TOM')), '1952-12-30'::date,
64::integer,
LOWER(TRIM('adipex 37.5mg tab')), LOWER(TRIM('one tab po qd')), LOWER(TRIM('tab')),
63::integer,
LOWER(TRIM('adipex 37.5mg tab')), LOWER(TRIM('one tab po qd')), LOWER(TRIM('tab')),
33::integer
)
),
_s AS ( -- RESOLVE ALL SURROGATE KEYS.
SELECT n.*, d1.recid as old_medication_recid, d2.recid as new_medication_recid, pt.recid as patient_recid
FROM _in n
JOIN medications d1 ON (n.old_medname, n.old_sig, n.old_form) = (d1.medname, d1.sig, d1.form)
JOIN medications d2 ON (n.new_medname, n.new_sig, n.new_form) = (d2.medname, d2.sig, d2.form)
JOIN patients pt ON (pt.lastname, pt.firstname, pt.birthdate) = (n.lastname, n.firstname, n.birthdate)
),
_t AS ( -- REMOVE CONFLICTING RECORD, IF ANY.
DELETE FROM rx r
USING _s n
WHERE (r.trx::date, r.disp, r.patient_recid, r.medication_recid)=(n.trx::date, n.new_disp, n.patient_recid, n.new_medication_recid)
RETURNING r.*
),
_u AS( -- GET NEW SURROGATE KEY.
SELECT COALESCE(_t.recid, r.recid) as target_recid, r.recid as old_recid
FROM _s n
JOIN rx r ON (r.trx, r.disp, r.patient_recid, r.medication_recid) = (n.trx, n.old_disp, n.patient_recid, n.old_medication_recid)
LEFT JOIN _t ON (_t.trx::date, _t.disp, _t.patient_recid, _t.medication_recid) = (n.trx::date, n.new_disp, n.patient_recid, n.new_medication_recid)
)
UPDATE rx r -- UPDATE ORIGINAL RECORD WITH NEW VALUES.
SET disp = n.new_disp, medication_recid = n.new_medication_recid, refills = n.new_refills, recid = _u.target_recid
FROM _s n, _u
WHERE r.recid = _u.old_recid
RETURNING r.*;
COMMIT;
Where table rx is defined as:
CREATE TABLE phoenix.rx
(
recid integer NOT NULL DEFAULT nextval('rx_recid_seq'::regclass),
trx timestamp without time zone NOT NULL,
disp integer NOT NULL,
refills integer,
tprinted timestamp without time zone,
tstop timestamp without time zone,
modified timestamp without time zone DEFAULT now(),
patient_recid integer NOT NULL,
medication_recid integer NOT NULL,
dposted date NOT NULL,
CONSTRAINT pk_rx_recid PRIMARY KEY (recid),
CONSTRAINT rx_unique UNIQUE (dposted, disp, patient_recid, medication_recid)
DEFERRABLE,
CONSTRAINT rx_medication_fk FOREIGN KEY (medication_recid)
REFERENCES phoenix.medications (recid) MATCH SIMPLE
ON UPDATE CASCADE
ON DELETE RESTRICT
DEFERRABLE,
CONSTRAINT rx_patients FOREIGN KEY (patient_recid)
REFERENCES phoenix.patients (recid) MATCH SIMPLE
ON UPDATE CASCADE
ON DELETE RESTRICT
)
After many hours, it is found that the "Delete.." of a conflicting record works as expected, but the "COALESCE" STATEMENT seems to fail when deciding on the new surrogate key (primary key) of rx.recid -- it does not seem to receive the result of the delete. (Or maybe the timing is wrong???)
Any help would be most appreciated.
TIA
This is documented:
The sub-statements in WITH are executed concurrently with each other and with the main query. Therefore, when using data-modifying statements in WITH, the order in which the specified updates actually happen is unpredictable. All the statements are executed with the same snapshot (see Chapter 13, so they cannot “see” one another's effects on the target tables.
Don't use the same table twice in a statement with a CTE if it occurs in a DML statement. Rather, use DELETE ... RETURNING and use the returned values in the other parts of the statement.
If you cannot rewrite the statement like that, use more than one statement instead of putting everything into a single CTE.
#LaurenzAlbe is totally correct in his answer. Below is a working solution to my problem. There are a few things to note:
The unique constraint is formed on a column in rx defined as a date and created by a trigger on update/insert that casts the timestamp of trx to a date: as in trx::date. For reasons I am not clear on, using r.trx::date in place of r.dposted leads to many records being identified and not the one record I want. Not sure why???. So the first fix was to use r.dposted, not r.trx::date.
Although the cte's are designed to be independent of each other, by using "RETURNING..." and incorporating the cte's in a step-wise fashion, one can be built upon another to obtain a final result set.
The working code is:
WITH _in(trx, lastname, firstname, birthdate, old_disp, old_medname, old_sig, old_form, new_disp, new_medname, new_sig, new_form, new_refills) AS (
VALUES ('2001-06-07 00:00:00'::timestamp,
UPPER(TRIM('smith')), UPPER(TRIM('john')), '1957-12-30'::date,
28::integer,
LOWER(TRIM('test')), LOWER(TRIM('i am sig')), LOWER(TRIM('tab')),
28::integer,
LOWER(TRIM('test 1')), LOWER(TRIM('i am sig')), LOWER(TRIM('tab')),
8::integer
)
),
_m AS (
SELECT n.*, d1.recid as old_medication_recid, d2.recid as new_medication_recid, pt.recid as patient_recid
FROM _in n
JOIN patients pt ON (pt.lastname, pt.firstname, pt.birthdate) = (n.lastname, n.firstname, n.birthdate)
JOIN medications d1 ON (n.old_medname, n.old_sig, n.old_form) = (d1.medname, d1.sig, d1.form)
LEFT JOIN medications d2 ON (n.new_medname, n.new_sig, n.new_form) = (d2.medname, d2.sig, d2.form)
),
_t AS ( -- REMOVE CONFLICTING RECORD, IF ANY.
DELETE FROM rx r
USING _m
WHERE (r.dposted, r.disp, r.patient_recid, r.medication_recid) = (_m.trx::date,_m.new_disp, _m.patient_recid, _m.new_medication_recid)
RETURNING r.*
),
_s AS ( -- GET NEW SURROGATE KEY
SELECT _m.*, r1.recid as old_recid, r2.recid as new_recid, COALESCE(r2.recid, r1.recid) as target_recid
FROM _m
JOIN rx r1 ON (r1.dposted, r1.disp, r1.patient_recid, r1.medication_recid) = (_m.trx::date,_m.old_disp, _m.patient_recid, _m.old_medication_recid)
LEFT JOIN rx r2 ON (r2.dposted, r2.disp, r2.patient_recid, r2.medication_recid) = (_m.trx::date,_m.new_disp, _m.patient_recid, _m.new_medication_recid)
LEFT JOIN _t ON (_t.recid = r2.recid)
)
UPDATE rx -- UPDATE ORIGINAL RECORD WITH NEW VALUES.
SET disp = _s.new_disp, medication_recid = _s.new_medication_recid, refills = _s.new_refills, recid = _s.target_recid
FROM _s
WHERE rx.recid = _s.old_recid
RETURNING rx.*;
COMMIT;
Hope this helps somebody.

DB2 Query : insert data in history table if not exists already

I have History table and transaction table.....and reference table...
If status in reference table is CLOSE then take those record verify in History table if not there insert from transaction table..... wiring query like this .... checking better one... please advice.. this query can be used for huge data ?
INSERT INTO LIB1.HIST_TBL
( SELECT R.ACCT, R.STATUS, R.DATE FROM
LIB2.HIST_TBL R JOIN LIB1.REF_TBL C
ON R.ACCT = C.ACCT WHERE C.STATUS = '5'
AND R.ACCT NOT IN
(SELECT ACTNO FROM LIB1.HIST_TBL)) ;
If you're on a current release of DB2 for i, take a look at the MERGE statement
MERGE INTO hist_tbl H
USING (SELECT * FROM ref_tbl R
WHERE r.status = 'S')
ON h.actno = r.actno
WHEN NOT MATCHED THEN
INSERT (actno,histcol2, histcol3) VALUES (r.actno,r.refcol2,r.refcol3)
--if needed
WHEN MATCHED
UPDATE SET (actno,histcol2, histcol3) = (r.actno,r.refcol2,r.refcol3)

Column is of type timestamp without time zone but expression is of type character

I'm trying to insert records on my trying to implement an SCD2 on Redshift
but get an error.
The target table's DDL is
CREATE TABLE ditemp.ts_scd2_test (
id INT
,md5 CHAR(32)
,record_id BIGINT IDENTITY
,from_timestamp TIMESTAMP
,to_timestamp TIMESTAMP
,file_id BIGINT
,party_id BIGINT
)
This is the insert statement:
INSERT
INTO ditemp.TS_SCD2_TEST(id, md5, from_timestamp, to_timestamp)
SELECT TS_SCD2_TEST_STAGING.id
,TS_SCD2_TEST_STAGING.md5
,from_timestamp
,to_timestamp
FROM (
SELECT '20150901 16:34:02' AS from_timestamp
,CASE
WHEN last_record IS NULL
THEN '20150901 16:34:02'
ELSE '39991231 11:11:11.000'
END AS to_timestamp
,CASE
WHEN rownum != 1
AND atom.id IS NOT NULL
THEN 1
WHEN atom.id IS NULL
THEN 1
ELSE 0
END AS transfer
,stage.*
FROM (
SELECT id
FROM ditemp.TS_SCD2_TEST_STAGING
WHERE file_id = 2
GROUP BY id
HAVING count(*) > 1
) AS scd2_count_ge_1
INNER JOIN (
SELECT row_number() OVER (
PARTITION BY id ORDER BY record_id
) AS rownum
,stage.*
FROM ditemp.TS_SCD2_TEST_STAGING AS stage
WHERE file_id IN (2)
) AS stage
ON (scd2_count_ge_1.id = stage.id)
LEFT JOIN (
SELECT max(rownum) AS last_record
,id
FROM (
SELECT row_number() OVER (
PARTITION BY id ORDER BY record_id
) AS rownum
,stage.*
FROM ditemp.TS_SCD2_TEST_STAGING AS stage
)
GROUP BY id
) AS last_record
ON (
stage.id = last_record.id
AND stage.rownum = last_record.last_record
)
LEFT JOIN ditemp.TS_SCD2_TEST AS atom
ON (
stage.id = atom.id
AND stage.md5 = atom.md5
AND atom.to_timestamp > '20150901 16:34:02'
)
) AS TS_SCD2_TEST_STAGING
WHERE transfer = 1
and to short things up, I am trying to insert 20150901 16:34:02 to from_timestamp and 39991231 11:11:11.000 to to_timestamp.
and get
ERROR: 42804: column "from_timestamp" is of type timestamp without time zone but expression is of type character varying
Can anyone please suggest how to solve this issue?
Postgres isn't recognizing 20150901 16:34:02 (your input) as a valid time/date format, so it assumes it's a string.
Use a standard date format instead, preferably ISO-8601. 2015-09-01T16:34:02
SQLFiddle example
Just in case someone ends up here trying to insert into a postgresql a timestamp or a timestampz from a variable in groovy or Java from a prepared statement and getting the same error (as I did), I managed to do it by setting the property stringtype to "unspecified". According to the documentation:
Specify the type to use when binding PreparedStatement parameters set
via setString(). If stringtype is set to VARCHAR (the default), such
parameters will be sent to the server as varchar parameters. If
stringtype is set to unspecified, parameters will be sent to the
server as untyped values, and the server will attempt to infer an
appropriate type. This is useful if you have an existing application
that uses setString() to set parameters that are actually some other
type, such as integers, and you are unable to change the application
to use an appropriate method such as setInt().
Properties props = [user : "user", password: "password",
driver:"org.postgresql.Driver", stringtype:"unspecified"]
def sql = Sql.newInstance("url", props)
With this property set, you can insert a timestamp as a string variable without the error raised in the question title. For instance:
String myTimestamp= Instant.now().toString()
sql.execute("""INSERT INTO MyTable (MyTimestamp) VALUES (?)""",
[myTimestamp.toString()]
This way, the type of the timestamp (from a String) is inferred correctly by postgresql. I hope this helps.
Inside apache-tomcat-9.0.7/conf/server.xml
Add "?stringtype=unspecified" to the end of url address.
For example:
<GlobalNamingResources>
<Resource name="jdbc/??" auth="Container" type="javax.sql.DataSource"
...
url="jdbc:postgresql://127.0.0.1:5432/Local_DB?stringtype=unspecified"/>
</GlobalNamingResources>

SQL SErver Trigger not evaluating as Insert or Update properly

I want to have one trigger to handle updates and inserts. Most of the sql actions in the trigger are for both. The only exception is the fields I'm using to record date and username for an insert and an update. This is what I have, but the updates of the fields used to track update and insert are not firing right. If I insert a new record, I get CreatedBy, CreatedOn, LastEditedBy, LastEditedOn populated, with LastEditedOn as 1 second after CreatedOn (which I dont want to happen). When I update the record, only the LastEditedBy & LastEditedOn changes (which is correct). I'm including my full trigger for reference:
SET ANSI_NULLS ON;
GO
SET QUOTED_IDENTIFIER ON;
GO
-- =================================================================================
-- Author: Paul J. Scipione
-- Create date: 2/15/2012
-- Update date: 6/5/2012
-- Description: To concatenate several fields into a set formatted UnitDescription,
-- to total Span & Loop footages, to set appropriate AcctCode, & track
-- user inserts
-- =================================================================================
IF OBJECT_ID('ProcessCable', 'TR') IS NOT NULL
DROP TRIGGER ProcessCable
GO
CREATE TRIGGER ProcessCable
ON Cable
AFTER INSERT, UPDATE
AS
BEGIN
SET NOCOUNT ON;
-- IF TRIGGER_NESTLEVEL() > 1 RETURN
IF ((SELECT TRIGGER_NESTLEVEL()) > 1 )
RETURN
ELSE
BEGIN
-- record user and date of insert or update
IF EXISTS (SELECT * FROM DELETED)
UPDATE Cable SET LastEditedOn = getdate(), LastEditedBy = REPLACE(user_name(), 'GRTINET\', '')
ELSE IF NOT EXISTS (SELECT * FROM DELETED)
UPDATE Cable SET CreatedOn = getdate(), CreatedBy = REPLACE(user_name(), 'GRTINET\', '')
-- reset Suffix if applicable
UPDATE Cable SET Suffix = NULL WHERE Suffix = 'n/a'
-- create UnitDescription value
UPDATE Cable SET UnitDescription =
isnull (Type, '') +
isnull (CONVERT (NVARCHAR (10), Size), '') +
'-' +
isnull (CONVERT (NVARCHAR (10), Gauge), '') +
CASE
WHEN ExtraTrench IS NOT NULL AND ExtraTrench > 0 THEN
CASE
WHEN Suffix IS NULL THEN 'TE' + '(' + CONVERT (NVARCHAR (10), ExtraTrench) + ')'
ELSE 'TE' + '(' + CONVERT (NVARCHAR (10), ExtraTrench) + ')' + Suffix
END
ELSE isnull (Suffix, '')
END
-- convert any accidental negative numbers entered
UPDATE Cable SET Length = ABS(Length)
-- sum Length with LoopFootage into TotalFootage
UPDATE Cable SET TotalFootage = isnull(Length, 0) + isnull(LoopFootage, 0)
-- set proper AcctCode based on Type
UPDATE Cable SET AcctCode =
CASE
WHEN Type IN ('SEA', 'CW', 'CJ') THEN '32.2421.2'
WHEN Type IN ('BFC', 'BJ', 'SEB') THEN '32.2423.2'
WHEN Type IN ('TIP','UF') THEN '32.2422.2'
WHEN Type = 'unknown' OR Type IS NULL THEN 'unknown'
END
WHERE AcctCode IS NULL OR AcctCode = ' '
END
END
GO
A few things jump out at me when I look at your trigger:
You are doing several additional updates rather than a single update (performance-wise, a single update would be better).
Your update statements are unconstrained (there is no JOIN to the inserted/deleted tables to limit the number of records that you perform these additional updates on).
Most of this logic feels like it should be in the application layer rather than in the database; Or, perhaps in some cases implemented differently.
Some quick examples:
Suffix of "n/a" should be removed before inserted.
Cable length absolute value should be done before inserted (with a CHECK CONSTRAINT to verify that bad data cannot be inserted).
TotalFootage should be a computed column so it is always correct.
The Type/AcctCode relationship seems like it should be a column value in a foreign key reference.
But ultimately, I think the reason you are seeing the unexpected dates is because of the unconstrained updates. Without addressing any of the other concerns I brought up above, the statement that sets the audit fields should be more like this:
UPDATE Cable SET LastEditedOn = getdate(), LastEditedBy = REPLACE(user_name(), 'GRTINET\', '')
FROM Cable
JOIN deleted on Cable.PrimaryKeyColumn = deleted.PrimaryKeyColumn
UPDATE Cable SET CreatedOn = getdate(), CreatedBy = REPLACE(user_name(), 'GRTINET\', '')
FROM Cable
JOIN inserted on Cable.PrimaryKeyColumn = inserted.PrimaryKeyColumn
LEFT JOIN deleted on Cable.PrimaryKeyColumn = deleted.PrimaryKeyColumn
WHERE deleted.PrimaryKeyColumn IS NULL

TSQL CTE Error: Incorrect syntax near ')'

I am developing a TSQL stored proc using SSMS 2008 and am receiving the above error while generating a CTE. I want to add logic to this SP to return every day, not just the days with data. How do I do this? Here is my SP so far:
ALTER Proc [dbo].[rpt_rd_CensusWithChart]
#program uniqueidentifier = NULL,
#office uniqueidentifier = NULL
AS
DECLARE #a_date datetime
SET #a_date = case when MONTH(GETDATE()) >= 7 THEN '7/1/' + CAST(YEAR(GETDATE()) AS VARCHAR(30))
ELSE '7/1/' + CAST(YEAR(GETDATE())-1 AS VARCHAR(30)) END
if exists (
select * from tempdb.dbo.sysobjects o where o.xtype in ('U') and o.id = object_id(N'tempdb..#ENROLLEES')
) DROP TABLE #ENROLLEES;
if exists (
select * from tempdb.dbo.sysobjects o where o.xtype in ('U') and o.id = object_id(N'tempdb..#DISCHARGES')
) DROP TABLE #DISCHARGES;
declare #sum_enrollment int
set #sum_enrollment =
(select sum(1)
from enrollment_view A
join enrollment_info_expanded_view C on A.enrollment_id = C.enroll_el_id
where
(#office is NULL OR A.group_profile_id = #office)
AND (#program is NULL OR A.program_info_id = #program)
and (C.pe_end_date IS NULL OR C.pe_end_date > #a_date)
AND C.pe_start_date IS NOT NULL and C.pe_start_date < #a_date)
select
A.program_info_id as [Program code],
A.[program_name],
A.profile_name as Facility,
A.group_profile_id as Facility_code,
A.people_id,
1 as enrollment_id,
C.pe_start_date,
C.pe_end_date,
LEFT(datename(month,(C.pe_start_date)),3) as a_month,
day(C.pe_start_date) as a_day,
#sum_enrollment as sum_enrollment
into #ENROLLEES
from enrollment_view A
join enrollment_info_expanded_view C on A.enrollment_id = C.enroll_el_id
where
(#office is NULL OR A.group_profile_id = #office)
AND (#program is NULL OR A.program_info_id = #program)
and (C.pe_end_date IS NULL OR C.pe_end_date > #a_date)
AND C.pe_start_date IS NOT NULL and C.pe_start_date >= #a_date
;WITH #ENROLLEES AS (
SELECT '7/1/11' AS dt
UNION ALL
SELECT DATEADD(d, 1, pe_start_date) as dt
FROM #ENROLLEES s
WHERE DATEADD(d, 1, pe_start_date) <= '12/1/11')
The most obvious issue (and probably the one that causes the error message too) is the absence of the actual statement to which the last CTE is supposed to pertain. I presume it should be a SELECT statement, one that would combine the result set of the CTE with the data from the #ENROLLEES table.
And that's where another issue emerges.
You see, apart from the fact that a name that starts with a single # is hardly advisable for anything that is not a local temporary table (a CTE is not a table indeed), you've also chosen for your CTE a particular name that already belongs to an existing table (more precisely, to the already mentioned #ENROLLEES temporary table), and the one you are going to pull data from too. You should definitely not use an existing table's name for a CTE, or you will not be able to join it with the CTE due to the name conflict.
It also appears that, based on its code, the last CTE represents an unfinished implementation of the logic you say you want to add to the SP. I can suggest some idea, but before I go on I'd like you to realise that there are actually two different requests in your post. One is about finding the cause of the error message, the other is about code for a new logic. Generally you are probably better off separating such requests into distinct questions, and so you might be in this case as well.
Anyway, here's my suggestion:
build a complete list of dates you want to be accounted for in the result set (that's what the CTE will be used for);
left-join that list with the #ENROLLEES table to pick data for the existing dates and some defaults or NULLs for the non-existing ones.
It might be implemented like this:
… /* all your code up until the last WITH */
;
WITH cte AS (
SELECT CAST('7/1/11' AS date) AS dt
UNION ALL
SELECT DATEADD(d, 1, dt) as dt
FROM cte
WHERE dt < '12/1/11'
)
SELECT
cte.dt,
tmp.[Program code],
tmp.[program_name],
… /* other columns as necessary; you might also consider
enveloping some or all of the "tmp" columns in ISNULLs,
like in
ISNULL(tmp.[Program code], '(none)') AS [Program code]
to provide default values for absent data */
FROM cte
LEFT JOIN #ENROLLEES tmp ON cte.dt = tmp.pe_start_date
;