How to get this query logic (instead of using Checksum)? - tsql

I have been struggling to get the right data using Checksum for last 15+ days, and now I am trying to find other way.
I am trying to get any data output that has been changed from Previous day's file to Today's file on Punch Card's punch_start HOUR due to unexpected Time Zone hour change (not minute).
Please see the bottom sample of data.
Dataset1 (Yesterday's file):
chcecksum person_id applied_date punch_start punch_end punch_hours
-1552866149 650067 2022-09-04 2022-09-04T20:11:00Z 2022-09-04T22:52:00Z 2.68333333333333
-1367087212 650067 2022-09-04 2022-09-04T22:52:00Z 2022-09-04T23:26:00Z 0.566666666666667
Dataset2 (Today's file):
chcecksum person_id applied_date punch_start punch_end punch_hours
-1564056421 650067 2022-09-04 2022-09-04T20:11:00Z 2022-09-04T22:52:00Z 2.683333333
-1470176798 650067 2022-09-04 2022-09-04T20:52:00Z 2022-09-04T23:26:00Z 0.566666667
So, what I am trying to is if there is any change of HOUR (in this example) on punch_start only, it will notify (or select those ones).
In this case, there was change from 22:52:00Z to 20:52:00Z on the second entry.
Checksum would not work because if there is any change like 2.683333333 to 2.68333 (without change of punch_start), it will still create different checksum value.
The challenge is finding unique ID for those corresponding entries of two datasets, and it has been a struggle for me.
I have been using something like bottom to create an unique ID for each entry:
,concat(
[person_id],
[applied_date] ,
[punch_hours],
datepart(minute, convert(datetime, cast([punch_start] as datetime), 112))
But, it sill gives me a lot of duplicates because if somebody works from
9:00 AM -- 12:00 PM &
1:00 PM -- 5:00 PM on the same day,
it would create duplicates because they work on the same [applied_date] and same [punch_hours] and same [min].
How do we tackle this?

Have you looked at using EXCEPT?
-- Prep data
select *
INTO #yesterday
from (values
(-1552866149 ,650067 , '2022-09-04', cast('2022-09-04T20:11:00Z' as datetime), cast('2022-09-04T22:52:00Z' as datetime) , 2.68333333333333 ),
(-1367087212 ,650067 , '2022-09-04', cast('2022-09-04T22:52:00Z' as datetime), cast('2022-09-04T23:26:00Z' as datetime) , 0.566666666666667)
)t1(chcecksum ,person_id ,applied_date ,punch_start ,punch_end ,punch_hours)
select *
INTO #today
from (values
(-1564056421 , 650067 ,'2022-09-04', cast('2022-09-04T20:11:00Z' as datetime), cast('2022-09-04T22:52:00Z' as datetime), 2.683333333),
(-1470176798 , 650067 ,'2022-09-04', cast('2022-09-04T20:52:00Z' as datetime), cast('2022-09-04T23:26:00Z' as datetime), 0.566666667)
)t2(chcecksum ,person_id ,applied_date ,punch_start ,punch_end ,punch_hours)
-- output
select
person_id,
applied_date,
punch_end,
Round(punch_hours, 4) as punch_hours, -- hope this is acceptable
datepart(HH, punch_start) as punch_start_hour, -- only looking for changes to HOUR
format(punch_start, 'yyyy-MM-dd XX:mm') as punch_start_hourless -- mask the the hour with XX so the rest of the Datetime can still be compared
from #yesterday
except
select
person_id,
applied_date,
punch_end,
Round(punch_hours, 4) as punch_hours,
datepart(HH, punch_start) as punch_start_hour,
format(punch_start, 'yyyy-MM-dd XX:mm') as punch_start_hourless
from #today
Wrap the 'output' query in this if you want to get the original values (minus the checksum )
SELECT
person_id
,applied_date
,Cast(REPLACE(punch_start_hourless, 'XX', punch_start_hour) as Datetime) as punch_start
,punch_end
,punch_hours
FROM (
-- insert query from above
) sub

You can use FULL OUTER JOIN to identified rows that exists in one table but not in the other
select *
from Dataset1 d1
full outer join Dataset2 d2 on d1.person_id = d2.person_id
and d1.applied_date = d2.applied_date
and d1.punch_start = d2.punch_start

Related

Repeated use of parameter for multiple UDF's in FROM throws an invalid column name error

When using multiple table-valued functions in a query like beneath, SSMS throws an error. Also, the [Date] parameter of [PRECALCPAGES_asof] is underlined in red.
I am trying to understand why this fails. I think this might be related to the way the SQL Server engine works. Have looked into documentation on MSDN but unfortunately I do not know what to look for. Why is this caused and is there a way around it?
Query
SELECT
[Date]
, COUNT(*)
FROM
[Warehouse].[dbo].[DimDate]
CROSS APPLY
[PROJECTS_asof]([Date])
INNER JOIN
[PRECALCPAGES_asof]([Date]) ON [PRECALCPAGES_asof].[PROJECTID] = [PROJECTS_asof].[PROJECTID]
GROUP BY
[Date]
Error
Msg 207, Level 16, State 1, Line 9
Invalid column name 'Date'.
Functions
CREATE FUNCTION [ProfitManager].[PROJECTS_asof]
(
#date DATETIME
)
RETURNS TABLE AS
RETURN
(
SELECT
[PROJECTID]
, [PROJECT]
, ...
FROM
Profitmanager.[PROJECTS_HISTORY]
WHERE
[RowStartDate] <= #date
AND
[RowEndDate] > #date
)
GO
CREATE FUNCTION [ProfitManager].[PRECALCPAGES_asof]
(
#date DATETIME
)
RETURNS TABLE AS
RETURN
(
SELECT
[PAGEID]
, [PAGENAME]
, ...
FROM
Profitmanager.[PRECALCPAGES_HISTORY]
WHERE
[RowStartDate] <= #date
AND
[RowEndDate] > #date
)
GO
I think you can't use fields from tables as parameters to a function in a join. You should use cross apply.
SELECT
[Date]
, COUNT(*)
FROM
[Warehouse].[dbo].[DimDate]
CROSS APPLY
[PROJECTS_asof]([Date])
CROSS APPLY
[PRECALCPAGES_asof]([Date])
WHERE
[PRECALCPAGES_asof].[PROJECTID] = [PROJECTS_asof].[PROJECTID]
GROUP BY
[Date]

Stored Procedure Date appears long date

ALTER PROCEDURE [dbo].[SP_My_Procedured]
AS
BEGIN
SELECT Mission_Time
FROM Mission_Table
WITH (NOLOCK)
WHERE
cast(getdate() as Date)=Mission_Time
END
When i run SP_My_Procedured,
I see Mission_Time as
"2014-01-04 08:35:05.510"
"2014-01-03 10:49:00.697"
But ı want to see like below,
"2014-01-04"
"2014-01-03"
So how can i do this in stored procedure by select ?
Any help will be appreciated.
Cast the value in the SELECT list to Date:
ALTER PROCEDURE [dbo].[SP_My_Procedured]
AS
BEGIN
SELECT
Mission_Time = cast(Mission_Time as Date)
FROM
Mission_Table (NOLOCK)
WHERE
cast(Mission_Time as Date) = cast(getdate() as Date)
END
[BTW: there are dangers to using NO LOCK]
[Also, casting the column to date may not be necessary if it is already of type Date (you don't specify its type). Doing so may result in an appropriate index not being used]
Try
SELECT CONVERT(VARCHAR(10), 'date', 120) AS MissionTime
For more date formats check this

How to select first and last records between certain date parameters?

I need a Query to extract the first instance and last instance only between date parameters.
I have a Table recording financial information with financialyearenddate field linked to Company table via companyID. Each company is also linked to programme table and can have multiple programmes. I have a report to pull the financials for each company
on certain programme which I have adjusted to pull only the first and last instance (using MIN & MAX) however I need the first instance.
after a certain date parameter and the last instance before a certain date parameter.
Example: Company ABloggs has financials for 1999,2000,2001,2004,2006,2007,2009 but the programme ran from 2001 to 2007 so I only want
the first financial record and last financial record between those years i.e. 2001 & 2007 records. Any help appreciated.
At the moment I am using 2 queries as I needed the data in a hurry but I need it in 1 query and only where financial year end dates are between parameters and only where there are minimum of 2 GVA records for a company.
Query1:
SELECT
gva.ccx_companyname,
gva.ccx_depreciation,
gva.ccx_exportturnover,
gva.ccx_financialyearenddate,
gva.ccx_netprofitbeforetax,
gva.ccx_totalturnover,
gva.ccx_totalwages,
gva.ccx_statusname,
gva.ccx_status,
gva.ccx_company,
gva.ccx_totalwages + gva.ccx_netprofitbeforetax + gva.ccx_depreciation AS GVA,
gva.ccx_nofulltimeequivalentemployees
FROM
(
SELECT
ccx_companyname,
MAX(ccx_financialyearenddate) AS LatestDate
FROM Filteredccx_gva AS Filteredccx_gva_1
GROUP BY ccx_companyname
) AS min_1
INNER JOIN Filteredccx_gva AS gva
ON min_1.ccx_companyname = gva.ccx_companyname AND
min_1.LatestDate = gva.ccx_financialyearenddate
WHERE (gva.ccx_status = ACTUAL)
Query2:
SELECT
gva.ccx_companyname,
gva.ccx_depreciation,
gva.ccx_exportturnover,
gva.ccx_financialyearenddate,
gva.ccx_netprofitbeforetax,
gva.ccx_totalturnover,
gva.ccx_totalwages,
gva.ccx_statusname,
gva.ccx_status,
gva.ccx_company,
gva.ccx_totalwages + gva.ccx_netprofitbeforetax + gva.ccx_depreciation AS GVA,
gva.ccx_nofulltimeequivalentemployees
FROM
(
SELECT
ccx_companyname,
MIN(ccx_financialyearenddate) AS FirstDate
FROM Filteredccx_gva AS Filteredccx_gva_1
GROUP BY ccx_companyname
) AS MAX_1
INNER JOIN Filteredccx_gva AS gva
ON MAX_1.ccx_companyname = gva.ccx_companyname AND
MAX_1.FirstDate = gva.ccx_financialyearenddate
WHERE (gva.ccx_status = ACTUAL)
Can't you just add a where clause using the first and last date parameters. Something like this:
SELECT <companyId>, MIN(<date>), MAX(<date>)
FROM <table>
WHERE <date> BETWEEN #firstDate AND #lastDate
GROUP BY <companyId>
declare #programme table (ccx_companyname varchar(max), start_year int, end_year int);
insert #programme values
('ABloggs', 2001, 2007);
declare #companies table (ccx_companyname varchar(max), ccx_financialyearenddate int);
insert #companies values
('ABloggs', 1999)
,('ABloggs', 2000)
,('ABloggs', 2001)
,('ABloggs', 2004)
,('ABloggs', 2006)
,('ABloggs', 2007)
,('ABloggs', 2009);
select c.ccx_companyname, min(ccx_financialyearenddate), max(ccx_financialyearenddate)
from #companies c
join #programme p on c.ccx_companyname = p.ccx_companyname
where c.ccx_financialyearenddate >= p.start_year and c.ccx_financialyearenddate <= p.end_year
group by c.ccx_companyname
having count(*) > 1;
You can combine your two original queries into a single query by including the MIN and MAX aggregates in the same GROUP BY query of the virtual table. Also including COUNT() and HAVING COUNT() > 1 ensures company must have at least 2 dates. So query should look like:
SELECT
gva.ccx_companyname,
gva.ccx_depreciation,
gva.ccx_exportturnover,
gva.ccx_financialyearenddate,
gva.ccx_netprofitbeforetax,
gva.ccx_totalturnover,
gva.ccx_totalwages,
gva.ccx_statusname,
gva.ccx_status,
gva.ccx_company,
gva.ccx_totalwages + gva.ccx_netprofitbeforetax + gva.ccx_depreciation AS GVA,
gva.ccx_nofulltimeequivalentemployees
FROM
(SELECT
ccx_companyname,
ccx_status,
MIN(ccx_financialyearenddate) AS FirstDate,
MAX(ccx_financialyearenddate) AS LastDate,
COUNT(*) AS NumDates
FROM Filteredccx_gva AS Filteredccx_gva_1
WHERE (ccx_status = ACTUAL)
GROUP BY ccx_companyname, ccx_status
HAVING COUNT(*) > 1
) AS MinMax
INNER JOIN Filteredccx_gva AS gva
ON MinMax.ccx_companyname = gva.ccx_companyname AND
(MinMax.FirstDate = gva.ccx_financialyearenddate OR
MinMax.LastDate = gva.ccx_financialyearenddate)
WHERE (gva.ccx_status = MinMax.ccx_status)
ORDER BY gva.ccx_companyname, gva.ccx_financialyearenddate

TSQL CTE Error: Incorrect syntax near ')'

I am developing a TSQL stored proc using SSMS 2008 and am receiving the above error while generating a CTE. I want to add logic to this SP to return every day, not just the days with data. How do I do this? Here is my SP so far:
ALTER Proc [dbo].[rpt_rd_CensusWithChart]
#program uniqueidentifier = NULL,
#office uniqueidentifier = NULL
AS
DECLARE #a_date datetime
SET #a_date = case when MONTH(GETDATE()) >= 7 THEN '7/1/' + CAST(YEAR(GETDATE()) AS VARCHAR(30))
ELSE '7/1/' + CAST(YEAR(GETDATE())-1 AS VARCHAR(30)) END
if exists (
select * from tempdb.dbo.sysobjects o where o.xtype in ('U') and o.id = object_id(N'tempdb..#ENROLLEES')
) DROP TABLE #ENROLLEES;
if exists (
select * from tempdb.dbo.sysobjects o where o.xtype in ('U') and o.id = object_id(N'tempdb..#DISCHARGES')
) DROP TABLE #DISCHARGES;
declare #sum_enrollment int
set #sum_enrollment =
(select sum(1)
from enrollment_view A
join enrollment_info_expanded_view C on A.enrollment_id = C.enroll_el_id
where
(#office is NULL OR A.group_profile_id = #office)
AND (#program is NULL OR A.program_info_id = #program)
and (C.pe_end_date IS NULL OR C.pe_end_date > #a_date)
AND C.pe_start_date IS NOT NULL and C.pe_start_date < #a_date)
select
A.program_info_id as [Program code],
A.[program_name],
A.profile_name as Facility,
A.group_profile_id as Facility_code,
A.people_id,
1 as enrollment_id,
C.pe_start_date,
C.pe_end_date,
LEFT(datename(month,(C.pe_start_date)),3) as a_month,
day(C.pe_start_date) as a_day,
#sum_enrollment as sum_enrollment
into #ENROLLEES
from enrollment_view A
join enrollment_info_expanded_view C on A.enrollment_id = C.enroll_el_id
where
(#office is NULL OR A.group_profile_id = #office)
AND (#program is NULL OR A.program_info_id = #program)
and (C.pe_end_date IS NULL OR C.pe_end_date > #a_date)
AND C.pe_start_date IS NOT NULL and C.pe_start_date >= #a_date
;WITH #ENROLLEES AS (
SELECT '7/1/11' AS dt
UNION ALL
SELECT DATEADD(d, 1, pe_start_date) as dt
FROM #ENROLLEES s
WHERE DATEADD(d, 1, pe_start_date) <= '12/1/11')
The most obvious issue (and probably the one that causes the error message too) is the absence of the actual statement to which the last CTE is supposed to pertain. I presume it should be a SELECT statement, one that would combine the result set of the CTE with the data from the #ENROLLEES table.
And that's where another issue emerges.
You see, apart from the fact that a name that starts with a single # is hardly advisable for anything that is not a local temporary table (a CTE is not a table indeed), you've also chosen for your CTE a particular name that already belongs to an existing table (more precisely, to the already mentioned #ENROLLEES temporary table), and the one you are going to pull data from too. You should definitely not use an existing table's name for a CTE, or you will not be able to join it with the CTE due to the name conflict.
It also appears that, based on its code, the last CTE represents an unfinished implementation of the logic you say you want to add to the SP. I can suggest some idea, but before I go on I'd like you to realise that there are actually two different requests in your post. One is about finding the cause of the error message, the other is about code for a new logic. Generally you are probably better off separating such requests into distinct questions, and so you might be in this case as well.
Anyway, here's my suggestion:
build a complete list of dates you want to be accounted for in the result set (that's what the CTE will be used for);
left-join that list with the #ENROLLEES table to pick data for the existing dates and some defaults or NULLs for the non-existing ones.
It might be implemented like this:
… /* all your code up until the last WITH */
;
WITH cte AS (
SELECT CAST('7/1/11' AS date) AS dt
UNION ALL
SELECT DATEADD(d, 1, dt) as dt
FROM cte
WHERE dt < '12/1/11'
)
SELECT
cte.dt,
tmp.[Program code],
tmp.[program_name],
… /* other columns as necessary; you might also consider
enveloping some or all of the "tmp" columns in ISNULLs,
like in
ISNULL(tmp.[Program code], '(none)') AS [Program code]
to provide default values for absent data */
FROM cte
LEFT JOIN #ENROLLEES tmp ON cte.dt = tmp.pe_start_date
;

Dynamic pivot - how to obtain column titles parametrically?

I wish to write a Query for SAP B1 (t-sql) that will list all Income and Expenses Items by total and month by month.
I have successfully written a Query using PIVOT, but I do not want the column headings to be hardcoded like: Jan-11, Feb-11, Mar-11 ... Dec-11.
Rather I want the column headings to be parametrically generated, so that if I input:
--------------------------------------
Query - Selection Criteria
--------------------------------------
Posting Date greater or equal 01.09.10
Posting Date smaller or equal 31.08.11
[OK] [Cancel]
the Query will generate the following columns:
Sep-10, Oct-10, Nov-10, ..... Aug-11
I guess DYNAMIC PIVOT can do the trick.
So, I modified one SQL obtained from another forum to suit my purpose, but it does not work. The error message I get is Incorrect Syntax near 20100901.
Could anybody help me locate my error?
Note: In SAP B1, '[%1]' is an input variable
Here's my query:
/*Section 1*/
DECLARE #listCol VARCHAR(2000)
DECLARE #query VARCHAR(4000)
-------------------------------------
/*Section 2*/
SELECT #listCol =
STUFF(
( SELECT DISTINCT '],[' + CONVERT(VARCHAR, MONTH(T0.RefDate), 102)
FROM JDT1
FOR XML PATH(''))
, 1, 2, '') + ']'
------------------------------------
/*Section 3*/
SET #query = '
SELECT * FROM
(
SELECT
T0.Account,
T1.GroupMask,
T1.AcctName,
MONTH(T0.RefDate) as [Month],
(T0.Debit - T0.Credit) as [Amount]
FROM dbo.JDT1 T0
JOIN dbo.OACT T1 ON T0.Account = T1.AcctCode
WHERE
T1.GroupMask IN (4,5,6,7) AND
T0.[Refdate] >= '[%1]' AND
T0.[Refdate] <= '[%2]'
) S
PIVOT
(
Sum(Amount)
FOR [Month] IN ('+#listCol+')
) AS pvt
'
--------------------------------------------
/*Section 4*/
EXECUTE (#query)
I don't know SAP, but a couple of things spring to mind:
It looks like you want #listCol to contain a collection of numbers within square brackets, for example [07],[08],[09].... However, your code appears not to put a [ at the start of this string.
Try replacing the lines
T0.[Refdate] >= '[%1]' AND
T0.[Refdate] <= '[%2]'
with
T0.[Refdate] >= ''[%1]'' AND
T0.[Refdate] <= ''[%2]''
(I also added a space before the AND in the first of these two lines while I was editing your question.)