How do I write sql to bring this back - tsql

I need to bring back distinct rows from table A. I need to take employee 300 from table b and c. He can see siteid 1 and 2 and EmployingActivityId 10 and 50 and What is the most efficient way to gram these records
Table A
id Employeeid SiteId EmployingActivityId
1 123 1 10
2 124 2 10
3 125 3 30
4 126 2 40
5 127 5 50
6 128 2 60
Table b
employeeid SiteID
300 1
300 2
400 2
table C
employeeid EmployingActivityId
300 10
300 50
400 20
I know this is not right, but....
select distinct id, Employeeid from tableA as a
inner join
(select siteID from tableb (where employee = 300) on tableb.siteID = tableA.siteid
inner join
(select siteID from tablec (where employee = 300) on tablec.EmployingActivityId = tableA.EmployingActivityId
I need to bring back from table A
id Employeeid
1 123
2 124
5 127
6 128

I would do something like this:
declare #a as table(
id int
,employeeId int
,siteId int
,employeeActivityId int
)
declare #b as table(
employeeID int
,siteId int
)
declare #c as table(
employeeid int
,employeeActivitatyID int
)
insert into #a values (1,123,1,10)
insert into #a values (2,124,2,10)
insert into #a values (3,125,3,30)
insert into #a values (4,126,2,40)
insert into #a values (5,127,5,50)
insert into #a values (6,128,2,60)
insert into #b values (300,1)
insert into #b values (300,2)
insert into #b values (400,2)
insert into #c values (300,10)
insert into #c values (300,50)
insert into #c values (400,20)
select distinct id, a.employeeid
from #a a
left join #b b
on a.siteId = b.siteId
left join #c c
on a.employeeActivityId = c.employeeActivitatyID
where 300 in (b.employeeID, c.employeeId)

Related

Postgresql Partition - Funtion calling in Select Query - is slow

Our system is a SAAS based system we use ClientID as a Masking for data fetching.
The DB load is based on the Size of the Company. So we partitioned the DB based on ClientID
Example: Before Partition
clienttable
clientid
clientname
clientaddress
1
ABC
...
2
EMN
...
3
XYZ
...
employeetable
clientid
employeeid
employeename
1
123
AAA
1
124
BBB
2
125
CCC
2
126
DDD
3
127
EEEE
jobtable
clientid
jobid
jobname
1
234
YTR
1
235
DER
2
236
SWE
3
237
VFT
3
238
GHJ
Example: After Partition
clienttable
clientid
clientname
clientaddress
1
ABC
...
2
EMN
...
3
XYZ
...
employeetable
employeetable_1
clientid
employeeid
employeename
1
123
AAA
1
124
BBB
employeetable_2
clientid
employeeid
employeename
2
125
CCC
2
126
DDD
employeetable_3
clientid
employeeid
employeename
3
127
EEE
jobtable
jobtable_1
clientid
jobid
jobname
1
234
YTR
1
235
DER
jobtable_2
clientid
jobid
jobname
2
236
SWE
jobtable_3
clientid
jobid
jobname
3
237
VFT
3
238
GHJ
When we write select queries:
Select employeeid,employeename from employeetable where clientid=2;
This query runs faster after partition. The problem we face is we have some user defined function to manipulate some data.
CREATE OR REPLACE FUNCTION GET_JOB_COUNT(NUMERIC, NUMERIC) RETURNS NUMERIC AS $BODY$
DECLARE
p_client_id ALIAS FOR $1;
p_employee_id ALIAS FOR $2;
v_is_count NUMERIC := 0;
BEGIN
SELECT COUNT(JOB_ID) INTO v_is_count FROM JOBTABLE where CLIENTID=p_client_id AND CREATEDBY=p_employee_id;
RETURN v_is_count;
END; $BODY$
LANGUAGE plpgsql;
Select employeeid,employeename,GET_JOB_COUNT(2,employeeid) from employeetable where clientid=2;
This query is slow after partition. Does this means the GET_JOB_COUNT function is run across Partition?
Is that the problem, then we can't use Functions like this in Select query after partition?
The function will be called once for each and every row from the employeetable (that is selected through the WHERE clause). I doubt you can improve the performance in any significant way using that approach.
It's better to do the aggregation (=count) for all rows at once, rather than for each row separately:
select e.employeeid, employeename, t.cnt
from employeetable e
left join (
select clientid, createdby, count(job_id) as cnt
from jobtable
group by client_id, created_by
) j on j.clientid = e.clientid and j.createdby = e.employeeid
where e.clientid = 2;
Another option to try is to use a lateral join to eliminate rows from the jobtable early - I am not sure if the optimizer is smart enough for that in the query above. So you can try this as an alternative:
select e.employeeid, employeename, j.cnt
from employeetable e
left join lateral (
select count(jt.job_id) as cnt
from jobtable jt
where jt.clientid = e.clientid
and jtcreatedby = e.employeeid
) j on true
where e.clientid = 2;
If you really do want to stick with the function, maybe making it a SQL function helps the optimizer. It at least removes the overhead of calling PL/pgSQL code:
CREATE OR REPLACE FUNCTION get_job_count(p_client_id numeric, p_employee_id numeric)
returns bigint
as
$body$
SELECT COUNT(JOB_ID)
FROM JOBTABLE
where CLIENTID = p_client_id
AND CREATEDBY = p_employee_id;
$BODY$
LANGUAGE sql
stable
parallel safe;
But I doubt that you will see a substantial improve by that.
As a side not: using numeric for an "ID" column seems like a rather strange choice. Why aren't you using int or bigint for that?

T-SQL query, multiple values in a field

I have two tables in a database. The first table tblTracker contains many columns, but the column of particular interest is called siteAdmin and each row in that column can contain multiple loginIDs of 5 digits like 21457, 21456 or just one like 21444. The next table users contains columns like LoginID, fname, and lname.
What I would like to be able to do is take the loginIDs contained in tblTracker.siteAdmin and return fname + lname from users. I can successfully do this when there is only one loginID in the row such as 21444 but I cannot figure out how to do this when there is more than one like 21457, 21456.
Here is the SQL statement I use for when there is one loginID in that column
SELECT b.FName + '' '' + b.LName AS siteAdminName,
FROM tblTracker a
LEFT OUTER JOIN users b ON a.siteAdmin= b.Login_Id
However this doesn't work when it tries to join a siteAdmin with more than one LoginID in it
Thanks!
I prefer the number table approach to split a string in TSQL
For this method to work, you need to do this one time table setup:
SELECT TOP 10000 IDENTITY(int,1,1) AS Number
INTO Numbers
FROM sys.objects s1
CROSS JOIN sys.objects s2
ALTER TABLE Numbers ADD CONSTRAINT PK_Numbers PRIMARY KEY CLUSTERED (Number)
Once the Numbers table is set up, create this split function:
CREATE FUNCTION [dbo].[FN_ListToTable]
(
#SplitOn char(1) --REQUIRED, the character to split the #List string on
,#List varchar(8000)--REQUIRED, the list to split apart
)
RETURNS TABLE
AS
RETURN
(
----------------
--SINGLE QUERY-- --this will not return empty rows
----------------
SELECT
ListValue
FROM (SELECT
LTRIM(RTRIM(SUBSTRING(List2, number+1, CHARINDEX(#SplitOn, List2, number+1)-number - 1))) AS ListValue
FROM (
SELECT #SplitOn + #List + #SplitOn AS List2
) AS dt
INNER JOIN Numbers n ON n.Number < LEN(dt.List2)
WHERE SUBSTRING(List2, number, 1) = #SplitOn
) dt2
WHERE ListValue IS NOT NULL AND ListValue!=''
);
GO
You can now easily split a CSV string into a table and join on it:
select * from dbo.FN_ListToTable(',','1,2,3,,,4,5,6777,,,')
OUTPUT:
ListValue
-----------------------
1
2
3
4
5
6777
(6 row(s) affected)
Your can now use a CROSS APPLY to split every row in your table like:
DECLARE #users table (LoginID int, fname varchar(5), lname varchar(5))
INSERT INTO #users VALUES (1, 'Sam', 'Jones')
INSERT INTO #users VALUES (2, 'Don', 'Smith')
INSERT INTO #users VALUES (3, 'Joe', 'Doe')
INSERT INTO #users VALUES (4, 'Tim', 'White')
INSERT INTO #users VALUES (5, 'Matt', 'Davis')
INSERT INTO #users VALUES (15,'Sue', 'Me')
DECLARE #tblTracker table (RowID int, siteAdmin varchar(50))
INSERT INTO #tblTracker VALUES (1,'1,2,3')
INSERT INTO #tblTracker VALUES (2,'2,3,4')
INSERT INTO #tblTracker VALUES (3,'1,5')
INSERT INTO #tblTracker VALUES (4,'1')
INSERT INTO #tblTracker VALUES (5,'5')
INSERT INTO #tblTracker VALUES (6,'')
INSERT INTO #tblTracker VALUES (7,'8,9,10')
INSERT INTO #tblTracker VALUES (8,'1,15,3,4,5')
SELECT
t.RowID, u.LoginID, u.fname+' '+u.lname AS YourAdmin
FROM #tblTracker t
CROSS APPLY dbo.FN_ListToTable(',',t.siteAdmin) st
LEFT OUTER JOIN #users u ON st.ListValue=u.LoginID --to get all rows even if missing siteAdmin
--INNER JOIN #users u ON st.ListValue=u.LoginID --to remove rows without any siteAdmin
ORDER BY t.RowID,u.fname,u.lname
OUTPUT:
RowID LoginID YourAdmin
----------- ----------- -----------
1 2 Don Smith
1 3 Joe Doe
1 1 Sam Jones
2 2 Don Smith
2 3 Joe Doe
2 4 Tim White
3 5 Matt Davis
3 1 Sam Jones
4 1 Sam Jones
5 5 Matt Davis
7 NULL NULL
7 NULL NULL
7 NULL NULL
8 3 Joe Doe
8 5 Matt Davis
8 1 Sam Jones
8 15 Sue Me
8 4 Tim White
(18 row(s) affected)

T-SQL: Selecting column from an alternate table when null

I have the following table, TableA, with data:
ID ColA
0 10
1 null
2 20
I have another table, TableB, with the following data.
ID ColB ColC
1 30 80
1 40 70
3 50 100
I need to select rows in TableA but when ColA in the row is null, I want to retrieve the value of ColB in TableB (if one exists) and use it in place of ColA. If no value in ColB exists, then the value of ColA in the result should be null. The join is done on TableA.ID and TableB.ID. TableB can have multiple rows where the ID column repeats. TableB.ID and TableB.ColC together make a row unique. So my result should look like this if ColC is limited to the value of 70:
ID ColA
0 10
1 40
2 20
Not sure how to do this. Thanks for your help!
select a.ID, COALESCE(a.ColA, b.ColB) as 'ColA'
from TableA a
left join TableB b on a.ID = b.ID and b.ColC = 70
This seems to do what you want if I have correctly interpreted your question:
SELECT a.ID,
ISNULL(a.ColA, b.ColB) ColA
FROM TableA a
LEFT JOIN
TableB b
ON a.ID = b.ID
AND b.ColC = 70
I have literally "limited to the value of 70" in ColC as you stated.
sounds like you're looking for a case statement. try case when TableA.Value is null then TableB.Value end
SQL Case Statements

TSQL - Mapping one table to another without using cursor

I have tables with following structure
create table Doc(
id int identity(1, 1) primary key,
DocumentStartValue varchar(100)
)
create Metadata (
DocumentValue varchar(100),
StartDesignation char(1),
PageNumber int
)
GO
Doc contains
id DocumentStartValue
1000 ID-1
1100 ID-5
2000 ID-8
3000 ID-9
Metadata contains
Documentvalue StartDesignation PageNumber
ID-1 D 0
ID-2 NULL 1
ID-3 NULL 2
ID-4 NULL 3
ID-5 D 0
ID-6 NULL 1
ID-7 NULL 2
ID-8 D 0
ID-9 D 0
What I need to is to map Metadata.DocumentValues to Doc.id
So the result I need is something like
id DocumentValue PageNumber
1000 ID-1 0
1000 ID-2 1
1000 ID-3 2
1000 ID-4 3
1100 ID-5 0
1100 ID-6 1
1100 ID-7 2
2000 ID-8 0
3000 ID-9 0
Can it be achieved without the use of cursor?
Something like, sorry can't test
;WITH RowList AS
( --assign RowNums to each row...
SELECT
ROW_NUMBER() OVER (ORDER BY id) AS RowNum,
id, DocumentStartValue
FROM
doc
), RowPairs AS
( --this allows us to pair a row with the previous rows to create ranges
SELECT
R.DocumentStartValue AS Start, R.id,
R1.DocumentStartValue AS End
FROM
RowList R JOIN RowList R1 ON R.RowNum + 1 = R1.RowNum
)
--use ranges to join back and get the data
SELECT
RP.id, M.DocumentValue, M.PageNumber
FROM
RowPairs RP
JOIN
Metadata M ON RP.Start <= M.DocumentValue AND M.DocumentValue < RP.End
Edit: This assumes that you can rely on the ID-x values matching and being ascending. If so, StartDesignation is superfluous/redundant and may conflict with the Doc table DocumentStartValue
with rm as
(
select DocumentValue
,PageNumber
,case when StartDesignation = 'D' then 1 else 0 end as IsStart
,row_number() over (order by DocumentValue) as RowNumber
from Metadata
)
,gm as
(
select
DocumentValue as DocumentGroup
,DocumentValue
,PageNumber
,RowNumber
from rm
where RowNumber = 1
union all
select
case when rm.IsStart = 1 then rm.DocumentValue else gm.DocumentGroup end
,rm.DocumentValue
,rm.PageNumber
,rm.RowNumber
from gm
inner join rm on rm.RowNumber = (gm.RowNumber + 1)
)
select d.id, gm.DocumentValue, gm.PageNumber
from Doc d
inner join gm on d.DocumentStartValue = gm.DocumentGroup
Try to use query above (maybe you will need to add option (maxrecursion ...) also) and add index on DocumentValue for Metadata table. Also, it it's possible - it will be better to save appropriate group on Metadat rows inserting.
UPD: I've tested it and fixed errors in my query, not it works and give result as in initial question.
UPD2: And recommended indexes:
create clustered index IX_Metadata on Metadata (DocumentValue)
create nonclustered index IX_Doc_StartValue on Doc (DocumentStartValue)

DB2 query group by id but with max of date and max of sequence

My table is like
ID FName LName Date(mm/dd/yy) Sequence Value
101 A B 1/10/2010 1 10
101 A B 1/10/2010 2 20
101 X Y 1/2/2010 1 15
101 Z X 1/3/2010 5 10
102 A B 1/10/2010 2 10
102 X Y 1/2/2010 1 15
102 Z X 1/3/2010 5 10
I need a query that should return 2 records
101 A B 1/10/2010 2 20
102 A B 1/10/2010 2 10
that is max of date and max of sequence group by id.
Could anyone assist on this.
-----------------------
-- get me my rows...
-----------------------
select * from myTable t
-----------------------
-- limiting them...
-----------------------
inner join
----------------------------------
-- ...by joining to a subselection
----------------------------------
(select m.id, m.date, max(m.sequence) as max_seq from myTable m inner join
----------------------------------------------------
-- first group on id and date to get max-date-per-id
----------------------------------------------------
(select id, max(date) as date from myTable group by id) y
on m.id = y.id and m.date = y.date
group by id) x
on t.id = x.id
and t.sequence = x.max_seq
Would be a simple solution, which does not take account of ties, nor of rows where sequence is NULL.
EDIT: I've added an extra group to first select max-date-per-id, and then join on this to get max-sequence-per-max-date-per-id before joining to the main table to get all columns.
I have considered your table name as employee..
check the below thing helped you.
select * from employee emp1
join (select Id, max(Date) as dat, max(sequence) as seq from employee group by id) emp2
on emp1.id = emp2.id and emp1.sequence = emp2.seq and emp1.date = emp2.dat
I'm a fan of using the WITH clause in SELECT statements to organize the different steps. I find that it makes the code easier to read.
WITH max_date(max_date)
AS (
SELECT MAX(Date)
FROM my_table
),
max_seq(max_seq)
AS (
SELECT MAX(Sequence)
FROM my_table
WHERE Date = (SELECT md.max_date FROM max_date md)
)
SELECT *
FROM my_table
WHERE Date = (SELECT md.max_date FROM max_date md)
AND Sequence = (SELECT ms.max_seq FROM max_seq ms);
You should be able to optimize this further as needed.