I have data in a table that looks like this:
select email_body from email_table
email_body
----------
Ashely, call me. thanks --- Original message --- From: Ashley To: Lee Subject: Homework Sent: 3/6/2017 2:32:54 AM GMT I have a quick question.
Mike, I have all the data you need. Jim --- Original message --- From: Mike To: Jim Subject: Not Sure Sent: 3/18/2017 12:02:51 AM GMT Are you available to go over this?
William, Are you around. thanks --- Original message --- From: Joe To: William Subject: Nothing much Sent: 4/16/2017 4:17:23 PM GMT I need some sleep.
Joan, call me. Ralph --- Original message --- From: Ralph To: Joan Subject: I need help Sent: 3/30/2017 5:12:50 AM GMT Call Rich.
I would like to just return the date and time listed in the email_body:
Results:
Original_message
----------------
3/6/2017 2:32:54 AM
3/18/2017 12:02:51 AM
4/16/2017 4:17:23 PM
3/30/2017 5:12:50 AM
Because the date length can vary (for example 1/1/2017 vs 12/12/2017),
You need to find where it starts (thats easy - 6 chars after the Sent:) and where it ends (that would be 2 chars before M GMT).
The rest is done with substring.
SELECT SUBSTRING(email_body, CHARINDEX('Sent:', email_body) + 6,
CHARINDEX('M GMT ',email_body) - 2 - CHARINDEX('Sent:', email_body) - 6) as Original_message
FROM email_table
You could used something like
select SUBSTRING (email_body , PATINDEX ( 'Subject' ,email_body )+1, 18) from email_table
If you are open to a Table-Valued-Function which will quickly and safely extract one or even many values based on a pattern, AND, you are tired of all the required string manipulations, consider the following:
Example
Declare #email_table table (id int,email_body varchar(max))
Insert Into #email_table values
(1,'Ashely, call me. thanks --- Original message --- From: Ashley To: Lee Subject: Homework Sent: 3/6/2017 2:32:54 AM GMT I have a quick question.'),
(2,'Mike, I have all the data you need. Jim --- Original message --- From: Mike To: Jim Subject: Not Sure Sent: 3/18/2017 12:02:51 AM GMT Are you available to go over this?'),
(3,'William, Are you around. thanks --- Original message --- From: Joe To: William Subject: Nothing much Sent: 4/16/2017 4:17:23 PM GMT I need some sleep.'),
(4,'Joan, call me. Ralph --- Original message --- From: Ralph To: Joan Subject: I need help which was Sent: 3/30/2017 5:12:50 AM GMT Call Rich.')
Select A.ID
,Original_message = B.RetVal
From #email_table A
Cross Apply [dbo].[udf-Str-Extract](A.email_body,' Sent: ',' GMT ') B
Returns
ID Original_message
1 3/6/2017 2:32:54 AM
2 3/18/2017 12:02:51 AM
3 4/16/2017 4:17:23 PM
4 3/30/2017 5:12:50 AM
The actual TVF returns
RetSeq RetPos RetLen RetVal
1 101 19 3/6/2017 2:32:54 AM
1 115 21 3/18/2017 12:02:51 AM
1 114 20 4/16/2017 4:17:23 PM
1 111 20 3/30/2017 5:12:50 AM
The UDF if Interested
CREATE FUNCTION [dbo].[udf-Str-Extract] (#String varchar(max),#Delimiter1 varchar(100),#Delimiter2 varchar(100))
Returns Table
As
Return (
with cte1(N) As (Select 1 From (Values(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) N(N)),
cte2(N) As (Select Top (IsNull(DataLength(#String),0)) Row_Number() over (Order By (Select NULL)) From (Select N=1 From cte1 N1,cte1 N2,cte1 N3,cte1 N4,cte1 N5,cte1 N6) A ),
cte3(N) As (Select 1 Union All Select t.N+DataLength(#Delimiter1) From cte2 t Where Substring(#String,t.N,DataLength(#Delimiter1)) = #Delimiter1),
cte4(N,L) As (Select S.N,IsNull(NullIf(CharIndex(#Delimiter1,#String,s.N),0)-S.N,8000) From cte3 S)
Select RetSeq = Row_Number() over (Order By N)
,RetPos = N
,RetLen = charindex(#Delimiter2,RetVal)-1
,RetVal = left(RetVal,charindex(#Delimiter2,RetVal)-1)
From (Select A.N,RetVal = ltrim(rtrim(Substring(#String, A.N, A.L))) From cte4 A ) A
Where charindex(#Delimiter2,RetVal)>1
)
/*
Max Length of String 1MM characters
Declare #String varchar(max) = 'Dear [[FirstName]] [[LastName]], ...'
Select * From [dbo].[udf-Str-Extract] (#String,'[[',']]')
*/
Related
I have a table with a list of individuals that have a effective date and a termination date.
example Person 1, 20171201, 20180601
For each record, I need to output a list of years and months they were "active" between the two dates.
So the output would look like
Data Output
This is in SQL Server 2016
Any help would be appreciated!
Just because I did not see it offered. Here is yet another approach which uses an ad-hoc calendar table
Note the base date of 2000-01-01 and 10,000 days ... expand or contract if needed
Example
Declare #YourTable table (Person int,startdate date, enddate date)
Insert Into #YourTable values
(1,'20171201','20180601')
Select Distinct
A.Person
,ActiveYear = year(D)
,ActiveMonth = month(D)
From #YourTable A
Join (
Select Top 10000 D=DateAdd(DAY,-1+Row_Number() Over (Order By (Select Null)),'2000-01-01')
From master..spt_values n1,master..spt_values n2
) B on D between startdate and enddate
Returns
Person ActiveYear ActiveMonth
1 2017 12
1 2018 1
1 2018 2
1 2018 3
1 2018 4
1 2018 5
1 2018 6
Based on info you've provided, I did a little guessing. Perhaps a recursive CTE will help:
DECLARE #tab TABLE (Person INT, EffectiveDate DATE, TerminationDate DATE)
INSERT #tab VALUES
(1, '2017-12-01', '2018-05-31'),
(2, '2017-10-01', '2018-01-01'),
(3, '2018-02-01', '2018-12-01')
;WITH t AS (
SELECT Person, EffectiveDate AS Dt
FROM #tab
UNION ALL
SELECT Person, DATEADD(mm,1,Dt)
FROM t
WHERE t.Dt < (SELECT DATEADD(mm,-1,TerminationDate) FROM #tab tt WHERE tt.Person = t.Person)
)
SELECT *
FROM t
ORDER BY Dt
Then take the DATEPART()
SELECT t.Person
, t.Dt
, DATEPART(yyyy, t.Dt) ActiveYear
, DATEPART(mm, t.Dt) ActiveMonth
FROM t
ORDER BY Dt
Welcome to stackoverflow! For better help it's good to include DDL easily consumable sample data that we can use to quickly re-create what you are doing and provide a solution. Note this sample data:
DECLARE #yourtable TABLE(person VARCHAR(20), date1 DATE, date2 DATE)
INSERT #yourtable (person, date1, date2)
VALUES ('Person1','20171201', '20180601'), ('Person2','20171001', '20180101'),
('Person3','20180101', '20180301');
SELECT t.* FROM #yourtable AS t;
Returns:
person date1 date2
-------------------- ---------- ----------
Person1 2017-12-01 2018-06-01
Person2 2017-10-01 2018-01-01
Person3 2018-01-01 2018-03-01
(I added a couple rows). Here's my solution:
DECLARE #yourtable TABLE(person VARCHAR(20), date1 DATE, date2 DATE)
INSERT #yourtable (person, date1, date2)
VALUES ('Person1','20171201', '20180601'), ('Person2','20171001', '20180101'),
('Person3','20180101', '20180301');
SELECT
Person = t.person,
ActiveYear = YEAR(st.DT),
ActiveMonth = MONTH(st.Dt)
FROM #yourtable AS t
CROSS APPLY
(
SELECT TOP (DATEDIFF(MONTH,t.date1,t.date2)) ROW_NUMBER() OVER (ORDER BY (SELECT NULL))
FROM (VALUES(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) AS a(x)
CROSS JOIN (VALUES(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) AS b(x)
) AS iTally(N)
CROSS APPLY (VALUES(DATEADD(MONTH, iTally.N-1, t.date1))) AS st(Dt);
Which returns:
Person ActiveYear ActiveMonth
-------------------- ----------- -----------
Person1 2017 12
Person1 2018 1
Person1 2018 2
Person1 2018 3
Person1 2018 4
Person1 2018 5
Person2 2017 10
Person2 2017 11
Person2 2017 12
Person3 2018 1
Person3 2018 2
Let me know if you have questions.
I would like to remove duplicate rows based on event_dates and case_ids.
I have a query that looks like this (the query is much longer, this is just to show the problem):
SELECT
event_date,
event_id,
event_owner
FROM eventtable
This gives me results such as the following:
event_date event_id event_owner
2018-02-06 00:00:00 123456 UNASSIGNED
2018-02-07 00:00:00 123456 UNASSIGNED
2018-02-07 00:00:00 123456 Mickey Mouse
2018-02-08 00:00:00 123456 Mickey Mouse
2018-02-09 00:00:00 123456 Minnie Mouse
2018-02-10 00:00:00 123456 Minnie Mouse
2018-02-11 00:00:00 123456 Mickey Mouse
.
.
.
Problem:
I have duplicate entries on 2018-02-07. I would like to have only the second one to remain.
So the result should be this:
event_date event_id event_owner
2018-02-06 00:00:00 123456 UNASSIGNED
2018-02-07 00:00:00 123456 Mickey Mouse
2018-02-08 00:00:00 123456 Mickey Mouse
2018-02-09 00:00:00 123456 Minnie Mouse
2018-02-10 00:00:00 123456 Minnie Mouse
2018-02-11 00:00:00 123456 Mickey Mouse
.
.
.
I've tried to use SELECT DISTINCT ... , but that gives back all the results since it takes into consideration all 3 columns and in that sence all rows are uniqe. I only want to apply DISTINCT on 2 columns event_data and event_id.
Should I use nested sub-queries? Or where lies the truth? All help is much appreciated.
You can use the ROW_NUMBER analytic function for this purpose, but you should clarify the order when you say " I would like to have only the second one to remain". That order doesn't exists in the data, so you need to do something to generate it by yourself.
Try this query:
select event_date, event_id, event_owner
from (
select
row_number() over (partition by event_date order by case when event_owner='UNASSIGNED' then 0 else 1 end desc) as rn,
*
from eventtable
) t
where rn=1
I have 3 columns. SSN|AccountNumber|OpenDate
1 SSN may have multiple AccountNumbers
Each AccountNumber has a corresponding OpenDate
In my list I have many SSN's, each containing several account numbers which may have been opened on different days.
I want the results of my query to be SSN|earlest OpenDate|AccountNumber that corresponds with the earliest opendate.
I'm dealing with about 200,000 records.
EDIT: First I did
select SSN, min(OpenDate), AcctNumber from Table Group By SSN, AccountNumber
but that didn't quite give me the correct data.
The raw data gives me something like this:
SSN | AcctNumber | OpenDate
---------------------------
10 101 Jan
10 102 Feb
10 103 Mar
Where I got 10, Jan, and AccNumber 102 which is not the account number that is associated with Jan OpenDate After looking at others, I found that the account number I got was just one of the account numbers associated with that SSN rather than the one that corresponds with the min(OpenDate)
WITH CTE AS ( SELECT SSN, AcctNumber, OpenDate, ROW_NUM() OVER (PARTITION BY SSN ORDER BY OpenDate DESC) AS RN ) SELECT SSN, AcctNumber, OpenDate FROM CTE WHERE RN=1;
If your table is like this:
SSN | AcctNumber | OpenDate
---------------------------
10 101 April
10 101 May
10 102 April
20 201 June
20 201 July
Do you want your query to return this?
SSN | AcctNumber | OpenDate
---------------------------
10 101 April
10 102 April
20 201 June
Then you would use this query:
select ssn, min(OpenDate), acctNumber from tbl group by ssn, acctNumber
You can try this..
select SSN , AcctNumber, OpenDate
from (SELECT SSN , AcctNumber, OpenDate
, ROW_NUMBER() OVER ( PARTITION BY SSN, ORDER BY OpenDate ASC ) AS RN
FROM table) AS temp
WHERE temp.RN= 1
using T-SQL, I have the following data:
ID name number q1 q2 q3
--- ----- ------ -- -- --
1 paul 7777 yes no maybe
2 steve 8786 no yes definitely
and I am looking to unpivot it so that it represents:
ID name number question answer
-- ---- ----- -------- ------
1 paul 7777 Q1 yes
1 paul 7777 Q2 no
1 paul 7777 Q3 maybe
2 steve 8786 Q1 no
2 steve 8786 Q2 yes
2 steve 8786 Q3 definitely
so far I have managed to unpivot the id, name, number and question parts, but cannot get the answer to complete accordingly.
I have used:
select [name],[number],[id],[question_number] from (select [name],[number],[id],
[q1],[q2],[q3]) unpivot
(something for [question_number] in ([Q1],{Q2],[Q3])) as unpvt
This is obviously a simplified version of my data, but the requirement is still the same. Can anyone help please?
Thanks.
My first answer :)
Without pivot:
select ID,name,number,'Q1' as question ,Q1 as answer from #yourtable
union all select ID,name,number,'Q2',Q2 from #yourtable
union all select ID,name,number,'Q3',Q3 from #yourtable
Here the full example
create table #yourtable (
ID int,
name nvarchar(20),
number int,
q1 nvarchar(20),
q2 nvarchar(20),
q3 nvarchar(20));
insert into #yourtable values(1 ,'paul', 7777,'yes','no','maybe');
insert into #yourtable values(2, 'steve', 8786, 'no', 'yes', 'definitely');
select ID,name,number,'Q1' as question ,Q1 as answer from #yourtable
union all select ID,name,number,'Q2',Q2 from #yourtable
union all select ID,name,number,'Q3',Q3 from #yourtable
This has to be a simple error on my part. I have a table with permits (applicants have one permit) - about ~600 expired last season and ~900 the season before. I need to generate a mailing list of unique applicants that had permits in last two seasons.
SELECT COUNT(*) FROM Backyard_Burn WHERE YEAR(Expiration_Date)= 2014
SELECT COUNT(*) FROM Backyard_Burn WHERE YEAR(Expiration_Date)= 2013
SELECT COUNT(*) FROM Backyard_Burn WHERE YEAR(Expiration_Date)= 2013
AND Applicant_Mail_ID NOT IN(
SELECT Applicant_Mail_ID
FROM Backyard_Burn
WHERE YEAR(Expiration_Date)= 2014)
Which returns : 618, 923, and 0
Why 0 and not a number somewhere near 923 - 618 assuming most are repeat applicants?
NOT IN can be dangerous. The problem is probably caused because Applicant_Mail_id takes on NULL values. You can fix this readily with:
SELECT COUNT(*)
FROM Backyard_Burn
WHERE YEAR(Expiration_Date) = 2013 AND
Applicant_Mail_ID NOT IN (SELECT Applicant_Mail_ID
FROM Backyard_Burn
WHERE YEAR(Expiration_Date) = 2014 AND Applicant_Mail_ID IS NOT NULL
);
If any of those values are NULL, then NOT IN can only return FALSE or NULL -- the condition can never allows records through.
For this reason, I think it is better to use NOT EXSTS, which has the semantics you expect when some of the values might be NULL:
SELECT COUNT(*)
FROM Backyard_Burn bb
WHERE YEAR(Expiration_Date) = 2013 AND
NOT EXISTS (SELECT 1
FROM Backyard_Burn bb2
WHERE YEAR(bb2.Expiration_Date) = 2014 AND
bb2.Applicant_Mail_ID = bb.Applicant_Mail_ID
);
EDIT:
By the way, an alternative way of formulating this is to use group by and having:
select Applicant_Mail_ID
from Backyard_Burn
group by Applicant_Mail_ID
having sum(case when year(Expiration_Date) = 2013 then 1 else 0 end) > 0 and
sum(case when year(Expiration_Date) = 2014 then 1 else 0 end) > 0;
This avoids the problem with NULLs and makes it easy to add new conditions, such as applicants who did not have any records in 2012.
you need applicants from the last two seasons - you need to use a greater than operator
its better to check on a full date instead of getting the year value with year
to get the unique applicants you can use distinct
Which results in:
select count(distinct Applicant_Mail_ID)
from Backyard_Burn
where Expiration_Date >= '20130101';