T-SQL - how to get around the order by restriction in CTEs - tsql

I have the following CTE. Its purpose is to provide unique Month/Year pairs. Later code will use the CTE to produce a concatenated string list of the Month/Year pairs.
;WITH tblStoredWillsInPeriod AS
(
SELECT DISTINCT Kctc.GetMonthAndYearString(DateWillReceived) Month
FROM Kctc.StoredWills
WHERE DateWillReceived BETWEEN '2010/01/01' AND '2010/03/31'
ORDER BY DateWillReceived
)
I have omitted the implmementation of the GetMonthAndYearString function as it is trivial.
Edit: As requested by Martin, here is the surrounding code:
DECLARE #PivotColumnHeaders nvarchar(MAX)
--CTE declaration as above---
SELECT #PivotColumnHeaders =
COALESCE(
#PivotColumnHeaders + ',[' + Month + ']',
'[' + Month + ']'
)
FROM tblStoredWillsInPeriod
SELECT #PivotColumnHeaders
Sadly, it seems T-SQL is always one step ahead. When I run this code, it tells me I'm not allowed to use ORDER BY in a CTE unless I also use TOP (or FOR XML, whatever that is.) If I use TOP, it tells me I can't use it with DISTINCT. Yup, T-SQL has all the answers.
Can anyone think of a solution to this problem which is quicker than simply slashing my wrists? I understand that death from blood loss can be surprisingly lingering, and I have deadlines to meet.
Thanks for your help.
David

Will this work?
DECLARE #PivotColumnHeaders VARCHAR(MAX)
;WITH StoredWills AS
(
SELECT GETDATE() AS DateWillReceived
UNION ALL
SELECT '2010-03-14 11:48:07.580'
UNION ALL
SELECT '2010-03-12 11:48:07.580'
UNION ALL
SELECT '2010-02-12 11:48:07.580'
),
tblStoredWillsInPeriod AS
(
SELECT DISTINCT STUFF(RIGHT(convert(VARCHAR, DateWillReceived, 106),8), 4, 1, '-') AS MMMYYYY,
DatePart(Year,DateWillReceived) AS Year,
DatePart(Month,DateWillReceived) AS Month
FROM StoredWills
WHERE DateWillReceived BETWEEN '2010-01-01' AND '2010-03-31'
)
SELECT #PivotColumnHeaders =
COALESCE(
#PivotColumnHeaders + ',[' + MMMYYYY + ']',
'[' + MMMYYYY + ']'
)
FROM tblStoredWillsInPeriod
ORDER BY Year, Month

Could you clarify why you need the data in the the CTE to be ordered? And why you are not able to order the data in the query using the CTE. Remember data in an ordinary subquery can't be ordered either.

What about?
;WITH tblStoredWillsInPeriod AS
(
SELECT DISTINCT Kctc.GetMonthAndYearString(DateWillReceived) Month
FROM Kctc.StoredWills
WHERE DateWillReceived BETWEEN '2010/01/01' AND '2010/03/31'
ORDER BY DateWillReceived
),
tblStoredWillsInPeriodOrdered AS
(
SELECT TOP 100 PERCENT Month
FROM tblStoredWillsInPeriod
ORDER BY Month
)

And you think you know T-SQL syntax!
Turns out I was wrong about not being able to use TOP and DISTINCT together.
This yields a syntax error...
SELECT TOP 100 PERCENT DISTINCT...
whereas this is absolutely fine...
SELECT DISTINCT TOP 100 PERCENT...
Work that one out.
One drawback is that you have to include the ORDER BY field in the SELECT list, which in all likelihood will interfere with your expected DISTINCT results. Sometimes T-SQL has you running around in circles.
But for now, my wrists are left unmarked.

SELECT DISTINCT TOP 100 PERCENT ...
ORDER BY ...

Related

T-SQL apply where clause to function fields not working

I'm having a hard time filtering this view by CreateDate. The CreateDate in the table is in the following format: 2013-10-14 15:53:33.900
I managed to DATEPART the year month and day into separate columns, but now it's not letting me use my WHERE clause on those newly created columns. Specifically, the error is "Invalid Column Name CreateYear" for both lines. What am I doing wrong here guys? Is there a better/easier way to do this than parse out the day, month, and year? It seems overkill. I've spent quite a bit of hours on this to no avail.
SELECT convert(varchar, DATEPART(month,v.CreateDate)) CreateMonth,
convert(varchar, DATEPART(DAY,v.CreateDate)) CreateDay,
convert(varchar, DATEPART(YEAR,v.CreateDate)) CreateYear,
v.CreateDate,
v.customerName
From
vw_Name_SQL_DailyPartsUsage v
full outer join
ABC.serviceteamstechnicians t on v.TechnicianNumber = t.AgentNumber
full outer join
ABC.ServiceTeams s on t.STID = s.STID
where
CreateYear >= '02/01/2018'
and
CreateYear <= '02/20/2018'
You cannot reference an alias from the select in the where
Even if you could why would you expect year to be '02/01/2018'
Why are you converting to varchar
where year(v.CreateDate) = 2018
or
select crdate, cast(crdate as date), year(crdate), month(crdate), day(crdate)
from sysObjects
where cast(crdate as date) <= '2014-2-20'
and cast(crdate as date) >= '2000-2-10'
order by crdate
You could use:
SELECT convert(varchar, DATEPART(month,v.CreateDate)) CreateMonth,
convert(varchar, DATEPART(DAY,v.CreateDate)) CreateDay,
convert(varchar, DATEPART(YEAR,v.CreateDate)) CreateYear,
v.CreateDate,
v.customerName
From vw_Name_SQL_DailyPartsUsage v
full outer join
ABC.serviceteamstechnicians t on v.TechnicianNumber = t.AgentNumber
full outer join
ABC.ServiceTeams s on t.STID = s.STID
where CreateDate BETWEEN '20180102' and '20180220';
More info about the logical query processing is that you cannot refer to a column alias at SELECT in the WHERE clause without using a subquery/CROSS APPLY.

multiple extract() with WHERE clause possible?

So far I have come up with the below:
WHERE (extract(month FROM orders)) =
(SELECT min(extract(month from orderdate))
FROM orders)
However, that will consequently return zero to many rows, and in my case, many, because many orders exist within that same earliest (minimum) month, i.e. 4th February, 9th February, 15th Feb, ...
I know that a WHERE clause can contain multiple columns, so why wouldn't the below work?
WHERE (extract(day FROM orderdate)), (extract(month FROM orderdate)) =
(SELECT min(extract(day from orderdate)), min(extract(month FROM orderdate))
FROM orders)
I simply get: SQL Error: ORA-00920: invalid relational operator
Any help would be great, thank you!
Sample data:
02-Feb-2012
14-Feb-2012
22-Dec-2012
09-Feb-2013
18-Jul-2013
01-Jan-2014
Output:
02-Feb-2012
14-Feb-2012
Desired output:
02-Feb-2012
I recreated your table and found out you just messed up the brackets a bit. The following works for me:
where
(extract(day from OrderDate),extract(month from OrderDate))
=
(select
min(extract(day from OrderDate)),
min(extract(month from OrderDate))
from orders
)
Use something like this:
with cte1 as (
select
extract(month from OrderDate) date_month,
extract(day from OrderDate) date_day,
OrderNo
from tablename
), cte2 as (
select min(date_month) min_date_month, min(date_day) min_date_day
from cte1
)
select cte1.*
from cte1
where (date_month, date_day) = (select min_date_month, min_date_day from cte2)
A common table expression enables you to restructure your data and then use this data to do your select. The first cte-block (cte1) selects the month and the day for each of your table rows. Cte2 then selects min(month) and min(date). The last select then combines both ctes to select all rows from cte1 that have the desired month and day.
There is probably a shorter solution to that, however I like common table expressions as they are almost all the time better to understand than the "optimal, shortest" query.
If that is really what you want, as bizarre as it seems, then as a different approach you could forget the extracts and the subquery against the table to get the minimums, and use an analytic approach instead:
select orderdate
from (
select o.*,
row_number() over (order by to_char(orderdate, 'MMDD')) as rn
from orders o
)
where rn = 1;
ORDERDATE
---------
01-JAN-14
The row_number() effectively adds a pseudo-column to every row in your original table, based on the month and day in the order date. The rn values are unique, so there will be one row marked as 1, which will be from the earliest day in the earliest month. If you have multiple orders with the same day/month, say 01-Jan-2013 and 01-Jan-2014, then you'll still only get exactly one with rn = 1, but which is picked is indeterminate. You'd need to add further order by conditions to make it deterministic, but I have no idea what you might want.
That is done in the inner query; the outer query then filters so that only the records marked with rn = 1 is returned; so you get exactly one row back from the overall query.
This also avoids the situation where the earliest day number is not in the earliest month number - say if you only had 01-Jan-2014 and 02-Feb-2014; comparing the day and month separately would look for 01-Feb-2014, which doesn't exist.
SQL Fiddle (with Thomas Tschernich's anwer thrown in too, giving the same result for this data).
To join the result against your invoice table, you don't need to join to the orders table again - especially not with a cross join, which is skewing your results. You can do the join (at least) two ways:
SELECT
o.orderno,
to_char(o.orderdate, 'DD-MM-YYYY'),
i.invno
FROM
(
SELECT o.*,
row_number() over (order by to_char(orderdate, 'MMDD')) as rn
FROM orders o
) o, invoices i
WHERE i.invno = o.invno
AND rn = 1;
Or:
SELECT
o.orderno,
to_char(o.orderdate, 'DD-MM-YYYY'),
i.invno
FROM
(
SELECT orderno, orderdate, invno
FROM
(
SELECT o.*,
row_number() over (order by to_char(orderdate, 'MMDD')) as rn
FROM orders o
)
WHERE rn = 1
) o, invoices i
WHERE i.invno = o.invno;
The first looks like it does more work but the execution plans are the same.
SQL Fiddle with your pastebin-supplied query that gets two rows back, and these two that get one.

In Firebird, how to aggregate the first N rows?

I would like to do something like this:
CNT=2;
//[edit]
select avg(price) from (
select first :CNT p.Price
from Price p
order by p.Date desc
);
This does not work, Firebird does not allow :cnt as a parameter to FIRST. I need to average the first CNT newest prices. The number 2 changes so it can not be hard-coded.
This can be broken out into a FOR SELECT loop and break when a count is reached. Is that the best way though? Can this be done in a single SQL statement?
Creating the SQL as a string and running it is not the best fit either. It is important that the database compile my SQL statement.
You don't have to use CTE, you can do it directly:
select avg(price) from (
select first :cnt p.Price
from Price p
order by p.Date desc
);
You can use a CTE (Common Table Expression) (see http://www.firebirdsql.org/refdocs/langrefupd21-select.html#langrefupd21-select-cte) to select data before calculate average.
See example below:
with query1 as (
select first 2 p.Price
from Price p
order by p.Date desc
)
select avg(price) from query1

How do I replace a SSN with a 9 digit random number in SQL Server 2008R2?

To satisfy security requirements, I need to find a way to replace SSN's with unique, random 9 digit numbers, before providing said database to a developer. The SSN is in a column in a table of a database. There may be 10's of thousands of rows in said table. The number does not need hyphens. I am a beginner with SQL and programming in general.
I have been unable to find a solution for my specific needs. Nothing seems quite right. But if you know of a thread that I have missed, please let me know.
Thanks for any help!
Here is one way.
I'm assuming that you already have a backup of the real data as this update is not reversible.
Below I've assumed your table name is Person with your ssn column named SSN.
UPDATE Person SET
SSN = CAST(LEFT(CAST(ABS(CAST(CAST(NEWID() as BINARY(10)) as int)) as varchar(max)) + '00000000',9) as int)
If they do not have to be random, you could just replace them with ascending numeric values. Failing that, you’d have to generate a random number. As you may have discovered, the RAND function will only generate a single value per query statement (select, update, etc.); the work-around to that is the newid() function, which would generate a GUID for each row produced by a query (run SELECT newid() from MyTable to see how this works). Wrap this in a checksum() to generate an integer; modulus that by 1,000,00,000 to get a value within the SSN range (0 to 999,999,999); and, assuming you’re storing it as a char(9) prefix it with leading zeros.
Next trick is ensuring it’s unique for all values in your table. This gets tricky, and I’d do it by setting up a temp table with the values, populating it, then copying them over. Lessee now…
DECLARE #DummySSN as table
(
PrimaryKey int not null
,NewSSN char(9) not null
)
-- Load initial values
INSERT #DummySSN
select
UserId
,right('000000000' + cast(abs(checksum(newid()))%1000000000 as varchar(9)), 9)
from Users
-- Check for dups
select NewSSN from #DummySSN group by NewSSN having count(*) > 1
-- Loop until values are unique
IF exists (SELECT 1 from #DummySSN group by NewSSN having count(*) > 1)
UPDATE #DummySSN
set NewSSN = right('000000000' + cast(abs(checksum(newid()))%1000000000 as varchar(9)), 9)
where NewSSN in (select NewSSN from #DummySSN group by NewSSN having count(*) > 1)
-- Check for dups
select NewSSN from #DummySSN group by NewSSN having count(*) > 1
This works for a small table I have, and it should work for a large one. I don’t see this turning into an infinite loop, but even so you might want to add a check to exit the loop after say 10 iterations,
I've run a couple million tests in this and it seems to generate random (URN) 9 digit numbers (no leading zeros).
I cannot think of a more efficient way to do this.
SELECT CAST(FLOOR(RAND(CHECKSUM(NEWID())) * 900000000 ) + 100000000 AS BIGINT)
The test used;
;WITH Fn(N) AS
(
SELECT CAST(FLOOR(RAND(CHECKSUM(NEWID())) * 900000000 ) + 100000000 AS BIGINT)
UNION ALL
SELECT CAST(FLOOR(RAND(CHECKSUM(NEWID())) * 900000000 ) + 100000000 AS BIGINT)
FROM Fn
)
,Tester AS
(
SELECT TOP 5000000 *
FROM Fn
)
SELECT LEN(MIN(N))
,LEN(MAX(N))
,MIN(N)
,MAX(N)
FROM Tester
OPTION (MAXRECURSION 0)
Not so fast, but easiest... I added some dot's...
DECLARE #tr NVARCHAR(40)
SET #tr = CAST(ROUND((888*RAND()+111),0) AS CHAR(3)) + '.' +
CAST(ROUND((8888*RAND()+1111),0) AS CHAR(4)) + '.' + CAST(ROUND((8888*RAND()+1111),0) AS
CHAR(4)) + '.' + CAST(ROUND((88*RAND()+11),0) AS CHAR(2))
PRINT #tr
If the requirement is to obfuscate a database then this will return the same unique value for each distinct SSN in any table preserving referential integrity in the output without having to do a lookup and translate.
SELECT CAST(RAND(SSN)*999999999 AS INT)

counting all occurrences in the last year

I have a question, although I can't really go into specifics.
Will the following query:
SELECT DISTINCT tableOuter.Property, (SELECT COUNT(ID) FROM table AS tableInner WHERE tableInner.Property = tableOuter.Property)
FROM table AS tableOuter
WHERE tableOuter.DateTime > DATEADD(year, -1, GETDATE())
AND tableOuter.Property IN (
...
)
Select one instance of each property in the IN clause, together with how often a row with that property occured in the last year?
I just read up on Correlated Subqueries on MSDN, but am not sure if I got it right.
If i understand you corrrecly, you want to get all occurences of each Property in the last year, am i right?
Then use GROUP BY with a HAVING clause:
SELECT tableOuter.Property, COUNT(*) AS Count
FROM table AS tableOuter
GROUP BY tableOuter.Property
HAVING tableOuter.DateTime > DATEADD(year, -1, GETDATE())
AND tableOuter.Property IN ( .... )