nested SELECT statements interact in ways that I don't understand - tsql

I thought I understood how I can do a SELECT from the results of another SELECT statement, but there seems to be some sort of blurring of scope that I don't understand. I am using SQL Server 2008R2.
It is easiest to explain with an example.
Create a table with a single nvarchar column - load the table with a single text value and a couple of numbers:
CREATE TABLE #temptable( a nvarchar(30) );
INSERT INTO #temptable( a )
VALUES('apple');
INSERT INTO #temptable( a )
VALUES(1);
INSERT INTO #temptable( a )
VALUES(2);
select * from #temptable;
This will return: apple, 1, 2
Use IsNumeric to get only the rows of the table that can be cast to numeric - this will leave the text value apple behind. This works fine.
select cast(a as int) as NumA
from #temptable
where IsNumeric(a) = 1 ;
This returns: 1, 2
However, if I use that exact same query as an inner select, and try to do a numeric WHERE clause, it fails saying cannot convert nvarchar value 'apple' to data type int. How has it got the value 'apple' back??
select
x.NumA
from
(
select cast(a as int) as NumA
from #temptable
where IsNumeric(a) = 1
) x
where x.NumA > 1
;
Note that the failing query works just fine without the WHERE clause:
select
x.NumA
from
(
select cast(a as int) as NumA
from #temptable
where IsNumeric(a) = 1
) x
;
I find this very surprising. What am I not getting? TIA

If you take a look at the estimated execution plan you'll find that it has optimized the inner query into the outer and combined the WHERE clauses.
Using a CTE to isolate the operations works (in SQL Server 2008 R2):
declare #temptable as table ( a nvarchar(30) );
INSERT INTO #temptable( a )
VALUES ('apple'), ('1'), ('2');
with Numbers as (
select cast(a as int) as NumA
from #temptable
where IsNumeric(a) = 1
)
select * from Numbers

The reason you are getting this is fair and simple. When a query is executed there are some steps that are being followed. This is a parse, algebrize, optimize and compile.
The algebrize part in this case will get all the objects you need for this query. The optimize will use these objects to create a best query plan which will be compiled and executed...
So, when you look into that part you will see it will do a table scan on #temptable. And #temptable is defined as the way you created your table. That you will do some compute on it is a different thing..... The column still has the nvarchar datatype..
To know how this works you have to know how to read a query. First all the objects are retrieved (from table, inner join table), then the predicates (where, on), then the grouping and such, then the select of the columns (with the cast) and then the orderby.
So with that in mind, when you have a combination of selects, the optimizer will still process it that way.. since your select is subordinate to the from and join parts of your query, it will be a reason for getting this error.
I hope i made it a little clear?

The optimizer is free to move expressions in the query plan in order to produce the most cost efficient plan for retrieving the data (the evaluation order of the predicates is not guaranteed). I think using the case expression like bellow produces a NULL in absence of the ELSE clause and thus takes the APPLE out
select a from #temptable where case when isnumeric(a) = 1 then a end > 1

Related

Is there a way to add the same row multiple times with different ids into a table with postgresql?

I am trying to add the same data for a row into my table x number of times in postgresql. Is there a way of doing that without manually entering the same values x number of times? I am looking for the equivalent of the go[count] in sql for postgres...if that exists.
Use the function generate_series(), e.g.:
insert into my_table
select id, 'alfa', 'beta'
from generate_series(1,4) as id;
Test it in db<>fiddle.
Idea
Produce a resultset of a given size and cross join it with the record that you want to insert x times. What would still be missing is the generation of proper PK values. A specific suggestion would require more details on the data model.
Query
The sample query below presupposes that your PK values are autogenerated.
CREATE TABLE test ( id SERIAL, a VARCHAR(10), b VARCHAR(10) );
INSERT INTO test (a, b)
WITH RECURSIVE Numbers(i) AS (
SELECT 1
UNION ALL
SELECT i + 1
FROM Numbers
WHERE i < 5 -- This is the value `x`
)
SELECT adhoc.*
FROM Numbers n
CROSS JOIN ( -- This is the single record to be inserted multiple times
SELECT 'value_a' a
, 'value_b' b
) adhoc
;
See it in action in this db fiddle.
Note / Reference
The solution is adopted from here with minor modifications (there are a host of other solutions to generate x consecutive numbers with SQL hierachical / recursive queries, so the choice of reference is somewhat arbitrary).

Db2 convert rows to columns

I need the below results ..
Table :
Order postcode qnty
123 2234 1
Expected result:
Order 123
Postcode 2234
Qnty 1
SQL server:
Select pvt.element_name
,pvt.element_value(select order.postcode
from table name)up
unpivot (element_value for element_name in(order,postcode) as Pvt
How to achieve this in db2?
Db2 for IBM i doesn't have a built-in unpviot function.. AFAIK, it's not available on any Db2 platofrm...unless it's been added recently.
The straight forward method
select 'ORDER' as key, order as value
from mytable
UNION ALL
select 'POSTCODE', postcode
from mytable
UNION ALL
select 'QNTY', char(qnty)
from mytable;
A better performing method is to do a cross join between the source table and a correlated VALUES of as many rows as columns that need to be unpivoted.
select
Key, value
from mytable T,
lateral (values ('ORDER', t.order)
, ('POSTCODE', t.postcode)
, ('QNQTY', varchar(t.qnty))
) as unpivot(key, value);
However, you'll need to know ahead of time what the values you're unpivoting on.
If you don't know the values, there are some ways to unpivot with the XMLTABLE (possibly JSON_TABLE) that might work. I've never used them, and I'm out of time to spend answering this question. You can find some examples via google.
I have created a stored procedure for LUW that rotate a table:
https://github.com/angoca/db2tools/blob/master/pivot.sql
You just need to call the stored procedure by passing the tablename as parameter, and it will return a cursor with the headers of the column in the first column.

TSQL order by but first show these

I'm researching a dataset.
And I just wonder if there is a way to order like below in 1 query
Select * From MyTable where name ='international%' order by id
Select * From MyTable where name != 'international%' order by id
So first showing all international items, next by names who dont start with international.
My question is not about adding columns to make this work, or use multiple DB's, or a largerTSQL script to clone a DB into a new order.
I just wonder if anything after 'Where or order by' can be tricked to do this.
You can use expressions in the ORDER BY:
Select * From MyTable
order by
CASE
WHEN name like 'international%' THEN 0
ELSE 1
END,
id
(From your narrative, it also sounded like you wanted like, not =, so I changed that too)
Another way (slightly cleaner and a tiny bit faster)
-- Sample Data
DECLARE #mytable TABLE (id INT IDENTITY, [name] VARCHAR(100));
INSERT #mytable([name])
VALUES('international something' ),('ACME'),('international waffles'),('ABC Co.');
-- solution
SELECT t.*
FROM #mytable AS t
ORDER BY -PATINDEX('international%', t.[name]);
Note too that you can add a persisted computed column for -PATINDEX('international%', t.[name]) to speed things up.

Generate result set of a certain length

I need to insert certain amount of rows into some table with values taken from variables. I certainly can do a loop inserting single row at a time, but that's too straightforward. I am looking for more elegant solution. My current thoughts are around INSERT INTO ... SELECT ... statement, but now I need a query that will generate the amount of rows that I need. I tried to write recursive CTE to do it:
CREATE FUNCTION ufGenerateRows(#numRows INT = 1)
RETURNS #RtnValue TABLE
(
RowID INT NOT NULL
)
AS
BEGIN
WITH numbers AS
(
SELECT 1 as N
UNION ALL
SELECT N + 1
FROM numbers
WHERE N + 1 <= #numRows
)
INSERT INTO #RtnValue
SELECT N
FROM numbers
RETURN
END
GO
It works, but has a limit of recursion depth of 100, which is inappropriate for me. Can you suggest alternatives?
always use the dbo. schema prefix when creating or referencing objects, especially functions.
you should strive to create inline table-valued functions, as opposed to multi-statement table-valued functions, when possible.
Recursive CTEs are about the least efficient way to generate a set (see this three-part series for much better examples):
http://www.sqlperformance.com/2013/01/t-sql-queries/generate-a-set-1
http://www.sqlperformance.com/2013/01/t-sql-queries/generate-a-set-2
http://www.sqlperformance.com/2013/01/t-sql-queries/generate-a-set-3
Here is one example:
CREATE FUNCTION dbo.GenerateRows(#numRows INT = 1)
RETURNS TABLE
AS
RETURN
(
SELECT TOP (#numRows) RowID = ROW_NUMBER() OVER (ORDER BY s1.[number])
FROM master.dbo.spt_values AS s1
-- CROSS JOIN master.dbo.spt_values AS s2
ORDER BY s1.[number]
);
If you need more than ~2,500 rows, you can cross join with itself, or another table.
Even better would be to create your own numbers table (again, see the links above for examples).
Don't think iteratively - looping - but set-based - all at once.
An INSERT INTO...SELECT TOP x… should do what you need without repeated inserts.
I will follow with an example when I'm not bound to my phone.
UPDATE:
What #AaronBertrand said. :} A CROSS JOIN in the SELECT is spot-on.

Is there a way to find TOP X records with grouped data?

I'm working with a Sybase 12.5 server and I have a table defined as such:
CREATE TABLE SomeTable(
[GroupID] [int] NOT NULL,
[DateStamp] [datetime] NOT NULL,
[SomeName] varchar(100),
PRIMARY KEY CLUSTERED (GroupID,DateStamp)
)
I want to be able to list, per [GroupID], only the latest X records by [DateStamp]. The kicker is X > 1, so plain old MAX() won't cut it. I'm assuming there's a wonderfully nasty way to do this with cursors and what-not, but I'm wondering if there is a simpler way without that stuff.
I know I'm missing something blatantly obvious and I'm gonna kick myself for not getting it, but .... I'm not getting it. Please help.
Is there a way to find TOP X records, but with grouped data?
According to the online manual, Sybase 12.5 supports WINDOW functions and ROW_NUMBER(), though their syntax differs from standard SQL slightly.
Try something like this:
SELECT SP.*
FROM (
SELECT *, ROW_NUMBER() OVER (windowA ORDER BY [DateStamp] DESC) AS RowNum
FROM SomeTable
WINDOW windowA AS (PARTITION BY [GroupID])
) AS SP
WHERE SP.RowNum <= 3
ORDER BY RowNum DESC;
I don't have an instance of Sybase, so I haven't tested this. I'm just synthesizing this example from the doc.
I made a mistake. The doc I was looking at was Sybase SQL Anywhere 11. It seems that Sybase ASA does not support the WINDOW clause at all, even in the most recent version.
Here's another query that could accomplish the same thing. You can use a self-join to match each row of SomeTable to all rows with the same GroupID and a later DateStamp. If there are three or fewer later rows, then we've got one of the top three.
SELECT s1.[GroupID], s1.[Foo], s1.[Bar], s1.[Baz]
FROM SomeTable s1
LEFT OUTER JOIN SomeTable s2
ON s1.[GroupID] = s2.[GroupID] AND s1.[DateStamp] < s2.[DateStamp]
GROUP BY s1.[GroupID], s1.[Foo], s1.[Bar], s1.[Baz]
HAVING COUNT(*) < 3
ORDER BY s1.[DateStamp] DESC;
Note that you must list the same columns in the SELECT list as you list in the GROUP BY clause. Basically, all columns from s1 that you want this query to return.
Here's quite an unscalable way!
SELECT GroupID, DateStamp, SomeName
FROM SomeTable ST1
WHERE X <
(SELECT COUNT(*)
FROM SomeTable ST2
WHERE ST1.GroupID=ST2.GroupID AND ST2.DateStamp > ST1.DateStamp)
Edit Bill's solution is vastly preferable though.