Get cartesian product of two columns - tsql

How can I get the cartesian product of two columns in one table?
I have table
A 1
A 2
B 3
B 4
and I want a new table
A 1
A 2
A 3
A 4
B 1
B 2
B 3
B 4

fiddle demo
your table
try this using joins
select distinct b.let,a.id from [dbo].[cartesian] a join [dbo].[cartesian] b on a.id<>b.id
will result like this

Create this table :
CREATE TABLE [dbo].[Table_1]
(
[A] [int] NOT NULL ,
[B] [nvarchar](50) NULL ,
CONSTRAINT [PK_Table_1] PRIMARY KEY CLUSTERED ( [A] ASC )
WITH ( PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF,
IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON,
ALLOW_PAGE_LOCKS = ON ) ON [PRIMARY]
)
ON [PRIMARY]
Fill table like this :
INSERT INTO [dbo].[Table_1]
VALUES ( 1, 'A' )
INSERT INTO [dbo].[Table_1]
VALUES ( 2, 'A' )
INSERT INTO [dbo].[Table_1]
VALUES ( 3, 'B' )
INSERT INTO [dbo].[Table_1]
VALUES ( 4, 'C' )
SELECT *
FROM [dbo].[Table_1]
Use this query
SELECT DISTINCT
T1.B ,
T2.A
FROM dbo.Table_1 AS T1 ,
dbo.Table_1 AS T2
ORDER BY T1.B

To clarify loup's answer (in more detail that allowable in a comment), any join with no relevant criteria specified will naturally produce a Cartesian product (which is why a glib answer to your question might be "all too easily"-- mistakenly doing t1 INNER JOIN t2 ON t1.Key = t1.Key will produce the same result).
However, SQL Server does offer an explicit option to make your intentions known. The CROSS JOIN is essentially what you're looking for. But like INNER JOIN devolving to a Cartesian product without a useful join condition, CROSS JOIN devolves to a simple inner join if you go out of your way to add join criteria in the WHERE clause.
If this is a one-off operation, it probably doesn't matter which you use. But if you want to make it clear for posterity, consider CROSS JOIN instead.

Related

SQL left join case statement

Need some help working out the SQL. Unfortunately the version of tsql is SybaseASE which I'm not too familiar with, in MS SQL I would use a windowed function like RANK() or ROW_NUMBER() in a subquery and join to those results ...
Here's what I'm trying to resolve
TABLE A
Id
1
2
3
TABLE B
Id,Type
1,A
1,B
1,C
2,A
2,B
3,A
3,C
4,B
4,C
I would like to return 1 row for each ID and if the ID has a type 'A' record that should display, if it has a different type then it doesn't matter but it cannot be null (can do some arbitrary ordering, like alpha to prioritize "other" return value types)
Results:
1, A
2, A
3, A
4, B
A regular left join (ON A.id = B.id and B.type = 'A') ALMOST returns what I am looking for however it returns null for the type when I want the 'next available' type.
You can use a INNER JOIN on a SubQuery (FirstTypeResult) that will return the minimum type per Id.
Eg:
SELECT TABLEA.[Id], FirstTypeResult.[Type]
FROM TABLEA
JOIN (
SELECT [Id], Min([Type]) As [Type]
FROM TABLEB
GROUP BY [Id]
) FirstTypeResult ON FirstTypeResult.[Id] = TABLEA.[Id]

Postgres left joining 3 tables with a condition

I have a query like this:
SELECT x.id
FROM x
LEFT JOIN (
SELECT a.id FROM a
WHERE [condition1] [condition2]
) AS A USING (id)
LEFT JOIN (
SELECT b.id FROM b
WHERE [condition1] [condition3]
) AS B USING (id)
LEFT JOIN (
SELECT c.id FROM c
WHERE [condition1] [condition4]
) AS C USING (id)
WHERE [condition1]
As you can see the [condition1] is common for subqueries and the outer query.
When [in general] might it be worth to remove [condition1] from subqueries (as the result is same) for performance reasons? Please don't give answers like "run it and see". There are lots of data and its changing so we need good worst case behaviour.
I have tried to do some tests but they are far from being conclusive. Will Postgres figure out that the condition applied to subqueries as well and propagate it?
Examples for condition1:
WHERE a.id NOT IN (SELECT id FROM {ft_geom_in}) (this is slow, I know, this is just for example)
WHERE a.id > x
It is difficult to give a clear general answer, since much depends on the actual data model (especially indexes) and queries (conditions).
However, in many cases it makes sense to place condition1 in joined subqueries.
This applies particularly when condition2 excludes much less rows than condition1.
In such cases, the filter on condition1 may significantly reduce the number of checks of condition2.
On the other hand, it seems unlikely that the presence of condition1 in subqueries could substantially slow down the query.
Simple tests do not give general answers, but might serve as an illustration.
create table x (id integer, something text);
create table a (id integer, something text);
insert into x select i, i::text from generate_series (1, 1000000) i;
insert into a select i, i::text from generate_series (1, 1000000) i;
Query A: condition2 excludes few rows.
A1: with condition1
explain analyse
select x.id
from x
left join (
select id from a
where id < 500000 and length(something) > 1
) as a using (id)
where id < 500000;
Average execution time: ~620.000 ms
A2: without condition1
explain analyse
select x.id
from x
left join (
select id from a
where length(something) > 1
) as a using (id)
where id < 500000;
Average execution time: ~810.000 ms
Query B: condition2 excludes many rows.
B1: with condition1
explain analyse
select x.id
from x
left join (
select id from a
where id < 500000 and length(something) = 1
) as a using (id)
where id < 500000;
Average execution time: ~220.000 ms
B2: without condition1
explain analyse
select x.id
from x
left join (
select id from a
where length(something) = 1
) as a using (id)
where id < 500000;
Average execution time: ~230.000 ms
Note, that the queries do not need to have subqueries. Queries with simple left joins and conditions in a common where clause should be a little bit faster. For example, this is the equivalent of query B1:
explain analyse
select x.id
from x
left join a using(id)
where x.id < 500000
and a.id < 500000
and length(a.something) = 1
Average execution time: ~210.000 ms

Using EXISTS as a column in TSQL

Is it possible to use the value of EXISTS as part of a query?
(Please note: unfortunately due to client constraints, I need SQLServer 2005 compatible answers!)
So when returning a set of results, one of the columns is a boolean value which states whether the subquery would return any rows.
For example, I want to return a list of usernames and whether a different table contains any rows for each user. The following is not syntactically correct, but hopefully gives you an idea of what I mean...
SELECT T1.[UserName],
(EXISTS (SELECT *
FROM [AnotherTable] T2
WHERE T1.[UserName] = T2.[UserName])
) AS [RowsExist]
FROM [UserTable] T1
Where the resultant set contains a column called [UserName] and boolean column called [RowsExist].
The obvious solution is to use a CASE, such as below, but I wondered if there was a better way of doing it...
SELECT T1.[UserName],
(CASE (SELECT COUNT(*)
FROM [AnotherTable] T2
WHERE T1.[UserName] = T2.[UserName]
)
WHEN 0 THEN CAST(0 AS BIT)
ELSE CAST(1 AS BIT) END
) AS [RowsExist]
FROM [UserTable] T1
Your second query isn't valid syntax.
SELECT T1.[UserName],
CASE
WHEN EXISTS (SELECT *
FROM [AnotherTable] T2
WHERE T1.[UserName] = T2.[UserName]) THEN CAST(1 AS BIT)
ELSE CAST(0 AS BIT)
END AS [RowsExist]
FROM [UserTable] T1
Is generally fine and will be implemented as a semi join.
The article Subqueries in CASE Expressions discusses this further.
In some cases a COUNT query can actually perform better though as discussed here
I like the other guys sql better but i just wrote this:
with bla as (
select t2.username, isPresent=CAST(1 AS BIT)
from t2
group by t2.username
)
select t1.*, isPresent = isnull(bla.isPresent, CAST(0 AS BIT))
from t1
left join blah on t1.username=blah.username
From what you wrote here I would alter your first query into something like this
SELECT
T1.[UserName], ISNULL(
(
SELECT
TOP 1 1
FROM [AnotherTable]
WHERE EXISTS
(
SELECT
1
FROM [AnotherTable] AS T2
WHERE T1.[UserName] = T2.[UserName]
)
), 0)
FROM [UserTable] T1
But actually if you use TOP 1 1 you would not need EXISTS, you could also write
SELECT
T1.[UserName], ISNULL(
(
SELECT
TOP 1 1
FROM [AnotherTable] AS T2
WHERE T1.[UserName] = T2.[UserName]
), 0)
FROM [UserTable] T1

How to use parameters in a SQL query with NOT EXISTS?

How can I change following query, so that I'm able to parameterize the SparePartNames?
It returns all ID's of repairs where not all mandatory spareparts were changed, in other words where at least one part is missing.
Note that the number of spareparts might change in future not only the names. Is it possible without using a stored procedure with dynamic SQL? If not, how could this SP look like?
Edit: Note that i do not need to know how to pass a list/array as parameter, this is asked myriads of time on SO. I've also already a Split table-valued-function. I'm just wondering how i could rewrite the query to be able to join(or whatever) with a list of mandatory parts, so that i'll find all records where at least one part is missing. So is it possible to use a varchar-parameter like '1264-3212,1254-2975' instead of a list of NOT EXISTS? Sorry for the confusion if it was not clear in the first place.
SELECT d.idData
FROM tabData d
INNER JOIN modModel AS m ON d.fiModel = m.idModel
WHERE (m.ModelName = 'MT27I')
AND (d.fiMaxServiceLevel >= 2)
AND (d.Manufacture_Date < '20120511')
AND (NOT EXISTS
(SELECT NULL
FROM tabDataDetail AS td
INNER JOIN tabSparePart AS sp ON sp.idSparePart = td.fiSparePart
WHERE (td.fiData = d.idData)
AND (sp.SparePartName = '1264-3212'))
OR (NOT EXISTS
(SELECT NULL
FROM tabDataDetail AS td
INNER JOIN tabSparePart AS sp ON sp.idSparePart = td.fiSparePart
WHERE (td.fiData = d.idData)
AND (sp.SparePartName = '1254-2975'))
)
)
Unfortunately I don't see how I could use sp.SparePartName IN/NOT IN(#sparePartNames) here.
One way to do it is to create a function to split delimited strings:
CREATE FUNCTION [dbo].[Split]
(
#Delimiter char(1),
#StringToSplit varchar(512)
)
RETURNS table
AS
RETURN
(
WITH Pieces(pieceNumber, startIndex, delimiterIndex)
AS
(
SELECT 1, 1, CHARINDEX(#Delimiter, #StringToSplit)
UNION ALL
SELECT pieceNumber + 1, delimiterIndex + 1, CHARINDEX(#Delimiter, #StringToSplit, delimiterIndex + 1)
FROM Pieces
WHERE delimiterIndex > 0
)
SELECT
SUBSTRING(#StringToSplit, startIndex, CASE WHEN delimiterIndex > 0 THEN delimiterIndex - startIndex ELSE 512 END) AS Value
FROM Pieces
)
populate a table variable with the spare part names:
DECLARE #SpareParts TABLE
(
SparePartName varchar(50) PRIMARY KEY CLUSTERED
);
INSERT INTO #SpareParts
SELECT Value FROM dbo.Split(',', '1264-3212,1254-2975');
and then join to the table variable:
SELECT d.idData
FROM tabData d
INNER JOIN modModel AS m ON d.fiModel = m.idModel
WHERE (m.ModelName = 'MT27I')
AND (d.fiMaxServiceLevel >= 2)
AND (d.Manufacture_Date < '20120511')
AND EXISTS (
SELECT 1
FROM tabDataDetail AS td
INNER JOIN tabSparePart AS sp ON sp.idSparePart = td.fiSparePart
LEFT JOIN #SpareParts AS s ON s.SparePartName = sp.SparePartName
WHERE td.fiData = d.idData
AND s.SparePartName IS NULL
)
Assuming there is (or will be) a table or view of mandatory spare parts, a list of exists can be replaced with a left join to tabDataDetail / tabSparePart pair on SparePartName; non-matches are reported back using td.fiSparePart is null.
; with mandatorySpareParts (SparePartName) as (
select '1264-3212'
union all
select '1254-2975'
)
SELECT d.idData
FROM tabData d
INNER JOIN modModel AS m ON d.fiModel = m.idModel
WHERE (m.ModelName = 'MT27I')
AND (d.fiMaxServiceLevel >= 2)
AND (d.Manufacture_Date < '20120511')
AND exists
(
SELECT null
from mandatorySpareParts msp
left join ( tabDataDetail AS td
INNER JOIN tabSparePart AS sp
ON sp.idSparePart = td.fiSparePart
AND td.fiData = d.idData
)
ON msp.SparePartName = sp.SparePartName
WHERE td.fiSparePart is null
)
Part names should be replaced by their id's, which would simplify left join and speed the query up.
EDIT: i've errorneously left filtering of td in where clause, which invalidated left join. It is now in ON clause where it belongs.
Use a table-variable and join on that.

Query to get row from one table, else random row from another

tblUserProfile - I have a table which holds all the Profile Info (too many fields)
tblMonthlyProfiles - Another table which has just the ProfileID in it (the idea is that this table holds 2 profileids which sometimes become monthly profiles (on selection))
Now when I need to show monthly profiles, I simply do a select from this tblMonthlyProfiles and Join with tblUserProfile to get all valid info.
If there are no rows in tblMonthlyProfile, then monthly profile section is not displayed.
Now the requirement is to ALWAYS show Monthly Profiles. If there are no rows in monthlyProfiles, it should pick up 2 random profiles from tblUserProfile. If there is only one row in monthlyProfiles, it should pick up only one random row from tblUserProfile.
What is the best way to do all this in one single query ?
I thought something like this
select top 2 * from tblUserProfile P
LEFT OUTER JOIN tblMonthlyProfiles M
on M.profileid = P.profileid
ORder by NEWID()
But this always gives me 2 random rows from tblProfile. How can I solve this ?
Try something like this:
SELECT TOP 2 Field1, Field2, Field3, FinalOrder FROM
(
select top 2 Field1, Field2, Field3, FinalOrder, '1' As FinalOrder from tblUserProfile P JOIN tblMonthlyProfiles M on M.profileid = P.profileid
UNION
select top 2 Field1, Field2, Field3, FinalOrder, '2' AS FinalOrder from tblUserProfile P LEFT OUTER JOIN tblMonthlyProfiles M on M.profileid = P.profileid ORDER BY NEWID()
)
ORDER BY FinalOrder
The idea being to pick two monthly profiles (if that many exist) and then 2 random profiles (as you correctly did) and then UNION them. You'll have between 2 and 4 records at that point. Grab the top two. FinalOrder column is an easy way to make sure that you try and get the monthly's first.
If you have control of the table structure, you might save yourself some trouble by simply adding a boolean field IsMonthlyProfile to the UserProfile table. Then it's a single table query, order by IsBoolean, NewID()
In SQL 2000+ compliant syntax you could do something like:
Select ...
From (
Select TOP 2 ...
From tblUserProfile As UP
Where Not Exists( Select 1 From tblMonthlyProfile As MP1 )
Order By NewId()
) As RandomProfile
Union All
Select MP....
From tblUserProfile As UP
Join tblMonthlyProfile As MP
On MP.ProfileId = UP.ProfileId
Where ( Select Count(*) From tblMonthlyProfile As MP1 ) >= 1
Union All
Select ...
From (
Select TOP 1 ...
From tblUserProfile As UP
Where ( Select Count(*) From tblMonthlyProfile As MP1 ) = 1
Order By NewId()
) As RandomProfile
Using SQL 2005+ CTE you can do:
With
TwoRandomProfiles As
(
Select TOP 2 ..., ROW_NUMBER() OVER ( ORDER BY UP.ProfileID ) As Num
From tblUserProfile As UP
Order By NewId()
)
Select MP.Col1, ...
From tblUserProfile As UP
Join tblMonthlyProfile As MP
On MP.ProfileId = UP.ProfileId
Where ( Select Count(*) From tblMonthlyProfile As MP1 ) >= 1
Union All
Select ...
From TwoRandomProfiles
Where Not Exists( Select 1 From tblMonthlyProfile As MP1 )
Union All
Select ...
From TwoRandomProfiles
Where ( Select Count(*) From tblMonthlyProfile As MP1 ) = 1
And Num = 1
The CTE has the advantage of only querying for the random profiles once and the use of the ROW_NUMBER() column.
Obviously, in all the UNION statements the number and type of the columns must match.