View all Data of two related tables , even if something is not registered from table A to table B - tsql

i have two sql server table like this :
[Management].[Person](
[PersonsID] [int] IDENTITY(1,1) NOT NULL,
[FirstName] [nvarchar](50) NOT NULL,
[LastName] [nvarchar](100) NOT NULL,
[Semat] [nvarchar](50) NOT NULL,
[Vahed] [nvarchar](50) NOT NULL,
[Floor] [int] NOT NULL,
[ShowInList] [bit] NOT NULL,
[LastState] [nchar](10) NOT NULL)
and
[Management].[PersonEnters](
[PersonEnters] [int] IDENTITY(1,1) NOT NULL,
[PersonID] [int] NOT NULL,
[Vaziat] [nchar](10) NOT NULL,
[Time] [nchar](10) NOT NULL,
[PDate] [nchar](10) NOT NULL)
that PersonsID in second table is a foreign key.
i register every person's enter to system on PersonsEnter Table.
i want to show all person enter stastus in a certain date (PDate field) , if a person entered to system show it's information an if did not, show null insted,
i tried this query :
select * from [Management].[Person] left outer join [Management].[PersonEnters]
on [Management].[Person].[PersonsID] = [Management].[PersonEnters].[PersonID]
where [Management].[PersonEnters].PDate = '1392/11/14'
but it just shows registered person enter data at 1392/11/14 and shows nothing for others,
i wanna show this data plus null or a constant string like "NOT REGISTERED" for other persons that not registered their enter in PersonEnters Table on '1392/11/14'. Please Help Me.

Logically, the WHERE clause will be applied after the join. If some Person entries do not have matches in PersonEnters, they will have NULLs in PDate as a result of the join, but the WHERE clause will filter them out because the comparison NULL = '1392/11/14' will not yield true.
If I understand your question correctly, you essentially want an outer join to a subset of PersonEnters (the one where PDate = '1392/11/14'), not to the entire table. One way to express that could be like this:
SELECT *
FROM Management.Person AS p
LEFT JOIN (
SELECT *
FROM Management.PersonEnters
WHERE PDate = '1392/11/14'
) AS pe
ON p.Person.ID = pe.PersonID
;
As you can see, this query very explicitly tells the server that a particular subset should be derived from PersonEnters before the join takes place – because you want to indicate matches with that particular subset, not with the whole table.
However, the same intent could be rewritten in a more concise way (without a derived table):
SELECT *
FROM Management.Person AS p
LEFT JOIN Management.PersonEnters AS pe
ON p.Person.ID = pe.PersonID AND pe.PDate = '1392/11/14'
;
The effect of the above query would be the same and you would get all Person entries, with matching results from PersonEnters only if they have PDate = '1392/11/14'.

select *
from [Management].[Person]
left outer join [Management].[PersonEnters]
on [Management].[Person].[PersonsID] = [Management].[PersonEnters].[PersonID]
and [Management].[PersonEnters].PDate = '1392/11/14'

Related

Postgresql How do I join these tables to get the correct output?

I am trying to find out which people are noted as being principal actors in a movie, but aren't noted as playing a character in that movie.
The schema I have is:
CREATE TABLE public.movies (
id integer NOT NULL,
title text NOT NULL,
year_made public.yeartype NOT NULL,
runtime public.minutes,
rating double precision,
nvotes public.counter
);
CREATE TABLE public.people (
id integer NOT NULL,
name text NOT NULL,
year_born public.yeartype,
year_died public.yeartype
);
CREATE TABLE public.plays (
movie_id integer NOT NULL,
person_id integer NOT NULL,
"character" text NOT NULL
);
CREATE TABLE public.principals (
movie_id integer NOT NULL,
ordering public.counter NOT NULL,
person_id integer NOT NULL,
role text NOT NULL
So far the query I have used works for some actors, however I think I have done the joins incorrectly as there is another actor who is a principal actor but is given a character which it shouldn't have (character name is from another movie they were in). This is my query:
select name as actor, movies.title as movie,character
from principals
inner join people on principals.person_id=people.id
inner join movies on principals.movie_id=movies.id
left outer join plays on principals.person_id=plays.person_id
where principals.role = 'actor' and character is null
Can anyone help me with this?
This is a summary of results and the join adds all the persons character names to every movie they were principal in.
https://drive.google.com/file/d/1NVRLiYBVbKuiazynx9Egav7c4_VHFEzP/view?usp=sharing
I think a second join condition is required for table plays as shown in bold below:
FROM principals
LEFT JOIN plays ON principals.person_id = plays.person_id
AND principles.movie_id = plays.movie_id
WHERE principals.role = 'actor'
AND plays.person_id IS NULL
However I feel that a not exists approach will make this query easier to understand:
SELECT
people.name AS actor
, movies.title AS movie
, plays.character
FROM principals
INNER JOIN people ON principals.person_id = people.id
INNER JOIN movies ON principals.movie_id = movies.id
WHERE principals.ROLE = 'actor'
AND NOT EXISTS (
SELECT NULL
FROM plays
WHERE principals.person_id = plays.person_id
AND principals.movie_id = plays.movie_id
)
You’re pretty close, but need to also join plays using movie.id:
select name as actor, movies.title as movie
from principals
join people on principals.person_id=people.id
join movies on principals.movie_id=movies.id
left join plays on principals.person_id=plays.person_id
and movies.id=principals.movie_id
where principals.role = 'actor'
and character is null
Removed character from select because it’s always null.

I'm trying to insert tuples into a table A (from table B) if the primary key of the table B tuple doesn't exist in tuple A

Here is what I have so far:
INSERT INTO Tenants (LeaseStartDate, LeaseExpirationDate, Rent, LeaseTenantSSN, RentOverdue)
SELECT CURRENT_DATE, NULL, NewRentPayments.Rent, NewRentPayments.LeaseTenantSSN, FALSE from NewRentPayments
WHERE NOT EXISTS (SELECT * FROM Tenants, NewRentPayments WHERE NewRentPayments.HouseID = Tenants.HouseID AND
NewRentPayments.ApartmentNumber = Tenants.ApartmentNumber)
So, HouseID and ApartmentNumber together make up the primary key. If there is a tuple in table B (NewRentPayments) that doesn't exist in table A (Tenants) based on the primary key, then it needs to be inserted into Tenants.
The problem is, when I run my query, it doesn't insert anything (I know for a fact there should be 1 tuple inserted). I'm at a loss, because it looks like it should work.
Thanks.
Your subquery was not correlated - It was just a non-correlated join query.
As per description of your problem, you don't need this join.
Try this:
insert into Tenants (LeaseStartDate, LeaseExpirationDate, Rent, LeaseTenantSSN, RentOverdue)
select current_date, null, p.Rent, p.LeaseTenantSSN, FALSE
from NewRentPayments p
where not exists (
select *
from Tenants t
where p.HouseID = t.HouseID
and p.ApartmentNumber = t.ApartmentNumber
)

T-SQL - compare column values and column names

I am trying to get the values from a small table, that are not present as columns in an existing table.
Here is some code I tried in SQL Server 2012:
BEGIN TRAN;
GO
CREATE TABLE [dbo].[TestValues](
[IdValue] [int] IDENTITY(1,1) NOT NULL,
[Code] [nvarchar](100) NULL,
) ON [PRIMARY];
GO
CREATE TABLE [dbo].[TestColumns](
[DateHour] [datetime2](7) NULL,
[test1] [nvarchar](100) NULL,
[test2] [nvarchar](100) NULL
) ON [PRIMARY]
GO
INSERT INTO [dbo].[TestValues] ([Code])
VALUES
(N'test1')
, (N'test2')
, (N'test3')
;
GO
SELECT
v.[Code]
, c.[name]
FROM
[dbo].[TestValues] AS v
LEFT OUTER JOIN [sys].[columns] AS c ON v.[Code] = c.[name]
WHERE
(c.[object_id] = OBJECT_ID(N'[dbo].[TestColumns]'))
AND (c.[column_id] > 1)
;
WITH
cteColumns AS (
SELECT
c.[name]
FROM
[sys].[columns] AS c
WHERE
(c.[object_id] = OBJECT_ID(N'[dbo].[TestColumns]'))
AND (c.[column_id] > 1)
)
SELECT
v.[Code]
, c.[name]
FROM
[dbo].[TestValues] AS v
LEFT OUTER JOIN cteColumns AS c ON v.[Code] = c.[name]
;
GO
ROLLBACK TRAN;
GO
In my opinion the two selects should have the same output. Can someone offer an explanation please?
TestValues is a table receiving data. TestColumns is a table that was created, when the project was started, by persisting a PIVOT query. Recently the process inserting into TestValues received some new data. I tried to get the new values using the first query and I was surprised when the result didn't show anything new.
Edit 1: Thank you dean for the answer, it sounds like a good explanation. Do you have any official page describing the behaviours of unpreserved tables? I did a quick google search and all I got was links towards Oracle. (added as an edit because I do not have enough reputation points to comment)

Most efficient way to join to a two-part key, with a fallback to matching only the first part?

In purely technical terms
Given a table with a two-column unique key, and input values for those two columns, what is the most efficient way to return the first matching row based on a two-step match?:
If an exact match exists on both key parts, return that
Otherwise, return the first (if any) matching row based on the first part alone
This operation will be done in many different places, on many rows. The "payload" of the match will be a single string column (nvarchar(400)). I want to optimize for fast reads. Paying for this with slower inserts and updates and more storage is acceptable. So having multiple indexes with the payload included is an option, as long is there is a good way to execute the two-step match described above. There absolutely will be a unique index on (key1, key2) with the payload included, so essentially all reads will be going off of this index alone, unless there is some clever approach that would use additional indexes.
A method that returns the entire matching row is preferred, but if a scalar function that only returns the payload is an order of magnitude faster, then that is worth considering.
I've tried three different methods, two of which I have posted as answers below. The third method was about 20x more expensive in the explain plan cost, and I've included it at the end of this post as an example of what not to do.
I'm curious to see if there are better ways, though, and will happily vote someone else's suggestion as the answer if it is better. In my dev database the query planner estimates similar costs to my two approaches, but my dev database doesn't have anywhere near the volume of multilingual text that will be in production, so it's hard to know if this accurately reflects the comparative read performance on a large data set. As tagged, the platform is SQL Server 2012, so if there are new applicable features available as of that version do make use of them.
Business context
I have a table LabelText that represents translations of user-supplied dynamic content:
create table Label ( bigint identity(1,1) not null primary key );
create table LabelText (
LabelTextID bigint identity(1,1) not null primary key
, LabelID bigint not null
, LanguageCode char(2) not null
, LabelText nvarchar(400) not null
, constraint FK_LabelText_Label
foreign key ( NameLabelID ) references Label ( LabelID )
);
There is a unique index on LabelID and LanguageCode, so there can only be one translation of a text item for each ISO 2-character language code. The LabelText field is also included, so reads can access the index along without having to fetch back from the underlying table:
create unique index UQ_LabelText
on LabelText ( LabelID, LanguageCode )
include ( LabelText);
I'm looking for the fastest-performing way to return the best match from the LabelText table in a two-step match, given a LabelID and LanguageCode.
For examples, let's say we have a Component table that looks like this:
create table Component (
ComponentID bigint identity(1,1) not null primary key
, NameLabelID bigint not null
, DescriptionLabelID bigint not null
, constraint FK_Component_NameLabel
foreign key ( NameLabelID ) references Label ( LabelID )
, constraint FK_Component_DescLabel
foreign key ( DescriptionLabelID ) references Label ( LabelID )
);
Users will each have a preferred language, but there is no guarantee that a text item will have a translation in their language. In this business context it makes more sense to show any available translation rather than none, when the user's preferred language is not available. So for example a German user may call a certain widget the 'linkenpfostenklammer'. A British user would prefer to see an English translation if one is available, but until there is one it is better to see the German (or Spanish, or French) version than to see nothing.
What not to do: Cross apply with dynamic sort
Whether encapsulated in a table-valued function or included inline, the following use of cross apply with a dynamic sort was about 20x more expensive (per explain plan estimate) than either the scalar-valued function in my first answer or the union all approach in my second answer:
declare #LanguageCode char(2) = 'de';
select
c.ComponentID
, c.NameLabelID
, n.LanguageCode as NameLanguage
, n.LabelText as NameText
from Component c
outer apply (
select top 1
lt.LanguageCode
, lt.LabelText
from LabelText lt
where lt.LabelID = c.NameLabelID
order by
(case when lt.LanguageCode = #LanguageCode then 0 else 1 end)
) n
I think this is going to be most performant
select lt.*, c.*
from ( select LabelText, LabelID from LabelText
where LabelTextID = #LabelTextID and LabelID = #LabelID
union
select LabelText, min(LabelID) from LabelText
where LabelTextID = #LabelTextID
and not exists (select 1 from LabelText
where LabelTextID = #LabelTextID and LabelID = #LabelID)
group by LabelTextID, LabelText
) lt
join component c
on c.NameLabelID = lt.LabelID
OP solution 1: Scalar function
A scalar function would make it easy to encapsulate the lookup for reuse elsewhere, though it does not return the language code of the text actually returned. I'm also unsure of the cost of executing multiple times per row in denormalized views.
create function GetLabelText(#LabelID bigint, #LanguageCode char(2))
returns nvarchar(400)
as
begin
declare #text nvarchar(400);
select #text = LabelText
from LabelText
where LabelID = #LabelID and LanguageCode = #LanguageCode
;
if #text is null begin
select #text = LabelText
from LabelText
where LabelID = #LabelID;
end
return #text;
end
Usage would look like this:
declare #LanguageCode char(2) = 'de';
select
ComponentID
, NameLabelID
, DescriptionLabelID
, GetLabelText(NameLabelID, #LanguageCode) AS NameText
, GetLabelText(DescriptionLabelID, #LanguageCode) AS DescriptionText
from Component
OP solution 2: Inline table-valued function using top 1, union all
A table-valued function is nice because it encapsulates the lookup for reuse just as with a scalar function, but also returns the matching LanguageCode of the row that was actually selected. In my dev database with limited data the explain plan cost of the following use of top 1 and union all is comparable to the scalar function approach in "OP Solution 1":
create function GetLabelText(#LabelID bigint, #LanguageCode char(2))
returns table
as
return (
select top 1
A.LanguageCode
, A.LabelText
from (
select
LanguageCode
, LabelText
from LabelText
where LabelID = #LabelID
and LanguageCode = #LanguageCode
union all
select
LanguageCode
, LabelText
from LabelText
where LabelID = #LabelID
) A
);
Usage:
declare #LanguageCode char(2) = 'de';
select
c.ComponentID
, c.NameLabelID
, n.LanguageCode AS NameLanguage
, n.LabelText AS NameText
, c.DescriptionLabelID
, c.LanguageCode AS DescriptionLanguage
, c.LabelText AS DescriptionText
from Component c
outer apply GetLabelText(c.NameLabelID, #LanguageCode) n
outer apply GetLabelText(c.DescriptionLabelID, #LanguageCode) d

Comparing 2 tables for new or updated rows using composite keys

I'm writing tsql for SQL Server 2008. I've got two tables with roughly 2 million rows each. The Source table gets updated daily and changes are pushed to the Destination table based on a last_edit date. If this date is newer in source than destination then update the destination row. If a new row exists in source compared to destination insert it into destination. This is really only a one way process that I'm concerned with, from source to destination. The source and destination table use a unique identifier across 4 columns, serialid, itemid, systemcode, and role.
My table are modeled similar to the script below. There are many data columns but I've limited it to 3 in this example. I'm looking for 2 outputs. 1 set of data with rows to update and 1 set of data with rows to add.
CREATE TABLE [dbo].[TABLE_DEST](
[SERIALID] [nvarchar](20) NOT NULL,
[ITEMID] [nvarchar](20) NOT NULL,
[SYSTEMCODE] [nvarchar](20) NOT NULL,
[ROLE] [nvarchar](10) NOT NULL,
[LAST_EDIT] [datetime] NOT NULL],
[DATA_COLUMN_1] [nvarchar](10) NOT NULL,
[DATA_COLUMN_2] [nvarchar](10) NOT NULL,
[DATA_COLUMN_3] [nvarchar](10) NOT NULL
)
CREATE TABLE [dbo].[TABLE_SOURCE](
[SERIALID] [nvarchar](20) NOT NULL,
[ITEMID] [nvarchar](20) NOT NULL,
[SYSTEMCODE] [nvarchar](20) NOT NULL,
[ROLE] [nvarchar](10) NOT NULL,
[LAST_EDIT] [datetime] NOT NULL],
[DATA_COLUMN_1] [nvarchar](10) NOT NULL,
[DATA_COLUMN_2] [nvarchar](10) NOT NULL,
[DATA_COLUMN_3] [nvarchar](10) NOT NULL
)
Here's what I've got for the update dataset.
select s.*
from table_dest (nolock) inner join table_source s (nolock)
on s.SYSTEMCODE = fd.SYSTEMCODE1Y
and s.ROLE = d.ROLE
and s.SERIALID = d.SERIALID
and s.ITEMID = d.ITEMID
and s.LAST_EDIT > d.LAST_EDIT
I don't know how best to accomplish finding the rows to add. But the solution has to be pretty efficient for the database.
Unmatched rows can be found with left/right join and checking target table keys for null:
select s.*, case when d.key1 is null then 'insert' else 'update' end [action]
from [table_dest] d right join [table_source] s on (d.key1 = s.key1 /* etc.. */)
If you need these rows just to perform respective operations then there is special feature for you:
merge [table_dest] d
using [table_source] s on (d.key1 = s.key1 /* etc.. */)
when mathed then
update set d.a = s.a
when not matched by target then
insert (key1, .., a) values (s.key1, ..., s.a);