T-SQL - compare column values and column names - tsql

I am trying to get the values from a small table, that are not present as columns in an existing table.
Here is some code I tried in SQL Server 2012:
BEGIN TRAN;
GO
CREATE TABLE [dbo].[TestValues](
[IdValue] [int] IDENTITY(1,1) NOT NULL,
[Code] [nvarchar](100) NULL,
) ON [PRIMARY];
GO
CREATE TABLE [dbo].[TestColumns](
[DateHour] [datetime2](7) NULL,
[test1] [nvarchar](100) NULL,
[test2] [nvarchar](100) NULL
) ON [PRIMARY]
GO
INSERT INTO [dbo].[TestValues] ([Code])
VALUES
(N'test1')
, (N'test2')
, (N'test3')
;
GO
SELECT
v.[Code]
, c.[name]
FROM
[dbo].[TestValues] AS v
LEFT OUTER JOIN [sys].[columns] AS c ON v.[Code] = c.[name]
WHERE
(c.[object_id] = OBJECT_ID(N'[dbo].[TestColumns]'))
AND (c.[column_id] > 1)
;
WITH
cteColumns AS (
SELECT
c.[name]
FROM
[sys].[columns] AS c
WHERE
(c.[object_id] = OBJECT_ID(N'[dbo].[TestColumns]'))
AND (c.[column_id] > 1)
)
SELECT
v.[Code]
, c.[name]
FROM
[dbo].[TestValues] AS v
LEFT OUTER JOIN cteColumns AS c ON v.[Code] = c.[name]
;
GO
ROLLBACK TRAN;
GO
In my opinion the two selects should have the same output. Can someone offer an explanation please?
TestValues is a table receiving data. TestColumns is a table that was created, when the project was started, by persisting a PIVOT query. Recently the process inserting into TestValues received some new data. I tried to get the new values using the first query and I was surprised when the result didn't show anything new.
Edit 1: Thank you dean for the answer, it sounds like a good explanation. Do you have any official page describing the behaviours of unpreserved tables? I did a quick google search and all I got was links towards Oracle. (added as an edit because I do not have enough reputation points to comment)

Related

How to return different format of records from a single PL/pgSQL function?

I am a frontend developer but I started to write backend stuff. I have spent quite some amount of time trying to figure out how to solve this. I really need some help.
Here are the simplified definitions and relations of two tables:
Relationship between tables
CREATE TABLE IF NOT EXISTS items (
item_id uuid NOT NULL DEFAULT gen_random_uuid() ,
parent_id uuid DEFAULT NULL ,
parent_table parent_tables NOT NULL
);
CREATE TABLE IF NOT EXISTS collections (
collection_id uuid NOT NULL DEFAULT gen_random_uuid() ,
parent_id uuid DEFAULT NULL
);
Our product is an online document collaboration tool, page can have nested pages.
I have a piece of PostgreSQL code for getting all of its ancestor records for given item_ids.
WITH RECURSIVE ancestors AS (
SELECT *
FROM items
WHERE item_id in ( ${itemIds} )
UNION
SELECT i.*
FROM items i
INNER JOIN ancestors a ON a.parent_id = i.item_id
)
SELECT * FROM ancestors
It works fine for nesting regular pages, But if I am going to support nesting collection pages, which means some items' parent_id might refer to "collection" table's collection_id, this code will not work anymore. According to my limited experience, I don't think pure SQL code can solve it. I think writing a PL/pgSQL function might be a solution, but I need to get all ancestor records to given itemIds, which means returning a mix of items and collections records.
So how to return different format of records from a single PL/pgSQL function? I did some research but haven't found any example.
You can make it work by returning a superset as row: comprised of item and collection. One of both will be NULL for each result row.
WITH RECURSIVE ancestors AS (
SELECT 0 AS lvl, i.parent_id, i.parent_table, i AS _item, NULL::collections AS _coll
FROM items i
WHERE item_id IN ( ${itemIds} )
UNION ALL -- !
SELECT lvl + 1, COALESCE(i.parent_id, c.parent_id), COALESCE(i.parent_table, 'i'), i, c
FROM ancestors a
LEFT JOIN items i ON a.parent_table = 'i' AND i.item_id = a.parent_id
LEFT JOIN collections c ON a.parent_table = 'c' AND c.collection_id = a.parent_id
WHERE a.parent_id IS NOT NULL
)
SELECT lvl, _item, _coll
FROM ancestors
-- ORDER BY ?
db<>fiddle here
UNION ALL, not UNION.
Assuming a collection's parent is always an item, while an item can go either way.
We need LEFT JOIN on both potential parent tables to stay in the race.
I added an optional lvl to keep track of the level of hierarchy.
About decomposing row types:
Combine postgres function with query
Record returned from function has columns concatenated

SQL NOT LIKE comparison against dynamic list

Working on a new TSQL Stored Procedure, I am wanting to get all rows where values in a specific column don't start with any of a specific set of 2 character substrings.
The general idea is:
SELECT * FROM table WHERE value NOT LIKE 's1%' AND value NOT LIKE 's2%' AND value NOT LIKE 's3%'.
The catch is that I am trying to make it dynamic so that the specific substrings can be pulled from another table in the database, which can have more values added to it.
While I have never used the IN operator before, I think something along these lines should do what I am looking for, however, I don't think it is possible to use wildcards with IN, so I might not be able to compare just the substrings.
SELECT * FROM table WHERE value NOT IN (SELECT substrings FROM subTable)
To get around that limitation, I am trying to do something like this:
SELECT * FROM table WHERE SUBSTRING(value, 1, 2) NOT IN (SELECT Prefix FROM subTable WHERE Prefix IS NOT NULL)
but I'm not sure this is right, or if it is the most efficient way to do this. My preference is to do this in a Stored Procedure, but if that isn't feasible or efficient I'm also open to building the query dynamically in C#.
Here's an option. Load values you want to filter to a table, left outer join and use PATINDEX().
DECLARE #FilterValues TABLE
(
[FilterValue] NVARCHAR(10)
);
--Table with values we want filter on.
INSERT INTO #FilterValues (
[FilterValue]
)
VALUES ( N's1' )
, ( N's2' )
, ( N's3' );
DECLARE #TestData TABLE
(
[TestValues] NVARCHAR(100)
);
--Load some test data
INSERT INTO #TestData (
[TestValues]
)
VALUES ( N's1 Test Data' )
, ( N's2 Test Data' )
, ( N's3 Test Data' )
, ( N'test data not filtered out' )
, ( N'test data not filtered out 1' );
SELECT a.*
FROM #TestData [a]
LEFT OUTER JOIN #FilterValues [b]
ON PATINDEX([b].[FilterValue] + '%', [a].[TestValues]) > 0
WHERE [b].[FilterValue] IS NULL;

Most efficient way to join to a two-part key, with a fallback to matching only the first part?

In purely technical terms
Given a table with a two-column unique key, and input values for those two columns, what is the most efficient way to return the first matching row based on a two-step match?:
If an exact match exists on both key parts, return that
Otherwise, return the first (if any) matching row based on the first part alone
This operation will be done in many different places, on many rows. The "payload" of the match will be a single string column (nvarchar(400)). I want to optimize for fast reads. Paying for this with slower inserts and updates and more storage is acceptable. So having multiple indexes with the payload included is an option, as long is there is a good way to execute the two-step match described above. There absolutely will be a unique index on (key1, key2) with the payload included, so essentially all reads will be going off of this index alone, unless there is some clever approach that would use additional indexes.
A method that returns the entire matching row is preferred, but if a scalar function that only returns the payload is an order of magnitude faster, then that is worth considering.
I've tried three different methods, two of which I have posted as answers below. The third method was about 20x more expensive in the explain plan cost, and I've included it at the end of this post as an example of what not to do.
I'm curious to see if there are better ways, though, and will happily vote someone else's suggestion as the answer if it is better. In my dev database the query planner estimates similar costs to my two approaches, but my dev database doesn't have anywhere near the volume of multilingual text that will be in production, so it's hard to know if this accurately reflects the comparative read performance on a large data set. As tagged, the platform is SQL Server 2012, so if there are new applicable features available as of that version do make use of them.
Business context
I have a table LabelText that represents translations of user-supplied dynamic content:
create table Label ( bigint identity(1,1) not null primary key );
create table LabelText (
LabelTextID bigint identity(1,1) not null primary key
, LabelID bigint not null
, LanguageCode char(2) not null
, LabelText nvarchar(400) not null
, constraint FK_LabelText_Label
foreign key ( NameLabelID ) references Label ( LabelID )
);
There is a unique index on LabelID and LanguageCode, so there can only be one translation of a text item for each ISO 2-character language code. The LabelText field is also included, so reads can access the index along without having to fetch back from the underlying table:
create unique index UQ_LabelText
on LabelText ( LabelID, LanguageCode )
include ( LabelText);
I'm looking for the fastest-performing way to return the best match from the LabelText table in a two-step match, given a LabelID and LanguageCode.
For examples, let's say we have a Component table that looks like this:
create table Component (
ComponentID bigint identity(1,1) not null primary key
, NameLabelID bigint not null
, DescriptionLabelID bigint not null
, constraint FK_Component_NameLabel
foreign key ( NameLabelID ) references Label ( LabelID )
, constraint FK_Component_DescLabel
foreign key ( DescriptionLabelID ) references Label ( LabelID )
);
Users will each have a preferred language, but there is no guarantee that a text item will have a translation in their language. In this business context it makes more sense to show any available translation rather than none, when the user's preferred language is not available. So for example a German user may call a certain widget the 'linkenpfostenklammer'. A British user would prefer to see an English translation if one is available, but until there is one it is better to see the German (or Spanish, or French) version than to see nothing.
What not to do: Cross apply with dynamic sort
Whether encapsulated in a table-valued function or included inline, the following use of cross apply with a dynamic sort was about 20x more expensive (per explain plan estimate) than either the scalar-valued function in my first answer or the union all approach in my second answer:
declare #LanguageCode char(2) = 'de';
select
c.ComponentID
, c.NameLabelID
, n.LanguageCode as NameLanguage
, n.LabelText as NameText
from Component c
outer apply (
select top 1
lt.LanguageCode
, lt.LabelText
from LabelText lt
where lt.LabelID = c.NameLabelID
order by
(case when lt.LanguageCode = #LanguageCode then 0 else 1 end)
) n
I think this is going to be most performant
select lt.*, c.*
from ( select LabelText, LabelID from LabelText
where LabelTextID = #LabelTextID and LabelID = #LabelID
union
select LabelText, min(LabelID) from LabelText
where LabelTextID = #LabelTextID
and not exists (select 1 from LabelText
where LabelTextID = #LabelTextID and LabelID = #LabelID)
group by LabelTextID, LabelText
) lt
join component c
on c.NameLabelID = lt.LabelID
OP solution 1: Scalar function
A scalar function would make it easy to encapsulate the lookup for reuse elsewhere, though it does not return the language code of the text actually returned. I'm also unsure of the cost of executing multiple times per row in denormalized views.
create function GetLabelText(#LabelID bigint, #LanguageCode char(2))
returns nvarchar(400)
as
begin
declare #text nvarchar(400);
select #text = LabelText
from LabelText
where LabelID = #LabelID and LanguageCode = #LanguageCode
;
if #text is null begin
select #text = LabelText
from LabelText
where LabelID = #LabelID;
end
return #text;
end
Usage would look like this:
declare #LanguageCode char(2) = 'de';
select
ComponentID
, NameLabelID
, DescriptionLabelID
, GetLabelText(NameLabelID, #LanguageCode) AS NameText
, GetLabelText(DescriptionLabelID, #LanguageCode) AS DescriptionText
from Component
OP solution 2: Inline table-valued function using top 1, union all
A table-valued function is nice because it encapsulates the lookup for reuse just as with a scalar function, but also returns the matching LanguageCode of the row that was actually selected. In my dev database with limited data the explain plan cost of the following use of top 1 and union all is comparable to the scalar function approach in "OP Solution 1":
create function GetLabelText(#LabelID bigint, #LanguageCode char(2))
returns table
as
return (
select top 1
A.LanguageCode
, A.LabelText
from (
select
LanguageCode
, LabelText
from LabelText
where LabelID = #LabelID
and LanguageCode = #LanguageCode
union all
select
LanguageCode
, LabelText
from LabelText
where LabelID = #LabelID
) A
);
Usage:
declare #LanguageCode char(2) = 'de';
select
c.ComponentID
, c.NameLabelID
, n.LanguageCode AS NameLanguage
, n.LabelText AS NameText
, c.DescriptionLabelID
, c.LanguageCode AS DescriptionLanguage
, c.LabelText AS DescriptionText
from Component c
outer apply GetLabelText(c.NameLabelID, #LanguageCode) n
outer apply GetLabelText(c.DescriptionLabelID, #LanguageCode) d

View all Data of two related tables , even if something is not registered from table A to table B

i have two sql server table like this :
[Management].[Person](
[PersonsID] [int] IDENTITY(1,1) NOT NULL,
[FirstName] [nvarchar](50) NOT NULL,
[LastName] [nvarchar](100) NOT NULL,
[Semat] [nvarchar](50) NOT NULL,
[Vahed] [nvarchar](50) NOT NULL,
[Floor] [int] NOT NULL,
[ShowInList] [bit] NOT NULL,
[LastState] [nchar](10) NOT NULL)
and
[Management].[PersonEnters](
[PersonEnters] [int] IDENTITY(1,1) NOT NULL,
[PersonID] [int] NOT NULL,
[Vaziat] [nchar](10) NOT NULL,
[Time] [nchar](10) NOT NULL,
[PDate] [nchar](10) NOT NULL)
that PersonsID in second table is a foreign key.
i register every person's enter to system on PersonsEnter Table.
i want to show all person enter stastus in a certain date (PDate field) , if a person entered to system show it's information an if did not, show null insted,
i tried this query :
select * from [Management].[Person] left outer join [Management].[PersonEnters]
on [Management].[Person].[PersonsID] = [Management].[PersonEnters].[PersonID]
where [Management].[PersonEnters].PDate = '1392/11/14'
but it just shows registered person enter data at 1392/11/14 and shows nothing for others,
i wanna show this data plus null or a constant string like "NOT REGISTERED" for other persons that not registered their enter in PersonEnters Table on '1392/11/14'. Please Help Me.
Logically, the WHERE clause will be applied after the join. If some Person entries do not have matches in PersonEnters, they will have NULLs in PDate as a result of the join, but the WHERE clause will filter them out because the comparison NULL = '1392/11/14' will not yield true.
If I understand your question correctly, you essentially want an outer join to a subset of PersonEnters (the one where PDate = '1392/11/14'), not to the entire table. One way to express that could be like this:
SELECT *
FROM Management.Person AS p
LEFT JOIN (
SELECT *
FROM Management.PersonEnters
WHERE PDate = '1392/11/14'
) AS pe
ON p.Person.ID = pe.PersonID
;
As you can see, this query very explicitly tells the server that a particular subset should be derived from PersonEnters before the join takes place – because you want to indicate matches with that particular subset, not with the whole table.
However, the same intent could be rewritten in a more concise way (without a derived table):
SELECT *
FROM Management.Person AS p
LEFT JOIN Management.PersonEnters AS pe
ON p.Person.ID = pe.PersonID AND pe.PDate = '1392/11/14'
;
The effect of the above query would be the same and you would get all Person entries, with matching results from PersonEnters only if they have PDate = '1392/11/14'.
select *
from [Management].[Person]
left outer join [Management].[PersonEnters]
on [Management].[Person].[PersonsID] = [Management].[PersonEnters].[PersonID]
and [Management].[PersonEnters].PDate = '1392/11/14'

an empty row with null-like values in not-null field

I'm using postgresql 9.0 beta 4.
After inserting a lot of data into a partitioned table, i found a weird thing. When I query the table, i can see an empty row with null-like values in 'not-null' fields.
That weird query result is like below.
689th row is empty. The first 3 fields, (stid, d, ticker), are composing primary key. So they should not be null. The query i used is this.
select * from st_daily2 where stid=267408 order by d
I can even do the group by on this data.
select stid, date_trunc('month', d) ym, count(*) from st_daily2
where stid=267408 group by stid, date_trunc('month', d)
The 'group by' results still has the empty row.
The 1st row is empty.
But if i query where 'stid' or 'd' is null, then it returns nothing.
Is this a bug of postgresql 9b4? Or some data corruption?
EDIT :
I added my table definition.
CREATE TABLE st_daily
(
stid integer NOT NULL,
d date NOT NULL,
ticker character varying(15) NOT NULL,
mp integer NOT NULL,
settlep double precision NOT NULL,
prft integer NOT NULL,
atr20 double precision NOT NULL,
upd timestamp with time zone,
ntrds double precision
)
WITH (
OIDS=FALSE
);
CREATE TABLE st_daily2
(
CONSTRAINT st_daily2_pk PRIMARY KEY (stid, d, ticker),
CONSTRAINT st_daily2_strgs_fk FOREIGN KEY (stid)
REFERENCES strgs (stid) MATCH SIMPLE
ON UPDATE CASCADE ON DELETE CASCADE,
CONSTRAINT st_daily2_ck CHECK (stid >= 200000 AND stid < 300000)
)
INHERITS (st_daily)
WITH (
OIDS=FALSE
);
The data in this table is simulation results. Multithreaded multiple simulation engines written in c# insert data into the database using Npgsql.
psql also shows the empty row.
You'd better leave a posting at http://www.postgresql.org/support/submitbug
Some questions:
Could you show use the table
definitions and constraints for the
partions?
How did you load your data?
You get the same result when using
another tool, like psql?
The answer to your problem may very well lie in your first sentence:
I'm using postgresql 9.0 beta 4.
Why would you do that? Upgrade to a stable release. Preferably the latest point-release of the current version.
This is 9.1.4 as of today.
I got to the same point: "what in the heck is that blank value?"
No, it's not a NULL, it's a -infinity.
To filter for such a row use:
WHERE
case when mytestcolumn = '-infinity'::timestamp or
mytestcolumn = 'infinity'::timestamp
then NULL else mytestcolumn end IS NULL
instead of:
WHERE mytestcolumn IS NULL