Index statistics available for not existing index? - tsql

I am running the following query to get index external fragmentation:
SELECT SCH.[name] as 'Schema'
,TBL.[name] as 'Table'
,IX.[name] as 'Index'
,IXStats.[avg_fragmentation_in_percent]
,IXStats.[page_count]
FROM [sys].[dm_db_index_physical_stats] (DB_ID(), NULL, NULL, NULL, NULL) AS IXStats
INNER JOIN [sys].[tables] AS TBL
on TBL.[object_id] = IXStats.[object_id]
INNER JOIN [sys].[schemas] SCH
on TBL.[schema_id] = SCH.[schema_id]
INNER JOIN [sys].[indexes] AS IX
ON IX.[object_id] = IXStats.[object_id]
AND IXStats.[index_id] = IX.[index_id]
WHERE IXStats.database_id = DB_ID()
ORDER BY IXStats.[avg_fragmentation_in_percent] DESC
And for several tables (on which there is no index or primary key) the query returns fragmentation.
How is this possible? Is SQL Server creating some internal indexes?

So, if you run your query and grab the index_id column too like so:
SELECT SCH.[name] as 'Schema'
,TBL.[name] as 'Table'
,IX.[name] as 'Index'
,IX.index_id
,IXStats.[avg_fragmentation_in_percent]
,IXStats.[page_count]
FROM [sys].[dm_db_index_physical_stats] (DB_ID(), NULL, NULL, NULL, NULL) AS IXStats
INNER JOIN [sys].[tables] AS TBL
on TBL.[object_id] = IXStats.[object_id]
INNER JOIN [sys].[schemas] SCH
on TBL.[schema_id] = SCH.[schema_id]
INNER JOIN [sys].[indexes] AS IX
ON IX.[object_id] = IXStats.[object_id]
AND IXStats.[index_id] = IX.[index_id]
WHERE IXStats.database_id = DB_ID()
ORDER BY IXStats.[avg_fragmentation_in_percent] DESC
You should see that tables that have no index defined for them have an ID of 0, this means that the table is a heap (i.e. a table with no clustered index). If you see the index ID of 1, this means that the table is a clustered index.
See here for more info about heaps:
Heaps (Tables without Clustered Indexes)

Related

Using ANY with raw data work but not subquery

I just can't figure it out why this query work
SELECT id, name, organization_id
FROM facilities
WHERE organization_id = ANY(
'{abc-xyz-123,678-ght-nmp}'
)
But this query wont work with error operator does not exist: uuid = uuid[]
SELECT id, name, organization_id
FROM facilities
WHERE organization_id = ANY(
SELECT organization_ids
FROM admins
WHERE id = 'jkl-iop-345'
)
When the subquery
SELECT organization_ids
FROM admins
WHERE id = 'jkl-iop-345'
give the exact result of {abc-xyz-123,678-ght-nmp}.
I'm using postgres (PostgreSQL) 13.3
The subquery produces one row that contains an array.
If you use = ANY (SELECT ...), the result set is converted to an array, so you end up with
{{abc-xyz-123,678-ght-nmp}}
which is an array of arrays.
You probably want
SELECT id, name, organization_id
FROM facilities
WHERE EXISTS (SELECT 1 FROM admins
WHERE admins.id = 'jkl-iop-345'
AND facilities.organization_id = ANY (admins.organization_ids)
);
Let me remark that storing references to other tables in an array, JSON or other composite data type is an exceptionally bad idea. A normalized schema with a junction table would serve you better.

How to get the matching column name when searching across multiple columns in PostgreSQL?

Is there a way to get the matching column name when searching across multiple columns in PostgreSQL?
Say I have the following table structure and query:
CREATE TABLE document (
id serial PRIMARY KEY,
document_content VARCHAR
);
CREATE TABLE story (
id serial PRIMARY KEY,
headline VARCHAR
);
-----
SELECT
"document".*,
story.id,
story.headline
FROM
"document"
INNER JOIN story_document AS Documents_join ON "document".id = Documents_join.document_id
INNER JOIN story ON "story".id = Documents_join.story_id
WHERE to_tsvector(document_content) ## to_tsquery('foo')
OR to_tsvector(headline) ## to_tsquery('foo');
I was thinking of concatenating the value of the two columns, run the full text search, then create a sub query for both columns and re-run the search individually and record the result as a reference, but this would mean executing the search 3x:
SELECT
"document".*,
story.id AS story_id,
story.headline
(SELECT "document".id WHERE to_tsvector(document_content) ## to_tsquery('foo')) AS "matching_document_id",
(SELECT story_id WHERE to_tsvector(headline) ## to_tsquery('foo')) AS "matching_story_id"
FROM
"document"
INNER JOIN story_document AS Documents_join ON "document".id = Documents_join.document_id
RIGHT JOIN story ON "story".id = Documents_join.story_id
WHERE to_tsvector(document_content || ' ' || headline) ## to_tsquery('foo');
How could I get a reference to the column: document_content or headline, where the keyword "foo" was found in one query?
Thanks!
Since the columns are in different tables the best you can do is translate the OR into a UNION:
SELECT
"document".*,
story.id,
story.headline
FROM
"document"
INNER JOIN story_document AS Documents_join ON "document".id = Documents_join.document_id
INNER JOIN story ON "story".id = Documents_join.story_id
WHERE to_tsvector(document_content) ## to_tsquery('foo')
UNION
SELECT
"document".*,
story.id,
story.headline
FROM
"document"
INNER JOIN story_document AS Documents_join ON "document".id = Documents_join.document_id
INNER JOIN story ON "story".id = Documents_join.story_id
WHERE to_tsvector(headline) ## to_tsquery('foo');
Then PostgreSQL doesn't have to build the complete join just to filter out most of the rows. My variant will be fast if the conditions are selective and indexed and you have indexes on the join conditions as well, so that you can get fast nested loop joins.
Here is some more about dealing with OR.

How does this query populate data?

It is my understanding that when this query runs it would not populate any data any number of times it runs because of the where clause
where c.company_id = lot.company_id
and p.product_id = lot.product_id
and l.packlevel_id = lot.packlevel_id
It looks to me that at the very beginning when the table fact_table_lot is empty the where clause would return with empty data because it would not find anything in an empty table and it would happen everytime. Is my understanding wrong?
insert into fact_table_lot(company_id, product_id, packlevel_id, l_num, sn_count, comm_loct, comm_start, commdate_end, man_date, exp_date, user_id, created_datetime)
select c.company_id, p.product_id, l.packlevel_id, l_num, sn_count, comm_loct, comm_start, commdate_end, man_date, exp_date, user_id, sysdate
from staging_serials s
left outer join fact_table_lot lot on s.lotnumber = lot.l_num
join company c on c.lsc_company_id = s.companyid
join product p on s.compositeprodcode = p.compositeprodcode
join level l on l.unit_of_measure = p.packaginguom
where c.company_id = lot.company_id
and p.product_id = lot.product_id
and l.packlevel_id = lot.packlevel_id
and lot.created_datetime is null
In your query staging_serials s left outer join fact_table_lot lot on s.lotnumber= lot.l_num this will give the result set containing all records from staging_serials and since fact table is empty null values for those column from fact table. If you want no records to be returned use a inner join instead of left join.

I'm trying to insert tuples into a table A (from table B) if the primary key of the table B tuple doesn't exist in tuple A

Here is what I have so far:
INSERT INTO Tenants (LeaseStartDate, LeaseExpirationDate, Rent, LeaseTenantSSN, RentOverdue)
SELECT CURRENT_DATE, NULL, NewRentPayments.Rent, NewRentPayments.LeaseTenantSSN, FALSE from NewRentPayments
WHERE NOT EXISTS (SELECT * FROM Tenants, NewRentPayments WHERE NewRentPayments.HouseID = Tenants.HouseID AND
NewRentPayments.ApartmentNumber = Tenants.ApartmentNumber)
So, HouseID and ApartmentNumber together make up the primary key. If there is a tuple in table B (NewRentPayments) that doesn't exist in table A (Tenants) based on the primary key, then it needs to be inserted into Tenants.
The problem is, when I run my query, it doesn't insert anything (I know for a fact there should be 1 tuple inserted). I'm at a loss, because it looks like it should work.
Thanks.
Your subquery was not correlated - It was just a non-correlated join query.
As per description of your problem, you don't need this join.
Try this:
insert into Tenants (LeaseStartDate, LeaseExpirationDate, Rent, LeaseTenantSSN, RentOverdue)
select current_date, null, p.Rent, p.LeaseTenantSSN, FALSE
from NewRentPayments p
where not exists (
select *
from Tenants t
where p.HouseID = t.HouseID
and p.ApartmentNumber = t.ApartmentNumber
)

How to find primary key on a table in DB2?

I am unable to fetch primary key in DB2. I used following code but It is not working for me.
SELECT TBCREATOR, TBNAME, NAME, KEYSEQ
FROM SYSIBM.SYSCOLUMNS
WHERE TBCREATOR = 'DSN8710'
AND TBNAME = 'EMPLOYEE'
AND KEYSEQ > 0
ORDER BY KEYSEQ;
And what is the means of TBCREATOR in this code and how to modified TBCREATOR value according to my case?
I'll answer your last question first. creator is sometimes referred to as schema. If you're familiar with Oracle, this is roughly analogous to a database user (though not exactly).
As far as getting the "primary key" information, you probably want to know which index is the "clustering" index (which is what usually, but not always, determines the physical ordering of the rows on disk).
How you find the clustering index depends on the platform you're running:
Mainframe (z/OS):
SELECT
RTRIM(name) AS index_name
,RTRIM(creator) AS index_schema
,uniquerule
,clustering
FROM sysibm.sysindexes
WHERE tbname = #table
AND tbcreator = #schema
AND clustering = 'Y'
Then, to see the actual columns in that index, you perform this query:
SELECT colname AS name
FROM sysibm.sysindexes a
JOIN sysibm.syskeys b
ON a.name = b.ixname
AND a.tbcreator = b.ixcreator
WHERE a.name = #index_name
AND a.tbcreator = #index_schema
ORDER BY COLSEQ
Linux/Unix/Windows:
SELECT
RTRIM(indname) AS index_name
,RTRIM(indschema) AS index_schema
,uniquerule
,indextype
FROM syscat.indexes
WHERE tabname = #table
AND tabschema = #schema
AND indextype = 'CLUS'
Then, to see the actual columns in that index, you perform this query:
SELECT colnames as name
FROM sysibm.sysindexes
WHERE name = #index_name
AND tbcreator = #index_schema
ORDER BY NAME
LUW returns the list of columns as one string, delimited by +, which is kind of weird...