Adding a non clustered index to a table with less than 1000 rows but accessed frequently will increase performance? - non-clustered-index

I have a table with just 400-500 rows but this table is being accessed very often so I was wondering if should add a non clustered index on one of its columns to see any improvement?
This table keeps the same data all the time and rarely is updated.
Here's the structure of the table
CREATE TABLE [dbo].[tbl_TimeZones](
[country] [char](2) NOT NULL,
[region] [char](2) NULL,
[timezone] [varchar](50) NOT NULL
) ON [PRIMARY]
With this Cluster Index:
CREATE CLUSTERED INDEX [IX_tbl_TimeZones] ON [dbo].[tbl_TimeZones]
(
[country] ASC,
[region] ASC,
[timezone] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
GO
This table doesn't has a primary key because the region column could be null so that's why I haven't used a key yet.
So I want to add a non cluster index on the column timezone in order to increase it's performance.

Short answer: an index will probably improve performance for you.
Longer answer.
Even with just that number of records, you could see query improvement with a well-chosen index. Assuming this table is used in joins, you could see changes (improvements) in the query plans that join through that table, which may give you a bigger benefit than you might anticipate.
You seem to give the impression that you expect to index "one" column. Indexing just one column is probably not the optimal solution. A "covering" index is generally going to be a better solution (Google for "covering index").
Now, having said all of that, I suspect that the best performance may come from how the clustered index is defined. You did not indicate the nature of the clustered index on this table, or of the use of the data. But, if the queries almost always access the table in the same way (e.g., the WHERE and JOIN clauses always reference the same columns), then you might find that changing the clustered index is gives the most improvement.
Also, part of the art of choosing indexes involves balancing query performance versus insert/update performance. You don't have that challenge if the data aren't changing. Keep that in mind when reading general index-tuning advice.
Bottom line: I suspect that a clustered index over the columns used in the WHERE and JOIN clauses is the answer. Consider column order matters. Selectivity matters.

Related

Does having a dedicated a single-column index have a substantial performance benefit over a composite index with the same column leading it?

I have a mapping table that looks like
group_id (int)
item_id (int)
there already exists two composite indexes group_id, item_id and item_id, group_id
I'm finding that deleting all records by group_id from the table is very slow (e.g. DELETE FROM table_name WHERE group_id = 1). From what I've read and see by using EXPLAIN the leading column composite index group_id, item_id will get used even though there no single-column index for group_id. I've seen people mention on here you can get even better performance by having a dedicated single-column index on the first column. How much of a performance benefit should I expect? Would it be a marginal improvement or
On a side note I'm also curious if it's the item_id, group_id index that hurting delete performance by needing to clean up indexes.
A smaller index might help from being able to more easily fit in cache. But that would help when you are jumping all around the index reading only one row from each spot, not reading a big chunk of adjacent index entries like you are here. Deletes don't incur direct index maintenance cost. They do create work for some future vacuum to clean up, but that doesn't seem to be what is happening here (and it is mostly independent of the number of columns in the index anyway). Whatever is slowing down your delete, it is not this. The biggest culprit for slowing down non-join deletes are triggers and FK constraints.

Trigger to update parent table based on max value of child table

this is my first foray into SSMS/T-SQL (coming from Access). I have a trigger setup that keeps the value of a column in a parent table always equal to the MAX value of a column a child table based on the key between them. To calculate the MAX I have a UDF defined that i Think works ok.
The problem I seem to have is that the trigger executes for EVERY key in the table and not just the one that got updated/deleted/inserted (or so is what I can glean from the debugger).
Here is the parent table:
CREATE TABLE [dbo].[factMeasures](
[MeasureID] [int] IDENTITY(1,1) NOT NULL,
[QARTOD] [int] NULL,
[Param] [char](10) NOT NULL,
[Value] [real] NOT NULL,
CONSTRAINT [PK_factMeasures] PRIMARY KEY CLUSTERED
(
[MeasureID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO
Here is the child table:
CREATE TABLE [dbo].[dt_QCflags](
[QC_ID] [int] IDENTITY(1,1) NOT NULL,
[fkMeasureID] [int] NOT NULL,
[RuleValue] [int] NOT NULL,
CONSTRAINT [PK_dt_QCflags] PRIMARY KEY CLUSTERED
(
[QC_ID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO
ALTER TABLE [dbo].[dt_QCflags] WITH CHECK ADD CONSTRAINT [FK_dt_QCflags_factMeasures] FOREIGN KEY([fkMeasureID])
REFERENCES [dbo].[factMeasures] ([MeasureID])
ON DELETE CASCADE
GO
ALTER TABLE [dbo].[dt_QCflags] CHECK CONSTRAINT [FK_dt_QCflags_factMeasures]
GO
HEre is the UDF that calculates the MAX value of [RuleValue] for the input [MeasureID]
CREATE FUNCTION [dbo].[MaxQC](#MeasureID INT)
RETURNS INT
AS
BEGIN
RETURN
(SELECT
Max([dt_QCflags].[RuleValue]) AS Max_RuleValue
FROM
dbo.dt_QCflags
WHERE
dt_QCflags.fkMeasureID = #MeasureID
GROUP BY
fkMeasureID);
END
And here is the trigger on the child table:
ALTER TRIGGER [dbo].[UpdateQARTOD]
ON [dbo].[dt_QCflags]
AFTER INSERT, UPDATE, DELETE
AS
BEGIN
UPDATE factMeasures
SET QARTOD = dbo.MaxQC(MeasureID) -- by QARTOD Definition, QARTOD flag is set to the MAX of all sub-test results
END
So what I want is for the column in the parent (factMeasures.QARTOD) to always contain the Maximum of the column in the child table (dt_QCFlags.RuleValue), for the given MeasureID value
When I debug this, it seems to be running the trigger for EVERY record in the parent table, so I think I need to modify the trigger, but i"m not sure how to get the MeasureID of JUST the record that was added/deleted/modified.
I'm guessing it has something to do with the "magic tables" inserted, deleted, etc. but I can't seem to get the syntax right.
Thanks!
I would argue that unless you have a very good reason, storing values that can easily be computed on a query level is a mistake.
This seems like one of many cases I've seen where people think they gain something by storing values on one table that's calculated from values of another table, but in fact the opposite is true - now you have two points of data that needs to be synchronized at all times, and since the process synchronizing them is a trigger, you don't really have control over that - it's quite easy to disable / enable triggers, for instance.
Therefore, My advice to you would be to remove that trigger all together and simply calculate the value when you need to.
Please note that since SQL Server supports max() over(partition by) meaning you don't even need a group by if you want to calculate the max of a column.
Updated Following your comments to the answer, it seems like you have a good reason to store these values.
Having said all that, here's a direct answer to the question you've asked.
In SQL Server triggers, the database enables you to query two special tables called inserted and deleted. These tables contains the data that was (or going to be, in case of instead of triggers) inserted or deleted to the table on which the trigger is declared.
Please note that in SQL Server, triggers are fired per statement, not per row. This means that the inserted and deleted tables might contain 0, 1 or many rows.
If you still want to calculate the value using triggers, I would advise a trigger for insert/update and another trigger for deletes.
This would make for much simpler code.
In the delete trigger, you left join to the deleted table:
UPDATE T
SET QARTOD = MaxValue
FROM factMeasures As T
JOIN
(
SELECT d.fkMeasureID, Max(t.RuleValue) As MaxValue
FROM Deleted AS d
LEFT JOIN dt_QCflags As t
ON d.QC_ID = t.QC_ID
GROUP BY d.fkMeasureID
) as D
ON T.MeasureID = D.fkMeasureID
In the insert/update trigger, you write a very similar code - but you don't need to refer to the physical table in this case, only the inserted table:
UPDATE T
SET QARTOD = MaxValue
FROM factMeasures As T
JOIN
(
SELECT fkMeasureID, Max(RuleValue) As MaxValue
FROM Inserted
GROUP BY fkMeasureID
) as I
ON t.MeasureID = I.fkMeasureID

Which index is used to answer aggregates when we have several indexes?

I have a table which is partitioned on daily basis, each partition has certainly a primary key, and several other indexes on columns which are not null. If I get the query plane for the following:
SELECT COUNT(*) FROM parent_table;
I can see different indexes are used, sometimes the primary key index is used and some times others. How postgres is able to decide which index to use. Note that, my table is not clustered and never clustered before. Also, the primary key is serial.
What are the catalog / statistics tables which are used to make this decision.

SQL Server index performance

I'm interested in databases and have started to play around with SQL Server 2008, I've read that using appropriate Indexes on tables can help to improve the overall performance of the database.
I have two tables and auto generated 1 million rows in each table using SQL Data Generator, Table 1 is a customers table and Table 2 is a renters table, the designs are as follows:
Customer Renters
CustomerID (PK) RentersID (PK)
ForeName (Non clustered index) StartDate
SurName EndDate
Email RentalNights
CustomerID (FK) (Non Clustered index)
I've read that placing a non clustered index on the most commonly used columns as well as foreign key columns will help to improve performance. I created a simple join query before using indexes and after using indexes, but for me I can't seen the increased performance when using indexes, can any body help me out? The images below are the execution plans before indexes and after using them.
Before Indexes:
After Indexes:
EDIT:
this is the SQL syntax i am using
SELECT cu.ForeName + ' ' + cu.SurName AS 'Name'
FROM dbo.Customers cu
INNER JOIN dbo.Renters re ON re.CustomerID = cu.CustomerID
WHERE cu.ForeName = 'Daniel'
EDIT
This is my index syntax using the ones posted in the reply below:
CREATE NONCLUSTERED INDEX [ix_Customer] ON [dbo].[Customers]
(
[ForeName] ASC,
[CustomerID] ASC
)
INCLUDE ( [SurName]) WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
GO
Based on your query the best nonclustered indexes to build would be:
CREATE NONCLUSTERED INDEX ix_IndexA on dbo.Customers (Forename, CustomerID)
INCLUDE (SurName)
CREATE NONCLUSTERED INDEX ix_IndexB on dbo.Renters (CustomerID)
You want your key fields to be on your filter or JOIN columns, and your INCLUDE columns are at the leaf level to get returned in the SELECT.

Doubt in clustered and non Clustered index

I have a doubt that if my table do n't have any constraint like Primary Key,Foreign key,Unique key etc. then can i create the clustered index on table and clustered index can have the douplicate records ?
My 2nd question is where should we exectly use the non clustered index and when it is useful and benificial to create in table?
My 3rd question is How can we create the 249 non clustered index in a table .Is it the meaning, Creating the non clustered index on 249 columns ?
Can you anyone help me to remove my confusion in this.
First, the definition of a clustered index is that it is physical ordering of data on the disk. Every time you do an insert into that table, the new record will be placed on the physical disk in its order based on its value in the clustered index column. Because it is the physical location on the disk, it is (A) the most rapidly accessible column in the table but (B) only possible to define a single clustered index per table. Which column (or columns) you use as the clustered index depend on the data itself and its use. Primary keys are typically the clustered index, especially if the primary key is sequential (e.g. an integer that increments automatically with each insert). This will provide the fastest insert/update/delete functionality. If you are more interested in performing reads (select * from table), you may want to cluster on a Date column, as most queries have either a date in the where clause, the group by clause or both.
Second, clustered indexes (at least in the DB's I know) need not be unique (they CAN have duplicates). Constraining the column to be unique is separate matter. If the clustered index is a primary key its uniqueness is a function of being a primary key.
Third, I can't follow you questions concerning 249 columns. A non-clustered index is basically a tool for accelerating queries at the expense of extra disk space. It's hard to think of a case where creating an index on each column is necessary. If you want a quick rule of thumb...
Write a query using your table.
If a column is required to do a join, index it.
If a column is used in a where column, index it.
Remember all the indexes are doing for you is speeding up your queries. If queries run fast, don't worry about them.
This is just a thumbnail sketch of a large topic. There are tons of more informative/comprehensive resources on this matter, and some depend on the database system ... just google it.