A smart way towards Indexes in SQL
Indexes are special lookup tables that the database search engine use to speed up data retrieval. This article focuses on how SQL Server uses indexes to read and write data. Data is arranged in SQL Server in the form of extents and pages. Each extent is of size 64 KB, having 8 pages of 8 KB sizes. An extent may have data from multiple or same table, but each page holds data from a single table only. Logically, data is stored in record sets in tables. A table is nothing but a collection of record sets; by default, rows are stored in the form of heaps unless a clustered index has been defined on the table, in which case, record sets are sorted and stored on the clustered index. The heaps structure is a simple arrangement where the inserted record is stored in the next available space on the table page.
Indexes are arranged in the form of a B-Tree where the leaf node holds the data or a pointer to the data. Since the stored data is in a sorted order, indexes precisely know which record is sitting where. Hence an index optimizes and enhances the data retrieval immensely.
We have two types of Indexes in SQL Server:-
1- Clustered indexes
2- Non-Clustered indexes
Clustered Indexes
A clustered index is something that reorganizes the way records in the table are physically stored. Therefore a table can have only one clustered index. The leaf nodes of a clustered index contain the data pages, by which I mean the key-value pair in the clustered index has the index key and the actual data value. Also remember, a clustered index will be created on a table by default the moment a primary key is created on the table.
Syntax for creating the clustered index is:-
CREATE CLUSTERED INDEX CI_ID ON Sales(ID)
Non Clustered Indexes
A non-clustered index is a special type of index in which the logical order of the index does not match the physical stored order of the rows on disk. The leaf node of a non-clustered index does not consist of the data pages but a pointer to it.
Important points about Indexes
1- SQL Server allows at most one clustered index in any version. As far as non-clustered indexes are concerned, Version 2005 allows 249 of them to be created while version 2008 allows 999 non-clustered indexes.
2- When we create a primary key, by default clustered index is created on that column.
3- We can use DTA to know the recommended index.
Interview Question
Q:- Once we declare a primary key, a clustered index is created on the column by default; what if I wish to create a clustered index and a primary key on two different columns? Is it possible?
A:- It is very much possible to have two different columns as primary key and clustered indexes. But remember, if I create a Primary Key on a table first, a CI will also be created. Now, in case I need them on two different columns, drop the Primary Key constraint and the CI shall automatically vanish. Now create a CI on column A and declare column B as Primary Key, and column B will have a NCI created by default on it instead of a CI. This way, we can have two columns as Primary Key and CI declared on them.
Q:- What are the disadvantages of index?
A:- 1-Use of intexes slow down Data modification operations (such as INSERT, UPDATE, DELETE).
2- Every time data changes in the table, all the indexes need to be updated.
3- Indexes need disk space, the more indexes you have, more disk space is used.
Q:- Explain the difference between clustered and non-clustered index.
A:- A clustered index reorders the way records are stored. A non clustered index is in which the logical order of the index does not match the physical stored order of the rows on disk. A clustered index is must faster because the index entries are actually data records. There can be just one clustered index per table while there can be up to 249 non clustered indexes
Q:- Define Clustered and Non-Clustered Index
A:- Clustered index exists as sorted row on disk.
Clustered index re-orders the table record.
Clustered index contains record in the leaf level of the B-tree.
There can be only one Clustered index possible in a table.
Non-clustered is the index in which logical order doesn’t match with physical order of stored data on disk.
Non-clustered index contains index key to the table records in the leaf level.
There can be one or more Non-clustered indexes in a table.
Q:- What is Unique Index?
A:- Unique index is the index that is applied to any column of unique value.
A unique index can also be applied to a group of columns.
Q:- Explain the 'Fillfactor' concept in Indexes.
A:- The fill factor option is provided for smoothening index data storage and performance.
The percentage of space on each leaf level page to be filled with data is determined by the fill factor value When an index is created. This reserves a percentage of free space for future growth
Q:- What is it unwise to create wide clustered index keys?
A:- A clustered index is a good choice for searching over a range of values. After an indexed row is found, the remaining rows being adjacent to it can be found easily. However, using wide keys with clustered indexes is not wise because these keys are also used by the non-clustered indexes for look ups and are also stored in every non-clustered index leaf entry
Q:- What is full-text indexing?
A:- Full text indexes are stored in the file system and are administered through the database.
Only one full-text index is allowed for one table. They are grouped within the same database in full-text catalogs and are created, managed and dropped using wizards or stored procedures
Q:- What are the different types of indexes?
A:-
- Clustered: It sorts and stores the data row of the table or view in order based on the index key.
- Non clustered: it can be defined on a table or view with clustered index or on a heap. Each row contains the key and row locator.
- Unique: ensures that the index key is unique
- Spatial: These indexes are usually used for spatial objects of geometry
- Filtered: It is an optimized non clustered index used for covering queries of well defined data