Database Indexes : Working, Caution & Implementation

Aditya Rewari
3 min readMay 28, 2021

What are Indexes
- Data structure technique which is used to quickly locate and access the data in a database.
- Implementation provided by database itself.
- Runs in background of database.
- Helps in faster data search

Database cases where Indexes can be applied
Column at WHERE clause :
Any column being frequently used at where clause of multiple queries, is eligible as a good candidate for applying indexes

— Columns used at JOINS :
The columns on which the tables are being joined ON are also a good candidate for indexes.
The reason being, as join operation itself performs internal look ups for producing the join result, having indexes facilitate this look up.

— Sorting :
As the indexes are themselves sorted. i.e., the data structure implementing indexes stores the column values in sorted order. Therefore, performing ASC & DESC ordering on indexes is also faster.
This is another reason, if we need to sort some data on basis Date, then that Date column is a good candidate for indexing.
(An exception to above will be Hash Indexes, as they are not sorted. Covered later in the article)

When NOT to use Indexes
Small Table Size :
If the number of total records in the table is less, then the total response time of database queries will be comparable in both the cases (including index vs without index)

— Columns having repetitive data :
Indexing is not considered efficient in this case.
After narrowing search through indexing, the repeated data set will have to be searched again.
Such cases giving surprising delays. If query would had been on some value which was not repeating, then response time will be quick. And, query on some column value which was repetitive, response time will be comparatively higher. It becomes a difficult debug task identifying such issues.

— Columns that contain a high number of NULL values :
These repetitive NULL values again will fall in above mentioned case.
This case is mentioned explicitly as their can be a scenario in which though the values being distinct, but the column might have a lot of NULL values too.

— Columns having frequent UPSERTS
Columns having frequent updates/inserts are a bad candidate for indexes.
This condition though has to be compromised many times especially if number of READ operations exceed UPSERT operations.

Implicit Indexes
Indexes which are automatically created by the database server are called Implicit Indexes. Applying following constraint on column causes implicit indexes.

  • Primary Key
  • Unique Constraint

Multiple Column Indexes (Composite Indexes)

  • When we include multiple columns under a single Index.
  • Syntax:
    CREATE INDEX index_name ON table_name (col1, col2, col3)
  • index_name is a composite index on col1, col2, col3
  • Composite index follow Left to Right rule
  • Index will be used on following searches
    * Col1
    * Col1 , Col2
    * Col1, Col2, Col3
  • Index will not be used under following search
    * Col2
    * Col3
    * Col2 , Col3

Structure of Index Node
The data structure implementing Index stores the data in following structure :

  • Search Key : stores the column value
  • Data Reference : stores the reference/pointer to disk space, where that particular record is saved.

Data Structures Implementing Indexes

Indexes are actually a programmatic implementations offered by databases. Based on the algorithms used in implementation, there are different types available of indexes.

  • B Tree:
    * This is the default implementation
    * These use B+ Tree for implementation
    * These are sorted in nature
    * search time is O(log n) (as Tree search is performed)
  • Hash Indexes
    * They run a hash algorithm on column value. Based on the hash obtained, a bucket is allotted to it.
    * Search time is O(1)
    * These are NOT sorted

--

--