I recently created my own SQLite database to do a one-off analysis on a special project. My database was pretty simple and had a couple of very large tables that consisted of millions of rows.
When I tried to join my tables together, the operation took almost an entire day. My query was really simple so I didn’t really understand why that would be happening. Up until this point, I had only really interacted with a database by reading data. I didn’t really know all the nuts and bolts behind the scenes. This is when I learned that Indexing can dramatically improve the performance of your query.
What is Indexing?
Indexing, simply put, helps organize data in a table to make lookups much more efficient. It’s like having a lot of bookmarks in a book. It’s much easier to use a bookmark to find a specific passage than to start from the beginning of thee book and read each word to find the location. This is why my query was so slow. Since the table wasn’t indexed, each item in my table was being compared to each row in the target table row by row.
For more information, check out this article which has a much better explanation that what I can provide.
When to use Indexing
The basic rule of thumb is this – the more indexing the better unless your data changes constantly.
Indexing takes up a lot of memory so creating a ton of indexing can dramatically slow down the performance of your database if your code is constantly writing to the database. This is because each time there are changes to the table, the new data has to be indexed.
If your data is static, then index as much as you can. It will greatly decrease the execution time of your queries.
Single Column Indexes
A single column index is when only one column is used in the index. This could simply be the user_id
column in a table, for example. This type of index is used when your join statement only references a single column. An example of that would look like this:
SELECT *
FROM table_1
INNER JOIN table_2 ON table_2.user_id = table_1.user_id;
Composite Indexes
A composite index contains multiple columns. Any join operation that uses multiple columns benefits tremendously from having a composite index. An example query would look like this:
SELECT *
FROM table_1
INNER JOIN table_2 ON table_2.user_id = table_1.user_id AND table_2.company_id = table_1.company_id AND table_2.city = table_1.city;
Following the ON
operator, you’ll see three columns that need to match. You could create single indexes for each column, but a composite index will be much more efficient.
How to Create an Index in SQLite
I will be using DB Browser for SQLite in the screenshots that follow.
On the Database Structure tab, click the Create Index button.
In the next pop up window, you have to name your index. The name doesn’t really matter and it’s not referenced in your queries. You will only see it in the Database Structure tab.
Then you can select one or many columns to use as the index. Add the column to the right box by using the arrow buttons. You can change the order of your columns in the right box by clicking on the column and using the up and down arrows.
The order of the columns has to match the order of your join query for the index to work properly. So the query would look something like the below to match the screenshot above.
SELECT *
FROM table_1
INNER JOIN table_2 ON table_2.field1 = table_1.field1 AND table_2.A = table_1.A AND table_2.B = table_1.B;
An index only has to be created on the table being joined. So that means that you don’t need to create the exact same index on the table specified in the FROM
statement of your query.
This was a pretty quick introduction to indexing and there is a lot more to learn, especially as your database grows in complexity.