An SQLite index rebuild operation is the only way to reclaim space lost to fragmentation.
Let’s see it in action. Imagine a table users with a primary key id and an index on email.
CREATE TABLE users (
id INTEGER PRIMARY KEY,
email TEXT NOT NULL,
name TEXT
);
CREATE INDEX idx_email ON users (email);
-- Populate with some data
INSERT INTO users (email, name) VALUES
('alice@example.com', 'Alice'),
('bob@example.com', 'Bob'),
('charlie@example.com', 'Charlie');
-- Now, let's simulate some churn: delete and insert
DELETE FROM users WHERE email = 'bob@example.com';
INSERT INTO users (email, name) VALUES ('david@example.com', 'David');
INSERT INTO users (email, name) VALUES ('eve@example.com', 'Eve');
DELETE FROM users WHERE email = 'alice@example.com';
INSERT INTO users (email, name) VALUES ('frank@example.com', 'Frank');
After these operations, the idx_email index might have unused space within its B-tree structure. This unused space, or fragmentation, happens because SQLite doesn’t aggressively re-arrange data when rows are deleted or updated. It simply marks pages as free, but doesn’t necessarily coalesce them into larger contiguous blocks. Over time, especially with frequent writes and deletes, this can lead to an index that is significantly larger than its logical content would suggest.
The primary tool to diagnose this is PRAGMA integrity_check. While it doesn’t directly report fragmentation, it’s your first stop to ensure the database isn’t fundamentally corrupted.
PRAGMA integrity_check;
If integrity_check returns ok, your database is structurally sound. The next step is to look at the index sizes. You can get this information using PRAGMA index_list and PRAGMA index_info.
PRAGMA index_list(users);
This will give you the name and unique number of each index. Then, for a specific index, say idx_email:
PRAGMA index_info(idx_email);
This shows the structure of the index. To actually measure the bloat, you’d typically compare the reported size of the index’s pages to the amount of data it logically contains. SQLite doesn’t expose a direct "fragmentation percentage" command. However, a common heuristic is to rebuild an index if its size, as reported by PRAGMA page_count on the index itself (which you can’t directly query with PRAGMA page_count), seems disproportionately large compared to the number of rows in the table. A more practical approach is to observe performance degradation on queries that heavily rely on that index. When queries become noticeably slower, especially on large tables, index bloat is a prime suspect.
The solution is to rebuild the index. You can do this by dropping and recreating it, or more elegantly using REINDEX.
Option 1: Drop and Recreate
DROP INDEX idx_email;
CREATE INDEX idx_email ON users (email);
This completely removes the old index structure and builds a new, compact one from scratch based on the current table data. All the previously marked-as-free pages are discarded, and the new index occupies only the space necessary for its current entries.
Option 2: REINDEX
REINDEX idx_email;
The REINDEX command tells SQLite to rebuild the specified index. Internally, it performs a similar operation to dropping and recreating but can sometimes be more efficient as it might avoid some overhead. It effectively rereads all the indexed data and reconstructs the index’s B-tree structure, eliminating any fragmentation.
Both methods achieve the same goal: a defragmented, space-efficient index.
After rebuilding, you can re-run your slow queries. If performance improves, the bloat was indeed the culprit. You can also check the size of the database file before and after the rebuild to see the space reclaimed.
The next error you’ll hit after fixing index bloat is usually related to constraint violations if your data integrity wasn’t perfect before the rebuild.