Skip to content

Optimize OFFSET on indexed table. #8878

Open
@nicktobey

Description

It should be possible to efficiently access an offset into an indexed table access.

For example, this query:

SELECT * FROM tbl WHERE pk > 1234 LIMIT 4000 OFFSET 2000

results in this plan:

+-----------------------------------+
| plan                              |
+-----------------------------------+
| Limit(4000)                       |
|  └─ Offset(2000)                  |
|      └─ IndexedTableAccess(tbl)   |
|          ├─ index: [tbl.pk]       |
|          ├─ filters: [{(1234, ∞)}]|
|          └─ columns: [pk]         |
+-----------------------------------+

This is a common pattern used in paginated results.

Currently, the engine implements this query by seeking to the beginning of the range, and then discarding the first 2000 values it pulls from the iterator. However, this we can do better: because every node in a Dolt prolly tree stores the total number of rows of each of it's children nodes (the "TreeCount"), it's possible to quickly advance the cursor by a given number of positions, by skipping entire tree nodes if none of the node's children are within range.

This is similar to, but distinct from #8779. Both are optimizations leveraging the fact that every node stores size information for its children. However, since this structure isn't visible to GMS, Dolt needs to add its own analysis pass, or there otherwise needs to be an interface that the Dolt table can implement to make the optimization possible.

Activity

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions