01/22/2020
We've started experimenting with Multi-Range-Reads in MyRocks.
MySQL optimizer can pass multiple ranges to storage engines, but default implementation in various storage engines just makes a nested loop of single range gets - this was initially a big deal for NBD (MySQL Cluster).
If we pass multiple ranges, we can indicate a larger batch of operations we are going to do, and for higher index dive setup costs in MyRocks (it possibly has to read index blocks and bloom filter blocks of multiple levels of data, instead of single B-Tree dive) that kind of batching can be very useful.
We added multi-get interface to RocksDB (and Jens Axboe already contributed io_uring support for that) - so by plugging through we can have a single query issue multiple I/Os in parallel for different ranges.
There're more wins that can be done with this kind of batching - there're cache efficiencies from testing index block or bloom filter with multiple keys, block cache LRU management gets cheaper as some of these blocks are checked out for the whole batch and not per individual key dive, etc.
In production we see that multiple concurrent queries hitting same range (via secondary index lookup or a join). That kind of workload ends up thrashing on index blocks's LRU shard mutex. Checking out the block less frequently by holding onto it during MRR helps us there.
We have to be careful not to do too much of prefetching/reading for queries that have LIMIT clause on them - but other than that there're wins all over the place.
Initially we had a bit of miscalculation with our limits and there was a possibility to pin your whole buffer pool to single query. Adding proper limits, checks for kill flag and tuning the optimizer choices is part of effort to roll features like that live.
This work is a collaboration with Sergey Petrunia from MariaDB.
Summary: First implementation. Supports scans on full primary key Cost-based choice whether to use MRR/BKA is not supported (set mrr=on,mrr_cost_based=off) Pull Request resolved: https://github.com...