multi-pack-index

The Git object directory contains a pack directory containing packfiles (with suffix ".pack") and pack-indexes (with suffix ".idx"). The pack-indexes provide a way to lookup objects and navigate to their offset within the pack, but these must come in pairs with the packfiles. This pairing depends on the file names, as the pack-index differs only in suffix with its pack- file. While the pack-indexes provide fast lookup per packfile, this performance degrades as the number of packfiles increases, because abbreviations need to inspect every packfile and we are more likely to have a miss on our most-recently-used packfile. For some large repositories, repacking into a single packfile is not feasible due to storage space or excessive repack times.

The multi-pack-index (MIDX for short) stores a list of objects and their offsets into multiple packfiles. It contains:

Thus, we can provide O(log N) lookup time for any number of packfiles.

Design details

Future work

[0] https://bugs.chromium.org/p/git/issues/detail?id=6 Chromium work item for: Multi-Pack Index (MIDX)

[1] https://lore.kernel.org/git/20180107181459.222909-1-dstolee@microsoft.com/ An earlier RFC for the multi-pack-index feature

[2] https://lore.kernel.org/git/alpine.DEB.2.20.1803091557510.23109@alexmv-linux/ Git Merge 2018 Contributor’s summit notes (includes discussion of MIDX)

© 2012–2024 Scott Chacon and others
Licensed under the MIT License.
https://git-scm.com/docs/multi-pack-index