This header-only library contains a highly tuned (uncompressed) bit vector implementation with multiple space efficient rank and select support data structures. Our fastest rank and select support has a space overhead of only ~3.51% and makes use of data level parallelism via SIMD instructions.
This header-only library contains a highly tuned (uncompressed) bit vector implementation with multiple space efficient rank and select support data structures.
Our fastest rank and select support has a space overhead of only ~3.51% and makes use of data level parallelism via SIMD instructions.
If you use this code in a scientific context, please cite our paper.
@inproceedings{Kurpicz2022CompactRankSelect,
author = {Florian Kurpicz},
title = {Engineering Compact Data Structures for Rank and Select Queries on Bit Vectors},
booktitle = {{SPIRE}},
series = {Lecture Notes in Computer Science},
volume = {13617},
pages = {257--272},
publisher = {Springer},
year = {2022},
doi = {10.1007/978-3-031-20643-6\_19}
}
This repository contains the following algorithms and data structures.
Our documentation contains in-depth on the usage of all these algorithms and data structures including easy to follow examples.
You can find the example in the screenshot below as text, too.
Bit vectors play an important role in many compressed text indices, e.g., the FM-index.
This repository contains the following bit vector implementations:
Dong Zhou and David G. Andersen and Michael Kaminsky,
Space-Efficient, High-Performance Rank and Select Structures on Uncompressed Bit Sequences,
SEA 2013.
Since this is a header-only library, you have to simply add it to your projects include path to use it.
A small example can be found below.
We refer to the documentation for more information.
#include <pasta/bit_vector/bit_vector.hpp>
#include <pasta/bit_vector/support/flat_rank_select.hpp>
// Create a bit vector of size 1000 containing only zeros and flip every other bit.
pasta::BitVector bv(1000, 0);
for (size_t i = 0; i < bv.size(); ++i) {
if (i % 2 == 0) { bv[i] = 1; }
}
// Print the bit vector to see that everything worked ;-)
for (auto it = bv.begin(); it != bv.end(); ++it) {
std::cout << ((*it == true) ? '1' : '0');
}
std::cout << std::endl;
// Create rank and select support and print the result of some queries.
pasta::FlatRankSelect rs(bv);
std::cout << rs.rank0(250) << ", " << rs.rank1(250)
<< ", "
<< rs.select0(250) << ", " << rs.rank1(250)
<< std::endl;
There exist an easy to use benchmark, which helps to compare the implementations in this repository.
To build the benchmark, run the CMake command with -DPASTA_BIT_VECTOR_BUILD_BENCHMARKS=On
.
Our tests are contained in the folder tests.
To build the tests, run the CMake command with -DPASTA_BIT_VECTOR_BUILD_TESTS=On
.
We also conducted an extensive experimental evaluation.
To this end, we use our rank and select benchmark where we compare our implementations with many other compact rank and select data structures.
We refer to our paper for a full description of the results, i.e., hardware, inputs, and competitors.
Below, you can find some of the figures we present in the paper.
Below, we list all commands that are required to build the code in this repository.
To this end, we provide three CMake presets (debug, release, and release with debug information).
debug
folder and uses the compiler flags -DDEBUG -O0 -g -ggdb -fsanitize=address
.build
folder and uses the compiler flags -DNDEBUG -march=native -O3
.build_with_debug_info
folder and uses the compiler flags -DDEBUG -g -march=native -O3
.Per default, we use the following compiler flags: -Wall -Wextra -Wpedantic -fdiagnostics-color=always
.
pasta::bit_vector is written in C++20 and requires a compiler that supports it.
We use Ninja as build system.
For more information on how to use this library, please refer to our documentation.
To just clone the source code, use the following.
git clone git@github.com:pasta-toolbox/bit_vector
cd bit_vector
git submodule update --init --recursive
If you also want to build the test, please continue with the following commands.
cmake --preset=[debug|build|relwithdeb]-DPASTA_BIT_VECTOR_BUILD_TESTS=On
cmake --build --preset=[debug|release|relwithdeb]
ctest --test-dir [debug|build|relwithdeb]
A Helmholtz Pilot Program