SicHash

A perfect hash function is a function that has no collisions on a given set. SicHash places objects in a cuckoo hash table and then stores the final hash function choice of each object in a retrieval data structure. Using irregular cuckoo hashing, each object has a different number of hash functions

173 commits | Last commit 1 month ago

What SicHash can do for you

A Perfect Hash Function (PHF) is a hash function that has no collisions on a given input set. PHFs can be used for space efficient storage of data in an array, or for determining a compact representative of each object in the set. We present the PHF construction algorithm SicHash - Small Irregular Cuckoo Tables for Perfect Hashing. At its core, SicHash uses a known technique: It places objects in a cuckoo hash table and then stores the final hash function choice of each object in a retrieval data structure. We combine the idea with irregular cuckoo hashing, where each object has a different number of hash functions. Additionally, we use many small tables that we overload beyond their asymptotic maximum load factor. The most space efficient competitors often use brute force methods to determine the PHFs. SicHash provides a more direct construction algorithm that only rarely needs to recompute parts. Our implementation improves the state of the art in terms of space usage versus construction time for a wide range of configurations. At the same time, it provides very fast queries.

Keywords
No keywords available
Programming languages
  • C++ 78%
  • TeX 10%
  • Shell 8%
  • CMake 3%
  • Dockerfile 2%
License
  • GPL-3.0-only
</>Source code

Reference papers

Related projects

no image

Core Informatics

A Helmholtz Pilot Program

Updated 8 months ago
In progress

Related software

ShockHash

SH

A perfect hash function is a function that has no collisions on a given set. ShockHash constructs very compact perfect hash functions significantly faster than previous approaches.

Updated 2 weeks ago