Hachis

hachis is an OCaml library that offers hash sets and hash maps.

These data structures handle collisions via linear probing, a technique that relies on linear searches within an array. All of the data is stored within one large array (for hash sets) or two large arrays (for hash maps). As a result, these data structures offer good locality.

  1. For hash sets, use the functor Hachis.HashSet.Make.
  2. For hash maps, use the functor Hachis.HashMap.Make.

Hash sets are slightly faster than hash maps. A hash set occupies twice less space in memory than a hash map with the same cardinality.

Performance

Compared with the standard library's hash maps, hachis's hash sets and hash maps are faster, more compact, and have better locality.

Some benchmarks carried out using version 5.2.0+flambda of the OCaml compiler suggest than hachis can be 2x faster than Hashtbl on insertions, 2x-4x faster than Hashtbl on deletions, and 1x-3x faster than Hashtbl on lookups. These numbers vary depending on the scenario and on the cardinality of the hash set or hash map.

It seems safe to say that hachis almost always outperforms Hashtbl.

Words of Caution

hachis's sets and maps are not thread-safe and do not include a protection against data races: concurrent accesses to a set or map, where at least one thread attempts to modify the set or map, are forbidden and can compromise memory safety (that is, they can cause a hard crash or silent memory corruption). Concurrent read accesses (mem, find, find_key, find_value) are safe.

hachis's maps do not include a protection against memory leaks. That is, after a pair of a key x and value v has been deleted from a map m via remove m x, the map m can still contain a spurious pointer to v, causing the value v to remain reachable in the eyes of the garbage collector. This problem can be avoided (only) by explicitly calling reset, which empties the whole map.

hachis's higher-order operations do not detect an attempt to modify a set or map while an operation is in progress. For example, in iter f m, the function f must not modify the map m, but a violation of this rule is not detected by hachis.

Installation and Usage

Type opam install hachis.

In your dune file, add (libraries hachis) to the description of your library or executable.