A hash table is very universal data structure. It does most of it’s operations in O(1) average time. The library contains a header to generate hash tables suiting your needs.

Mandatory macros

  • HASH_NODE — a data type where a node dwells. It is usually a structure.

  • HASH_PREFIX(x) — the name generating macro.

  • Key type and name. Must be one of following.

    • HASH_KEY_ATOMIC — the key (node\->HASH_KEY_ATOMIC) is an atomic type which can be compared using ==.

    • HASH_KEY_STRING — the key is a zero-terminated string, allocated separately from the rest of the node.

    • HASH_KEY_ENDSTRING — a zero-terminated string which lives at the end of node (it is allocated together with the node). It should be declared as char key[1].

    • HASH_KEY_MEMORY — the node\->HASH_KEY_MEMORY is to be compared using memcmp() function. In this case, you need to provide HASH_KEY_SIZE macro as well, to specify the length of the key.

    • HASH_KEY_COMPLEX(x) — the key is compound of more than one component. The macro should expand to x key1, x key2, ..., x kn. Furthermore, you need to provide a HASH_KEY_DECL macro. It is used to define function parameters. Therefore it should expand to type1 key1, type2 key2, ..., typen keyn. And HASH_GIVE_HASHFN and HASH_GIVE_EQ are mandatory for this key type.

Optional function switches

You can define any of these macros and provide corresponding functions to customize the behaviour. The macros are:

  • HASH_GIVE_HASHFN — the table will use uint HASH_PREFIX(hash)(key) to calculate hash of key. There is a sensible default for integers and strings. In the case of HASH_KEY_COMPLEX, it is mandatory to provide this macro and function.

  • HASH_GIVE_EQ — tells the table to use int HASH_PREFIX(eq)(key1, key2) function to decide if key1 and key2 are equal. Default for atomic types is == and strcmp() or strcasecmp() for strings (depends on HASH_NOCASE switch). It is mandatory when you use HASH_KEY_COMPLEX.

  • HASH_GIVE_EXTRA_SIZE — function int HASH_PREFIX(extra_size)(key) returns how many bytes after the node should be allocated. It defaults to 0 or to length of key in case of HASH_KEY_ENDSTRING.

  • HASH_GIVE_INIT_KEY — function void HASH_PREFIX(init_key)(node *, key) is used to initialize key in newly created node. The default is assignment for atomic keys and static strings (HASH_KEY_ATOMIC, HASH_KEY_STRING) and strcpy() for HASH_KEY_ENDSTRING.

  • HASH_GIVE_INIT_DATA — function void HASH_PREFIX(init_data)(node *) is used to initialize the rest of node. Useful if you use HASH_PREFIX(lookup())

  • HASH_GIVE_ALLOC — you need to provide void \*HASH_PREFIX(alloc)(uint size and void HASH_PREFIX(free)(void \*) to allocate and deallocate the nodes. Default uses xmalloc() and xfree(), [mempool routines] or [eltpool routines], depending on HASH_USE_POOL, HASH_AUTO_POOL, HASH_USE_ELTPOOL and HASH_AUTO_ELTPOOL switches.

  • [HASH_GIVE_TABLE_ALLOC] — you need to provide void \*HASH_PREFIX(table_alloc)(uint size and void HASH_PREFIX(table_free)(void \*) to allocate and deallocate the table itself. Default uses xmalloc() and xfree() or the functions from HASH_GIVE_ALLOC depending on [HASH_TABLE_ALLOC] switch.

Optional parameters

You can customize the hash table a little more by these macros:

  • HASH_NOCASE — use case-insensitive comparison for strings.

  • HASH_DEFAULT_SIZE — use approximately this many elements when creating the hash table.

  • HASH_CONSERVE_SPACE — use as little space as possible.

  • HASH_FN_BITS — the hash function only provides this many significants bits.

  • HASH_ATOMIC_TYPE — the type of atomic key (HASH_KEY_ATOMIC) is not int, but this type.

  • HASH_USE_POOL — tells to use mempool allocation to allocate the nodes. You should define it to the name of mempool variable to be used for this purpose.

  • HASH_AUTO_POOL — like above, but it creates it’s own mempool. Define it to the block size of the pool.

  • HASH_USE_ELTPOOL — tells to use eltpool allocation to allocate the nodes. You should define it to the name of eltpool variable to be used for this purpose.

  • HASH_AUTO_ELTPOOL — like above, but it creates it’s own mempool. Define it to the number of preallocated nodes in each chunk of memory.

  • HASH_ZERO_FILL — initialize new nodes to all zeroes.

  • HASH_TABLE_ALLOC — allocate the table the same way as nodes. If not provided, xmalloc() is used.

  • HASH_TABLE_GROWING — never decrease the size of allocated table of nodes.

  • HASH_TABLE_DYNAMIC — By default, only one global hash table is used. With this macro defined, all functions gain new first parameter of type HASH_PREFIX(table) * to allow them work with multiple hash tables.

  • HASH_TABLE_VARS — extra variables to be defined at head of HASH_PREFIX(table) * structure. It can be useful in combination with [HASH_TABLE_DYNAMIC] to access per-table custom variables from macros or function switches before you include the generator.

  • HASH_LOOKUP_DETECT_NEW — the prototype for lookup is changed to node *lookup(key, int *new_item), new_item must not be NULL and returns 1 whether lookup just created a new item in the hashtable or 0 otherwise.

Functionality switches

Each of these macros enables some of the functionality the table has.

Generated functions

These are the function that the header file generates for you. The strange first parameter of each function is a place where the HASH_PREFIX(table) * resides when you define HASH_TABLE_DYNAMIC. If you do not, the parameter is empty.


static void HASH_PREFIX(init)(TA);

Initializes the hash table. This one is available no matter what HASH_WANT_ macros you defined or not.


static void HASH_PREFIX(cleanup)(TA);

Deallocates the hash table, including the nodes. It is available if you defined HASH_WANT_CLEANUP.


static HASH_NODE* HASH_PREFIX(find)(TAC HASH_KEY_DECL);

Finds a node with given key (specified in the HAS_KEY_DECL parameter). If it does not exist, NULL is returned.

Enabled by the HASH_WANT_FIND macro.


static HASH_NODE* HASH_PREFIX(find_next)(TAC P(node) *start);

Finds next node with the same key. Returns NULL if it does not exist.

Enabled by the HASH_WANT_FIND_NEXT macro.


static HASH_NODE * HASH_PREFIX(new)(TAC HASH_KEY_DECL);

Generates a new node with a given key.

Enabled by the HASH_WANT_NEW macro.


static HASH_NODE* HASH_PREFIX(lookup)(TAC HASH_KEY_DECL, int *new_item);

Finds a node with a given key. If it does not exist, a new one is created. It is strongly recommended to use HASH_GIVE_INIT_DATA.

This one is enabled by the HASH_WANT_LOOKUP macro. The new_item argument is available only if HASH_LOOKUP_DETECT_NEW was given.


static int HASH_PREFIX(delete)(TAC HASH_KEY_DECL);

Removes a node with the given key from hash table and deallocates it.

Success is returned.

This one is enabled by HASH_WANT_DELETE macro.


static void HASH_PREFIX(remove)(TAC HASH_NODE *n);

Removes a given node and deallocates it. It differs from HASH_PREFIX(delete)() in its type of parameter — this one deletes a specific node, that one looks for it by a key.

Enabled by HASH_WANT_REMOVE macro.

Iterator

You can use the HASH_FOR_ALL iterator macro to run trough all the nodes. Lets say your HASH_PREFIX(x) macro is defined as prefix_##x. Then you would do something like:

HASH_FOR_ALL(prefix, node_variable)
{
   do_something_with_node(node_variable);
}
HASH_END_FOR;

If you use HASH_TABLE_DYNAMIC, use HASH_FOR_ALL_DYNAMIC(prefix, table, node_variable) instead.

You may not modify the table inside the block. Use HASH_BREAK and HASH_CONTINUE instead of break and continue statements.