3.11. Hash Tables

Hash tables provide a powerful data structure to maintain associations between lisp objects. These data structures behave like arrays except that any lisp object is an acceptable subscript. This simple and flexible access makes hash tables extremely convenient.

They are so convenient that you will be quickly tempted to use them everywhere. However, keep in mind that a simple array requires less memory (about five times less) and less processor cycles than a hash table.

3.11.0. Representing Associations with Hash Tables

See: (= n1 n2 )
See: (== n1 n2 )
A hash table is an instance of class |HTABLE| . Like most lisp objects, hash tables can be compared using function = and saved using function bwrite .

A hash table maintains a set of associations between two lisp objects, namely a key and a value. This set is organized in a way that allows fast retrieval of the value associated to a given key.

Hash table queries search an association whose key is equal to the queried key. Since there are two ways to test equality (i.e. functions = and == ), there are two kinds of hash tables.

The most useful hash tables use the logical equality, as implemented by function = . Two objects are logically equal if they convey the same useful information. These hash table work properly as long as you do not modify the objects used as keys by the hash table. This can happen if you used arrays or custom objects as keys in your hash table. See the description of function htable-rehash for more information about this point.

Hash tables can also rely on pointer equality, as implemented by function == . These hash tables are useful to associate a piece of information with a particular lisp object. Functions getp and putp , for instance, make use of pointer-equality hash tables. The associations of a pointer-equality hash table are automatically removed when the key object is deallocated. (htable [nelem [flag]])

Function htable returns a new empty hash table. Optional arguments nelem and flag are seldom useful:

Argument nelem specifies an estimate of the number of elements in the hash table. This estimate is useful for saving a few processor cycles when filling the table.

Argument flag is a boolean value specifying whether the hash table will use logical equality (the default) or pointer equality (when argument flag is not nil.) (htable key)

This expression returns the value associated with key key in hash table htable . Expression key can be any lisp object. If no association is defined for key key , the empty list is returned. (htable key value)

This expression associates value value to key key in the hash table htable . The value can be subsequently retrieved by expression (htable key) .

This operation is performed in two logical steps: (a) any existing association matching key key is first deleted, and (b) a new association is created if argument value is not the empty list.


? (setq a (htable))
= ::HTABLE:6f3cac
? (a 'hello "hi")
= ::HTABLE:6f3cac
? (a '(3 4) "ho")
= ::HTABLE:6f3cac
? (a 'hello)
= "hi"
? (a '(3 4 5))
= ()
? (a '(3 4))
= "ho" (htable-size htable)

Function htable-size returns the number of associations managed by the hash table. (htable-keys htable)

Function htable-keys returns a list of the keys of all the associations managed by the hash table. It is guaranteed that querying these keys will eturn a non nil result.
? (setq a (htable))
= ::HTABLE:6f43c
? (for (i 0 10) (a i (sqrt i))
= ::HTABLE:6f43c
? (htable-keys a)
= (7 6 3 10 5 9 8 4 2 0 1) (htable-alist htable)

See: (assoc key alist )
Function htable-alist returns an alist representing all associations managed by the hash table. This function also provide a way to iterate over all associations:
? (setq a (htable))
= ::HTABLE:6f43c
? (for (i 0 10) (a i (sqrt i))
= ::HTABLE:6f43c
?  (each (((key . value) (htable-alist a))) 
     (printf "%4d %6.4f\n" key value))))
   7 2.6458
   6 2.4495
   3 1.7321
  10 3.1623
   5 2.2361
   9 3.0000
   8 2.8284
   4 2.0000
   2 1.4142
   0 0.0000
   1 1.0000
= () (htable-rehash htable)

Each association managed by a hash table contains a pointer to the key object and a pointer to the value object. Functions that modify the values of these objects also modify the state of the hash table.

Hash table queries search an association whose key is logically equal to the queried key (unless pointer equality was specified when creating the hash table). The hash table works by storing each association in locations depending on the information conveyed by the key objects.

Modifying the object used as a key creates a lot of problems. The corresponding association is no longer stored in a location corresponding to the new information conveyed by the key objects. The hash table state is no longer consistent.

Function htable-rehash triggers the relocation of all associations of the hash table htable to the location corresponding to the new values of the key objects. You can then reliably use the hash table again.

? (setq a (htable))
= ::HTABLE:6f32c
? (setq m [1 2])
= [1 2]
? (a m 3)
= ::HTABLE:6f32c
? (a [1 2])
= 3
? (m 0 0)
= [ 0.00 2.00 ]
? (a [1 2])
= ()      ;; we lost it
? (a [0 2])
= ()      ;; result is undefined
? (htable-rehash a)
= 1
? (a [0 2])
= 3       ;; entry is no longer lost
? (a [1 2])
= ()      ;; this key no longer exists
The relocation of the hash table entries actually happens when you first access the hash table after calling htable-rehash . You can call function htable-rehash several times in a row. The cost of relocating all the hash table entries will be only incurred once when you will access the hash table again. (htable-info htable)

Function htable-info returns a list containing statistical information about the hash table. This list is an alist with the following form:
((size . 1294) (buckets . 3841) (hits 1132 145 14 3))
In this example, the hash table contains 1294 associations. Associations are stored into 3841 buckets. Out of the 1294 associations, 1132 canbe retrieved directly, 145 require one extra search iteration, 14 require two extra search iteration, etc.

3.11.1. Representing Sets with Hash Tables

Sets are easily represented by creating a hash table and defining associations whose keys are the set elements and whose value is non nil. Here are some of the basic hash table expressions that can be used:

Other functions given in this section perform additional operations on sets represented by hash tables. (hset eltlist [flag])
[DE] (sysenv.lsh)

Author(s): Raymond Martin (count flag part)

Function hset returns a hash table representing the set of all elements of list eltlist (as a set, the elements are only entered once). The hash table associations are not returned in any particular order (e.g. when using htable-keys , htable-alist , etc.), use sort-list or other function to get a sorted or ordered list or other representation.

Argument flag is a boolean value specifying whether or not the hash table will maintain a count as the value of the same elements encountered when building the hash table (as opposed to t when flag is nil ). By default, flag is nil .

These counts can be useful for performing statistics or other calculations.

? (setq a (hset '(1 2 3 4 5 4 3 2 1 5 4 2 6) t ))
= ::HTABLE:6f43c
? (each (((elem . value) (htable-alist a))) 
     (printf "%4d %4d\n" elem value))))
   4   3
   2   3
   1   2
   6   1
   3   2
   5   2
= () (hset-and htable1 htable2)
[DE] (sysenv.lsh)

Function hset-and returns a new hash table representing the intersection of the sets represented by hash tables htable1 and htable2 .

? (htable-keys (hset-and (hset '(1 2 3 4)) (hset '(2 4 5 6))))
= (4 2) (hset-or htable1 htable2)
[DE] (sysenv.lsh)

Function hset-or returns a new hash table representing the union of the sets represented by hash tables htable1 and htable2 .

? (htable-keys (hset-or (hset '(1 2 3 4)) (hset '(2 4 5 6))))
= (4 2 1 6 3 5)