Skip to content

String Hashing

We need a hash function to convert a string into an integer, called the hash of that string. The following condition has to hold: For two strings SS and TT,

S=T    hash(S)=hash(T)S = T \implies \text{hash}(S) = \text{hash}(T) \\

but reverse is not guaranteed. We only hope that

P(ST    hash(S)hash(T))is very highP(S \not= T \implies \text{hash}(S) \not= \text{hash}(T)) \quad \text{is very high} \\ hash(s)=s[0]+s[1]p+s[2]p2++s[n1]pn1modm=i=0n1s[i]pimodm,\begin{aligned} \text{hash}(s) &= s[0] + s[1] \cdot p + s[2] \cdot p^2 + \cdots + s[n-1] \cdot p^{n-1} \mod m \\ &= \sum_{i=0}^{n-1} s[i] \cdot p^i \mod m, \end{aligned}

where n=length(s)n = \text{length}(s) and pp and mm are some chosen, positive numbers.

// Precomputing the powers of p might give a performance boost.

Technique: Fast Polynomial Rolling Hash of Substrings

Section titled “Technique: Fast Polynomial Rolling Hash of Substrings”