Challenge: Most Frequent Element

Here’s a nice problem from Notes on Streaming Algorithms:

Suppose we have sequence delim{lbrace}{x_1, x_2, ..., x_n}{rbrace} from a set epsilon, and that we know that one value from epsilon occurs more than n/2 times in the sequence; find the most frequent value using only two variables having total memory of only lg(n) + lg(max value) bits and in only a single sweep.

Spoiler 1
lg(n) can store any value up to n.
Spoiler 2
lg(max value) can store any value from epsilon.
Variables are count and major. Scan the list once, each element votes for major using count.

for each element
    element == major? count++ : count--
    if (count == 0)
        major = element
        count = 1

My implementation is in git and below.

Leave a Reply

Your email address will not be published. Required fields are marked *