# Challenge: Most Frequent Element

Here’s a nice problem from Notes on Streaming Algorithms:

Suppose we have sequence from a set , and that we know that one value from occurs more than times in the sequence; find the most frequent value using only two variables having total memory of only bits and in only a single sweep.

Spoiler 1
Spoiler 2

My implementation is in git and below.

# Largest Rectangle in a Histogram

Given a histogram/bar graph, what is the largest rectangle (area) within?

This can be solved in using a simple stack algorithm.

The gist:
– Scan from left to right.
– Add to the stack when height is going up.
– Otherwise, this marks the end of a (previous) taller rectangle, which is checked for area.

References:
geeksforgeeks.com
Wikipedia

My implementation is in git and below:

# Red John is Back

This post is in reference to SPOJ 21525. Red John is Back PPBRJB.

You need to build a solid wall that’s 4-high and N-wide using 4×1 bricks. The question: how many distinct walls can you build?

If you place a brick vertically, that takes up one entire column. Placing 4 bricks horizontally takes up four. The solution makes use of the simple dynamic programming recursion shown below:

Without further ado, my DP implementation is in git and below:

# Binary-Indexed Tree

A Binary-Indexed Tree is a standard array with extra range indexes, and looks like the following:

Each single cell (shown as an empty square on each row) is an array item, indexed from [1, n]. Each vertical bar (striped part + non-striped top cell) is an index range.

Index Ranges and Responsibility:
Each index range is named after it’s top cell, and its responsibility is given by the last -digit (and following s). Range has responsibility , meaning it covers from down to . In other words, take the last and move it to the end e.g. to , to , to .

Then, to add into any range, start by adding to each index range that lies completely within the add range. To add to , well, that’s covered completely by itself, so gets added to range , fini. To add to , that’s covered by . The latter — — is not a valid range! How’s that handled?

Index Values Too
Each item in the structure is both a range and an individual item, and its actual value is a sum of this item along with all responsible (covering) ranges. E.g. the value of item is its individual value + range values . (Draw a line from the item straight left and see which ranges are hit.)

So, to answer the previous question, when we encounter our invalid range we simply add our value to item itself; that’s the best we can do.

This implementation is suitable for:

– adding values to all indices across a range
– setting individual values
– querying individual values

Not suitable for:
– querying sum totals across a range

Interface:
int64 get(id)
void set(id, int64 value)
void add(ida, idb, int64 value)

Reference: TopCoder – Binary Indexed Trees
N.B. This reference enables a cumulative-sum query from individual inserts/additions. My implementation flips this, enabling addition to entire ranges with querying kept to individual values. (I’ll work on the former soon.)

My implementation is in git and below:

# Exact Cover Problem and Knuth’s Algorithm X

I‘ve put together an implementation of Knuth’s Algorithm X (Wikipedia), which solves the Exact Cover problem (Wikipedia).

The example from Algorithm X’s Wikipedia page is as follows:

Set
Set
Set
Set
Set
Set

Is there a set of sets that covers each column exactly once? In this example, the solution is , since that yields — none left out and none duplicated.

Algorithm X is a recursive back-tracking search; it simply tries one set, prunes down the search space, rolls back if it hits a dead end, tries another set. The secret sauce, however, is the data structure and the concept called Dancing Links (my little write-up here).

Implementation reference: Dancing Links, Knuth. If you want to understand my implementation, this reference is a must-read.

My implementation is in git and below:

# Quick Union Find

Let’s say you have 9 vials of liquid and the list of pairs to combine as follows:

1 & 2 (new batch)
5 & 8 (new batch)
9 & 4 (new batch)
9 & 6 (new 6 goes into existing 9)
3 & 5 (new 3 goes into existing 5)
7 & 1 (new 7 goes into existing 1)
2 & 5 (combine separate existing batches)
2 & 3 (same batch already)

How many separate batches do you have after each move? Are 2 & 4 in the same one?

Solution

This is solved with a Quick Union data structure; this structure has the following operations:

Ensures two IDs exist in the same set

This belongs to which set?

How many total sets?

Where and , the running time is .

My implementation is in git and below:

# 2-Dimensional String Search

In my previous two posts, I implemented KMP search with continuation and the Aho-Corasick dictionary. Now we can put it all together and perform a 2D search, finding a rectangular needle in a larger rectangular haystack. This implementation runs in .

The basic steps:

1. Create Aho-Corasick dictionary from needle rows.

2. Haystack states: for each row of the haystack, for each character run it through our dictionary and record each state. (That’s one state per character.)

3. Needle fingerprint: for each row of the needle, run it through the dictionary and record the very final state. (That’s one state per row.)

4. Finally, search the haystack states down the columns for the needle fingerprint using KMP.

The implementation is in git and below:

# Aho-Corasick String Matching Dictionary

This is an implementation of the Aho-Corasick String Matching Dictionary (Wikipedia).

This algorithm lets you search for (all instances of) multiple substrings in a single pass through the haystack. This is done by first compiling the substrings into a finite-state machine (tree) which is fed each haystack character in turn, adjusts its (specific) state, and reports any matches. This runs in .

The implementation is in git and below:

# KMP Search with Continuation

This is a C++ implementation of the Knuth-Morris-Pratt string-search algorithm (Wikipedia). In addition, this implements a continuation for finding all instances. Running time is .

The code is in git and as follows:

Here is a demonstration of the Dancing Links idea (Wikipedia definition).

The idea is this: if you remove an item from a doubly-linked list, you can put it right back in as long as you hold on to it. And, if you remove items , you can put them right back as long as you reverse the order exactly, .

This is done as follows:

```// Remove: `curr` is forgotten by neighbors
curr->prev->next = curr->next;
curr->next->prev = curr->prev;

// Reinsert: 'curr' knows where to go, and steps right back in!
curr->next->prev = curr;
curr->prev->next = curr;

// Works over and over, as long as it's done in stack order:
// remove=push, reinsert=pop.