# Kosaraju’s Algorithm

Kosaraju’s algorithm (Wikipedia) finds a graph’s strongly connected components in linear time.

The algorithm runs in two phases:

Phase 1:

1. grab an unseen node and perform depth-first search.
2. add node to stack after children processed.
3. while unseen nodes exist…

Phase 2:

1. pop node off stack.
2. graph traversed against directed edges.
3. all within traversal belong to one strongly-connected component.
4. while stack is not empty…

This algorithm is used in my solving of SPOJ 13990. Web islands WEBISL; implementation is in git and below:

# Binary-Indexed Tree

A Binary-Indexed Tree is a standard array with extra range indexes, and looks like the following:

Each single cell (shown as an empty square on each row) is an array item, indexed from [1, n]. Each vertical bar (striped part + non-striped top cell) is an index range.

Index Ranges and Responsibility:
Each index range is named after it’s top cell, and its responsibility is given by the last -digit (and following s). Range has responsibility , meaning it covers from down to . In other words, take the last and move it to the end e.g. to , to , to .

Then, to add into any range, start by adding to each index range that lies completely within the add range. To add to , well, that’s covered completely by itself, so gets added to range , fini. To add to , that’s covered by . The latter — — is not a valid range! How’s that handled?

Index Values Too
Each item in the structure is both a range and an individual item, and its actual value is a sum of this item along with all responsible (covering) ranges. E.g. the value of item is its individual value + range values . (Draw a line from the item straight left and see which ranges are hit.)

So, to answer the previous question, when we encounter our invalid range we simply add our value to item itself; that’s the best we can do.

This implementation is suitable for:

– adding values to all indices across a range
– setting individual values
– querying individual values

Not suitable for:
– querying sum totals across a range

Interface:
int64 get(id)
void set(id, int64 value)

Reference: TopCoder – Binary Indexed Trees
N.B. This reference enables a cumulative-sum query from individual inserts/additions. My implementation flips this, enabling addition to entire ranges with querying kept to individual values. (I’ll work on the former soon.)

My implementation is in git and below:

# Exact Cover Problem and Knuth’s Algorithm X

I‘ve put together an implementation of Knuth’s Algorithm X (Wikipedia), which solves the Exact Cover problem (Wikipedia).

Set
Set
Set
Set
Set
Set

Is there a set of sets that covers each column exactly once? In this example, the solution is , since that yields — none left out and none duplicated.

Algorithm X is a recursive back-tracking search; it simply tries one set, prunes down the search space, rolls back if it hits a dead end, tries another set. The secret sauce, however, is the data structure and the concept called Dancing Links (my little write-up here).

Implementation reference: Dancing Links, Knuth. If you want to understand my implementation, this reference is a must-read.

My implementation is in git and below:

# Quick Union Find

Let’s say you have 9 vials of liquid and the list of pairs to combine as follows:

1 & 2 (new batch)
5 & 8 (new batch)
9 & 4 (new batch)
9 & 6 (new 6 goes into existing 9)
3 & 5 (new 3 goes into existing 5)
7 & 1 (new 7 goes into existing 1)
2 & 5 (combine separate existing batches)
2 & 3 (same batch already)

How many separate batches do you have after each move? Are 2 & 4 in the same one?

Solution

This is solved with a Quick Union data structure; this structure has the following operations:

Ensures two IDs exist in the same set

This belongs to which set?

How many total sets?

Where and , the running time is .

My implementation is in git and below:

# 2-Dimensional String Search

In my previous two posts, I implemented KMP search with continuation and the Aho-Corasick dictionary. Now we can put it all together and perform a 2D search, finding a rectangular needle in a larger rectangular haystack. This implementation runs in .

The basic steps:

1. Create Aho-Corasick dictionary from needle rows.

2. Haystack states: for each row of the haystack, for each character run it through our dictionary and record each state. (That’s one state per character.)

3. Needle fingerprint: for each row of the needle, run it through the dictionary and record the very final state. (That’s one state per row.)

4. Finally, search the haystack states down the columns for the needle fingerprint using KMP.

The implementation is in git and below:

# Aho-Corasick String Matching Dictionary

This is an implementation of the Aho-Corasick String Matching Dictionary (Wikipedia).

This algorithm lets you search for (all instances of) multiple substrings in a single pass through the haystack. This is done by first compiling the substrings into a finite-state machine (tree) which is fed each haystack character in turn, adjusts its (specific) state, and reports any matches. This runs in .

The implementation is in git and below:

Here is a demonstration of the Dancing Links idea (Wikipedia definition).

The idea is this: if you remove an item from a doubly-linked list, you can put it right back in as long as you hold on to it. And, if you remove items , you can put them right back as long as you reverse the order exactly, .

This is done as follows:

```// Remove: `curr` is forgotten by neighbors
curr->prev->next = curr->next;
curr->next->prev = curr->prev;

// Reinsert: 'curr' knows where to go, and steps right back in!
curr->next->prev = curr;
curr->prev->next = curr;

// Works over and over, as long as it's done in stack order:
// remove=push, reinsert=pop.
```

This works well where you’re drilling in to search but need to backtrack too — you can do so without the need to work with multiple copies.

And finally, this is demonstrated in git and as follows: