Suppose we have sequence from a set , and that we know that one value from occurs more than times in the sequence; find the most frequent value using only two variables having total memory of only bits and in only a single sweep.
This can be solved in using a simple stack algorithm.
– Scan from left to right.
– Add to the stack when height is going up.
– Otherwise, this marks the end of a (previous) taller rectangle, which is checked for area.
You need to build a solid wall that’s 4-high and N-wide using 4×1 bricks. The question: how many distinct walls can you build?
If you place a brick vertically, that takes up one entire column. Placing 4 bricks horizontally takes up four. The solution makes use of the simple dynamic programming recursion shown below:
Each single cell (shown as an empty square on each row) is an array item, indexed from [1, n]. Each vertical bar (striped part + non-striped top cell) is an index range.
Index Ranges and Responsibility:
Each index range is named after it’s top cell, and its responsibility is given by the last -digit (and following s). Range has responsibility , meaning it covers from down to . In other words, take the last and move it to the end e.g. to , to , to .
Then, to add into any range, start by adding to each index range that lies completely within the add range. To add to , well, that’s covered completely by itself, so gets added to range , fini. To add to , that’s covered by . The latter — — is not a valid range! How’s that handled?
Index Values Too
Each item in the structure is both a range and an individual item, and its actual value is a sum of this item along with all responsible (covering) ranges. E.g. the value of item is its individual value + range values . (Draw a line from the item straight left and see which ranges are hit.)
So, to answer the previous question, when we encounter our invalid range we simply add our value to item itself; that’s the best we can do.
This implementation is suitable for:
– adding values to all indices across a range
– setting individual values
– querying individual values
Not suitable for:
– querying sum totals across a range
void set(id, int64 value)
void add(ida, idb, int64 value)
Reference: TopCoder – Binary Indexed Trees
N.B. This reference enables a cumulative-sum query from individual inserts/additions. My implementation flips this, enabling addition to entire ranges with querying kept to individual values. (I’ll work on the former soon.)
The example from Algorithm X’s Wikipedia page is as follows:
Is there a set of sets that covers each column exactly once? In this example, the solution is , since that yields — none left out and none duplicated.
Algorithm X is a recursive back-tracking search; it simply tries one set, prunes down the search space, rolls back if it hits a dead end, tries another set. The secret sauce, however, is the data structure and the concept called Dancing Links (my little write-up here).
Implementation reference: Dancing Links, Knuth. If you want to understand my implementation, this reference is a must-read.
1 & 2 (new batch)
5 & 8 (new batch)
9 & 4 (new batch)
9 & 6 (new 6 goes into existing 9)
3 & 5 (new 3 goes into existing 5)
7 & 1 (new 7 goes into existing 1)
2 & 5 (combine separate existing batches)
2 & 3 (same batch already)
How many separate batches do you have after each move? Are 2 & 4 in the same one?
This is solved with a Quick Union data structure; this structure has the following operations:
Ensures two IDs exist in the same set
This belongs to which set?
How many total sets?
Where and , the running time is .
The basic steps:
1. Create Aho-Corasick dictionary from needle rows.
2. Haystack states: for each row of the haystack, for each character run it through our dictionary and record each state. (That’s one state per character.)
3. Needle fingerprint: for each row of the needle, run it through the dictionary and record the very final state. (That’s one state per row.)
4. Finally, search the haystack states down the columns for the needle fingerprint using KMP.
This algorithm lets you search for (all instances of) multiple substrings in a single pass through the haystack. This is done by first compiling the substrings into a finite-state machine (tree) which is fed each haystack character in turn, adjusts its (specific) state, and reports any matches. This runs in .
The idea is this: if you remove an item from a doubly-linked list, you can put it right back in as long as you hold on to it. And, if you remove items , you can put them right back as long as you reverse the order exactly, .
This is done as follows:
// Remove: `curr` is forgotten by neighbors curr->prev->next = curr->next; curr->next->prev = curr->prev; // Reinsert: 'curr' knows where to go, and steps right back in! curr->next->prev = curr; curr->prev->next = curr; // Works over and over, as long as it's done in stack order: // remove=push, reinsert=pop. dancing_link_remove(A); dancing_link_remove(B); dancing_link_reinsert(B) dancing_link_remove(C); dancing_link_reinsert(C) dancing_link_reinsert(A)
This works well where you’re drilling in to search but need to backtrack too — you can do so without the need to work with multiple copies.