# Week 3 - Maximum Flow and Minimum Cut & String Sorts

// TODO

## String Sorts

### strings in Java

• Sequence of characters (immutable)

• String vs. StringBuilder

• Linear time

#### Alphabets

• Digital key. Sequence of digits over fixed alphabet.
• Radix. Number of digits R in alphabet.

### key-indexed counting

• Assumption. Keys are integers between 0 and R - 1.

• Goal. Sort an array a[] of N integers between 0 and R - 1.

• Implementation.

• * **Notice**: use a for 0, b for 1, c for 2, d for 3, e for 4, f for 5 for better explanation
• Proposition. Key-indexed counting uses ~ 11 N + 4 R array accesses to sort N items whose keys are integers between 0 and R - 1.

• Least-signiﬁcant-digit-ﬁrst string sort
• Consider characters from right to left.
• Stably sort using d^th character as the key (using key-indexed counting).

• Most-signiﬁcant-digit-ﬁrst string sort

• Partition array into R pieces according to first character (use key-indexed counting).
• Recursively sort all strings that start with each character (key-indexed counts delineate subarrays to sort).
• Treat strings as if they had an extra char at end (smaller than any char).

• implementation

• potential for disastrous performance

1. much too slow for small subarrays
2. Huge number of small subarrays
• Solution: Cutoff to insertion sort for small subarrays.
• Insertion sort, but start at d th character.
• Implement less() so that it compares starting at d^th character.

• faster than the other two algorithms, but not stable.

### suffix arrays

#### Suffix sort

• Can be used for finding longest repeated substring
• worst-case input: longest repeated substring very long.
• LRS needs at least 1 + 2 + 3 + … + D character compares, where D = length of longest match.

#### Manber-Myers 马上到！ algorithm

• Sufﬁx sorting in linearithmic time
• linear: N, linearithmic: a * N
• overview
• Phase 0: sort on first character using key-indexed counting sort.
• Phase i: given array of suffixes sorted on first 2 i - 1 characters, create array of suffixes sorted on first 2 i characters.
• Key process: Constant-time string compare by indexing into inverse
• inverse[]:
• before: {0: 17, 1: 16, 2: 15, …, 9: 1, …, 12: 2, …, 14: 0 }
• after: { 0: 14, 1: 9, 2: 12, …}