Overview

This post explores the design and implementation details of Timsort, the hybrid sorting algorithm derived from merge sort and insertion sort that has been the default sorting algorithm in Python since version 2.3. We analyze its heuristics for minimizing expensive comparisons and maintaining merge balance.

🏷️ The Genesis of Timsort

Timsort was designed by Tim Peters to leverage the structure of real-world data, which often contains pre-sorted sequences (“natural runs”). While based on merge sort, it incorporates several sophisticated heuristics to achieve worst-case performance while significantly outperforming traditional algorithms on practical datasets.

🏷️ Design Principles

The architecture of Timsort is driven by four key observations:

  1. Hybridization: Insertion sort is highly efficient for small arrays, while merge sort provides optimal asymptotic complexity.
  2. Comparison Cost: In dynamic languages like Python, comparing two objects is significantly more expensive than moving them in memory.
  3. Merge Balance: Merge sort performance degrades when merging arrays of highly disparate lengths (e.g., merging 5 elements into 5000).
  4. Stable Sorting: Timsort is designed to be stable, preserving the relative order of equal elements.

🏷️ The Concept of Runs

Timsort identifies runs---contiguous blocks of elements that are already sorted (ascending or strictly descending).

  • To avoid excessive overhead from very short natural runs, Timsort enforces a minimum run length, minrun.
  • If a natural run is shorter than minrun, it is artificially extended using insertion sort with binary search to minimize the number of comparisons.

🏷️ Dynamic minrun Selection

The minrun value is chosen dynamically based on the total number of elements . It is typically between 32 and 64, selected such that is at or slightly below a power of 2, ensuring balanced merge operations.

int minrun(int n) {
  assert (n >= 0);
  int r = 0;
  while (n >= MIN_MERGE) {
    r |= (n & 1);
    n >>= 1;
  }
  return n + r;
}

🏷️ The Merge Strategy

Identified runs are pushed onto a stack. To maintain efficiency, Timsort maintains an invariant on the stack to ensure merges are performed only on runs of similar size, mimicking the behavior of a balanced binary tree.

The stack invariants (for runs from top to bottom) are:

If these invariants are violated, the algorithm triggers a merge of the two smallest adjacent runs. This strategy ensures that the merge complexity remains logarithmic relative to the number of runs.

🏷️ Galloping Mode

When merging two runs and , Timsort monitors how many consecutive elements are taken from the same run. If this number exceeds a threshold (MIN_GALLOP), it enters Galloping Mode.

Instead of linear comparison, the algorithm performs an exponential search (comparing indices ) to find the insertion point. This reduces the number of comparisons from to during the merge process, which is particularly effective when the data has long-range structure.

📚 References