4.7 heap

4.7.1 heap 概述

heap 并不属于 STL 容器组件，它作为 priority-queue 的助手。priority queue 允许用户以任何元素推入容器内，但取出时一定是从优先权最高（也就是数值最高）的元素开始取。binary max heap 正是具有这样的特性，适合作为 priority 的底层机制。

binary hea 就是一种 complete binary tree（完全二叉树）。它没有任何节点漏洞，我们可以利用数组来存储所有节点。假设将 array 的#0 元素保留（或设为 infinity），那么当 complete binary tree 种的某个节点位于 array 的 i 处时，其左子节点必位于 array 的 2i 处，其右子节点必位于 array 的 2i+1 处。其父节点必位于 array 的 i/2 处。这种以 array 表述 tree 的方式，我们称为隐式表述法 (implicit representation)。

array 的缺点是无法动态改变大小，而 heap 却需要这项功能，因此，以 vector 代替 array 是更好的选择。

根据元素排列方式，heap 可分为 max-heap 和 min-heap 两种，前者每个节点的键值都大于或等于其子节点键值，后者的每个节点键值都小于或等于其子节点键值。因此，max-heap 的最大值在根节点，并总是位于底层 array 或 vector 的开头处；min-heap 的最小值在根节点，并总是位于底层 array 或 vector 的开头处。STL 提供的是 max-heap，以下讨论的 heap 指的是 max-heap。

4.7.2 heap 算法

push_heap 算法

下图所示的是 push_heap 算法的展示。为了满足完全二叉树的条件，新加入的元素一定要放在最下一层作为叶节点，并填补在由左到右的第一个空格，也就是把新元素插入在底层 vector 的 end() 处。

新元素是否适合其现有位置呢？为满足 max-heap 条件，我们执行 percolate up（上溯） 程序：将新节点与其父节点比较，如果其键值大于符父节点，就交换父子位置。如此一直上溯，直到不需要交换或到根节点为止。

下面是 push_heap 算法的实现细节。该函数接受两个迭代器，用来表现一个 heap 底部容器（vector）的首尾，并且新元素已经插入到底部容器的最尾端。

template <class RandomAccessIterator>
inline void push_heap(RandomAccessIterator first, 
                      RandomAccessIterator last) {
  // 注意，此函数被调用时，新元素已置于底部容器的最尾端
  __push_heap_aux(first, last, distance_type(first),
                               value_type(first));
}

template <class RandomAccessIterator, class Distance, class T>
inline void __push_heap_aux(RandomAccessIterator first,
                            RandomAccessIterator last,
                            Distance*, T*) {
  __push_heap(first, Distance((last - first) - 1), Distance(0),
                                 T(*(last - 1)));
  // 以上根据 implicit representation 的结构特性：
  // 新值必位于底部容器的最尾端，此即第一个洞号：(last - first) - 1
}

// 以下这组 push_back() 不允许指定“大小比较标准”
template <class RandomAccessIterator, class Distance, class T>
void __push_heap(RandomAccessIterator first, Distance holeIndex,
                 Distance topIndex, T value) {
  Distance parent = (holeIndex - 1) / 2;    // 找出父节点
  // 如果新元素比父亲节点大，就交换父子位置
  while (holeIndex > topIndex && *(first + parent) < value) {
    *(first + holeIndex) = *(first + parent); // 令洞值为父值
    holeIndex = parent;    // percolate up: 调整洞号，向上提升至父节点
    parent = (holeIndex - 1) / 2;   // 新洞的父节点
  }   // 持续至顶端，或满足 heap 的次序特性为止
  *(first + holeIndex) = value;  // 将新值放到洞，完成插入操作
}

pop_heap 算法

下图所示的是 push_heap 算法的展示。pop 操作取走根节点，为了满足完全二叉树的条件，必须割舍最下层最右边的叶节点，并将其值重新安插至 max-heap。

为了满足 max-heap 次序特性，我们执行 percolate down（下溯） 程序：将空间节点和其他较大子节点交换，并持续下放，直到叶节点为止。然后将被割舍元素的值设给这个已到达叶层的空洞节点，再对它执行一次 percolate up（上溯）程序，便完成操作。

下面是 pop_heap 算法的实现。该函数接受两个迭代器，用来表现一个 heap 底部容器（vector）的首尾。

template <class RandomAccessIterator>
inline void pop_heap(RandomAccessIterator first, 
                     RandomAccessIterator last) {
  __pop_heap_aux(first, last, value_type(first));
}

template <class RandomAccessIterator, class T>
inline void pop_heap_aux(RandomAccessIterator first, 
                          RandomAccessIterator last, T*) {
  __pop_heap(first, last - 1, last - 1, T(*(last - 1)),
             distance_type(first));
  // 根据 implicit representation heap 的次序特性，pop 的操作结果
  // 应为底部容器的第一个元素。因此，首先设定欲调整值为尾值，然后将首值调至
  // 尾节点（result = last - 1），然后重整 [first,last-1)，
  // 使之成为合格的 heap
}

// __pop_heap() 不允许指定“大小比较标准”
template <class RandomAccessIterator,  class T, class Distance>
void __pop_heap(RandomAccessIterator first, RandomAccessIterator last,
                RandomAccessIterator result, T value, Distance*) {
  *result = *first;   // 设定尾值尾首值，于是为止即为欲求结果
  __adjust_heap(first, Distance(0), Distance(last - first), T(*(last - 1)));
  // 重新调整 heap，洞号为 0 处欲调整值为 value（原尾值）
}

// __adjust_heap() 不允许指定“大小比较标准”
template <class RandomAccessIterator, class Distance, class T>
void __adjust_heap(RandomAccessIterator first, Distance holeIndex,
                   Distance len, T value) {
  Distance topIndex = holeIndex;
  Distance secondChild = 2 * holeIndex + 2;  // 洞节点的右子节点
  while (secondChild < len) {
    // 比较洞节点的左右两个子值，然后以 secondChild 代表较大子节点
    if (*(first + secondChild) < *(first + (secondChild - 1)))
      secondChild--;
    // percolate down 令较大子节点为洞值，再令洞号下移至较大子节点处
    *(first + holeIndex) = *(first + secondChild);
    holeIndex = secondChild;
    // 找处新洞节点的右子节点
    secondChild = 2 * (secondChild + 1);
  }
  if (secondChild == len) {  // 如果洞节点只有左子节点
    // percolate down 令左子节点为洞值，再令洞号下移至较大子节点处
    *(first + holeIndex) = *(first + (secondChild - 1));
    holeIndex = secondChild - 1;
  }
  // __push_heap(first, holeIndex, topIndex, value);
  // bug: 不可再执行 percolate up
}

注意，pop_heap 之后，最大元素只是被放置于底部容器的最尾端，尚未被取走。如果要取其值，可使用底部容器的 back() 函数，如果要移除它，可使用底部容器提供的 pop_back() 函数。

sort_heap 算法

既然每次 pop_heap 可获得 heap 中键值最大的元素，如果持续对整个 heap 做 pop_heap 操作，每次将操作范围从后向前缩减一个元素，当整个程序执行完毕时，我们便有了一个递增序列。下图展示 sort_heap 的过程：

下面是 sort_heap 算法的实现细节。该函数接受两个迭代器，用来表现一个 heap 底部容器（vector）的头尾。注意，排序过后，原来的 heap 就不再是一个合法的 heap 了。

// 以下这个 sort_heap() 不允许指定“大小比较标准”
template <class RandomAccessIterator>
void sort_heap(RandomAccessIterator first, RandomAccessIterator last) {
  // 每执行一次 pop_heap，极值被放在最尾端
  // 扣除尾端再执行一次 pop_heap，次极值又被放在新尾端
  // 一直下去，最后即得排序结果
  while (last - first > 1) {
    pop_heap(first, last--);  // 每执行 pop_heap() 一次，操作范围退缩一格
}

make_heap 算法

这个算法用来将一段现有的数据转化为一个 heap。

// 将 [first,end) 排列为一个 heap
template <class RandomAccessIterator>
inline void make_heap(RandomAccessIterator first, 
                      RandomAccessIterator last) {
  __make_heap(first, last, value_type(first), distance_type(first));
}

// 下面 make_heap() 不允许指定“大小比较标准”
template <class RandomAccessIterator, class T, class Distance>
void __make_heap(RandomAccessIterator first, 
                 RandomAccessIterator last, T*, Distance*) {
  if (last - first < 2) return;     // 如果长度为 0 或 1，不必重新排列
  Distance len = last - first;
  // 找出第一个需要重排的子树头部，以 parent 标示出
  // 由于任何叶节点不需要执行 percolate down，所以有以下计算
  Distance parent = (len - 2)/2;    // parent 应命名为 holeIndex 更合理

  while (true) {
    // 重排以 parent 为首的子树。len 是为了让__adjust_heap() 判断操作范围
    __adjust_heap(first, parent, len, T(*(first + parent)));
    if (parent == 0) return;    // 走完根节点就结束
    parent--;                   // （已重排完子树的）头部向前一个节点
  }
}

4.7.3 heap 没有迭代器

heap 的所有元素都必须遵循完全二叉树排列规则，所以 heap 不提供遍历功能，也不提供迭代器。

4.7.4 heap 测试实例

// file: 4heap-test.cpp
#include <iostream>
#include <vector>
#include <algorithm>  // heap algorithms

int main() {
  {
    // test heap (vector)
    int ia[9] = {0, 1, 2, 3, 4, 8, 9, 3, 5};
    std::vector<int> v(ia, ia + 9);

    std::make_heap(v.begin(), v.end());
    for (int i = 0; i < v.size(); i++)
      std::cout << v[i] << " ";   // out: 9 5 8 3 4 0 2 3 1
    std::cout << std::endl;

    i.push_back(7);
    std::push_heap(v.begin(), v.end());
    for (int i = 0; i < v.size(); i++)
      std::cout << v[i] << " ";   // out: 9 7 8 3 5 0 2 3 1 4
    std::cout << std::endl;

    std::pop_heap(v.begin(), v.end());
    std::cout << v.back() << " ";   // out: 9
    v.pop_back();

    for (int i = 0; i < v.size(); i++) 
      std::cout << v[i] << " ";   // out: 8 7 4 3 5 0 2 3 1
    std::cout << std::endl;

    std::sort_heap(v.begin(), v.end());
    for (int i = 0; i < v.size(); i++)
      std::cout << v[i] << " ";   // out: 0 1 2 3 3 4 5 7 8
    std::cout << std::endl;
  }
  {
    // test heap (array)
    int ia[9] = {0, 1, 2, 3, 4, 8, 9, 3, 5};
    std::make_heap(ia, ia + 9);
    // array 无法动态改变大小，因此不可以对满载的 array 进行 push_back() 操作

    std::sort_heap(ia, ia + 9);
    for (int i = 0; i < 9; i++)
      std::cout << ia[i] << " ";   // out: 0 1 2 3 3 4 5 7 8 9
    std::cout << std::endl;
    
    // 重新再做一个 heap
    std::make_heap(ia, ia + 9);
    std::pop_heap(ia, ia + 9);
    std::cout << ia[8] << " ";   // out: 9
  }
  {
    // test heap (array)
    int ia[6] = {4, 1, 7, 6, 2, 5};
    std::make_heap(ia, ia + 6);
    for (int i = 0; i < 6; i++)
      std::cout << ia[i] << " ";   // out: 7 6 5 1 2 4
    std::cout << std::endl;
  }
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

4_7.md

4_7.md

4.7 heap

4.7.1 heap 概述

4.7.2 heap 算法

push_heap 算法

pop_heap 算法

sort_heap 算法

make_heap 算法

4.7.3 heap 没有迭代器

4.7.4 heap 测试实例

Files

4_7.md

Latest commit

History

4_7.md

File metadata and controls

4.7 heap

4.7.1 heap 概述

4.7.2 heap 算法

push_heap 算法

pop_heap 算法

sort_heap 算法

make_heap 算法

4.7.3 heap 没有迭代器

4.7.4 heap 测试实例