東京大学新領域創成科学研究科メディカル情報生命専攻 2015年8月実施問題9

Author

Description

We sort an integer array, x. Answer the following questions, assuming that integer operations such as comparison, addition, subtraction, loading from memory, and storing to memory, take a unit time.

(1) Fill (A) and (B) to complete the function below that constructs an array \(y[s, s+1, \cdots, e-1]\) sorted in ascending order by merging two subarrays \(x[s, s+1, \cdots, m-1]\) and \(x[m, m+1, \cdots, e-1]\), each of which is sorted in ascending order. You may write multiple lines if needed.

void merge_two_arrays(int x[], int s, int m, int e, int y[]) {
    int i = s, j = m, k = s;
    while(i < m && j < e) {
        if(x[i] < x[j]) {
            // (A)
        } else {
            // (B)
        }
    }
    while(i < m) { y[k] = x[i]; k++; i++; }
    while(j < e) { y[k] = x[j]; k++; j++; }
}

(2) Fill (C) to complete the function below that takes an integer array \(x[s, s+1, \cdots, e-1]\) as an input and outputs the sorted array \(x[s, s+1, \cdots, e-1]\). Array \(y\) is a temporary array of the same size as \(x\).

void merge_sort(int x[], int s, int e, int y[]) {
    if(e - s <= 1) return;
    int m = (s + e) / 2;
    // (C)
    merge_two_arrays(x, s, m, e, y);
    for(int i = s; i < e; i++) { x[i] = y[i]; }
}

(3) Find the worst-case time complexity of sorting an integer array of size \(n\) by merge_sort().

(4) In order to accelerate sorting by merge_sort(), we implement a hardware that calculates a function cmp(x1, x2, x3) that returns 1 when x1 is the smallest element among x1, x2 and x3, 2 when x2 is the smallest, and 3 when x3 is the smallest. Show how to accelerate merge_sort() using the function cmp(). We assume that the function cmp() and branching by its return value take a unit time, respectively.

我们对整数数组 x 进行排序。回答以下问题，假设整数操作（如比较、加法、减法、从内存加载和存储到内存）占用单位时间。

(1) 填写 (A) 和 (B)，以完成以下函数，该函数通过合并两个升序排列的子数组 \(x[s, s+1, \cdots, m-1]\) 和 \(x[m, m+1, \cdots, e-1]\) 来构造一个升序排列的数组 \(y[s, s+1, \cdots, e-1]\)。如有需要，可以写多行代码。

void merge_two_arrays(int x[], int s, int m, int e, int y[]) {
    int i = s, j = m, k = s;
    while(i < m && j < e) {
        if(x[i] < x[j]) {
            // (A)
        } else {
            // (B)
        }
    }
    while(i < m) { y[k] = x[i]; k++; i++; }
    while(j < e) { y[k] = x[j]; k++; j++; }
}

(2) 填写 (C)，以完成以下函数，该函数将整数数组 \(x[s, s+1, \cdots, e-1]\) 作为输入，并输出已排序的数组 \(x[s, s+1, \cdots, e-1]\)。数组 \(y\) 是与 \(x\) 大小相同的临时数组。

void merge_sort(int x[], int s, int e, int y[]) {
    if(e - s <= 1) return;
    int m = (s + e) / 2;
    // (C)
    merge_two_arrays(x, s, m, e, y);
    for(int i = s; i < e; i++) { x[i] = y[i]; }
}

(3) 通过 merge_sort() 找出对大小为 \(n\) 的整数数组进行排序的最坏情况下的时间复杂度。

(4) 为了加速 merge_sort() 排序，我们实现了一种硬件，它计算一个函数 cmp(x1, x2, x3)，当 x1 是 x1、x2 和 x3 中最小的元素时返回 1，当 x2 是最小的时返回 2，当 x3 是最小的时返回 3。展示如何使用函数 cmp() 加速 merge_sort()。我们假设函数 cmp() 和根据其返回值进行的分支分别占用单位时间。

Kai

(1)

The task is to fill in the parts labeled as (A) and (B) to correctly merge two sorted subarrays.

void merge_two_arrays(int x[], int s, int m, int e, int y[]) {
    int i = s, j = m, k = s;
    while(i < m && j < e) {
        if(x[i] < x[j]) {
            y[k] = x[i]; // (A)
            i++; k++; // (A)
        } else {
            y[k] = x[j]; // (B)
            j++; k++; // (B)
        }
    }
    while(i < m) { y[k] = x[i]; k++; i++; }
    while(j < e) { y[k] = x[j]; k++; j++; }
}

Explanation:

(A): When the element at x[i] is less than x[j], we copy x[i] into y[k] and increment both i and k.
(B): When x[i] is not less than x[j], we copy x[j] into y[k] and increment both j and k.

(2)

The task is to fill in the part labeled as (C) to correctly implement the merge sort algorithm.

void merge_sort(int x[], int s, int e, int y[]) {
    if(e - s <= 1) return;
    int m = (s + e) / 2;
    merge_sort(x, s, m, y); // (C)
    merge_sort(x, m, e, y); // (C)
    merge_two_arrays(x, s, m, e, y);
    for(int i = s; i < e; i++) { x[i] = y[i]; }
}

Explanation:

(C): The merge sort algorithm recursively divides the array into two halves until each subarray has only one element. Then, it merges the subarrays using the merge_two_arrays function. The two recursive calls are to merge_sort(x, s, m, y) and merge_sort(x, m, e, y).

(3)

The worst-case time complexity of merge sort is O(n \log n), where \(n\) is the number of elements in the array. This complexity arises because the array is divided into two halves \(\log n\) times, and each level of recursion requires \(O(n)\) operations to merge the subarrays.

(4)

To further optimize the merge sort, we can leverage the hardware-supported cmp(x1, x2, x3) function by modifying the traditional two-way merge sort into a three-way (ternary) merge sort. This approach divides the array into three subarrays instead of two, which can reduce the depth of the recursive calls and potentially improve the sorting speed.

Approach

Three-way Split:
Divide the array into three nearly equal parts. For an array x[s..e-1], calculate the indices m1 and m2 such that m1 = s + (e - s) / 3 and m2 = s + 2 * (e - s) / 3.
The subarrays are then x[s..m1-1], x[m1..m2-1], and x[m2..e-1].
Merge Process:
Utilize the cmp(x1, x2, x3) function to find the smallest element among the current elements of the three subarrays.
Insert the smallest element into the temporary array y, and move the corresponding pointer forward.
Implementation:
Maintain three pointers for the start of each subarray and one for the position in the merged array.
Continue merging until all elements are processed.

Example Code

void merge_three_arrays(int x[], int s, int m1, int m2, int e, int y[]) {
    int i = s, j = m1, k = m2, l = s;
    while (i < m1 && j < m2 && k < e) {
        int cmp_result = cmp(x[i], x[j], x[k]);
        if (cmp_result == 1) {
            y[l] = x[i];
            i++;
        } else if (cmp_result == 2) {
            y[l] = x[j];
            j++;
        } else {
            y[l] = x[k];
            k++;
        }
        l++;
    }
    // Merge remaining elements of the first two arrays
    while (i < m1 && j < m2) {
        y[l++] = (x[i] < x[j]) ? x[i++] : x[j++];
    }
    // Merge remaining elements of the first and third arrays
    while (i < m1 && k < e) {
        y[l++] = (x[i] < x[k]) ? x[i++] : x[k++];
    }
    // Merge remaining elements of the second and third arrays
    while (j < m2 && k < e) {
        y[l++] = (x[j] < x[k]) ? x[j++] : x[k++];
    }
    // Copy any remaining elements
    while (i < m1) { y[l++] = x[i++]; }
    while (j < m2) { y[l++] = x[j++]; }
    while (k < e) { y[l++] = x[k++]; }
}

void merge_sort_ternary(int x[], int s, int e, int y[]) {
    if(e - s <= 1) return;
    int m1 = s + (e - s) / 3;
    int m2 = s + 2 * (e - s) / 3;
    merge_sort_ternary(x, s, m1, y);
    merge_sort_ternary(x, m1, m2, y);
    merge_sort_ternary(x, m2, e, y);
    merge_three_arrays(x, s, m1, m2, e, y);
    for(int i = s; i < e; i++) { x[i] = y[i]; }
}

Conclusion

By implementing a three-way merge sort using the cmp(x1, x2, x3) function, we reduce the number of levels in the recursion tree, leading to a shallower depth. While the time complexity remains \(O(n \log n)\), the base of the logarithm changes from \(2\) to \(3\), resulting in fewer recursive steps and potentially faster execution, especially with large data sets. This method efficiently utilizes the three-element comparison capability provided by the hardware, optimizing the sorting process.

Knowledge

归并排序复杂度分析递归排序算法

难点思路

归并排序的难点主要在于如何有效地实现合并两个有序子数组。尤其是在大数据集下，如何优化合并过程中的比较操作是一个关键问题。通过引入专用的硬件指令（如 cmp 函数），可以有效减少比较次数，提高算法效率。

解题技巧和信息

分治法：将一个问题分成多个子问题，递归解决每个子问题，然后合并结果。归并排序即为典型的分治算法。
优化比较操作：在排序过程中，比较操作的次数对整体效率影响很大。可以通过硬件支持或者改进算法来减少不必要的比较。

重点词汇

Merge Sort: 归并排序
Recursion: 递归
Time Complexity: 时间复杂度
Space Complexity: 空间复杂度
Hardware Acceleration: 硬件加速
Comparison Operation: 比较操作

参考资料

《算法导论》第三版，第 2 章：排序与查找
《计算机算法设计与分析》第 5 章：分治策略
《数据结构与算法分析：C 语言描述》：归并排序章节

東京大学 新領域創成科学研究科 メディカル情報生命専攻 2015年8月実施 問題9