東京大学 新領域創成科学研究科 メディカル情報生命専攻 2022年8月実施 問題7
Author
Description
Given a point set, \(P = \{ \mathbf{v_1}, \ldots, \mathbf{v_n} \} (\mathbf{v_i} \in \mathbb{R}^2)\), the convex hull, \(C_P\), is a convex polygon \(\{ \mathbf{v_{x_1}}, \mathbf{v_{x_2}}, \ldots, \mathbf{v_{x_m}} \} (m \leq n)\) such that \(\{ \mathbf{v_{x_1}}, \mathbf{v_{x_2}}, \ldots, \mathbf{v_{x_m}} \}\) comprises a subset of \(P\) arranged in a clockwise order and that all points in \(P\) are included in or on the edge of the polygon.
(1) The algorithm shown below computes \(C_P\) from a given \(P\). Show the complexity of line 4-7 in the \(O()\) notation. Explain the reason. Arithmetic operations over real numbers have infinite precision and can be computed in a unit time. \(\mathrm{cw}(a, b, c)\) returns true only if \(\overrightarrow{ab}\) and \(\overrightarrow{bc}\) turns clockwise (i.e., the cross product \(\overrightarrow{ab} \times \overrightarrow{bc}\) is negative).
(2) Suppose \(X = \{ x_1, \ldots, x_n \} (x_i \in \mathbb{R})\) is given. Let \(y_i = (x_i, x_i^2)\), and we computed the convex hull \(C_Y\) for \(Y = \{ y_1, \ldots, y_n \} (y_i \in \mathbb{R}^2)\). Given \(C_Y\), prove that the sorted array of \(X\) can be computed in linear time in terms of \(n\).
(3) Prove that the complexity of an algorithm that computes the convex hull of \(n\) points (not necessarily the algorithm shown in (1)) is not smaller than the complexity of a sorting algorithm for \(n\) elements.
(4) Suppose we use a random function that returns true/false at 50% of probability, respectively. Prove that we need to call \(r()\) at least \(\log_2 n!\) times to output one of all possible permutations of integers from 1 to \(n\) with the equal probability.
(5) Consider a computation model where the sorting algorithm with the smallest complexity is based on comparison between two numbers. Prove that the complexity of computing a convex hull is intrinsically \(\Theta(n \log n)\). Use the Stirling’s approximation formula, \(\ln(n!) = n \ln(n) - n + O(\log n)\), if need be.
Kai
(1) Complexity of lines 4-7 in the algorithm
To analyze the complexity of lines 4-7, we note that the primary operation in the for
loop is the while
loop which pops elements from the stack and then pushes the current point \(\mathbf{v}\). The key observation is that each point is pushed and popped from the stack at most once.
- Push Operation: Each point is pushed exactly once.
- Pop Operation: Each point is popped at most once.
Therefore, the total number of operations in the while loop is \(O(n)\). Hence, the complexity of lines 4-7 is \(O(n)\).
(2) Sorting \(X\) from \(C_Y\)
Given:
- \(X = \{x_1, \ldots, x_n\}\) where \(x_i \in \mathbb{R}\)
- \(Y = \{y_1, \ldots, y_n\}\) where \(y_i = (x_i, x_i^2)\)
- \(C_Y\) is the convex hull of \(Y\)
To prove: The sorted array of \(X\) can be computed in linear time \(O(n)\) given \(C_Y\).
- Observe the relationship between \(X\) and \(Y\):
- Each point in \(Y\) is of the form \((x, x^2)\)
-
This forms a parabola in the 2D plane
-
Key insight:
- The convex hull \(C_Y\) captures the "outer" points of this parabola
-
These outer points correspond to the smallest and largest elements of \(X\)
-
Properties of the convex hull \(C_Y\):
- It consists of an upper hull and a lower hull
-
The leftmost and rightmost points of \(C_Y\) correspond to the minimum and maximum elements of \(X\) respectively
-
Algorithm to sort \(X\) using \(C_Y\): a) Identify the leftmost point of \(C_Y\). This gives the minimum of \(X\). b) Identify the rightmost point of \(C_Y\). This gives the maximum of \(X\). c) Traverse the upper hull of \(C_Y\) from left to right. Each point \((x, x^2)\) encountered gives the next element in the sorted \(X\).
-
Time complexity analysis:
- Finding the leftmost and rightmost points of \(C_Y\): \(O(n)\)
- Traversing the upper hull of \(C_Y\): \(O(n)\)
-
Total time complexity: \(O(n)\)
-
Correctness:
- The upper hull of \(C_Y\) contains all points that are not "hidden" by other points when viewed from above
- These points, when projected onto the x-axis, give the sorted order of \(X\)
Therefore, given the convex hull \(C_Y\), we can indeed sort \(X\) in linear time \(O(n)\).
(3) Complexity of computing convex hull vs. sorting
Let's assume we have an algorithm \(A\) that can compute the convex hull of \(n\) points faster than any comparison-based sorting algorithm. We will show that this leads to a contradiction by leveraging the known lower bound of comparison-based sorting algorithms.
Consider a set of \(n\) distinct real numbers \({x_1, x_2, \ldots, x_n}\) that we want to sort. Transform these numbers into a set of points \(P = {(x_1, 0), (x_2, 0), \ldots, (x_n, 0)}\).
All points in \(P\) lie on a straight line (the \(x\)-axis). The convex hull of these collinear points will consist of exactly two points: the leftmost point and the rightmost point. The leftmost point corresponds to the minimum value in our original set, and the rightmost point corresponds to the maximum value.
Assume algorithm \(A\) computes the convex hull faster than any comparison-based sorting algorithm. Then we could:
- Treat our \(n\) numbers as \(n\) points.
- Compute the convex hull with \(A\).
- Convert the sorted \(n\) points back to \(n\) numbers in order.
Thus, we would have a way to sort the elements as quickly as \(A\), which contradicts to our assumption that algorithm \(A\) computes the convex hull faster than any comparison-based sorting algorithm.
Therefore, we prove that the complexity of computing the convex hull of \(n\) points cannot be smaller than the complexity of sorting \(n\) elements in the general case.
(4) Number of calls to r()
for random permutations
- The number of permutations of \(n\) integers is \(n!\).
- To represent \(n!\) different outcomes (permutations) uniformly, we need at least \(\log_2 n!\) bits of information, as this is the minimum number of bits required to encode \(n!\) distinct values.
- Each call to
r()
generates one bit of information.
Therefore, to generate a random permutation uniformly, we need at least \(\log_2 n!\) bits, and hence we need to call r()
at least \(\log_2 n!\) times.
To prove all possible permutations of integers from 1 to \(n\) has the equal probability to be generated by r(n)
for \(\log_2 n!\) times, we can number all the permutations from 0 to \(n!-1\) and use the binary representation of the numbers to determine the permutation.
Since r()
generates a random bit with equal probability, the binary representation of the numbers will be uniformly distributed, and thus all permutations will have an equal probability of being generated.
(5) Convex hull complexity using comparison model
Sorting based on comparison between 2 numbers can be viewed as constructing a decision tree where each comparison narrows down the possible orderings. The height of this tree represents the number of comparisons needed, which is \(\log_2 n!\) in the worst case.
Using Stirling's approximation formula \(\ln(n!) = n \ln(n) - n + O(\log n)\), we have:
So sorting \(n\) elements requires \(\Theta(n \log n)\) comparisons in the comparison model.
To show that the complexity of computing the convex hull is intrinsically \(\Theta(n \log n)\), we can consider both the lower bound and the upper bound of the complexity:
- Lower bound:
-
Based on the conclusion in (3), the complexity of computing the convex hull is not smaller than the complexity of sorting \(n\) elements, so the lower bound of the complexity of computing the convex hull is \(\Omega(n \log n)\).
-
Upper bound:
- Based on the algorithm given in (1), computing the convex hull consists of sorting the points based on the slope of the line from the leftmost point, which can be done in \(O(n \log n)\) time, and construct the convex hull in \(O(n)\) time using a stack, which can be done in \(O(n)\) time based on the analysis in (1). Therefore, the upper bound of the complexity of computing the convex hull is \(O(n \log n) + O(n) = O(n \log n)\).
Hence, the complexity of computing the convex hull is intrinsically \(\Theta(n \log n)\).
Knowledge
计算几何 #凸包 #排序算法 #复杂度分析
难点解题思路
- Understanding the relationship between the convex hull of a set of points and the sorted order of the corresponding values.
- Analyzing the complexity of algorithms based on the operations performed and the number of times each operation is executed.
- Relating the lower bound of comparison-based sorting algorithms to the complexity of computing the convex hull.
- Using the concept of decision trees to analyze the complexity of sorting algorithms based on comparisons.
- Applying Stirling's approximation formula to analyze the complexity of sorting algorithms and the convex hull computation.
- Demonstrating the intrinsic complexity of computing the convex hull based on the lower and upper bounds.
解题技巧和信息
- 在分析算法复杂度时,注意识别关键操作及其频率。
- 将一个复杂问题归约到已知下界问题(如排序问题)来确定其复杂度。
重点词汇
- convex hull: 凸包
- cross product: 叉积
- algorithm complexity: 算法复杂度
参考资料
- "Introduction to Algorithms" by Cormen, Leiserson, Rivest, and Stein, Chap. 33.
- "Computational Geometry: Algorithms and Applications" by de Berg, van Kreveld, Overmars, and Schwarzkopf, Chap. 1 and 3.