在陣列中查找兩個數字，使得它們之間的所有元素都小于兩個數字中的最小值-有解無憂

如何在陣列中找到對，使得它們之間的元素小于對中的最小元素。

例如。
10 4 6 8 7

在上面的對中可以形成如下：
(10,4), (4,6), (6,8), (8,7)因為它們之間沒有元素所以它也有效
(10,6) As only它們之間的元素是4和4 < 6這是最小的 10,6
(10, 8) * As 4 < 8 and also 6 < 8 *

但 (10,7)不會成對為8 > 7

所有對：

(10, 4), (4, 6), (6, 8), (8, 7), (10, 6), (10, 8)

我的代碼：

def findPirs(n, values):
    res = 0
    arr = []
    for start in range(n-1):
        currMax = 0
        for end in range(start 1, n):
            m = min(values[start], values[end])
            if currMax < m:
                arr.append((values[start], values[end]))
                res  = 1
            currMax = max(currMax, values[end])

    return res, arr

這是我當前的代碼，它有效但效率非常低，因為它的時間復雜度為O(n^2)。
關于如何改進的任何建議？

uj5u.com熱心網友回復：

你不需要最低限度。如果新端點大于您迄今為止檢測到的最大值，那么您就獲勝了。

def findPairs(values):
    arr = []
    for start in range(len(values)-1):
        currMax = 0
        for end in values[start 1:]:
            if end > currMax:
                arr.append((values[start], end))
                currMax =  end
    return arr

nums = [10, 4, 6, 8, 7]
print(findPairs( nums ))

uj5u.com熱心網友回復：

我在上面的評論中對我暗示的解決方案進行了更多思考，我認為它可以作業。在最壞的情況下，您永遠不會比 O(n2) 更好，因為您可能必須輸出 O(n2) 對，但我們的想法是，當您不必輸出太多時，您可能希望擁有一種快速的演算法，然后（必然）在有很多輸出時變慢。O(n z) 演算法，其中 z 是輸出的大小，最適合需要查看所有資料并且需要輸出所有對的情況。

所以，我的解決方案，如下所示。如果輸出很大，它就無法與快速 O(n2) 解決方案競爭——它有更多的開銷——但是當你輸出 o(n2) 時它應該可以正常作業。

這個想法僅適用于超過兩個的間隔，所以我將所有鄰居視為我剛剛報告的特例。我可以在 O(n) 中輕松做到這一點。對于更長的時間間隔，我認為輸入是一系列山丘和山谷，山頂位于位置 i，其中 x[i-1] < x[i] 和 x[i] < x[i 1]。我不知道您想如何處理運行相同值的情況，但您可以調整代碼來處理它。這僅取決于您想對它們做什么。

如果我有兩個相鄰的山頂，它們之間有一個山谷，我必須報告兩邊的間隔。在山谷中間會有一些我不能使用的值，最初它是一個區間中的最小值（不能包括在內），后來我屏蔽了更大的塊，把小山丘變成了山谷。

因此，總的來說，我采用成對的相鄰山頂，在它們之間的山坡上輸出間隔。

def report_valid_intervals(x: list[int]):
    """Report intervals (i,j) such that for all i <= k < j, x[k] <= min(x[i],x[j])."""

    # All neighbours are valid, and these are a special case that we handle first.
    for i in range(len(x) - 1):
        yield i, i 1

    # Identify hills and valleys in the sequence. These are the valleys we use
    # to report intervals where the left and right points are larger than the
    # interval values. Identifying the hill tops (tops) and valleys (masks)
    # is done in O(n).
    masks = [False] * len(x)
    tops = find_tops(x, masks)

    # Now process valleys one by one. In each valley, we report the intervals inside it
    # in O(z) -- linear in what we report -- and then we update the mask, throwing
    # away one hill top (making it into a valley) for each pair, also in O(z) (we
    # only look at indices we also used in the output). When we are through all the
    # valleys we have elimiated at least half of them. Because we eliminate half each
    # time, the running time (excluding reporting) is the sum n   n/2   n/4   ... a
    # geomemtric sum that converges to O(n).
    while len(tops) > 1:
        for left, right in zip(tops, tops[1:]):
            for i, j in report_intervals_in_valley(x, left, right, masks):
                yield i, j
            update_valley_mask(x, left, right, masks)

        # eliminate the smaller tops
        tops = [top for top in tops if not masks[top]]

您可以像這樣輕松識別山丘和山谷（但您可能希望重新定義頂部以處理具有相同值的運行的情況）。

# Identifying hills and valleys...
def find_tops(x: list[int], masks: list[bool]) -> list[int]:
    """Locate the tops of the hills and mask out the bottom of the valleys.
    Runs in O(n) and only runs once."""
    # Get hill tops
    start = [0] if x[0] > x[1] else []
    end = [len(x)-1] if x[-1] > x[-2] else []
    mid = [i for i in range(1, len(x)-1) if x[i-1] < x[i] > x[i 1]]
    tops = start   mid   end

    # mask valley bottoms. Doesn't matter with the end-points,
    # since they are only relevant as tops.
    for i in range(1, len(x)-1):
        if x[i-1] > x[i] < x[i 1]:
            masks[i] = True

    return tops

迭代地穿過相鄰的山頂是線性時間，因為我們每次消除一半的山頂，得到一個幾何和。如果我們更多地分析底層運行時間，我不確定這里是否需要論證，但如果我們需要它，它就在那里。我們只需要按與我們從中獲得的輸出成比例的時間處理每個谷。在這里它變得有點棘手。

You can output intervals in a valley by running down one side, and then running down the other side as long as you do not include a value smaller than the next value on the first side. If you output something every time you do this, you get O(z), but you could run down a side and test the first value on the other side repeatedly, without outputting anything. Then you spend O(m) time, where m is the length of the side of the valley, and then the running time goes above O(n z).

However, we are going to mask out indices that cannot be used after we have processed a valley. All indices with a value smaller than the smallest peak cannot be part of an interval that spans across a hill top, and we can mask those out so we do not consider them. If we process valleys such that we do the expensive work on the smaller hill, we will mask out the indices we process, and since they are only masked out once and then never considered again, we are back to O(n z).

# When reporting from a valley, work out the valid left and right interval to
# consider. Those are the indices from the left/right respectively until we see a
# masked index. The masked indices are valleys that we have already processed,
# and we do not need to consider anything in there, as intervals spanning them
# must also contain values that are too large for a hill top inbetween.


def report_intervals_in_valley_left(x: list[int],
                                    left_top: int, right_top: int,
                                    masks: list[bool]):
    """Report all valid intervals in a valley. The running time is O(lv z) where
    z is the number of intervals we report and lv is the size of the left side
    of the valley (because we look at all those indices even if we do not
    report). We will mask out those intervals afterward so we never look at
    them again, so they cannot contribute more than O(n) to the algorithm in
    total."""
    for i in range(left_top, right_top):
        if masks[i]:  # end of this side of the valley
            break
        for j in range(right_top, left_top, -1):
            if masks[j]:
                break
            if x[j] < x[i   1]:  # we cannot go below the next on the other side
                break  # no more hits from i
            yield i, j


def report_intervals_in_valley_right(x: list[int],
                                     left_top: int, right_top: int,
                                     masks: list[bool]):
    """Report all valid intervals in a valley. The running time is O(rv z) where
    z is the number of intervals we report and rv is the size of the right side
    of the valley (because we look at all those indices even if we do not
    report). We will mask out those intervals afterward so we never look at
    them again, so they cannot contribute more than O(n) to the algorithm in
    total."""
    for j in range(right_top, left_top, -1):
        if masks[j]:
            break
        for i in range(left_top, right_top):
            if masks[i]:  # end of this side of the valley
                break
            if x[i] < x[j - 1]:  # we cannot go below the next on the other side
                break  # no more hits from i
            yield i, j


def report_intervals_in_valley(x: list[int],
                               left_top: int, right_top: int,
                               masks: list[bool]):
    """Report all valid intervals in a valley. The running time is O(v z) where
    z is the number of intervals we report and v is the size of the side of the
    valley that will be masked out and not contribute to the running time again."""
    # make sure it is the side with the smallest top that risks doing
    # the most work
    if x[left_top] < x[right_top]:
        yield from report_intervals_in_valley_left(x, left_top, right_top, masks)
    else:
        yield from report_intervals_in_valley_right(x, left_top, right_top, masks)

When you update a valley by masking, you remove the values smaller than the smallest peak. Here, my current code doesn't quite do the right thing.

def update_valley_mask(x: list[int],
                       left_top: int, right_top: int,
                       masks: list[bool]):
    """Mask out the indices that are smaller than the smallest top. Those cannot
    be part of valid pairs when we merge this valey with its neighbours. The running
    time is less than the time we spent reporting intervals, since each index we
    consider was part of at least one pair that we outputted."""
    mi = min(x[left_top], x[right_top])
    masks[left_top if x[left_top] == mi else right_top] = True

    for i in range(left_top, right_top):  # range is entire interval, but we break
        if masks[i]:
            break  # we enter the already masked, so get out to get the correct running time
        if x[i] < mi:
            masks[i] = True
    for i in range(right_top, left_top, -1):
        if masks[i]:
            break
        if x[i] < mi:
            masks[i] = True

The code processes the sides of the valley, and it looks at indices that do not get masked out. That means that it doesn't guarantee the O(n z) running time. However, it should be simple enough to keep track of the closest masked index on the left and right of each hill top. We know the tops of our valley when we mask, so it would just be updating a list. If we do that, then we can start from the closest masked value and work our way up towards the hill top, and then we will mask all the indices we process exact the last, and then the running time is correct again.

I can have a go at it, but I suspect this is only of theoretical interest. The O(n2) solutions are probably fast enough for most cases (and if you have to output many intervals you cannot do better). But I'm reasonably sure that you can do it in O(n z), and probably smarter than this solution.

轉載請註明出處，本文鏈接：https://www.uj5u.com/yidong/312889.html

標籤：Python 数学数据结构

上一篇：廣義關聯勒讓德多項式

下一篇：將雙數轉換為數字，反之亦然？