最小增量數，以便所有元素都有一個公約數-有解無憂

我最近在一次采訪中遇到了這個問題：

給定一組數字X = [X_1, X_2, ...., X_n]，其中X_i <= 500for 1 <= i <= n。增加集合中的數字（僅正增量），以便集合中的每個元素都有一個公約數 >=2，并且所有增量的總和最小。

例如，如果X = [5, 7, 7, 7, 7]新集合是X = [7, 7, 7, 7, 7]因為您可以將 2 添加到 X_1。X = [6, 8, 8, 8, 8]有一個公分母 2 但不正確，因為我們要加 6（在 4 個 7 的每一個上加 2 到 5 和 1）。

我有一個看似有效的解決方案（因為它通過了所有測驗用例），它回圈遍歷 <500 的素數，并為每個X_iinX找到大于 X_i 的素數的最接近倍數。

function closest_multiple(x, y)
    return ceil(x/y)*y

min_increment = inf
for each prime_number < 500:
    total_increment = 0
    for each element X_i in X:
        total_increment  = closest_multiple(X_i, prime_number) - X_i
    min_increment = min(min_increment, total_increment)

return min_increment

它在技術上是 O(n) 但有沒有更好的方法來解決這個問題？有人建議我使用動態編程，但我不確定它如何適合這里。

uj5u.com熱心網友回復：

常量有界條目案例

當X_i受常數限制時，漸近達到的最佳時間是O(n)，因為讀取所有輸入至少需要那么長時間。有一些實際改進：

過濾掉重復項，這樣您就可以使用（元素，頻率）對的串列。
提前停止回圈。
的更快計算closest_multiple(x, p) - x。這稍微依賴于硬體/語言，但單個整數模運算幾乎肯定比 int -> float cast、float 除法、ceiling() 呼叫和對相同數量的乘法更快。

freq_counts <- Initialize-Counter(X) // List of (element, freq) pairs
min_increment = inf

for each prime_number < 500:
    total_increment = 0
    for each pair X_i, freq in freq_counts:
        total_increment  = (prime_number - (X_i % prime_number)) * freq
        if total_increment >= min_increment: break
    min_increment = min(min_increment, total_increment)

return min_increment

大型條目案例

對于統一選擇的隨機資料，答案幾乎總是使用“2”作為除數，而更大的質數除數幾乎是不可能的。但是，讓我們解決最壞的情況。

在這里，讓 max(X) = M，所以我們的輸入大小是O(n (log M))位。我們想要一個在該輸入大小中是次指數的解決方案，因此不可能找到低于 M（甚至 sqrt(M)）的所有質數。我們正在尋找任何能夠為我們提供min-total-increment 的素數；我們將把這樣的素數稱為min-prime。找到這樣的素數后，我們可以得到線性時間內的最小總增量。我們將使用分解方法以及兩個觀察結果。

觀察 1：答案總是至多n，因為素數 2 除以所需的增量X_i至多為 1。
觀察 2：我們試圖找到除我們的大部分條目X_i之外的質數或比X_i其稍大的數X_i。讓Consecutive-Product-Divisors[i]是除以任何一個的所有素數的集合X_i or X_i 1，我將其縮寫為CPD[i]。這正是除的所有素數的集合X_i * (1 X_i)。
（觀察 2 續）如果U是我們答案的已知上限（此處最多為 n），并且p是 X 的最小素數，則p必須除以我們的 CPD 條目中的一個X_i或X_i 1至少N - U/2一個。使用 CPD 陣列上的頻率計數來查找所有這些素數。

一旦您有了候選素數串列（保證所有最小素數都在此串列中），您就可以使用您的演算法單獨測驗每個素數。由于數字 k 最多可以有O(log k)不同的素數除數，這給出了O(n log M)可能的不同素數，這些素數可以整除
[X_1*(1 X_1), X_2*(1 X_2), ... X_n*(1 X_n)]構成我們候選串列的至少一半的數字。您可以通過一些更仔細的分析來降低這個界限，但它可能不會強烈影響整個演算法的漸近運行時間。

大型條目的更優化復雜性

這個解決方案的復雜性很難用簡短的形式寫出來，因為瓶頸是分解n最大尺寸的數字M，加上對最大尺寸數字的O(n^2 log M)算術（即加法、減法、乘法、模）運算M。這并不意味著運行時未知：如果您選擇任何整數分解演算法和大整數算術演算法，您可以準確地推匯出運行時。不幸的是，由于因式分解，上述演算法最著名的運行時間是超多項式（但次指數）。

How can we do better? I did find a more complicated solution, based on Greatest Common Divisors (GCD) and dynamic-programming-like that runs in polynomial time (although likely much slower on non-astronomical-size inputs) since it doesn't rely on factoring.

The solution relies on the fact that at least one of the following two statements is true:

The number 2 is a min-prime for X, or
For at least one value of i, 1 <= i <= n there is an optimal solution where X_i remains unincremented, i.e. where one of the divisors of X_i produces a min-total-increment.

GCD-Based polynomial time algorithm

We can test 2 and all small primes quickly for their minimum costs. In fact, we'll test all primes p, p <= n, which we can do in polynomial time, and factor out these primes from X_i and its first n increments. This leads us to the following algorithm:

// Given: input list X = [X_1, X_2, ... X_n].
// Subroutine compute-min-cost(list A, int p) is 
// just the inner loop of the above algorithm.

min_increment = inf;
for each prime p <= n:
    min_increment = min(min_increment, compute-min-cost(X, p));

// Initialize empty, 2-D, n x (n 1) list Y[n][n 1], of offset X-values
for all 1 <= i <= n:
    for all 0 <= j <= n:
        Y[i][j] <- X[i]   j;

for each prime p <= n:  // Factor out all small prime divisors from Y
    for each Y[i][j]:
        while Y[i][j] % p == 0:
            Y[i][j] /= p;

for all 1 <= i <= n: // Loop 1
    // Y[i][0] is the test 'unincremented' entry
    // Initialize empty hash-tables 'costs' and 'new_costs'
    // Keys of hash-tables are GCDs,
    // Values are a running sum of increment-costs for that GCD

    costs[Y[i][0]] = 0;
    for all 1 <= k <= n: // Loop 2
        if i == k: continue;
        clear all entries from new_costs  // or reinitialize to empty
        for all 0 <= j < n: // Loop 3
            for each Key in costs: // Loop 4
                g = GCD(Key, Y[k][j]);
                if g == 1: continue;
                if g is not a key in new_costs:
                    new_costs[g] = j   costs[Key];
                else:
                    new_costs[g] = min(new_costs[g], j   costs[Key]);
        
        swap(costs, new_costs);
    if costs is not empty:
        min_increment = min(min_increment, smallest Value in costs);

return min_increment;

The correctness of this solution follows from the previous two observations, and the (unproven, but straightforward) fact that there is a list [X_1 r_1, X_2 r_2, ... , X_n r_n] (with 0 <= r_i <= n for all i) whose GCD is a divisor with minimum increment cost.

The runtime of this solution is trickier: GCDs can easily be computed in O(log^2(M)) time, and the list of all primes up to n can be computed in low poly(n) time. From the loop structure of the algorithm, to prove a polynomial bound on the whole algorithm, it suffices to show that the maximum size of our 'costs' hash-table is polynomial in log M. This is where the 'factoring-out' of small primes comes into play. After iteration k of loop 2, the entries in costs are (Key, Value) pairs, where each Key is the GCD of k 1 elements:
our initial Y[i][0], and [Y[1][j_1], Y[2][j_2], ... Y[k][j_k]] for some 0 <= j_l < n. The Value for this Key is the minimum increment sum needed for this divisor (i.e. sum of the j_l) over all possible choices of j_l.

There are at most O(log M) unique prime divisors of Y[i][0]. Each such prime divides at most one key in our 'costs' table at any time: Since we've factored out all prime divisors below n, any remaining prime divisor p can divide at most one of the n consecutive numbers in any Y[j] = [X_j, 1 X_j, ... n-1 X_j]. This means the overall algorithm is polynomial, and has a runtime below O(n^4 log^3(M)).

From here, the open questions are whether a simpler algorithm exists, and how much better than this bound can you achieve. You can definitely optimize this algorithm (including using the early-stopping and frequency counts from before). It's also likely that better bounds on counting large-and-distinct-prime-divisors for consecutive numbers shows this solution is already better than that stated runtime, but a simplification of this solution would be very interesting.

轉載請註明出處，本文鏈接：https://www.uj5u.com/gongcheng/346030.html

標籤：数学动态规划

上一篇：求和藥劑

下一篇：AJAX成功/錯誤功能不起作用