以最佳方式拆分陣列并具有多個功能-有解無憂

我正在尋找一個演算法名稱，最好是一個可以采用陣列并以最佳方式在函式之間拆分它的庫。我不在乎復雜性（它適用于非常低的資料集）。遞回檢查就足夠了。

一個很好的例子是陣列：

const list = [{gender: "female"}, {gender: "female"}, {gender: "female"}, {gender: "male"}]

所以如果我們用特殊函式運行它

specialFunc(list, [(val) => val === 'female', (val) => val === 'male']);

我們會得到這個

[
 [{gender: "female"}, {gender: "female"}, {gender: "female"}],
 [{gender: "male"}]
]

因為這是我們能得到的最好的分割。

但是，如果我們通過這個函式運行它：

specialFunc(list, [(val) => !!val, (val) => val === 'male']);

我會得到這個：

[
 [{gender: "female"}, {gender: "female"}, {gender: "female"}],
 [{gender: "male"}]
]

“最好的方法”是指每個陣列之間的數距（陣列長度）最小，每個陣列中的記錄數應該最大。

我已經搜索了 npmjs 和 github 很多，但找不到任何東西。

非常非常感謝你！

uj5u.com熱心網友回復：

我想我理解這些要求。您有許多要用于對專案進行分組的謂詞函式。對于同一個專案，多個謂詞可能回傳 true，因此有各種可用的分組。您想找到一個分組，以最大限度地減少結果大小的變化。

我不覺得你的例子很有說服力。我會嘗試我自己的。如果您的專案是8, 6, 7, 5, 3, 0, 9]并且您有三個謂詞：(n) => n < 7, (n) => n > 3, and (n) => n % 2 == 1，那么 the8只能進入第二組（它大于 3，但不少于 7 且不是奇數。）The6可以進入前兩組中的任何一組，在5可以在其中任何一個，等等，像這樣：

  8      6       7        5         3     0      9
[[1], [0, 1], [1, 2], [0, 1, 2], [0, 2], [0], [1, 2]]
  ^    ^  ^    ^  ^    ^  ^  ^    ^  ^    ^    ^  ^  
  |    |  |    |  |    |  |  |    |  |    |    |  |
  |     --|----|--|---- --|--|---- --|---- ----|--|------> Group 0 (n => n < 7)
  |       |    |  |       |  |       |         |  |
   ------- ---- --|------- --|-------|--------- --|------> Group 1 (n => n > 3)
                  |          |       |            |
                   ---------- ------- ------------ ------> Group 2 (n => n % 2 == 1)

由于第一個有一個選擇，第二個有兩個選擇，第三個有兩個，依此類推，可能的磁區數是1 * 2 * 2 * 3 * 2 * 1 * 2，或48。它們可能看起來像這樣：

[//     < 7          > 3      odd
  [ [6, 5, 3, 0], [8, 7, 9], [] ], 
  [ [6, 5, 3, 0], [8, 7],    [9] ], 
  [ [6, 5, 0],    [8, 7, 9], [3] ], 
  [ [6, 5, 0],    [8, 7],    [3, 9] ], 
  // ... (42 rows elided)  
  [ [0],          [8, 6, 9], [7, 5, 3] ], 
  [ [0],          [8, 6],    [7, 5, 3, 9] ]
]

然后，從這些中，我們需要選擇磁區大小變化最小的那些。我們可以對這個¹使用統計方差，即值與其均值的距離的平方和，所以[[6, 5, 3, 0], [8, 7, 9], []]，長度為4，3，和0；這有一個方差8.667。第二個的長度為4，2并且1方差為4.667。我們最好的可能性是3，2和2，方差為0.667。所以像這樣的答案[[6, 5, 0], [8, 7], [3, 9]]是合理的。可能有很多類似的行為；下面的實作只是選擇第一個。

如果這正確地描述了問題，那么這里有一些我認為可以處理它的代碼：

const range = (lo, hi) => Array .from ({length: hi - lo}, (_, i) => i   lo)

const sum = (ns) => ns .reduce ((a, b) => a   b, 0)

const filterMap = (f, m) => (xs) => 
  xs .flatMap ((x, i, a) => f (x, i, a) ? [m (x, i, a)] : [])

const cartesian = ([xs, ...xss]) =>
  xs == undefined
    ? [[]]
  : xs .flatMap (x => cartesian (xss) .map (ys => [x, ...ys]))

const variance = (ns, avg = sum (ns) / (ns .length || 1)) =>
  sum (ns .map (n => (n - avg) * (n - avg)))

const groupIndices = (count) => (xs) =>
  Object .values (xs .reduce (
    (a, x, i) => ((a [x] .push (i)), a), 
    Object .fromEntries (range (0, count) .map (n => [n, []]))
  ))

const specialFunc = (xs, preds) =>
  cartesian (xs .map ((x) => filterMap ((pred, i) => pred (x), (_, i) => i) (preds)))
    .map (groupIndices (preds .length))
    .reduce (
      ({min, v}, xs, _, __, v2 = variance (xs .map (x => x .length))) => 
          v2 < v ? {min: xs, v: v2} : {min, v}, 
      {min: [], v: Infinity}
    ) .min .map (ys => ys .map (i => xs [i]))

console .log (specialFunc (
  [8, 6, 7, 5, 3, 0, 9], 
  [n => n < 7, n => n > 3, n => n % 2 == 1]
)) //=> [[6, 5, 0], [8, 7], [3, 9]]

.as-console-wrapper {max-height: 100% !important; top: 0}

We start with some fairly standard utility functions. range calculates an integer range inclusive at the bottom, exclusive at the top, so that, for instance, range (3, 12) returns [3, 4, 5, 6, 7, 8, 9 ,10, 11]. sum simply totals an array of numbers, filterMap combines filtering with mapping, first testing whether an input matches a filter, and if so, transforming the result before putting it in the result. This implementation is unusual, in that the filter and mapping functions take more than just the value, but also the index and array properties found in things like map and filter. We need that as we'll use it to collect indices that match. (There are plenty of other ways to do that bit, but filterMap is a useful, reusable function.) cartesian returns the cartesian product of an array of arrays. For instance, cartesian ([1, 2, 3], [true], ['a', 'b']]) will return [[1, true, 'a'], [1, true, 'b'], [2, true, 'a'], [2, true, 'b'], [3, true, 'a'], [3, true, 'b']]. And finally variance calculate the statistical variance of a list of numbers.

Then we have a helper function, groupIndices. This might be easiest to show with an example. One of the 48 results from our cartesian product will be [1, 0, 1, 0, 2, 0, 1], which means that our original numbers (8, 6, 7, 5, 3, 0, 9], recall) are in groups 1, 0, 1, 0, 2, 0, and 1, respectively. groupIndices takes the number of groups and then takes that cartesian combination, and transforms it into [[1, 3, 5], [0, 2, 6], [4]], giving the indices of the values that are mapped to each group. (If I wasn't out of time, I'm sure we could skip this working with the indices and go directly against the values, but this works.)

現在我們找到了 main 函式，我還沒有嘗試為它找到一個好名字，所以它仍然被稱為specialFunc. 這用于filterMap將我們的串列變成[[1], [0, 1], [1, 2], [0, 1, 2], [0, 2], [0], [1, 2]]，呼叫cartesian結果，映射groupIndices這些值，然后用于reduce查找（第一個）方差最小的值。最后，它將結果索引映射回實際值。

同樣，我們可能會清理它并使用值而不是索引，但我首先想知道這是否是您正在尋找的行為型別。

^{¹的標準偏差有更清晰的含義，但它只是方差的平方根，也將被責令方式與方差相同，并不會涉及計算平方根。}

uj5u.com熱心網友回復：

function splitGroups<T>(items: T[], filters: ((x: T) => boolean)[]) {
    let options = filters.map(f => items.filter(f));
    let groups = filters.map((_, index) => ({ data: [] as T[], index }));
    let res: T[][] = [];
    while (options.reduce((partial_sum, a) => partial_sum   a.length, 0) > 0) {
        groups.sort((a, b) => a.data.length - b.data.length);
        let smallGroup = groups[0];
        const smallGroups = groups.filter(g => g.data.length === smallGroup.data.length);
        if (smallGroups.length > 1) {
            smallGroup = smallGroups[Math.floor(Math.random() * (smallGroups.length - 1))];
        }
        if (options[smallGroup.index].length === 0) {
            res.push(smallGroup.data);
            groups = groups.filter(x => x !== smallGroup);
            continue;
        }
        const item = options[smallGroup.index][0];
        options = options.map(x => x.filter(y => y !== item));
        smallGroup.data.push(item);
    }
    res = [...res, ...groups.map(x => x.data)];
    return res;
}

function accurateSplitGroups<T>(items: T[], filters: ((x: T) => boolean)[], times: number) {
    const options: { data: T[][]; diff: number }[] = [];
    for (let i = 0; i < times; i  ) {
        const res = splitGroups(items, filters);
        let diffBetweenGroups = 0;
        const groupsLens = res.map(x => x.length);
        for (let i = 0; i < groupsLens.length; i  ) {
            for (let j = 0; j < groupsLens.length; j  ) {
                diffBetweenGroups  = Math.abs(groupsLens[i] - groupsLens[j]);
            }
        }
        options.push({ data: res, diff: diffBetweenGroups });
    }
    return options.sort((a, b) => a.diff - b.diff)[0].data;
}

const items = [{ gender: 'female' }, { gender: 'female' }, { gender: 'female' }, { gender: 'male' }];
const filters = [(x: any) => !!x.gender, (x: any) => !!x.gender];
const res = accurateSplitGroups(items, filters, 100);
const b = res;

轉載請註明出處，本文鏈接：https://www.uj5u.com/gongcheng/374285.html

標籤：javascript 数组算法递归分裂

上一篇：是否有一種演算法可以為給定范圍內的每個條目找到最短的二進制表示？

下一篇：樹遞回-如何在深度優先搜索中包含條件？