我正在努力尋找通過查看它們的共享值來合并陣列（或創建新陣列）的最佳方法。

List<String[]> dictionary = new ArrayList<String[]>();

這是我的“字典”，里面裝滿了 2 個單詞的陣列，例如它包含陣列：

["A","B"]
["B","C"]
["D","E"]
["F","C"]
["G","H"]
["T","D"]

我需要按它們共享的值合并它們，例如，完成的“字典”（或全新的串列）如下所示：

["A","B","C","F"];
["D","E","T"];
["G","H"];

此外，不必洗掉舊陣列，它們可以保留在“字典”中，但我需要合并的陣列，我很難弄清楚。

無論如何不必對陣列進行排序。

這是我到目前為止所擁有的，但它不起作用

    public static void SynonymsMerge(List<String[]> dictionary){
    ArrayList<ArrayList<String>> newDictionary = new ArrayList<ArrayList<String>>();
    for(int i=0;i < dictionary.size(); i  ){
        ArrayList<String> synonyms = new ArrayList<String>();
        for(int j=0; j < dictionary.get(i).length; j  ){
            synonyms.add(dictionary.get(i)[j]);
        }
        newDictionary.add(synonyms);
    }
    for(int i=0;i< newDictionary.size();i  ){
        for(int j=0; j < newDictionary.size();j  ){
            for (int k=0; k < newDictionary.get(j).size() ;k  ) {
                    if (newDictionary.get(i).equals(newDictionary.get(j)))
                        continue;
                    if (newDictionary.get(i).contains(newDictionary.get(j).get(k)))
                        newDictionary.get(i).addAll(newDictionary.get(j));

uj5u.com熱心網友回復：

首先，這里是代碼。我將輸入型別從更改List<String[]> 為List<List<String>>，因為混淆串列和陣列并沒有真正意義。這也適用于輸出型別。

編碼

public static List<List<String>> merge(List<List<String>> dictionary) {
        List<List<String>> newDictionary = new ArrayList<>();

        for (List<String> stringPair : dictionary) {

            List<Integer> matchIndices = new ArrayList<>();
            for (int i = 0; i < newDictionary.size(); i  ) {
                List<String> newStrings = newDictionary.get(i);

                for (String str : stringPair) {
                    if (newStrings.contains(str)) {
                        matchIndices.add(i);
                    }
                }
            }
            if (matchIndices.size() == 0) {
                newDictionary.addAll(new ArrayList<List<String>>(Collections.singleton(new ArrayList<>(stringPair))));
                continue;
            }

            matchIndices.sort(Integer::compareTo);

            if (matchIndices.size() == 1) {
                newDictionary.get(matchIndices.get(0)).addAll(new ArrayList<>(stringPair));
            } else {
                int last = matchIndices.remove(0);
                while (matchIndices.size() > 0) {
                    int i = matchIndices.get(0);
                    newDictionary.get(last).addAll(newDictionary.get(i));
                    newDictionary.remove(i);
                    matchIndices.remove(0);
                    matchIndices = new ArrayList<>(matchIndices.stream().map(a -> a - 1).toList());
                }
            }
        }
        newDictionary = newDictionary.stream()
                .map(strings -> strings.stream().distinct().toList())
                .toList();

        return newDictionary;
    }

它是如何作業的？

dictionary型別的輸入List<List<String>>（內部串列的最大大小為 2，即使該函式理論上可以處理更多的字串）
newDictionary型別函式的輸出List<List<String>>

為每個輸入對/字串串列執行以下代碼directory

獲取所有現有的不同“組”（它們的索引），newDictionary其中 par 中的字串已經存在。此索引串列稱為matchIndices
示例：stringPair=["A","E"] newDictionary:[["I", "A", "O"], ["P", "D"]] 將導致matchIndices=[0 ] 因為只有“A”出現在第一個元素中newDictionary
如果matchIndices.size()為 0，newDictionary則使用字串對創建一個新組。回到 1。
如果matchIndices.size()為 1，則將該對中的字串附加到newDictionary具有中指定的索引的特定組matchIndices。回到 1。
如果matchIndices.size()大于 1，則意味著newDictionary具有中指定索引的多個組必須在-loopmatchIndices中合并在一起。for回到 1。

最后，我們必須確保 newDictionary.

主要方法

    public static void main(String[] args) {
        List<List<String>> dictionary = new ArrayList<>(List.of(
                List.of("A", "B"),
                List.of("B", "C"),
                List.of("D", "E"),
                List.of("F", "C"),
                List.of("G", "H"),
                List.of("T", "D")));
        
        System.out.println(merge(dictionary));
    }

為什么我們需要第 4 步？

在您的具體示例中，我們不必合并多個組。
但是像這樣的輸入資料

List<List<String>> dictionary = new ArrayList<>(List.of(
                List.of("A", "B"),
                List.of("B", "C"),
                List.of("D", "E"),
                List.of("F", "E"),
                List.of("E", "A")));

newDictionary=[[A, B, B, C], [D, E, F, E]]我們最終到了必須嘗試插入的地步[E, A]。在這里，兩個組newDictionary都必須合并在一起。
然后，這會產生的輸出[[A, B, C, D, E, F]]，其中兩個組都被合并并洗掉了重復項。

附言

我對這個解決方案并不滿意，因為目前還不清楚實際發生了什么，但我仍在發布這個，因為你說你會對任何解決方案感到滿意。:)

uj5u.com熱心網友回復：

對于這個問題，您需要連接（反之亦然），然后連接和A連接等等。很多交叉點和很多回圈。BCBCF

這是使用Graph作為資料結構的完美案例。

準確地說，這個資料集可以表示為一個回圈無向不相交圖，其中包含幾個連通分量。就圖論而言，這項任務歸結為發現圖中的所有組件。

為此，我們需要采取以下步驟：

通過決議輸入資料來創建和初始化圖形。
遍歷圖的頂點集合，并遍歷每個遇到的連接組件。以前未見過的每個頂點都表明它所屬的組件也尚未被發現。作為遍歷演算法，我選擇了深度優先搜索（但出于此任務的目的，廣度優先搜索演算法也可以正常作業）。

執行：

public class Graph {
    private Map<String, Vertex> vertexByName = new HashMap<>();
    
    private Graph() {} // no way and no need to invoke this constractor outside the class
    
    public static Graph getInstance(List<List<String>> dictionary) { // method responsible for instantiation and initialization of the graph
        Graph graph = new Graph();
        graph.init(dictionary);
        return graph;
    }
    
    private void init(List<List<String>> dictionary) {
        for (List<String> list: dictionary) {
            for (String name: list) {
                addVertex(name, list);
            }
        }
    }
    
    private void addVertex(String name, List<String> neighbours) {
        
        Vertex cur = vertexByName.computeIfAbsent(name, Vertex::new);
    
        for (String neighbourName: neighbours) {
            if (neighbourName.equals(name)) continue;
        
            Vertex neighbour = vertexByName.computeIfAbsent(neighbourName, Vertex::new);
            cur.addNeighbour(neighbour);
            neighbour.addNeighbour(cur); // this graph is undirectional, i.e. both vertices in each connected pair should hold a reference to one another
        }
    }
    
    public List<List<String>> getComponents() {
        List<List<String>> components = new ArrayList<>();
        
        Set<Vertex> seen = new HashSet<>();
        for (Vertex vertex: vertexByName.values()) {
            if (seen.contains(vertex)) continue;
            
            components.add(getComponentNames(vertex, seen));
        }
        return components;
    }

    // Depth first search implementation
    private List<String> getComponentNames(Vertex vertex, Set<Vertex> seen) {
        Deque<Vertex> stack = new ArrayDeque<>();
        List<String> names = new ArrayList<>();
        stack.push(vertex);
        seen.add(vertex);
        
        while(!stack.isEmpty()) {
            Vertex current = stack.pop();
            names.add(current.getName());
            
            for (Vertex neighbour: current.getNeighbours()) {
                if (seen.contains(neighbour)) continue;
                
                seen.add(neighbour);
                stack.push(neighbour);
            }
        }
        return names;
    }
    
    private class Vertex {
        private String name;
        private Set<Vertex> neighbours = new HashSet<>();
        
        public Vertex(String name) {
            this.name = name;
        }
    
        public Vertex(String name) {
            this.name = name;
        }
        
        public boolean addNeighbour(Vertex neighbour) {
            return neighbours.add(neighbour);
        }
    
        public String getName() {
            return name;
        }
    
        public Set<Vertex> getNeighbours() {
            return neighbours;
        }
    }
}

main()- 演示

public static void main(String[] args) {
    List<List<String>> dictionary =
        List.of(List.of("A","B"), List.of("B","C"),
                List.of("D","E"), List.of("F","C"),
                List.of("G","H"), List.of("T","D"));
    
    Graph graph = Graph.getInstance(dictionary);
    List<List<String>> componentNames = graph.getComponents();

    System.out.println(componentNames);
}

輸出

[[A, B, C, F], [T, D, E], [G, H]]

uj5u.com熱心網友回復：

未完善的第一個想法 -索引陣列，然后合并它們。重復。

遍歷 ArrayList 中的陣列；
索引陣列中的專案；
合并重疊的專案；
對合并的結果重復相同的程序，直到沒有重疊。

使用您的示例：

[A, B] (Call this #1 array)
[B, C] (Call this #2 array)
[D, E] (Call this #3 array)
[F, C] (Call this #4 array)
[G, H] (Call this #5 array)
[T, D] (Call this #6 array)

現在，準備索引，如：

A -> 1   (because A occurs only in array 1)
B -> 1,2 (because B occurs in array 1 and 2)
C -> 2,4 ...
D -> 3,6 ...
E -> 3   ...
F -> 4   ...
G -> 5   ...
T -> 6   ...

查看上面的索引，我們知道我們應該合并 1 和 2、2 和 4、以及 3 和 6。這將給我們：

[A, B, C] (This is our new #1)
[B, C, F] (This is our new #2)
[D, E, T] (This is our new #3)
[G, H]    (This is our new #4)

對新的 ArrayList 陣列重復這些步驟。重新索引給出...

A -> 1
B -> 1,2
C -> 1,2
D -> 3
E -> 3
F -> 2
G -> 4
H -> 4
T -> 3

再次合并重疊。這次只有 1 和 2 重疊。合并它會給你：

[A, B, C, F] (This is our new #1)
[D, E, T]    (This is our new #2)
[G, H]       (This is our new #3)

再次，重新索引，

A -> 1
B -> 1
C -> 1
D -> 2
E -> 3
F -> 1
G -> 3
H -> 3
T -> 2

由于這次沒有重疊陣列，因此沒有更多可以合并的內容，這就是最終答案。

uj5u.com熱心網友回復：

使用 Union find 添加解決方案，因此這里的目標是遍歷所有字串，同時找到共同的“領導者”。

之后，我們將再次遍歷字典，但這次每個字串都有一個領導者，我們將它們系結到一個共同的領導者，然后創建合并字典

public class UnionFind
{
    private Map<String, String> graph;

    public UnionFind()
    {
        graph = new HashMap<>();
    }

    public String find(String str)
    {
        if (str == null) throw new IllegalArgumentException("Invalid String");

        if (graph.getOrDefault(str, str).equals(str))
            graph.put(str, str);
        else 
            graph.put(str, find(graph.get(str)));

        return graph.get(str);
    }

    public void union(String str1, String str2)
    {
        String root1 = find(str1);
        String root2 = find(str2);


        if (!root1.equals(root2))
        {
            if (root1.equals(str1)) graph.put(graph.get(root1), graph.get(root2));
            else graph.put(graph.get(root2), graph.get(root1));
        }
    }

    public static void main(String[] args)
    {
        List<List<String>> dictionary = prepareDictionary();

        UnionFind unionFind = new UnionFind();

        for (List<String> list : dictionary)
        {
            for (int i = 1; i < list.size(); i  )
            {
                unionFind.union(list.get(i - 1), list.get(i));
            }
        }

        Map<String, Set<String>> map = new HashMap<>();

        for (List<String> list : dictionary)
        {
            for (String str : list)
            {
                String parent = unionFind.find(str);
                if (!map.containsKey(parent))
                    map.put(parent, new LinkedHashSet<>());

                map.get(parent).add(str);
            }
        }

        List<List<String>> result = new ArrayList<>();
        for(Map.Entry<String, Set<String>> entry : map.entrySet())
        {
            result.add(new ArrayList<>(entry.getValue()));
        }

        System.out.println(result);
    }

    private static List<List<String>> prepareDictionary()
    {
        List<List<String>> dictionary = new ArrayList<>();

        dictionary.add(Arrays.asList("A", "B"));
        dictionary.add(Arrays.asList("B", "C"));
        dictionary.add(Arrays.asList("D", "E"));
        dictionary.add(Arrays.asList("F", "C"));
        dictionary.add(Arrays.asList("G", "H"));
        dictionary.add(Arrays.asList("T", "D"));

        return dictionary;
    }

結果：

[[A, B, C, F], [D, E, T], [G, H]]

uj5u.com熱心網友回復：

這是另一種解決方案，其中包含用于您的結構的字串集

public void merge(List<String[]> dictionary) {
    List<Set<String>> dictionaryList = dictionary.stream()
            .map(x -> new HashSet<> (Arrays.asList(x))).collect(Collectors.toList());

    for (int i = 0; i < dictionaryList.size() ; i  ){
        Set list = dictionaryList.get(i);

        for (int j = i   1; j < dictionaryList.size() ; j  ){
            Set otherList = dictionaryList.get(j);
            Set result = (Set) list.stream().filter(otherList::contains).collect(Collectors.toSet());

            if (!result.isEmpty()) {
                list.addAll(otherList);
                dictionaryList.remove(j);
            }
        }
    }
    System.out.println(dictionaryList);
}

結果

[[A, B, C, F], [D, T, E], [G, H]]

uj5u.com熱心網友回復：

所有的答案都很好，謝謝！但是我不能使用它們，因為也許我需要解釋代碼并且我沒有完全理解它們（第一次在 java 中編碼 yaaay！）。我最后所做的是：我將所有輸入保存到哈希集串列中，因為哈希集不能重復值，然后我使用 if 陳述句通過 2 個 for(nested) 回圈運行所有輸入

if(return !Collections.disjoint(hashset1,hashset2);)

之后我使用合并它們set.addAll(hashset1); set.addAll(hashset2);

但是它仍然不完整，并且有一些應該合并但沒有合并的集合。所以我用相同的 if 陳述句再次通過 2for(nested) 回圈運行它并且它有效（我希望）。它適用于 2000 字的輸入，我希望它適用于更多字的輸入：D 感謝大家的幫助。

轉載請註明出處，本文鏈接：https://www.uj5u.com/yidong/464736.html

標籤：爪哇数组列表算法

上一篇：如何獲取2個資料表，并提出符合限制的組合

下一篇：檢查值是否在串列串列中并檢索元素索引的有效演算法

如何在Java中合并包含相同值的多個陣列

編碼

它是如何作業的？

主要方法

為什么我們需要第 4 步？

附言