我有一個如下所示的資料集 -
List((X,Set(" 1", " 7")), (Z,Set(" 5")), (D,Set(" 2")), (E,Set(" 8")), ("F ",Set(" 5", " 9", " 108")), (G,Set(" 2", " 11")), (A,Set(" 7", " 5")), (M,Set(108)))
這里 X 與 A 有關,因為它們之間共有 7 個
Z 與 A 有關,因為它們之間共有 5 個
F 與 A 有關,因為它們之間共有 5 個
M與F有關,因為它們之間共有108個
所以,X、Z、A、F 和 M 是相關的
D 和 G 是相關的,因為 2 是它們之間的共同點
E 與任何人無關
因此,輸出將是 ((X, Z, A, F, M), (D,G), (E))
順序在這里并不重要。
我在這里使用了 Scala,但 Scala/Python 或偽代碼中的解決方案對我有用。
uj5u.com熱心網友回復:
構建一個無向圖,其中每個標簽都連接到相應集合中的每個數字(即(A, { 1, 2 })會給出兩條邊:A <-> 1和A <-> 2)
計算連通分量(例如,使用深度優先搜索)。
僅從連接的組件中過濾掉標簽。
import util.{Left, Right, Either}
import collection.mutable
def connectedComponentsOfAsc[F, V](faces: List[(F, Set[V])]): List[List[F]] = {
type Node = Either[F, V]
val graphBuilder = mutable.HashMap.empty[Node, mutable.HashSet[Node]]
def addEdge(a: Node, b: Node): Unit =
graphBuilder.getOrElseUpdate(a, mutable.HashSet.empty[Node]) = b
for
(faceLabel, vertices) <- faces
vertex <- vertices
do
val faceNode = Left(faceLabel)
val vertexNode = Right(vertex)
addEdge(faceNode, vertexNode)
addEdge(vertexNode, faceNode)
val graph = graphBuilder.view.mapValues(_.toSet).toMap
val ccs = connectedComponents(graph)
ccs.map(_.collect { case Left(faceLabel) => faceLabel }.toList)
}
def connectedComponents[V](undirectedGraph: Map[V, Set[V]]): List[Set[V]] = {
val visited = mutable.HashSet.empty[V]
var connectedComponent = mutable.HashSet.empty[V]
val components = mutable.ListBuffer.empty[Set[V]]
def dfs(curr: V): Unit = {
if !visited(curr) then
visited = curr
connectedComponent = curr
undirectedGraph(curr).foreach(dfs)
}
for v <- undirectedGraph.keys do
if !visited(v) then
connectedComponent = mutable.HashSet.empty[V]
dfs(v)
components = connectedComponent.toSet
components.toList
}
可以這樣使用:
@main def main(): Unit = {
println(connectedComponentsOfAsc(
List(
("X",Set("1", "7")),
("Z",Set("5")),
("D",Set("2")),
("E",Set("8")),
("F",Set("5", "9", "108")),
("G",Set("2", "11")),
("A",Set("7", "5")),
("M",Set("108"))
)
).map(_.sorted).sortBy(_.toString))
}
產生:
List(List(A, F, M, X, Z), List(D, G), List(E))
所有步驟都是O(n)(與輸入大小成線性比例)。
這個答案是獨立的,但在這里使用某種圖形庫顯然是有利的。
uj5u.com熱心網友回復:
// 我把一些值放在引號中,所以我們有一致的字串輸入
val initialData :List[(String, Set[String])] = List(
("X",Set(" 1", " 7")),
("Z",Set(" 5")),
("D",Set(" 2")),
("E",Set(" 8")),
("F ",Set(" 5", " 9", " 108")),
("G",Set(" 2", " 11")),
("A",Set(" 7", " 5")),
("M",Set("108"))
)
// 通過將集合內的字串資料轉換為 Ints 來清理集合。
val cleanedData = initialData.map(elem => (elem._1, elem._2.map(_.trim.toInt)))
> cleanedData: List[(String, scala.collection.immutable.Set[Int])] = List((X,Set(1, 7)), (Z,Set(5)), (D,Set(2)), (E,Set(8)), ("F ",Set(5, 9, 108)), (G,Set(2, 11)), (A,Set(7, 5)), (M,Set(108)))
// 將集合分解成一個簡單映射串列。X -> 1, X -> 7 單獨。
val explodedList = cleanedData.flatMap(x => x._2.map(v => (x._1, v)))
> explodedList: List[(String, Int)] = List((X,1), (X,7), (Z,5), (D,2), (E,8), ("F ",5), ("F ",9), ("F ",108), (G,2), (G,11), (A,7), (A,5), (M,108))
按新鍵將它們組合在一起
val mappings = explodedList.groupBy(_._2)
> mappings: scala.collection.immutable.Map[Int,List[(String, Int)]] = Map(5 -> List((Z,5), ("F ",5), (A,5)), 1 -> List((X,1)), 9 -> List(("F ",9)), 2 -> List((D,2), (G,2)), 7 -> List((X,7), (A,7)), 108 -> List(("F ",108), (M,108)), 11 -> List((G,11)), 8 -> List((E,8)))
列印輸出
mappings.foreach { case (key, items) =>
println(s"${items.map(_._1).mkString(",")} are all related because of $key")
}
> Z,F ,A are all related because of 5
> X are all related because of 1
> F are all related because of 9
> D,G are all related because of 2
> X,A are all related because of 7
> F ,M are all related because of 108
> G are all related because of 11
> E are all related because of 8
uj5u.com熱心網友回復:
- 讀取輸入,創建對向量
例如
X 1
X 7
Z 5
...
- 按對的第二個成員的順序對向量進行排序
例如
X 1
D 2
G 2
...
- 遍歷已排序的向量,只要第二個成員不變,就添加到“pass1 組”。如果確實發生了變化,請啟動一個新的 pass1 組。
例如
X
D G
Z F A
X A
E
F
G
- 將 pass1 組與公共成員合并以提供輸出組。
這是實作此功能的 C 代碼
#include <string>
#include <iostream>
#include <vector>
#include <algorithm>
bool merge(
std::vector<char> &res,
std::vector<char> &vg)
{
bool ret = false;
for (char r : res)
{
for (char c : vg)
{
if (c == r)
ret = true;
}
}
if (!ret)
return false;
for (char c : vg)
{
if (std::find(res.begin(), res.end(), c) == res.end())
res.push_back(c);
}
return true;
}
void add(
std::vector<std::vector<char>> &result,
std::vector<char> &vg)
{
std::vector<char> row;
for (char c : vg)
row.push_back(c);
result.push_back(row);
}
main()
{
std::string input = "List((X,Set(\" 1\", \" 7\")), (Z,Set(\" 5\")), (D,Set(\" 2\")), (E,Set(\" 8\")), (F,Set(\" 5\", \" 9\", \" 108\")), (G,Set(\" 2\", \" 11\")), (A,Set(\" 7\", \" 5\")), (M,Set(108)))";
std::vector<std::pair<char, int>> vinp;
int p = input.find("Set");
int q = input.find("Set", p 1);
while (p != -1)
{
char c = input[p - 2];
int s = input.find_first_of("0123456789", p);
while (s < q)
{
vinp.push_back(std::make_pair(
c,
atoi(input.substr(s).c_str())));
s = input.find_first_of("0123456789", s 3);
}
p = q;
q = input.find("Set", p 1);
}
std::sort(vinp.begin(), vinp.end(),
[](std::pair<char, int> a, std::pair<char, int> b)
{
return a.second < b.second;
});
std::cout << "sorted\n";
for (auto &p : vinp)
std::cout << p.first << " " << p.second << "\n";
std::vector<std::vector<char>> vpass1;
std::vector<char> row;
int sec = -1;
for (auto &p : vinp)
{
if (p.second != sec)
{
// new group
if (row.size())
vpass1.push_back(row);
sec = p.second;
row.clear();
}
row.push_back(p.first);
}
std::cout << "\npass1\n";
for (auto &row : vpass1)
{
for (char c : row)
std::cout << c << " ";
std::cout << "\n";
}
std::vector<std::vector<char>> result;
std::vector<char> pass2group;
bool fmerge2 = true;
while (fmerge2)
{
fmerge2 = false;
for (auto &vg : vpass1)
{
if (!result.size())
add(result, vg);
else
{
bool fmerge1 = false;
for (auto &res : result)
{
if (merge(res, vg))
{
fmerge1 = true;
fmerge2 = true;
break;
}
}
if (!fmerge1)
add(result, vg);
}
}
if (fmerge2)
{
vpass1 = result;
result.clear();
}
}
std::cout << "\n(";
for (auto &res : result)
{
if (res.size())
{
std::cout << "(";
for (char c : res)
std::cout << c << " ";
std::cout << ")";
}
}
std::cout << ")\n";
return 0;
}
它產生正確的結果
sorted
X 1
D 2
G 2
Z 5
F 5
A 5
X 7
A 7
E 8
F 9
G 11
F 108
pass1
X
D G
Z F A
X A
E
F
G
((X A Z F )(D G )(E ))
轉載請註明出處,本文鏈接:https://www.uj5u.com/qukuanlian/529165.html
下一篇:在特定字數處中斷字串
