有一個任務。該陣列包含任意字串。我們需要計算每個字串在陣列中出現的次數。在一個執行緒和多執行緒中解決任務,比較執行時間。
出于某種原因,單執行緒版本比多執行緒版本運行得更快:90 毫秒對 300 毫秒。如何修復多執行緒版本,使其運行速度比單執行緒版本快?
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
using System.Collections.Concurrent;
using System.Diagnostics;
using System.Threading;
namespace ParallelDictionary
{
class Program
{
static void Main(string[] args)
{
List<string> strs = new List<string>();
for (int i=0; i<1000000; i )
{
strs.Add("qqq");
}
for (int i=0;i< 5000; i )
{
strs.Add("aaa");
}
F(strs);
ParallelF(strs);
}
private static void F(List<string> strs)
{
Dictionary<string, int> freqs = new Dictionary<string, int>();
Stopwatch stopwatch = new Stopwatch();
stopwatch.Start();
for (int i=0; i<strs.Count; i )
{
if (!freqs.ContainsKey(strs[i]))
freqs[strs[i]] = 1;
else
freqs[strs[i]] ;
}
stopwatch.Stop();
Console.WriteLine("single-threaded {0} ms", stopwatch.ElapsedMilliseconds);
foreach (var kvp in freqs)
{
Console.WriteLine("{0} {1}", kvp.Key, kvp.Value);
}
}
private static void ParallelF(List<string> strs)
{
ConcurrentDictionary<string, int> freqs = new ConcurrentDictionary<string, int>();
Stopwatch stopwatch = new Stopwatch();
stopwatch.Start();
Parallel.ForEach(strs, str =>
{
freqs.AddOrUpdate(str, 1, (key, value) => value 1);
});
stopwatch.Stop();
Console.WriteLine("multi-threaded {0} ms", stopwatch.ElapsedMilliseconds);
foreach (var kvp in freqs)
{
Console.WriteLine("{0} {1}", kvp.Key, kvp.Value);
}
}
}
}
uj5u.com熱心網友回復:
通過使用磁區器將資料拆分為您單獨處理的塊,可以使多執行緒版本比單執行緒版本快一點。
然后每個塊可以被處理成一個單獨的非并發字典,而不需要任何鎖定。最后,在每個范圍的末尾,您可以更新一個非并發結果字典(您必須鎖定它)。
像這樣的東西:
private static void ParallelF(List<string> strs)
{
Dictionary<string, int> result = new Dictionary<string, int>();
Stopwatch stopwatch = new Stopwatch();
stopwatch.Start();
object locker = new object();
Parallel.ForEach(Partitioner.Create(0, strs.Count), range =>
{
var freqs = new Dictionary<string, int>();
for (int i = range.Item1; i < range.Item2; i)
{
if (!freqs.ContainsKey(strs[i]))
freqs[strs[i]] = 1;
else
freqs[strs[i]] ;
}
lock (locker)
{
foreach (var kvp in freqs)
{
if (!result.ContainsKey(kvp.Key))
{
result[kvp.Key] = kvp.Value;
}
else
{
result[kvp.Key] = kvp.Value;
}
}
}
});
stopwatch.Stop();
Console.WriteLine("multi-threaded {0} ms", stopwatch.ElapsedMilliseconds);
foreach (var kvp in result)
{
Console.WriteLine("{0} {1}", kvp.Key, kvp.Value);
}
}
在我的系統上給出以下結果(對于發布版本,.NET 6):
single-threaded 50 ms
qqq 1000000
aaa 5000
multi-threaded 26 ms
qqq 1000000
aaa 5000
它只是快一點......如果這值得你決定。
uj5u.com熱心網友回復:
這是另一種方法,它與基于Matthew Watson的Partitioner解決方案有相似之處,但使用了不同的 API。它使用高級Parallel.ForEach多載,其簽名如下所示:
/// <summary>
/// Executes a foreach (For Each in Visual Basic) operation with thread-local data
/// on an System.Collections.IEnumerable in which iterations may run in parallel,
/// loop options can be configured, and the state of the loop can be monitored and
/// manipulated.
/// </summary>
public static ParallelLoopResult ForEach<TSource, TLocal>(
IEnumerable<TSource> source,
ParallelOptions parallelOptions,
Func<TLocal> localInit,
Func<TSource, ParallelLoopState, TLocal, TLocal> body,
Action<TLocal> localFinally);
TLocal使用本地字典,其中包含由單個作業執行緒計算的部分結果:
static Dictionary<string, int> GetFrequencies(List<string> source)
{
Dictionary<string, int> frequencies = new();
ParallelOptions options = new()
{
MaxDegreeOfParallelism = Environment.ProcessorCount
};
Parallel.ForEach(source, options, () => new Dictionary<string, int>(),
(item, state, partialFrequencies) =>
{
ref int occurences = ref CollectionsMarshal.GetValueRefOrAddDefault(
partialFrequencies, item, out bool exists);
occurences ;
return partialFrequencies;
}, partialFrequencies =>
{
lock (frequencies)
{
foreach ((string item, int partialOccurences) in partialFrequencies)
{
ref int occurences = ref CollectionsMarshal.GetValueRefOrAddDefault(
frequencies, item, out bool exists);
occurences = partialOccurences;
}
}
});
return frequencies;
}
上面的代碼還演示了低級CollectionsMarshal.GetValueRefOrAddDefaultAPI 的使用,它允許使用鍵的單個散列來搜索和更新字典。
我沒有測量它(也沒有測驗它),但我希望它比 Matthew Watson 的解決方案慢。原因是source以同步方式列舉。如果您可以處理復雜性,則可以考慮將這兩種方法結合起來以獲得最佳性能。
轉載請註明出處,本文鏈接:https://www.uj5u.com/yidong/517042.html
下一篇:Java中的并發安全佇列
