Parallel.ForEach和ConcurrentDictionary比常規Dictionary慢-有解無憂

有一個任務。該陣列包含任意字串。我們需要計算每個字串在陣列中出現的次數。在一個執行緒和多執行緒中解決任務，比較執行時間。

出于某種原因，單執行緒版本比多執行緒版本運行得更快：90 毫秒對 300 毫秒。如何修復多執行緒版本，使其運行速度比單執行緒版本快？

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
using System.Collections.Concurrent;
using System.Diagnostics;
using System.Threading;

namespace ParallelDictionary
{
    class Program
    {
        static void Main(string[] args)
        {
            List<string> strs = new List<string>();
            for (int i=0; i<1000000; i  )
            {
                strs.Add("qqq");
            }
            for (int i=0;i< 5000; i  )
            {
                strs.Add("aaa");
            }

            F(strs);
            ParallelF(strs);
        }

        private static void F(List<string> strs)
        {
            Dictionary<string, int> freqs = new Dictionary<string, int>();

            Stopwatch stopwatch = new Stopwatch();
            stopwatch.Start();
            for (int i=0; i<strs.Count; i  )
            {
                if (!freqs.ContainsKey(strs[i]))
                    freqs[strs[i]] = 1;
                else
                    freqs[strs[i]]  ;
            }
            stopwatch.Stop();

            Console.WriteLine("single-threaded {0} ms", stopwatch.ElapsedMilliseconds);
            foreach (var kvp in freqs)
            {
                Console.WriteLine("{0} {1}", kvp.Key, kvp.Value);
            }
        }

        private static void ParallelF(List<string> strs)
        {
            ConcurrentDictionary<string, int> freqs = new ConcurrentDictionary<string, int>();

            Stopwatch stopwatch = new Stopwatch();
            stopwatch.Start();
            Parallel.ForEach(strs, str =>
            {
                freqs.AddOrUpdate(str, 1, (key, value) => value   1);
            });
            stopwatch.Stop();

            Console.WriteLine("multi-threaded {0} ms", stopwatch.ElapsedMilliseconds);
            foreach (var kvp in freqs)
            {
                Console.WriteLine("{0} {1}", kvp.Key, kvp.Value);
            }
        }
    }
}

uj5u.com熱心網友回復：

通過使用磁區器將資料拆分為您單獨處理的塊，可以使多執行緒版本比單執行緒版本快一點。

然后每個塊可以被處理成一個單獨的非并發字典，而不需要任何鎖定。最后，在每個范圍的末尾，您可以更新一個非并發結果字典（您必須鎖定它）。

像這樣的東西：

private static void ParallelF(List<string> strs)
{
    Dictionary<string, int> result = new Dictionary<string, int>();

    Stopwatch stopwatch = new Stopwatch();
    stopwatch.Start();

    object locker = new object();

    Parallel.ForEach(Partitioner.Create(0, strs.Count), range => 
    {
        var freqs = new Dictionary<string, int>();

        for (int i = range.Item1; i < range.Item2;   i)
        {
            if (!freqs.ContainsKey(strs[i]))
                freqs[strs[i]] = 1;
            else
                freqs[strs[i]]  ;
        }

        lock (locker)
        { 
            foreach (var kvp in freqs)
            {
                if (!result.ContainsKey(kvp.Key))
                {
                    result[kvp.Key] = kvp.Value;
                }
                else
                {
                    result[kvp.Key]  = kvp.Value;
                }
            }
        }
    });

    stopwatch.Stop();

    Console.WriteLine("multi-threaded {0} ms", stopwatch.ElapsedMilliseconds);
    foreach (var kvp in result)
    {
        Console.WriteLine("{0} {1}", kvp.Key, kvp.Value);
    }
}

在我的系統上給出以下結果（對于發布版本，.NET 6）：

single-threaded 50 ms
qqq 1000000
aaa 5000
multi-threaded 26 ms
qqq 1000000
aaa 5000

它只是快一點......如果這值得你決定。

uj5u.com熱心網友回復：

這是另一種方法，它與基于Matthew Watson的Partitioner解決方案有相似之處，但使用了不同的 API。它使用高級Parallel.ForEach多載，其簽名如下所示：

/// <summary>
/// Executes a foreach (For Each in Visual Basic) operation with thread-local data
/// on an System.Collections.IEnumerable in which iterations may run in parallel,
/// loop options can be configured, and the state of the loop can be monitored and
/// manipulated.
/// </summary>
public static ParallelLoopResult ForEach<TSource, TLocal>(
    IEnumerable<TSource> source,
    ParallelOptions parallelOptions,
    Func<TLocal> localInit,
    Func<TSource, ParallelLoopState, TLocal, TLocal> body,
    Action<TLocal> localFinally);

TLocal使用本地字典，其中包含由單個作業執行緒計算的部分結果：

static Dictionary<string, int> GetFrequencies(List<string> source)
{
    Dictionary<string, int> frequencies = new();
    ParallelOptions options = new()
    {
        MaxDegreeOfParallelism = Environment.ProcessorCount
    };
    Parallel.ForEach(source, options, () => new Dictionary<string, int>(),
        (item, state, partialFrequencies) =>
    {
        ref int occurences = ref CollectionsMarshal.GetValueRefOrAddDefault(
            partialFrequencies, item, out bool exists);
        occurences  ;
        return partialFrequencies;
    }, partialFrequencies =>
    {
        lock (frequencies)
        {
            foreach ((string item, int partialOccurences) in partialFrequencies)
            {
                ref int occurences = ref CollectionsMarshal.GetValueRefOrAddDefault(
                    frequencies, item, out bool exists);
                occurences  = partialOccurences;
            }
        }
    });
    return frequencies;
}

上面的代碼還演示了低級CollectionsMarshal.GetValueRefOrAddDefaultAPI 的使用，它允許使用鍵的單個散列來搜索和更新字典。

我沒有測量它（也沒有測驗它），但我希望它比 Matthew Watson 的解決方案慢。原因是source以同步方式列舉。如果您可以處理復雜性，則可以考慮將這兩種方法結合起來以獲得最佳性能。

轉載請註明出處，本文鏈接：https://www.uj5u.com/yidong/517042.html

標籤：C＃多线程并行.foreach并发字典

上一篇：正確使用同步塊、notify()和wait()

下一篇：Java中的并發安全佇列