Click here to Skip to main content
15,889,651 members
Please Sign up or sign in to vote.
0.00/5 (No votes)
In the below below code UE is dictionary and bg is the list.
I am using PLINQ to get the values from the dictionary and returning the list.
Is the below code is thread safe?

TIA

What I have tried:

t = UE.AsParallel().Where(x => bg.Contains(x.Key, StringComparer.OrdinalIgnoreCase)).Select(y => y.Value).Distinct().ToList();
Posted
Updated 7-Mar-16 14:20pm

As long as you're only reading and not modifying values anything is thread safe.
 
Share this answer
 
Sascha is correct about thread safety.
However, if you are using the .AsParallel() to improve performance, you're going about it incorrectly!

Since bg is a List<string>, the .Contains() is a very poor way to check for membership in the list. It is O(n) (where n is bg.Length). So the overall query is O(n*m) (where m is UE.Count).
Instead, you should make a HashSet<string> from bg and check for membership in the set. That is O(1), making the overall query directly proportional to UE.Count.

Using the HashSet is about 100 times faster (on my system) than the parallelized scan of bg.
Parallelizing the HashSet version actually slows it down, presumably due to the parallelization setup and context switching and coordination overhead.
Here's my testing code:
C#
using System;
using System.Collections.Generic;
using System.Diagnostics;
using System.Linq;

namespace ConsoleApplication10
{
  class Program
  {
    static void Main(string[] args)
    {
      Random rand = new Random();
      Stopwatch swp = new Stopwatch();
      Stopwatch swh = new Stopwatch();
      Stopwatch swph = new Stopwatch();
      bool tail = false;
      const int count = 200;
      for (int i = 0; i <= count; i++)
      {
        // Populate the collections
        Dictionary<string, int> UE = new Dictionary<string, int>();
        List<string> bg = new List<string>();
        int roff = rand.Next(1000);
        foreach (var item in Enumerable.Range(roff, 10000))
        {
          string key = item.ToString("D");
          UE.Add(key, item % (roff + 500));
          if (item % 10 == 0)
            bg.Add(key);
        }

        // skip timing the first time through so we're sure the JITTER isn't included.
        if (tail)
          swp.Start();
        List<int> tp = UE.AsParallel().Where(x => bg.Contains(x.Key, StringComparer.OrdinalIgnoreCase)).Select(y => y.Value).Distinct().ToList();
        if (tail)
          swp.Stop();

        if (tail)
          swh.Start();
        HashSet<string> hs = new HashSet<string>(bg, StringComparer.OrdinalIgnoreCase);
        List<int> th = UE.Where(x => hs.Contains(x.Key)).Select(y => y.Value).Distinct().ToList();
        if (tail)
          swh.Stop();

        if (tail)
          swph.Start();
        HashSet<string> hps = new HashSet<string>(bg, StringComparer.OrdinalIgnoreCase);
        List<int> tph = UE.AsParallel().Where(x => hps.Contains(x.Key)).Select(y => y.Value).Distinct().ToList();
        if (tail)
          swph.Stop();

        tail = true;

        // so GC of per-timing-iteration garbage isn't included in timing
        UE = null;
        bg = null;
        hs = null;
        hps = null;
        GC.Collect();
      }

      Console.WriteLine("Parallel processing: {0:F2} ticks (avg)", swp.ElapsedTicks / (double)count);
      Console.WriteLine("HashSet processing: {0:F2} ticks (avg)", swh.ElapsedTicks / (double)count);
      Console.WriteLine("Parallel & HashSet processing: {0:F2} ticks (avg)", swph.ElapsedTicks / (double)count);
      Console.WriteLine("Return to exit");
      Console.ReadLine();
    }
  }
}
 
Share this answer
 
Comments
MYQueries1 8-Mar-16 0:24am    
How about where clause on the List and looking in the dict. is this thread safe ? and Performance is better without AsParallel
bg.AsParallel().Where(x => !UE.ContainsKey(x)).Distinct().ToList();
Matt T Heffron 8-Mar-16 12:47pm    
As Sascha said initially, these operations are thread-safe if nothing is modifying either collection. Otherwise, there is no thread-safe solution using these data structures other than putting locks around these queries and all places where the collections are modified. And that could have serious effects (possible deadlocks!) unless done very carefully.

This new Linq statement is different in that it's looking for the strings in bg that aren't keys in UE. (And the .Distinct() probably would be better placed directly on bg, to limit the "work" done in the rest of the query.)
I don't know if it would be more efficient. Because it is getting a different result, it is an "apples vs. oranges" comparison. Try plugging it into the code I used and see.

If you describe the problem you're trying to solve, we may be able to suggest a different strategy that would work better.

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900