Thursday, 18 September 2014

Lazy loading with ConcurrentDictionary for Thread-Safe Factory Methods

Developers who have started using the ConcurrentDictionary class in a multithreaded environment may have noticed some unusual behaviour with the GetOrAdd or AddOrUpdate methods exposed by the class. These methods are NOT performed in an atomic operation meaning the delegates parameters in these methods are not performed in a blocking manner. This can therefore cause the delegates to be called multiple times if multiple threads are accessing the dictionary at the same time. This may not have been your intention in the design of your code, especially if this delegate is processing a time consuming command (e.g. accessing a data layer).

For Microsoft to implement these methods this way is by design, this is because the ConcurrentDictionary class is a non-blocking data structure, whereby the delegates are run outside of the dictionary’s internal lock mechanism in order to prevent unknown code from blocking all threads accessing the dictionary. The design of this class is to keep access from multiple threads as quick as possible.

To get around this issue the ConcurrentDictionary can be used with the Lazy class to ensure the delegate is invoked only once. This guarantees the delegate is called only once.

class Program
{
    static void Main(string[] args)
    {
        Consumer consumer = new Consumer();
        Parallel.For(0, 10, (i) => consumer.Consume("NonLazyKey"));
        Parallel.For(0, 10, (i) => consumer.LazyConsume("LazyKey"));
        Console.ReadKey();
    }
}
 
class Consumer
{
    ConcurrentDictionary<string, int> _cache = new ConcurrentDictionary<string, int>();
    ConcurrentDictionary<string, Lazy<int>> _lazyCache = new ConcurrentDictionary<string, Lazy<int>>();
 
    public int Consume(string key)
    {
        return _cache.GetOrAdd(key, this.GetValue(key));
    }
 
    public int LazyConsume(string key)
    {
        return _lazyCache.GetOrAdd(key, new Lazy<int>(() => this.GetValue(key))).Value;
    }
 
    int GetValue(string key)
    {
        Console.WriteLine("Getting Value for Key: {0}", key);
        return 1;
    }
}

In the example provided the Consumer class implements two internal caches; one cache declared with a string key and an integer value, the other cache declared with a string key and a Lazy data type. Two methods are provided to get the integer values from the caches provided by a key, these both subsequently call the GetValue method to get a “calculate” value (this GetValue method could be long running). The GetValue method outputs to the Console every time it is called, this is done so that you can see how many times it is called by each Consume method. In this example the LazyComsume only ends up calling the GetValue method once, whereas the non-lazy method Consume calls the method multiple times.

The ConcurrentDictionary and Lazy classes should be used in conjunction with each other going forward as a safe design pattern to use to ensure your factory value generators aren't called multiple times.

Sunday, 3 August 2014

Linq 2 SQL ObjectTrackingEnabled Memory Leaks

If you're working with disconnected entities with Linq 2 SQL you should understand that setting the ObjectTrackingEnabled flag on the data context is important to prevent memory leaks in regards to subscribed events as the default implied value for this flag is TRUE.

Take the following data access layer snippet for retrieving a Person entity (AdventureWorks db):
public Person GetPerson()
{
    using (AdventureWorksDataContext ctx = new AdventureWorksDataContext())
    {
        //ctx.ObjectTrackingEnabled = false;
        return ctx.Persons.First();
    }
}
If the ObjectTrackingEnabled flag is NOT set to FALSE on the data context then the lifetime of the implied change tracker object in Linq 2 SQL will remain alive as long as the entity's.  This is because the object change tracker in Linq 2 SQL subscribes to the PropertyChanged and PropertyChanging events on the entities which are materialised when the ObjectTrackingEnabled flag is set to TRUE.  This could become problematic when calling this method multiple times to retrieve numerous People within your application, because whenever the data context is instantiated a new instance of the change tracker is also created.  Therefore you'll have one change tracker for every entity!  However, once the entity goes out of scope for garbage collection the change tracker will also be garbage collected.

To prove this point the following code snippet shows two examples: the first, where the change tracker is kept alive by keeping the ObjectTrackingEnabled set to TRUE, and the second where the change tracker is discarded by setting the ObjectTrackingEnabled to FALSE.  The GetPerson method in the DataAccessLayer now contains some code where it retrieves the ChangeTracker object on the data context and assigns it to the WeakReference output parameter so the lifetime of the object can be tracked.
static void Main(string[] args)
{
    DataAccessLayer layer = new DataAccessLayer();

    WeakReference refCTWithTracking;
    Person person1 = layer.GetPerson(true, out refCTWithTracking);
    GC.Collect();
    Console.WriteLine("ChangeTrackerWithTracking IsAlive: {0}", refCTWithTracking.IsAlive);

    WeakReference refCTWithoutTracking;
    Person person2 = layer.GetPerson(false, out refCTWithoutTracking);
    GC.Collect();
    Console.WriteLine("ChangeTrackerWithoutTracking IsAlive: {0}", refCTWithoutTracking.IsAlive);

    Console.ReadKey();
}

public class DataAccessLayer
{
    public Person GetPerson(bool objectChangeTracking, out WeakReference refChangeTracker)
    {
        using (AdventureWorksDataContext ctx = new AdventureWorksDataContext())
        {
            ctx.ObjectTrackingEnabled = objectChangeTracking;

            PropertyInfo propServices = typeof(AdventureWorksDataContext).GetProperty("Services", BindingFlags.NonPublic | BindingFlags.Instance);
            PropertyInfo propChangeTracker = propServices.PropertyType.GetProperty("ChangeTracker", BindingFlags.Instance | BindingFlags.NonPublic);
            refChangeTracker = new WeakReference(propChangeTracker.GetValue(propServices.GetValue(ctx)));

            return ctx.Persons.First();
        }
    }
}
It would have been a better design for Microsoft to explicitly unsubscribe from the events on PropertyChanged and PropertyChanging when disposing of the data context as the lifetime of the change tracker should be scoped to that of the data context.  In my opinion, the change tracker is deemed redundant once the data context has been disposed.  Linq 2 SQL even prevents you reattaching a disconnected entity to another context if it has these events subscribed to by another change tracker.

Friday, 1 August 2014

.NET Event Memory Leaks

A common cause of memory leaks in .NET applications is caused by subscribing classes not correctly unsubscribing from events of another class it consumes prior to disposal.

The example below shows where memory leaks can cause a problem (copy and paste into LinqPad):
void Main()
{
 //subscriber subscribes to publisher event
 PublisherSubscriberExample(true);

 //subscriber does NOT subscribes to publisher event
 PublisherSubscriberExample(false);
}

void PublisherSubscriberExample(bool subscribeToEvents)
{
 String.Format("SubscribeToEvents: {0}", subscribeToEvents).Dump();
 Publisher pub = new Publisher();
 Subscriber sub = new Subscriber(pub, subscribeToEvents);
 WeakReference refPub = new WeakReference(pub);
 WeakReference refSub = new WeakReference(sub);
 //print publisher and subscriber object state pre dispose
 String.Format("Pub: {0}", refPub.IsAlive).Dump();
 String.Format("Sub: {0}", refSub.IsAlive).Dump();
 //dispose of the subscriber object
 sub = null;
 //force garbage collection for the subscriber object
 GC.Collect();
 //print publisher and subscriber object state post dispose
 String.Format("Pub: {0}", refPub.IsAlive).Dump();
 String.Format("Sub: {0}", refSub.IsAlive).Dump();
}

class Publisher
{
 public event EventHandler PublisherEvent;

 public Publisher() {}
}

class Subscriber
{
 Publisher _publisher;

 public Subscriber(Publisher publisher, bool subscribeToEvent)
 {
  _publisher = publisher;
 
  if (subscribeToEvent)
   _publisher.PublisherEvent += PublisherEvent;
 }

 void PublisherEvent(object sender, EventArgs e)
 {
 }
}
We have a Subscriber and Publisher class; the Subscriber takes a Publisher instance and subscribes to an event.  When the Subscriber instance goes out of scope (i.e. disposed of and requested for garbage collection), the Publisher instance still maintains a reference to the Subscriber as the Subscriber has NOT explicitly unsubscribed from its original subscription. Whilst the Publisher instance is still in scope so will the Subscriber instance.

To get round this issue it is therefore good practice to implement IDisposable on the Subscriber class so that the event can be unsubscribed from upon disposing of the object:
class Subscriber : IDisposable
{
 Publisher _publisher;

 public Subscriber (Publisher publisher, bool subscribeToEvent)
 {
  _publisher = publisher;
 
  if (subscribeToEvent)
   _publisher.PublisherEvent += PublisherEvent;
 }

 public void Dispose()
 {
  //free managed resources
  this.Dispose(true);
 }

 void Dispose(bool disposing)
 {
  if (disposing)
  {
   //free managed resources
   _publisher.PublisherEvent -= PublisherEvent;
  }
 }

 void PublisherEvent(object sender, EventArgs e)
 {
 }
}
You would then call Dispose on objects which expose this method instead of setting the reference to NULL.