Tuesday, 9 February 2016

Retry Pattern for failed Invocations

There are numerous times when invoking a delegate may fail due to an unexpected exception occurring in which we would like to automatically retry the invocation up to a certain number of retries.  This can easily be achieved using the Retry Pattern as shown below:

public static bool TryInvoke<TException, T>(Func<T> del, out T value, out IList<Exception> exceptions, int attempts = 3, IInterval interval = null)
 where TException : Exception
{
 if (del == null) throw new ArgumentNullException("del");
 if (attempts <= 0) throw new ArgumentOutOfRangeException("attempts");

 exceptions = new List<Exception>();

 for (int i = 0; i < attempts; i++)
 {
  try
  {
   value = del();
   return true;
  }
  catch (TException ex)
  {
   exceptions.Add(ex);
   if (interval != null)
    interval.Sleep();
  }
 }

 value = default(T);
 return false;
}

The above generic implementation of the Retry Pattern also follows the structure of the Try-Parse Pattern in that if the invocation of the delegate fails on all of it's attempts then the "out" parameter for the return value of the delegate is set to it's default value and the method returns false.  Conversely, if the delegate succeeds then the return value of the delegate is assigned to the "out" parameter and returns true - this means we can invoke delegates as follows:

int value;
Func<int> calculateValue = null;
if (Invoker.TryInvoke<int>(calculateValue, out value))
{
 //print value
 Console.WriteLine("Value: {0}", value);
}
else
{
 //show error message to user
 Console.WriteLine("Error calculating value after X attempts");
}

It is also possible for us to gracefully retry on specific types of exceptions and throw exceptions on others.  For example, I may expect a failed attempt whilst downloading some data from an external source, therefore I may want to retry ONLY when a DownloadException is thrown, all other exceptions should cause an unhandled exception to be thrown:

string data;
Func<string> getData = null;
IList<Exception> exceptions;
try
{
 if (Invoker.TryInvoke<DownloadException, string>(getData, out data, out exceptions))
 {
  //succeeded
 }
 else
 {
  //gracefully handle exceptions
 }
}
catch (Exception ex)
{
 //unexpected exception
}

We may also want to run this logic asynchronously, for example when a user clicks a button on a GUI and requests to download some data from an external source:

Func<string> downloadData = () =>
{
 return new Downloader().Download();
};
var task = await Invoker.TryInvokeAsync<string>(downloadData);
if (task.Success)
{
 //invocation passed
 Console.WriteLine(task.Value);
}

Note how the syntax of the TryInvokeAsync method has changed slightly in that it no longer contains an out parameter for the assigned value - async methods are unable to support out parameters.

The complete source code of the Retry Invoker can be found on GitHub.

Wednesday, 12 August 2015

Enumerable Timer with Stopwatch using an Extension Method

There have been numerous times when I’ve wanted to run a deferred query using either Linq-to-SQL (IQueryable<T>) or Linq-to-Objects (IEnumerable<T>) to capture the execution time for analysis.  Doing this for Linq-to-Objects is relatively easy using an extension method:

 public static IEnumerable<T> WithTimer<T>(this IEnumerable<T> source, Stopwatch watch)
 {
#if TRACE
  watch.Reset();
  watch.Start();
  foreach (var item in source)
   yield return item;
  watch.Stop();
#else
  return source;
#endif
 }

We simply pass in a StopWatch which is started and stopped pre- and post- enumeration respectively.  This allows us to get the total elapsed time outside of the enumeration as follows:

 Stopwatch watch = new Stopwatch();
 Enumerable.Range(0, 1000000).Where(n => n % 2 == 0).WithTimer(watch).ToList();
 Console.WriteLine(watch.ElapsedMilliseconds);

However, doing this for Linq-to-SQL is marginally more difficult as YIELD RETURN isn't supported for IQueryable<T> as it isn't an iterator interface type.  We therefore have to create our own Queryable wrapper (TimerQueryable<T>) for the IQueryable source and then wrap the enumerator which is returned by GetEnumerator into a TimerEnumerator<T> to control starting and stopping the timer.

    public class TimerQueryable<T> : IQueryable<T>
    {
        readonly IQueryable<T> _queryable;
        readonly IQueryProvider _provider;
        readonly Stopwatch _watch;

        public TimerQueryable(IQueryable<T> queryable, Stopwatch watch)
        {
            _watch = watch;
            _queryable = queryable;
            _provider = queryable.Provider;
        }

        public IEnumerator<T> GetEnumerator()
        {
            return new TimerEnumerator<T>(this._provider.CreateQuery<T>(Expression).GetEnumerator(), _watch);
        }

        System.Collections.IEnumerator System.Collections.IEnumerable.GetEnumerator()
        {
            return this.GetEnumerator();
        }

        public Type ElementType
        {
            get { return _queryable.ElementType; }
        }

        public System.Linq.Expressions.Expression Expression
        {
            get { return _queryable.Expression; }
        }

        public IQueryProvider Provider
        {
            get { return _queryable.Provider; }
        }
    }

    public class TimerEnumerator<T> : IEnumerator<T>
    {
        IEnumerator<T> _enumerator;
        Stopwatch _watch;

        public TimerEnumerator(IEnumerator<T> enumerator, Stopwatch watch)
        {
            _enumerator = enumerator;
            _watch = watch;
        }

        public T Current
        {
            get { return _enumerator.Current; }
        }

        public void Dispose()
        {
            _watch.Stop();
            _enumerator.Dispose();
        }

        object System.Collections.IEnumerator.Current
        {
            get { return this.Current; }
        }

        public bool MoveNext()
        {
            if (!_watch.IsRunning)
            {
                _watch.Reset();
                _watch.Start();
            }
            return this._enumerator.MoveNext();
        }

        public void Reset()
        {
            this._enumerator.Reset();
        }
    }
The StopWatch in the TimerEnumerator<T> class is started upon the enumeration in MoveNext and stopped upon the enumerator being disposed of in Dispose.  The timer isn't disposed of as we'd like to get the total elapsed time outside of the enumerator.  As IEnumerator<T> derives from IDisposable it would be fair to assume that the enumerator will be disposed correctly if the developer is adhering to the IDisposable pattern.

As there is only one instance of the StopWatch created and passed into multiple TimerEnumerator<T> instances it would be useful to mention that this isn't inherently thread-safe, but then again, do you know of an IQueryable or IEnumerable source which is?  Trying to execute two queries at the same time against an Entity Framework context in different threads would most certainly cause exceptions to be thrown.

Additionally, as we don't want this overhead in production code I have used the TRACE constant so that the timers will only work with the TRACE constant defined in the project - it doesn't make sense to run these timers when efficiency is paramount.

The full source code of the Enumerable Timer can be found on GitHub.

Wednesday, 10 June 2015

Return Await vs Returning Task

I'm a true believe in trying to keep code as simple as possible by minimising the amount I have to write whilst not compromising functionality and readability.  I ran into a problem the other day when using the async/await pattern along with the Task Parallel Library (TPL).  Observe the following code snippet:

Task<int> CalculateValue()
{
    using (var calc = new Calculator())
    {
        return calc.Calculate();
    }
}

First thing to mention is I like to conform to standards - if my class is directly using unmanaged resources or any indirect unmanaged resource through another managed class then I like to implement the IDisposable pattern to ensure there are no memory leaks throughout the lifetime of the application.

So back to the problem at hand - the Calculate method on the Calculator class returns a Task<int> as this is a long running calculation which is done in another thread.  All is good except for the crucial part that upon the Calculate method being called the execution is returned back to the CalculateValue method whereby the Dispose method on the Calculator class is called long before the Calculate method completing.  The reason is simply because I didn't decorate the calc.Calculate call with an await and the CalculateValue method with an async keyword.

To resolve this problem I should change to the following:

async Task<int> CalculateValue()
{
    using (var calc = new Calculator())
    {
        return await calc.Calculate();
    }
}

So once the execution of the Task returned by the Calculate method is complete the Dispose method on the Calculator is subsequently called.

The reason I didn't implement it like this in the first place is because I don't like to add complexity to where I believe it isn't required.  Adding the await keyword on the calc.Calculate call would imply it has code to run after the call is complete - in my opinion it didn't have anything further to call apart from correctly disposing of the calculator which I easily missed.  Implementing the call this way would have been perfectly fine (no async and await keywords necessary):

Calculator _calc = new Calculator()
Task<int> CalculateValue()
{
    return calc.Calculate();
}

As I inject most of my dependencies using a DI framework it has been a rare circumstance where I've ran into this problem.

Saturday, 10 January 2015

EF Mapping with Discriminator Column and Concrete Tables for Derived Types

A problem I recently ran into with Entity Framework 6 was where I wanted to use two features at the same time, specifically the Table-Per-Hierarchy mapping and Table Splitting. However, when using both of then at the same time it caused a conflict with the framework which it was unable to handle.

The following database schema was used for mapping to my domain model where I wanted the Person.PersonType column as a discriminator for my derived types.



My domain model was as follows (PersonType property has been omitted as discriminators cannot be mapped to properties):

class Person
{
 public int PersonId { get; set; }
 public string Name { get; set; }
}

class Customer : Person
{
 public string CustomerNumber { get; set; }}

class Operator : Person
{
 public string OperatorNumber { get; set; }
}
The model mapping I had in place was as follows:
  modelBuilder
    .Entity<Person>()
    .HasKey(n => n.PersonId)
    .Map(map =>
    {
        map.Properties(p => new { p.Name });
        map.ToTable("dbo.Person");
    })
    .Map<Customer>(map =>
    {
        map.Requires("PersonType").HasValue("C");
        map.Properties(p => new { p.CustomerNumber });
        map.ToTable("dbo.Customer");
    });
Which caused the following SQL exception to be thrown:

EntityCommandExecutionException
An error occurred while executing the command definition. See the inner exception for details
Invalid column name 'PersonType'.Invalid column name 'PersonType'.

The SQL which was generated by the entity framework was as follows:

SELECT 
    CASE WHEN (( NOT (([Project1].[C1] = 1) AND ([Project1].[C1] IS NOT NULL))) AND ( NOT (([Project2].[C1] = 1) AND ([Project2].[C1] IS NOT NULL)))) THEN '0X' WHEN (([Project1].[C1] = 1) AND ([Project1].[C1] IS NOT NULL)) THEN '0X0X' ELSE '0X1X' END AS [C1], 
    [Extent1].[PersonId] AS [PersonId], 
    [Extent1].[Name] AS [Name], 
    CASE WHEN (( NOT (([Project1].[C1] = 1) AND ([Project1].[C1] IS NOT NULL))) AND ( NOT (([Project2].[C1] = 1) AND ([Project2].[C1] IS NOT NULL)))) THEN CAST(NULL AS varchar(1)) WHEN (([Project1].[C1] = 1) AND ([Project1].[C1] IS NOT NULL)) THEN [Project1].[CustomerNumber] END AS [C2], 
    CASE WHEN (( NOT (([Project1].[C1] = 1) AND ([Project1].[C1] IS NOT NULL))) AND ( NOT (([Project2].[C1] = 1) AND ([Project2].[C1] IS NOT NULL)))) THEN CAST(NULL AS varchar(1)) WHEN (([Project1].[C1] = 1) AND ([Project1].[C1] IS NOT NULL)) THEN CAST(NULL AS varchar(1)) ELSE [Project2].[OperatorNumber] END AS [C3]
    FROM   [dbo].[Person] AS [Extent1]
    LEFT OUTER JOIN  (SELECT 
        [Extent2].[PersonId] AS [PersonId], 
        [Extent2].[CustomerNumber] AS [CustomerNumber], 
        cast(1 as bit) AS [C1]
        FROM [dbo].[Customer] AS [Extent2]
        WHERE [Extent2].[PersonType] = N'C' ) AS [Project1] ON [Extent1].[PersonId] = [Project1].[PersonId]
    LEFT OUTER JOIN  (SELECT 
        [Extent3].[PersonId] AS [PersonId], 
        [Extent3].[OperatorNumber] AS [OperatorNumber], 
        cast(1 as bit) AS [C1]
        FROM [dbo].[Operator] AS [Extent3]
        WHERE [Extent3].[PersonType] = N'O' ) AS [Project2] ON [Extent1].[PersonId] = [Project2].[PersonId]

As you can see from the generated SQL it is trying to use the discriminator column on our satellite tables (dbo.Operator and dbo.Customer) rather than on our base table (dbo.Person).  As of yet I haven’t found a way to change the way in which the entity framework mapping works for the discriminator column to achieve both TPH mappings and Table Splitting.  To get around this problem I had to create a view on top of all of the table schemas involved in the hierarchy mapping:

create view dbo.vw_PersonExtended
as

    select
        p.Name, p.PersonId, p.PersonType, c.CustomerNumber, o.OperatorNumber
    from
        dbo.Person p
        left join dbo.Customer c on c.PersonId=p.PersonId
  left join dbo.Operator o on c.PersonId=o.PersonId
And mapping this view to the base class type Person and removing the derived class table mapping as follows:
   modelBuilder
    .Entity<Person>()
    .HasKey(n => n.PersonId)
    .Map(map =>
    {
        map.Properties(p => new { p.Name });
        map.ToTable("dbo.vw_PersonExtended");
    })
    .Map<Customer>(map =>
    {
        map.Requires("PersonType").HasValue("C");
        map.Properties(p => new { p.CustomerNumber });
    });

The derived types Person and Operator are no longer mapped to their individual concrete tables, but instead their values come from the mapping in the view.

Additionally, we have to create either an INSTEAD OF INSERT trigger on top of this view or create a stored procedure to allow our DB context to support persisting new entities.

The stored procedure would be:

create procedure dbo.usp_InsertCustomer
    @Name varchar(50),
    @CustomerNumber varchar(50)
as
begin

        set nocount on
        declare @id int

        insert into dbo.Person (Name, PersonType)
        values (@Name, 'C')
        set @id = scope_identity()

        insert into dbo.Customer (PersonId, CustomerNumber)
        values (@id, @CustomerNumber)

        select @id as PersonId

end

And mapped as follows:

modelBuilder
        .Entity<Customer>()
        .MapToStoredProcedures(map => map.Insert(i => i.HasName("usp_InsertCustomer")));

The obvious drawback to this approach is that you have to create views and stored procedures on the database side when you want to implement the discriminator hierarchy pattern in Entity Framework as well as the plumbing work in the database context mapping. This however shouldn't impede the performance of the implementation, it'll just make the implementation more time consuming.

Wednesday, 17 December 2014

SQL Scalar Value Function Inefficiencies

Are scalar value functions (SVF) really the best place to centralise the logic in your DB?

Although I’m against it many developers place business logic regardless of complexity into the database through the use of inline case statements where this could be a simple calculation to calculate a shipping cost for example. As the database grows and the number of SQL queries increases where they have a dependency on this calculation it becomes good practice to centralise the code for reuse rather than duplicating the logic in numerous queries.

One of the options for centralising this logic is through the use of scalar value functions where these calls are placed inline in the SQL statement, either in the SELECT or as a predicate in the WHERE clause. Using these inside a query actually impedes the performance as the SVF call is executed in a different context to the main query, thus causing an additional cost to the analyser.

If we take a simple Products data table with 100k records as an example (this simply generates some random numbers for the properties of a product):

set nocount on

create table dbo.temp_Products
(
 [ProductID] int,
 [Type] tinyint,
 [Price] float,
 [Weight] float 
)

declare @rowcnt int = 0
while (@rowcnt <= 100000)
begin
 insert into dbo.temp_Products
 select @rowcnt, 1+rand()*4, 1+rand()*100, 1+rand()*10

 set @rowcnt = @rowcnt + 1
end

And a scalar value function which does a simple logical calculation for calculating a shipping cost:

create function dbo.usvf_CalculateShipping
(
    @PricePerKG float = 1,
    @Type tinyint,
    @WeightInKG float = 1
)
returns float
as
begin

    return  /*get the appropriate factor to apply*/
            case
                when @Type = 1 then 0.1
                when @Type = 2 then 0.2
                when @Type = 3 then 0.35
                when @Type = 4 then 0.43
            end * @PricePerKG * @WeightInKG

end
go

When we call the SVF inside a query the resulting performance is impaired relative to calling the logic inline. To compare the differences between the two take the following two SQL statements:

select ProductID, case
                when [Type] = 1 then 0.1
                when [Type] = 2 then 0.2
                when [Type] = 3 then 0.35
                when [Type] = 4 then 0.43
            end *Price*[Weight] from temp_Products

select ProductID, dbo.usvf_CalculateShipping(Price, [Type], [Weight]) from temp_Products

The first query which runs the logic inline produces a result in a quickier time in comparison to the second query. The results are as follows:

(100001 row(s) affected)
Table 'temp_Products'. Scan count 1, logical reads 390, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

(1 row(s) affected)

 SQL Server Execution Times:
   CPU time = 78 ms,  elapsed time = 212 ms.

(100001 row(s) affected)
Table 'temp_Products'. Scan count 1, logical reads 390, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

(1 row(s) affected)

 SQL Server Execution Times:
   CPU time = 842 ms,  elapsed time = 938 ms.
SQL Server parse and compile time: 
   CPU time = 0 ms, elapsed time = 0 ms.

To get around this the solution would be to use an inline table-valued function (TVF) where the logic resides, this logic would then be injected into the plan of the query calling this function, affectively placing the logic inline.

create function dbo.utvf_CalculateShipping
(
    @PricePerKG float = 1,
    @Type tinyint,
    @WeightInKG float = 1
)
returns table as return 
select case
         when @Type = 1 then 0.1
         when @Type = 2 then 0.2
         when @Type = 3 then 0.35
         when @Type = 4 then 0.43
       end * @PricePerKG * @WeightInKG as Shipping
go

If I run the following two queries you’ll find the performance of both are almost identical:

select ProductID, case
                when [Type] = 1 then 0.1
                when [Type] = 2 then 0.2
                when [Type] = 3 then 0.35
                when [Type] = 4 then 0.43
            end *Price*[Weight] from temp_Products
select ProductID, tt.Shipping from temp_Products t cross apply dbo.utvf_CalculateShipping(Price, Type, Weight) tt



(100001 row(s) affected)
Table 'temp_Products'. Scan count 1, logical reads 390, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

(1 row(s) affected)

 SQL Server Execution Times:
   CPU time = 109 ms,  elapsed time = 310 ms.

(100001 row(s) affected)
Table 'temp_Products'. Scan count 1, logical reads 390, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

(1 row(s) affected)

 SQL Server Execution Times:
   CPU time = 109 ms,  elapsed time = 285 ms.

Monday, 13 October 2014

Thread-Safe Singleton Construction

Using the Singleton design pattern it is a common technique to ensure only one instance of a class exists during runtime:

class NonThreadSafeSingleton
{
 static NonThreadSafeSingleton instance;

 public static NonThreadSafeSingleton Instance
 {
  get
  {
   if (instance == null)
    instance = new NonThreadSafeSingleton();
   return instance;
  }
 }
}

The above implementation of the Singleton pattern may work, but it doesn't make it safe to use between multiple threads.  If multiple threads were to access the above Singleton implementation we may experience the following sequence as executed by the runtime:

Order Thread 1 Thread 2
1 if (instance == null)
2 if (instance == null)
3 instance = new NonThreadSafeSingleton();
4 instance = new NonThreadSafeSingleton();
5 return instance;
6 return instance;

As you can see in the order of execution the NonThreadSafeSingleton class is instantiated twice due to a race-condition.  When in reality our desired execution is as follows:

Order Thread 1 Thread 2
1 if (instance == null)
2 instance = new NonThreadSafeSingleton();
3 return instance;
4 return instance;

If we're going to make this thread-safe one approach in .NET 2.0 would be to use an exclusive lock:

class ThreadSafeSingletonWithLock
{
 static ThreadSafeSingletonWithLock instance;
 static object locker = new Object();

 public static ThreadSafeSingletonWithLock Instance
 {
  get
  {
   lock (locker)
   {
    if (instance == null)
     instance = new ThreadSafeSingletonWithLock();
   }
   return instance;
  }
 }
}

But looking at the sequence of execution we now run into the problem of a lock always being acquired even if the instance is already constructed (executions 1 and 6 listed below):

Order Thread 1 Thread 2
1 lock (locker)
2 if (instance == null)
3 instance = new ThreadSafeSingletonWithLock();
4 return instance;
5 lock (locker)
6 return instance;

This causes an additional overhead which isn't desired.  It is actually possible to achieve the same requirement of synchronising the object construction whilst minimising the amount of locking required:

class ThreadSafeSingletonWithDoubleCheckLock
{
 static ThreadSafeSingletonWithDoubleCheckLock instance;
 static object locker = new Object();

 public static ThreadSafeSingletonWithDoubleCheckLock Instance
 {
  get
  {
   if (instance == null)
   {
    lock (locker)
    {
     if (instance == null)
      instance = new ThreadSafeSingletonWithDoubleCheckLock();
    }
   }
   return instance;
  }
 }
}

It is still possible for two threads to "lock" during object construction, although if  the object construction is quick the likelihood of this condition occurring is minimised.  Any additional instance retrievals after instantiation are done outside of the lock providing an improvement during runtime.  Although the above implementation of the thread-safe Singleton pattern is better than the prior implementation, it kind of feels a bit messy having to implement the condition to check if the instance is equal to null two times, an inexperienced developer may think this is unnecessary and run the risk of the line of code being removed.

An even better implementation is provided in .NET 4.0 BCL with Lazy<T>.  This allows a single instance of a class to be instantiated upon request in a thread-safe way.   An example of it's usage is shown below:

class ThreadSafeLazySingleton
{
 readonly static Lazy<ThreadSafeLazySingleton> instance = new Lazy<ThreadSafeLazySingleton>(() => new ThreadSafeLazySingleton());

 private ThreadSafeLazySingleton(){}

 public static ThreadSafeLazySingleton Instance
 {
  get { return instance.Value; }
 }
}

The Lazy<T> overloaded constructors provide performance options during construction.  For example, if the instance isn't going to be used between multiple threads then it is possible to define the Lazy instance as non-thread safe to ensure the highest possible performance.

By using the Lazy class we achieve the following:
  1. Encapsulating the locking mechanism: As the synchronisation of instantiation is internal to the Lazy<T> class any future optimisations done by the .NET team should not effect the usage of this class unless they intend of implementing a breaking-change;
  2. Allows the singleton instance to be used in a thread-safe environment;
  3. Gracefully encapsulates the double-check locking inside the class without having the developer to implement this themselves
The private constructor in the examples above have been omitted for brevity purposes.

Wednesday, 8 October 2014

Binding ConverterParameter

There have been numerous cases where I've wanted to bind a ConverterParameter on a converter instance to either a property on another object or to a property on my view model.

XAML Binding
<Label Content="{Binding BoundedValue
            , Converter={StaticResource valueIfNullConverter}
            , ConverterParameter={Binding DefaultNullValue}}"/>
Converter
    public class ValueIfNullConverter : IValueConverter
    {
        public object Convert(object value, Type targetType, object parameter, System.Globalization.CultureInfo culture)
        {
            return value ?? System.Convert.ToString(parameter);
        }
    }

Only to find this failing at runtime with the following exception:

A 'Binding' cannot be set on the 'ConverterParameter' property of type 'Binding'. A 'Binding' can only be set on a DependencyProperty of a DependencyObject.

This therefore implies the ConverterParameter property isn't exposed as a DependencyProperty and therefore can't be bound to.

To overcome this problem we can use a MultiBinding along with a multi-value converter whereby we bind the "values" of the converter to both the data source and converter parameter properties as follows:

        <Label>
            <Label.Content>
                <MultiBinding Converter="{StaticResource valueIfNullConverter}">
                    <Binding Path="BoundedValue" />
                    <Binding Path="DefaultNullValue" />
                </MultiBinding>
            </Label.Content>
        </Label>
And changing our converter to a multi-value converter:
    public class ValueIfNullConverter : IMultiValueConverter
    {
        public object Convert(object[] values, Type targetType, object parameter, System.Globalization.CultureInfo culture)
        {
            if (values != null && values.Length == 2)
            {
                return values[0] ?? System.Convert.ToString(values[1]);
            }

            return values;
        }
    }
In the multi-value converter above, the bound value and parameter are now accessible inside the 'values' collection argument instead of accessing the parameter through the 'parameter' argument.