2008/03/30

IQueryable<T> vs. IEnumerable<T>

In one of our current projects we're using LINQ TO SQL to conquer the object-relational impedance mismatch. We all had some experience with LINQ and deferred execution. But it was getting obvious that we all needed to deeply internalize the difference between IEnumerable and IQueryable.

EntitySet<T> and Table<T> both implement the IEnumerable<T> interface. However, if there would not be IQueryable all querying functionality - including filtering and sorting - would be executed on the client. To optimize a query for a specific data source we need a way to analyze a query and its definition. That’s where expression trees are coming in. As we know an expression tree represents the logical definition of a method call which can be manipulated and transformed.

So we have a mutable logical definition of our query on one side and a queryable data source on the other. The Property Provider on IQueryable now returns an IQueryProvider which is exactly what we need here.

public interface IQueryProvider {
  IQueryable CreateQuery(Expression expression);
  IQueryable<TElement> CreateQuery<TElement>(Expression     expression);
  object Execute(Expression expression);
  TResult Execute<TResult>(Expression expression);
}


There are 2 interesting operations and their generic counterparts. The generic versions are used most of the time and they perform better because we can avoid using reflection for object instantiation. CreateQuery() does precisely what we are looking for. It takes an expression tree as argument and returns another IQueryable based on the logical definition of the tree. When the returned IQueryable is enumerated it will invoke the query provider which will then process this specific query expression.
The Execute() method now is used to actually executing your query expression. This explicit entry point – instead of just relying on IEnumerator.GetEnumerator() – allows executing ET’s which do not necessarily yield sequences of elements. (For example aggregate functions like count or sum.)

We finally have our two worlds nicely connected together. The mutability of ET and the deferred execution of IEnumerable combined to a construct that can analyze an arbitrary mutated and extended query at the last possible moment and execute an optimized query against its data source. It’s not even too hard to implement your own IQueryProvider for your own data source. Maybe I’ll cover that in a later post. This is really nice work Eric Meijer and his team has done here.