2010 07 03 building tuple expressions

Published on July 3nd, 2010 at 20:24

Building tuple expressions

In my previous post I explained how the re-linq SQL backend moves OrderBy clauses from sub-queries to the outermost surrounding query in order to be compatible with SQL sub-query restrictions. Because the outer query might not have access to the properties required for the ordering, it is necessary to replace the subquery’s projection with a tuple containing both the original projection and all the ordering expressions:

from o in Orders
from product in (from oi in o.OrderItems
                 orderby oi.Quantity descending
                 select oi.Product)
select new { Order = o, Product = product }

becomes

from o in Orders
from product in (  
  from oi in o.OrderItems
  select new KeyValuePair<string, int\> (oi.Product, oi.Quantity))
orderby product.Value descending
select new { Order = o, Product = product.Key }

The SQL backend uses the predefined KeyValuePair type defined by .NET to construct tuples, but this presents a problem: what to do if there is more than one OrderBy expression in the sub-query? KeyValuePair can only store two values – the original projection and one OrderBy expression. So what if we have to store more than that?

[Side note: Of course, we could just implement our own Triple with three elements. And Quadruple, Quintuple, Sextuple with four, five, and six items. But we can’t build a tuple type that can take an arbitrary number of items. We could use an array, but the re-linq backend does not currently support analyzing array index operations.]

One elegant solution is to use KeyValuePair as a recursive data type. If we need to store more than two values, we’ll put a second KeyValuePair into the first one’s Value property – and thus gain one more place to store something:

Values to be stored	Expression
1	1
1,2	new KVP(1,2)
1,2,3	new KVP(1,new KVP(2,3))
1,2,3,4	new KVP(1,new KVP(2,new KVP(3,4)))

As you can see, this is quite flexible. It is, by the way, how many functional and logic-oriented programming languages construct lists.

Since building such tuple expressions can be handy in different situations while building a LINQ provider, re-linq now contains a helper class, TupleExpressionBuilder, which helps with building such tuples and accessing the members held by a tuple.

Here is the code:

public class TupleExpressionBuilder
{
 public static Expression AggregateExpressionsIntoTuple (  
 IEnumerable<Expression\> expressions)
  {
    ArgumentUtility.CheckNotNull ("expressions", expressions);

    return expressions
        .Reverse ()
        .Aggregate ((current, expression) => CreateTupleExpression (  
            expression, current));
  }

  public static IEnumerable<Expression\> GetExpressionsFromTuple (  
      Expression tupleExpression)
  {
    ArgumentUtility.CheckNotNull ("tupleExpression", tupleExpression);

    while (tupleExpression.Type.IsGenericType  
        && tupleExpression.Type.GetGenericTypeDefinition()  
            == typeof (KeyValuePair<,>))
    {
      yield return Expression.MakeMemberAccess (  
          tupleExpression,  
          tupleExpression.Type.GetProperty ("Key"));
      tupleExpression = Expression.MakeMemberAccess (  
          tupleExpression,  
          tupleExpression.Type.GetProperty ("Value"));
    }

    yield return tupleExpression;
  }

  private static Expression CreateTupleExpression (  
      Expression left,  
      Expression right)
  {
    var tupleType =  
        typeof (KeyValuePair<,>).MakeGenericType (left.Type, right.Type);
    var newTupleExpression =
        Expression.New (
            tupleType.GetConstructor (new[] { left.Type, right.Type }),
            new[] { left, right },
            new[] {  
                tupleType.GetMethod ("get_Key"),  
                tupleType.GetMethod ("get_Value") });
    return newTupleExpression;
  }
}

Is anybody else feeling the LISP-ness of those tuples? I only say car and cdr…
(Although, actually, I prefer the ./2 predicate.)

Exercise for the reader: Get rid of the while loop in GetExpressionsFromTuple. It would be so much easier if we had a NIL element at the end of the list…

- Fabian

Comments

Stefan Wenig - July 5th, 2010 at 8:48

Cool! Is the transformed LINQ query (the one with KeyValuePair) an actual intermediate result that is then used to generate SQL from? If so, when do you apply that transformation?

Fabian Schmied - July 5th, 2010 at 11:16

Yeah, hard to find with search engines that ignore punctuation characters :)

./2 is the list constructor of Prolog. In Prolog, a list [1, 2, 3] can be written as .(1, .(2, .(3, []))) (or as [1|[2|[3|[]]]], or in a few other notations…). Why you’d do that? Because you can easily write algorithms that recurse over the elements of a list that way. Most functional languages have a smilar notation for lists.

Note that Prolog (like many functional languages) uses a NIL element to terminate the list, which makes many recursive algorithms easier to write declaratively.

We originally also has a NIL element at the end of our KeyValuePair list structure, but we got rid of it because it required special casing in other situations. Now, however, the GetExpressionsFromTuple algorithm requires a dedicated "end" case (yield return tupleExpression) that makes it hard to write declaratively using LINQ.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

2010 07 03 building tuple expressions

Building tuple expressions

Comments

Stefan Wenig - July 5th, 2010 at 8:48

Fabian Schmied - July 5th, 2010 at 11:16

Clone this wiki locally