-
Notifications
You must be signed in to change notification settings - Fork 58
2010 07 03 building tuple expressions
Published on July 3nd, 2010 at 20:24
In my previous post I explained how the re-linq SQL backend moves OrderBy
clauses from sub-queries to the outermost surrounding query in order to be compatible with SQL sub-query restrictions. Because the outer query might not have access to the properties required for the ordering, it is necessary to replace the subquery’s projection with a tuple containing both the original projection and all the ordering expressions:
from o in Orders
from product in (from oi in o.OrderItems
orderby oi.Quantity descending
select oi.Product)
select new { Order = o, Product = product }
becomes
from o in Orders
from product in (
from oi in o.OrderItems
select new KeyValuePair<string, int\> (oi.Product, oi.Quantity))
orderby product.Value descending
select new { Order = o, Product = product.Key }
The SQL backend uses the predefined KeyValuePair
type defined by .NET to construct tuples, but this presents a problem: what to do if there is more than one OrderBy
expression in the sub-query? KeyValuePair
can only store two values – the original projection and one OrderBy
expression. So what if we have to store more than that?
[Side note: Of course, we could just implement our own Triple with three elements. And Quadruple, Quintuple, Sextuple with four, five, and six items. But we can’t build a tuple type that can take an arbitrary number of items. We could use an array, but the re-linq backend does not currently support analyzing array index operations.]
One elegant solution is to use KeyValuePair
as a recursive data type. If we need to store more than two values, we’ll put a second KeyValuePair
into the first one’s Value
property – and thus gain one more place to store something:
Values to be stored | Expression |
---|---|
1 | 1 |
1,2 | new KVP(1,2) |
1,2,3 | new KVP(1,new KVP(2,3)) |
1,2,3,4 | new KVP(1,new KVP(2,new KVP(3,4))) |
As you can see, this is quite flexible. It is, by the way, how many functional and logic-oriented programming languages construct lists.
Since building such tuple expressions can be handy in different situations while building a LINQ provider, re-linq now contains a helper class, TupleExpressionBuilder
, which helps with building such tuples and accessing the members held by a tuple.
Here is the code:
public class TupleExpressionBuilder
{
public static Expression AggregateExpressionsIntoTuple (
IEnumerable<Expression\> expressions)
{
ArgumentUtility.CheckNotNull ("expressions", expressions);
return expressions
.Reverse ()
.Aggregate ((current, expression) => CreateTupleExpression (
expression, current));
}
public static IEnumerable<Expression\> GetExpressionsFromTuple (
Expression tupleExpression)
{
ArgumentUtility.CheckNotNull ("tupleExpression", tupleExpression);
while (tupleExpression.Type.IsGenericType
&& tupleExpression.Type.GetGenericTypeDefinition()
== typeof (KeyValuePair<,>))
{
yield return Expression.MakeMemberAccess (
tupleExpression,
tupleExpression.Type.GetProperty ("Key"));
tupleExpression = Expression.MakeMemberAccess (
tupleExpression,
tupleExpression.Type.GetProperty ("Value"));
}
yield return tupleExpression;
}
private static Expression CreateTupleExpression (
Expression left,
Expression right)
{
var tupleType =
typeof (KeyValuePair<,>).MakeGenericType (left.Type, right.Type);
var newTupleExpression =
Expression.New (
tupleType.GetConstructor (new[] { left.Type, right.Type }),
new[] { left, right },
new[] {
tupleType.GetMethod ("get_Key"),
tupleType.GetMethod ("get_Value") });
return newTupleExpression;
}
}
Is anybody else feeling the LISP-ness of those tuples? I only say car and cdr…
(Although, actually, I prefer the ./2
predicate.)
Exercise for the reader: Get rid of the while loop in GetExpressionsFromTuple
. It would be so much easier if we had a NIL element at the end of the list…
- Fabian
Cool! Is the transformed LINQ query (the one with KeyValuePair) an actual intermediate result that is then used to generate SQL from? If so, when do you apply that transformation?
Yeah, hard to find with search engines that ignore punctuation characters :)
./2
is the list constructor of Prolog. In Prolog, a list [1, 2, 3]
can be written as .(1, .(2, .(3, [])))
(or as [1|[2|[3|[]]]]
, or in a few other notations…). Why you’d do that? Because you can easily write algorithms that recurse over the elements of a list that way. Most functional languages have a smilar notation for lists.
Note that Prolog (like many functional languages) uses a NIL element to terminate the list, which makes many recursive algorithms easier to write declaratively.
We originally also has a NIL element at the end of our KeyValuePair list structure, but we got rid of it because it required special casing in other situations. Now, however, the GetExpressionsFromTuple algorithm requires a dedicated "end" case (yield return tupleExpression
) that makes it hard to write declaratively using LINQ.