Skip to content

Computation Expression macro

Alex Zimin edited this page Sep 14, 2011 · 3 revisions

Table of Contents

Asynchronous programming and Computation Expressions

Translated from Асинхронное программирование и Computation Expressions

The goal of this article is to serve as an introduction to asynchronous programming and computation expressions in Nemerle, but it might also be useful for those studying F#, since F# served as an inspiration for asynchronous programming implementation in Nemerle. On the other hand, someone might be interested by how difficult problems from other languages can be solved in a few lines using computation expressions.

Most likely everyone who has understood monads wrote an article about them. I am no exception. I will try to describe them very briefly, so that those who know would recall, while others got a clue for what follows.

Monads

A monad is a generator pattern built into the language. The pattern has the following pair at its core:

  • a polymorphic type:
interface F<T> {
...
}
  • and a couple of operations on it:
class M {
  static public F<T> Return<T>(T obj) { ... }
  static F<U> Bind<T,U>(F<T> obj, Func<T, F<U>> fun) { ... }
}

Return "wraps" any value into a monad, and Bind manipulates it. A very weak analogy can be made with StringBuilder, which takes the initial value in its constructor and then modifies it using the Append* methods.

If we replace F with IEnumerable, then the signature of Bind resembles the signature of SelectMany from Linq. This is not surprising, since Linq, with some serious reservations, is a monad. There is, by the way, an interesting explanation by Bart De Smet at PDC2010, titled "LINQ, Take Two – Realizing the LINQ to Everything Dream" (link).

If Linq is a Monad, let's try to rewrite a simple Linq expression:

var nums = Enumerable.Range(-2, 7);
var sqrt = from n in nums where n >=0 select Math.Sqrt(n);

with M-expressions. First, let's declare the M-expressions:

static class MList
{
    static public IEnumerable<T> Return<T>(T obj) 
    { 
        var list = new List<T>();
        list.Add(obj);
        return list;
    }
    static public IEnumerable<U> Bind<T, U>(this IEnumerable<T> obj, Func<T, IEnumerable<U>> fun) 
    {
        return obj.SelectMany(fun);
    }
    static public IEnumerable<T> Empty<T>()
    {
        return Enumerable.Empty<T>();
    }
}

Now, rewrite the Linq expression:

var nums = Enumerable.Range(-2, 7);
var sqrt = nums
    .Bind(n => n >= 0 ? MList.Return(n) : MList.Empty<int>())
    .Bind(n => MList.Return(Math.Sqrt(n)));

It looks even worse than Linq, since C# has no built-in monad support. In case of Nemerle, this code can be rewritten better. Let's declare some M-expressions:

class MList
{
  public Return[T](obj : T) : IEnumerable[T]
  { 
    def data = List();
    data.Add(obj);
    data
  }
  public Bind[T, U](obj : IEnumerable[T], f : T->IEnumerable[U]) : IEnumerable[U]
  {
    obj.SelectMany(f)
  }
  public Empty[T]() : IEnumerable[T]
  {
    Enumerable.Empty()
  }
}

And rewrite the Linq expression again:

def mlist = MList();
def nums = Enumerable.Range(-2, 7);
def sqrt = comp mlist 
{
  defcomp n = nums;
  defcomp n = if (n >= 0) mlist.Return(n) else mlist.Empty();
  return Math.Sqrt(n :> double);
};

First, recall that the Nemerle def is the immutable version of C# var, if is the ternary operator (?:), and constructor calls do not require new. Now, getting down to business, the comp operator declares the beginning of monadic computations, the following parameter supplies the M-operations, and the expressions themselves come next.

Compared with Linq, we have three lines, instead of one, but these lines look like regular code working with a variable, while in fact it generates a new collection from the original. This example serves as a demonstration. The subsequent examples are very difficult to reproduce in regular code. Let's figure out how it works.

Defcomp is a magic operator that "expands" a monad (in this case of type IEnumerable[T]) into a value (of type T), while return converts a value into a monad. In reality, there is no magic here: the compiler simply turns the expression:

defcomp n = nums;
...

into:

mlist.Bind(nums, n => ...)

Computation expressions

If we were discussing Haskell, the story could have stopped right there. However, the hybrid (functional/imperative) language case is much more complicated, since we have to deal with such control structures as conditional operators, loops, and yield. In order to understand the problem, we will try to express monadic expressions with loops with M-expressions inside the defcomp operator.

The solution for this problem is quite simple: we just need to extend the set of M-expressions with conditional expression and loop transformation methods. For example, While would have the following signature:

public F<FakeVoid> While<T>(Func<bool> cond, Func<F<FakeVoid>> body)

So, when the compiler encounters a cycle containing monadic operators, it first transforms the loop body into a chain of Bind calls, since Bind returns F, then wraps this chain in a lambda: "() => body()", of type >. The compiler also wraps the loop condition in a lambda, then feeds those lambdas into the While M-expression.

Every M-expression has to return a monad, but loops do not return anything, so there is no value to wrap into a monad. This is when we use the FakeVoid singleton.

We can now give an informal description of a computation expression: a monad for imperative languages. In Haskell's case, the compiler rewrites only defcomp and return inside monadic expressions, but imperative languages make it necessary to rewrite the control structures, as well. The following table lists all operators that get rewritten:

defcomp expands monad into a value; similar to assignment
callcomp expands monad; used when the value is not important
return wraps the argument into a monad; used at the end of a monadic computation block, similar to return from a function
returncomp the argument is a monad; returns the monad as the result of a monadic computation block; unlike return, does not wrap the argument again
yield wraps the argument into a monad and performs actions analogous to yield return
yieldcomp the argument is a monad; performs actions analogous to yield return; unlike yield, does not wrap the argument again
if, when, unless, while, do, foreach, for, using, try...catch...finally monadic versions of the usual control structures

Two more words about M-operation providers: officially, they are called builders; computation expression construction uses duck typing, so builders have to have the methods used by the compiler, but don't have to implement a full interface. This solution makes it possible to implement builders only partially when not intending to use all computation expression features. By the way, we have already used this approach when implementing the MList builder (with only defcomp and return).

Another reason for not using interfaces is that they put unecessarily strict constraints on M-expression signatures. We only need monads and M-expressions to have compatible types. For example, the preceding examples presumed that a monad has one generic parameter, but it could easily have several. It does not matter for the following discussion, but the principle could be examined in the Continuation monad example.

For those who want to write their own builders, the best advice is to study the source code: Computation expressions.

Standard builder examples

The Computation Expressions library includes several standard builders. Instead of instantiating them to pass as parameters, it is enough to pass them by name.

List

The list builder supports the language's standard control constructs, as well as yield and yieldcomp. It uses list[T] (the standard Nemerle list) as the monad. This builder is interesting in that it implements C# programmers' two old wishes: yielding from lambdas and yielding collections. First, let's look at an analogue of the Linq query from the beginning of the article:

def num = Enumerable.Range(-2, 7);
def sqrt : list[double] = comp list 
{
  foreach(n in num)
    when(n >= 0)
      yield Math.Sqrt(n);
}

You can see that the list builder lets you use yield expressions in-place, without creating functions for it. It can also be used instead of Linq to objects. I think this code is much easier to read than the equivalent Linq expression.

Now let's look at another wish: yielding collections. First, I declare a local function generating a sequence, then call it twice and yield it as a collection.

def upTo(n)
{
  comp list { for(mutable i=0;i<n;i++) yield i }
}
def twice = comp list
{
  repeat(2) yieldcomp upTo(3);
}
Console.WriteLine(twice);
//[0, 1, 2, 0, 1, 2]

I had to write my own generator rather than use Enumerable.Range(0, 3), because of its type: yieldcomp requires a monad as input, having type list[int] in this case, but Enumerable.Range(0, 3) returns IEnumerable[int]. In order to overcome this disparity, we have another builder - enumerable.

Enumerable

This builder repeats much of the list builder, except that it uses IEnumerable[T] as its monad type and allows you to create infinite sequences. Let's rewrite the last example:

def twice = comp enumerable
{
  repeat(2) yieldcomp Enumerable.Range(0, 3);
}
foreach(item in twice) Console.Write($"$item ");
//0, 1, 2, 0, 1, 2 

Array

Array works like list and enumerable, except that it uses array[T] as the monad type.

Async

This is the most complicated, but also the most useful builder, very similar to async/await in C#. It is used for creation of asynchronous computations, combining existing asynchronous computations.

It supports all operations, except yield and yieldcomp.

This builder's monad type is Async[T]. Objects of this type describe asynchronous computations returning values of type T (similarly to Task in C#). If an asynchronous operation does not return a value, then FakeVoid is used as T. The Bind operation, having type Async[T]*(T->Async[U])->Async[U], "continues" asynchronous computation (of type Async[T]) with a function. This function takes an object of type T as input and returns a new asynchronous computation of type Async[U].

ExecutionContext is another key type. Its subtypes' isntances are responsible for initiating asynchronous operations (for example, in current thread, in a ThreadPool, or using SynchronizationContext). Here is the signature:

public abstract class ExecutionContext
{
  public abstract Execute(computatuion : void -> void) : void;
}

In order to initiate an asynchronous operation, you have to call the Start method of the object describing the asynchronous operation (class Async[T]), passing it an ExecutionContext object. If the method is called without arguments, the asynchronous operation is initiated using ThreadPool.QueueUserWorkItem.

The async CTP extension, which makes it possible to use async/wait in C#, has many extension methods that add asynchronous operations to existing classes. The async monad library does not have these extensions, but offers simple tools for adding them. For example, let's look at a part of the current HttpWebRequest signature, responsible for completing requests, that exists since the first version of the framework:

public class HttpWebRequest : WebRequest, ISerializable
{
  public override IAsyncResult BeginGetResponse(AsyncCallback callback, object state);
  public override WebResponse EndGetResponse(IAsyncResult asyncResult);
}

Now, let's create an asynchronous extension suitable for use in monadic computations, using these primitives.

public module AsyncExtentions
{
  public GetResponseAsync(this request : HttpWebRequest) : Async[WebResponse]
  {
    Async.FromBeginEnd(request.BeginGetResponse(_, null), request.EndGetResponse(_))
  }
}

You should keep in mind that _ is a special symbol in Nemerle. In this case, it is used for currying (f(_) is equivalent to x => f(x)). Likewise, we can create wrappers for any standard asynchronous computations.

Lets try to write something from (C# 101) Async samples, such as parallel loading of several web pages and printing of their headers. I omit the code for GetHtml() and GetTitle() extensions, since this article is long enough as it is.

public PrintTitles() : Async[FakeVoid]
{
  comp async
  {
    def response1 = HttpWebRequest.Create("http://www.ya.ru").GetResponseAsync();
    def response2 = HttpWebRequest.Create("http://www.habr.ru").GetResponseAsync();
    defcomp response1 = response1;
    defcomp response2 = response2;
    Console.WriteLine(response1.GetHtml().GetTitle());
    Console.WriteLine(response2.GetHtml().GetTitle());
  }
}

The first two lines initiate asynchronous page loading operations. These methods return objects describing asynchronous operations at execution time. From the compiler's point of view, these objects have Async[WebResponce] (monad). In the following two lines, the monads are expanded into values. In other words, we wait for the data. The last two lines process the results.

The most important thing you have to remember is that a monad is a generator pattern. You create a function describing an asynchronous computation without initiating it. Initiating could be done with PrintTitles().Start(). This is very important, because it could be a source of errors. If a method returns Async[T], you have to know whether the code initiates a computation or simply constructs it. It might be useful to have a naming convention to distinguish between the two, like adding the suffix Async to methods initiating computations.

Nemerle has a lot of flexibility regarding asynchronous computation result processing. It has the ability to shift computations from one thread to another. Let's take a look at a button press handler:

private button1_Click (sender : object,  e : System.EventArgs) : void
{
  def formContext = SystemExecutionContexts.FromCurrentSynchronizationContext();
  def task = comp async
  {
    Thread.Sleep(5000);
    callcomp Async.SwitchTo(formContext);
    label1.Text = "success";
  }
  _ = task.Start(SystemExecutionContexts.ThreadPool());
}

First, we get an ExecutionContext, which initiates the computation in the current SynchronizationContext. We then describe an asynchronous operation: Thread.Sleep emulates an intensive computation. We switch the execution context to the GUI thread and print the result. The computation itself is performed in the thread pool execution context.

It looks like magic, but we have really seen all of this before: callcomp simply expands the monad when its value does not matter. "Why did we even need to expand it?" you might ask. The reason has to do with side effects and state. Monadic operations pass along state. At expansion time, the monad has access to this state and is able to change it — this is what happens here. In this example, the state keeps execution context information, and when this information is changed, switches to the new context. I recommend reading the source code for details — they are interesting.

In addition to Async.SwitchTo, there are other interesting monads affecting the execution thread. For example, Async.Yield says that execution context has changed without changing it. In some cases, this does nothing, but when used with a ThreadPool, this action triggers jump to another pool thread.

Conclusion

To conclude, I would like to remark that monads are a very rich subject: this article does not even touch on such classic monads as State, Cont (continuation), Maybe (another C# fanboy wish). You can read about these in other articles. Here, I tried to give a practical explanation sufficient to begin using asynchronous programming and monads list/enumerable in Nemerle while knowing what happens under the hood.

The C# await/async implementation and the Nemerle asynchronous programming approach have much in common, but with a big difference: await/async requires language support, but asynchronous programming in Nemerle is implemented with monads, which are part of the standard library, not the language itself.

Clone this wiki locally