Skip to content

Design Doc: vectorizing model operations

Fritz Obermeyer edited this page May 31, 2017 · 3 revisions

Objective

Speed up inference in models of >100 nodes by vectorizing model$calculate() invocations in compiled inference code.

Overview

We can vectorize operations on models by simply supporting block indexing in the code that compiles models down to nimble DSL code; then model vectorization can leverage our larger effort to vectorize nimble compilation. The larger effort can start with pure Eigen vectorization, before we decide on GPU framework.

Detailed Design

The easy part: block indexing in model$calculate()

Node functions already inherit from a base class nodeFun. To allow vectorized overrides of these functions (when vecorization is possible), we can mark virtual the base class methods nodeFun::*Block(), e.g.

class Example : public nodeFun ... {
   public:
    // Caches and returns the logprob of single node.
    double calculate(indexedNodeInfo const& ARG1_INDEXEDNODEINFO__) const override;
    // Caches each node's logprob and returns the total logprob summed over all nodes.
    double calculateBlock(const useInfoForIndexedNodeInfo &ARG1_BLOCK__) const override;
    ...
};

The calculateBlock method will then be compiled from DSL code using compileNimble.

The hard part: vectorizing arbitrary DSL code

We propose to generalize our limited "eigenization" to a more general "tensorization", and then perform the tensor-to-eigen compilation in a later compilation stage.

Tensorization

We can start tensorizing

Tensor representation

It remains to be seen whether Eigen is suitable for

Tasks

  • PR Mark nodeFun::*Block() methods as virtual to allow vectorized overrides.

  • Add vectorized versions of nodeFun::*Block() methods during nndf_createMethodList() in nimbleFunction_nodeFunctionNew.R.

    • nodeFun::calculateBlock()
    • nodeFun::calculateDiffBlock()
    • nodeFun::getLogProbBlock()
    • nodeFun::simulateBlock()

...much more...

Clone this wiki locally