Skip to content

Commit

Permalink
Merge pull request #15729 from JuliaLang/teh/val_manual
Browse files Browse the repository at this point in the history
Expand documentation on Val{T}
  • Loading branch information
timholy committed Apr 21, 2016
2 parents 9b029a3 + c56d161 commit 72ba87a
Show file tree
Hide file tree
Showing 2 changed files with 183 additions and 17 deletions.
151 changes: 149 additions & 2 deletions doc/manual/performance-tips.rst
Original file line number Diff line number Diff line change
Expand Up @@ -587,8 +587,10 @@ loop. There are several possible fixes:
- Declare the type of ``x``: ``x::Float64 = 1``
- Use an explicit conversion: ``x = one(T)``

Separate kernel functions
-------------------------
.. _man-performance-kernel-functions:

Separate kernel functions (aka, function barriers)
--------------------------------------------------

Many functions follow a pattern of performing some set-up work, and then
running many iterations to perform a core computation. Where possible,
Expand Down Expand Up @@ -637,6 +639,151 @@ Functions like ``strange_twos`` occur when dealing with data of
uncertain type, for example data loaded from an input file that might
contain either integers, floats, strings, or something else.

.. _man-performance-val:

Types with values-as-parameters
-------------------------------

Let's say you want to create an ``N``-dimensional array that
has size 3 along each axis. Such arrays can be created like this:

.. doctest::

A = fill(5.0, (3, 3))

This approach works very well: the compiler can figure out that ``A``
is an ``Array{Float64,2}`` because it knows the type of the fill value
(``5.0::Float64``) and the dimensionality (``(3, 3)::NTuple{2,Int}``).
This implies that the compiler can generate very efficient code for
any future usage of ``A`` in the same function.

But now let's say you want to write a function that creates a
3×3×... array in arbitrary dimensions; you might be tempted to write a
function

.. doctest::

function array3(fillval, N)
fill(fillval, ntuple(d->3, N))
end

This works, but (as you can verify for yourself using ``@code_warntype
array3(5.0, 2)``) the problem is that the output type cannot be
inferred: the argument ``N`` is a *value* of type ``Int``, and
type-inference does not (and cannot) predict its value in
advance. This means that code using the output of this function has to
be conservative, checking the type on each access of ``A``; such code
will be very slow.

Now, one very good way to solve such problems is by using the
:ref:`function-barrier technique
<man-performance-kernel-functions>`. However, in some cases you might
want to eliminate the type-instability altogether. In such cases, one
approach is to pass the dimensionality as a parameter, for example
through ``Val{T}`` (see :ref:`man-val-trick`):

.. doctest::

function array3{N}(fillval, ::Type{Val{N}})
fill(fillval, ntuple(d->3, Val{N}))
end

Julia has a specialized version of ``ntuple`` that accepts a
``Val{::Int}`` as the second parameter; by passing ``N`` as a
type-parameter, you make its "value" known to the compiler.
Consequently, this version of ``array3`` allows the compiler to
predict the return type.

However, making use of such techniques can be surprisingly subtle. For
example, it would be of no help if you called ``array3`` from a
function like this:

.. doctest::

function call_array3(fillval, n)
A = array3(fillval, Val{n})
end

Here, you've created the same problem all over again: the compiler
can't guess the type of ``n``, so it doesn't know the type of
``Val{n}``. Attempting to use ``Val``, but doing so incorrectly, can
easily make performance *worse* in many situations. (Only in
situations where you're effectively combining ``Val`` with the
function-barrier trick, to make the kernel function more efficient,
should code like the above be used.)

An example of correct usage of ``Val`` would be:

.. doctest::

function filter3{T,N}(A::AbstractArray{T,N})
kernel = array3(1, Val{N})
filter(A, kernel)
end

In this example, ``N`` is passed as a parameter, so its "value" is
known to the compiler. Essentially, ``Val{T}`` works only when ``T``
is either hard-coded (``Val{3}``) or already specified in the
type-domain.

The dangers of abusing multiple dispatch (aka, more on types with values-as-parameters)
---------------------------------------------------------------------------------------

Once one learns to appreciate multiple dispatch, there's an
understandable tendency to go crazy and try to use it for everything.
For example, you might imagine using it to store information, e.g.
::
immutable Car{Make,Model}
year::Int
...more fields...
end

and then dispatch on objects like ``Car{:Honda,:Accord}(year, args...)``.

This might be worthwhile when the following are true:

- You require CPU-intensive processing on each ``Car``, and it becomes
vastly more efficient if you know the ``Make`` and ``Model`` at
compile time.

- You have homogenous lists of the same type of ``Car`` to
process, so that you can store them all in an
``Array{Car{:Honda,:Accord},N}``.

When the latter holds, a function processing such a homogenous array
can be productively specialized: julia knows the type of each element
in advance (all objects in the container have the same concrete type),
so julia can "look up" the correct method calls when the function is
being compiled (obviating the need to check at run-time) and thereby
emit efficient code for processing the whole list.

When these do not hold, then it's likely that you'll get no benefit;
worse, the resulting "combinatorial explosion of types" will be
counterproductive. If ``items[i+1]`` has a different type than
``item[i]``, julia has to look up the type at run-time, search for the
appropriate method in method tables, decide (via type intersection)
which one matches, determine whether it has been JIT-compiled yet (and
do so if not), and then make the call. In essence, you're asking the
full type- system and JIT-compilation machinery to basically execute
the equivalent of a switch statement or dictionary lookup in your own
code.

Some run-time benchmarks comparing (1) type dispatch, (2) dictionary
lookup, and (3) a "switch" statement can be found `on the mailing list
<https://groups.google.com/d/msg/julia-users/jUMu9A3QKQQ/qjgVWr7vAwAJ>`_.

Perhaps even worse than the run-time impact is the compile-time
impact: julia will compile specialized functions for each different
``Car{Make, Model}``; if you have hundreds or thousands of such types,
then every function that accepts such an object as a parameter (from a
custom ``get_year`` function you might write yourself, to the generic
``push!`` function in the standard library) will have hundreds or
thousands of variants compiled for it. Each of these increases the
size of the cache of compiled code, the length of internal lists of
methods, etc. Excess enthusiasm for values-as-parameters can easily
waste enormous resources.


Access arrays in memory order, along columns
--------------------------------------------

Expand Down
49 changes: 34 additions & 15 deletions doc/manual/types.rst
Original file line number Diff line number Diff line change
Expand Up @@ -1237,39 +1237,58 @@ If you apply :func:`supertype` to other type objects (or non-type objects), a
julia> supertype(Union{Float64,Int64})
ERROR: `supertype` has no method matching supertype(::Type{Union{Float64,Int64}})

.. _man-val-trick:

"Value types"
-------------

As one application of these ideas, Julia includes a parametric type,
``Val{T}``, designated for dispatching on bits-type *values*. For
example, if you pass a boolean to a function, you have to test the
value at run-time:
In Julia, you can't dispatch on a *value* such as ``true`` or
``false``. However, you can dispatch on parametric types, and Julia
allows you to include "plain bits" values (Types, Symbols,
Integers, floating-point numbers, tuples, etc.) as type parameters. A
common example is the dimensionality parameter in ``Array{T,N}``,
where ``T`` is a type (e.g., ``Float64``) but ``N`` is just an
``Int``.

.. doctest::
You can create your own custom types that take values as parameters,
and use them to control dispatch of custom types. By way of
illustration of this idea, let's introduce a parametric type,
``Val{T}``, which serves as a customary way to exploit this technique
for cases where you don't need a more elaborate hierarchy.

function firstlast(b::Bool)
return b ? "First" : "Last"
end
``Val`` is defined as::

println(firstlast(true))
immutable Val{T}
end

You can instead cause the conditional to be evaluated during function
compilation by using the ``Val`` trick:
There is no more to the implementation of ``Val`` than this. Some
functions in Julia's standard library accept ``Val`` types as
arguments, and you can also use it to write your own functions. For
example:

.. doctest::

firstlast(::Type{Val{true}}) = "First"
firstlast(::Type{Val{false}}) = "Last"

println(firstlast(Val{true}))
julia> println(firstlast(Val{true}))
First

Any legal type parameter (Types, Symbols, Integers, floating-point
numbers, tuples, etc.) can be passed via ``Val``.
julia> println(firstlast(Val{false}))
Last

For consistency across Julia, the call site should always pass a
``Val`` type rather than creating an instance, i.e., use
``Val`` *type* rather than creating an *instance*, i.e., use
``foo(Val{:bar})`` rather than ``foo(Val{:bar}())``.

It's worth noting that it's extremely easy to mis-use parametric
"value" types, including ``Val``; in unfavorable cases, you can easily
end up making the performance of your code much *worse*. In
particular, you would never want to write actual code as illustrated
above. For more information about the proper (and improper) uses of
``Val``, please read the more extensive discussion in :ref:`the
performance tips <man-performance-val>`.

.. _man-nullable-types:

Nullable Types: Representing Missing Values
Expand Down

0 comments on commit 72ba87a

Please sign in to comment.