From f38993c701add95f98dfabeb55f3b8fe0faac94c Mon Sep 17 00:00:00 2001 From: Tim Holy Date: Wed, 20 Apr 2016 09:52:26 -0500 Subject: [PATCH] Add info about abuse of type system [ci skip] Also adds numerous tweaks in response to review comments. --- doc/manual/performance-tips.rst | 83 ++++++++++++++++++++++++++++----- doc/manual/types.rst | 49 +++++++++++++------ 2 files changed, 107 insertions(+), 25 deletions(-) diff --git a/doc/manual/performance-tips.rst b/doc/manual/performance-tips.rst index 1ae3ecde5af033..169bcd8c46b47a 100644 --- a/doc/manual/performance-tips.rst +++ b/doc/manual/performance-tips.rst @@ -589,8 +589,8 @@ loop. There are several possible fixes: .. _man-performance-kernel-functions: -Separate kernel functions -------------------------- +Separate kernel functions (aka, function barriers) +-------------------------------------------------- Many functions follow a pattern of performing some set-up work, and then running many iterations to perform a core computation. Where possible, @@ -641,8 +641,8 @@ contain either integers, floats, strings, or something else. .. _man-performance-val: -Using the ``Val{T}`` trick --------------------------- +Types with values-as-parameters +------------------------------- Let's say you want to create an ``N``-dimensional array that has size 3 along each axis. Such arrays can be created like this: @@ -652,13 +652,14 @@ has size 3 along each axis. Such arrays can be created like this: A = fill(5.0, (3, 3)) This approach works very well: the compiler can figure out that ``A`` -is an ``Array{Float64,2}`` because it know the type of the fill value +is an ``Array{Float64,2}`` because it knows the type of the fill value (``5.0::Float64``) and the dimensionality (``(3, 3)::NTuple{2,Int}``). This implies that the compiler can generate very efficient code for any future usage of ``A`` in the same function. -But now let's say you want to support this for arbitrary dimensions; -you might be tempted to write a function +But now let's say you want to write a function that creates a +3×3×... array in arbitrary dimensions; you might be tempted to write a +function .. doctest:: @@ -678,7 +679,8 @@ Now, one very good way to solve such problems is by using the :ref:`function-barrier technique `. However, in some cases you might want to eliminate the type-instability altogether. In such cases, one -approach is to make use of ``Val{T}`` (see :ref:`man-val-trick`): +approach is to pass the dimensionality as a parameter, for example +through ``Val{T}`` (see :ref:`man-val-trick`): .. doctest:: @@ -692,8 +694,9 @@ type-parameter, you make its "value" known to the compiler. Consequently, this version of ``array3`` allows the compiler to predict the return type. -However, making use of ``Val`` can be surprisingly subtle: it would be -of no help if you called ``array3`` from a function like this: +However, making use of such techniques can be surprisingly subtle. For +example, it would be of no help if you called ``array3`` from a +function like this: .. doctest:: @@ -718,11 +721,69 @@ An example of correct usage of ``Val`` would be: filter(A, kernel) end -In this example, ``N`` is specified as a parameter, so its "value" is +In this example, ``N`` is passed as a parameter, so its "value" is known to the compiler. Essentially, ``Val{T}`` works only when ``T`` is either hard-coded (``Val{3}``) or already specified in the type-domain. +The dangers of abusing multiple dispatch (aka, more on types with values-as-parameters) +--------------------------------------------------------------------------------------- + +Once one learns to appreciate multiple dispatch, there's an +understandable tendency to go crazy and try to use it for everything. +For example, you might imagine using it to store information, e.g. +:: + immutable Car{Make,Model} + year::Int + ...more fields... + end + +and then dispatch on objects like ``Car{:Honda,:Accord}(year, args...)``. + +This might be worthwhile when the following are true: + +- You require CPU-intensive processing on each ``Car``, and it becomes + vastly more efficient if you know the ``Make`` and ``Model`` at + compile time. + +- You have huge, homogenous lists of the same type of ``Car`` to + process, so that you can store them all in an + ``Array{Car{:Honda,:Accord},N}``. + +When the latter holds, a function processing such a homogenous array +can be productively specialized: julia knows the type of each element +in advance (all objects in the container have the same concrete type), +so julia can "look up" the correct method calls when the function is +being compiled (obviating the need to check at run-time) and thereby +emit efficient code for processing the whole list. + +When these do not hold, then it's likely that you'll get no benefit; +worse, the resulting "combinatorial explosion of types" will be +counterproductive. If ``items[i+1]`` has a different type than +``item[i]``, julia has to look up the type at run-time, search for the +appropriate method in method tables, decide (via type intersection) +which one matches, determine whether it has been JIT-compiled yet (and +do so if not), and then make the call. In essence, you're asking the +full type- system and JIT-compilation machinery to basically execute +the equivalent of a switch statement or dictionary lookup in your own +code. + +Some run-time benchmarks comparing (1) type dispatch, (2) dictionary +lookup, and (3) a "switch" statement can be found `on the mailing list +`_. + +Perhaps even worse than the run-time impact is the compile-time +impact: julia will compile specialized functions for each different +``Car{Make, Model}``; if you have hundreds or thousands of such types, +then every function that accepts such an object as a parameter (from a +custom ``get_year`` function you might write yourself, to the generic +``push!`` function in the standard library) will have hundreds or +thousands of variants compiled for it. Each of these increases the +size of the cache of compiled code, the length of internal lists of +methods, etc. Excess enthusiasm for values-as-parameters can easily +waste enormous resources. + + Access arrays in memory order, along columns -------------------------------------------- diff --git a/doc/manual/types.rst b/doc/manual/types.rst index 60606c099dce42..ae5401b2243dad 100644 --- a/doc/manual/types.rst +++ b/doc/manual/types.rst @@ -1242,31 +1242,52 @@ If you apply :func:`supertype` to other type objects (or non-type objects), a "Value types" ------------- -As one application of these ideas, Julia includes a parametric type, -``Val{T}``, designated for dispatching on bits-type *values*. -Normally, you can't dispatch on a value such as ``true`` or ``false``, -but the ``Val`` type makes this possible: +In Julia, you can't dispatch on a *value* such as ``true`` or +``false``. However, you can dispatch on parametric types, and Julia +allows you to include "plain bits" values (Types, Symbols, +Integers, floating-point numbers, tuples, etc.) as type parameters. A +common example is the dimensionality parameter in ``Array{T,N}``, +where ``T`` is a type (e.g., ``Float64``) but ``N`` is just an +``Int``. + +You can create your own custom types that take values as parameters, +and use them to control dispatch of custom types. By way of +illustration of this idea, let's introduce a parametric type, +``Val{T}``, which serves as a customary way to exploit this technique +for cases where you don't need a more elaborate hierarchy. + +``Val`` is defined as:: + + immutable Val{T} + end + +There is no more to the implementation of ``Val`` than this. Some +functions in Julia's standard library accept ``Val`` types as +arguments, and you can also use it to write your own functions. For +example: .. doctest:: firstlast(::Type{Val{true}}) = "First" firstlast(::Type{Val{false}}) = "Last" - println(firstlast(Val{true})) + julia> println(firstlast(Val{true})) + First -Any legal type parameter (Types, Symbols, Integers, floating-point -numbers, tuples, etc.) can be passed via ``Val``. + julia> println(firstlast(Val{false})) + Last For consistency across Julia, the call site should always pass a -``Val`` type rather than creating an instance, i.e., use +``Val`` *type* rather than creating an *instance*, i.e., use ``foo(Val{:bar})`` rather than ``foo(Val{:bar}())``. -It's worth noting that it's extremely easy to mis-use the ``Val`` -trick, and you can easily end up making the performance of your code -much *worse*. For example, you would never want to write actual code -as illustrated above. For more information about the proper (and -improper) uses of ``Val``, please read the more extensive discussion -in :ref:`the performance tips `. +It's worth noting that it's extremely easy to mis-use parametric +"value" types, including ``Val``; in unfavorable cases, you can easily +end up making the performance of your code much *worse*. In +particular, you would never want to write actual code as illustrated +above. For more information about the proper (and improper) uses of +``Val``, please read the more extensive discussion in :ref:`the +performance tips `. .. _man-nullable-types: