Skip to content

Feature comparison

Konrad Rudolph edited this page Dec 28, 2015 · 7 revisions

Comparison with …

Source files (source)

Because of this package’s design, modules can directly replace source statements in code; in most cases,

source('relative/path/file.r')

can be replaced by

import('relative/path/file', attach = TRUE)

– albeit with marked improvements:

  • Module content is loaded into its own private environment, akin to setting the local=TRUE option. It thus avoids polluting the global environment.

  • Since modules are environments, a module’s content can be listed easily via ls(modulename), and R shells provide auto-completion when writing modulename$ and pressing Tab repeatedly.

  • Modules can be executed directly (via Rscript module.r or similar) or imported. Unlike via source, a module knows when it’s being imported, which allows code to be executed conditionally only when it is executed directly:

    if (is.null(module_name())) {
        …
    }

    This is of course similar to Python’s if __name__ == '__main__': … mechanism. module_name returns a module’s name. Module source files which are being executed directly don’t act as modules and hence have no name (module_name() is NULL).

  • Modules can import other modules relative to their own path, without having to chdir to the module’s path (similar to the source option chdir=TRUE, but preserving getwd()).

  • import uses a standardised, customisable search path to locate modules, making it easy to reuse source files across projects without having to copy them around.

  • Repeatedly importing, even in different modules, loads the module only once. This makes it particularly well-suited for structuring projects into small, decomposable units. This project was mainly borne out of the frustration that is repeatedly sourceing the same file, or alternatively having one “master header” file which includes all other source files.

  • Doc comments inside a module source file are parsed during import, and interactive help on module contents is subsequently available via the usual mechanisms (e.g. ?mod$fun).

  • All module source files are assumed to be encoded as UTF-8, which is nowadays the only sane default.

Packages (library)

Modules are conceived as a lightweight alternative to packages (see rationale). As such, modules are generally intended to be more lightweight than packages.

  • Most importantly, modules often consist of single source code files.

  • Modules do not need a DESCRIPTION file or similar.

  • Modules offer more stringent protection against name clashes. While attaching to the R search() path is supported, it’s not the default, and (like in Python), it’s generally discouraged in favour of explicitly qualifying accesses to the module with the module name (or an alias).

  • Changing a module does not necessitate a module reinstall, the changes are available directly to clients (and even to running sessions, via reload).

  • Modules can be local to a project. This allows structuring projects internally, something that packages only allow at coarse level. In particular, modules can be nested as in Python to create hierarchies, and this is in fact encouraged.

  • As of now, there is no support for non-R code or dynamic libraries (but one may of course use facilities such as dyn.load and Rcpp to include compiled code).

  • Control over exported and imported symbols is less fine-grained than for packages with namespace for now. This is intentional, since modules handle namespaces (via environments) more stringently than packages by default. However, this might still change in the future to allow more control.

Python’s import mechanism

R modules are heavily inspired by Python modules, but embedded in R syntax.

  • There is one general form of the import function, corresponding to import modname in Python. Arguments can be used to emulate the other forms: import(x, attach = TRUE) loosely corresponds to from x import *. import(x, attach = c('foo', 'bar')) corresponds to from x import foo, bar.

  • Like in Python, imports are absolute by default. This means that if there are two modules of the same name, one in the global search path and one in the local directory, importing that module will resolve to the one in the global search path, and in order to import the local module instead, the user has to specify a relative path: import('./modname'). Unlike in Python, modules can always be specified as relative imports, not only for submodules.

  • When specifying attach = TRUE, names of the imported module are made available directly in the calling scope, but unlike in Python they are not copied into that scope, so local names may shadow imported names.

  • As a consequence of this, modules export functions and objects they define, but they do not export symbols they themselves import: if a module a contains import('b', attach = TRUE), none of the symbols from b will be visible for any code importing a. Where this is not the desired behaviour, users can use the export_submodule function instead of import.

Clone this wiki locally