Skip to content
This repository has been archived by the owner on Feb 21, 2025. It is now read-only.

Developing Shared Resources

James Baker edited this page Apr 27, 2017 · 2 revisions

Shared resources are instances of a class which is accessible across a pipeline. These can be used to implement shared state (such as with history), sharing complex or memory intensive data structures (such as gazetteers), or sharing access to remote resources.

Shared resources are derived from BaleenResource (which in turn derived from Resource_ImplBase). At a basic level, developers should subclass BaleenResource and then add the methods required for that resource.

Resources will typically fall into two categories.

  1. Those which hold and manage an item such as a gazetteer or database connection. In which case the SharedResource is likely to have functions such as get() and effectively acts as a 'pipeline singleton'.

  2. Those which offer common functionality, allowing plugin implementation to be offered pipeline elements. Implementations are likely to have more functional methods, e.g. categorise(document) or translate(text)

As each pipeline is single threaded and processes only a single document, shared resources need not be thread safe. However implementations may need to make requests of other services (e.g. database or remote web services) which are synchronous / slow and they may also cache data. This can lead to unintended side effects.

Shared resources have very basic lifecycle:

  • doInitialize: call to create the resource and initialise it.
  • afterResourcesInitialized: called after all other resources have been initialised.
  • doDestroy: called when the resource is no longer required to clean up, free memory, etc.

The important point from the lifecycle is applicable to resources which themselves depend on other resources (ie inject other shared resources). An example is a gazetteer which depends on a shared database resources to provide its terms. When the gazetteer doInitialise is called the database shared resource may not yet be initialised and may not be injected as a UimaFIT dependency (that is @ExternalResource private Database sharedDb; will be null).

Behind the scenes, UimaFIT will first doInitialize all resources then inject resources into one another (and annotators etc). If your initialisation requires access to other resources you should perform that initialisation in afterResourcesInitialized (and not doInitialise).

Using shared resources

For an example of using a SharedResource, see Developing Annotators with Resources. The same approach can be used with consumers, collection readers and indeed other resources.

Resources are configured from the global and pipeline configuration (the pipeline overrides the global configuration as usual).