Skip to content

Latest commit

 

History

History
88 lines (68 loc) · 3.28 KB

README.md

File metadata and controls

88 lines (68 loc) · 3.28 KB

dagger

Dagger is a small DAG execution engine for dependent tasks. It is meant to be very simple and lightweight.

Dagger takes a configuration consisting of a collection of operations to execute and named, typed input/output keys, and generates an execution DAG from the operation and key interdependencies. Expressions of key/operation dependencies are called execution components. These components can be composed ad-hoc in a modular fashion.

This modular setup is well suited to situations in which there exists a library of specific entities, operations on those entities, and dependencies between.

Different execution graphs can be assembled by composing subsets of the module library.

Any operations that can be executed in parallel are scheduled in parallel using regular Scala futures.

Dagger builds an execution trace that records the input and output values of each operation, the order of execution, and timing information. Since both keys and execution components are given string-valued names, explanatory trace output is easy to generate.

DSL

There is a convenient Scala DSL for expressing operation and key dependencies. Given the following operations, expressed as Scala functions:

// generate a random alphabetic string
val randomString = (size: Int) =>
  Random.alphanumeric.dropWhile(_.isDigit).take(size).mkString

// join two strings using a delimiter
val joinStringsFn = (first: String, second: String, delimiter: String) =>
  String.join(delimiter, first, second)

And the following keys, which map to the parameters and return values of the functions:

val wordSizeKey = new MessageKey[Int]("word-size")
val delimiterKey = new MessageKey[String]("delimiter")
val firstWordKey = new MessageKey[String]("first-word")
val secondWordKey = new MessageKey[String]("second-word")
val combinedKey = new MessageKey[String]("combined-words")

Using the DSL, one can build a library of executable components, expressing operation and key dependencies.

val firstWord = "generate first word" :: wordSizeKey ~> randomString ~> firstWordKey 
val secondWord = "generate second word" ::  wordSizeKey ~> randomString ~> secondWordKey
val joinStrings = "join strings" :: (firstWordKey, secondWordKey, delimiterKey) ~> joinStringsFn ~> combinedKey

// wrap the dependency relationships into a collection of execution components  
override val stepConfig = Seq(
  firstWord,
  secondWord,
  joinStrings)

// A "Message" holds all intermediate data during execution. It can be seeded with initial values before execution
val result = processMessage(new Message()
  + (delimiterKey, " - ")
  + (wordSizeKey, 5))

println(result)

Running this configuration will yield the following output:

Message(MessageKey[String](delimiter) ->  - 
MessageKey[int](word-size) -> 5
MessageKey[String](second-word) -> lQMWG
MessageKey[String](combined-words) -> leXNc - lQMWG
MessageKey[String](first-word) -> leXNc)

Trace:
         -> KV(parallel [ generate first word, generate second word ].Time,48835686)
         -> KV(generate first word.Time,5801722)
         -> KV(generate second word.Time,5112642)
         -> KV(parallel [ join strings ].Time,5682546)
         -> KV(join strings.Time,5342818)

See the Full Example for more info.