Skip to content
This repository has been archived by the owner on Mar 21, 2019. It is now read-only.

Graph Computation #52

Open
3 tasks
jeannieyeliu opened this issue Dec 16, 2018 · 6 comments
Open
3 tasks

Graph Computation #52

jeannieyeliu opened this issue Dec 16, 2018 · 6 comments
Assignees
Labels
research Something that neeeds to be researched

Comments

@jeannieyeliu
Copy link
Contributor

jeannieyeliu commented Dec 16, 2018

https://github.com/janestreet/incremental/blob/master/src/incremental_intf.ml
research on graph computations like:

  • graph incremental computation

  • graph rewrite

  • optional transformation

@jeannieyeliu
Copy link
Contributor Author

jeannieyeliu commented Dec 17, 2018

Graph Incremental

Graph Incremental defines a collection of interdependent values, and automatically tracks all the dependencies between these values and can bubble-up or propagate changed variables and recompute the incremental values that depend on them.

*several things that needs to be considered concerning graph incrementation

  • The data structure that should be used for storing the graph node data, the edge info and the recomputation rule.
1. should a graph be a DAG or could bi-directional when propagating the changed values?
 need to specify the order.?
  • The Propagation algorithms for tracking node value changes.

@jeannieyeliu jeannieyeliu self-assigned this Dec 18, 2018
@jeannieyeliu jeannieyeliu added the research Something that neeeds to be researched label Dec 18, 2018
@jeannieyeliu
Copy link
Contributor Author

after studying on the incremental computation on janestreet/incremental
Will talk about it in four-leveled aspect (Maybe more), namely:

  • The single nodes(vertices) representation/structure
  • How nodes are constructed in a heap, add/delete node in a heap
  • how the state machine is constructed for the incremental computing, (Probably need to use a stack)
  • the recomputation algorithm

@jeannieyeliu
Copy link
Contributor Author

jeannieyeliu commented Dec 26, 2018

Incremental DAG‘s Node

Node [t] is constructed in a struct. The node has parent and child, and the height of the parent node is always bigger than that of children when this node is necessary. A node is necessary iff Node [t] is a descendant of an observer node , or [t] is a [Freeze] node. A Node [t] is marked as an observer node when we want data from it [t] . A node which needs to be recomputed when changes occur should be stored in recompute heap.

some major element in a DAG Node [t]:

  • id: unique Node id
  • value_opt

parent-child relationship

parents:
the nodes that depend on [t], and that should be recomputed when this node [t] changes

  • number_of_parent
  • parent0: the first parent
  • parent1_and_beyond: remaining parents

scope

  • create_in: the scope that this node is created in. a scope is a bind in which nodes are created. a bind is
  • height: used for visiting node in topological order. the parent node has a higher height than its children.

recompute heap

a heap structure thats stores all nodes that needs to be recomputed

  • height_in_recompute_heap: nodes that need to be recomputed are stored in a heap. this represents [t]'s height in the heap. regularly is equal to height. during the adjust_heap, this value could be less
  • prev_in_recompute_heap, next_in_recompute_heap: consider it as a doubly-linked list, [prev_in_recompute_heap] and [next_in_recompute_heap] doubly link all nodes of the
    same height in the recompute heap.

adjust_height_heap

this heap structure is used only during height adjustment

  • height_in_adjust_heights_heap
  • next_in_adjust_heights_heap

stabilization

the elements that needs for recomputation procedure

  • old_value_opt: holds the pre-stabilization value of [t]
  • on_update_handlers: An on-update handler is stored in a node or an observer, and is run at the end of a stabilization either when the value of the node changes, when the handler is installed, or when the node becomes invalid. We only run the handler if was created in an earlier stabilization cycle. If the handler was created by another on-update handler during the running of on-update handlers in the current stabilization, we treat the added handler as if it were added after this stabilization finished. We will run it at the next stabilization, because the node with the handler was pushed on [state.handle_after_stabilization].
  • observers: head of the doubly-linked list of observers of [t].
In order to get information out of an incremental computation, 
we need to explicitly mark which nodes we want data from by creating observer nodes.
  • is_in_handle_after_stabilization: avoid pushing the same node multiple
    times onto [state.handle_after_stabilization]

  • on_update_handlers: the functions supplied to [Incremental.on_update] to be run
    as described in the module [On_update_handler]. [on_update_handlers] does not
    contain the on-update handlers in [t.observers]. [on_update_handlers] only ever
    gets longer; there is no way to remove elements. *)

Other Kinds of DAG node

  • The Bind node
    see this
  • immutable nodes
    • Const: value and status doesn't change or dependent
    • Freeze: takes on the value of another node and doesn't change thereafter.
  • time-based nodes:
    • At:value is [Before] until the clock reaches a certain time, at which point its value becomes [After]
    • At_intervals:interval of the [At] node
    • Snapshot: creating snapshot of the node status,
    • Step_function: represents a function from [Time_ns.t] to ['a] with a finite number of steps. The steps are in nondecreasing time order

(Ordered/Unordered)array_fold:an immutable value that holds the
children of type ['a] and can [compute] the fold to produce a value of type ['b].

If_then_else: Value of the nodes is conditional
Expert: he only kind of node that can update its value and set of children
incrementally. The operations to change the set of children and to react to various
events
Map: the map function of nodes recomputation
Var: leaf node in DAG
Join:a DAG node which contains one node on the left-hand side and one node on the right-hand side. right depends on left.
Redude_balanced:

@jeannieyeliu
Copy link
Contributor Author

jeannieyeliu commented Dec 26, 2018

The Heap Data Structure that organise the DAG nodes

Bind

  • The Bind structure is stored in a bind node, where a set of nodes on the right-hand side depends on the node on the right-hand side.

  • each bind [t] will track the changes of the node on its left-hand side. and then by calling a user-supplied function [t.f], replaces the values of the nodes on the right-hand side.

The structure of bind:

('a, 'b) t:
f: a user-supplied function we run every time [t.lhs] changes
lhs: represent the node on the left-hand side
lhs_change:
rhs: node on the right-hand side.
all_nodes_created_on_hrs: the head of a singly-linked list of nodes created on the right-hand side of [t]

Scope

A scope is a bind in which nodes are created. It is either [top], for nodes not in a
bind, or [Uopt.some packed_bind] for nodes created on the right-hand side of a bind.

some major element of Scope

top
is_top
height
is_necessary
add_node

The Recompute Heap

The recompute heap holds the set of nodes such that [Node.needs_to_be_computed]. It is used during stabilization to visit the nodes that need to be computed in topological order, using the recompute heap to visit them in increasing order of height.

The Adjust Heights Heap

The adjust-heights heap is used after an edge is added to the graph from a child node to a parent node. If the child's height is greater than or equal to the parent's height, then [Adjust_heights_heap.adjust_heights] increases the height of the parent and its ancestors as necessary in order to restore the height invariant. This is done by visiting ancestors in topological order, using the adjust-heights heap to visit them in increasing order of pre-adjusted height.

[adjust_heights t recompute_heap ~child ~parent] is called when [parent] is added as a
parent of [child] and [child.height >= parent.height]. It restores the node height
invariant: [child.height < parent.height] (for [parent] and all its ancestors).

Pre and post-conditions:

- [t] is empty.  Thus, for all nodes [n], [n.height_in_adjust_heights_heap = -1].
- For all nodes [n] in [recompute_heap], [n.height = n.height_in_recompute_heap].

[adjust_heights] adds a node [n] to the adjust-heights heap when it detects that
[c.height >= n.height] for some child [c] of [n].  It adds [n] with
[n.height_in_adjust_heights_heap] set to the pre-adjusted height of [n], and then sets
[n.height] to [c.height + 1].  [adjust_heights] then does not change
[n.height_in_adjust_heights_heap] until [n] is removed from [t], at which point it is
reset to [-1].  [adjust_heights] may increase [n.height] further as it detects other
children [c] of [n] with [c.height >= n.height].  A node's [height_in_recompute_heap]
changes at most once during [adjust_heights], once the node's final adjusted height is
known.

Here is the algorithm.

while [t] is not empty:
1. remove an [n] in [t] with minimum [n.height_in_adjust_heights_heap].
2. [Recompute_heap.increase_height recompute_heap n].
3. for all parents [p] of [n], if [n.height >= p.height], then ensure [p] is in [t]
and set [p.height] to [n.height + 1] and

If [adjust_heights] ever encounters [child] while visiting the ancestors of [parent],
then there is a cycle in the graph and [adjust_heights] raises.

[adjust_heights] raises if a node's height needs to be increased beyond
[max_height_allowed t].

@jeannieyeliu
Copy link
Contributor Author

#Observer

An observer is a "handle" to an {!Internal_observer} that is given to user code -- the handle exists so the implementation can hold on to the internal observer and use a finalizer to detect when the user is done with the observer. The finalizer disallows future use of the observer if it has no on-update handlers, so even if user code uses a finalizer to resurrect the observer, it will still have [not (use_is_allowed t)].
an {Internal observer} is the root of the incremental DAG, all descendants of an observer are "necessary", so that stabilization ensures their values are up to date.
use a doubly linked list to store all the observers.

state transition:

         Created --> In_use --> Disallowed --> Unlinked
           |                                     ^
           \-------------------------------------/

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
research Something that neeeds to be researched
Development

No branches or pull requests

1 participant