Skip to content

recursive aggregating treetable and pivot table for hypergrid

License

Notifications You must be signed in to change notification settings

lstefano71/hypertree

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

31 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Hypertree

Concepts

Structure

Events

Parameters

Calculation

Use

Scripts

Examples

Concepts

Hypertree is a q server which adds tree- and pivot-table calculation to Hypergrid.

An earlier version of this algorithm (bottom-up aggregation) is described here.

What follows is a brief overview:

Suppose our tree has the following structure:

0
 1
  2
  3
 4
  5
   6
   7
   8

Data-elements are attached to the leaves of the tree:

0
 1
  2	- 20
  3 - 30
 4
  5
   6 - 60
   7 - 70
   8 - 80

To calculate sums for the nodes of the tree we calculate iteratively, bottom-up:

0 = 2 + 3 + 6 + 7 + 8
1 = 2 + 3
4 = 5
5 = 6 + 7 + 8

The iterative method requires that we have in hand the ultimate constituents -- the leaves -- of each node:

0 : 2 3 6 7 8
1 : 2 3
4 : 5
5 : 6 7 8

we compute the leaves of each node by first constructing the "parent vector" of our tree:

p:0 0 1 1 0 4 5 5 5

p[i] is the parent of element i:

q)p 5
4

We can "climb the tree" from any leaf to the root node:

q)p 8
5
q)p 5
4
q)p 4
0

The root is self-parenting:

q)p 0
0
q)p p p p p p 0
0

So using "converge":

q)p over 8
0
q)p scan 8
8 5 4 0

To generate all paths from ultimate constituents:

q)i:(p scan)each til count p
q)i
,0
1 0
2 1 0
3 1 0
4 0
5 4 0
6 5 4 0
7 5 4 0
8 5 4 0

The ultimate constituents:

q)l:til[count p]except p
q)l
2 3 6 7 8

Now we can populate the leaves with data-elements:

q)d:@[count[p]#0;l;:;l*10]
q)d
0 0 20 30 0 0 60 70 80

and use the path-list to sum up from the leaves:

q)s:@[count[p]#0;i;+;d]
q)s
260 50 20 30 210 210 60 70 80

Hypertree also supports one- and two-dimensional pivot-table views.

For this, we rely on a simplified version of an algorithm devised by Nick Psaris in his book Q-TIPS.

Hypertree thus provides drill-down interaction on both X and Y axes.

Structure

Given table T (the underlying table) + Hypertree Parameters (see below) .ht.cons produces Z, the Hypertree table.

Column names a_ through z_ are reserved for Hypertree use.

All other column names ending in _ are virtual rollups: the leaves are nulled in the display.

The Hypertree table structure is encoded in the following six columns:

q).ht.C
`n_`e_`l_`o_`p_`g_

q)?[Z;();0b;.ht.C!.ht.C]
n_                          e_ l_ o_ p_ g_         
---------------------------------------------------
()                          0  0  1  0             
,`abbott                    0  1  0  0  abbott     
,`chico                     0  1  1  0  chico      
`chico`energy               0  2  0  2  energy     
`chico`financials           0  2  1  2  financials 
`chico`financials`chart     0  3  0  4  chart      
`chico`financials`house     0  3  0  4  house      
`chico`financials`indexarb  0  3  0  4  indexarb   

n_:  path to record
e_:  leaf?
o_:  open?
p_:  parent vector
g_:  last each n_ (hierarchy column)

z_ is used internally as a dummy column.

Events

Hypertree understands a limited number of user-initiated events:

row:		click on an element in the Hierarchy column to expand from or contract to 
			that row.

col:		click on a column to pivot the table.

cell:		click on a cell to restrict the table to the row-value and pivot the resulting
			table on the column-value.

sorts:		multi-sort the table ascending, descending, ascending-absolute-value, 
			descending-absolute-value.

get:		send a subtable of the current state of the table to Hypergrid.

groups:		press the alt/option key to summon a window which allows drag-and-drop 
			regrouping of the table, and to select which columns of the table are visible.

reset:		clear sorts, expansions, pivots, &c.

expand:		expand the tree to the level just above the leaves (i.e. the last group).

collapse:	collapse the tree to the first level (i.e. the first group).

Hypertree also supports update. For example, if the underlying table is t:

\t 1000
.z.ts:{update ... from `t ...;.js.upd`}

This causes Hypertree to recalculate and redisplay:

Hypercalc can also be used to define automatically recalculating underlying views.

Parameters

Hypertree behavior is controlled by a set of programmer-defined global variables.

Variables defined as dependencies can be overriden by the programmer.

A: Rollups

Rollup expressions are q parse-trees:

A:()!()
A[`f]:(avg;`f)

Hypertree will use sensible defaults (e.g. sum for numeric.)

Multiple rollups can be defined for a single column:

A[`f`g`h]:((sum;`a);(avg;`a);(max;`a))

Rollup functions map lists to atoms.  If the type of column c is t, then the
type of the rollup of c must be -t.

E: Universe of columns

E::distinct key[A],cols first T

F: Visible columns in order

F::E except G,I

G: Grouping columns

G:()

Any subset of H.

When G is empty, the underlying table together with its grand total row is displayed:

H: Groupable columns

H::exec c from meta first T where t in"bhijspmdznuvt"

Hypertree understands grouping by symbolic and "discrete" values (including dates and 
times.)

I: Invisible columns

I::E except G,F

F, G, and I can be overridden s.t. (F,G,I) is a (proper) subset of cols T.

J: Infinity check

J:0b

json chokes on q infinities.  At some not-insignificant computational cost, these can be 
filtered out of the result before conversion to json by .j.j.  Better to not generate them
at all!

L: Expand to leaves

L:1b

By default, Hypertree expands to display the leaves of the treetable.  If this behavior
is not desired (for example, because there are too many ultimate records, or because the
data at the leaves is not interesting), set L to 1b.

O: Object properties

O.:(::)

Display properties can be coordinated between Hypertree and Hypergrid with the multi-level 
dictionary O.  For example:

O.somecolumn.type:`price99

Consult the Hypergrid documentation for a full description of this feature.

P: Instruction state = (current;prior)

P:.ht.P

The state of the Hypertree is represented as a pair of keytables.  P 0 represents the 
current state, and P 1 represents the previous state.  n (the key) is a list of paths to
nodes which have been opened, and v is a boolean vector saying whether the corresponding 
node is open or closed.

For example:

q)P 0
n                                             | v
----------------------------------------------| -
(`symbol$())!()                               | 1
(,`trader)!,`costello                         | 0
`trader`sector!`costello`energy               | 1
`trader`sector`strategy!`costello`energy`house| 1

q)P 1
n                                             | v
----------------------------------------------| -
(`symbol$())!()                               | 1
(,`trader)!,`costello                         | 1
`trader`sector!`costello`energy               | 1
`trader`sector`strategy!`costello`energy`house| 1

Q: q-types

Q::.ht.qtype get Z

Q is a dictionary of q-types by column.  For example, Q.N = "j".

R: Rows -> Hypergrid (scrolling)

R:`start`end!0 100

S: Sorts = cols!(..{adAD}..)

S:.ht.S = ()!()

For example:  S:`foo`bar!`D`a for ascending bar within descending-absolute foo.

T: Table

The table underlying Hypertree must be unkeyed (type 98h).

default: 

	T:`

dynamic rollup calculation: 

	T:`t

static rollups, u is a dictionary of rollup tables:  

	T:(`t;u)

U: Drill to the underlying table = allow empty grouping

U:0b

V: Pop up grouping/visibility panel

V:1b

Setting V to 0b blocks the pop-up option-window.

W: Pivot state = ((z-col;Q);selects;groups)

W:.ht.W = (();();())

Pivot and Y-Axis drill-down operations are mutually exclusive.

W keeps track of the sequence of pivot operations, and allows the user to unwind from the 
current to the previous pivot-state.

X: Allow X-axis drilldown (pivot)

X:1b

Y: Allow Y-axis drilldown (treetable)

Y:1b

Z: Hypertree

Z:`

Z is symmetrical with T:  a pointer to the hypertree result-table.

Calculation

Hypertree calculates rollups on every Y-axis drilldown event, and on every X- or XY-pivot event, and on every update to the underlying data.

Our rollup algorithm performs well in the case of medium-sized tables (a few million records), and for tables with not too many rollups and/or not too many exposed subnodes.

For static data, and especially for very large tables which update off-line, a different approach is required.

Dynamic rollups:

T:`t

Static rollups:

Create a dictionary whose keys are a subset of the permutations of G.

q).af.set`:t
q).af.get`:t
sector   strategy trader  | :../t/sector.strategy.trader
sector   trader   strategy| :../t/sector.trader.strategy
strategy sector   trader  | :../t/strategy.sector.trader
strategy trader   sector  | :../t/strategy.trader.sector
trader   sector   strategy| :../t/trader.sector.strategy
trader   strategy sector  | :../t/trader.strategy.sector

Then in d.q:

T:(`t;.af.get`:t)   (OR:  u:.af.get`:t;T:`t`u)

u is a dictionary whose keys are drawn from the permutations g of G and whose values 
are symbols of, or the actual complete rollup trees for t and each g.

if U is set to 1b, then H should be set to G, to prevent the definition of hierarchies
which are not keys of d.

Use

Single process:

Start hypertree process (default port 12345, \t = 0):

	cd hypertree
	q h.q

The one-process version single-threads q updates and Hypergrid interaction.

Two-process:

Start hypertree client (default port 12345):

	cd hypertree
	q c.q

Start hypertree server (default port 12346):

	cd hypertree
	q s.q

Client and server processes may be started in either order.

The two-process version decouples q update and Hypergrid interaction.

By default, the both client and server start on the same host.

After starting either the single process or two-process version of Hypergrid, start Hypergrid.

Scripts

d.q             data and parameter definition
e.q             hypergrid events
t.q             hypertree calculation
j.q             javascript interaction utilities
x.q             hypertree parameters
a.q             auxilliary functions
l.q             console logging functions

h.q             standalone hypertree process

c.q             hypertree client process
s.q             hypertree server process

Examples

The d.q script currently contains three examples. These should be regarded as temporary testing examples, and are sure to change as we begin to explore the real-world uses of Hypertree.

Example 0 is a simple table:

Example 1 uses a table generated from my Itunes library.

Example 2 links yahoo stock prices (h/t Simon Garland) to a table of literal stock information.

About

recursive aggregating treetable and pivot table for hypergrid

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published