Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a Table Data Type! #3

Open
5 tasks
oparisblue opened this issue Jun 29, 2020 · 0 comments
Open
5 tasks

Add a Table Data Type! #3

oparisblue opened this issue Jun 29, 2020 · 0 comments
Labels
node-suggestion Suggesting a new node planned This issue should be implemented / fixed; and is welcoming Pull Requests! suggestion A suggestion which is not for a new node

Comments

@oparisblue
Copy link
Owner

The Table format stores CSV-style tables.

We want to support the following operations:

  • CSV Upload => make table
  • Filter / Query / Aggregate functions:
    • Drop Column
    • Filter Rows
    • Add Column
    • Sum / Average / etc
    • Group
    • Limit
    • etc

But, this seems overly limiting to the user, and seems to pose some intricate UI problems. For example, what if you want to sum just all of the odd rows? Perhaps, you would filter out the odd rows, and then run the stock sum function (of note: what column does this sum???). But this just pushes the problem back a level - how do we choose what gets filter?

Therefore, there will just be one key node: "For Each Row".

proposed node

What's special about this node is that it contains a "subgraph", e.g. a graph which will be executed for each node. This subgraph will have unremovable "Input" and "Output" nodes in it, which items can an be piped in and out of each time. The input node has a dynamic number of outputs (one for each column), as well as a counter (e.g. what number row is this) and a total (e.g. how many rows are there), and the accumulator (initially the base case, but contains all of the previous modifications too). The output node accepts just one input: something of the same type as the base case / accumulator.

If you're used to .reduce() in JavaScript, or foldl / foldr in other languages, then this should be familiar, and clearly it provides the flexibility that's needed - e.g. for our previous example, we could do something like "0" as the base case, and then e.g. (accumulator + (column value * (row counter % 2))) in the subgraph.

However, as this could be confusing for some of the more simple operations, I think we should offer both: "For Each Row" for complex cases, and then basic nodes like "Sum", "Filter", etc, with dropdown menus / custom UIs to set up the basic tasks.


To-do:

  • CSV Upload => make table
  • For-each node and subgraph
  • Abstract "Aggregate" node with four nodes extending it: Min, Max, Average, and Sum (or more if we think they're useful)
  • Drop Column node (by number)

potentially more --- feel free to edit!!


Eventual:

  • Excel Upload => make table
@oparisblue oparisblue added suggestion A suggestion which is not for a new node planned This issue should be implemented / fixed; and is welcoming Pull Requests! node-suggestion Suggesting a new node labels Jun 29, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
node-suggestion Suggesting a new node planned This issue should be implemented / fixed; and is welcoming Pull Requests! suggestion A suggestion which is not for a new node
Projects
None yet
Development

No branches or pull requests

1 participant