Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Suggestion: tree-shakeable transformations #30

Open
luucvanderzee opened this issue Sep 13, 2019 · 0 comments
Open

Suggestion: tree-shakeable transformations #30

luucvanderzee opened this issue Sep 13, 2019 · 0 comments
Labels
enhancement New feature or request question Further information is requested

Comments

@luucvanderzee
Copy link
Contributor

luucvanderzee commented Sep 13, 2019

While our transformations currently have a pretty decent API, there is one big drawback to the current class-based piping/chaining approach: the transformations are not tree-shakeable. This means that you always send the code of all transformations to the client.

I think I just came up with an approach to circumvent this problem. Say that we want to do this with the current API:

import DataContainer from '@snlab/florence-datacontainer'

const data = new Datacontainer({ 
  fruit: ['apple', 'apple', 'banana', 'banana'],
  price: [1, 2, 3, 4]
})

const meanPricePerFruit = data
  .groupBy('fruit')
  .summarise({ mean_price: { price: 'mean' } })
  .arrange({ mean_price: 'descending' })

In the new proposed API, this would become

import DataContainer, { groupBy, summarise, arrange } from '@snlab/florence-datacontainer'

const data = new Datacontainer({ 
  fruit: ['apple', 'apple', 'banana', 'banana'],
  price: [1, 2, 3, 4]
})

const meanPricePerFruit = data.pipe(
  groupBy('fruit'),
  summarise({ mean_price: { price: 'mean' } }),
  arrange({ mean_price: 'descending' })
)

The advantages of this method are

  1. Tree-shakeable transformations, like mentioned above
  2. Easy for users to write and use custom transformations (see below)
  3. Transformations can be used without a DataContainer (if you are using column-oriented data at least)
  4. Cleaner separation of code/tests can just focus purely on the transformations

An example of how a user could write a custom toQuantitative transformation to convert categorical data to quantitative data:

import DataContainer from '@snlab/florence-datacontainer'

const toQuantitatve = columnName => {
  return data => {
    const categoricalColumn = data[columnName]
    data[columnName] = categoricalColumn.map(value => parseFloat(value))

    return data
  }
}

const dataContainer = new DataContainer({ 
  amount: [1, 2, 3, 4], 
  price: ['1', '2', '3', '4']
}).pipe(toQuantitative('price'))

console.log(dataContainer.column('price')) // [1, 2, 3, 4]

Thoughts?

@luucvanderzee luucvanderzee added enhancement New feature or request question Further information is requested labels Sep 13, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request question Further information is requested
Projects
None yet
Development

No branches or pull requests

1 participant