Suggestion: tree-shakeable transformations #30

luucvanderzee · 2019-09-13T15:43:14Z

While our transformations currently have a pretty decent API, there is one big drawback to the current class-based piping/chaining approach: the transformations are not tree-shakeable. This means that you always send the code of all transformations to the client.

I think I just came up with an approach to circumvent this problem. Say that we want to do this with the current API:

import DataContainer from '@snlab/florence-datacontainer'

const data = new Datacontainer({ 
  fruit: ['apple', 'apple', 'banana', 'banana'],
  price: [1, 2, 3, 4]
})

const meanPricePerFruit = data
  .groupBy('fruit')
  .summarise({ mean_price: { price: 'mean' } })
  .arrange({ mean_price: 'descending' })

In the new proposed API, this would become

import DataContainer, { groupBy, summarise, arrange } from '@snlab/florence-datacontainer'

const data = new Datacontainer({ 
  fruit: ['apple', 'apple', 'banana', 'banana'],
  price: [1, 2, 3, 4]
})

const meanPricePerFruit = data.pipe(
  groupBy('fruit'),
  summarise({ mean_price: { price: 'mean' } }),
  arrange({ mean_price: 'descending' })
)

The advantages of this method are

Tree-shakeable transformations, like mentioned above
Easy for users to write and use custom transformations (see below)
Transformations can be used without a DataContainer (if you are using column-oriented data at least)
Cleaner separation of code/tests can just focus purely on the transformations

An example of how a user could write a custom toQuantitative transformation to convert categorical data to quantitative data:

import DataContainer from '@snlab/florence-datacontainer'

const toQuantitatve = columnName => {
  return data => {
    const categoricalColumn = data[columnName]
    data[columnName] = categoricalColumn.map(value => parseFloat(value))

    return data
  }
}

const dataContainer = new DataContainer({ 
  amount: [1, 2, 3, 4], 
  price: ['1', '2', '3', '4']
}).pipe(toQuantitative('price'))

console.log(dataContainer.column('price')) // [1, 2, 3, 4]

Thoughts?

The text was updated successfully, but these errors were encountered:

luucvanderzee added enhancement New feature or request question Further information is requested labels Sep 13, 2019

luucvanderzee mentioned this issue Sep 25, 2019

Lazy transformations #39

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Suggestion: tree-shakeable transformations #30

Suggestion: tree-shakeable transformations #30

luucvanderzee commented Sep 13, 2019 •

edited

Loading

Suggestion: tree-shakeable transformations #30

Suggestion: tree-shakeable transformations #30

Comments

luucvanderzee commented Sep 13, 2019 • edited Loading

luucvanderzee commented Sep 13, 2019 •

edited

Loading