dat-transform

Lazily-evaluated transformation on Dat archives. Inspired by Resilient Distributed Dataset(RDD)

npm i dat-transform

Synopsis

word count example:

const {RDD, kv} = require('dat-transform')

const Hyperdrive = require('hyperdrive')
const ram = require('random-access-memory')
const archive = new Hyperdrive(ram, '<DAT-ARCHIVE-KEY>', {sparse: true})

// define transforms
var wc = RDD(archive)
  .splitBy(/[\n\s]/)
  .filter(x => x !== '')
  .map(word => kv(word, 1))

// actual run(action)
wc
  .reduceByKey((x, y) => x + y)
  .toArray(res => {
    console.log(res) // [{bar: 2, baz: 1, foo: 1}]
  })

Transform & Action

Transforms are lazily-evaluated function on a dat archive. Defining a transform on a RDD will not trigger computation immediately. Instead, transformations will be pipelined and computed when we actually need the result, therefore provides opportunities of optimization.

Transforms are applied to each file separately.

Following transforms are included:

map(f)
filter(f)
splitBy(f)
sortBy(f) // check test/index.js for gotcha

Actions are operations that returns a value to the application.

Examples of actions:

collect()
take(n)
reduceByKey(f)
count()
sum()
takeSortedBy()

Select

dat-transform provides indexing via hyperdrive's list of entry. You can specify the entries you want to computed with, which can greatly reduce bandwidth usage.

get(entryName)
select(f)

Partition

Partitions lets you re-index and cache the computed result to another archive.

partition(outArchive) // return promise

Marshal/Unmarshal

Transforms can be marshalled as JSON. which allows execution on remote machine.

RDD.marshal
unmarshal

How it works

dat-transform use streams from highland.js, which provides lazy-evaluation and back-pressure.

License

The MIT License

Name		Name	Last commit message	Last commit date
Latest commit History 54 Commits
test		test
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
action.js		action.js
index.js		index.js
package-lock.json		package-lock.json
package.json		package.json
transform.js		transform.js

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

dat-transform

Synopsis

Transform & Action

Select

Partition

Marshal/Unmarshal

How it works

License

About

Releases

Packages

Contributors 2

Languages

License

poga/dat-transform

Folders and files

Latest commit

History

Repository files navigation

dat-transform

Synopsis

Transform & Action

Select

Partition

Marshal/Unmarshal

How it works

License

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages