Manipulate data from API responses with ease.
Inspired by modern Object Relational Managers, DataCollection.js is a JavaScript library for storage, filtration, manipulation and accession of large datasets. It is ideal for working with data returned from RESTful API endpoints.
Boasting synchronous performance that nears native Array manipulation for large (>10,000) recordsets, let DataCollection.js do your heavy lifting for you.
You can begin using DataCollection.js by embedding the following script (assumes it has been placed in your root directory)
<script src="/data_collection-1.1.6.js"></script>
Alternatively, the minified version can be found at
<script src="/data_collection-1.1.6-min.js"></script>
You can then start using DataCollection
objects with
var dc = new DataCollection();
$ npm install data-collection
Followed by a script with this require...
var DataCollection = require('data-collection');
Woohoo!
DataCollection can be used for fast, synchronous processing of large datasets (arrays of objects) - i.e. a RESTful API response.
It is especially useful for maintaining maps of specific keys and indexing results.
Let's say that I have a standardized Array containing the results of a RESTful API response. My data set looks like this:
var characters = [
{
id: 1,
first_name: 'Jon',
last_name: 'Snow',
gender: 'm',
age: 14,
location: 'Winterfell'
},
{
id: 2,
first_name: 'Eddard',
last_name: 'Stark',
gender: 'm',
age: 35,
location: 'Winterfell'
},
{
id: 3,
first_name: 'Catelyn',
last_name: 'Stark',
gender: 'f',
age: 33,
location: 'Winterfell'
},
{
id: 4,
first_name: 'Roose',
last_name: 'Bolton',
gender: 'm',
age: 40,
location: 'Dreadfort'
},
{
id: 5,
first_name: 'Ramsay',
last_name: 'Snow',
gender: 'm',
age: 15,
location: 'Dreadfort'
}
];
First off, let's load this data into a DataCollection
...
var charDC = new DataCollection(characters);
Now, let's approach some problems...
filter
allows us to look for a specific value.
var bastards = charDC.query().filter({last_name: 'Snow'}).values();
A simple max()
call will do the trick.
var topAge = charDC.query().max('age');
DataCollection provides an easy distinct
function for use.
var locations = charDC.query().distinct('location');
No problem!
charDC.query().filter({first_name__in: ['Catelyn', 'Eddard']}).remove();
// Will return Jon, Eddard and Ramsay
charDC.query()
.filter({gender: 'm', age__lt: 40})
.values();
// Updates location
charDC.query()
.filter({location: 'Winterfell'})
.exclude({first_name: 'Jon'})
.update({location: 'King\'s Landing'});
// Finds Roose, Ramsay
chardDC.query()
.filter({first_name__contains: 'R'});
// Finds Roose, Ramsay, Eddard --- case insensitive
charDC.query()
.filter({first_name__icontains: 'R'})
.values();
// Creates a mapping for current future values...
charDC.createMapping('is_bastard', function(row) {
return row.last_name === 'Snow';
});
// true
charDC.query().filter({first_name: 'Jon'}).first().is_bastard;
// false
charDC.query().filter({first_name: 'Catelyn'}).first().is_bastard;
// Add an entry (Can accept each entry as an argument, or an array)
charDC.insert({
id: 6,
first_name: 'Rob',
last_name: 'Stark',
gender: 'm',
age: 14,
location: 'Winterfell'
});
// new entry, but is also false
charDC.query().filter({first_name: 'Rob'}).first().is_bastard;
// will return Eddard and Catelyn rows
charDC.query()
.sort('age', true) // sortDesc = true
.limit(1, 2)
.values();
What if I have object inside of my DataCollection? Can I filter and sort by those fields as well?
Of course! Separate nested fields by double underscores (__
).
Please note that when using this method, exact values must be checked for using
'__is'
.
var dc = new DataCollection();
dc.load([{
main: {
sub: 7,
sub2: {
low: 8
}
}
}, {
main: {
sub: 7,
sub2: {
low: 20
}
}
}, {
main: {
sub: -1,
sub2: {
low: 16
}
}
}]);
// Returns only first two rows
dc.query().filter({main__sub__is: 7}).values();
// Returns only last two rows
dc.query().filter({main__sub__low__gte: 10}).values();
And there's more! Try playing around.
DataCollection( [Optional Array] data )
Constructor (used with new
keyword)
If provided data, will run DataCollection.prototype.load(data)
defineIndex( [String] key )
returns self
Define a unique key to use as an index for this collection
used for DataCollection.prototype.exists
, DataCollection.prototype.fetch
and
DataCollection.prototype.destroy
All indexed values will be converted to strings, be careful about uniqueness
createMapping( [String] key, [Function] map -> ([Object] row) )
returns self
Define a mapped key, and a function that returns the associated value based on the input row. Can be used any time, new mappings will be applied to your DataCollection immediately.
var dc = new DataCollection();
dc.createMapping('c', function(row) { return row['a'] + row['b']; });
dc.load([{a: 1, b: 2}, {a: 2, b: 3}]);
console.log(dc.query().last()); // logs {a: 2, b: 3, c: 5}
exists( [String] indexedValue )
returns boolean
Determine whether the DataCollection has an entry with the specified index based on your index key
fetch( [String] indexedValue )
returns Object
fetches object (if it exists) associated with the specified index based on
your index key. Otherwise, returns null
.
destroy( [String] indexedValue )
returns true
Destroys object (if it exists) associated with the specified index based on your index key. Otherwise, throws an error.
load( [Object] row_1, ..., [Object] row_n )
load( [Array] data )
returns true
Loads (truncates, then adds) new data from individual row Objects or an array of row Objects
insert( [Object] row_1, ..., [Object] row_n )
insert( [Array] data )
returns true
Inserts new data from individual row Objects or an array of row Objects
truncate()
returns true
Empties all data from DataCollection
query()
returns DataCollectionQuery
returns a new DataCollectionQuery containing a referential set of all data from the parent DataCollection.
DataCollectionQuery()
Constructor, only accessible via DataCollection.prototype.query()
filter( [Object] filters_1, ..., [Object] filters_n )
returns new DataCollectionQuery
Returns a new DataCollectionQuery
containing a referential subset of its
parent. Contains filtered values (see: Filters).
Providing new filter objects via separate arguments does a logical OR between the filter sets. (Within a filter set is logical AND.)
exclude( [Object] filters )
returns new DataCollectionQuery
Returns a new DataCollectionQuery
containing a referential subset of its
parent. Excludes filtered values (see: Filters)
spawn( [Boolean] ignoreIndex )
returns new DataCollection
Creates a new DataCollection
object (non-referential, new values) from
all data contained within the current DataCollectionQuery
. Will inherit the
parent DataCollection's index unless ignoreIndex
is set to true.
each( [Function] callback -> ([Object] row, [Integer] index) )
returns self
Loops through all rows of data, and performs callback
for each one
var dc = new DataCollection([{a: 1, b: 2}, {a: 2, c: 3}]);
var query = dc.query();
query.each(function(row, index) {
console.log(index + ': ' + row['a'] + ', ' + row['b']);
});
// logs
// 0: 1, 2
// 1: 2, 3
update( [Object] values )
returns self
Assigns all key-value pairs from values to every row in the current selection (updates parent DataCollection)
remove()
returns true
Removes all rows contained in DataCollectionQuery
from the parent
DataCollection
order( [String] key, [Optional Boolean] orderDesc = false )
returns DataCollectionQuery
Returns a new DataCollectionQuery containing the parent's rows, sorted by a specific key (descending if sortDesc = true).
Sort order is as follows (regardless of ASC or DESC): Function, Object, Date Object, String, Boolean, Number, NaN, null, undefined
Strings, Booleans, and Numbers will be sorted based on their values (ASC/DESC) Functions, Objects and identical values will be sorted based on the order in which they were inserted (stable sort).
sort( [String] key, [Optional Boolean] orderDesc = false )
returns DataCollectionQuery
Alias of .order
sequence( [String or Number] indexedValue_1, ..., [String or Number] indexedValue_n )
sequence( [Array] indexedValues )
returns DataCollectionQuery
Returns a new DataCollectionQuery containing the parent's rows with specified indices, ordered in the sequence provided.
Can accept indices as arguments or in an Array.
values( [Optional String] key )
returns Array
Returns an array of all row Objects (each Object is referential!) in the
DataCollectionQuery
, or an array of all values from a specific key if
provided
json( [Optional String] key )
returns String
Returns a JSON-stringified version of .values()
max( [String] key )
returns Float
Returns the maximum value (JavaScript "greater than (>)") contained in key
from the DataCollectionQuery
subset
min( [String] key )
returns Float
Returns the minimum value (JavaScript "greater than (>)") contained in key from the DataCollectionQuery subset
sum( [String] key )
returns Float
Returns the numeric sum of all values contained in key from the
DataCollectionQuery
subset
avg( [String] key )
returns Float
Returns the numeric average of all values contained in key from the
DataCollectionQuery
subset
transform( [Object] keyMapPair )
returns new DataCollectionQuery
Maps each row of the current DataCollectionQuery to a new object with specified keys.
The "map" in keyMapPair can be either a string representation of a key or a mapping function.
dc.load([
{a: 1, b: 2, c: 3},
{a: 4, b: 5, c: 6}
]);
dc.query().transform({d: 'a', e: 'b', f: function(row) { return row.a + row.b + row.c; }}).values();
/* will return...
[
{d: 1, e: 2, c: 6},
{d: 4, e: 5, c: 15}
]
*/
reduce( [String] key, [Function] callback -> ([Any] prevValue, [Any] curValue, [Any] index) )
returns Any
Runs a specified reduction function on all values contained in key from
the DataCollectionQuery
subset.
distinct( [String] key )
returns Array
Returns an array of all unique values (converted to String) with specified
key from the DataCollectionQuery
subset
limit( [Integer] count )
limit( [Integer] offset, [Integer] count )
returns new DataCollectionQuery
Returns a new DataCollectionQuery
containing the first count items from
the current DataCollectionQuery
, or containing count items beginning at
offset
count()
returns Integer
Returns the amount of items (rows) in the current DataCollectionQuery
DataCollection supports a number of filters in the filter()
and exclude()
functions. Many will be familiar if you've used the Django ORM or checked out
another project of ours, FastAPI.
All filters are prefixed with a double underscore when used.
Please note that DataCollection supports filtering (and sorting) based on nested
objects. The syntax is for find the is
filtered value of a nested field would
be {field__nestedField__is: 7}
. You can nest indefinitely using double
underscores.
a === b
Checks for exact equivalence. Equivalent to no specified filter. (Only the field
name). Exists for the purpose of standardization and edge cases (i.e. if your
field ends with __
).
a !== b
Checks for inequivalence. (Not exactly matching.)
a > b
Checks if contained value is greater than provided value.
a >= b
Checks if contained value is greater than or equal to provided value.
a < b
Checks if contained value is less than provided value.
a <= b
Checks if contained value is less than or equal to provided value.
a.indexOf(b) > -1
Checks if contained value contains the provided value. Works for strings or arrays.
a.toLowerCase().indexOf(b.toLowerCase()) > -1
Case insensitive contains. Only works for strings comparisons.
b.indexOf(a) > -1
Checks if the contained value exists in the provided value. Works for strings or arrays.
b.indexOf(a) === -1
Checks if the contained value does not exist in the provided value. Works for strings or arrays.
Current test coverage is 100%
Included with this repository are tests (in /tests
) to make sure everything
is running as expected.
There is a node webserver in the root repository directory that can be used
for testing on localhost:8888
. To start the server (with node installed) simply run:
$ node testserv.js
Tests are run using QUnit, coverage sampled using Blanket.js.
A few benchmarks are logged in the JavaScript developer console.
DataCollection is MIT licensed, feel free to use it wherever you'd like. Thanks for checking us out! We welcome good, thoughtful contributions.
DataCollection was created at Storefront, Inc. in 2014 by Keith Horwood.
Feel free to follow on Twitter:
Or check out our GitHub Repositories for more libraries: