Skip to content

The fastest functions for fuzzy search and items filtering

License

Notifications You must be signed in to change notification settings

axules/fuzzy-tools

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

40 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

fuzzy-tools

  1. What is it?
  2. Live example
  3. Installation
  4. What is new?
  5. Match function
  6. Filter function
  7. Benchmark results
  8. Tests

What is it?

It is the fastest functions for fuzzy matching and items filtering.

fuzzy-tools demo

https://codesandbox.io/s/fuzzytoolsexample-4r2ej

Installation

npm install --save fuzzy-tools

What is new?

v.2.0.0

  • caseInsensitive removed (use caseSensitive: true instead of caseInsensitive: false)
  • mask not only String, but Array.of(String). In case when it is array then search will be work as fuzzy but not by chars, but by words. ['fuz', 'sea'] will not be found in fuuuuzzzzzy seeeeearch, but will be found in fuzzzzzy search.

Match function

match(mask, where, options):Object - returns result of string matching. Returns object if mask matches where (e.g. 'fzz' in 'fuzzy'), null - if it is not ('fzza' in 'fuzzy').

Arguments

args type default note
mask string or array.of(string) no provide what you want to find ('fzz' or ['fuz', 'sea'])
where string or array no destination string or array of strings
options object {...} additional options
  • mask
    • String: 'fzz' will be found in 'fuzzy search'

    • Array.of(String):

      ['fzz'] will NOT be found in 'fuzzy search'.

      ['fuz', 'sea'] will be found in 'fuzzy search'.

      ['fuz', 'seaa'] will NOT be found in 'fuzzy search', because ALL words should be in where.

        match('fzz', 'fuzzy search'); // { score: 1 }
        match(['fzz'], 'fuzzy search'); // null
        match(['fuz', 'rch'], 'fuzzy search'); // { score: 1 }
        match(['fuz', 'rch1111'], 'fuzzy search'); // null
  • where
    • String: result contains score is score of matching.
        match('fuz', 'fuzzy'); // { score: 1 }
    • Array of strings or Object({ key: String, key2: String }): result contains score is min of scores of each string (score = min(match(mask, item[0]), match(mask, item[1]), ...)).
        match('fuz', ['fuzzy', 'it is fuzzy']);
        // { score: 1, matches: { 0: { score: 1 }, 1: { score: 1 } } }
        match('fuz', { v1: 'fuzzy', v2: 'it is fuzzy' });
        // { score: 1, matches: { v1: { score: 1 }, v2: { score: 1 } } }
    • Array of Object({ value: String, rate: Number(>0 and <=1) }): result contains score is min of score of each value * rate (score = min(match(mask, item[0].value) / item[0].rate, ...))
        match('fuz',
          [
            { value: 'fuzzy', rate: 0.75 },
            { value: 'it is fuzzy', rate: 0.25 }
          ]
        );
        match('fuz',
          {
            v1: { value: 'fuzzy', rate: 0.75 },
            v2: { value: 'it is fuzzy', rate: 0.25 }
          }
        )
  • options
    • caseSensitive: Boolean (default: false) - when it is true, then FZZ will not be matched with fuzzy. Char's case will be ignored.
    • withScore: Boolean (default: false) - when it is true, then score will be computed for matched strings, else score will be 1.
    • withWrapper: String or Function (default: false) - when it is true, then result will contains wrapped. It is needed to render highlighted results.
      • String: template string, (e.g. '<b>{?}</b>', {?} will be replaced by matched word. fzz in fuzzy => '<b>f</b>u<b>zz</b>y'.
      • Function(word: String): String, (e.g. (word) => '<b>'+word+'</b>')
    • withRanges: Boolean (default: false) - when it is true, then result will contains ranges. It is array of Object({ begin: Number, end: Number }) with ranges of matched parts.
    • rates: or(Array, Object) of rates (> 0 and <= 1).
        match('fuz',
          ['fuzzy', 'it is fuzzy'],
          { rates: [0.75, 0.25] }
        )
        // or
        match('fuz',
          ['fuzzy', 'it is fuzzy'],
          { rates: { 0: 0.75, 1: 0.25 } }
        )
        // the same that
        match('fuz',
          {
            v1: { value: 'fuzzy', rate: 0.75 },
            v2: { value: 'it is fuzzy', rate: 0.25 }
          }
        )

Result

  • score - from 0.001 to infinity, less is better. If withScore is false then score is 1 always.

  • matches - It will be if where is Array or Object. It is object with results, key is index or key, value is Object

        {
          score: Number,
          original: String,
          index: or(Number, String),
          [wrapped: String],
          [ranges: Array]
        }
  • wrapped - contains wrapped original string or result of withWrapper function. It is undefined if withWrapper is false.

  • ranges - array with matched ranges { begin: Number, end: Number }. It is undefined if withRanges is false.

import { match } from 'fuzzy-tools';

match('fzz', 'fuzzy', { withScore: true }); // { score: 1.74 }

match('fzz', ['fu', 'fuzza'], { withScore: true }) // { score: 1.74, matches: {1:{score: 1.74}} }

match('fzz', [{ value: 'fuzza', rate: 0.75 }, { value: 'fuzzy', rate: 0.10 }], { withScore: true })
// { score: 2.32, matches: {0: {score: 2.32}, 1: {score: 17.4}} }
// score = Min(1.74 / 0.75, 1.74 / 0.10)

match('fzz', 'fuzzy'); // { score: 1 }

match('fZZ', 'fuzzy', { caseSensitive: true }); // null

match('fZZ', 'fuzzy', { withWrapper: '<i>{?}</i>' });
// { score: 1, wrapped: '<i>f</i>u<i>zz</i>y' }

match('fZZ', ['fuzzy'], { withWrapper: '<i>{?}</i>' });
// { score: 1, matches: {0: {score: 1, original: 'fuzzy', index: 0, wrapped: '<i>f</i>u<i>zz</i>y', original: 'fuzzy'}}}

match('fZZ', { v: 'fuzzy' }, { withWrapper: '<i>{?}</i>' });
// { score: 1, matches: { v: {score: 1, original: 'fuzzy', index: 'v', wrapped: '<i>f</i>u<i>zz</i>y', original: 'fuzzy'}}}

match('fZZ', 'fuzzy', { withWrapper: w => `<b>${w}</b>` });
// { score: 1, wrapped: '<b>f</b>u<b>zz</b>y' }

match('fZZ', ['fuzzy'], { withWrapper: w => `<b>${w}</b>`, withScore: true });
// { score: 1, matches: {0: {score: 1, wrapped: '<b>f</b>u<b>zz</b>y', original: 'fuzzy'}}}

match('fZZ', 'fuzzy', { withRanges: true });
// { score: 1, ranges: [{begin: 0, end: 0}, {begin: 2, end: 3}]}

match('fZZ', ['fuzzy'], { withRanges: true });
// { score: 1, matches: {0: {score: 1, ranges: [{begin: 0, end: 0}, {begin: 2, end: 3}], original: 'fuzzy'}}}

Match in string

import { match } from 'fuzzy-tools';
// or
// import { matchString } from 'fuzzy-tools';

match('fzz', 'fuzzy', { withScore: true }); // { score: 1.74 }

match('fzz', 'fuzzy'); // { score: 1 }

match('fZZ', 'fuzzy', { caseSensitive: true }); // null

match('fZZ', 'fuzzy', { withWrapper: '<i>{?}</i>', withScore: true });
// { score: 1.74, wrapped: '<i>f</i>u<i>zz</i>y' }

match('fZZ', 'fuzzy', { withWrapper: w => `<b>${w}</b>` });
// { score: 1, wrapped: '<b>f</b>u<b>zz</b>y' }

match('fZZ', 'fuzzy', { withRanges: true });
// { score: 1, ranges: [{begin: 0, end: 0}, {begin: 2, end: 3}]}

Match in list of strings

import { match } from 'fuzzy-tools';
// or
// import { matchList } from 'fuzzy-tools';

match('fzz', ['fu', 'fuzza'], { withScore: true })
// { score: 1.74, matches: {1: {score: 1.74, original: 'fuzza', index: 1}} }

match('fzz', { v1: 'fu', v2: 'fuzza' })
// { score: 1, matches: {v2: {score: 1, original: 'fuzza', index: 'v2'}} }

match('fzz', [{ value: 'fuzza', rate: 0.75 }, { value: 'fuzzy', rate: 0.10 }], { withScore: true })
// {
//   score: 2.3708148148148145,
//   matches: {
//     0: { score: 2.3708148148148145, original: 'fuzza', rate: 0.75, index: 0 },
//     1: { score: 17.78111111111111, original: 'fuzzy', rate: 0.10, index: 1 }
//   }
// }
// score = min(2.3708148148148145, 17.78111111111111)

// the same but more compact
match('fzz', ['fuzza', 'fuzzy'], { rates: [0.75, 0.10], withScore: true })
// {
//   score: 2.3708148148148145,
//   matches: {
//     0: { score: 2.3708148148148145, original: 'fuzza', rate: 0.75, index: 0 },
//     1: { score: 17.78111111111111, original: 'fuzzy', rate: 0.10, index: 1 }
//   }
// }
// score = min(2.3708148148148145, 17.78111111111111)

match('fzz', {v1: { value: 'fuzza', rate: 0.75 }, v2: { value: 'fuzzy', rate: 0.10 } }, { withScore: true })
// {
//   score: 2.3708148148148145,
//   matches: {
//     v1: { score: 2.3708148148148145, original: 'fuzza', rate: 0.75, index: 'v1' },
//     v2: { score: 17.78111111111111, original: 'fuzzy', rate: 0.10, index: 'v2' }
//   }
// }
// score = min(2.3708148148148145, 17.78111111111111)

// the same but more compact
match('fzz', { v1: 'fuzza', v2: 'fuzzy' }, { rates: { v1: 0.75, v2: 0.10 }, withScore: true })
// {
//   score: 2.3708148148148145,
//   matches: {
//     v1: { score: 2.3708148148148145, original: 'fuzza', rate: 0.75, index: 'v1' },
//     v2: { score: 17.78111111111111, original: 'fuzzy', rate: 0.10, index: 'v2' }
//   }
// }
// score = min(2.3708148148148145, 17.78111111111111)

match('fZZ', ['fuzzy'], { caseSensitive: true });
// null

match('fZZ', ['fuzzy'], { withWrapper: '<i>{?}</i>' });
// { score: 1, matches: {0: {score: 1, index: 0, wrapped: '<i>f</i>u<i>zz</i>y', original: 'fuzzy'}}}

match('fZZ', ['fuzzy'], { withWrapper: w => `<b>${w}</b>`, withScore: true });
// { score: 1.74, matches: {0: {score: 1.74, index: 0, wrapped: '<b>f</b>u<b>zz</b>y', original: 'fuzzy'}}}

match('fZZ', ['fuzzy'], { withRanges: true });
// { score: 1.74, matches: {0: {score: 1.74, index: 0, ranges: [{begin: 0, end: 0}, {begin: 2, end: 3}], original: 'fuzzy'}}}

Filter function

filter(mask: OR(String, Array.of(String)), items: Array, options: Object): Array - returns list of matched items.

Arguments

args type default note
mask string or array.of(string) no what you want to find (fzz)
items array no items list
options object {...} additional options
  • mask - read more in Match function
  • where
    • Array of Strings
    • Array of Objects
  • options
    • extract: required to filter Array of Objects
      • String: field name - this field will be extrtacted to match with mask
      • Array: fields names - these fields will be extrtacted to match with mask, each field will have 1 as rate.
      • Object: { fieldName: rateNumber } - field name is key, rate is value.
      • Function(item):String or Array - function takes item and should return String, Array of Strings, Object of Strings or Array of Object({ value: String, rate: Number })
    • itemWrapper: function(item, matchResult, { index: Number, result: Array }): any - function takes item and matchResult and should return value that will be pushed into result list. If it returns empty value (false, null, undefined, '', 0), then it will not be pushed into result.
    • caseSensitive: Boolean (default: false) - when it is true, then FZZ will not be matched with fuzzy. If it is false then char's case will be ignored.
    • withScore: Boolean (default: false) - when it is true, then score will be computed for matched strings, else score will be 1.
    • withWrapper: String or Function (default: false) - when it is true, then match result for each item will contain wrapped. It is needed to render highlighted results.
      • String: template string, (e.g. '<b>{?}</b>', {?} will be replaced by matched word. fzz in fuzzy => '<b>f</b>u<b>zz</b>y'.
      • Function(word: String): String, (e.g. (word) => '<b>'+word+'</b>')
    • withRanges: Boolean (default: false) - when it is true, then match result for each item will contain ranges. It is array of Object({ begin: Number, end: Number }) with ranges of matched parts.

Filter strings list

import { filter } from 'fuzzy-tools';

const data = ['fuzzy', 'fazzy', 'furry', 'funny', 'fuuuuuzzer'];
filter('fZZ', data);
// ['fuzzy', 'fazzy', 'fuuuuuzzer'];

filter('fZZ', data, { itemWrapper: item => item.toUpperCase() });
// ['FUZZY', 'FAZZY', 'FUUUUUZZER'];

filter('fZZ', data, { withWrapper: '<{?}>', itemWrapper: (item, match) => match.wrapped });
// ['<f>u<zz>y', '<f>a<zz>y', '<f>uuuuu<zz>er'];

filter('fZZ', data, { extract: (item) => item.slice(0, 4) });
// ['fuzzy', 'fazzy'];

filter('FZZ', data, { caseSensitive: true });
// [];

filter('FZZ', data, { caseSensitive: true, extract: item => item.toUpperCase() });
// ['fuzzy', 'fazzy', 'fuuuuuzzer'];

Filter objects list

import { filter } from 'fuzzy-tools';

const data = [{ v: 'fuzzy' }, { v: 'funny' }, { v: 'fuuuuuzzer'}];

filter('fZZ', data, { extract: 'v' });
// [{ v: 'fuzzy' }, { v: 'fuuuuuzzer'}];

filter('fZZ', data, { extract: 'vvv' });
// [];

filter('fZZ', data, { extract: 'v', itemWrapper: item => item.v });
// ['fuzzy', 'fuuuuuzzer'];

filter('fZZ', data, { extract: 'v', withWrapper: '<{?}>', itemWrapper: (item, m) => m.wrapped });
// ['<f>u<zz>y', '<f>uuuuu<zz>er'];

// if extract contains more than 1 field, then result of matching will contain matches array
filter('fZZ', data, {
    extract: ['v', 'v2'],
    withWrapper: '<{?}>',
    itemWrapper: (item, m) => m.matches[0].wrapped
});
// ['<f>u<zz>y', '<f>uuuuu<zz>er'];

Benchmark results

Compare next libraries

Methodology

  1. generate pairs mask (random length) and string (255 or 5000 chars length). 5000 pairs.
  2. call match function from each library
  3. compare results

Results

It is more real case - 5000 items with 255 length string

fuzzy-tools vs others in real case

It is test case - 5000 items with 2000 length string. And it is not mistake, it is real results.

fuzzy-tools vs others

Benchmark project

https://codesandbox.io/s/fuzzytoolsbenchmarkexample-t9fk1?file=/src/index.js