Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Buffer - feature request: Buffer.compare with offsets. #521

Closed
rootslab opened this issue Jan 20, 2015 · 15 comments
Closed

Buffer - feature request: Buffer.compare with offsets. #521

rootslab opened this issue Jan 20, 2015 · 15 comments
Assignees
Labels
buffer Issues and PRs related to the buffer subsystem. feature request Issues that request new features to be added to Node.js.

Comments

@rootslab
Copy link
Contributor

I'm really impressed with the new iojs features, an awesome work!
Buffer.compare is a great iojs addition ( memcmp is very fast!), but, unfortunately, the current method implementation doesn't offer a way to compare 2 portions of Buffers: you are forced to slice one or both Buffers to compare desired portions of data.

Now, Buffer.compare accepts 2 buffers as arguments, instead, it could accept 4 args (+1 optional), or something like that:

function ( Buffer b1, Number start1, Buffer b2, Number start2 [, Number length ] ) {}
/*
 * For example, comparing 10 bytes:
 *  - b1 from index 0 to 9 
 *  - b2 from index 16 to 25
 */
Buffer.compare( b1, 0, b2, 16, 10 )
// or, to mantain current signature:
Buffer.compare( b1, b2, 0, 16, 10 )

@trevnorris It would be great if it was planned to add this functionality, it simplify things a lot!!

@feross
Copy link
Contributor

feross commented Jan 20, 2015

What about just slicing the buffers before comparing? Slicing will
reference memory in the parent buffer so there's no copy. How much overhead
would this actually save?
On Mon, Jan 19, 2015 at 7:45 PM gu [email protected] wrote:

I'm really impressed with the new iojs features, an awesome work!
Buffer.compare is another great iojs addition ( memcmp is very
fast!), but, unfortunately, the current method implementation doesn't offer
a way to compare 2 portions of Buffers: you are forced to slice one or both
Buffers to compare desired portions of data.

Now, Buffer.compare accepts 2 buffers as arguments, instead, it could
accept 4 args (+1 optional), or something like that:

function ( Buffer b1, Number start1, Buffer b2, Number start2 [, Number length ] ) {}/* * For example, comparing 10 bytes: * - b1 from index 0 to 9 * - b2 from index 16 to 25 */
Buffer.compare( b1, 0, b2, 16, 10 )

@trevnorris https://github.com/trevnorris It would be great if it was
planned to add this functionality, it speeds up things a lot!!


Reply to this email directly or view it on GitHub
#521.

@rootslab
Copy link
Contributor Author

Hi @feross,
"What about just slicing the buffers before comparing? "
I obviously know that :) ".. you are forced to slice one or both Buffers to compare desired portions of data.
For example, If you compare a fixed pattern against a data stream, you have to repeatedly slice, the same chunk of incoming data, to the current start offset, instead of simple incrementing an index and executing memcmp / Buffer.compare on a portion of it (anyway, native memcmp gets 3 arguments, 2 pointers and a number for specifying length).

@rootslab
Copy link
Contributor Author

P.S
@feross, It is also possible to use Buffer.copy without indexes if you slice input buffers to desired offsets :)), but we prefer to use direct offset indexes; for an analogue reason, I think, we could get a Buffer.compare method with offset indexes.
Probably, as you says, it doesn't saves a lot in terms of average execution time, I don't know that, I really mean "simplify things". Thanks for your interest.

@jorangreef
Copy link
Contributor

Whenever I compare buffers, it's also almost always using an offset and common length argument, exactly as @rootslab suggested, for example compare(buffer1, offset1, buffer2, offset2, comparisonLength), so this would really fit my typical use case. Slicing to compare within a binary search (usually where I need to compare) would make no sense.

@trevnorris trevnorris added buffer Issues and PRs related to the buffer subsystem. enhancement labels Jan 20, 2015
@trevnorris trevnorris self-assigned this Jan 20, 2015
@trevnorris
Copy link
Contributor

I'm cool with the idea. Though having that many overloads would become a pain to check. How about we start with the simple case:

Buffer#compare(buf[, offset[, length]])

We can work on the syntax for Buffer.compare() afterwards.

@trevnorris
Copy link
Contributor

Oh, and it's not an iojs addition. The feature is also in v0.11. :-)

@rvagg
Copy link
Member

rvagg commented Jan 20, 2015

https://github.com/bnoordhuis/node-buffertools is my go-to for this kind of thing but I'm a big +1 on putting the most useful stuff in core because having to compile an addon just to do simple things often makes me resort to this kind of thing which I'm sure would make @trevnorris scream: buf1.toString('hex') == buf2.toString('hex').

@rootslab
Copy link
Contributor Author

@jorangreef, then, I'm not alone ;))

@trevnorris ops.. I didn't see it in v0.11! ;)
Unfortunately (or fortunately?) my c++/v8 skills are very poor , but I have written some js code for checking index ranges before calling native compare. It's very "tight code", I know, but it works and it could be easy modified or probably used as a simple reference.

@rvagg thanks for suggestion, I know buffertools by @bnoordhuis, however, let me scream together with @trevnorris about toString('hex') ;)) it also converts every byte of raw data to 2 bytes (ASCII) chars, before comparing.

Thanks to all! ;)

@rootslab
Copy link
Contributor Author

@trevnorris @rvagg, I hope it is useful for something :)

compare : function ( buf1, buf2, bpos1, bpos2, bytes ) {

    if ( ! ( buf1 instanceof Buffer &&
             buf2 instanceof Buffer ) )
        throw new TypeError( 'Arguments must be Buffers' );

    var abs = Math.abs
        , min = Math.min
        , blen1 = b1.length
        , blen2 = b2.length
        // normalize arg values
        , s1 = bpos1 >>> 0 ? abs( + bpos1 ) : 0
        , s2 = bpos2 >>> 0 ? abs( + bpos2 ) : 0
        , len = bytes >>> 0 ? abs( + bytes ) : min( blen1 - s1, blen2 - s2 )
        ;

    if ( ! blen1 || ! blen2 )
        throw new Error( '..0 length buffer..' );

    if ( ( len <= 0 ) ||
         ( s1 + len > blen1 ) ||
         ( s2 + len > blen2 ) )
        throw new RangeError( 'out of range index' );
  /*
    * Indexes range:
    *
    * - buf1 -> slice(s1, s1 + len)
    * - buf2 -> slice(s2, s2 + len)
    *
    * call native compare, with safe indexes:
    */
    return internal.compare( buf1, buf2, s1, s2, len );
}

@brendanashworth brendanashworth added feature request Issues that request new features to be added to Node.js. and removed enhancement labels May 6, 2015
@Trott
Copy link
Member

Trott commented Feb 26, 2016

@trevnorris et al.: Assuming this is still a desirable feature, how would the arguments for Buffer#compare(buf[, offset[, length]]) work? Specifically, would offset and length apply to both buffers or just to buf?

Trott added a commit to Trott/io.js that referenced this issue Feb 26, 2016
@trevnorris
Copy link
Contributor

@Trott If the hope is to skip the .slice() operation then you'll need a signature of:

compare(buf[, bufOffset[, bufLength[, thisOffset]]])

It's a bit ugly, but don't see much way around that.

EDIT: Guess thisLength wouldn't be necessary.

@jasnell
Copy link
Member

jasnell commented Mar 22, 2016

Fairly -1 on this. Using slice is a bit more verbose but it works.

@jorangreef
Copy link
Contributor

I wanted to use buffer.compare for a fast substring search implementation the other day, but could not, because of all the garbage that .slice() generates. It was in a hot function and the C memcmp would have been useful but I had to use a while loop instead.

I think Trevor's API suggestion is right.

@trevnorris
Copy link
Contributor

I can't see a technical reason why this API isn't a valid proposition. We already support similar with Buffer#copy() to also allow bypassing .slice() because the operation is expensive enough to hinder the hot path.

@jasnell
Copy link
Member

jasnell commented Mar 23, 2016

Ok. That works for me then.
On Mar 22, 2016 11:18 PM, "Trevor Norris" [email protected] wrote:

I can't see a technical reason why this API isn't a valid proposition. We
already support similar with Buffer#copy() to also allow bypassing
.slice() because the operation is expensive enough to hinder the hot path.


You are receiving this because you commented.
Reply to this email directly or view it on GitHub
#521 (comment)

jasnell added a commit to jasnell/node that referenced this issue Apr 9, 2016
Adds additional `targetStart`, `targetEnd`, `sourceStart,
and `sourceEnd` arguments to `Buffer.prototype.compare`
to allow comparison of sub-ranges of two Buffers without
requiring Buffer.prototype.slice()

Fixes: nodejs#521
@jasnell jasnell closed this as completed in a246689 Apr 9, 2016
MylesBorins pushed a commit that referenced this issue Apr 20, 2016
Adds additional `targetStart`, `targetEnd`, `sourceStart,
and `sourceEnd` arguments to `Buffer.prototype.compare`
to allow comparison of sub-ranges of two Buffers without
requiring Buffer.prototype.slice()

Fixes: #521
PR-URL: #5880
Reviewed-By: Trevor Norris <[email protected]>
MylesBorins pushed a commit that referenced this issue Apr 21, 2016
Adds additional `targetStart`, `targetEnd`, `sourceStart,
and `sourceEnd` arguments to `Buffer.prototype.compare`
to allow comparison of sub-ranges of two Buffers without
requiring Buffer.prototype.slice()

Fixes: #521
PR-URL: #5880
Reviewed-By: Trevor Norris <[email protected]>
jasnell added a commit that referenced this issue Apr 26, 2016
Adds additional `targetStart`, `targetEnd`, `sourceStart,
and `sourceEnd` arguments to `Buffer.prototype.compare`
to allow comparison of sub-ranges of two Buffers without
requiring Buffer.prototype.slice()

Fixes: #521
PR-URL: #5880
Reviewed-By: Trevor Norris <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
buffer Issues and PRs related to the buffer subsystem. feature request Issues that request new features to be added to Node.js.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

8 participants