-
-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Expanded whitespace test case to catch more inefficiencies and updated code to optimize. #5
Changes from 5 commits
6d1d16d
3dc039b
6ec5151
04db121
f1feb12
6c17733
66fe885
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -33,6 +33,35 @@ test('efficiency', (t) => { | |
}, 0) | ||
}) | ||
|
||
t.test('flanking whitespace', (t) => { | ||
const timeoutId = setTimeout(() => { | ||
t.fail('did not pass in 10ms') | ||
}, 10) | ||
|
||
t.deepEqual(trimLines(whitespace + '\na\n' + whitespace), '\na\n') | ||
|
||
setTimeout(() => { | ||
clearTimeout(timeoutId) | ||
t.end() | ||
}, 0) | ||
}) | ||
|
||
t.test('internalized whitespace ', (t) => { | ||
42shadow42 marked this conversation as resolved.
Show resolved
Hide resolved
|
||
const timeoutId = setTimeout(() => { | ||
t.fail('did not pass in 30ms') | ||
}, 30) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why is this 30 instead of 10? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. No, this test is slower because it's the edge case that originally caused problems. It might pass in 10ms with the fully non-regex version though I'd have to try it. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Actually on second test, it appears it does pass in 10 ms, I must have accidentally tested 1ms or something. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. On a third revision it appears they all need 20-30ms, they are intermittently failing with 10. Good call. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Also, we’re not testing |
||
|
||
t.deepEqual( | ||
trimLines('\na' + whitespace + 'b\n'), | ||
'\na' + whitespace + 'b\n' | ||
) | ||
|
||
setTimeout(() => { | ||
clearTimeout(timeoutId) | ||
t.end() | ||
}, 0) | ||
}) | ||
|
||
t.test('whitespace around line', (t) => { | ||
const timeoutId = setTimeout(() => { | ||
t.fail('did not pass in 10ms') | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At this point, it seems like a bad idea to use regexes for this. At least, it’s hard to read. Can you check if something like this works?
(formatted to match project)
(perhaps good to check for some lines that just include whitespace, and line that include nothing at all)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I prefer the regex approach where readability is concerned. It means people don't have to look up the char codes, and I feel it clarifies the intent over the above code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand your opinion, but I disagree. Can you change it?
Regexes are generally slow. Character codes, especially with parsing projects such as all of unified, are common and searchable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I propose the follow as the function. Please still consider it pseudocode, however, I did test it and it seems to work
This:
!start
as well, meaning that regexes are no longer neededcharCodeAt
which is faster thatcharAt
, and no longer needs small strings. I understand that these codes might be new to you, and hence you do not prefer them, but I consider them common enough in parsing, in the 100s of projects I am maintaining, that I strongly prefer them.value
only once, without reassigning it, or even not at all for empty lines. Reassigning a parameter is slow, because JavaScript “links”arguments[0]
andvalue
together. Not slicing at all for empty lines is likely also fast in edge cases of large blank lines.All reasons why this should be faster.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest using constants for the character codes to make it more readable:
And I agree with not using regex, both for readability, performance, and also to avoid the risk of ReDoS.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just saw this, I think it resolves my concern about readability without compromising performance. I'll make the constants now.