Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add character APIs for locations #1809

Merged
merged 1 commit into from
Nov 20, 2023
Merged

Add character APIs for locations #1809

merged 1 commit into from
Nov 20, 2023

Conversation

kddnewton
Copy link
Collaborator

Fixes #1788 by introducing:

  • Location#start_character_offset
  • Location#end_character_offset
  • Location#start_character_column
  • Location#end_character_column

Comment on lines 62 to 65
# A lazily-computed cache of byte offset => character offset mappings.
def character_offsets
@character_offsets ||= { 0 => 0 }
end
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure this would pay off because it could become pretty large for a big file and consume N bytes for every byte in the source.
N seems to be about 60 for integer entries in a Hash:

ruby --disable-gems -e 'h={}; 1.times {|i| h[i]=i}'
12MB
ruby --disable-gems -e 'h={}; 10_000_000.times {|i| h[i]=i}'
588MB
irb(main):003:0> 576 * 1024**2 / 10_000_000.0
=> 60.3979776

I think it's better to leave this uncached for now, or do one cache entry per line.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair point!

@kddnewton kddnewton requested a review from eregon November 14, 2023 21:15
@kddnewton kddnewton force-pushed the character-apis branch 2 times, most recently from eb14cf5 to f005fa0 Compare November 14, 2023 21:42
end

# Return the character offset for the given byte offset.
def character(byte_offset)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
def character(byte_offset)
def character_offset(byte_offset)

It seems clearer, (e.g. it doesn't return a character) and given this is a public method, it seems even more important to be clear.

@kddnewton kddnewton merged commit d493ccd into main Nov 20, 2023
@kddnewton kddnewton deleted the character-apis branch November 20, 2023 16:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Location APIs for characters
2 participants