Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove string allocation from GC_set_warn_proc #11729

Merged
merged 2 commits into from
Jan 12, 2022

Conversation

straight-shoota
Copy link
Member

#11289 introduced logic to suppress some common GC warning messages which are identified by comparing the start of the warning message.

In oder to use String#starts_with? from Crystal's stdlib, the C string passed to the proc needs to be converted to a String which requires allocating memory for the new string. This isn't mentioned in the documentation of libgc, but it's seems to be not a good idea to allocate from within an allocation warning handler.

This patch removes the String allocation and instead recognizes a matching message directly from the C string.

Resolves #11728

/cc #6389 which is similarly about avoiding String allocations when working with C strings.

@straight-shoota straight-shoota added kind:bug A bug in the code. Does not apply to documentation, specs, etc. topic:stdlib:runtime kind:regression Something that used to correctly work but no longer works labels Jan 8, 2022
beta-ziliani
beta-ziliani previously approved these changes Jan 10, 2022
Copy link
Member

@beta-ziliani beta-ziliani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the long run we probably need to have some useful set of functions to operate with C-strings without requiring to allocate more memory.

@beta-ziliani beta-ziliani added this to the 1.3.1 milestone Jan 10, 2022
asterite
asterite previously approved these changes Jan 10, 2022
Copy link
Member

@asterite asterite left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

Same comment as @beta-ziliani : I'm sure there's a more succinct/efficient way to do this. For example using LibC.strlen and then uses LibC.strncmp. But for now this is good!

src/gc/boehm.cr Outdated
# This method implements `String#starts_with?` without allocating a `String` (#11728)
def self.starts_with?(string_pointer, start)
start.to_slice.each_with_index do |chr, i|
return false unless chr == string_pointer[i]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this watch for string_pointer[i] == 0 and then break the loop?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think once string_pointer[I] is zero it won't match the char in the expected string, so the loop will break.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True. Still very implicit way of dealing with this that relies on non-attacker controlled input in start, so might be worth an explicit check or an explanatory comment.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method is customized for this specific use case and we're controlling the value of start (which governs the iteration length). So this should be completely safe.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Until somebody starts to copy or reuse it without much thinking. As said I'm fine with a comment but I would not leave this as an uncommented example in a codebase without doubt many use as an inspiration :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we're controlling the value of start (which governs the iteration length)

I don't think this matters. It governs the iteration length, but what happens if the string is longer than start? Then almost by chance the iteration stops because C strings are guaranteed to end with a final null byte. But, as @jhass says, it's not immediately clear.

I would much rather prefer to use LibC.strlen and using LibC.strncmp.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I'll refactor to use strlen.

We currently don't have bindings for strncmp, but I believe we don't need that. As soon as we know the length of the string, we can use a slice for the comparison.

@straight-shoota straight-shoota dismissed stale reviews from asterite and beta-ziliani January 11, 2022 12:26

Code changes

@straight-shoota straight-shoota merged commit 1607e49 into crystal-lang:master Jan 12, 2022
@straight-shoota straight-shoota deleted the fix/gc-warn branch January 12, 2022 17:30
rdp pushed a commit to rdp/crystal that referenced this pull request Jan 19, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind:bug A bug in the code. Does not apply to documentation, specs, etc. kind:regression Something that used to correctly work but no longer works topic:stdlib:runtime
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Bug: Allocation in GC warning suppression proc blocks process
4 participants