-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
match is type unstable #10550
Comments
I should add that I am willing to work on this is there is some general consesus to change it. |
+1... but I'm not sure how you can do this without breaking lots of existing code. Edit: thinking about it pandas made this change to their str.match method, for the same reason. |
+1 |
BTW, I've just bumped on the same pattern with |
The big issue here is design. How do you design the equivalent functionality in a type-stable way? One possibility is using the relatively new I had at one point proposed an else clause for for m = match(r"([M,G])([0-9.])+", "foo")
# do stuff with m
else
# handle not matching
end Then |
I think this is a great case for Nullable, it's actually been on my todo On Wed, Mar 18, 2015 at 5:19 AM, Stefan Karpinski [email protected]
|
as mentioned here #1820 (comment) there's also the new pcre2 api and features to think about, if you're digging into regex functionality |
I like the idea of for/else. It's a nice construct that would remove some ugly flags in various bits of code. I don't see any real downsides to having it, since it's not like it uses an extra keyword. |
Returning an iterable is also an appealing alternative, if it can be made as efficient as EDIT: There's also |
A while ago, I had implemented this as a This behavior was actually one of my main motivations for creating |
From a performance perspective, an iterable may be better than a Nullable. Until we have escape analysis, constructing a |
There may not be a big performance advantage to making |
My personal sense: the most important thing is that we standardize on a idiom for all functions that return results that may not exist. Nullable seems like the simplest approach, but I don't really care what we do as long as we always do the same thing. We can easily make Nullable iterable: this has been discussed before. The heap allocation issue raised by @simonster is more serious, but I think we need to solve it since the use cases for something equivalent to Nullable are going to keep piling up. |
In theory you could certainly work out that
|
I think it's more complicated than that: |
@one-more-minute Type inference is smart enough to figure that out: julia> function f(str)
x = match(r"\w", str)
x.match
end;
julia> code_typed(f, (UTF8String,))
1-element Array{Any,1}:
:($(Expr(:lambda, Any[:str], Any[Any[:x],Any[Any[:str,UTF8String,0],Any[:x,Union(RegexMatch,Void),18]],Any[],Any[UTF8String,Regex,UTF8String,Regex]], :(begin # none, line 1:
GenSym(1) = r"\w"
x = match(GenSym(1),str::UTF8String,1,(top(box))(Int32,(top(checked_trunc_sint))(Int32,0)))::Union(RegexMatch,Void) # line 2:
return (top(getfield))(x::Union(RegexMatch,Void),:match)::SubString{UTF8String}
end::SubString{UTF8String})))) @johnmyleswhite |
julia> type Foo x::Int end
julia> type Bar y::Float64 end
julia> foo() = rand()<0.5? Foo(1) : Bar(1)
foo (generic function with 1 method)
julia> bar() = foo().x
bar (generic function with 1 method)
julia> @code_typed bar()
1-element Array{Any,1}:
:($(Expr(:lambda, Any[], Any[Any[],Any[],Any[],Any[]], :(begin # none, line 1:
return (top(getfield))(foo()::Union(Foo,Bar),:x)::Int64
end::Int64)))) That's pretty sweet, I had no idea we could do that kind of inference on field names. It even types it as |
Cool isn't it :D |
Keen for this, it just catches me out everytime, because its not error I expect. I wonder though if I think a Regex not matching is an exceptionable situation. |
I don't think this is true. In my experience it is very common to first check if the string matches and then do further processing using the information of the matching. |
I think this can be closed now because in a turn of events this now matches what other functions do (e.g. |
I feel like this a duplicate issue, but I could not find it. A little ironic I guess.
This behavior is documented, but I feel like it should not be the default for performance reasons. This seems like a use case for
Nullable
s.The text was updated successfully, but these errors were encountered: