-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ccall should automatically convert integers #132
Comments
This makes me think: can we use type inference with numeric literals? This is sort of the opposite of what other languages do, but it would probably give us 99% of what we want: when a numeric literal is used in a context where we can figure out what type it ought to be — e.g. |
In fact, maybe this is just specific case of that — that would make it less confusing: variables have to be cast, literals will automatically be converted to the right type. |
Actually that is exactly what Haskell does. + is defined only for arguments of the same type, so one argument can determine the type of the operation and force a literal to be of the same type. I don't see how we can do that though. Both arguments determine what method to apply, and any method signature can be defined, so there isn't a unique answer to what type an argument "ought" to be. |
Back to the original issue: can we have |
The big problem here is the way we handle pointers. If you pass an Int32 to a ccall expecting Ptr{Int32}, it passes a pointer to your integer value. But converting an integer to a pointer uses the bits of the integer as the value of the pointer. Another issue is when to insert the calls to convert. Since ccall uses normal function call syntax, it's hard for the front end to know exactly what is and isn't a ccall. I guess we could make |
See, I didn't know that.
I would be completely find with |
If a garbage collection occurs between calling pointerto() and the ccall that uses it, you could get a segfault. There's a reason for the magic. Actually,
|
Hmm. That's nasty. How about preventing GC for the duration of the function instead? |
Preventing GC during the ccall doesn't help, since the problem is GC between |
How about this: |
I think that would do it, since we need some way to tag that a value needs to be converted differently. Doing heap allocation for this is unfortunate though. Actually you can already do this by wrapping numbers in arrays, I can rewrite the ccall to keep the original arguments so the code generator can make sure they are gc roots:
So there is no need to disable GC. |
Do you mean this:
? Any chance that the compiler can figure out that heap allocation is unnecessary because the |
Well, that's probably possible with enough of my time :) I think the bigger problem is that a lapack call would go from this:
to this:
which is not the kind of improvement I was hoping for by filing this issue :) |
But the LAPACK calling convention is clearly not the norm — it's very ugly. Optimizing for it seems silly. I think that the latter version is actually much simpler to understand: it's shorter and the fields that should be passed as pointers are instead passed as Refs. That's a pretty simple thing for users to wrap their heads around:
This would be infinitely more explainable and usable than what we currently have, IMO. |
You know, the syntax |
Yes!! |
- conversions are inserted for the arguments, to the ccall argument types - syntax &x is used to pass a pointer to a scalar
Ok, I have this change basically implemented on a new branch. However, it is causing a lot of unexpected difficulty. Mostly, there are some performance regressions of about 10-15% in randmatstat and printfd. There were some changes to inlining behavior, but those did not seem to be the cause of the regressions, and I'm still not sure what's going on. The other problem is GC roots. The reason I wrote
is so the code generation for ccall has access to the original arguments before conversion, in case they are new arrays that get converted to pointers. Now, this might seem silly, since lowering has to pull expressions into variables anyway to avoid duplicating side-effects:
But what if we have
We have to avoid lowering and inlining this to
Oops! I don't have this fully worked out. If lowering emits the argument twice, that at least signals to the inliner that it needs to generate a temporary variable. I guess I could alternate real/dummy arguments:
Every other argument is completely superfluous, but has to be there! Needless to say, this problem isn't unique to ccall, and any use of array-to-pointer conversion is a potential ticking time bomb. But in ccall we have to deal with it since we insert the conversions. Yet another problem is conversion inside
The current implementation doesn't support this, and you have to write
To insert the conversion automatically, we'd have to take apart the type
|
I can't comment on the performance issue since I'm not really in a position to advise. I'm not clear on how gc can possibly occur during a ccall and cause the gc rooting thing to be a problem. No gc will happen while the ccall is executing and by the time it returns, its safe to free the memory unless it's rooted somewhere else. The |
Just think of the example |
Spell it out. I'm not getting it. The ccall takes the pointers and computes something with them. As long as it doesn't return the pointers, expecting them to still point to valid memory, it's fine. If f does something like that, then you shouldn't pass it un-rooted memory. |
When |
A crappy hack would be to temporarily root all literals while evaluating the expression they appear in. |
Well, it has nothing to do with literals. It could be any expression
There is no problem if you don't call |
Right, I get that pointer can still be used to make "hand grenades" like this, but my hack would take care of the specific case of ccall, no? That was all I was proposing. We can either do away with pointers in Julia altogether or allow them and make sure people know they're dangerous like they are in C. There's a lot fewer places where they can and should be used — almost exclusively to pass to ccall, in fact. Maybe we can design a mechanism that's pretty specific to ccall and is safe. Like using an ArrayPointer object that contains the array and an index and is converted specially by ccall into a pointer. That still leaves the problem of pointers returned from C. Maybe we can have a BarePointer type that can be returned and converted into an ArrayPointer by combining it with the appropriate array object. If the BarePointer doesn't point into the array object given, an exception is thrown; if it does point into the array object given, it can safely be converted into an ArrayPointer object. You can still do arithmetic, etc. with an ArrayPointer object, but a BarePointer is completely opaque and cannot be operated on. |
This is still in progress! I put the branch back in sync. How do you feel about merging into master even if the mysterious printfd perf regression remains? Perhaps we can address it over time while enjoying the benefits of the better ccall? |
I'm cool with that. I think at this point shaking out the issues in the new-and-improved ccall is more important than printf performance. So, yeah, let's do that. Better send a message to the dev list explaining the change though. |
Pardon the potentially stupid qustion, but why, in your above example I guess unless you are tracing your AST in realtime to see if the reference is still needed, maybe you can just say "don't mark or sweep anything on the current stack-frame"? Which, thinking out-loud, would lead to some incredibly unwieldy memory usage for poorly written applications with lots of single-use, large, temporary variables. Or really, isn't the issue just with anonymous variables -- especially those pointing to literals? It seems like you might have a nice corner case for root objects pointing to the current stack-frames literal pool? |
For one thing, the AST doesn't exist anymore at run-time because everything is JITed to machine code before running. Not sure about the rest... |
Don't worry about it. Ignore the man behind the curtain. Everything will be fine. |
Hahahahaha. Well, I'll trust in that. |
Well, the lack of AST makes sense at run-time if everything is JITed (I forgot that pleasant detail of Julia -- still getting acquainted over here). I still don't understand why the literal is getting marked by the GC. I'd prefer to not ignore the man behind the curtain and rather understand his seemingly crazy behavior. |
The core issue is that a |
Also, you will want to re-read your sources on mark/sweep GC; marking is what happens to live objects. |
Are you saying that the literal isn't live? I'm assuming you're saying it isn't live because it is unreachable. I would say that it SHOULD be live, or at least that is what I would expect in writing that code. Treating Ptr as an integer type and not checking if it is referring to a julia object certainly leads to "unexpected" behavior from a code-writers point of view. I understand that checking if Ptr refers to a julia object definitely creates overhead, but it seems like it the right solution to match coder expectation. Plus, and maybe this is just my naiveté, but it seems to me like the overhead isn't in the speed-critical areas of the code I would write anyway -- just in conversions with ccall. |
The important thing to remember is that you don't really need to explicitly convert objects to Ptr. The only real use of that is calling C, and if you use ccall normally (which converts for you, as in |
It's not a matter of checking anything about the Ptr object itself. Making the existence of a Ptr object that points to a Julia object prevent that Julia object from being garbage collected requires scanning all live Ptr objects each time a Julia object is considered for GC to see if that pointer happens to point to the object. That's really expensive. |
One could potentially make Ptrs more like some kind of Ref object that explicitly references another Julia object:
This kind of construct interacts just fine with GC. However, then the problem comes when you want a pointer that doesn't refer to any Julia object: an opaque pointer returned from |
add rem(::Integer, ::Type{T<:Integer}}) functions
…JuliaLang#132) grabbag of other style fixes
* Delete v1 track icons * Delete svg version of track icon
The track repo no longer contains an icon (JuliaLang#132)
* Functions for trimming outliers and robust statistics (JuliaLang#132). - Added trim, trim!, winsor, winsor!, trimvar. - Deprecated trimmean. * remove dep warns * finished docs for winsor, trim, etc
Enclose signatures_at in try/catch (fixes #132)
This can be done for arguments whose types can be inferred. It will make things more convenient in general, but the cost is that it won't be easy to predict when an explicit conversion needs to be written. At the very least it will improve the situation with literals, so you don't need to write
int32(0)
.The text was updated successfully, but these errors were encountered: