-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
remove Scheduler #5687
remove Scheduler #5687
Conversation
This is really really exciting! Thanks so much for working on this! |
@bass3m might also be interested in trying this out |
@vtjnash looking forward to precompiled packages! It looks like you pulled in some unintended changes (see test/ccall and test/linalg.jl) |
current_module needs to be rooted probably. |
@@ -103,10 +103,11 @@ | |||
#end | |||
|
|||
# type Task | |||
# parent::Task | |||
# current_module::Module |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How come you don't need to keep track of the parent task anymore?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The parent task used to be whoever first switched to the task, that isn't a very helpful definition and it caused issues with tasks being awakened by some other random task exiting, rather than the condition variable it was waiting for. It was broken before, but harder to notice since the Scheduler was forbidden from waiting on stuff.
@loladiro where? it's more rooted now than it used to be |
end | ||
return default | ||
end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not searching the entire parent task chain seems rather likely to break source_path
quite badly: we only populate the TLS lazily and this search process is how we find the correct source path for tasks that haven't populated this entry. We should probably generalize this approach since it's quite likely to be something that will be used for other values stored in TLS.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See above for why I think this definition of source_path may have been broken anyways. (task Alice spawns task Bob, task Bob yields for a bit then calls source_path, meanwhile Alice has finished loading the file and proceeded to include some other file. Bob now reports that it is in Alice's other file.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The idea was for it to at least work for tasks whose lifetimes are contained within loading the file. With this change, it no longer works in that simple case.
source_path
is dynamic; it's the file currently being loaded. Setting up tasks to use it potentially long after the file is done loading is a strange thing to do. A function inside the file that calls source_path
will also give a different answer after loading the file. This isn't too different from the task case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does it make sense to capture this as a per-module
attribute then? Or should I restore the parent parameter, but set it equal to the task that created the task (as opposed to the task that first switched to the task)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe, but source files and modules aren't 1-to-1 so that won't really fix it. The nice thing about the parent
field is that you can get good default values for things in task-local storage without copying the dictionary for each new task.
Going by the creating task instead of the first-switching task sounds like a good idea; you're right that in general somebody might make a task, stash it in the work queue, and some random other task switches to it later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The creating task does seem like a much saner choice for the parent. The fact that the creating task and the first task that switches to it are often the same is a lucky accident.
I may be missing something, but is |
It is rooted in |
Ah, I see. |
@@ -1,4 +1,8 @@ | |||
# source path in tasks | |||
path = Base.source_path() | |||
@test endswith(path, joinpath("test","test_sourcepath.jl")) | |||
@test yieldto(@task Base.source_path()) == path |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Breaking this test is a no-no – this definitely needs to work.
This is a great change, thanks for tackling it. |
The binary file |
perform_work() | ||
process_events(false) | ||
end | ||
yield() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This causes worker processes to busy wait (rather, this whole while true
loop).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i fixed this locally, but didn't hadn't pushed it yet
Yes, definitely a great change to have. |
I fixed the ccalltest file. I need the linalg changes to run the tests locally. So, to me, they aren't unrelated. I should probably cherry-pick it directly onto master then rebase this. |
this passes all tests now @JeffBezanson ping |
@@ -196,7 +196,7 @@ debug && println("Solve upper trianguler system") | |||
|
|||
debug && println("Solve lower triangular system") | |||
x = tril(a)\b | |||
@test_approx_eq_eps tril(a)*x b 1000ε | |||
@test_approx_eq_eps tril(a)*x b 1050ε |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These changes are unrelated and may be a reversion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
my quick git blame work told me these are new tests added ~1 week ago (just prior to my branch which is why they ended up here)
@vtjnash I'd appreciate it if you could cherry-pick those linalg test changes, they've been blocking the OSX nightlies for almost a week now. I know @jiahao has been working on a systematic re-think of the linalg tests, but if you've got a relaxation of the bounds that works, I'd love to have that on |
@staticfloat I can relax those tests now. My only problem is that all tests pass on my own machines as well as Travis so I need some feedback. I'll relax those that have been reported already. |
@andreasnoackjensen I have machines that fail on tests. I'll run any branches/commits you want me to and report on how it works, just ping me. |
Okay. Please try to run |
Confirmed working. On Wed, Feb 5, 2014 at 1:59 PM, Andreas Noack Jensen <
|
I've also been using this and testing on osx and linux. Works great so far! |
@WestleyArgentum, I'm pretty sure that @staticfloat's response was in regard to a side conversation with @andreasnoackjensen, and not to this PR. ;-) But it's great that this PR is working for you! |
@kmsquire heh, right - sorry for the spam |
@JeffBezanson rebased && ready to merge |
local v | ||
try | ||
v = yieldto(P, values...) | ||
if ct.last !== P |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You don't have to change this now, but this almost seems more like an error --- there should be no reason a task other than P
would yield to this one. Something is really broken if that happens.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was needed to pass pollfd test
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, but we should look into why that happens.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok. for some reason the old tasks get woken up, finish their work, and then wake up their parent (who was intending to wait for its new task)
First, we keep track of whether a task has asked to be in a particular module (e.g. by initiating a source file load). If it has, then current_module is always set appropriately for that task. That should be uncontroversial. Next, if a task doesn't care what module it is in, use the current_module from its parent task. This is similar to the previous commit's behavior of setting task->current_module at task creation, except that it sees changes in the parent task's current module. So if the parent task finishes loading something, the spawned task won't think it is still inside that module. In other words, remember where we got current_module rather than remembering the module itself.
state don't print fatal error for non-root task without exception handler. this is not fatal and in fact can happen regularly. small cleanups to yield_until()
now multi.jl deals only with distributed parallelism remove no-longer-used WeakRemoteRef
Ok I'm going to merge this now, as it works perfectly well. We can get deeper into other task-related changes afterwards. |
I know I say this a lot, but yay! |
This is totally cool! I love having rapid access to graphics! I noticed a couple of oddities (e.g., with Gtk compilation fails the 2nd time), but I bet you're aware of them. My biggest concern is how developers are now going to avoid accidentally contributing their favorite sets of packages to |
Maybe add a |
Or better yet, automatically precompile packages if they are listed in a Make variable that can be set in the |
That's annoying. Somehow I hadn't run across this, but I can see why it happens. @JeffBezanson is there some black-magic we could use to delete all old bindings (esp. Modules) from |
ihnorton's solution is nice since it allows arbitrary code to be added to the sysimg |
Arbitrary code is definitely more flexible than a make variable. I do not fully understand the I just wanted to suggest a way to avoid creating a new user editable file, and to simplify debugging because the automatically generated |
I think the |
I did the |
@StefanKarpinski It might be good to move this thread over to pull request #5746, where I address the issue of how fragile it is and the need to use |
Ah, I didn't realize that was what you guys were talking about over there. Yes, will do. |
This removes the Scheduler. @JeffBezanson you've talked about doing this for awhile since it should improve performance by avoiding the extra task switch.
@WestleyArgentum This change set frees the sysimg build of the need for starting the Scheduler. It was needed to permit pulling in other modules into the sysimg. After this change, I can add the following lines to the bottom of my
sysimg.jl
file to have Gtk fully & instantly available at the REPL.edit: This is WIP because I need to create a
flush_gc_msgs
Task to avoid the warning that the scheduler is being called recursively. Also need Jeff's thoughts on howsource_path
should behave (cf. f9dd0bb).