-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add code tour #21
Add code tour #21
Conversation
examples/tour.jl
Outdated
|
||
# Now let's start the batcher - but given we already used the RandomBatcher previously, we | ||
# need to provide the correct new_state | ||
# XXX: why does it work if we use state=RNG above? why do we need to do this twice? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
?
not really sure tbh
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
beacuse the batcher itself does not store the state anywhere; we used to do that but it was too easy for it to get out of sync and then you lose teh guarantees about reproducibility (that's also why the take!(::Batcher, state)
interface requires the state, so that the batcher itself can remain stateless)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah I see...
To me then it seems that allowing Batcher(...;start=true, state=RNG)
would contradict that...because in that case it not only seems like the Batcher
does know something about the state (because it's passed as an argument) but that it also controls the iteration state by being able to start
from construction.
(tangentially, the fact that "state is ignored if start=false" just confuses the matter because now you have conditional application of kwargs)
Overall I feel the separation of concerns has been blurred which partly contributes to my confusion
examples/tour.jl
Outdated
|
||
# A more convenient way to do this is to just take! from the batcher.channel, which obviates | ||
# the need for passing around the new_state | ||
((X4, Y4), new_state), old_state = take!(batcher.channel) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not documented that it can/should be used this way but I've seen this used on beacon-internal code? is it sanctioned? seems safer & more convenient than carrying around a state
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah this is no good 🙅 for the reasons above: it's the responsibility of the caller to handle iteration state. the other reason is that take!(::Batcher, state)
handles error propagation, avoiding deadlock, etc. all that good stuff (which is surprisingly tricky to get right).
if a user wants to just have a channel of batches, they are absolutely free to call start_batching
with the workers they wanna use; the idea of the Batcher
is to encapsulate all that trickiness of running this stuff in a distributed way
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
gotcha! I might show you where I saw this pattern internally and you might shed light on what I might be misunderstanding
# https://github.com/beacon-biosignals/Legolas.jl/blob/main/examples/tour.jl | ||
# https://github.com/beacon-biosignals/Onda.jl/blob/main/examples/tour.jl | ||
# | ||
# Why OndaBatches? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
to the best of my understanding - this is why it exists
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is a great start and really awesome to have!
examples/tour.jl
Outdated
s1, l1 = load_labeled_signal(labeled_signals[1, :]) | ||
|
||
@test s1 isa Samples | ||
@test l1 isa Samples | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
might be worthwhile to explain a bit about how label_span
relates to the Signal's span
field (they're both relative to recording start), and maybe to give an example of using sub_label_span
to select a sub-span (using the correct alignment). woudl be easier to see the nuances here if span
doesn't start at 0. so maybe this should be a separate section somewhere...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added, though I'm confused about one part - highlighted below
examples/tour.jl
Outdated
|
||
# Now let's start the batcher - but given we already used the RandomBatcher previously, we | ||
# need to provide the correct new_state | ||
# XXX: why does it work if we use state=RNG above? why do we need to do this twice? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
beacuse the batcher itself does not store the state anywhere; we used to do that but it was too easy for it to get out of sync and then you lose teh guarantees about reproducibility (that's also why the take!(::Batcher, state)
interface requires the state, so that the batcher itself can remain stateless)
examples/tour.jl
Outdated
|
||
# A more convenient way to do this is to just take! from the batcher.channel, which obviates | ||
# the need for passing around the new_state | ||
((X4, Y4), new_state), old_state = take!(batcher.channel) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah this is no good 🙅 for the reasons above: it's the responsibility of the caller to handle iteration state. the other reason is that take!(::Batcher, state)
handles error propagation, avoiding deadlock, etc. all that good stuff (which is surprisingly tricky to get right).
if a user wants to just have a channel of batches, they are absolutely free to call start_batching
with the workers they wanna use; the idea of the Batcher
is to encapsulate all that trickiness of running this stuff in a distributed way
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
still haven't really given this a super thorough review but it's been open so long we might as well merge it.
Closes #20