Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add code tour #21

Merged
merged 15 commits into from
Mar 20, 2024
Merged

Add code tour #21

merged 15 commits into from
Mar 20, 2024

Conversation

glennmoy
Copy link
Contributor

@glennmoy glennmoy commented Apr 26, 2023

Closes #20


@glennmoy glennmoy marked this pull request as draft April 26, 2023 19:21
examples/tour.jl Outdated Show resolved Hide resolved
examples/tour.jl Outdated

# Now let's start the batcher - but given we already used the RandomBatcher previously, we
# need to provide the correct new_state
# XXX: why does it work if we use state=RNG above? why do we need to do this twice?
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

?

not really sure tbh

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

beacuse the batcher itself does not store the state anywhere; we used to do that but it was too easy for it to get out of sync and then you lose teh guarantees about reproducibility (that's also why the take!(::Batcher, state) interface requires the state, so that the batcher itself can remain stateless)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah I see...

To me then it seems that allowing Batcher(...;start=true, state=RNG) would contradict that...because in that case it not only seems like the Batcher does know something about the state (because it's passed as an argument) but that it also controls the iteration state by being able to start from construction.

(tangentially, the fact that "state is ignored if start=false" just confuses the matter because now you have conditional application of kwargs)

Overall I feel the separation of concerns has been blurred which partly contributes to my confusion

examples/tour.jl Outdated

# A more convenient way to do this is to just take! from the batcher.channel, which obviates
# the need for passing around the new_state
((X4, Y4), new_state), old_state = take!(batcher.channel)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not documented that it can/should be used this way but I've seen this used on beacon-internal code? is it sanctioned? seems safer & more convenient than carrying around a state?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah this is no good 🙅 for the reasons above: it's the responsibility of the caller to handle iteration state. the other reason is that take!(::Batcher, state) handles error propagation, avoiding deadlock, etc. all that good stuff (which is surprisingly tricky to get right).

if a user wants to just have a channel of batches, they are absolutely free to call start_batching with the workers they wanna use; the idea of the Batcher is to encapsulate all that trickiness of running this stuff in a distributed way

Copy link
Contributor Author

@glennmoy glennmoy Apr 28, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

gotcha! I might show you where I saw this pattern internally and you might shed light on what I might be misunderstanding

examples/tour.jl Outdated Show resolved Hide resolved
@glennmoy glennmoy changed the title Add code tour RFC: Add code tour Apr 26, 2023
# https://github.com/beacon-biosignals/Legolas.jl/blob/main/examples/tour.jl
# https://github.com/beacon-biosignals/Onda.jl/blob/main/examples/tour.jl
#
# Why OndaBatches?
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to the best of my understanding - this is why it exists

Copy link
Member

@kleinschmidt kleinschmidt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is a great start and really awesome to have!

examples/tour.jl Outdated
Comment on lines 81 to 85
s1, l1 = load_labeled_signal(labeled_signals[1, :])

@test s1 isa Samples
@test l1 isa Samples

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

might be worthwhile to explain a bit about how label_span relates to the Signal's span field (they're both relative to recording start), and maybe to give an example of using sub_label_span to select a sub-span (using the correct alignment). woudl be easier to see the nuances here if span doesn't start at 0. so maybe this should be a separate section somewhere...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added, though I'm confused about one part - highlighted below

examples/tour.jl Outdated Show resolved Hide resolved
examples/tour.jl Outdated Show resolved Hide resolved
examples/tour.jl Outdated Show resolved Hide resolved
examples/tour.jl Outdated Show resolved Hide resolved
examples/tour.jl Outdated

# Now let's start the batcher - but given we already used the RandomBatcher previously, we
# need to provide the correct new_state
# XXX: why does it work if we use state=RNG above? why do we need to do this twice?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

beacuse the batcher itself does not store the state anywhere; we used to do that but it was too easy for it to get out of sync and then you lose teh guarantees about reproducibility (that's also why the take!(::Batcher, state) interface requires the state, so that the batcher itself can remain stateless)

examples/tour.jl Outdated Show resolved Hide resolved
examples/tour.jl Outdated Show resolved Hide resolved
examples/tour.jl Outdated

# A more convenient way to do this is to just take! from the batcher.channel, which obviates
# the need for passing around the new_state
((X4, Y4), new_state), old_state = take!(batcher.channel)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah this is no good 🙅 for the reasons above: it's the responsibility of the caller to handle iteration state. the other reason is that take!(::Batcher, state) handles error propagation, avoiding deadlock, etc. all that good stuff (which is surprisingly tricky to get right).

if a user wants to just have a channel of batches, they are absolutely free to call start_batching with the workers they wanna use; the idea of the Batcher is to encapsulate all that trickiness of running this stuff in a distributed way

examples/tour.jl Outdated Show resolved Hide resolved
@glennmoy glennmoy changed the title RFC: Add code tour Add code tour May 2, 2023
@glennmoy glennmoy marked this pull request as ready for review May 2, 2023 09:55
@glennmoy glennmoy requested a review from kleinschmidt May 3, 2023 18:49
examples/tour.jl Outdated Show resolved Hide resolved
Copy link
Member

@kleinschmidt kleinschmidt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

still haven't really given this a super thorough review but it's been open so long we might as well merge it.

@kleinschmidt kleinschmidt merged commit 7b807bf into main Mar 20, 2024
2 checks passed
@kleinschmidt kleinschmidt deleted the gm/codetour branch March 20, 2024 18:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants