-
Notifications
You must be signed in to change notification settings - Fork 230
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Samplers always return the same number of batches in distributed mode #267
Samplers always return the same number of batches in distributed mode #267
Conversation
lhotse/dataset/sampling.py
Outdated
The formula used to determine which batches are returned is | ||
``(batch_idx + rank) % world_size == 0``. | ||
This ensures that we can return an equal number of batches in all distributed workers | ||
in spite of using a dynamic batch size, at the cost of skipping at most ``world_size`` batches. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should it be
skipping at most ``world_size - 1`` batches.
?
lhotse/dataset/sampling.py
Outdated
DistributedSampler -- instead of partitioning the underlying cuts into equally sized chunks, | ||
it will return every N-th batch and skip the other batches (where ``N == world_size``). | ||
The formula used to determine which batches are returned is | ||
``(batch_idx + rank) % world_size == 0``. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should it be
(batch_idx + (world_size - rank)) % world_size == 0
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, you're right, thanks
@danpovey @csukuangfj