-
Notifications
You must be signed in to change notification settings - Fork 2.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix 627 #642
Fix 627 #642
Conversation
|
||
assert feature_frame_num == label_frame_num | ||
|
||
if self._split_sentence_threshold == -1 or self._split_perturb == -1 or self._split_sub_sentence_len == -1 or self._split_sentence_threshold >= feature_frame_num: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please keep one line under the 80-character limit, which is also suitable for comments
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fix
LGTM |
@@ -61,10 +61,21 @@ class SampleInfoBucket(object): | |||
label_bin_paths (list|tuple): Files containing the binary label data. | |||
label_desc_paths (list|tuple): Files containing the description of | |||
samples' label data. | |||
split_perturb(int): split long sentence' perturb sub-sentence length value. | |||
split_sentence_threshold(int): sentence length large than |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sentence --> Sentence
large --> larger
operator --> operation
sentence length large than split_sentence_threshold trigger split operator. -->
Sentence whose length larger than the value will trigger split operation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK
@@ -61,10 +61,21 @@ class SampleInfoBucket(object): | |||
label_bin_paths (list|tuple): Files containing the binary label data. | |||
label_desc_paths (list|tuple): Files containing the description of | |||
samples' label data. | |||
split_perturb(int): split long sentence' perturb sub-sentence length value. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
split_perturb(int) --> split_perturb (int)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
split long sentence' perturb sub-sentence length value.
Please refine this comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Random perturb sub-sentence length when split long sentence
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about Maximum perturbation value for length of sub-sentence when splitting long sentence.
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice
remain_frame_num = feature_frame_num | ||
while True: | ||
if remain_frame_num > self._split_sentence_threshold: | ||
cur_frame_len = self._split_sub_sentence_len + random.randint( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems exceed 80 columns.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please use self._rng
instead using random.randint
directly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
self._rng.randint(0, self._split_perturb)
@@ -244,11 +291,20 @@ def read_bytes(fpath, start, size): | |||
sample_info.feature_start, | |||
sample_info.feature_size) | |||
|
|||
assert sample_info.feature_frame_num * sample_info.feature_dim * 4 == len( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems exceed 80 columns.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
agree
@@ -273,7 +329,8 @@ def read_bytes(fpath, start, size): | |||
time.sleep(0.001) | |||
|
|||
# drop long sentence | |||
if self._drop_frame_len >= sample_data[0].shape[0]: | |||
if self._drop_frame_len == -1 or self._drop_frame_len >= sample_data[ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fix
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
split long sentence, drop sentence close by -1