Fix 627 #642

zhxfl · 2018-02-06T09:28:04Z

split long sentence, drop sentence close by -1

kuke · 2018-02-07T07:13:18Z

fluid/DeepASR/data_utils/data_reader.py

-
+                assert feature_frame_num == label_frame_num
+
+                if self._split_sentence_threshold == -1 or self._split_perturb == -1 or self._split_sub_sentence_len == -1 or self._split_sentence_threshold >= feature_frame_num:


Please keep one line under the 80-character limit, which is also suitable for comments

kuke · 2018-02-07T11:36:28Z

LGTM

pkuyym · 2018-02-08T02:47:40Z

fluid/DeepASR/data_utils/data_reader.py

@@ -61,10 +61,21 @@ class SampleInfoBucket(object):
        label_bin_paths (list|tuple): Files containing the binary label data.
        label_desc_paths (list|tuple): Files containing the description of
                                       samples' label data.
+        split_perturb(int): split long sentence' perturb sub-sentence length value. 
+        split_sentence_threshold(int): sentence length large than 


sentence --> Sentence
large --> larger
operator --> operation

sentence length large than split_sentence_threshold trigger split operator. -->
Sentence whose length larger than the value will trigger split operation.

pkuyym · 2018-02-08T03:02:38Z

fluid/DeepASR/data_utils/data_reader.py

@@ -61,10 +61,21 @@ class SampleInfoBucket(object):
        label_bin_paths (list|tuple): Files containing the binary label data.
        label_desc_paths (list|tuple): Files containing the description of
                                       samples' label data.
+        split_perturb(int): split long sentence' perturb sub-sentence length value. 


split_perturb(int) --> split_perturb (int)

split long sentence' perturb sub-sentence length value. Please refine this comment.

Random perturb sub-sentence length when split long sentence

How about Maximum perturbation value for length of sub-sentence when splitting long sentence.?

pkuyym · 2018-02-08T03:35:20Z

fluid/DeepASR/data_utils/data_reader.py

+                    remain_frame_num = feature_frame_num
+                    while True:
+                        if remain_frame_num > self._split_sentence_threshold:
+                            cur_frame_len = self._split_sub_sentence_len + random.randint(


Seems exceed 80 columns.

Please use self._rng instead using random.randint directly.

self._rng.randint(0, self._split_perturb)

pkuyym · 2018-02-08T03:35:46Z

fluid/DeepASR/data_utils/data_reader.py

@@ -244,11 +291,20 @@ def read_bytes(fpath, start, size):
                                           sample_info.feature_start,
                                           sample_info.feature_size)

+                assert sample_info.feature_frame_num * sample_info.feature_dim * 4 == len(


Seems exceed 80 columns.

pkuyym · 2018-02-08T03:36:00Z

fluid/DeepASR/data_utils/data_reader.py

@@ -273,7 +329,8 @@ def read_bytes(fpath, start, size):
                    time.sleep(0.001)

                # drop long sentence
-                if self._drop_frame_len >= sample_data[0].shape[0]:
+                if self._drop_frame_len == -1 or self._drop_frame_len >= sample_data[


Same as above.

pkuyym

LGTM

zhxfl added 2 commits February 6, 2018 15:32

Merge branch 'develop', remote branch 'origin' into fix-627

85d8e5c

split long sentence, drop sentence close by -1

84d28c3

zhxfl requested review from pkuyym and kuke February 6, 2018 09:29

zhxfl added the DeepASR label Feb 6, 2018

add more assert check

8d39596

kuke reviewed Feb 7, 2018

View reviewed changes

zhxfl added 5 commits February 7, 2018 16:01

merge develop

8bb8132

code line too long

6fe9377

merge develop

d6be02c

code line too long

42bb326

comment drop_long_sentence -1 to disable the policy

ec4fb54

pkuyym requested changes Feb 8, 2018

View reviewed changes

zhxfl added 3 commits February 8, 2018 12:37

fix by review

93fecc8

fix by review

bf28104

fix by review

6e6ed6b

pkuyym approved these changes Feb 8, 2018

View reviewed changes

zhxfl merged commit 1184109 into PaddlePaddle:develop Feb 8, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix 627 #642

Fix 627 #642

zhxfl commented Feb 6, 2018

kuke Feb 7, 2018

zhxfl Feb 7, 2018

kuke commented Feb 7, 2018

pkuyym Feb 8, 2018

zhxfl Feb 8, 2018

pkuyym Feb 8, 2018

pkuyym Feb 8, 2018

zhxfl Feb 8, 2018

pkuyym Feb 8, 2018

zhxfl Feb 8, 2018

pkuyym Feb 8, 2018

pkuyym Feb 8, 2018

zhxfl Feb 8, 2018

pkuyym Feb 8, 2018

zhxfl Feb 8, 2018

pkuyym Feb 8, 2018

zhxfl Feb 8, 2018

pkuyym left a comment


		assert feature_frame_num == label_frame_num

		if self._split_sentence_threshold == -1 or self._split_perturb == -1 or self._split_sub_sentence_len == -1 or self._split_sentence_threshold >= feature_frame_num:

Fix 627 #642

Fix 627 #642

Conversation

zhxfl commented Feb 6, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kuke commented Feb 7, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pkuyym left a comment

Choose a reason for hiding this comment