Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Segmentation fault when extracting feature and raise ERROR when training #3

Open
YangangCao opened this issue Aug 2, 2021 · 2 comments

Comments

@YangangCao
Copy link

YangangCao commented Aug 2, 2021

Hi, thanks for your work, I encounter two problems:

  1. I set count as 10000000, and use original 48K speech and noise as dataset. the output looks like this:
x_lp[0] is Nan detected after pitch_downsample, so not filtered. total count:1
x_lp[0] is Nan detected after pitch_downsample, so not filtered. total count:2
x_lp[0] is Nan detected after pitch_downsample, so not filtered. total count:3
x_lp[0] is Nan detected after pitch_downsample, so not filtered. total count:4
x_lp[0] is Nan detected after pitch_downsample, so not filtered. total count:5
x_lp[0] is Nan detected after pitch_downsample, so not filtered. total count:6
x_lp[0] is Nan detected after pitch_downsample, so not filtered. total count:7
x_lp[0] is Nan detected after pitch_downsample, so not filtered. total count:8
x_lp[0] is Nan detected after pitch_downsample, so not filtered. total count:9
x_lp[0] is Nan detected after pitch_downsample, so not filtered. total count:10
x_lp[0] is Nan detected after pitch_downsample, so not filtered. total count:11
x_lp[0] is Nan detected after pitch_downsample, so not filtered. total count:12
x_lp[0] is Nan detected after pitch_downsample, so not filtered. total count:13
x_lp[0] is Nan detected after pitch_downsample, so not filtered. total count:14
x_lp[0] is Nan detected after pitch_downsample, so not filtered. total count:15
x_lp[0] is Nan detected after pitch_downsample, so not filtered. total count:16
x_lp[0] is Nan detected after pitch_downsample, so not filtered. total count:17
x_lp[0] is Nan detected after pitch_downsample, so not filtered. total count:18
x_lp[0] is Nan detected after pitch_downsample, so not filtered. total count:19
x_lp[0] is Nan detected after pitch_downsample, so not filtered. total count:20
x_lp[0] is Nan detected after pitch_downsample, so not filtered. total count:21
x_lp[0] is Nan detected after pitch_downsample, so not filtered. total count:22
x_lp[0] is Nan detected after pitch_downsample, so not filtered. total count:23
x_lp[0] is Nan detected after pitch_downsample, so not filtered. total count:24
x_lp[0] is Nan detected after pitch_downsample, so not filtered. total count:25
x_lp[0] is Nan detected after pitch_downsample, so not filtered. total count:26
x_lp[0] is Nan detected after pitch_downsample, so not filtered. total count:27
x_lp[0] is Nan detected after pitch_downsample, so not filtered. total count:28
x_lp[0] is Nan detected after pitch_downsample, so not filtered. total count:29
x_lp[0] is Nan detected after pitch_downsample, so not filtered. total count:30
x_lp[0] is Nan detected after pitch_downsample, so not filtered. total count:31
x_lp[0] is Nan detected after pitch_downsample, so not filtered. total count:32
x_lp[0] is Nan detected after pitch_downsample, so not filtered. total count:33
x_lp[0] is Nan detected after pitch_downsample, so not filtered. total count:34
x_lp[0] is Nan detected after pitch_downsample, so not filtered. total count:35
x_lp[0] is Nan detected after pitch_downsample, so not filtered. total count:36
x_lp[0] is Nan detected after pitch_downsample, so not filtered. total count:37
x_lp[0] is Nan detected after pitch_downsample, so not filtered. total count:38
x_lp[0] is Nan detected after pitch_downsample, so not filtered. total count:39
x_lp[0] is Nan detected after pitch_downsample, so not filtered. total count:40
x_lp[0] is Nan detected after pitch_downsample, so not filtered. total count:41
x_lp[0] is Nan detected after pitch_downsample, so not filtered. total count:42
x_lp[0] is Nan detected after pitch_downsample, so not filtered. total count:43
x_lp[0] is Nan detected after pitch_downsample, so not filtered. total count:44
x_lp[0] is Nan detected after pitch_downsample, so not filtered. total count:45
x_lp[0] is Nan detected after pitch_downsample, so not filtered. total count:46
x_lp[0] is Nan detected after pitch_downsample, so not filtered. total count:47
x_lp[0] is Nan detected after pitch_downsample, so not filtered. total count:48
x_lp[0] is Nan detected after pitch_downsample, so not filtered. total count:49
x_lp[0] is Nan detected after pitch_downsample, so not filtered. total count:50
x_lp[0] is Nan detected after pitch_downsample, so not filtered. total count:51
x_lp[0] is Nan detected after pitch_downsample, so not filtered. total count:52
x_lp[0] is Nan detected after pitch_downsample, so not filtered. total count:53
x_lp[0] is Nan detected after pitch_downsample, so not filtered. total count:54
x_lp[0] is Nan detected after pitch_downsample, so not filtered. total count:55
x_lp[0] is Nan detected after pitch_downsample, so not filtered. total count:56
x_lp[0] is Nan detected after pitch_downsample, so not filtered. total count:57
x_lp[0] is Nan detected after pitch_downsample, so not filtered. total count:58
x_lp[0] is Nan detected after pitch_downsample, so not filtered. total count:59
x_lp[0] is Nan detected after pitch_downsample, so not filtered. total count:60
x_lp[0] is Nan detected after pitch_downsample, so not filtered. total count:61
x_lp[0] is Nan detected after pitch_downsample, so not filtered. total count:62
x_lp[0] is Nan detected after pitch_downsample, so not filtered. total count:63
x_lp[0] is Nan detected after pitch_downsample, so not filtered. total count:64
x_lp[0] is Nan detected after pitch_downsample, so not filtered. total count:65
x_lp[0] is Nan detected after pitch_downsample, so not filtered. total count:66
x_lp[0] is Nan detected after pitch_downsample, so not filtered. total count:67
x_lp[0] is Nan detected after pitch_downsample, so not filtered. total count:68
x_lp[0] is Nan detected after pitch_downsample, so not filtered. total count:69
x_lp[0] is Nan detected after pitch_downsample, so not filtered. total count:70
x_lp[0] is Nan detected after pitch_downsample, so not filtered. total count:71
x_lp[0] is Nan detected after pitch_downsample, so not filtered. total count:72
x_lp[0] is Nan detected after pitch_downsample, so not filtered. total count:73
x_lp[0] is Nan detected after pitch_downsample, so not filtered. total count:74
x_lp[0] is Nan detected after pitch_downsample, so not filtered. total count:75
x_lp[0] is Nan detected after pitch_downsample, so not filtered. total count:76
x_lp[0] is Nan detected after pitch_downsample, so not filtered. total count:77
x_lp[0] is Nan detected after pitch_downsample, so not filtered. total count:78
x_lp[0] is Nan detected after pitch_downsample, so not filtered. total count:79
x_lp[0] is Nan detected after pitch_downsample, so not filtered. total count:80
x_lp[0] is Nan detected after pitch_downsample, so not filtered. total count:81
x_lp[0] is Nan detected after pitch_downsample, so not filtered. total count:82
x_lp[0] is Nan detected after pitch_downsample, so not filtered. total count:83
x_lp[0] is Nan detected after pitch_downsample, so not filtered. total count:84
x_lp[0] is Nan detected after pitch_downsample, so not filtered. total count:85
x_lp[0] is Nan detected after pitch_downsample, so not filtered. total count:86
x_lp[0] is Nan detected after pitch_downsample, so not filtered. total count:87
Segmentation fault (core dumped)

I retry some times, segmentation fault occurs every time after total count achieve 87. I have no idea about that. I set count as 1000000, it works as normal, so I use this config to train model and encounter the second problem.

  1. when I run rnn_train.py (tensorflow-gpu 2.5.0), raise error like this:
Traceback (most recent call last):
  File "rnn_train.py", line 206, in <module>
    callbacks=[checkpoint_cb],
  File "/home/edev/.local/lib/python3.6/site-packages/keras/engine/training.py", line 1122, in fit
    steps_per_execution=self._steps_per_execution)
  File "/home/edev/.local/lib/python3.6/site-packages/keras/engine/data_adapter.py", line 1348, in get_data_handler
    return DataHandler(*args, **kwargs)
  File "/home/edev/.local/lib/python3.6/site-packages/keras/engine/data_adapter.py", line 1136, in __init__
    adapter_cls = select_data_adapter(x, y)
  File "/home/edev/.local/lib/python3.6/site-packages/keras/engine/data_adapter.py", line 978, in select_data_adapter
    _type_name(x), _type_name(y)))
ValueError: Failed to find data adapter that can handle input: <class '__main__.CustomDataGen'>, <class 'NoneType'>

can you please tell me your TensorFlow version?

update: I changed some code and fixed the second problem, however, only CPU can be used to train, when I use GPU, raising error:
Unknown: CUDNN_STATUS_BAD_PARAM

can you please tell me if you can use GPU to train?

@YangangCao YangangCao changed the title training use GPU raise ERROR Segmentation fault when extracting feature and raise ERROR when training Aug 3, 2021
@xyx361100238
Copy link

Yes I have the same question:
x_lp[0] is Nan detected after pitch_downsample, so not filtered. total count:1

Why does Nan occur? I have check the RNNoise with same data,it's neverhappened

@cookcodes
Copy link
Owner

cookcodes commented Aug 9, 2021

I user tensorflow 2.3.0 Keras 2.4.3
Yes, GPU is used to train the data.

And in rnn_train.py, some numbers need to be changed based on your input data size.
traingen = CustomDataGen(35200,batch_size,0,35200 *window_size,window_size)
valgen = CustomDataGen(4800,batch_size,35200 *window_size,40000 *window_size,window_size)

For "x_lp[0] is Nan detected after pitch_downsample, so not filtered. total coun", Some codes are added to check if nan is generated or not during pitch filter. Need furthur check if these codes are needed any more or not. At present, you may ignore these messages.

For Segmentation core dump issue, maybe because memory is not enough, You may use gdb to check the rootcause.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants