声音忽大忽小是什么原因 #45

skyliwq · 2024-03-02T09:29:42Z

中文生成的声音忽大忽小是什么原因，特别是长文本的时候

skyliwq · 2024-03-02T10:00:48Z

声音最小的时候几乎听不清

skyliwq · 2024-03-06T06:00:54Z

大神声音忽大忽小的问题能解决吗？

Zengyi-Qin · 2024-03-07T15:02:26Z

Please just amplify the volume or use some post processing normalizing technique

skyliwq · 2024-03-07T21:19:55Z

Please just amplify the volume or use some post processing normalizing technique

我需要接到大模型上使用，语音声音忽大忽小，体验不是很好，希望能改进

luobotaxinghu · 2024-03-08T02:43:57Z

看日志应该是拆成一句话一句话转的，这样不可避免每句话的音量无法对齐，看 @Zengyi-Qin 的回复是做后处理如标准化等
这个工作其实应该项目内部处理，而不是交给用户，用户拿到整段音频是不能做处理的，需要改中间实现的代码分句音频处理

skyliwq · 2024-03-08T05:41:24Z

@Zengyi-Qin，是的，如果是这样在某些应用场景下，就失去使用价值了

看日志应该是拆成一句话一句话转的，这样不可避免每句话的音量无法对齐，看 @Zengyi-Qin 的回复是做后处理如标准化等这个工作其实应该项目内部处理，而不是交给用户，用户拿到整段音频是不能做处理的，需要改中间实现的代码分句音频处理

MissingTwins · 2024-03-08T09:20:59Z

pip install ffmpeg-normalize
ffmpeg-normalize input.wav -c:a libopus -b:a 128k -o output.oga -f

WARNING: Input file had loudness range of 10.1. This is larger than the loudness range target (7.0). Normalization will revert to dynamic mode.

Well, normalization does not solve the issue. The dynamic range remains too wide, with the volume fluctuating randomly between loud and soft.

andyweiqiu · 2024-03-12T03:57:11Z

pip install ffmpeg-normalize ffmpeg-normalize input.wav -c:a libopus -b:a 128k -o output.oga -f

WARNING: Input file had loudness range of 10.1. This is larger than the loudness range target (7.0). Normalization will revert to dynamic mode.

Well, normalization does not solve the issue. The dynamic range remains too wide, with the volume fluctuating randomly between loud and soft.

直接处理肯定是不行的，整段音频音量会同时增大或减少，在听感上跟输出的音频没啥区别，要在分段输出那里进行处理。

v3ucn · 2024-05-03T10:53:42Z

pip install pyloudnorm

加载音频文件

data, rate = sf.read(r"D:\Downloads\output_v2_zh.wav")

峰值归一化至 -1 dB

peak_normalized_audio = pyln.normalize.peak(data, -1.0)

测量响度

meter = pyln.Meter(rate)
loudness = meter.integrated_loudness(data)

响度归一化至 -12 dB LUFS

loudness_normalized_audio = pyln.normalize.loudness(data, loudness, -12.0)

sf.write("./normalized_audio.wav", loudness_normalized_audio, rate)

v3ucn mentioned this issue May 3, 2024

解决中文语音推理声音忽大忽小的问题 #122

Closed

zgldh mentioned this issue Dec 8, 2024

fix: fix_loudness to -12 dB #221

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

声音忽大忽小是什么原因 #45

声音忽大忽小是什么原因 #45

skyliwq commented Mar 2, 2024 •

edited

Loading

skyliwq commented Mar 2, 2024

skyliwq commented Mar 6, 2024

Zengyi-Qin commented Mar 7, 2024

skyliwq commented Mar 7, 2024

luobotaxinghu commented Mar 8, 2024

skyliwq commented Mar 8, 2024 •

edited

Loading

MissingTwins commented Mar 8, 2024 •

edited

Loading

andyweiqiu commented Mar 12, 2024

v3ucn commented May 3, 2024

声音忽大忽小是什么原因 #45

声音忽大忽小是什么原因 #45

Comments

skyliwq commented Mar 2, 2024 • edited Loading

skyliwq commented Mar 2, 2024

skyliwq commented Mar 6, 2024

Zengyi-Qin commented Mar 7, 2024

skyliwq commented Mar 7, 2024

luobotaxinghu commented Mar 8, 2024

skyliwq commented Mar 8, 2024 • edited Loading

MissingTwins commented Mar 8, 2024 • edited Loading

andyweiqiu commented Mar 12, 2024

v3ucn commented May 3, 2024

加载音频文件

峰值归一化至 -1 dB

测量响度

响度归一化至 -12 dB LUFS

skyliwq commented Mar 2, 2024 •

edited

Loading

skyliwq commented Mar 8, 2024 •

edited

Loading

MissingTwins commented Mar 8, 2024 •

edited

Loading