Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Comparison of spectrogram between librosa and torchlibrosa #16

Open
ChrisNick92 opened this issue Nov 30, 2024 · 0 comments
Open

Comparison of spectrogram between librosa and torchlibrosa #16

ChrisNick92 opened this issue Nov 30, 2024 · 0 comments

Comments

@ChrisNick92
Copy link

Hello everyone and good job for this library.

I am trying to figure out the correspondence between librosa and torchlibrosa for the spectrogram extraction.
As stated in the description, Spectrogram method of torchlibrosa corresponds to the stft method of librosa.

I created the following script to check if this true but I get different results:

import numpy as np
import librosa
import torchlibrosa as tl
import torch

if __name__ == "__main__":

    # Sample audio of 8KHz
    audio = np.random.randn(1, 8_000).astype(np.float32)

    # Extract Spectrogram Amplitude with Librosa
    Specl = np.squeeze(np.abs(librosa.stft(y=audio, n_fft=1024, hop_length=256)))
    # Shape: (Freq bins x Time bins)
    print(f"Librosa spec shape: {Specl.shape}")

    # Extract Spectrogram Amplitude with TL
    spec_extractor = tl.Spectrogram(n_fft=1024, hop_length=256, power=1)
    Spectl = torch.squeeze(spec_extractor(torch.from_numpy(audio))).T
    # Shape (Freq bins x Time bins)
    print(f"TorchLibrosa spec shape: {Spectl.shape}")

    # Compare
    print("\n\nLibrosa Spectrogram\n", Specl, "\n\n\n", "Torch Librosa Spectrogram\n", Spectl)
    Specl = torch.from_numpy(Specl)

    print(f"\nL infty norm of difference: {torch.abs(Specl - Spectl).max()}")

When I execute the above script with python librosa_comparison.py I get the following output:

Librosa spec shape: (513, 32)
TorchLibrosa spec shape: torch.Size([513, 32])


Librosa Spectrogram
 [[19.054897   1.9908019 17.752724  ... 12.835096  22.917797  10.363778 ]
 [20.790785  24.919458  24.746351  ... 25.476685  15.694681   5.6991477]
 [21.582727  18.81189   16.363668  ... 35.682972  10.927736   3.178584 ]
 ...
 [14.2219305 32.846138  41.570446  ... 15.117013   5.9180603 10.4871025]
 [ 9.384632  12.503995  30.297691  ... 10.411208   8.746181  13.068881 ]
 [ 7.3271813 19.286459  19.321154  ...  1.8899674 15.552687  15.774155 ]] 


 Torch Librosa Spectrogram
 tensor([[38.4898,  8.1263, 17.7527,  ..., 12.8351, 24.5029, 17.0312],
        [38.6057, 30.1623, 24.7463,  ..., 25.4767, 14.2940,  9.5362],
        [24.1956, 16.3059, 16.3637,  ..., 35.6830, 12.6571,  6.5012],
        ...,
        [24.0833, 35.3503, 41.5704,  ..., 15.1170,  6.7059, 10.9721],
        [18.3864,  9.4011, 30.2977,  ..., 10.4112,  8.4296, 19.5919],
        [14.2744, 22.7395, 19.3212,  ...,  1.8900, 17.0934, 27.0281]])

L infty norm of difference: 34.372806549072266

Is anything that I am doing wrong? What's the relation between Spectrogram method of torchlibrosa and stft of librosa.

Thanks in advance!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant