Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is this code being used into the pyannote-audio >= 2.1 ? #247

Open
nikifori opened this issue Oct 25, 2024 · 4 comments
Open

Is this code being used into the pyannote-audio >= 2.1 ? #247

nikifori opened this issue Oct 25, 2024 · 4 comments
Labels
question Further information is requested

Comments

@nikifori
Copy link

Based on this paper may I assume that versions of pyannote-audio >= 2.1 are using the diart methodology?

For example, if I run this code, will it be executed in an online manner?

    pipeline = Pipeline.from_pretrained("pyannote/speaker-diarization-3.1")
    pipeline.to(device)

    # Apply the pipeline to the audio file
    diarization = pipeline(
        audio_path,
        num_speakers=8,
    )

Thanks

@juanmc2005 juanmc2005 added the question Further information is requested label Nov 6, 2024
@juanmc2005
Copy link
Owner

Hi @nikifori ! diart leverages pyannote.audio models, but pyannote.audio does not provide online inference in their pipelines.

@ywangwxd
Copy link

ywangwxd commented Dec 20, 2024

Based on this paper may I assume that versions of pyannote-audio >= 2.1 are using the diart methodology?

For example, if I run this code, will it be executed in an online manner?

    pipeline = Pipeline.from_pretrained("pyannote/speaker-diarization-3.1")
    pipeline.to(device)

    # Apply the pipeline to the audio file
    diarization = pipeline(
        audio_path,
        num_speakers=8,
    )

Thanks

I have upgraded some depended packages to the latest version successfully based on feat/diart-asr branch:

pyannote.audio 3.3.2
pyannote.core 5.0.0
pyannote.database 5.1.0
pyannote.metrics 3.2.1
pyannote.pipeline 3.0.1
pytorch-lightning 2.4.0
pytorch-metric-learning 2.8.1
torch 2.5.1
torch-audiomentations 0.11.1
torch_pitch_shift 1.2.5
torchaudio 2.5.1
torchmetrics 0.11.4
torchvision 0.20.1

So yes, you can do it. It shouldn't be difficult to do so, just fix any issue you encounterted. I have done this because I usually hate restriction to lower package version.

@juanmc2005
Copy link
Owner

@ywangwxd would you mind opening a PR with the updated dependencies? I've been meaning to do this for some time.

@ywangwxd
Copy link

ywangwxd commented Dec 23, 2024

@ywangwxd would you mind opening a PR with the updated dependencies? I've been meaning to do this for some time.

Hi Juanmc2005,

For some reason, it is not convenient for me to commit any code onto github.com. So I have just attached two files which I have made changes. I reproduced it again before this. The whole process is smooth, I did not see any conflicts. There is no need to change any code. But if you or anyone else do find any conflicts, please let me know, I will try to solve it. I have been working on integrating faster-whisper into feat/diart-asr branch and I have completed that. All my work is based on this config, so there should not be any big issue.

requirements.txt
setup.txt

There are two warning messages, but I have ignored them.

Model was trained with pyannote.audio 0.0.1, yours is 3.3.2. Bad things might happen unless you revert pyannote.audio to 0.x.
Model was trained with torch 1.10.0+cu102, yours is 2.5.1+cu124. Bad things might happen unless you revert torch to 1.x.

Btw, I am using Python=3.10.16 on Linux.

Please do not forget to rename setup.txt to setup.cfg.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

3 participants