Skip to content
This repository has been archived by the owner on Nov 16, 2023. It is now read-only.

Azure Kinect DK in C# mono instead of 8 channel audio #29

Open
Weatwagon opened this issue Aug 26, 2019 · 7 comments
Open

Azure Kinect DK in C# mono instead of 8 channel audio #29

Weatwagon opened this issue Aug 26, 2019 · 7 comments

Comments

@Weatwagon
Copy link

Please provide us with the following information:

This issue is for a: (mark with an x)

- [x] bug report -> please search issues before submitting
- [ ] feature request
- [ ] documentation issue or request
- [ ] regression (a behavior that used to work and stopped in a new release)

Minimal steps to reproduce

Converted sample java Cts.java to C# dotnet 4.7.2 or core 2.2 console app (Gist https://gist.github.com/Weatwagon/9be875a040bfaa0439eb0109104c9fdf)

Any log messages given by the failure

From speech config log:
(526): 174ms SPX_DBG_TRACE_VERBOSE: Microsoft::CognitiveServices::Speech::Impl::ISpxPropertyBagImpl::LogPropertyAndValue: this=0x00000186DAB8D2D8; name='AudioConfig_NumberOfChannelsForCapture'; value='1'

Expected/desired behavior

The speech config log should show a value of 8 for AudioConfig_NumberOfChannelsForCapture, like the following from the java speech config log that I was able to run with success:
(692): 100ms SPX_DBG_TRACE_VERBOSE: property_bag_impl.h:120 Microsoft::CognitiveServices::Speech::Impl::ISpxPropertyBagImpl::LogPropertyAndValue: this=0x000000002179C648; name='AudioConfig_NumberOfChannelsForCapture'; value='8'

OS and Version?

Windows 10, 1903 Ent.

Versions

1.6.0.28

Mention any other details that might be useful

Running the java example I experience no issues with transcription and participant identification. When trying to run with C# I don't receive any errors but no transcription happens. Looking at the log files I noticed the difference in output form java to c# was the AudioConfig_NumberOfChannelsForCapture.


Thanks! We'll be in touch soon.

@jychoudh
Copy link
Collaborator

jychoudh commented Sep 4, 2019

Thanks for reporting the issue!

My initial guess would be that Microsoft.CognitiveServices.Speech.extension.pma.dll, libpma.dll and libunimic_runtime.dll are not present at the correct location. These libraries are required for recording correctly using Azure Kinect DK and do conversation transcription. Can you tell me where are you placing these? How did you get the other libraries apart from these above three libraries?

Thanks,
Jyotesh

@Weatwagon
Copy link
Author

I am using NuGet package manager to install the Microsoft.CognitiveServices.Speech I can see the following in the bin directory on compile

  • Microsoft.CognitiveServices.Speech.core.dll
  • Microsoft.CognitiveServices.Speech.csharp.dll
  • Microsoft.CognitiveServices.Speech.csharp.xml
  • Microsoft.CognitiveServices.Speech.extension.kws.dll

I found the Microsoft.CognitiveServices.Speech.extension.pma.dll from the java sample but was not able to locate libpma.dll and libunimic_runtime.dll. Where do I acquire these? I placed Microsoft.CognitiveServices.Speech.extension.pma.dll in the bin directory, but nothing changed.

@jychoudh
Copy link
Collaborator

jychoudh commented Sep 9, 2019

Sorry, I wrote the wrong names above. I meant pma.dll and unimic_runtime.dll. They are in same location as Microsoft.CognitiveServices.Speech.extension.pma.dll in the Java sample. When you download sdsdk-jre.zip from https://docs.microsoft.com/en-us/azure/cognitive-services/speech-service/speech-devices-sdk-windows-quickstart, unzip it. There will be JRE-Sample-Release.zip in the unzipped folder. Unzip it too. Then you will find all three (Microsoft.CognitiveServices.Speech.extension.pma.dll, pma.dll and unimic_runtime.dll) in the unzipped JRE-Sample-Release folder.

You have to paste all three of these dlls into the directory where other dlls are present (bin directory).

@Weatwagon
Copy link
Author

Yes that did the trick, I'm now getting conversation transcription working in my project. Is the a specific NuGet package to use to get these dll's added automatically or must I add the references manually to the project so they are copied on compile?

@jychoudh
Copy link
Collaborator

jychoudh commented Sep 9, 2019

Unfortunately, we don't have a NuGet package as of release 1.6. We are planning to add it in a future release. For now, you will have to manually copy the .dlls. Sorry about that.

@Weatwagon
Copy link
Author

My only remaining issue is I'm getting a lot of false identifications which I assume is due to poor voice signatures. Is there a place to gain more information on what a quality voice signature should be or how to reduce incorrect identifications?

Also I would like to use a recording of a previous conversation because right now I'm driving my co-workers crazy asking them to "hey say something" (open office floor plan) every time I want to test. What the best way to capture an 8 channel audio file from the Azure Kinect DK?

@jychoudh
Copy link
Collaborator

I don't have answer to your first question. I will ask around internally and will get back to you.

You can capture 7-channel audio in Audacity using Azure Kinect DK. In the Device Toolbar, change Audio Host to Windows WASAPI, Recording Device to Azure Kinect Microphone Array and Recording Channels to 7. In the Selection Toolbar, change Project Rate to 16KHz. Capture the audio using this setup. Once you are done with recording, note down the length of the audio you recorded. Now, click on Tracks menu item -> Add New -> Mono Track. Then select the new track and click on Generate menu item -> Silence. Enter the duration equal to your recording duration. Now, you should have 8-channel audio with first 7 channels containing recorded audio and 8th channel containing silence.
Now, go to Edit menu item -> Preferences -> Import / Export. Make sure that "Use custom mix" option is selected in the right pane. Then go to File menu item -> Export Audio. Select "WAV (Microsoft) signed 16-bit PCM" as the Save as type and save the file. Verify that the exported file is 8-channel 16-bit PCM 16KHz wav file.
Then you can use the code given here to do conversation transcription.

Thanks,
Jyotesh

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants