Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dsound: don't restrict the host buffer to 16-bit #774

Merged
merged 1 commit into from
Dec 22, 2023

Conversation

dechamps
Copy link
Contributor

@dechamps dechamps commented Feb 25, 2023

Currently, the DirectSound Host API code only creates 16-bit integer DirectSound buffers. All audio data provided by the user is therefore converted by PortAudio to and from 16-bit, regardless of the user buffer format.

This basically makes it impossible to pass 24-bit audio through the DirectSound Host API.

If the user buffer format is not 16-bit, this also causes pointless conversions to take place, even if the hardware is running at 16-bit, because modern Windows versions (Vista+) convert the data to floating point behind the scenes before it is handed off to the hardware. This can lead to silly situations where 32-bit float samples from the user are (lossily) converted to 16-bit by PortAudio, then ended off to Windows via DirectSound, only to be converted back to 32-bit float again, before finally being converted to the format the hardware device is configured to use. This can easily lead to two layers of 16-bit dithering (one from PortAudio, and one from Windows) being piled on top of each other, resulting in an elevated noise floor.

This commit fixes this problem by configuring the DirectSound buffers to use the same format as the user buffer. This should stop PortAudio from converting samples in all cases except paInt8, which is not supported by WAVEFORMATEX (only paUInt8 is).

The new code assumes that DirectSound will accept whatever format we throw at it. I feel confident that this is always true on Vista+ regardless of hardware, because starting from Vista DirectSound does not access hardware directly - it always goes through the Windows Audio Engine which supports all formats in both directions (I verified this using paloopback).

The consequences of this change on pre-Vista Windows are less clear, although this thread suggests that this might just work even on 2000/XP as KMixer should be able to handle these conversions as well.

Fixes #54

Currently, the DirectSound Host API code only creates 16-bit integer
DirectSound buffers. All audio data provided by the user is therefore
converted by PortAudio to and from 16-bit, regardless of the user buffer
format.

This basically makes it impossible to pass 24-bit audio through the
DirectSound Host API.

If the user buffer format is not 16-bit, this also causes pointless
conversions to take place, *even if the hardware is running at 16-bit*,
because modern Windows versions (Vista+) convert the data to floating
point behind the scenes before it is handed off to the hardware. This
can lead to silly situations where 32-bit float samples from the user
are (lossily) converted to 16-bit by PortAudio, then ended off to
Windows via DirectSound, only to be converted back to 32-bit float
again, before finally being converted to the format the hardware device
is configured to use. This can easily lead to two layers of
16-bit dithering (one from PortAudio, and one from Windows) being piled
on top of each other, resulting in an elevated noise floor.

This commit fixes this problem by configuring the DirectSound buffers to
use the same format as the user buffer. This should stop PortAudio from
converting samples in all cases except paInt8, which is not supported by
WAVEFORMATEX (only paUInt8 is).

The new code assumes that DirectSound will accept whatever format we
throw at it. I feel confident that this is always true on Vista+
regardless of hardware, because starting from Vista DirectSound does not
access hardware directly - it always goes through the Windows Audio
Engine which supports all formats in both directions (I verified this
using paloopback).

The consequences of this change on pre-Vista Windows are less clear,
although [1] suggests that this might just work even on 2000/XP as
KMixer should be able to handle these conversions as well.

[1]: https://microsoft.public.win32.programmer.directx.audio.narkive.com/elgNlPIn/

Fixes PortAudio#54
@RossBencina
Copy link
Collaborator

Thanks Etienne, this is a long standing issue and your patch is very simple, which is great.

My first question is: how is dithering and clipping handled? Are these PortAudio flags respected?

Then there is the question of consistent format conversions. If we pass floating point data directly through to the OS it means that the PortAudio float scaling behavior may not be consistent between platforms. (e.g. for floating point values does outputting a +/- 1.0 amplitude sine wave get clipped? does inputting a full range 16 bit signal give the same floating point range as when our converters are used?).

starting from Vista DirectSound does not access hardware directly - it always goes through the
Windows Audio Engine which supports all formats in both directions (I verified this using
paloopback).

To be clear, which Windows versions did you test this on?

My personal view (not shared by all) is that given that DirectSound is a legacy API, and one of the benefits of PA supporting DirectSound is to support older OS versions, I think it would be best to try to avoid risks of breaking backward compatibility. Ideally all the way back to Windows 98. (There is also the question of compatibility with the WINE audio stack.) I can think of three ways forward:

  1. Merge any patch as-is and risk silently breaking compatibility with old Windows versions.
  2. Exhaustive testing of this patch on all old OS versions (not worth it).
  3. Pick an "oldest target" (Windows 7, say) and test on that. Then put an OS version test in the code that switches to the old 16-bit only mode for Windows versions prior to that version.

The consequences of this change on pre-Vista Windows are less clear, although this thread
suggests
that
this might just work even on 2000/XP as KMixer should be able to handle these conversions as well.

Did DirectSound always go via KMixer though? I have forgotten the details.

@RossBencina RossBencina added the src-dsound MS DirectSound Host API /src/hostapi/dsound label Feb 28, 2023
@dechamps
Copy link
Contributor Author

dechamps commented Feb 28, 2023

My first question is: how is dithering and clipping handled? Are these PortAudio flags respected?

My understanding is that PortAudio flags for customizing dithering or clipping behavior are only relevant if PortAudio is actually doing the conversion. I don't see how these flags could possibly apply if the conversion is done downstream of PortAudio.

In this PR, these flags will always be ignored, because PortAudio will never do any conversions (except between UInt8 and Int8, which is trivial). That makes sense to me.

Today, with the current code, these flags don't do what the user expects anyway, because the Windows Audio Engine will do another set of conversions behind the user's back.

Then there is the question of consistent format conversions. If we pass floating point data directly through to the OS it means that the PortAudio float scaling behavior may not be consistent between platforms. (e.g. for floating point values does outputting a +/- 1.0 amplitude sine wave get clipped? does inputting a full range 16 bit signal give the same floating point range as when our converters are used?).

DirectSound hands off the samples to the Windows Audio Engine (through WASAPI internally), where the signal is filtered by APOs, resampled (if necessary), mixed, possibly gain-adjusted if software volume control is in use, passed through CAudioLimiter (which does "soft" clipping), and finally converted to integer (with dithering) for handoff to the hardware. (See Microsoft docs e.g. Windows Audio Architecture, Audio Processing Object Architecture, my APO notes, etc.)

There is so much processing being done by Windows downstream of PortAudio that I really don't see how PortAudio's "float scaling behavior" could possibly matter in the grand scheme of things. Any decision made by PortAudio in that regard will be immediately undone as soon as the samples get munched through the Windows Audio Engine.

You seem to be operating under the assumption that DirectSound is a bit-perfect pipeline. It absolutely is not - not in post-Vista Windows, anyway. Users who care about bit-perfect operation need to use WASAPI Exclusive or WDM-KS.

Even if I were to concede your point, I would argue that avoiding unnecessary lossy conversions is more important for most users.

To be clear, which Windows versions did you test this on?

For this particular PR, Windows 11.

If you're challenging the "starting from Vista DirectSound does not access hardware directly" quote, well, I think it's well-known public knowledge that Microsoft removed direct audio hardware access through DirectSound starting with the new Windows audio architecture introduced in Vista. In fact it made quite a fuss at the time because audio interface manufacturers would complain that they cannot offer hardware accelerated audio anymore.

My personal view (not shared by all) is that given that DirectSound is a legacy API, and one of the benefits of PA supporting DirectSound is to support older OS versions, I think it would be best to try to avoid risks of breaking backward compatibility.

Is it your view that PortAudio users targetting modern OS versions should not use the DirectSound host API?

If that is your view, and the DirectSound Host API is only meant for use on old systems, then that should be documented. Also, we will need to make it clear to contributors that the DirectSound Host API code is frozen in place and that no PRs will be accepted because there is a "zero risk" backward compatibility policy in place. This way I can stop wasting my time sending these PRs. I will need to change the defaults for my application (FlexASIO), because DirectSound is the Host API it uses by default. I made DS the default in 2018 after being disappointed by the reliability of the WASAPI Host API, and I didn't go for MME because DS seemed more "modern" than MME. In retrospect, maybe I should have gone with MME.

I will point out, at the risk of stating the obvious, that everything is a tradeoff. Rejecting a change that improves PortAudio in most situations, on the off chance that perhaps someone out there running a 25-year-old OS might maybe encounter some unproven problem seems like a very odd way to make this tradeoff.

Ideally all the way back to Windows 98.

I think it is utterly unreasonable for an application developer to expect code written in 2023 to work on Windows 98. That is an incredibly odd requirement to have for an incredibly niche use case. Honestly Ross, I almost felt like I was being trolled when I read this. How many people still have a Windows 98 computer that still works and is still in use? How many would expect an application built in 2023 from a 2023 codebase to work on such a computer?

If developers want to build PortAudio applications that run on Windows 98, then why not tell these people (again, how many of these are there??) to use a PortAudio version that is closer to that era? Thanks to the magic of version control, they can take a time machine and go back to any version along the very long history of PortAudio. For example they could pick one from 2006, which is when Microsoft ended extended support for Windows 98. Since Microsoft cares a great deal about backwards compatibility, it is quite likely their application using a PortAudio from 2006 will run just fine on modern Windows as well. (The opposite is much less likely.) But then again I can't believe I'm even arguing about this.

There is also the question of compatibility with the WINE audio stack

The point of WINE is to reproduce Windows behaviour as closely as possible. If Wine behaviour diverges from Windows behaviour, that's a bug in Wine, not a bug in PortAudio. Also, a main selling point of PortAudio is precisely that you can build cross-platform (Windows and Linux) apps with it that do not require the use of Wine!

I can think of three ways forward:

  1. Merge any patch as-is and risk silently breaking compatibility with old Windows versions.

  2. Exhaustive testing of this patch on all old OS versions (not worth it).

  3. Pick an "oldest target" (Windows 7, say) and test on that. Then put an OS version test in the code that switches to the old 16-bit only mode for Windows versions prior to that version.

Okay Ross, here's a thought. And I apologize this is going to be a long rant again, but I think it needs to be said.

Please consider that your bar for accepting PRs into the master branch may be too high. You are asking for a level of testing that is not consistent with the level of resources the PortAudio project has access to. Your contributors are unpaid and do not have access to a lab with various computers running various versions of Windows going all the way back to Windows 98, nor do they have access to a variety of audio devices on a test bench and a QA department to operate it.

Given your contributors cannot meet your impossible standards, I think you're gonna have a really hard time approving any non-trivial PR with these requirements. You might have contributors who care about backward compatibility to such an extreme degree that they're willing to spend time wondering how their code would behave on Windows 98 (and perhaps even test on it, if they're into archaeology as well), but I'm not one of them. I do care, however, about fixing real (not hypothetical) issues that affect users of my applications (in this case FlexASIO) today, on today's systems, and contributing my fixes upstream so that other PortAudio users do not run into the same issues.

To me the current policy that you are trying to enforce for PortAudio PRs is untenable. I would like to suggest a change in policy. Here's the policy I would suggest: PRs will be accepted into master as long as there is no clear indication that they would break reasonable use cases. The author of the PR is not required to demonstrate that their code works with every potential use case on every potential platform, because that's an impractical requirement and no contributor will have the resources nor the motivation to do that. Instead, the burden of proof lies with the person objecting to the PR. That is, in order to prevent a PR from being merged, one has to demonstrate that the proposed change breaks a reasonable use case (either by showing the code itself is wrong, or by testing the code and producing failing test results). In other words: you can only block PRs if you have a real, substantiated concern backed by actual evidence showing that the PR is problematic. You do not get to block PRs on the off chance that perhaps something might maybe break somewhere, as you're trying to do here.

(Separately, I would further argue that Windows 98 does not qualify as a "reasonable use case". Windows 7, sure. But that's a different point.)

Does that lower the bar to get something into master? Yes, definitely. Does that make it more likely for bugs to make it into master? Yes (but conversely, it also makes it more likely for bugfixes such as this one to make it in as well). The goal is to make it easier for people to contribute. It's a different tradeoff. One nice thing about accepting more changes into master is that they get a lot more testing, because application developers will surely do some testing as they pick up the latest master. If things break, then guess what? Commits can be reverted!

A good reason to reject (or revert) a change is if it has been tested and shown to break. But currently, it doesn't even get to that point. Proposed changes are not being tested because contributors are being asked to do all the testing themselves and to produce forms in triplicate showing they have addressed every potential concern (no matter how remote) that anyone could raise, yet they have neither the resources, nor the motivation, nor the time to do so. Maybe things would be different if people were being paid to do this, but they aren't. The only way out, IMHO, is to assume good faith, assume that changes are fine until proven otherwise, get them merged into master, and revert them if someone yells. Is this an ideal process? No, but the only alternative is a dead project with no contributions at all, except for the small subset of obviously safe, boring changes.

So in that spirit, getting back to your list of 3 options above, I would go for (1), get this PR merged as-is, and then if someone observes that the change broke a reasonable use case, then I would be perfectly fine with the commit being reverted or the code being adjusted accordingly. We both agree that (2) is untenable, and I disagree with (3) because I do not believe we should pollute the code with branching statements that make the code harder to read and increase the risk of bugs (due to the additional complexity and possible combinations of states), on the off chance that perhaps something might maybe break on some very old Windows versions. Again, I would be perfectly fine adding such a branch if and when someone actually complains that this change broke them, and produces evidence to back it up. If there is no real evidence that something would break, then the default position should be to keep the code as simple as possible.

@sagamusix
Copy link

Is it even known if PortAudio, in its current state, works on vanilla Windows 98? For example there's #597 which started to use wide-string Windows APIs unconditionally, and those are not available on a vanilla Windows 98 installations (and nobody asked whether that PR was tested on Windows 98). As a consequence PortAudio already requires either Unicows or KernelEx to function on Windows 98 anyway. So far nobody appears to have complained about that fact. If this doesn't stop a developer from making their software available on Windows 98, they should probably take care of not allowing floating-point audio to be used in their software on such systems.

dechamps added a commit to dechamps/portaudio that referenced this pull request Mar 18, 2023
Currently, the MME Host API code only creates 16-bit integer MME
buffers. All audio data provided by the user is therefore converted by
PortAudio to and from 16-bit, regardless of the user buffer format.

This basically makes it impossible to pass 24-bit audio through the
MME Host API.

If the user buffer format is not 16-bit, this also causes pointless
conversions to take place, *even if the hardware is running at 16-bit*,
because modern Windows versions (Vista+) convert the data to floating
point behind the scenes before it is handed off to the hardware. This
can lead to silly situations where 32-bit float samples from the user
are (lossily) converted to 16-bit by PortAudio, then ended off to
Windows via MME, only to be converted back to 32-bit float again, before
finally being converted to the format the hardware device is configured
to use. This can easily lead to two layers of 16-bit dithering (one from
PortAudio, and one from Windows) being piled on top of each other,
resulting in an elevated noise floor.

This commit fixes this problem by configuring the MME buffers to use the
same format as the user buffer. This should stop PortAudio from
converting samples in all cases except paInt8, which is not supported by
WAVEFORMATEX (only paUInt8 is). This is pretty much the same idea as
PortAudio#774.

The new code assumes that MME will accept whatever format we throw at
it. I feel confident that this is always true on Vista+ regardless of
hardware, because starting from Vista MME does not access hardware
directly - it always goes through the Windows Audio Engine which
supports all formats in both directions (I verified this using
paloopback).

On pre-Vista Windows, this should still work all the way back to Windows
98 SE, because MME goes through KMixer which supports all formats [1].

[1]: https://learn.microsoft.com/en-us/windows-hardware/drivers/audio/background-of-non-pcm-support#waveout-api

Fixes PortAudio#790
dechamps added a commit to dechamps/portaudio that referenced this pull request Mar 18, 2023
Currently, the MME Host API code only creates 16-bit integer MME
buffers. All audio data provided by the user is therefore converted by
PortAudio to and from 16-bit, regardless of the user buffer format.

This basically makes it impossible to pass 24-bit audio through the
MME Host API.

If the user buffer format is not 16-bit, this also causes pointless
conversions to take place, *even if the hardware is running at 16-bit*,
because modern Windows versions (Vista+) convert the data to floating
point behind the scenes before it is handed off to the hardware. This
can lead to silly situations where 32-bit float samples from the user
are (lossily) converted to 16-bit by PortAudio, then ended off to
Windows via MME, only to be converted back to 32-bit float again, before
finally being converted to the format the hardware device is configured
to use. This can easily lead to two layers of 16-bit dithering (one from
PortAudio, and one from Windows) being piled on top of each other,
resulting in an elevated noise floor.

This commit fixes this problem by configuring the MME buffers to use the
same format as the user buffer. This should stop PortAudio from
converting samples in all cases except paInt8, which is not supported by
WAVEFORMATEX (only paUInt8 is). This is pretty much the same idea as
PortAudio#774.

The new code assumes that MME will accept whatever format we throw at
it. I feel confident that this is always true on Vista+ regardless of
hardware, because starting from Vista MME does not access hardware
directly - it always goes through the Windows Audio Engine which
supports all formats in both directions (I verified this using
paloopback).

On pre-Vista Windows, this should still work all the way back to Windows
98 SE, because MME goes through KMixer which supports all formats [1].

[1]: https://learn.microsoft.com/en-us/windows-hardware/drivers/audio/background-of-non-pcm-support#waveout-api

Fixes PortAudio#790
dechamps added a commit to dechamps/portaudio that referenced this pull request Mar 18, 2023
Currently, the MME Host API code only creates 16-bit integer MME
buffers. All audio data provided by the user is therefore converted by
PortAudio to and from 16-bit, regardless of the user buffer format.

This basically makes it impossible to pass 24-bit audio through the
MME Host API.

If the user buffer format is not 16-bit, this also causes pointless
conversions to take place, *even if the hardware is running at 16-bit*,
because modern Windows versions (Vista+) convert the data to floating
point behind the scenes before it is handed off to the hardware. This
can lead to silly situations where 32-bit float samples from the user
are (lossily) converted to 16-bit by PortAudio, then ended off to
Windows via MME, only to be converted back to 32-bit float again, before
finally being converted to the format the hardware device is configured
to use. This can easily lead to two layers of 16-bit dithering (one from
PortAudio, and one from Windows) being piled on top of each other,
resulting in an elevated noise floor.

This commit fixes this problem by configuring the MME buffers to use the
same format as the user buffer. This should stop PortAudio from
converting samples in all cases except paInt8, which is not supported by
WAVEFORMATEX (only paUInt8 is). This is pretty much the same idea as
PortAudio#774.

The new code assumes that MME will accept whatever format we throw at
it. I feel confident that this is always true on Vista+ regardless of
hardware, because starting from Vista MME does not access hardware
directly - it always goes through the Windows Audio Engine which
supports all formats in both directions (I verified this using
paloopback).

On pre-Vista Windows, this should still work all the way back to Windows
98 SE, because MME goes through KMixer which supports all formats [1].

[1]: https://learn.microsoft.com/en-us/windows-hardware/drivers/audio/background-of-non-pcm-support#waveout-api

Fixes PortAudio#790
dechamps added a commit to dechamps/portaudio that referenced this pull request Mar 22, 2023
Currently, the MME Host API code only creates 16-bit integer MME
buffers. All audio data provided by the user is therefore converted by
PortAudio to and from 16-bit, regardless of the user buffer format.

This basically makes it impossible to pass 24-bit audio through the
MME Host API.

If the user buffer format is not 16-bit, this also causes pointless
conversions to take place, *even if the hardware is running at 16-bit*,
because modern Windows versions (Vista+) convert the data to floating
point behind the scenes before it is handed off to the hardware. This
can lead to silly situations where 32-bit float samples from the user
are (lossily) converted to 16-bit by PortAudio, then ended off to
Windows via MME, only to be converted back to 32-bit float again, before
finally being converted to the format the hardware device is configured
to use. This can easily lead to two layers of 16-bit dithering (one from
PortAudio, and one from Windows) being piled on top of each other,
resulting in an elevated noise floor.

This commit fixes this problem by configuring the MME buffers to use the
same format as the user buffer. This should stop PortAudio from
converting samples in all cases except paInt8, which is not supported by
WAVEFORMATEX (only paUInt8 is). This is pretty much the same idea as
PortAudio#774.

The new code assumes that MME will accept whatever format we throw at
it. I feel confident that this is always true on Vista+ regardless of
hardware, because starting from Vista MME does not access hardware
directly - it always goes through the Windows Audio Engine which
supports all formats in both directions (I verified this using
paloopback).

On pre-Vista Windows, this should still work all the way back to Windows
98 SE, because MME goes through KMixer which supports all formats [1].

[1]: https://learn.microsoft.com/en-us/windows-hardware/drivers/audio/background-of-non-pcm-support#waveout-api

Fixes PortAudio#139
dechamps added a commit to dechamps/portaudio that referenced this pull request Jul 29, 2023
Currently, the MME Host API code only creates 16-bit integer MME
buffers. All audio data provided by the user is therefore converted by
PortAudio to and from 16-bit, regardless of the user buffer format.

This basically makes it impossible to pass 24-bit audio through the
MME Host API.

If the user buffer format is not 16-bit, this also causes pointless
conversions to take place, *even if the hardware is running at 16-bit*,
because modern Windows versions (Vista+) convert the data to floating
point behind the scenes before it is handed off to the hardware. This
can lead to silly situations where 32-bit float samples from the user
are (lossily) converted to 16-bit by PortAudio, then ended off to
Windows via MME, only to be converted back to 32-bit float again, before
finally being converted to the format the hardware device is configured
to use. This can easily lead to two layers of 16-bit dithering (one from
PortAudio, and one from Windows) being piled on top of each other,
resulting in an elevated noise floor.

This commit fixes this problem by configuring the MME buffers to use the
same format as the user buffer. This should stop PortAudio from
converting samples in all cases except paInt8, which is not supported by
WAVEFORMATEX (only paUInt8 is). This is pretty much the same idea as
PortAudio#774.

The new code assumes that MME will accept whatever format we throw at
it. I feel confident that this is always true on Vista+ regardless of
hardware, because starting from Vista MME does not access hardware
directly - it always goes through the Windows Audio Engine which
supports all formats in both directions (I verified this using
paloopback).

On pre-Vista Windows, this should still work all the way back to Windows
98 SE, because MME goes through KMixer which supports all formats [1].

[1]: https://learn.microsoft.com/en-us/windows-hardware/drivers/audio/background-of-non-pcm-support#waveout-api

Fixes PortAudio#139
dechamps added a commit to dechamps/portaudio that referenced this pull request Jul 29, 2023
Currently, the MME Host API code only creates 16-bit integer MME
buffers. All audio data provided by the user is therefore converted by
PortAudio to and from 16-bit, regardless of the user buffer format.

This basically makes it impossible to pass 24-bit audio through the
MME Host API.

If the user buffer format is not 16-bit, this also causes pointless
conversions to take place, *even if the hardware is running at 16-bit*,
because modern Windows versions (Vista+) convert the data to floating
point behind the scenes before it is handed off to the hardware. This
can lead to silly situations where 32-bit float samples from the user
are (lossily) converted to 16-bit by PortAudio, then ended off to
Windows via MME, only to be converted back to 32-bit float again, before
finally being converted to the format the hardware device is configured
to use. This can easily lead to two layers of 16-bit dithering (one from
PortAudio, and one from Windows) being piled on top of each other,
resulting in an elevated noise floor.

This commit fixes this problem by configuring the MME buffers to use the
same format as the user buffer. This should stop PortAudio from
converting samples in all cases except paInt8, which is not supported by
WAVEFORMATEX (only paUInt8 is). This is pretty much the same idea as
PortAudio#774.

The new code assumes that MME will accept whatever format we throw at
it. I feel confident that this is always true on Vista+ regardless of
hardware, because starting from Vista MME does not access hardware
directly - it always goes through the Windows Audio Engine which
supports all formats in both directions (I verified this using
paloopback).

On pre-Vista Windows, this should still work all the way back to Windows
98 SE, because MME goes through KMixer which supports all formats [1].
Nevertheless, it's difficult to be sure, so this code checks the Windows
version it's running on and preserves the old behavior (i.e. always use
Int16) on pre-Vista Windows.

[1]: https://learn.microsoft.com/en-us/windows-hardware/drivers/audio/background-of-non-pcm-support#waveout-api

Fixes PortAudio#139
RossBencina pushed a commit that referenced this pull request Aug 4, 2023
Currently, the MME Host API code only creates 16-bit integer MME
buffers. All audio data provided by the user is therefore converted by
PortAudio to and from 16-bit, regardless of the user buffer format.

This basically makes it impossible to pass 24-bit audio through the
MME Host API.

If the user buffer format is not 16-bit, this also causes pointless
conversions to take place, *even if the hardware is running at 16-bit*,
because modern Windows versions (Vista+) convert the data to floating
point behind the scenes before it is handed off to the hardware. This
can lead to silly situations where 32-bit float samples from the user
are (lossily) converted to 16-bit by PortAudio, then ended off to
Windows via MME, only to be converted back to 32-bit float again, before
finally being converted to the format the hardware device is configured
to use. This can easily lead to two layers of 16-bit dithering (one from
PortAudio, and one from Windows) being piled on top of each other,
resulting in an elevated noise floor.

This commit fixes this problem by configuring the MME buffers to use the
same format as the user buffer. This should stop PortAudio from
converting samples in all cases except paInt8, which is not supported by
WAVEFORMATEX (only paUInt8 is). This is pretty much the same idea as
#774.

The new code assumes that MME will accept whatever format we throw at
it. I feel confident that this is always true on Vista+ regardless of
hardware, because starting from Vista MME does not access hardware
directly - it always goes through the Windows Audio Engine which
supports all formats in both directions (I verified this using
paloopback).

On pre-Vista Windows, this should still work all the way back to Windows
98 SE, because MME goes through KMixer which supports all formats [1].
Nevertheless, it's difficult to be sure, so this code checks the Windows
version it's running on and preserves the old behavior (i.e. always use
Int16) on pre-Vista Windows.

[1]: https://learn.microsoft.com/en-us/windows-hardware/drivers/audio/background-of-non-pcm-support#waveout-api

Fixes #139
Copy link
Collaborator

@philburk philburk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can improve resolution for apps using DSound/
This might break some code running on pre-Vista Windows.
If so then we can make the code conditional on OS like we did for MME.

Copy link
Collaborator

@RossBencina RossBencina left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that we merged #796 it is consistent to merge this too.

We can develop a unified position on format conversions in the future.

@RossBencina RossBencina merged commit 0c19240 into PortAudio:master Dec 22, 2023
bear101 added a commit to bear101/portaudio that referenced this pull request Mar 6, 2024
* Preparation to start patching - TODOS and directory src/hostapi/Oboe

* Added src/hostapi/oboe/README.md

* Added include/pa_oboe.h and src/hostapi/oboe/pa_oboe.cpp

* Added PA_USE_OBOE section to CMakeLists.txt

* Added PA_USE_OBOE section to CMakeLists.txt

* Heavily reworked CMake dependencies, added FindOboe.cmake, updated pa_oboe.cpp, more work needed

* Included build_all_PaOboe.sh, more work needed on CMake

* Included build_all_PaOboe.sh, more work needed on CMake

* Update src/hostapi/oboe/README.md

* bindings/cpp: CMake: support pkgconfig with RelWithDebInfo (PortAudio#822)

bindings/cpp: add  `RelWithDebInfo` to the configurations allowed to install portaudiocpp.pc, otherwise you won't have support for pkg-config when building a release package with separate debugging information. Besides that, RelWithDebInfo is identical to Release.

* Fix MSVC warning C4018 signed/unsigned mismatch (PortAudio#821)

See PortAudio#810

* Built shared library

* Polished bits

* mme: don't restrict the host buffer to 16-bit

Currently, the MME Host API code only creates 16-bit integer MME
buffers. All audio data provided by the user is therefore converted by
PortAudio to and from 16-bit, regardless of the user buffer format.

This basically makes it impossible to pass 24-bit audio through the
MME Host API.

If the user buffer format is not 16-bit, this also causes pointless
conversions to take place, *even if the hardware is running at 16-bit*,
because modern Windows versions (Vista+) convert the data to floating
point behind the scenes before it is handed off to the hardware. This
can lead to silly situations where 32-bit float samples from the user
are (lossily) converted to 16-bit by PortAudio, then ended off to
Windows via MME, only to be converted back to 32-bit float again, before
finally being converted to the format the hardware device is configured
to use. This can easily lead to two layers of 16-bit dithering (one from
PortAudio, and one from Windows) being piled on top of each other,
resulting in an elevated noise floor.

This commit fixes this problem by configuring the MME buffers to use the
same format as the user buffer. This should stop PortAudio from
converting samples in all cases except paInt8, which is not supported by
WAVEFORMATEX (only paUInt8 is). This is pretty much the same idea as
PortAudio#774.

The new code assumes that MME will accept whatever format we throw at
it. I feel confident that this is always true on Vista+ regardless of
hardware, because starting from Vista MME does not access hardware
directly - it always goes through the Windows Audio Engine which
supports all formats in both directions (I verified this using
paloopback).

On pre-Vista Windows, this should still work all the way back to Windows
98 SE, because MME goes through KMixer which supports all formats [1].
Nevertheless, it's difficult to be sure, so this code checks the Windows
version it's running on and preserves the old behavior (i.e. always use
Int16) on pre-Vista Windows.

[1]: https://learn.microsoft.com/en-us/windows-hardware/drivers/audio/background-of-non-pcm-support#waveout-api

Fixes PortAudio#139

* Don't use absolute path when linking to macOS frameworks (PortAudio#829)

fixes PortAudio#828

* Fixing old problems

* Added some debug messages

* added new build script

* Added paOboe in the paHostApi array

* Added paOboe in pa_unix_hostapis.c

* Working patch

* win: New PaWinUtil_GetOsVersion() function for getting Windows OS version. Refactored WASAPI, DS, MME and WDMKS host back-ends to use PaWinUtil_GetOsVersion() instead of direct OS API.

* Ready for other users to compile/build

* Updated oboe/Readme.md

* Added sharing mode selection

* Fixed minor issue with sharing mode selection

* Removed references to KCTI

* Fixed error callback, added performance mode autoselection

* Minor change to CMakeLists, added low latency costant in pa_oboe.h

* Deleted Build_PaOboe.sh

* Ready to push (removed my paths to Oboe directory and Android NDK)

* Update README.md

* Update README.md

* remove all alternative sample conversion code using lrintf in pa_converters.c (PortAudio#403)

removes the code guarded by PA_USE_C99_LRINTF
See PortAudio#390

* updated readme

* fixed CMakeLists.txt

* fixed FindOboe.cmake

* Preparation to start patching - TODOS and directory src/hostapi/Oboe

* Added src/hostapi/oboe/README.md

* Added include/pa_oboe.h and src/hostapi/oboe/pa_oboe.cpp

* Added PA_USE_OBOE section to CMakeLists.txt

* Added PA_USE_OBOE section to CMakeLists.txt

* Heavily reworked CMake dependencies, added FindOboe.cmake, updated pa_oboe.cpp, more work needed

* Included build_all_PaOboe.sh, more work needed on CMake

* Included build_all_PaOboe.sh, more work needed on CMake

* Built shared library

* Polished bits

* Fixing old problems

* Added some debug messages

* added new build script

* Added paOboe in the paHostApi array

* Added paOboe in pa_unix_hostapis.c

* Working patch

* Ready for other users to compile/build

* Updated oboe/Readme.md

* Added sharing mode selection

* Fixed minor issue with sharing mode selection

* Removed references to KCTI

* Fixed error callback, added performance mode autoselection

* Minor change to CMakeLists, added low latency costant in pa_oboe.h

* Deleted Build_PaOboe.sh

* Ready to push (removed my paths to Oboe directory and Android NDK)

* Update src/hostapi/oboe/README.md

* Update README.md

* Update README.md

* updated readme

* fixed CMakeLists.txt

* fixed FindOboe.cmake

* corrected oboe/Readme.md

* Updated oboe/Readme.md

* PulseAudio Portaudio HostAPI (PortAudio#336)

Adds support for PulseAudio API on Linux.

For more information about Pulseaudio visit: https://www.freedesktop.org/wiki/Software/PulseAudio/

---------

Signed-off-by: Mario Kleiner <[email protected]>
Co-authored-by: sqweek <[email protected]>
Co-authored-by: Daniel Schürmann <[email protected]>
Co-authored-by: Mooneer Salem <[email protected]>
Co-authored-by: Be <[email protected]>
Co-authored-by: Mario Kleiner <[email protected]>

* pulseaudio: Move Pulseaudio include in correct place when using autoconf (PortAudio#843)

As Jack and Pulseaudio both needs Ringbuffer that include can
be done in same place. In configure.in also Pulseaudio header
file was included before it was sure that it was really needed.

Commit makes sure that Pulseaudio include is available only if
it's needed as it can cause failing in build if Pulseaudio
develoment files are not available.

* added .idea to gitignore

* added .idea to gitignore

* restored workflows directory

* Minor fixes to FindOboe.cmake

* Enhanced prebuilt libraries compatibility in FindOboe.cmake

* Minor changes to Pa_Oboe/Readme and pa_oboe.cpp

* Removed auto latency tuning in favor of simpler impleentation in pa_oboe.cpp

* Set paFloat32 as default format in pa_oboe.cpp

* Renamed most of the variables according to best coding practices.

* Added separate callback class to fix the single-stream issue.

* Modified OboeEngine accordingly

* Adjusted the code in the rest of pa_oboe.cpp

* Removed stop and close phases of OboeEngine::restartStream

* Updated functions' description

* minor description corrections

* fixed all compiling errors generated by typos

* Added OboeMediator class in place of OboeCallback, that mediates PortAudio C stream struct and the C++ OboeEngine class

* Fixed allocation problem, working PaOboe implementation

* Fix 'pulseaudioHostApi' use-after-free/null ptr deref in PaPulseAudio_Initialize (PortAudio#847)

The call to PaPulseAudio_UnLock( pulseaudioHostApi->mainloop ) in error-label is performed on 'pulseaudioHostApi' after 'pulseaudioHostApi' has been freed by PaPulseAudio_Free and set to NULL.

* wdmks: declare GUIDs with selectany attribute (PortAudio#846)

* wdmks: declare GUIDs with selectany attribute

Match the behavior of guiddef.h in both mingw and the Windows SDK headers. This prevents linking errors caused by multiply defined symbols when linking against certain Windows SDK libs (like dxguid.lib).

* Make sure this works even if DECLSPEC_SELECTANY is not defined

---------

Co-authored-by: Ross Bencina <[email protected]>

* fixed README.md indenting

* Removed .idea folder from repo

* replaced 'g_' with 'paOboe_' in non-static global variables

* replaced 'm_' prefix with 'm' prefix

* fixed OboeEngine::tryStream as requested

* PulseAudio Portaudio HostAPI (PortAudio#336)

Adds support for PulseAudio API on Linux.

For more information about Pulseaudio visit: https://www.freedesktop.org/wiki/Software/PulseAudio/

---------

Signed-off-by: Mario Kleiner <[email protected]>
Co-authored-by: sqweek <[email protected]>
Co-authored-by: Daniel Schürmann <[email protected]>
Co-authored-by: Mooneer Salem <[email protected]>
Co-authored-by: Be <[email protected]>
Co-authored-by: Mario Kleiner <[email protected]>

* pulseaudio: Move Pulseaudio include in correct place when using autoconf (PortAudio#843)

As Jack and Pulseaudio both needs Ringbuffer that include can
be done in same place. In configure.in also Pulseaudio header
file was included before it was sure that it was really needed.

Commit makes sure that Pulseaudio include is available only if
it's needed as it can cause failing in build if Pulseaudio
develoment files are not available.

* restored workflows directory

* Minor fixes to FindOboe.cmake

* Enhanced prebuilt libraries compatibility in FindOboe.cmake

* Minor changes to Pa_Oboe/Readme and pa_oboe.cpp

* Removed auto latency tuning in favor of simpler impleentation in pa_oboe.cpp

* Set paFloat32 as default format in pa_oboe.cpp

* Renamed most of the variables according to best coding practices.

* Added separate callback class to fix the single-stream issue.

* Modified OboeEngine accordingly

* Adjusted the code in the rest of pa_oboe.cpp

* Removed stop and close phases of OboeEngine::restartStream

* Updated functions' description

* minor description corrections

* fixed all compiling errors generated by typos

* Added OboeMediator class in place of OboeCallback, that mediates PortAudio C stream struct and the C++ OboeEngine class

* Fixed allocation problem, working PaOboe implementation

* fixed README.md indenting

* Removed .idea folder from repo

* replaced 'g_' with 'paOboe_' in non-static global variables

* replaced 'm_' prefix with 'm' prefix

* fixed OboeEngine::tryStream as requested

* Changed names to improve readability, i.e. OboeStream -> PaOboeStream

* fixed all compiling errors generated by typos

* Fixed minor problem with TryStream

* fixed long line in FindOboe.cmake

* fixed typos in pa_oboe.cpp

* set to verbose some logs

* Fixed minor problem with TryStream

* fixed long line in FindOboe.cmake

* fixed typos in pa_oboe.cpp

* set to verbose some logs

* Better handling of format in paOboe, removed junk code from CMakeLists

* Improved readability of some variables

* Removed '#include oboe/Oboe.h' from pa_oboe.h, and modified host api implementation accordingly

* static cast fixes

---------

Signed-off-by: Mario Kleiner <[email protected]>
Co-authored-by: Carlo Benfatti <[email protected]>
Co-authored-by: hopefulGiupplo <[email protected]>
Co-authored-by: Carlo Bramini <[email protected]>
Co-authored-by: Etienne Dechamps <[email protected]>
Co-authored-by: Daniel Schürmann <[email protected]>
Co-authored-by: dmitrykos <[email protected]>
Co-authored-by: Ross Bencina <[email protected]>
Co-authored-by: Tuukka Pasanen <[email protected]>
Co-authored-by: sqweek <[email protected]>
Co-authored-by: Mooneer Salem <[email protected]>
Co-authored-by: Be <[email protected]>
Co-authored-by: Mario Kleiner <[email protected]>
Co-authored-by: Tuukka Pasanen <[email protected]>
Co-authored-by: invertego <[email protected]>
@dechamps dechamps deleted the dsoundformat branch May 25, 2024 19:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
src-dsound MS DirectSound Host API /src/hostapi/dsound
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Enable >16bit output sample widths using DirectSound API
4 participants