Chatted with the devs and this is the response:
APOs are not in control of the cadence of the realtime pipeline they operate
in. So in your scenario, the APO cannot get away without implementing a ring
buffer.
The Calc*Frames methods are only relevant for APOs that perform sample rate
conversion.
In particular, CalcInputFrames is used by APOs on the microphone path that
change the sample rate, e.g. the Windows resampler APO when the audio client is
initialized with the AUDCLNT_STREAMFLAGS_AUTOCONVERTPCM flag.
CalcOutputFrames is used similarly by APOs on the speaker path that change the
sample rate.
From: Akshay Cadambi<mailto:akshay@xxxxxxxxxxxx>
Sent: Tuesday, June 16, 2020 1:50 PM
To: wdmaudiodev@xxxxxxxxxxxxx<mailto:wdmaudiodev@xxxxxxxxxxxxx>
Subject: [EXTERNAL] [wdmaudiodev] Behavior of CalcInputFrames and
CalcOutputFrames
Hey Matthew,
We are currently implementing some custom audio processing via an SFX APO and
an EFX APO. Some of our algorithms require a power-of-two or multiple-of-two
buffer size in order to meaningfully process audio. Our current implementation
currently gets around this by using a ring-buffer.
However, we noticed that the IAudioProcessingObjectRT interface provides the
following two methods via baseaudioprocessing object
<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Ftritao%2FWindowsSDK%2Fblob%2Fmaster%2FSDKs%2FSourceDir%2FWindows%2520Kits%2F10%2FInclude%2F10.0.17763.0%2Fum%2Fbaseaudioprocessingobject.h&data=02%7C01%7CMatthew.van.Eerde%40microsoft.com%7C45d1014f2c9f41ed8c1208d81236bb24%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637279374366329736&sdata=FF0EQFwkHtLw1jVUN0KFm0lxpDtf5Pc1NXujsQlTi4M%3D&reserved=0>
in the swap APO sample.
STDMETHOD_(UINT32, CalcInputFrames)(_In_ UINT32 u32OutputFrameCount);
STDMETHOD_(UINT32, CalcOutputFrames)(_In_ UINT32 u32InputFrameCount);
Playing around with the debugger we found that CalcInputFrames gets called with
480 being the output frame count. Given the 10ms latency requirement in the
windows audio engine, this makes sense for a sample rate of 48kHz.
If we override this function to return 480*2 then we notice that APOProcess
gets called with double the number of input frames (as expected).
We also noticed that CalcOutputFrames never gets called.
We are considering using this mechanism as a way to obtain the appropriate
number of frames that our algorithm requires and removing the need to use ring
buffers in our implementation.
We have the following questions:
1. When does CalcOutputFrames get called?
2. Is there any documentation that you can point us to in order to ensure
that we are implementing these correctly?
3. What is the behavior when these functions are overridden?
4. What if we return a non-integer multiple of the input argument in
CalcInputFrames: for example, if we receive 480 --- and we return 512 (next
power of two).
5. Other than the fact that it is called by the real-time thread, are
there any other gotchas that we should be aware of with using these functions?
Apologies for the lack of specificity. We are merely trying to understand the
appropriate use cases for those two methods, and what the behavior of the audio
engine is, before moving forward with a solution.
Any help or documentation would be appreciated. Thanks in advance for your time
and energy!
-Akshay