Status updates at the end
As part of a much larger project, I have a piece that records from the microphone to a number of different file formats. For MP4 (AAC) output, I use Media Foundation from a C# application to create a MP4 (AAC) recording. It works perfectly on Windows 10 (MF DLLs version 10.0.19041.1) but MP4 output hangs during finalization on Windows 11 (10.0.22621.1). It also works perfectly on both 10 and 11 for all formats except MP4.
What I have tried:
Data is read in from the microphone in PCM format and delivered to whichever one of my various output writer classes is appropriate for the desired format. Code is C# using a wrapper based on MF.Net to talk to the Media Foundation COM objects for MP4 output.
Here is the creation code for the MP4 writer -- error checking is removed here but it is present in the code and all return values are
success values.
MFStartup (0x20070, Full);
MFCreateMediaType (out IMFMediaType inputMediaType);
inputMediaType.SetGUID (MF_MT_MAJOR_TYPE, Audio);
inputMediaType.SetGUID (MF_MT_SUBTYPE, PCM);
inputMediaType.SetUINT32 (MF_MT_AUDIO_BITS_PER_SAMPLE, 16);
inputMediaType.SetUINT32 (MF_MT_AUDIO_SAMPLES_PER_SECOND, 48000);
inputMediaType.SetUINT32 (MF_MT_AUDIO_NUM_CHANNELS, 2);
MFCreateMediaType (out IMFMediaType outputMediaType);
outputMediaType.SetGUID (MF_MT_MAJOR_TYPE, Audio);
outputMediaType.SetGUID (MF_MT_SUBTYPE, AAC);
outputMediaType.SetUINT32 (MF_MT_AUDIO_BITS_PER_SAMPLE, 16);
outputMediaType.SetUINT32 (MF_MT_AUDIO_SAMPLES_PER_SECOND, 48000);
outputMediaType.SetUINT32 (MF_MT_AUDIO_NUM_CHANNELS, 2);
MFCreateAttributes (out IMFAttributes attributes, 1);
attributes.SetGUID (MF_TRANSCODE_CONTAINERTYPE, MPEG4);
MFCreateFile (ReadWrite, DeleteIfExist, None, filename, out IMFByteStream byteStream);
MFCreateMPEG4MediaSink (byteStream, null, outputMediaType, out IMFMediaSink mediaSink);
MFCreateSinkWriterFromMediaSink (mediaSink, attributes, out IMFSinkWriter sinkWriter);
sinkWriter.SetInputMediaType (0, inputMediaType, null);
long currentPosition = 0;
sinkWriter.BeginWriting ();
Since the PCM input is delivered to the C# code in a native buffer and must feed into various file formats, I use a custom media buffer that inherits from IMFMediaBuffer and wraps the native buffer so I don't have to make a copy of the data. Here is the writing code that executes for each input buffer.
CustomMediaBuffer inBuffer = new CustomMediaBuffer (native buffer info);
MFCreateSample (out IMFSample sample);
sample.AddBuffer (inBuffer);
long durationConverted = (10000000L * native buffer length) / averageBytesPerSecond;
sample.SetSampleTime (currentPosition);
sample.SetSampleDuration (durationConverted);
currentPosition += durationConverted;
sinkWriter.WriteSample (0, sample);
There is a lock around both the write and close logic to prevent samples being added after the close is initiated. The close code is:
sinkWriter.Flush (0);
sinkWriter.Finalize_ ();
mediaSink.Shutdown ();
When this runs on Windows 10, everything works correctly to output the MP4 file. On Windows 11, the call to Finalize_ never returns -- no error, no crash, nothing in the event logs, just crickets.
Has anyone else encountered this or have any sort of suggestion to address the problem?
Thanks
--------------------- update 31 May
Here's an update:
One thing I've been playing with is specifying every possible parameter / attribute appropriate for the media types I'm using. I've now got the following and it has made a difference in the failure mode on Windows 11.
Input Media now contains:
MF_MT_MAJOR_TYPE MFMediaType.Audio
MF_MT_SUBTYPE MFMediaType.PCM
MF_MT_AUDIO_BITS_PER_SAMPLE 16
MF_MT_AUDIO_SAMPLES_PER_SECOND 48000
MT_MT_AUDIO_NUM_CHANNELS 2
MF_MT_AUDIO_BLOCK_ALIGNMENT 4 (numChannels * bitsPerSample / 8)
MF_MT_AUDIO_AVG_BYTES_PER_SECOND 192000 (sampleRate * blockAlign)
MF_MT_ALL_SAMPLES_INDEPENDENT 1 (TRUE)
Output Media now contains:
MF_MT_MAJOR_TYPE MFMediaType.Audio
MF_MT_SUBTYPE MFMediaType.AAC
MF_MT_AUDIO_BITS_PER_SAMPLE 16
MF_MT_AUDIO_SAMPLES_PER_SECOND 48000
MT_MT_AUDIO_NUM_CHANNELS 2
MF_MT_AAC_AUDIO_PROFILE_LEVEL_INDICATOR 0x29 (AAC profile, level 2)
MF_MT_USER_DATA 0, 0, 0x29, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0x11, 0x90
I made one change to the "close" code, putting a release of the sink writer after calling
Finalize_ but before calling
Shutdown on the media sink. Note this DOES NOT affect the Windows 11 failure since it occurs after the function call that (previously) hung. This change fixed a problem with the MP4 boxes when recording on my VM; recording on Windows 10 without a VM worked fine either way.
Now to the new failure mode with all these changes. Windows 10 still functions beautifully with all this. Windows 11 now fails with an actual error when calling
SetInputMediaType on the sink writer.
Recording session cannot be started: failure setting input format for sink writer. The data specified for the media type is invalid, inconsistent, or not support by this object.
This error is patently false -- AAC encoders most certainly do support an input type of 48000 16-bit samples per second, stereo, PCM data.
I dug into the MFTrace output trying to find out more about the error. When I compared the log from Windows 10 (working) with the log from Windows 11 (failing), I found something very interesting.
-- Windows 10 calls
MFTEnumEx and finds the AAC Encoder in mfAACEnc.dll, and then an MFT (MF transform) instance of that encoder is created and attached. Windows 10 then immediately sets the input media type on the input stream and exits
SetInputMediaType.
-- Windows 11 calls
MFTEnumEx and finds the AAC Encoder in mfAACEnc.dll, and then an MFT instance of that encoder is created and attached. This encoder has the exactly the same GUID and attributes as the version found on Windows 10. Windows 11 however now creates a new MFT instance of "Resampler DMO" (resampledmo.dll) and attempts to attach that; now the failure happens.
There is no need for a resampler in this data flow, and in fact I do
not want a resampler between my input and the AAC Encoder. I want that PCM data to be fed directly to the encoder.
Now my task is finding a way to tell MF to
not add that resampler.
------- again May 31
I should have waited to post that update...
After adding a few more things to the output media type, I was able to get the right data flow lined up. I'm now back to the Finalize_ hanging on Windows 11 <sigh>
additional output attributes
MF_MT_AUDIO_AVG_BYTES_PER_SECOND 12000
MF_MT_AVG_BITRATE 96000
MF_MT_AUDIO_BLOCK_ALIGNMENT 1
MF_MT_COMPRESSED 1
MF_MT_AUDIO_PREFER_WAVEFORMATEX 1
MF_MT_FIXED_SIZE_SAMPLES 0
MF_MT_AAC_PAyLOAD_TYPE 0