P/Invoke Jujitsu: A Data Kata

honey the codewitch

5.00/5 (7 votes)

Jul 11, 2020

MIT

14 min read

6402

Exploit the memory layout of your data to make your P/Invoke code more accessible and maintainable

Download source code - 4.3 KB

Introduction

P/Invoke is very powerful once you get past the learning curve. It can be intimidating if you didn't cut your teeth in languages with pointer operations, but it's just another resource at your disposal. Ultimately, it's there to help, and .NET gives you P/Invoke functionality that is deep, flexible, and for what it does, easy to use.

Once you've learned how to do some basic P/Invoke, your next biggest hurdle is in understanding more of what you can do with it to build on what you have been doing because there's no single guide out there that can show you how to "level up" in succession in terms of your ability to use it. However, exploring it is the key to both unlocking its potential and understanding the mechanics behind the dance, so it's pretty important if you plan to code things that rely on it. The more you understand, the better your code will be in the end.

This article aims to imbue you with some intermediate skills in handling your data more flexibly. We won't be using the Marshal class like we did in the last P/Invoke Jujitsu instalment, but we will be doing some deep exploration of the memory layout of our data, and mapping it to different managed types.

Knowing Your Data

What is 32 bits of data? Put simply, it is an int. Put another way, it's a float. Put yet another way, it's an array of 4 bytes. Put yet another way, it's a struct with 4 fields of one byte each, or a struct with two short fields, or a struct with one short field and two byte fields. It is any one of these things, depending on how one looks at it.

It's not enough to know how much data is there, you also must know what it is. That's why we have types, so we can tell if that 32 bits is supposed to be an int or a float, or something else as above.

Here's one of the keys to understanding all of this: A type is not something that "holds" data. It simply "refers to data." The type is only conceptually part of the data. It's not really part of the data. The data in its purest form, is just the bits, like the 32 bits that make up an int. Our programming languages and runtimes impose types on the raw data for us. The type is a facade and an interface to access the data, not the data itself. It is separate. The knowing is not the doing. Types are abstract, while bits are concrete. If we took all the types out of data and just looked at it in raw form, it would be just flat streams of bits. That's what P/Invoke deals with, essentially.

Data is nothing by itself. It's how we move with our data that defines it. Our types give us the foundation for our actions with the data. They give it form and shape, without which it's just a sea of bits of a (sometimes) known length. It's all in how you use it that defines it.

Why bother knowing this? Simply put, because we're going to exploit it. Ripping one facade off your data and replacing it with another can be very useful when dealing with P/Invoke code generally, and understanding how to do it will give you a deeper understanding of both how P/Invoke and how your data works.

As I said, P/Invoke deals in the bits of your data. It doesn't so much care about the types that reflect it. It believes whatever types you tell it, and will happily let you lie to it. It's perfectly okay to lie to the marshaller, or at least fake it out, and we will be doing so ourselves, just don't get caught! I'll explain as we go, and we'll get there soon.

Marshalling a Method Call

Sometimes, you just need to call an unmanaged platform method and so you'll need to marshal a method call. In order to do so, you'll need to tell .NET where the method that sinks the call lives. The DllImportAttribute covers this. We'll be using the MIDI features of winmm.dll under the Win32 Multimedia API to demonstrate the concepts of this article. All of the calls sink to winmm.dll.

The first thing we need to do is find out what the method we want to call looks like. Unlike a .NET assembly, an unmanaged DLL can't tell us what its types and methods look like. We must go to (in this case) Microsoft's documentation and/or header files. Failing that, we can also use pinvoke.net as a resource but be warned that what's up there is crowdsourced and often contains non-optimal or even wrong definitions. It's a great site for getting const and flag values though, as those aren't in Microsoft's docs typically.

This part is where a background in C and especially Win32 C development will really help you, but I'll try to walk you through it even if you don't have that background. If you don't have a C background, it might behoove you to download P/Invoke definitions from pinvoke.net and then verify and modify them as needed. Don't skip the last bit though, because like I said, P/Invoke definitions on that site are often suboptimal or even wrong.

We're going to start with the midiOutClose() function because it has the fewest parameters. Here's the C definition from Microsoft:

MMRESULT midiOutClose( HMIDIOUT hmo );

We need to translate all the arguments and the return value, leaving a total of two translations that must take place for the above method. Working from left to right, the first is MMRESULT. If you don't know Microsoft's habits that well, you might be tempted to dig into the headers or Google, but it's a 32 bit integer Microsoft uses to report the status of the call - whether there was an error or not. They do this for most of their unmanaged calls. I'll go further and tell you the integer is used to signal success on zero or failure on non-zero and that there is an enumeration of error codes that go with it. However, we don't need the const values for this code. In fact, we don't need them even for professional code because there are better ways of getting a friendly error message than using const values here. I knew that because I've used this API before, so I was able to tell you. If I hadn't, I may have had to Google around for example code in C or C#. In the end, we'll be using an int to represent MMRESULT.

Warning: This only works because int and MMRESULT are both exactly the same size! Always match the sizes of your method parameters and return values! I don't care if you use a float where you need an int since they're both 32 bits. The marshaller won't care either. However, if you were to use a short (16 bits) or a long (64 bits) where you needed an int (32 bits), you may as well deliberately crash your app with a hardcoded and unhandled exception and save yourself some debugging pain. What will happen is it will destroy the "call stack" which is the memory the marshaller uses to transfer your call to and from unmanaged code. This corrupts your application irrevocably. There is no error handling that will fix this since you've just compromised the integrity of the memory in your running application. The best case is you crash every time, right away. The worst case is you crash "sometimes", a bit later. Take the time to get your method's return value and arguments correct. Your application's integrity absolutely depends on it. P/Invoke lets you do dangerous things very easily.

The parameter is of type HMIDIOUT which is a 32 bit "handle" which is a fancy way of Microsoft saying "pointer to something we aren't documenting" - which is okay, we don't need the documentation for what the handle points to. Whenever you get a handle, you can just use an IntPtr. Theoretically, you could use a 32-bit value like an int, but it's a good practice to use IntPtr. Just remember though, that IntPtr is different sizes depending on the word size of your platform, but typically if you need to make your app support 32-bit and 64-bit both, you'll need two different sets of P/Invoke declarations and some conditional compilation blocks anyway. We're going to focus on 32-bit P/Invoke for this exercise. Most of the time, apps you develop in .NET will have an IntPtr size of 32-bits anyway.

Putting all this together, here's our P/Invoke signature for the above method:

[DllImport("winmm.dll")]
static extern int midiOutClose(IntPtr hmo);

That wasn's so bad, but that's why we did it first. midiOutOpen() is more involved:

MMRESULT midiOutOpen(
    LPHMIDIOUT phmo, 
    UINT uDeviceID, 
    DWORD_PTR dwCallback, 
    DWORD_PTR dwInstance, 
    DWORD fdwOpen
);

Make sure to open the docs for this function. Working from left to right, top to bottom, we already know that MMRESULT is an int.

The next one is LPHMIDIOUT. We know from before that HMIDIOUT itself was an IntPtr. If you're familiar with Hungarian notation, you'd know that prefixing something with "LP" means it's a pointer to whatever follows. In this case, it's a pointer to an HMIDIOUT handle, itself another pointer. The docs also suggest that this is an out-value. That would explain why it's a pointer to a thing (LPHMIDIOUT) rather than simply the thing itself (HMIDIOUT). That's the C way of saying "I need to pass this by reference or pass an out-value." In C#, knowing HMIDIOUT is an IntPtr, we could pass by reference using ref IntPtr phmo and this works just fine. The only problem with it, is the marshaller will also be passing the value in as well as out. To tell C#, and the marshaller both that we don't care about the in-value, we substitute out instead of ref, leaving us with out IntPtr phmo. If you're ever not sure which to use, use ref, as it will work in cases where out will work as well, but out won't work where ref is needed.

Next, if we dig around in Microsoft's Win32 headers, we can find that UINT is a 32-bit unsigned int. Unless I need the unsigned range for an unsigned value I generally use int for these, and we will here, which leaves us with int uDeviceId.

The next parameter is a pointer of type DWORD_PTR, and the docs say what it points to depends on the value of fdwOpen. We've dodge a bullet here, because we don't need this parameter in our code. We can pass a null pointer which we'll signify by expressing the parameter as an IntPtr, and passing IntPtr.Zero from our code leaving us with IntPtr dwCallback.

The next parameter is a pointer also of type DWORD_PTR, and the docs say it's a user defined value passed along with the function, for use with the callback mechanisms. We aren't going to use this either. We'll declare it as IntPtr so that it matches the DWORD_PTR's memory footprint, but we'll be passing IntPtr.Zero from our code here as well which leaves us with IntPtr dwInstance.

Last, but not least, we have a DWORD which is an unsigned 32-bit integer. Again, we prefer using signed values from C# where we can, and the marshaller doesn't care so we'll declare this as int fdwOpen.

Finally, putting it all together that leaves us with:

[DllImport("winmm.dll")]
static extern int midiOutOpen(out IntPtr phmo, int uDeviceId, 
                              IntPtr dwCallback, IntPtr dwInstance, int fdwOpen);

Bringing Form to the Void*

As I said before, data in its raw form has no type - it's just a stream of bits. Sometimes those bits come in different packages. Consider the following P/Invoke method declaration:

[DllImport("winmm.dll")]
static extern int midiOutShortMsg(IntPtr hmo, int dwMsg);

Here, the obvious question is what is a message? Here it's represented by int dwMsg but what does that int represent? It's basically an opaque bitstream of 32 bits as far as we know at this point. That doesn't tell us a lot.

Looking at the docs, it tells us that the message is "packed" into an int, with each of the bytes representing a different part of the message: We have a "status byte", and two "data bytes", plus one unused byte, for a total of 4 bytes, or 32 bits.

Okay, so we can pack a message into an int, like this:

var msg = (data2 << 16) + (data1 << 8) + status;

That's not very clear. Maybe we can do better. Instead of envisioning this as a 24-bit value packed into one 32-bit int, what if we break it apart into four 8-bit fields, one of which is reserved?

This is where we get jiggy with P/Invoke, and really get it to work for us instead of against us:

[StructLayout(LayoutKind.Sequential)]
struct MidiMsg
{
    public byte Status;
    public byte Data1;
    public byte Data2;
    byte Reserved;
}

The structure is laid out from low byte to high byte, and is the size of one 32-bit integer. Remember from before, I said we could lie to the marshaller as long as the sizes were the same? Let's do so by creating another declaration for midiOutShortMsg():

[DllImport("winmm.dll")]
static extern int midiOutShortMsg(IntPtr hmo, MidiMsg dwMsg);

Here we've replaced the DWORD field (32-bits) with our MidiMsg struct instead of an int. Each of them is 32 bits so it works. It just means we're acting against those bits differently in our code. The end result is the same. The marshaller doesn't care that MidiMsg is not an int. It only sees a 32-bit stream of bits it has to get from one end of the call to the other, in whatever form it comes in.

Now instead of:

var msg = (data2 << 16) + (data1 << 8) + status;

We can do:

var msg = default(MidiMsg);
msg.Status = status;
msg.Data1 = data1;
msg.Data2 = data2;

And then, either way, we can call midiOutMsg(handle, msg);

You probably still don't know what status, data1 and data2 are because nobody has told you, but at least the latter lets us set the message fields more clearly. The status, data1, and data2 are particular to the MIDI protocol, which I cover in this link.

Let's say we sometimes need this msg as an int, and sometimes as a MidiMsg struct. One option is to change the definition of our struct:

[StructLayout(LayoutKind.Explicit)]
struct MidiMsg2
{
    [FieldOffset(0)] public int Packed;
    [FieldOffset(0)] public byte Status;
    [FieldOffset(1)] public byte Data1;
    [FieldOffset(2)] public byte Data2;
    [FieldOffset(3)] byte Reserved;
}

This is still 32-bits. We've changed the way we laid the struct out, so we're placing each field within MidiMsg2's bitstream individually. Note that our int field comes at the start of the data and also extends for 32 bits, just like the rest of the struct. This creates a kind of "C union" wherein the Packed field refers to the same location in memory as the rest of the struct. It just reflects the data therein differently. Setting it impacts the other 4 fields, and vice versa.

Using the above MidiMsg2, our previous int midiOutShortMsg(IntPtr, int) works so we don't need to make yet another declaration, though we could. We simply pass msg.Packed to the midiOutShortMsg() we've already declared. That just gets our data as a single 32 bit integer value.

What About MarshalAs?

You can adjust marshalling behavior by applying MarshalAsAttribute to P/Invoke method parameters and/or return values. However, you should almost never need it, and if you find yourself using it, you're usually asking for trouble. This isn't because it's "advanced stuff" - we're hitting the advanced stuff here. No, it's because the marshaller is really good at marshalling what you give it, without hints. If you have to give it hints, you're probably trying to marshal a data type where it can't work with it in the first place. Trying to use the MarshalAsAttribute, more often than not, is a warning that you've got it wrong. We won't cover it here, because there's one primary exception in terms of when we want to use it, but it usually crops up on COM interfaces, which we aren't getting into this time.

Now from Form to Movement

Now that we have concepts and data structures, let's put them in motion:

IntPtr handle;
// 0 = success
if(0==midiOutOpen(out handle, 0, IntPtr.Zero, IntPtr.Zero, 0))
{
    // Below we use two different methods
    // of calling midiOutShortMsg()

    // The reason this works is not because the C function we're calling
    // has multiple overloads - it doesn't. No, the reason it works is because
    // the raw bytes we're passing to the function are the same either way.

    // no matter what, midiOutShortMsg() sees two 32-bit parameters passed into it 
    // and returns 32 bits out of it through the return value.
    // The value of the second parameter however, can vary depending on how we 
    // want it to map to memory. In one way, we chose a 4 byte struct (4x8=32-bits)
    // in the other, we chose an int (32-bits). The only thing that changes is how 
    // *we're* looking at or  modifying the data, not the data itself - it is the 
    // same whether or not the data is mapped to a 4 byte int or whether it 
    // is mapped to a 4 byte struct - it's just 32-bits of data!

    // first we'll use the structure method
    // of calling midiOutShortMsg():
    var m = default(MidiMsg);
    m.Status = 0x90; // note on
    m.Data1 = 0x3C; // middle C
    m.Data2 = 0x7F; // max velocity
    midiOutShortMsg(handle, m);

    // Now we'll use the int method of 
    // calling midiOutShortMsg():

    // if we had done so above it would have been:
    // midiOutShortMsg(handle, 0x007F3C90); 

    // note on middle E, max velocity: (data1 = 0x40)
    midiOutShortMsg(handle, 0x007F4090);

    // alternate to above is
    //m = default(MidiMsg);
    //m.Status = 0x90; // note on
    //m.Data1 = 0x40; // middle E
    //m.Data2 = 0x7F; // max velocity
    //midiOutShortMsg(handle, m);

    // note on middle G, max velocity: (data1 = 0x43)
    var m2 = default(MidiMsg2);
    m2.Status = 0x90; // note on
    m2.Data1 = 0x43; // middle G
    m2.Data2 = 0x7F; // max velocity
    // use the "Packed" field to get
    // the above as a 32-bit int
    midiOutShortMsg(handle, m2.Packed);

    // alternate to above is
    // midiOutShortMsg(handle, 0x007F4390);

    Console.Error.WriteLine("Press any key to exit...");
    Console.ReadKey();
                
    midiOutClose(handle);
}

This outputs a C major chord rooted in middle C at maximum strike velocity to the first MIDI output controller available, which is usually your computer sound hardware's wavetable synthesizer. Consequently, you should hear the output of a single chord on your computer speakers.

And with that, we've called the same midiOutShortMsg() method several different ways, but with each way, we still used the same essential 32 bit bitstream, no matter whether we created it with a struct or with an int.

Now you've seen how you can work with your data by using different underlying types to represent the same memory-space and improve the flexibility and readability of your P/Invoke code in the process.

History

11^th July, 2020 - Initial submission