Don't Cast Function Pointers (Unless You Really Know What You're Doing)

posted in MJP's Last Stand

Published August 15, 2008

Yet another Evil Steve-esque journal entry that I can instantly whip out when needed, instead of typing out a detailed explanation

*Note: Any information here only strictly applies to Microsoft Visual C++, I'm nowhere near experienced enough in any other compilers to comment on them. And as always, if anything is wrong please let me know so I can correct it. I'm much more interested in correctness than in pride. [smile]

**Extra Note: I compiled and ran the sample code on Visual C++ 2005, your results may differ for other versions. However that should just serve to show you how unpredictable and nasty this stuff can be.

In my last real journal entry I mentioned how the For Beginners forum typically has all kinds of Bad Win32 Code floating around. Well it doesn't just stop there...it's also brimming with Really Bad C++ Code, and even Completely Horrifying C++ Code. For this entry I'll be tackling something so scary it keeps me lying awake at night: function pointer casts in C++. Anybody who's used C++ for more than a month knows how dangerous casting can be, yet we still see it commonly used as a tool to "simply make the compiler shut up". Casting function pointers is even more dangerous, and we're going to talk about why.

At the lowest low-level, a function pointer is exactly that: a pointer to a function. Functions have an address in memory where they're located, and a function pointer contains that address. When you want to call the function pointed to by the function pointer, an x86 "call" command is used an execution starts at the address contained in the pointer. However there's more to calling a function then simply running the code: there's also the function parameters, the return value, and other information that needs to be stuck on the stack (like the return address, so execution can return to the calling function). How this exactly happens is determined by the calling convention of a function. The calling convention specifies how parameters are passed to the function (usually on the stack), how the return value is passed back, how the "this" pointer is passed (in the case of C++ member functions), and how the stack is eventually cleaned up when the function is finished. This entry from Raymond Chen's blog has a good summary of the calling conventions used in 32-bit Windows. As Raymond puts it on his blog, a calling convention is a contract that defines exactly what happens when that function is called. Both sides (the caller and the callee) must agree to the contract and hold up their respective ends of the bargain in order for things to continue smoothly.

So it should be obvious by now that function pointers are more than just an address: they also specify the parameters, the return value, and the calling convention. When you create a function pointer, all of this information is contained in the pointer type. This is a Good Thing, because it means that the compiler can catch errors for you when you try to assign values to incompatible types. Take this code for instance, which generates a compilation error:

#include #include using std::cout;int DoSomething(int num){	return num + 1;}int DoSomethingElse(int num, int* numPtr){	int result = num + *numPtr;	return result;}typedef int (*DoSomethingPtr)(int);typedef int (*DoSomethingElsePtr)(int, int*);int main(){	DoSomethingPtr fnPtr = DoSomethingElse;	int result = fnPtr(5); 	cout << result;		getch();	return 0;}

Look at that, the compiler saved our butt. We were trying to do something very bad! But of course since this is C++ we're talking about, the compiler does not have the final say in what happens. If we want, we can say "shut up compiler, and do what I tell you" and it will happily oblige. So go ahead and change the first line of main to this:

DoSomethingPtr fnPtr = (DoSomethingPtr)DoSomethingElse;

and watch the compiler error magically vanish. But now try running the code, in debug mode first. And look at that, an Access violation. Why did we get an access violation? Well that's easy: we called a function that expected two parameters on the stack. However we were using a function pointer that only specified one parameter. In other words, we violated the contract on our end. The callee however dutifully followed the contracted, and popped two parameters off the stack. The second value on the stack happened to be NULL, which caused an exception to be thrown when we tried to dereference NULL.

This is actually a pretty "nice" error. The exception happens right when we call the function, so naturally the first thing we'd do is go back to where we called the function and see what went wrong. So in the event of an accidentally erroneous cast, we'd figure it out pretty quickly. But of course, that's not always the case. Try compiling and running in release mode. And look at that: no crash! However that return value of "-1559444344" sure does look funky...clearly we weren't so lucky this time. Now instead of a nice informative crash, we have a function that just produces a completely bogus value. Maybe that value could be used for something immediately after and we'll notice it's bogus, maybe we won't notice until we've made 8 calculations based on it. Either way something down the line will get screwed up, and the chance that you'll trace it back to a bogus function pointer get slimmer and slimmer every step of the way.

But wait...the fun doesn't end there. Casting problems can be more subtle than that...as well as more catastrophic. Let's try this nearly-identical program instead:

#include #include using std::cout;int __stdcall DoSomething(int num){	return num + 1;}int __stdcall DoSomethingElse(int num, int* numPtr){	int result = num + *numPtr;	return result;}typedef int (*DoSomethingPtr)(int);typedef int (*DoSomethingElsePtr)(int, int*);int main(){	DoSomethingPtr fnPtr = (DoSomethingPtr)DoSomething;	int result = fnPtr(5); 	cout << result;		getch();	return 0;}

Look at that, we're actually pointing our function pointer to the right function this time! This should work perfectly, right? Right? Go ahead and run it. And what to do you know, it spits out the anticipated result! But no go ahead and press a key to let the program close up and....crash. A strange one too...access violation? At address 0x00000001? No source code available? What the heck code are we even executing? A look at the call stack shows that we're somehow executing in the middle of nowhere!

So how did this happen? Once again, we're crooks who violated the contract. The functions were declared with the calling convention __stdcall, which specifies that the function being called cleans up the stack. However our function pointers were never given an explicit calling convention, which means they got the default (which is __cdecl). This meant we put our parameter and other stuff on the stack, we called the function, the function cleaned up the junk on the stack by popping it off, and then when we returned the main function once again cleaned junk off the stack. Except that since the junk had already been cleaned up already, we instead completely bungled up our stack and wound up with an instruction pointer pointing to no-man's land. Beautiful. For those wondering, the correct way to declare the function pointers would be like this:

  typedef int (__stdcall *DoSomethingPtr)(int);typedef int (__stdcall *DoSomethingElsePtr)(int, int*);

And of course, the even smarter thing to do would have been to have no cast at all, since then the compiler would have caught our mistake and whacked us over the head for it.

By now I hope I've gotten my point across. If I haven't, my point is this: don't cast function pointers unless you're extremely careful about it, and you absolutely have no choice. Type safety exists for a reason: to save us from ourselves. Make use of it whenever you can.

EXTRA: On a somewhat related note, sometimes what you think is a function pointer isn't really a function pointer at all. For instance...what you get back when you pass GWLP_WNDPROC to GetWindowLongPtr. Yet another reason to be careful with function pointers.

Previous Entry OpenGL 3.0

Next Entry LogLuv Encoding for HDR

0 likes 1 comments

Comments

Evil Steve

Woo, another bookmark for my selection [smile]

Looks all good to me, I never thought about mentioning that cast function pointers might work in Release but not Debug. It might also be worth adding in a jab about casting the WindowProc pointer when filling in the WNDCLASSEX struct; I've seen that done more times than I care to count...

August 16, 2008 11:21 AM

You must log in to join the conversation.

Don't have a GameDev.net account? Sign up!

MJP

Author

🎉 Celebrating 25 Years of GameDev.net! 🎉

Don't Cast Function Pointers (Unless You Really Know What You're Doing)

Comments

MJP

Latest Entries

DXSAS Controls Library

HDR Rendering Sample

A Preview of My New Sample

LogLuv Encoding for HDR

Don't Cast Function Pointers (Unless You Really Know What You're Doing)

OpenGL 3.0

Posting WM_DESTROY is not how you destroy a window

Working With Unicode in the Windows API

Starfield Skybox Generator

🎉 Celebrating 25 Years of GameDev.net! 🎉

Don't Cast Function Pointers (Unless You Really Know What You're Doing)

Comments

MJP

Latest Entries

DXSAS Controls Library

HDR Rendering Sample

A Preview of My New Sample

LogLuv Encoding for HDR

Don't Cast Function Pointers (Unless You Really Know What You're Doing)

OpenGL 3.0

Posting WM_DESTROY is *not* how you destroy a window

Working With Unicode in the Windows API

Starfield Skybox Generator

Reticulating splines

Posting WM_DESTROY is not how you destroy a window