Tuesday, February 06, 2007 7:46 AM
6 "Pointers" on Debugging Unmanaged Code
Under our .NET hood, we use C and C++ at Atalasoft. When you're trying to process images fast enough to keep up with high-speed scanners, pure managed C# isn't good enough. Since, ultimately, all of our components have managed interfaces, we package all unmanaged code in managed/unmanaged C++ assemblies (more on why we do that in our .NET imaging philosophy article).
For that, and other reasons, we're forced to use unmanaged code with pointers. Along with that comes the need to worry about unmanaged debugging. Here are some of the things you can do to track down your next crash inside of unmanaged code.
- Unit Test: There are many good reasons to unit test, but for the purposes of this article, the important thing is to find possible bugs in unmanaged code. You also need to use a code coverage tool to make sure that you're hitting everything -- we use AutomatedQA's AQtime specifically because of its excellent support of managed/unmanaged hybrid assemblies.
When you are trying to expose bugs in unmanaged code, throw the kitchen sink at your code. We have an extensive database of pathological image files we can throw at our code, and building up a test resource like that is essential to finding the boundary cases.
- Use Debugging Tools for Windows: If you don't have this, go get it now. For some serious debugging, you get windbg, which I will write a blog about soon. But, the utility I use most is GFLAGS. One of the most useful features it has is the ability to make every heap allocation on its own page (+hpa). This means that you can detect a memory out of bounds error at the point of the initial dereference, not later when you end up scribbling on something important.
This option makes your process use much more memory and is noticeably slower. I recommend running your entire unit test suite in the debugger after running this line from the command prompt (if you are using the NUnit gui test runner).
gflags -i nunit-gui.exe +hpa
Replace nunit-gui.exe with the name of your test runner .exe. (remember to turn it off when you are done with -hpa). More on GFLAGS.
WinDBG is essential for tracking down unmanaged memory leaks. You could also check out AutomatedQA's tools for this which are orders of magnitude easier to use -- I still think it's worth learning WinDBG though.
- Learn how to recognize common bugs. This comes with experience. Basically, you should be able to classify which memory is being corrupted (Stack, Heap, or Global), and what kinds of things cause that.
If it's your stack you want to look at local variables, arguments and the return (common problems are not matching calling conventions or messing up when casting a pointer to function and calling it). With the Heap, make sure you aren't dereferencing bad pointers (can be caused by double deleting pointers, out of array bounds, wrong type sizes, etc.).
- Learn to read Intel Assembly. Sometimes you're going to have to look at the Disassembly view in Visual Studio, and you should learn how to read it. Practice on things you have the source to, because the source will be dispersed next to its corresponding assembly code.
If you want to learn what you need to know to read Intel Assembly (especially the kind generated by compilers), check out Write Great Code, Volume 2. It has 3 chapters on what you need to know about assembly language to apply it to writing in high-level languages. You'll need Volume 1 if you don't have any computer architecture background.
Reading Assembly code will give you more clues to what is going on -- crashing on a ret? Better check the stack -- specifically calling conventions.
The other reason you need to be able to read assembly is if you end up having to debug a release version and you don't have debugging information -- which bring us to...
- Always build with debug information -- even your release versions. Build it as a Program Database and check in all PDB files that go with builds you release to customers. This way if a customer can reproduce a problem, but you can't do it with the latest or with the debug version -- at least you can debug the version you released. You might have to debug it on their machine -- which will be a lot easier if you have the PDB that goes with the version you gave them.
You'll also want to make sure you understand some common optimizations -- because debugging release code is not as easy (it won't match up with the source as well). The watch window becomes less useful because more variables end up in registers and they are reused. Only trust a watched value if you know it's being used on the line of code you're on -- better to just know how each variable is being stored and watch the memory or register directly.
- Reproduce the problem reliably in the Debuggger. Actually this is the point of the previous five steps. Unit tests give us small bits of code to work with, GFLAGS makes the runtime unforgiving to mistakes,
and knowing what kind of problem we have tells us which GFLAGS options are likely to cause the problem to happen most frequently.
If you can't reliably reproduce the problem, you won't really know if you fixed it. It's great to be able to reproduce it in a unit test alone, but I find that I need the extra sensitive memory debugging you get with GFLAGS to get to happen every time. Also, you'll need to turn on Unmanaged Debugging in Visual Studio to be able to step into unmanged code.
The corollary to this tip is to get it to crash as early as
possible. It's no fun to figure out a crash bug once the culprit
function has already returned. You really want the root cause
somewhere on the call stack when it's detected.
And, by the way, .NET 3.0 doesn't make these problems go way -- at least for imaging. Even though .NET 3.0 is putting a managed API on more graphically intense underlying code, extending WIC API's has to be done with COM
. Check out this article for how we're approaching mixing COM with .NET
. The upshot is -- all this stuff still applies.
And if you know how to do this already -- take a look at our software engineering job openings