Russinovich: A possible cure for exploitable heap corruption in Windows 7

"The Big Fix"

With more data on hand thanks to the vastly improved and partly repurposed Windows Error Reporting service, engineers could craft more effective ways to address the root problem of as much as one-third of key categories of crashes. That's what enabled them to implement what may go down as the Big Fix.

"We introduced this thing called fault-tolerant heap," stated Mark Russinovich. "The basic idea is, it's monitoring for heap corruptions. When it sees a heap corruption in a process, it enables heap mitigations, then it monitors the effectiveness of that heap mitigation, and if it's effective, it keeps it on. At the process shutdown, [FTH] is keeping a record of mitigations it's detected it had to apply with stack traces, and then capturing that and sending it up to Windows Error Reporting, so that you've got a record of exactly where your application was causing a heap problem. And then we can look at it, see if it's our code or your code [that's to blame] (it's probably your code) and fix it."

The core system library in Windows is NTDLL. There are multiple reasons why NTDLL may cause a crash, but there's only one place for a thread to crash on account of heap corruption, and that's NTDLL.

Once again, an existing Windows architectural element is used for a new purpose: Vista introduced the idea of the shim to accompany older programs loaded into memory, to help ease their transition to the new system and reduce incompatibility problems. When you run an old program in "compatibility mode," you're introducing a shim. Now, with the new event triggers in place, the FTH system can install a new kind of shim atop a process whenever it faults. The purpose of this new shim is to capture information about the stat of the process' heap.

As Russinovich explained, faulting processes have a tendency to reference their heaps after they're supposed to have freed them. With the shim in place, such a call to an officially freed heap will be redirected to a new 4 MB buffer that will, for the intent of the crashing app, look like the heap. This is a separate area of memory managed by FTH. What this means is, during a heap corruption event, instead of less control in Windows, there's now more control. What's more, each page of allocated memory in the heap is now buffered with an extra eight bytes of otherwise superfluous non-data, simply to mitigate the event of a likely buffer overrun -- another extremely exploitable event.

"When a process faults and we've detected that it's crashed in the heap, and FTH has been applied to it, what we do is watch for further crashes of the process that look like heap crashes," Russinovich explained. "We start out with...our starting value. If the process exits without a heap corruption, then we leave it where it is. If it crashes with a heap corruption, then we say we weren't effective at capturing that, and decrement that count. If that count goes to zero, then we end up removing the shim. The shim's cost depends on the application's use of the heap, and for things like Internet Explorer -- which runs a lot of dodgy stuff inside of it, and might get shims applied to it -- we want to be very careful about only applying the shims if it's really having a beneficial effect."

After years of semi-fruitful brain-wracking over how the operating system should best respond to heap corruption events, a side benefit of an effort to simply reduce the number of running services so the OS could respond better to things like Bluetooth, led to perhaps the most important reliability and security enhancement to Windows since address space load randomization. When someone asks, "What does improving performance one iota or two really matter to users," here is your response.

Fewer running processes is partly responsible for the measurable size reduction in Windows 7 over Windows Vista.


11:00 am EST December 31, 2009 · If I were to give a pop quiz to several of the various media aggregators who read and "reported" on this story, posted yesterday, I would be handing out F's until I had to borrow some from my own last name. So in the interest of correcting the News from the Alternate Universe:

  • No, there is no new zero-day Windows flaw or bug discovered by Mark Russinovich. This is a story of a chronic Windows problem addressed by improvements to the operating system architecture that were rolled out in Windows 7 and Windows Server 2008 R2.
  • Russinovich did not build these new features, such as Unified Background Process Manager, alone. There is a large company called Microsoft which is his employer, and which has hired a battalion of developers.
  • This is not some future feature. These improvements are things that are in Windows 7 now. If you've installed it on your computer, you're using them now.
  • Heap corruption is not the only cause of crashes. Likewise, not every process crash leaves the heap corrupted.

As always, Betanews is not responsible for misinterpreted facts on the part of aggregators beyond our control.

31 Responses to Russinovich: A possible cure for exploitable heap corruption in Windows 7

© 1998-2025 BetaNews, Inc. All Rights Reserved. About Us - Privacy Policy - Cookie Policy - Sitemap.