Barcelona: Quad-Core Opterons Now Feature Virtualization Support

SCOTT FULTON, BetaNews: It's been said that one of the benefits of being able to run servers in virtual environments is the security benefits, essentially any software that runs there that things it can do incursion or damage to a system is doing it to a virtual system...[which] doesn't really have access to memory, not directly. Well, what you're talking about is something that does have access to memory directly, and I'm wondering now that you don't need the translator, now that you can speak Mandarin, you need to be more careful of what you're saying in Mandarin, in essence.
JOHN FRUEHE, Worldwide Mktg. Dev. Mgr., AMD: Absolutely. So there are security features...At the hardware level today, there are features that prevent one application from writing over the memory space of another application. So that continues to happen. What happens is, you've got TLB - tag-level buffers - that work within the hardware to make sure that one application isn't writing over the memory space of another application, and in virtualization with AMD-V, you have guest TLBs which essentially do the exact same thing, it's doing it at the hardware level, and it's tracking it all at the hardware level to ensure that you're not having a security incursion from one application to another. That's happening in hardware - in standard virtualization that would all be done in the software, and the software would prevent one application from writing over the memory space of another.
The most important part is, we're not taking over that functionality. We're hardware-enabling that functionality, but you're always going to need virtualization software. AMD is not going to be a virtualization platform per se; you're always going to need some virtualization software. We work very closely with VMware, we work very closely with Microsoft, with Xen, really with all of the virtualization platforms, and we're giving functionality that they can integrate into their software to make their software run better on our platforms, but we're definitely not taking over that role for them.
SCOTT FULTON: [Your chart mentions] IOMMU for security and performance?
JOHN FRUEHE: What IOMMU does is, it lets you virtualize your I/O traffic so your network connections, your hard drive connections, SAN connections, all of that can be virtualized, and you get significantly better performance because again today, all of your I/O traffic has to go through the hypervisor layer, and at the software level, VMware or Microsoft or Xen is actually translating all of those I/O calls to allow the applications to be able to take advantage of those I/O peripherals in a virtualized world. So with IOMMU, much the same way we can allow for the virtual machines to access memory directly, when you get to IOMMU, it allows virtual machines to access virtual I/O channels and provide significantly better performance, and as well, you've got to add security in there because, as you pointed out, you can't just let all the virtual machines have access to everything willy-nilly; you've got to have some performance levels built in there to ensure that you've got security built in, and you're not letting the I/O stream of one virtual machine corrupt the I/O stream of another virtual machine.
SCOTT FULTON: A lot of what you're talking about points to the need you were mentioning earlier: closer cooperation between AMD and the software developers themselves...[especially] to be able to take advantage of what VMware would call "paravirtualization."
JOHN FRUEHE: Exactly. For us, the key is, the world is really becoming very software-dependent, and if you think about the way that the food chain works from an applications standpoint, when somebody buys desktops, companies will buy desktops 10 to 20 at a time, they'll buy them by the pallet and they'll show up and they'll deploy them and run a variety of different applications, and nobody really knows exactly what the desktop is going to do - that really is up to the user. But we don't have the luxury in the server world of people buying a palette of servers and figuring out what they'll do with them later.
When you buy a server, it has a very specific function, and the software really leads the charge. The hardware guy is the last guy to get invited to the party. So they have a business need, there's a software application that'll solve that business need, that application runs on this operating system, and lo and behold, this hardware is really optimized to run that operating system. That's essentially the food chain, and so it really is more incumbent from an enterprise perspective for us to be tied in with the software vendors and, as we get more and more cores, the linkage between the hardware vendors and the software vendors gets even tighter, because if we want to deliver better performance in an era where there's going to be more cores, we've got to make sure that the software applications really optimize to take advantage.
In today's world, most of these apps...were really designed back around the late '90s or 2000 or 2001, when a four-socket system with four single-core processors was really state-of-the-art. And all these apps scaled phenomenally well up to four threads, and then beyond that you start to get diminishing returns unless you've really optimized the application to take advantage of more than four. So you see today the standard platform is a two-way server with two dual-core processors, and that scale is great on the software. As you look at quad-core, you've gotta really make sure that your operating system and your application is really designed to scale well beyond those four threads.
SCOTT FULTON: Is there something inherent about [Non-Uniform Memory Architecture] that will enable software developers to get some help in that regard? In other words, if they took advantage of NUMA, would they be able to look at those 1981 four-thread architecture source codes and say, "We can redo that and take advantage of eight threads by moving to NUMA?"
JOHN FRUEHE: I think it's less about NUMA and it's more about multiple processors. NUMA is, in the best-case scenario, really invisible to the application. They're just different pools of memory, but where you really get the efficiency is in being able to really work with your scheduler to do multiple tasks at the same time.
In too many software applications, you find things happen in a very serial manner. You do A + B, and you take the result of that and it becomes A + B added to C, and you're doing things in a kind of serial fashion. What you really want to do is do more things at the same time. So one of the announcements that AMD had this past week was the introduction of SSE5, which will allow for more concurrent types of commands to happen simultaneously, and you combine that with better job scheduling and you start to see applications that can really take advantage of this and do more things, so you're really leveraging each cycle on the clock and being able to get better performance and more efficiency, because more things are happening with each cycle simultaneously, versus waiting for the next cycle in order to execute that command.
Virtualization really helps in many ways because, for all of these applications that may not scale well beyond four threads, if you can start putting several of them on a single platform and using virtualization, you don't necessarily have to have your application optimized for more than four threads because [with] virtualization, you'll be able to run multiple applications on the same platform and take advantage of all those [resources].