Pages

Tuesday, December 2, 2008

Managed, Unmanaged, Native: What Kind of Code Is This? By Kate Gregory

With the release of Visual Studio .NET 2003 (formerly known as Everett) on April 24th, many developers are now willing to consider using the new technology known as managed code. But especially for C++ developers, it can be a bit confusing. That's because C++, as I pointed out in my first column here, is special.

What Is Managed Code?

Managed Code is what Visual Basic .NET and C# compilers create. It compiles to Intermediate Language (IL), not to machine code that could run directly on your computer. The IL is kept in a file called an assembly, along with metadata that describes the classes, methods, and attributes (such as security requirements) of the code you've created. This assembly is the one-stop-shopping unit of deployment in the .NET world. You copy it to another server to deploy the assembly there—and often that copying is the only step required in the deployment.

Managed code runs in the Common Language Runtime. The runtime offers a wide variety of services to your running code. In the usual course of events, it first loads and verifies the assembly to make sure the IL is okay. Then, just in time, as methods are called, the runtime arranges for them to be compiled to machine code suitable for the machine the assembly is running on, and caches this machine code to be used the next time the method is called. (This is called Just In Time, or JIT compiling, or often just Jitting.)

As the assembly runs, the runtime continues to provide services such as security, memory management, threading, and the like. The application is managed by the runtime.

Visual Basic .NET and C# can produce only managed code. If you're working with those applications, you are making managed code. Visual C++ .NET can produce managed code if you like: When you create a project, select one of the application types whose name starts with .Managed., such as .Managed C++ application..

What Is Unmanaged Code?

Unmanaged code is what you use to make before Visual Studio .NET 2002 was released. Visual Basic 6, Visual C++ 6, heck, even that 15-year old C compiler you may still have kicking around on your hard drive all produced unmanaged code. It compiled directly to machine code that ran on the machine where you compiled it—and on other machines as long as they had the same chip, or nearly the same. It didn't get services such as security or memory management from an invisible runtime; it got them from the operating system. And importantly, it got them from the operating system explicitly, by asking for them, usually by calling an API provided in the Windows SDK. More recent unmanaged applications got operating system services through COM calls.

Unlike the other Microsoft languages in Visual Studio, Visual C++ can create unmanaged applications. When you create a project and select an application type whose name starts with MFC, ATL, or Win32, you're creating an unmanaged application.

This can lead to some confusion: When you create a .Managed C++ application., the build product is an assembly of IL with an .exe extension. When you create an MFC application, the build product is a Windows executable file of native code, also with an .exe extension. The internal layout of the two files is utterly different. You can use the Intermediate Language Disassembler, ildasm, to look inside an assembly and see the metadata and IL. Try pointing ildasm at an unmanaged exe and you'll be told it has no valid CLR (Common Language Runtime) header and can't be disassembled—Same extension, completely different files.

What about Native Code?

The phrase native code is used in two contexts. Many people use it as a synonym for unmanaged code: code built with an older tool, or deliberately chosen in Visual C++, that does not run in the runtime, but instead runs natively on the machine. This might be a complete application, or it might be a COM component or DLL that is being called from managed code using COM Interop or PInvoke, two powerful tools that make sure you can use your old code when you move to the new world. I prefer to say .unmanaged code. for this meaning, because it emphasizes that the code does not get the services of the runtime. For example, Code Access Security in managed code prevents code loaded from another server from performing certain destructive actions. If your application calls out to unmanaged code loaded from another server, you won't get that protection.

The other use of the phrase native code is to describe the output of the JIT compiler, the machine code that actually runs in the runtime. It's managed, but it's not IL, it's machine code. As a result, don't just assume that native = unmanaged.

Does Managed Code Mean Managed Data?

Again with Visual Basic and C#, life is simple because you get no choice. When you declare a class in those languages, instances of it are created on the managed heap, and the garbage collector takes care of lifetime issues. But in Visual C++, you get a choice. Even when you're creating a managed application, you decide class by class whether it's a managed type or an unmanaged type. This is an unmanaged type:

class Foo
{
private:
int x;
public:
Foo(): x(0){}
Foo(int xx): x(xx) {}
};

This is a managed type:

__gc class Bar
{
private:
int x;
public:
Bar(): x(0){}
Bar(int xx): x(xx) {}
};

The only difference is the __gc keyword on the definition of Bar. But it makes a huge difference.

Managed types are garbage collected. They must be created with new, never on the stack. So this line is fine:

Foo f;

But this line is not allowed:

Bar b;

If I do create an instance of Foo on the heap, I must remember to clean it up:

Foo* pf = new Foo(2);
// . . .
delete pf;

The C++ compiler actually uses two heaps, a managed an unmanaged one, and uses operator overloading on new to decide where to allocate memory when you create an instance with new.

If I create an instance of Bar on the heap, I can ignore it. The garbage collector will clean it up some after it becomes clear that no one is using it (no more pointers to it are in scope).

There are restrictions on managed types: They can't use multiple inheritance or inherit from unmanaged types, they can't allow private access with the friend keyword, and they can't implement a copy constructor, to name a few. So, you might not want your classes to be managed classes. But that doesn't mean you don't want your code to be managed code. In Visual C++, you get the choice.

About the Author

Kate Gregory is a founding partner of Gregory Consulting Limited (www.gregcons.com). In January 2002, she was appointed MSDN Regional Director for Toronto, Canada. Her experience with C++ stretches back to before Visual C++ existed. She is a well-known speaker and lecturer at colleges and Microsoft events on subjects such as .NET, Visual Studio, XML, UML, C++, Java, and the Internet. Kate and her colleagues at Gregory Consulting specialize in combining software develoment with Web site development to create active sites. They build quality custom and off-the-shelf software components for Web pages and other applications. Kate is the author of numerous books for Que, including Special Edition Using Visual C++ .NET.