39. Virtual Machines

Part of CS:2820 Object Oriented Software Development Notes, Spring 2016
by Douglas W. Jones
THE UNIVERSITY OF IOWA Department of Computer Science

 

A Problem

When object-oriented programs get big, how can we organize them so that we can make better sense of them. There are at least two parts to this, first, when classes get big, how can we organize their interior details to make things easier, and second, when there are large numbers of classes, how can we make sense of their relationships.

O-functiona and V-functions

When a class gets large, with a large collection of methods, it is useful to find some way of organizing those methods to help readers understand the class. David Parnas, one of the early developers of ideas relating to object-oriented programming, proposed the following general classification system for methods:

O-functions
Methods that Operate on the value of an object.

V-functions
Methods that work in terms of the Value of an object without changing it.

Note: Parnas used the term function, in part, because the term method was not yet in general use. Today, he might have called them methods. Regardless of what you call them, the term remains useful. Consider this simple class:

class C {
        private int f; // some field of the object
        public void set( int v ) { // a simple o-function
		f = v;
        }
	public int get() { // a simple v-function
		return f;
	}
}

When a class gets large, separating the o-functions from the v-functions can be one useful way of organizing the class definition to make it easier to read.

Note that you might imagine that the o-functions doing all the complex work of the function, while the v-functions trivially inspect and return fields, but this is not the case. Consider the problem of computing perspective projections from some viewpoint:

class Perspective {
	// fields describing the viewpoint

        public void set( Viewpoint v ) { // an o-function
		// do whatever is needed
        }

	public point transform( point p ) { // a v-function
		return perspective transformation of point p;
	}
}

Here, it turns out, the fields describing the perspective transformation from a particular viewpoint are essentially a matrix. When the viewpoint is set (by providing the coordinates of a point and the direction of the view from that point), the code in our O-function must compute this matrix.

When the coordinates of a point are transformed into the coordinates as seen from that viewpoint, the essential computation involves multiplying the coordinate vector times the perspective transformation matrix. Matrix multiplication is an expensive operation, so our v-function does quite a bit of work. Of course, we can also construct alternative problems where the O-functions naturally do all the work while the v-functions do very little.

Virtual Machines

The term virtual machine is used fairly frequently these days, and in its broadest sense, it is very relevant to the problem of organizing large programs. First, however, let's look at its primary meanings.

software-defined virtual machines
When you have executable code in the instruction set of one machine, but the hardware you have executes a different instruction set, you can still run that code if you have a virtual (software) implementation of the instruction set you want. Such a virtual implementation is also referred to as an instruction-set emulator.

For example, many computer architecture courses are taught using the ARM and MIPS instruction sets, yet most of the students taking those courses only have access to computers with Intel x86-family processors. So, to run their programming assignments, they use virtual MIPS or ARM processors implemented in software by an appropriate emulator.

For a second example, most Java compilers produce output in a machine language called J-code, and the simplest Java run-time systems use a virtual J machine, or a J-code emulator to run the code. Briefly, Rockwell Collins Corp. of Cedar Rapids, Iowa, offered a hardware J-machine for sale. They did this because they had already developed, in house, a micrprocessor called the AAMP that was so similar to the J-machine that converting the AAMP to run J-code was fairly easy.

virtual machine monitors
A virtual machine monitor is an operating system that creates one or more virtual machines running on one physical machine. The virtual machines are typically identical in their instruction set to the physical machine, except that each has access to only a subset of the physical resources.

A well-written virtual machine monitor on a computer designed to be virtualizable will typically run user code at a speed very close to the actual hardware speed. Most machine instructions on a virtual machine will be executed by the physical hardware. The only instructions that are executed in software are the instructions that control access to physical resources such as I/O devices and the memory management unit.

For example, VMware (a product built by the company of the same name) tries to virtualize the x86 family instruction set. Their software is widely used on server farms that provide cloud computing services in order to prevent users of cloud services from interfering with other users. An unfortunate problem with the Intel x86 family is that it was not originally designed to be virtualizable, and attempts to find all of the parts of the instruction set that need virtualization have proven to be very difficult.

The VM operating system from IBM that runs on their enterprise server architecture is a far better example, as well as being the first fully developed example in this category.

virtual machines (in general)
The most general definition of a virtual machine is that it is any set of hardware and software resources that, taken together, create an environment in which applications can be developed. So:

An important point to note, here, is that most applications are developed in the context of a hierarchy of virtual machines. Recognizing that many of the components of large programs are actually defining new virtual machine layers can help make sense of large programs that would otherwise be difficult to digest.

The Origin of the Term

The term virtual machine is, in part, a borrowing by computer scientists from the field of optics. In classical optics, when discussing lenses, the key concept is that of an image. You have an object, a lens, and the image of that object cast by the lens. This works so long as you are describing converging lenses (like magnifying glasses). The key defining characteristic of an image is that, if you put a projection screen or a sheet of white paper in the plane of an image, you will see that image projected on the paper.

Many optical systems have numerous image planes in them. For example, in an astronomical refracting telescope, the objective lens projects an image of a patch of sky on the primary image plane. There is no sheet of paper there, but the image exists regardless of whether you put a projection screen there so you can see it. The eyepiece of the telescope is, essentially, a magnifying glass allowing you to look at a small patch of the primary image. The combination of the eyepiece lens and the lens in your eye then projects an image on your retina. That is what you actually see.

When some of the lenses are diverging lenses, we need to modify our terms. When you look through a diverging lens, you see something in the distance through the lens. From your human perspective, it looks like you are looking at an image or at a real object, but if you reach behind the lens to grab something you see, your hand will miss the real object.

The problem with diverging lenses is that they create what is referred to, in classical optics, as virtual images. A virtual image is something you see when you look through a diverging lens. There is either an object or a real image behind the diverging lens, but it is somewhere closer to you than the virtual image you see.

The term virtual machine comes from this usage. The virtual machine is the machine you see through the lens of the software when it is running on a real machine.