22C:18, Lecture 32, Summer 1997

Douglas W. Jones
University of Iowa Department of Computer Science

Virtual Machines

One important use of trap service routines is to extend the instruction set implemented by hardware. This allows a wide range of different versions of the hardware to be built, all able to execute the same instruction set. Some versions of the hardware may support the entire instruction set, while others may support only parts of it in hardware, using software to pick up the pieces. We say that all of these implementations provide the user with the same virtual machine, even though the physical machines are quite different.

One common example of this is found in many personal computers, where the same code may be run with and without a floating point coprocessor. When the coprocessor is present, the hardware directly executes floating point instructions; when the coprocessor is absent, the hardware merely traps floating point instructions and allows the software to execute them.

Another example is provided by the evolution of the DEC VAX family of computers. The original VAX, the 11/780, was implemented entirely in hardware. Whe DEC came out with the VAX 11/750, and again with the MicroVax, both newer and less expensive versions of the machine, they omitted some of the less frequently used instructions from of hardware, relying on software for those functions. User programs required no changes to live with this because the combination of hardware and system software supported the same virtual machine across all members of the VAX family.

IBM has used similar techniques for years with the IBM mainframe family that grew from the IBM 360 of the 1960's. The primary operating system for this family of machines is now known as VM, standing for virtual machine, because of its extensive use of this idea. Microsoft's Windows operating system, with the release of Windows 95, has recently begun to make extensive use of related ideas.

Extending the Hawk Virtual Machine

Let's examine the possibility of extending the Hawk virtual machine by adding a register-to-register multiply instruction. The Hawk machine has many undefined instructions. For example, a quick check of the two-register instruction group shows, the following:

 _______________________________
|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|
|       |  dst  |       |  src  |

 0 0 0 1         1 1 1 1          TRUNC  r[dst] = setcc(trunc(r[dst],src bits))
 0 0 0 1         1 1 1 0          SXT    r[dst] = setcc(sxt(r[dst],src bits))
 0 0 0 1         1 1 0 1          --
 0 0 0 1         1 1 0 0          --
 0 0 0 1         1 0 1 1          AND    r[dst] = setcc(r[dst]&r[src])
 0 0 0 1         1 0 1 0          OR     r[dst] = setcc(r[dst]|r[src])
For our example, let's consider a software implementation of a multiply instruction that takes the first of the unused instructions listed above and redefines it as:
 0 0 0 1         1 1 0 1          MUL    r[dst] = setcc(r[dst]*r[src])
Of course, we could do this by extending the Hawk emulator to support this instruction; this is equivalent to building new hardware, perhaps designing a next-generation integrated circuit for the machine. A manufacturer who does this would be well advised to offer software upgrades to users of the original machine that support the same instruction, and we will pursue this approach here to illustrate how virtual machine extensions can be built.

We will assume that the basic register save and restore mechanism discussed in the previous lecture is being used and concentrate on writing a specific handler for the undefined instruciton trap. This begins by setting up the linkage to the handler, using the macro defined in the previous lecture. According to the Hawk manual, the instruction trap is trap number 2, so we can link things up as follows:

	HANDLER	2,INSTRUCTION,32
Having done this, all we need to do is write a procedure to handle the trap. We can do this as follows:
;-------------------------------
INSTRUCTION:			; trap handler for instruction traps
				; assumes R2 points to AR
				; assumes R3 points to register save area
	LOAD	R8,R3,PCSV	; get saved program counter
	LOADS	R9,R8		; get instruction that caused trap
	EXTH	R9,R9,R8
	LIL	R10,#F0F0	; load mask to check opcode
	AND	R10,R9		; compute opcode
	CMPI	R10,#10D0	; is it MUL?
	BEQ	ISMUL		; -- if so, go do it

	...			; code to handle other possibilities

ISMUL:				; multiply opcode found
	ADDIS	R8,2		; increment program counter
	STORE	R8,R3,PCSV	;   so return will be to right point
	LIS	R10,#F		; mask for src field
	AND	R10,R9		; get src field
	LIL	R11,#0F00	; mask for dst field
	AND	R11,R9		; get dst field
	SL	R10,2
	SRU	R11,6		; make both fields into word offsets
	LEA	R12,R3,R1SV-4	; get pointer to R0 save loc
	ADD	R10,R12,R10	; make pointer to R[src] save loc
	ADD	R11,R12,R11	; make pointer to R[dst] save loc
	MOVE	R12,R3		;   save R3
	LOADS	R4,R10		; get src operand
	LOADS	R3,R11		; get dst operand
	MOVE	R13,R1		;   save R1
	CALL	R1,TIMES	; multiply them
	MOVE	R1,R13		;   restore R1
	STORES	R3,R11		; save the result
	MOVE	R3,R12		;   restore R3
;	note that at this point we could set N and Z to report on the result
	JSRS	R1		; return
Of course, we must also define the MUL instruction for the users so that they can assemble it. For example, we could add the following to hawk.macs:
	MACRO	MUL dst,src
	  H	#10D0 ! (dst << 16) | src 
	ENDMAC
There are some unresolved problems here! The effect of this multiply instruction is not well defined for operations on register zero. Of course, we could just put in the programmer's manual that it produces nonsense when R0 is an operand, but it would be better to make it so that at least it did not try to store values in the nonexistant save location R0SV.