Machine Problem 3, due October 20

Part of the homework for 22C:60, Fall 2008
by Douglas W. Jones
THE UNIVERSITY OF IOWA Department of Computer Science

Submit the source file mp3.a (or mp3.txt if you must) for your solution to ICON on or before the indicated date.

The Problem

The file mp3data.o is an object file containing some null-terminated strings encoded in UTF16 format. The address of these strings are given by the external symbols STRING1 and STRING2. (There are many good web pages on UTF16, the 16-bit encoding of Unicode, including the Wikipedia page.)

First, write a subroutine, call it UTF16TOASCII. This subroutine has 3 parameters.

The subroutine should place consecutive characters of the UTF string in the ASCII string, replacing any non-ASCII characters with the ASCII question mark. This assignment is relatively easy if you only worry about the common Unicode values U+0000 to U+FFFF. The harder part of the assignment has to do with what the Unicode documentation refers to as surrogate pairs, where the simple solution will replace such pairs with two question marks when a correct solution will recognize that they each surrogate pair is just one character.

The subroutine must not ever overflow the bounds of the destination string, and it must guarantee that the destination string is null terminated.

Second, write a main program that allocates a 10 character buffer to hold the ASCII string and then calls your subroutine twice to convert STRING1 and STRING2 to ASCII. After each conversion, print out the converted string.

To link your program, after you have assembled your source file, you will need to type:

	link mp3.o mp3data.o