Note on Machine Problem 2

Part of the homework for 22C:50, Summer 2003
by Douglas W. Jones
THE UNIVERSITY OF IOWA Department of Computer Science

The solution to MP2 distributed to the class is the third one I produced. My first solution, Version 2.0, included the DEF function in operands, and IF/ENDIF without any ELSE clauses but allowing nesting of IF/ENDIF blocks.

The one tricky thing about the support for DEF was suppressing error messages in the operands passed to DEF for testing. I did this by adding a Boolean parameter to parse_operand(). When true, this allows error messages. When false, it suppresses them. Here is my code for handling IF/ENDIF blocks, torn out of the body of parse_statement() right after the code for handling the B and W directives:

                        object_word( location,  value );        /*V1.1*/
                        location.value = location.value + 2;
/*V2.0*/
/*V2.0*/        } else if ((SYM_HANDLE)lex_this.val == if_handle) {
/*V2.0*/                OBJECT_VALUE value; /* new value */
/*V2.0*/
/*V2.0*/                /* scan over the IF */
/*V2.0*/                lex_scan();
/*V2.0*/
/*V2.0*/                /* process operand */
/*V2.0*/                value = parse_operand( TRUE );
/*V2.0*/                if (value.value == 0) { /* skip until endif */
/*V2.0*/                        int nestlevel = 1;
/*V2.0*/                        do { /* skip lines until ENDIF */
/*V2.0*/                                lex_scan_line();
/*V2.0*/                                if (lex_this.typ == endfile) {
/*V2.0*/                                        lex_error( &lex_this,
/*V2.0*/                                                "missing endif" );
/*V2.0*/                                        break;
/*V2.0*/                                }
/*V2.0*/                                if ((lex_this.typ == identifier)
/*V2.0*/                                && lex_ispunc( &lex_next, ':' )) {
/*V2.0*/                                        lex_scan(); /* skip label */
/*V2.0*/                                        lex_scan(); /* skip : */
/*V2.0*/                                }
/*V2.0*/
/*V2.0*/                                /* handle nested IF/ENDIF constructs
/*V2.0*/                                if ((SYM_HANDLE)lex_this.val
/*V2.0*/                                        == if_handle) {
/*V2.0*/                                        nestlevel++;
/*V2.0*/                                } else if ((SYM_HANDLE)lex_this.val
/*V2.0*/                                        == endif_handle) {
/*V2.0*/                                        nestlevel--;
/*V2.0*/                                }
/*V2.0*/
/*V2.0*/                        } while (((SYM_HANDLE)lex_this.val
/*V2.0*/                                        != endif_handle)
/*V2.0*/                                ||  (nestlevel != 0));
/*V2.0*/
/*V2.0*/                        /* scan over the ENDIF */
/*V2.0*/                        if (lex_this.typ != endfile) lex_scan();
/*V2.0*/                }
/*V2.0*/
/*V2.0*/        } else if ((SYM_HANDLE)lex_this.val == endif_handle) {
/*V2.0*/                /* scan over the ENDIF */
/*V2.0*/                lex_scan();
/*V2.0*/                /* other than that, ignore it */
/*V2.0*/
                } else {
                        lex_error( &lex_this, "illegal opcode or pseudo-op" );
                }
        }
}

Version 2.1 of my solution was a first try at supporting ELSE clauses. I did this by blindly stuffing the following new code into parse_statement() between the code for IF and the code for ENDIF:

/*V2.0*/                        if (lex_this.typ != endfile) lex_scan();
/*V2.0*/                }
/*V2.1*/
/*V2.1*/        } else if ((SYM_HANDLE)lex_this.val == else_handle) {
/*V2.1*/                int nestlevel = 1;
/*V2.1*/
/*V2.1*/                for (;;) { /* skip lines until ENDIF */
/*V2.1*/                        lex_scan_line();
/*V2.1*/                        if (lex_this.typ == endfile) {
/*V2.1*/                                lex_error( &lex_this, "missing endif" );
/*V2.1*/                                break;
/*V2.1*/                        }
/*V2.1*/
/*V2.1*/                        /* handle labels on ENDIF */
/*V2.1*/                        if ((lex_this.typ == identifier)
/*V2.1*/                        && lex_ispunc( &lex_next, ':' )) {
/*V2.1*/                                lex_scan(); /* skip label */
/*V2.1*/                                lex_scan(); /* skip : */
/*V2.1*/                        }
/*V2.1*/
/*V2.1*/                        /* handle nested IF/ENDIF constructs
/*V2.1*/                        if ((SYM_HANDLE)lex_this.val
/*V2.1*/                                == if_handle) {
/*V2.1*/                                nestlevel++;
/*V2.1*/                        } else if ((SYM_HANDLE)lex_this.val
/*V2.1*/                                == endif_handle) {
/*V2.1*/                                nestlevel--;
/*V2.1*/                        }
/*V2.1*/
/*V2.1*/                        if (nestlevel != 0) continue;
/*V2.1*/                        if ((SYM_HANDLE)lex_this.val
/*V2.1*/                                == endif_handle) break;
/*V2.1*/                }
/*V2.1*/
/*V2.1*/                /* scan over the ENDIF */
/*V2.1*/                if (lex_this.typ != endfile) lex_scan();
/*V2.1*/                /* other than that, ignore it */
/*V2.0*/
/*V2.0*/        } else if ((SYM_HANDLE)lex_this.val == endif_handle) {

The above code begins to work, but it is obvious that the code to skip to an ENDIF after an ELSE duplicates a huge part of the code to skip to an ELSE or ENDIF after an IF. This led to my final version, Version 2.2, posted on line.

The test file distributed with Version 2.2 includes some fairly exhaustive tests of IF/ELSE/ENDIF, and among these, it demonstrates some rather nasty things that the assembler supports without complaint. Among these, it never complains about extra ENDIF directives, it allows an ELSE that isn't preceeded by an IF and it allows multiple ELSE directives between IF and ENDIF.

It would be perfectly appropriate to add code to catch these "misfeatures", but many production assemblers and compilers have similar misfeatures that are simply left undocumented. The assembler or compiler is considered to be formally correct if it produces correct output for all syntactically valid inputs, and the fact that it does not detect all possible errors in the input is not considered a fatal defect. At least, in the test data for Version 2.2, this misfeature is clearly documented!