falcon picture
  

Definition

The Falcon Programming Language

Part of the documentation for the Falcon language
by Douglas W. Jones
THE UNIVERSITY OF IOWA Department of Computer Science

Index


BNF Notation

The notation used here, BNF. There is no consensus on what BNF stands for. John Backus developed the notation, and Peter Naur simplified it, both while working on the Algol 60 report, published in 1963. Naur named it Backus Normal Form, but it is not, formally speaking, a normal form. Later, Don Knuth suggested that it should be called Backus–Naur Form. There have been many variations on this theme, so we must define our notation here:

Symbols used in the description of a language are divided into two classes:

Terminal symbols actually occur in the language. Terminal symbols in the grammar are enclosed in quotation marks. Thus, 'rock' and "stone" might be terminal symbols in a grammar for the English language. Double and single quotes are both used in order to allow quotation marks themselves to be quoted, as in "'".

Nonterminal symbols are used to describe constructs in the language. Thus, in a grammar of English, you might find <noun phrase> used to describe such strings as "the big red house" and <adverb> used to describe the word "quickly". Angle brackets surround the name of every nonterminal.

The grammar consists of a set of Production rules that describe how the terminal symbols of the language combine. Every production rule consists of a designated nonterminal, on the left hand side, and a list of terminal and nonterminal symbols that may be combined to form an instance of the designated nonterminal. For example, the following production rule might occur in a grammar for the English language:

        <noun phrase> ::= <article> <noun>

The phrase "the house" follows this rule, so long as "the" is accepted as an <article> and "house" is accepted as a <noun>.

The punctuation "::=" is used to separate the left and right sides of the production rules. (Peter Naur seems to have introduced this mark because it is distinctly different yet still easy to type.)

Multiple production rules may be combined. Consider these two rules:

        <noun phrase> ::= <article> <noun>
        <noun phrase> ::= <article> <adjective> <noun>

These may be combined and written as:

        <noun phrase> ::= <article> <noun>
                       |  <article> <adjective> <noun>

Optional syntactic elements may be enclosed in square brackets. This allows a more compact expression of the above:

        <noun phrase> ::= <article> [ <adjective> ] <noun>

Repeated syntactic elements may be enclosed in curly braces. This allows expressions such as:

        <noun list> ::= <noun> [ { "," <noun> } "and" <noun> ]

This compact expression stands in for the following production rules:

        <noun list> ::= <noun>
                     |  <noun> "and" <noun>
                     |  <noun> <comma noun list> "and" <noun>
        <comma noun list> ::= "," <noun>
                           |  "," <noun> <comma noun list>

Finally, the mark --, occurring without quotation, is used as a comment indicator, with the comment extending to the end of line.

Over the years, numerous other extensions and variants have been proposed for use in BNF grammars. Here, we will limit ourselves to the above.

Character Set

Falcon is defined over the 7-bit ASCII character set, with provisions for extensions to UTF-8, the standard 8-bit encoding of Unicode. The following characters and classes of characters are defined:

        <tab>     ::= HT   -- 09
        <newline> ::= LF   -- 0A
        <break>   ::= VT   -- 0B
                   |  FF   -- 0C
                   |  CR   -- 8D
        <space>   ::= " "  -- 20
        <digit>   ::= "0"  -- 30
                   |  "1"  -- 31
                  ...
                   |  "9"  -- 39
        <letter>  ::= "A"  -- 41
                   |  "B"  -- 42
                  ...
                   |  "Z"  -- 5A
                   |  "a"  -- 61
                   |  "b"  -- 62
                  ...
                   |  "z"  -- 7A

The comments above give the hexidecimal code for the corresponding ASCII character. Note that ASCII is a strict subset of UTF-8. The <letter> category may be extended to accept the accented letters in the Unicode Latin-1 Supplement, Greek, and Cyrillic alphabets. Extensions to support alphabets Hebrew, Arabic or other alphabets that do not render left to right is more problematic.

Additionally, the <space>, <newline> and <break> categories may be extended to include such things as Unicode's reverse linefeed (008D - a form of line break), no-break-space (00A0 - a form of space). Unicode is vast, and the appropriate use of all of its obscure and scattered control characters is difficult to determine.

Of more importance, quoted character strings in Falcon may contain any character other than a quotation mark. This reqsuires the definition of two special sets of characters:

        <anything but double quote> ::= NUL  -- 00
                                    ...
                                     |  "!"  -- 21
                                     |  "#"  -- 23
                                    ...
                                     |          FF
        <anything but single quote> ::= NUL  -- 00
                                    ...
                                     |  "&"  -- 26
                                     |  "("  -- 28
                                    ...
                                     |          FF

This permits any 8-bit codes to appear within quoted strings, thus allowing full use of UTF-8 between quotation marks.

Falcon Lexical Structure

At the lexical level, a Falcon program consists of a sequence of lexemes separated by arbitrary amounts of white space and terminated by an end of file:

        <falcon program> ::= { { <white space> } <lexeme> }
                               { <white space> } <end of file>

In what follows, however, we will cheat and view the end of file mark as a kind of lexeme, one that occurs only once in any program and only at the end. This allows us to describe a Falcon program as follows:

        <falcon program> ::= { { <white space> } <lexeme> }

White space consists of spaces, tabs, newlines, comments and other format effectors classified here as breaks:

        <white space> ::= <space>
                       |  <tab>
                       |  <newline>
                       |  <comment>
                       |  <break>

Comments in Falcon begin with a double dash and continue to the end of line:

        <comment> ::= "--" { <anything but #> } <newline>

For error reporting purposes, newlines may be counted in order to determine the line number in the program. Aside from counting newlines and serving to separate lexemes, the white space in a program has no significance.

Lexemes include all of the symbols of the programming language, including the end of file:

        <lexeme> ::= <identifier>
                  |  <number>
                  |  <string>
                  |  <punctuation>
                  |  <end of file>

As in most programming languages, an identifier consists of a letter followed by any number of letters or digits

        <identifier> ::= <letter> { <letter> | <digit> }

There is no limit on the length of an identifier, and all characters within an identifier are significant. Note that the formal definition given here is ambiguous! Because white space between consecutive lexemes is optional, ab could be read as two consecutive identifiers, a followed by b. This ambiguity is resolved informally by asserting that consecutive identifiers must be separated by white space, so where there is ambiguity, we always read the longest possible identifier.

Numbers may be specified in any number base from 2 to 32. Decimal numbers are used by default, and radix specification is always given in decimal:

        <number> ::= <decimal number> [ '#' <extended number> ]
        <decimal number> ::= <digit> { <digit> }
        <extended number> ::= <extended digit> { <extended digit> }
        <extended digit> ::= <digit> | <letter>

The # (number sign) character is used to designate numbers in radices other than 10. The decimal number to the left of the number sign gives the radix, while the string of extended digits to the right of the number sign gives the value. This definition of numbers introduces the same ambiguity that is present with identifiers, and it is resolved the same way: Consecutive numbers or identifiers must be separated by white space.

The extended digit set encodes the values of digits using Crockford's Base-32 encoding system:

digit0123 4567
value 0 1 3 3 4 5 6 7
digit89AaBb CcDdEeFf
value 8 9 10 11 12 13 14 15
digitGgHhIiJj KkLlMmNn
value 16 17 1 18 19 1 20 21
digitOoPpQqRr SsTtUuVv
value 0 22 23 24 25 26 32 27
digitWwXxYyZz
value 28 29 30 31

Digits with values greater than or equal to the radix are, of course, illegal. The end of a string of extended digits making up an extended number does not depend on the radix. For bases such as 8 and 16, this encoding is identical to common octal and hexadecimal. Upper and lower case are equivalent, and for larger bases, the possibility of transcription errors is significantly reduced by interpreting the letter O as zero and the letters I and l as one. The primary use for such larger number bases is to permit compact encoding of large constants such as are used for cryptographic keys and pseudorandom number generation.

Quoted strings may be enclosed in either double quotes or single quotes. There is no significance to the different types of quotes.

        <string> ::= '"' { <anything but double quote> } '"'
                  |  "'" { <anything but single quote> } "'"

Within a quoted string, the backslash character has special meaning. Backslashes must occur in pairs. Backslashes enclosing white space are ignored -- allowing long strings to be broken across multiple lines. A pair of consecutive backslashes stands for a backslash. Backslashes enclosing the standard names of ASCII control codes stand for that non-printing character. White space may separate control codes.

        "abc\
        \def"       -- equivalent to "abcdef"

        "\\"        -- a single backslash
        "\LF NUL\"  -- a linefeed followed by a null
        "\SP\"      -- equivalent to " "

        "abc
        def"        -- probably equivalent to "abc\LF\        def"

Nothing else is permitted between backslashes. Names of ASCII control codes must be capitalized.

The following punctuation marks occuring in Falcon programs are classified as lexemes:

	<punctuation> ::= ";"   -- statement terminator/separator
	               |  "="   -- assignment and comparison
	               |  ":"   -- definition
	               |  "("   -- parameter lists, array subscripts
	               |  "["   --     "
	               |  "{"   --     "
	               |  ")"   --     "
	               |  "]"   --     "
	               |  "}"   --     "
	               |  ","   -- parameter separator
	               |  "@"   -- pointer definition and use
	               |  ".."  -- subrange constructor
	               |  "<>"  -- comparison (inequality)
	               |  ">"   -- comparison
	               |  ">="  -- comparison
	               |  "<"   -- comparison
	               |  "<="  -- comparison
	               |  "+"   -- arithmetic
	               |  "-"   -- arithmetic
	               |  "*"   -- arithmetic
	               |  "/"   -- arithmetic
	               |  "\/"  -- logic
	               |  "/\"  -- logic
	               |  "~"   -- logical not or one's complement
	               |  "."   -- field selection

As the language is refined, meanings may be given to additional punctuation marks.

Note that several of the punctuation marks above are compound marks made up of several characters. Above the lexical level, these are always seen as single lexemes. Thus, for example, "/\" is a single lexeme quite distinct from "/ \". The latter is seen as two lexemes, "/" followed by "\", the latter having no defined meaning.

Also note the possibility of natural Unicode extensions supporting, for example, "×" as an alternative to "*" and "∧" as an alternative to "/\". Pretty printing tools and smart text-editors that make such subsitutions automatically would be useful, since typing the full Unicode character set is nearly impossible.

The following reserved words are recognized as identifiers by the BNF definitions for the lexical level given above, but are treated as special cases in the Falcon syntax and may not be used except as dictated by the syntax:

	<reserved word> ::= "end"       -- terminates many constructs
	                 :  "code"      -- start code of a block
	                 :  "const"     -- various definition
	                 :  "type"
	                 :  "exception"
	                 :  "procedure"
	                 :  "function"
	                 :  "private"   -- mark a definition as private
	                 :  "external"  -- mark a function as external
	                 :  "ref"       -- indicate parameter by reference
	                 :  "enum"      -- various types
	                 :  "array"
	                 :  "of"
	                 :  "record"
	                 :  "if"        -- statement
	                 :  "then"
	                 :  "else"
	                 :  "select"    -- statement
	                 :  "case"
	                 :  "while"     -- statement
	                 :  "do"
	                 :  "until"
	                 :  "for"       -- statement
	                 :  "in"
	                 :  "try"       -- statement
	                 :  "handle"
	                 :  "raise"     -- statement

As the language is refined, meanings may be given to additional reserved words, but this should be done sparingly.

Falcon Programs

        <falcon program> ::= <block> [ "end" ] <end of file>

A Falcon program consists of a single block. Execution of the program proceeds by allocating a record in memory to hold the components of the block, executing the initializer for the block, and deallocation of the record.

Private variables and functions declared in the block making up a Falcon program are purely local to that program. Public variables and functions are visible to the linker. With care, this permits linkage with variables that are declared in external code with which Falcon programs are linked. Notably, this allows linkage with variables declared in support libraries written in languages like C.

Blocks

        <block> ::= { <statement> [ ";" ] } { <block element> }

        <block element> ::= "code" { <statement> [ ";" ] }
                         |  [ "private" ] "const" { <constant declaration> [ ";" ] }
                         |  [ "private" ] "type" { <type declaration> [ ";" ] }
                         |  [ "private" ] "exception" { <exception declaration> [ ";" ] }
                         |  [ "private" ] "var" { <variable declaration> [ ";" ] }
                         |  [ "private" ] "procedure" <procedure declaration> [ ";" ]
                         |  [ "private" ] "function" <function declaration> [ ";" ]

A block consists of a sequence of declarations and statements. The declarations are elaborated in the sequence they occur, and the statements serve to initialize the record associated with the block and perform other computations. Declarations introduce identifiers into the block on a strict definition-before-use basis, and statements are executed in their order of appearance. All identifiers declared in a particular block must be distinct.

All declarations in a block are public, that is, visible from outside the block, unless they appear in a group of declarations that have been marked as private.

Constant Declarations

        <constant declaration> ::= <identifier> [ "=" ] <expression>

A constant declaration binds the given identifier to the value of an expression. The expression must have a constant value, which is to say, all the operands must be constants and the operators must be computable at compile time. The type of the constant is determined from the expression.

Type Declarations

        <type declaration> ::= <identifier> [ "=" ] <type>

A type declaration binds an identifier to a type.

Exception Declarations

        <exception declaration> ::= <identifier>

An exception declaration binds an identifier to a new exception, distinct from all other exceptions used in the program.

Variable Declarations

        <variable declaration> ::= <identifier> [ ":" ] <type>

A variable declaration binds an identifier to a newly allocated chunk of memory of size sufficient to hold a value of the indicated type.

Procedure and Function Declarations

        <procedure declaration> ::= <identifier> [ <formal parameter list> ]
                                    <body>
        <function declaration> ::= <identifier> [ <formal parameter list> ] [ ":" ] <type>
                                    <body>

	<body> ::= "external" | ( <block> "end" )

        <formal parameter list> ::= "(" <parameter> { [","] <parameter>} ")"
                              |  "[" <parameter> { [","] <parameter>} "]"
                              |  "{" <parameter> { [","] <parameter>} "}"
        <parameter> ::= [ "ref" ] <variable declaration>

A procedure or function declaration binds an identifier to a constructor for a new block that, when constructed, inherits from the enclosing block. In effect, the procedure name identifies a new type, comparable to a record type, where the only permitted way to create an instance of that type is to call the procedure. The declarations in the parameter list, if present, are part of this block. When a procedure is called, the constructor is executed, allocating storage for the parameters and variables in the block and executing the associated code.

Parameters qualified by the keyword ref are passed by reference. In that case, the formal parameter will be a reference to the actual parameter when the call is made. Otherwise, the formal parameter will be a copy of the actual parameter.

A procedure call is a type of statement. Functions may be called only from within expressions. The type of the return value is given by the reference immediately following the parameter list, if present.

Falcon subsets may restrict this type reference to a simple identifier, eliminating the possibility of function types such as package.type.

A procedure with an external body refers to a procedure, typically written in some other language such as C, which can be called using the standard calling sequence of the host system and which some kind of linker can bind to the object code produced by the Falcon compiler. Not all C routines can be called from Falcon programs. Notably, routines such as scanf() and printf() that have variable numbers of parameters of uncertain types cannot be called.

Types

        <type> ::= <reference>
                |  <enumeration>
                |  <subrange>
                |  <pointer>
                |  <array>
                |  <record>

A type identifier must have been previously defined as a type and visible in the current context.

New types may be constructed using any of the following mechanisms:

Enumeration Types

        <enumeration> ::= "enum" "(" <identifier"> { [","] <identifier>} ")"
                       |  "enum" "[" <identifier"> { [","] <identifier>} "]"
                       |  "enum" "{" <identifier"> { [","] <identifier>} "}"

Each enumeration type is a new scalar type and is not interchangable with any preexisting type. The identifiers listed in the creation of an enumeration type are the constants of that type. An ordering relationship is established over the enumeration such that the first constant is less than the second, which is less than the third, and so on.

Subrange Types

        <subrange> ::= <expression> ".." <expression>

Subrange types may be defined over any scalar type. A subrange establishes upper and lower bounds on values of that subrange type. The two expressions giving the bounds of the subrange must have constant scalar values, where the values are defined over the same base type, and the first value must be less than or equal to the second. A value is in a particular subrange if it is greater than or equal to the first or lower bound and less than or equal to the second or upper bound.

All practical integer types are subranges of the abstract integer type, which has an infinite range of values. The predefined type char can be considered to be a subrange defined over the alphabet, from "/NUL/" to "/DEL/", for ASCII, or from "/NUL/"+255 for UTF-8. Each enumerated type introduces a new base type.

Subranges declared with the upper and lower bound equal are, for practical purposes, new void types, in that an object of such a type occupies no memory.

Pointer Types

        <pointer> ::= '@' <type>

Pointer types give the memory address of an object of the indicated type.

Array Types

        <array> ::= "array" <type> "of" <type>

Array types require an index type, the type of the array index, and an element type, the type of each array element. The index type must be a finite scalar type. The range of values of the index type indicates the number of instances of the element that will be allocated when the array type is elaborated.

Record Types

        <record> ::= "record" <block> "end"

When a variable of a record type is allocated, declarations in the block associated with that record type are elaborated and the code of that block is executed to initialize the record. The persistence of the resulting record depends on where the record was allocated. Note that record types may include procedural components; these may be viewed as methods when applied to instances of that type, so such instances are objects, as the term is used in object-oriented programming.

A record type with an empty bock, for example, record end, is a void type. Instances of void records occupy no memory and have no useful value.

Statements

        <statement> ::= <if>
                     |  <case>
                     |  <loop>
                     |  <handler>
                     |  <raise>
                     |  <procedure call>
                     |  <assignment>

Statements direct the computation. Conditional, selection and iteration statements determine which computations are performed, while procedure calls direct the execution of the block of code associated with a procedure and assignment statements direct changes to the values of variables.

If Statements

        <if> ::= "if" <expression> [ "then" ]
                     <block>
               [ "else"
                     <block> ]
                 "end"

If statements begin by evaluating the controlling expression. This expression must have a Boolean value. If the expression evaluates to true, the block in the then clause is executed. If the expression evaluates to false, the block in the else clause is executed, if present. Execution of the controlled blocks proceeds by first allocating storage for any variables declared in the block, and then elaborating the declarations and executing the statments of the block. Both subsidiary blocks inherit from the enclosing block.

Case Statements

        <case> ::= "select" <expression> [ "from" ]
               { "case" <case label> { [ "," ] <case label> } [ ":" ]
                     <block> }
               [ "else"
                     <block> ]
                 "end"

        <case label> ::= <expression> [ ".." <expression> ]

Select-case statements begin by evaluating the selection expression. This expression must have a scalar value. If the value of the expression matches one of the case labels, the labeled block is executed. If no case label matches and if there is an else clause, the block in the else clause is executed. The subsidiary blocks operate in exactly the same way as the subsidiary blocks in an if statement.

The expression or expressions making up a case label must be constant values and their types must be compatible with the type of the controlling expression. Furthermore, no two case labels under one select clause may have the same value, nor may the ranges overlap.

Loop Statements

        <loop> ::= <while loop>
                 |  <until loop>
                 |  <for loop>

        <while loop> ::= "while" <expression> [ "do" ]
                             <block>
                         "end"

        <until loop> ::= "do"
                             <block>
                         "until" <expression>

        <for loop> ::= "for" <identifier> "in" <type> [ "do" ]
                           <block>
                       "end"

The expressions controlling while and until loops must have Boolean values. The while loop tests the value of the expression before each iteration of the block making up the loop body, and terminates as soon as this expression is false. The until loop evaluates the expression after each iteration of the loop body and terminates the loop as soon as this expression is true.

For loops declare a variable named by an identifier as being of a finite particular scalar type and then iterate the block making up the loop body over every value of that type. The order of the enumeration is indeterminate.

As with if and select-case statements, the block making up the loop body inherits from the enclosing block. In the case of for loops, the implicit declaration of the loop control variable may be considered to be in a block that inherits from the enclosing block and that is extended by the loop body. The declarations elaborated during any particular iteration of the block have an effect limited to that iteration. Any communication from one iteration to the next must, therefore, be through variables declared in the block(s) that enclose the loop.

Exception Handlers

        <handler> ::= "try"
                          <block>
                    { "handle" <reference> { [ "," ] <reference> } [ ":" ]
                          <block> }
                    [ "else"
                          <block> ]
                      "end"

The try block is attempted, and if it executes to completion without raising an exception, the handle and else clauses are ignored. If an unhandled exception is raised within the try block, or within any code called from within the try block, execution of that code is abandoned. The references in the handle clauses must name distinct exceptions. If the exception raised in the try block is named by the refrence on one of the handle clauses, that block is executed as an exception handler before control exits the try statement.

The else clause is equivalent to a handle block carrying the names of all other exceptions (even if those names are invisible in this context). If there is no handler in this try statement for the exception that was raised, this block is also abandoned.

Raising Exceptions

        <raise> ::= "raise" <reference>

The reference in the raise statement must name an exception that is visible in the current block. The raise statement causes execution of the current block to be abandoned, with a transfer of control to the current handler for the named exception.

Falcon subsets may restrict the reference to simple identifiers, eliminating the possibility of raising package.exception.

Procedure Calls

        <procedure call> ::= <reference> [ <actuals> ]

        <actuals> ::= "(" <expression> { "," <expression> } ")"
                   |  "[" <expression> { "," <expression> } "]"
                   |  "{" <expression> { "," <expression> } "}"

The reference given in a procedure call must refer to a declared procedure. Storage for the block associated with that procedure is allocated, the declarations in that block are elaborated, and the code in that block is executed, and then the associated storage is deallocated.

If and only if the procedure declaration indicates that it expects parameters, the actual parameter list must be present and the types of the actual parameters must be compatible with the types of the formal parameters specified in the procedure declaration.

Assignment Statements

        <assignment> ::= <reference> "=" <expression>

The reference given on the left-hand side of an assignment statement must identify a variable. The expression on the right-hand side of the assignment statement must have a type that is compatible with the type of that variable.

Expressions

        <expression> ::= <comparand> [ <comparing operator> <comperand> ]
        <comparand> ::= <term> { <adding operator> <term> }
        <term> ::= <factor> { <multiplying operator> <factor> }

Multiplying operators take precedence over adding operators, which take precedence over comparison operators. Aside from those three levels of precedence, operators are evaluated from left to right.

In every context where an expression or sub-expression is required, the context may constrain the expression to be of a particular type, for example, boolean, or it may be constrained to be scalar. In any event, the expression itself returns a value of a particular type determined by its operands.

Comparing Operators

        <comparing operator> ::= "="  -- equality
                              |  "<>" -- inequality
                              |  ">"  -- greater
                              |  ">=" -- greater or equal
                              |  "<"  -- less
                              |  "<=" -- greater or equal

The comparing operators always give a Boolean result, and the operands must be comparable.

Any two expressions of the same type are comparable for equality and inequality. Pointers are equal if and only if they refer to the same object. Pointers to variables of distinctly different types may not be compared. Comparing two instances of the same record type for equality is equivalent to the logical and of an equality test on each of the fields of the record. Comparing two instances of the same array type operates similarly.

Two scalar types are comparable if they have the same base type, for example, integer or a particular enumeration. If the subranges do not overalp, the result of a comparison is a foregone conclusion and the value of the expression is considered to be constant.

Adding Operators

        <adding operator> ::= "+"  -- addition
                           |  "-"  -- subtraction
                           |  "\/" -- logical or

Integers may be added to scalars to produce a result of a comparable scalar type (note that integers are themselves scalars). Thus, for example, 1+1=2 and "a"+1="b". When adding to a subrange type, the bounds on the subrange of the result are the sums of the bounds on the subranges of the operands. If the result is outside the bounds of the base type of the scalar, a range exception is raised.

Two operands of comparable scalar types may be subtracted to produce an integer result, and an integer may be subtracted from a scalar to produce a result of the same scalar type. Thus, for example, "b"-"a"=1 and "b"-1="a". The bounds on the subrange of the result are determined from the bounds on the operands. If the result is outside the bounds of the base type, a range exception is raised.

The logical or operator applies to boolean operands. In this case, if either operand is true, the other operand may or may not be evaluated.

Multiplying Operators

        <multiplying operator> ::= "*"  -- multiplication
                                |  "/"  -- division
                                |  "/\" -- logical and

Integers may be multiplied to produce an integer product. The subrange of the result depends on the subranges of the operands. If the result is outside the bounds of the base integer type, a range exception is raised.

Integers may be divided to produce an integer quotient. The subrange of the result depends on the subranges of the operands. If the result is outside the bounds of the base integer type, a range exception is raised.

The logical and operator applies to boolean operands. In this case, if either operand is false, the other operand may or may not be evaluated.

Factors

        <factor> ::= [ "-" | "~" ] ( <number> | <string> | <constructor> | <reference> )

The unary negation operator, if present, applies only to numeric terms and has the usual meaning. The range of the result of applying the negation operator to an integer is computed simply by inverting the bounds on the original range. Note that, n-bit two's complement binary numbers in the range -2n-1 to 2n-1-1, the result will be in the range -2n-1-1 to 2n-1; two's complement representations in this range require n+1 bits, so the assignment i=-i, for example, may produce a value that is out of bounds.

The unary "~" is logical not operator when applied to Boolean terms. Extensions allowing application of "~" as a one's complement operator are contemplated. In this context, there are details remaining to be worked out.

The type of a numeric factor is a subrange of the base integer type with bounds equal to the value. So, for example, the constant 4 is of type 4..4.

Character strings are constant arrays of characters with an integer index type with bounds between zero and the size of the string. So, for example, the constant "Hello" is of type array 0..4 of char. Note that an array of one element may be used in any context where an object of the element's type is expected, so "a" may be used as a character constant as well as being an array of one character.

When a factor consists of a reference, where that reference names a variable or constant, the value of that variable or constant is taken. Where the reference is to a function, the function is called and the return value is taken.

Constructors

        <constructor> ::= "(" <expression> { [ "," ] <expression> } ")"
                       |  "[" <expression> { [ "," ] <expression> } "]"
                       |  "{" <expression> { [ "," ] <expression> } "}"

A constructor found in a context where a scalar value is expected permits only a single expression. That expression must deliver a scalar value.

A constructor following a reference to a procedure or function will construct the actual parameter list for a call to that procedure or function. In that case, number of expressions found in the constructor must match the number of formal parameters in the procedure or function declaration, and each of the successive parameters must be of a type compatible with the declared types of the formal parameters. In the case of parameters passed by value, the actual parameter must be assignment compatible with the actual parameter. In the case of parameters passed by reference, the actual parameter must be of the same type as the formal parameter, or of a subtype, as follows: Where the parameter is of an array type, the bounds of the actual parameter array may be of a subrange of the bounds declared for the formal parameter, so long as the array elements are of identical type.

References

        <reference> ::= <identifier>
                     |  <reference> "@"
                     |  <reference> "." <identifier>
                     |  <reference> <constructor>

References name items declared in some block. All references begin with an identifier that must be bound in the current block or in a block from which the current block inherits. The reference, at this point, refers to whatever that identifier is bound to.

A reference followed by an at-sign such as p@ is only legal in contexts where the initial reference p names a variable of a pointer type and the value of that variable is not null. In that case, the resulting reference refers to the memory location referenced by that pointer, and the type of the result is the type of the object stored in that location.

A reference followed by a dot such as v.f, is legal in contexts where the initial reference v refers to a variable of a record type. In this case, the identifier f must be declared in that type and may refer to any public declaration in that record.

A reference followed by a dot such as t.f, is legal in contexts where the initial reference t refers to a record type (including the activation record types of procedures and function), and f refers to a public type (including activation record types), constant or exception declared in that record. Note that if v is a variable of type t, references to v.f and t.f are equivalent when referencing constants, and exceptions. Record types, however, cannot be instantiated except in the context of an instance, since the initializer code of a record may access variables local to the enclosing context.

A reference followed by a constructor may have several meanings. If the reference names a variable of type array, the constructor will be asked to deliver a single value of the index type of that array, and that value will be used to select an element of the array.

Where a reference to a procedure or function is followed by a constructor, the constructor will be asked to deliver the actual parameters for a call to that procedure. For procedures, the value resulting from the reference is void. For functions, the value is the return value of the function.

Where a reference to an array type is followed by a constructor, the constructor will be asked to deliver an object of that array type, with each of the array elements initialized from successive expressions in the constructor.

Where a reference to a record type is followed by a constructor, an instance of that record type is allocated, its initializer is run, and then the public fields are assigned values from the successive expressions in the constructor.

The Standard Prologue

The following Falcon code produces definitions that can be considered to be a standare prologue for every Falcon program.

        exception range;

        type boolean = enum( false, true );
        type char = "\NUL\".."\NUL\"+255;
        type int8 = -128 .. 127;
        type uint8 = 0 .. 255;
        type int16 = -32768 .. 32767;
        type uint16 = 0 .. 65535;
        type int32 = -16#80000000 .. 16#7FFFFFFF;
        type uint32 = 0 .. 16#FFFFFFFF;

	type file = @ record end;
	var input: file;
	    output: file;
	    errors: file;
	procedure putchar( c:char, f:file ) external;
	function  getchar( f:file ):char external;
	function  eof( f:file ):boolean external;
	procedure putstr( ref s:array uint16 of char, f:file ) external;
	procedure getstr( ref s:array uint16 of char, f:file ) external;