MoJo Semantics

Definitions#

A MoJo program specifies a computation that acts on a sequence of digital components called locations. A variable is a set of locations that represents a mathematical value according to a convention determined by the variable’s type. If a value can be represented by some variable of type T, then we say that the value is a member of T and T contains the value.

An identifier is a symbol declared as a name for a variable, type, procedure, etc. The region of the program over which a declaration applies is called the scope of the declaration. Scopes can be nested. The meaning of an identifier is determined by the smallest enclosing scope in which the identifier is declared.

An expression specifies a computation that produces a value or variable. Expressions that produce variables are called designators. A designator can denote either a variable or the value of that variable, depending on the context. Some designators are readonly, which means that they cannot be used in contexts that might change the value of the variable. A designator that is not readonly is called writable. Expressions whose values can be determined statically are called constant expressions; they are never designators.

A static error is an error that the implementation must detect before program execution. Violations of the language definition are static errors unless they are explicitly classified as runtime errors.

A checked runtime error is an error that the implementation must detect and report at runtime. The method for reporting such errors is implementation-dependent. (If the implementation maps them into exceptions, then a program could handle these exceptions and continue.)

Types#

MoJo is designed to support structural equivalence of types, instead of name equivalence. Two types are the same if their definitions become the same when expanded; that is, when all constant expressions are replaced by their values and all type names are replaced by their definitions. In the case of recursive types, the expansion is the infinite limit of the partial expansions. A type expression is generally allowed wherever a type is required.

Every expression has a statically-determined type, which contains every value that the expression can produce. The type of a designator is the type of the variable it produces.

Assignability and type compatibility are defined in terms of a single syntactically specified subtype relation with the property that if T is a subtype of U, then every member of T is a member of U. The subtype relation is reflexive and transitive.

Every expression has a unique type, but a value can be a member of many types. For example, the value nil is a member of both Refany and Root. It would be ambiguous to talk about “the type of a value”. Thus the phrase “type of x” means “type of the expression x”, while “x is a member of T” means “the value of x is a member of T”.

However, there is one sense in which a value can be said to have a type: every object or traced reference value includes a code for a type, called the allocated type of the reference value.

The type constructors and subtyping rules are discussed below.

Ordinal Types#

There are two ordinal types: bool and int.

Integers and Booleans are collectively called ordinal values. The base type of an ordinal value v is int if v is an integer, otherwise it is the unique type that contains v, which is bool for Boolean values.

The operators First, Last, and Number applied to an ordinal type return the first element, last element, and number of elements, respectively.

Here are the predeclared ordinal types:

int: All integers naturally represented by the implementation.
bool: The enumeration {false, true}.

Array Types#

An array is an indexed collection of component variables, called the elements of the array. The indexes are the values of an ordinal type, called the index type of the array. The elements all have the same size and the same type, called the element type of the array.

There are two kinds of array types, fixed and open. The length of a fixed array is determined at compile time. The length of an open array type is determined at runtime, when it is allocated or bound. The length cannot be changed thereafter.

The shape of a multi-dimensional array is the sequence of its lengths in each dimension. More precisely, the shape of an array is its length followed by the shape of any of its elements; the shape of a non-array is the empty sequence.

Arrays are assignable if they have the same element type and shape. If either the source or target of the assignment is an open array, a runtime shape check is required.

A fixed array type declaration has the form:

type T = [N] Element

where N is a constant integer expression and Element is any type other than an open array type. The values of type T are arrays whose element type is Element and whose length is the value of N.

If a has type T, then a[i] designates the element of a whose position corresponds to the position of i in the subrange [0..N-1], where N is the length of the array.

An open array type declaration has the form:

type T = [] Element

where Element is any type. The values of T are arrays whose element type is Element and whose length is arbitrary. The index type of an open array is the int subrange [0..n-1], where n is the length of the array.

An open array type can be used only as the type of a formal parameter, the referent of a reference type, the element type of another open array type, or as the type in an array constructor.

Examples of array types:

type Transform = [3] int;
     Vector    = [] int;

Record Types#

A record is a sequence of named variables, called the fields of the record. Different fields can have different types. The name and type of each field is statically determined by the record’s type. The expression r.f designates the field named f in the record r.

A record type declaration has the form:

type T = ( FieldList )

where FieldList is a list of field declarations, each of which has the form:

fieldName: Type

where fieldName is an identifier and Type is any non-empty type other than an open array type. The field names must be distinct. A record is a member of T if it has fields with the given names and types, in the given order, and no other fields. Empty records are allowed.

When a series of fields shares the same type, any fieldName can be a list of identifiers separated by commas. Such a list is shorthand for a list in which the type is repeated for each identifier. That is:

f_1, ..., f_m: Type

is shorthand for:

f_1: Type; ...; f_m: Type

This shorthand is eliminated from the expanded definition of the type.

Reference Types#

A reference value is either nil or the address of a variable, called the referent.

A reference type is traced. When all traced references to a piece of allocated storage are gone, the implementation reclaims the storage. A general type is traced if it is a traced reference type, a record type any of whose field types is traced, or an array type whose element type is traced.

A declaration for a traced reference type has the form:

type T = ^Type

where Type is any type. The values of T are traced references to variables of type Type, which is called the referent type of T.

The following reference types are predeclared:

Refany: Contains all traced references.
Null: Contains only nil.

Procedure Types#

A procedure is either nil or a triple consisting of:

the body, which is a block,
the signature, which specifies the procedure’s formal arguments. and result type
the environment, which is the scope with respect to which variable names in the body will be interpreted.

A procedure that returns a result is called a function procedure; a procedure that does not return a result is called a proper procedure. A top-level procedure is a procedure declared in the outermost scope of a module. Any other procedure is a local procedure. A local procedure can be passed as a parameter but not assigned, since in a stack implementation a local procedure becomes invalid when the frame for the procedure containing it is popped.

A procedure constant is an identifier declared as a procedure. (As opposed to a procedure variable, which is a variable declared with a procedure type.)

A procedure signature specification has the form:

(formal_1; ...; formal_n): R

where

Each formal_i is a formal parameter declaration, as described below.
R is the result type, which can be any type but an open array type. The : R can be omitted, making the signature that of a proper procedure.

A formal parameter declaration has the form

Mode Name: Type

where

Mode is a parameter mode, which can be val, var, or readonly. If Mode is omitted, it defaults to val.
Name is an identifier that names the parameter. The parameternames must be distinct.
Type is the type of the parameter.

When a series of parameters share the same mode, and type, name_i can be a list of identifiers separated by commas. Such a list is shorthand for a list in which the mode, and type, are repeated for each identifier. That is:

Mode v_1, ..., v_n: Type

is shorthand for:

Mode v_1: Type; ...; Mode v_n: Type

This shorthand is eliminated from the expanded definition of the type.

A procedure value P is a member of the type T if it is nil or its signature is covered by the signature of T, where signature_1 covers signature_2 if:

They have the same number of parameters, and corresponding parameters have the same type and mode.
They have the same result type, or neither has a result type.

The parameter names and defaults affect the type of a procedure, but not its value.

Object Types#

An object is either nil or a reference to a data record paired with a method suite, which is a record of procedures that will accept the object as a first argument.

An object type determines the types of a prefix of the fields of the data record, as if object were ^(). But in the case of an object type, the data record can contain additional fields introduced by subtypes of the object type. Similarly, the object type determines a prefix of the method suite, but the suite can contain additional methods introduced by subtypes.

If o is an object, then o.f designates the data field named f in o’s data record. If m is one of o’s methods, an invocation of the form o.m( ... ) denotes an execution of o’s m method. An object’s methods can be invoked, but not read or written.

If T is an object type and m is the name of one of T’s methods, then T.m denotes T’s m method. This notation makes it convenient for a subtype method to invoke the corresponding method of one of its supertypes.

A field or method in a subtype masks any field or method with the same name in the supertype.

Object assignment is reference assignment. Objects cannot be dereferenced, since the static type of an object variable does not determine the type of its data record. To copy the data record of one object into another, the fields must be assigned individually.

There is one predeclared object type:

Root: The traced object type with no fields or methods

The declaration of an object type has the form:

type T = ST object { Members }

where ST is an optional supertype, Members is a list of field and method declarations, and method overrides. The fields of T consist of the fields of ST followed by the fields declared in Members. The methods of T consist of the methods of ST modified by method overrides in Members, and followed by the methods declared in Members. The names introduced in Members must be distinct from one another. If ST is omitted, it defaults to Root.

A method declaration has the form:

m sig := proc

where m is an identifier, sig is a procedure signature, and proc is a top-level procedure constant. It specifies that T’s m method has signature sig and value proc. If := proc is omitted, := nil is assumed. If proc is non-nil, its first parameter must have mode val and type some supertype of T, and dropping its first parameter must result in a signature that is covered by sig.

A method override has the form:

m := proc

where m is the name of a method of the supertype ST and proc is a top-level procedure constant. It specifies that the m method for T is proc, rather than ST.m. If proc is non-nil, its first parameter must have mode val and type some supertype of T, and dropping its first parameter must result in a signature that is covered by the signature of ST’s m method.

Examples#

Consider the following declarations:

type 
  A  = object { a: int; p() };
  AB = A object { b: int };

def Pa(self: A) { ... }
def Pab(self: AB) { ... }

The procedures Pa and Pab are candidate values for the p methods of objects of types A and AB. For example:

type T1 = AB object { p := Pab };

declares a type with an AB data record and a p method that expects an AB. T1 is a valid subtype of AB`. Similarly,

type T2 = A object { p := Pa };

declares a type with an A data record and a method that expects an A. T2 is a valid subtype of A. A more interesting example is:

type T3 = AB object { p := Pa };

which declares a type with an AB data record and a p method that expects an A. Since every AB is an A, the method is not too choosy for the objects in which it will be placed. T3 is a valid subtype of AB. In contrast,

type T4 = A object { p := Pab };

attempts to declare a type with an A data record and a method that expects an AB; since not every A is an AB, the method is too choosy for the objects in which it would be placed. The declaration of T4 is a static error.

The following example illustrates the difference between declaring a new method and overriding an existing method. After the declarations

type
  A = object { m() := P };
  B = A object { m := Q };
  C = A object { m() := Q };

var a := New(A); b := New(B); c := New(c);

we have that

a.m() activates P(a)
b.m() activates Q(b)
c.m() activates Q(c)

So far there is no difference between overriding and extending. But c’s method suite has two methods, while b’s has only one, as can be revealed if b and c are viewed as members of type A:

a := b;
a.m(); /* activates Q(b) */
a := c;
a.m(); /* activates P(c) */

The last example uses object subtyping to define reusable queues. First the interface:

type
  Queue = ( head, tail: QueueElem );
  QueueElem = object { link: QueueElem };

def Insert (var q: Queue; x: QueueElem);
def Delete (var q: Queue): QueueElem;
def Clear  (var q: Queue);

Then an example client:

type
  IntQueueElem = QueueElem object { val: int };
var
  q: Queue;
  x: IntQueueElem;
{
  ...
  Clear(q);
  x := New(IntQueueElem);
  Insert(q, x);
  ...
  x := Delete(q)
}

Passing x to Insert is safe, since every IntQueueElem is a QueueElem. Assigning the result of Delete to x cannot be guaranteed valid at compile-time, since other subtypes of QueueElem can be inserted into q, but the assignment will produce a checked runtime error if the source value is not a member of the target type.

Subtyping Rules#

We write T <: U to indicate that T is a subtype of U and U is a supertype of T.

If T <: U, then every value of type T is also a value of type U. The converse does not hold. This section presents the rules that define the subtyping relation.

For array types,

\[\texttt{[]}^m \texttt{[}J_1\texttt{]} \ldots \texttt{[}J_n\texttt{] [}K_1\texttt{]}\ldots \texttt{[}K_p\texttt{] T}\] \[<: \texttt{[]}^m \texttt{[]}^n \texttt{[}I_1\texttt{]} \ldots \texttt{[}I_p\texttt{] T}\]

if $I_i = K_i$ for $i = 1, \ldots, p$.

That is, an array type A is a subtype of an array type B if they have the same ultimate element type, the same number of dimensions, and, for each dimension, either both are open (as in the first $m$ dimensions above), or A is fixed and B is open (as in the next $n$ dimensions above), or they are both fixed and have the same size (as in the last $p$ dimensions above).

Null <: ^T <: Refany

That is, Refany contains all traced references, and nil is a member of every reference type.

Null <: (A): R for any A and R

That is, nil is a member of every procedure type;

(A): Q <: (B): R
if signature "(B): R" covers signature "(A): Q".

That is, for procedure types, T <: U if they are the same except for parameter names.

Root <: Refany
Null <: T object { ... } <: T

That is, every object is a reference, nil is a member of every object type, and every subtype is included in its supertype.

T <: T  for all T
T <: U  and   U <: V  implies  T <: V  for all T, U, V.

That is, <: is reflexive and transitive.

Note that T <: U and U <: T does not imply that T and U are the same.

Predeclared Opaque Types#

The language predeclares the type:

Text <: Refany

Expressions#

Designators#

An identifier is a writable designator if it is declared as a variable, is a var or val parameter. An identifier is a readonly designator if it is a readonly parameter, or a local of a for statement.

The only operations that produce designators are dereferencing, subscripting, and selection. This section defines these operations and specifies the conditions under which they produce designators.

r^ denotes the the referent r; this operation is called dereferencing.
The expression r^ is always a writable designator.
It is a static error if the type of r is Refany, Null, or an object type, and a checked runtime error if r is nil.
The type of r^ is the referent type of r.
a[i] denotes the (i + 1 - First(a))-th element of the array a.
The expression a[i] is a designator if a is, and is writable if a is.
The expression i must be assignable to the index type of a.
The type of a[i] is the element type of a.
If a is a reference to an array, then a[i] is shorthand for a^[i].
r.f, o.f, I.x, T.m
- If r denotes a record, r.f denotes its f field. In this case r.f is a designator if r is, and is writable if r is.
  The type of r.f is the declared type of the field.
  If r is a reference to a record, then r.f is shorthand for r^.f.
- If o denotes an object and f names a data field specified in the type of o, then o.f denotes that data field of o. In this case o.f is a writable designator whose type is the declared type of the field.
- If T is an object type and m is the name of one of T’s methods, then T.m denotes the m method of type T. In this case T.m is not a designator. Its type is the procedure type whose first argument has mode val and type T, and whose remaining arguments are determined by the method declaration for m in T. The name of the first argument is unspecified. T.m is a procedure constant.

Numeric literals#

Numeric literals denote constant non-negative integers. The types of these literals are int.

A literal int has the form base_digits, where base is one of 2, 3, …, 16, and digits is a non-empty sequence of the decimal digits 0 through 9 plus the hexadecimal digits A through F. The base_ can be omitted, in which case base defaults to 10. The digits are interpreted in the given base. Each digit must be less than base. For example, 16_FF and 255 are equivalent integer literals.

If no explicit base is present, the value of the literal must be at most Last(int). If an explicit base is present, the value of the literal must be less than $2^n$ where $n$ is the word size of the target machine. For example, on a sixteen-bit two’s complement machine, 16_FFFF and -1 represent the same value.

Case is not significant in digits, prefixes or scale factors. Embedded spaces are not allowed.

Text and character literals#

A character literal is a pair of single quotes enclosing either a single printing character (excluding single quote) or an escape sequence. The type of a character literal is int.

A text literal is a pair of double quotes enclosing a sequence of printing characters (excluding double quote) and escape sequences. The type of a text literal is Text.

Nil#

The literal nil denotes the value nil. Its type is Null.

Function application#

A procedure call is an expression if the procedure returns a result. The type of the expression is the result type of the procedure.

New#

An allocation operation has the form:

New(T, ...)

where T is a reference type other than Refany or Null. The operation returns the address of a newly-allocated variable of T’s referent type; or if T is an object type, a newly-allocated data record paired with a method suite. The reference returned by New is distinct from all existing references. The allocated type of the new reference is T.

It is a static error if T’s referent type is empty.

The initial state of the referent generally represents an arbitrary value of its type.

If T is a reference to an array with k open dimensions, the New operation has the form:

New(T, n_1, ..., n_k)

where the n’s are integer-valued expressions that specify the lengths of the new array in its first k dimensions. The values in the array will be arbitrary values of their type.

Arithmetic operations#

The basic arithmetic operations are built into the language.

Implementations must not rearrange the computation of expressions in a way that could affect the result. For example, (x+y)+z generally

cannot be computed as x+(y+z), since addition is not associative for bounded integers.

prefix    +  (x: int)    : int
 infix    +  (x,y: int)  : int

As a prefix operator, +x returns x. As an infix operator on numeric arguments, + denotes addition.

prefix    -  (x: int)    : int
 infix    -  (x,y: int)  : int

As a prefix operator, -x is the negative of x. As an infix operator on numeric arguments, - denotes subtraction.

 infix    *  (x,y: int)  : int

On numeric arguments, * denotes multiplication.

 infix    /  (x,y: int)    : int

On numeric arguments, / is the floor of the quotient of x and y; that is, the maximum integer not excedding the real number z such that z * y = x.

Relations#

infix    =  (x, y: Any): bool
infix    != (x, y: Any): bool

The operator = returns true if x and y are equal. The operator != returns true if x and y are not equal. It is a static error if the type of x is not assignable to the type of y or vice versa.

Ordinals are equal if they have the same value. References are equal if they address the same location. Procedures are equal if they agree as closures; that is, if they refer to the same procedure body and environment.

infix    <= (x,y: Ordinal) : bool
infix    >= (x,y: Ordinal) : bool

The operator <= returns true if x is at most as large as y. It is a static error if the type of x is not assignable to the type of y, or vice versa. The expression x >= y is equivalent to y <= x.

infix    >  (x,y: Ordinal) : bool
infix    <  (x,y: Ordinal) : bool

In all cases, x < y means (x <= y) && (x != y), and x > y means y < x. It is a static error if the type of x is not assignable to the type of y, or vice versa.

Boolean operations#

prefix   !  (p: bool)   : bool 
infix    && (p,q: bool) : bool 
infix    || (p,q: bool) : bool

!p is the complement of p.

p && q is true if both p and q are true. If p is false, q is not evaluated.

p || q is true if at least one of p and q is true. If p is true, q is not evaluated.

Type Operations#

Number (T: OrdinalType)    : int
       (A: FixedArrayType) : int
       (a: Array)          : int

For an ordinal type T, Number(T) returns the number of elements in T. For a fixed array type A, Number(A) is the length of A. Similarly, for an array a, Number(a) is the length of the type of a if a has fixed array type, or the allocated length of a if a has open array type. The expression a will be evaluated only if it denotes an open array.

First (T: OrdinalType)    : T
      (A: FixedArrayType) : int
      (a: Array)          : int

Last  (T: OrdinalType)    : T
      (A: FixedArrayType) : int
      (a: Array)          : int

For a non-empty ordinal type T, First returns the smallest value of T and Last returns the largest value.

For a fixed array type A, First(A) is 0 and Last(A) is N-1 where N is Length(A). Similarly, for an array a, First(a) and Last(a) are 0 and N-1, respectively, where N is Length(a). The expression a will be evaluated only if it is an open array.

Constant Expressions#

Constant expressions are a subset of the general class of expressions, restricted by the requirement that it be possible to evaluate the expression statically. All operations are legal in constant expressions except for New and dereferencing (explicit or implicit)

A variable can appear in a constant expression only as an argument to First, Last, Number, and such a variable must not have an open array type. Literals and top-level procedure constants are legal in constant expressions.