Definitions#
A MoJo program specifies a computation that acts on a sequence of
digital components called locations. A variable is a set of
locations that represents a mathematical value according to a
convention determined by the variable’s type. If a value can be
represented by some variable of type T, then we say that the value
is a member of T and T contains the value.
An identifier is a symbol declared as a name for a variable, type, procedure, etc. The region of the program over which a declaration applies is called the scope of the declaration. Scopes can be nested. The meaning of an identifier is determined by the smallest enclosing scope in which the identifier is declared.
An expression specifies a computation that produces a value or variable. Expressions that produce variables are called designators. A designator can denote either a variable or the value of that variable, depending on the context. Some designators are readonly, which means that they cannot be used in contexts that might change the value of the variable. A designator that is not readonly is called writable. Expressions whose values can be determined statically are called constant expressions; they are never designators.
A static error is an error that the implementation must detect before program execution. Violations of the language definition are static errors unless they are explicitly classified as runtime errors.
A checked runtime error is an error that the implementation must detect and report at runtime. The method for reporting such errors is implementation-dependent. (If the implementation maps them into exceptions, then a program could handle these exceptions and continue.)
Types#
MoJo is designed to support structural equivalence of types, instead of name equivalence. Two types are the same if their definitions become the same when expanded; that is, when all constant expressions are replaced by their values and all type names are replaced by their definitions. In the case of recursive types, the expansion is the infinite limit of the partial expansions. A type expression is generally allowed wherever a type is required.
Every expression has a statically-determined type, which contains every value that the expression can produce. The type of a designator is the type of the variable it produces.
Assignability and type compatibility are defined in terms of a single
syntactically specified subtype relation with the property that if T
is a subtype of U, then every member of T is a member of U. The
subtype relation is reflexive and transitive.
Every expression has a unique type, but a value can be a member of
many types. For example, the value nil is a member of both Refany
and Root. It would be ambiguous to talk about “the type of a
value”. Thus the phrase “type of x” means “type of the expression
x”, while “x is a member of T” means “the value of x is a
member of T”.
However, there is one sense in which a value can be said to have a type: every object or traced reference value includes a code for a type, called the allocated type of the reference value.
The type constructors and subtyping rules are discussed below.
Ordinal Types#
There are two ordinal types: bool and int.
Integers and Booleans are collectively called ordinal values. The
base type of an ordinal value v is int if v is an integer,
otherwise it is the unique type that contains v, which is bool for
Boolean values.
The operators First, Last, and Number applied to an ordinal type return the first element, last element, and number of elements, respectively.
Here are the predeclared ordinal types:
-
int: All integers naturally represented by the implementation. -
bool: The enumeration {false, true}.
Array Types#
An array is an indexed collection of component variables, called the elements of the array. The indexes are the values of an ordinal type, called the index type of the array. The elements all have the same size and the same type, called the element type of the array.
There are two kinds of array types, fixed and open. The length of a fixed array is determined at compile time. The length of an open array type is determined at runtime, when it is allocated or bound. The length cannot be changed thereafter.
The shape of a multi-dimensional array is the sequence of its lengths in each dimension. More precisely, the shape of an array is its length followed by the shape of any of its elements; the shape of a non-array is the empty sequence.
Arrays are assignable if they have the same element type and shape. If either the source or target of the assignment is an open array, a runtime shape check is required.
A fixed array type declaration has the form:
type T = [N] Element
where N is a constant integer expression and Element is any type
other than an open array type. The values of type T are arrays whose
element type is Element and whose length is the value of N.
If a has type T, then a[i] designates the element of a whose
position corresponds to the position of i in the subrange
[0..N-1], where N is the length of the array.
An open array type declaration has the form:
type T = [] Element
where Element is any type. The values of T are arrays whose
element type is Element and whose length is arbitrary. The index
type of an open array is the int subrange [0..n-1], where n is
the length of the array.
An open array type can be used only as the type of a formal parameter, the referent of a reference type, the element type of another open array type, or as the type in an array constructor.
Examples of array types:
type Transform = [3] int;
Vector = [] int;
Record Types#
A record is a sequence of named variables, called the fields of the record. Different fields can have different types. The name and type of each field is statically determined by the record’s type. The expression r.f designates the field named f in the record r.
A record type declaration has the form:
type T = ( FieldList )
where FieldList is a list of field declarations, each of which has
the form:
fieldName: Type
where fieldName is an identifier and Type is any non-empty type
other than an open array type. The field names must be distinct. A
record is a member of T if it has fields with the given names and
types, in the given order, and no other fields. Empty records are
allowed.
When a series of fields shares the same type, any fieldName can be a list of identifiers separated by commas. Such a list is shorthand for a list in which the type is repeated for each identifier. That is:
f_1, ..., f_m: Type
is shorthand for:
f_1: Type; ...; f_m: Type
This shorthand is eliminated from the expanded definition of the type.
Reference Types#
A reference value is either nil or the address of a variable,
called the referent.
A reference type is traced. When all traced references to a piece of allocated storage are gone, the implementation reclaims the storage. A general type is traced if it is a traced reference type, a record type any of whose field types is traced, or an array type whose element type is traced.
A declaration for a traced reference type has the form:
type T = ^Type
where Type is any type. The values of T are traced references to
variables of type Type, which is called the referent type of T.
The following reference types are predeclared:
-
Refany: Contains all traced references. -
Null: Contains onlynil.
Procedure Types#
A procedure is either nil or a triple consisting of:
- the body, which is a block,
- the signature, which specifies the procedure’s formal arguments. and result type
- the environment, which is the scope with respect to which variable names in the body will be interpreted.
A procedure that returns a result is called a function procedure; a procedure that does not return a result is called a proper procedure. A top-level procedure is a procedure declared in the outermost scope of a module. Any other procedure is a local procedure. A local procedure can be passed as a parameter but not assigned, since in a stack implementation a local procedure becomes invalid when the frame for the procedure containing it is popped.
A procedure constant is an identifier declared as a procedure. (As opposed to a procedure variable, which is a variable declared with a procedure type.)
A procedure signature specification has the form:
(formal_1; ...; formal_n): R
where
- Each
formal_iis a formal parameter declaration, as described below. -
Ris the result type, which can be any type but an open array type. The: Rcan be omitted, making the signature that of a proper procedure.
A formal parameter declaration has the form
Mode Name: Type
where
-
Modeis a parameter mode, which can beval,var, orreadonly. IfModeis omitted, it defaults toval. -
Nameis an identifier that names the parameter. The parameternames must be distinct. -
Typeis the type of the parameter.
When a series of parameters share the same mode, and type, name_i
can be a list of identifiers separated by commas. Such a list is
shorthand for a list in which the mode, and type, are repeated for
each identifier. That is:
Mode v_1, ..., v_n: Type
is shorthand for:
Mode v_1: Type; ...; Mode v_n: Type
This shorthand is eliminated from the expanded definition of the type.
A procedure value P is a member of the type T if it is nil or
its signature is covered by the signature of T, where
signature_1 covers signature_2 if:
- They have the same number of parameters, and corresponding parameters have the same type and mode.
- They have the same result type, or neither has a result type.
The parameter names and defaults affect the type of a procedure, but not its value.
Object Types#
An object is either nil or a reference to a data record paired with a
method suite, which is a record of procedures that will accept the
object as a first argument.
An object type determines the types of a prefix of the fields of the
data record, as if object were ^(). But in the case of an object
type, the data record can contain additional fields introduced by
subtypes of the object type. Similarly, the object type determines a
prefix of the method suite, but the suite can contain additional
methods introduced by subtypes.
If o is an object, then o.f designates the data field named f in
o’s data record. If m is one of o’s methods, an invocation of
the form o.m( ... ) denotes an execution of o’s m method. An
object’s methods can be invoked, but not read or written.
If T is an object type and m is the name of one of T’s methods,
then T.m denotes T’s m method. This notation makes it convenient
for a subtype method to invoke the corresponding method of one of its
supertypes.
A field or method in a subtype masks any field or method with the same name in the supertype.
Object assignment is reference assignment. Objects cannot be dereferenced, since the static type of an object variable does not determine the type of its data record. To copy the data record of one object into another, the fields must be assigned individually.
There is one predeclared object type:
-
Root: The traced object type with no fields or methods
The declaration of an object type has the form:
type T = ST object { Members }
where ST is an optional supertype, Members is a list of field and
method declarations, and method overrides. The fields of T consist
of the fields of ST followed by the fields declared in
Members. The methods of T consist of the methods of ST modified
by method overrides in Members, and followed by the methods declared
in Members. The names introduced in Members must be distinct from
one another. If ST is omitted, it defaults to Root.
A method declaration has the form:
m sig := proc
where m is an identifier, sig is a procedure signature, and proc is a
top-level procedure constant. It specifies that T’s m method has
signature sig and value proc. If := proc is omitted, := nil is
assumed. If proc is non-nil, its first parameter must have mode val
and type some supertype of T, and dropping its first parameter must
result in a signature that is covered by sig.
A method override has the form:
m := proc
where m is the name of a method of the supertype ST and proc is a top-level procedure constant. It specifies that the m method for T is proc, rather than ST.m. If proc is non-nil, its first parameter must have mode val and type some supertype of T, and dropping its first parameter must result in a signature that is covered by the signature of ST’s m method.
Examples#
Consider the following declarations:
type
A = object { a: int; p() };
AB = A object { b: int };
def Pa(self: A) { ... }
def Pab(self: AB) { ... }
The procedures Pa and Pab are candidate values for the p methods
of objects of types A and AB. For example:
type T1 = AB object { p := Pab };
declares a type with an AB data record and a p method that expects an AB. T1 is a valid subtype of AB`. Similarly,
type T2 = A object { p := Pa };
declares a type with an A data record and a method that expects an A. T2 is a valid subtype of A. A more interesting example is:
type T3 = AB object { p := Pa };
which declares a type with an AB data record and a p method that expects an A. Since every AB is an A, the method is not too choosy for the objects in which it will be placed. T3 is a valid subtype of AB. In contrast,
type T4 = A object { p := Pab };
attempts to declare a type with an A data record and a method that
expects an AB; since not every A is an AB, the method is too
choosy for the objects in which it would be placed. The declaration of
T4 is a static error.
The following example illustrates the difference between declaring a new method and overriding an existing method. After the declarations
type
A = object { m() := P };
B = A object { m := Q };
C = A object { m() := Q };
var a := New(A); b := New(B); c := New(c);
we have that
-
a.m()activatesP(a) -
b.m()activatesQ(b) -
c.m()activatesQ(c)
So far there is no difference between overriding and extending. But c’s method suite has two methods, while b’s has only one, as can be revealed if b and c are viewed as members of type A:
a := b;
a.m(); /* activates Q(b) */
a := c;
a.m(); /* activates P(c) */
The last example uses object subtyping to define reusable queues. First the interface:
type
Queue = ( head, tail: QueueElem );
QueueElem = object { link: QueueElem };
def Insert (var q: Queue; x: QueueElem);
def Delete (var q: Queue): QueueElem;
def Clear (var q: Queue);
Then an example client:
type
IntQueueElem = QueueElem object { val: int };
var
q: Queue;
x: IntQueueElem;
{
...
Clear(q);
x := New(IntQueueElem);
Insert(q, x);
...
x := Delete(q)
}
Passing x to Insert is safe, since every IntQueueElem is a
QueueElem. Assigning the result of Delete to x cannot be
guaranteed valid at compile-time, since other subtypes of QueueElem
can be inserted into q, but the assignment will produce a checked
runtime error if the source value is not a member of the target type.
Subtyping Rules#
We write T <: U to indicate that T is a subtype of U and U is
a supertype of T.
If T <: U, then every value of type T is also a value of type
U. The converse does not hold. This section presents the rules that
define the subtyping relation.
For array types,
\[\texttt{[]}^m \texttt{[}J_1\texttt{]} \ldots \texttt{[}J_n\texttt{] [}K_1\texttt{]}\ldots \texttt{[}K_p\texttt{] T}\] \[<: \texttt{[]}^m \texttt{[]}^n \texttt{[}I_1\texttt{]} \ldots \texttt{[}I_p\texttt{] T}\]if $I_i = K_i$ for $i = 1, \ldots, p$.
That is, an array type A is a subtype of an array type B if they
have the same ultimate element type, the same number of dimensions,
and, for each dimension, either both are open (as in the first $m$
dimensions above), or A is fixed and B is open (as in the next $n$
dimensions above), or they are both fixed and have the same size (as
in the last $p$ dimensions above).
Null <: ^T <: Refany
That is, Refany contains all traced references, and nil is a
member of every reference type.
Null <: (A): R for any A and R
That is, nil is a member of every procedure type;
(A): Q <: (B): R
if signature "(B): R" covers signature "(A): Q".
That is, for procedure types, T <: U if they are the same except for
parameter names.
Root <: Refany
Null <: T object { ... } <: T
That is, every object is a reference, nil is a member of every
object type, and every subtype is included in its supertype.
T <: T for all T
T <: U and U <: V implies T <: V for all T, U, V.
That is, <: is reflexive and transitive.
Note that T <: U and U <: T does not imply that T and U are the same.
Predeclared Opaque Types#
The language predeclares the type:
Text <: Refany
Expressions#
Designators#
An identifier is a writable designator if it is declared as a
variable, is a var or val parameter. An identifier is a readonly
designator if it is a readonly parameter, or a local of a for
statement.
The only operations that produce designators are dereferencing, subscripting, and selection. This section defines these operations and specifies the conditions under which they produce designators.
-
r^denotes the the referentr; this operation is called dereferencing.
The expressionr^is always a writable designator.
It is a static error if the type of r isRefany,Null, or an object type, and a checked runtime error ifrisnil.
The type ofr^is the referent type ofr. -
a[i]denotes the(i + 1 - First(a))-th element of the arraya.
The expressiona[i]is a designator ifais, and is writable ifais.
The expressionimust be assignable to the index type ofa.
The type ofa[i]is the element type ofa.
Ifais a reference to an array, thena[i]is shorthand fora^[i]. -
r.f,o.f,I.x,T.m- If
rdenotes a record,r.fdenotes itsffield. In this caser.fis a designator ifris, and is writable ifris.
The type ofr.fis the declared type of the field.
Ifris a reference to a record, thenr.fis shorthand forr^.f. - If
odenotes an object andfnames a data field specified in the type ofo, theno.fdenotes that data field ofo. In this caseo.fis a writable designator whose type is the declared type of the field. - If
Tis an object type andmis the name of one ofT’s methods, thenT.mdenotes themmethod of typeT. In this caseT.mis not a designator. Its type is the procedure type whose first argument has modevaland typeT, and whose remaining arguments are determined by the method declaration forminT. The name of the first argument is unspecified.T.mis a procedure constant.
- If
Numeric literals#
Numeric literals denote constant non-negative integers. The types of
these literals are int.
A literal int has the form base_digits, where base is one of
2, 3, …, 16, and digits is a non-empty sequence of the
decimal digits 0 through 9 plus the hexadecimal digits A through
F. The base_ can be omitted, in which case base defaults
to 10. The digits are interpreted in the given base. Each digit must
be less than base. For example, 16_FF and 255 are equivalent
integer literals.
If no explicit base is present, the value of the literal must be at
most Last(int). If an explicit base is present, the value of the
literal must be less than $2^n$ where $n$ is the word size of the
target machine. For example, on a sixteen-bit two’s complement
machine, 16_FFFF and -1 represent the same value.
Case is not significant in digits, prefixes or scale factors. Embedded spaces are not allowed.
Text and character literals#
A character literal is a pair of single quotes enclosing either a
single printing character (excluding single quote) or an escape
sequence. The type of a character literal is int.
A text literal is a pair of double quotes enclosing a sequence of
printing characters (excluding double quote) and escape sequences. The
type of a text literal is Text.
Nil#
The literal nil denotes the value nil. Its type is Null.
Function application#
A procedure call is an expression if the procedure returns a result. The type of the expression is the result type of the procedure.
New#
An allocation operation has the form:
New(T, ...)
where T is a reference type other than Refany or Null. The
operation returns the address of a newly-allocated variable of T’s
referent type; or if T is an object type, a newly-allocated data
record paired with a method suite. The reference returned by New is
distinct from all existing references. The allocated type of the new
reference is T.
It is a static error if T’s referent type is empty.
The initial state of the referent generally represents an arbitrary value of its type.
If T is a reference to an array with k open dimensions, the New
operation has the form:
New(T, n_1, ..., n_k)
where the n’s are integer-valued expressions that specify the
lengths of the new array in its first k dimensions. The values in
the array will be arbitrary values of their type.
Arithmetic operations#
The basic arithmetic operations are built into the language.
Implementations must not rearrange the computation of expressions in a way that could affect the result. For example, (x+y)+z generally
cannot be computed as x+(y+z), since addition is not associative for bounded integers.
prefix + (x: int) : int
infix + (x,y: int) : int
As a prefix operator, +x returns x. As an infix operator on
numeric arguments, + denotes addition.
prefix - (x: int) : int
infix - (x,y: int) : int
As a prefix operator, -x is the negative of x. As an infix
operator on numeric arguments, - denotes subtraction.
infix * (x,y: int) : int
On numeric arguments, * denotes multiplication.
infix / (x,y: int) : int
On numeric arguments, / is the floor of the quotient of x and y;
that is, the maximum integer not excedding the real number z such
that z * y = x.
Relations#
infix = (x, y: Any): bool
infix != (x, y: Any): bool
The operator = returns true if x and y are equal. The operator
!= returns true if x and y are not equal. It is a static error
if the type of x is not assignable to the type of y or vice versa.
Ordinals are equal if they have the same value. References are equal if they address the same location. Procedures are equal if they agree as closures; that is, if they refer to the same procedure body and environment.
infix <= (x,y: Ordinal) : bool
infix >= (x,y: Ordinal) : bool
The operator <= returns true if x is at most as large as y. It
is a static error if the type of x is not assignable to the type of
y, or vice versa. The expression x >= y is equivalent to y <=
x.
infix > (x,y: Ordinal) : bool
infix < (x,y: Ordinal) : bool
In all cases, x < y means (x <= y) && (x != y), and x > y means
y < x. It is a static error if the type of x is not assignable to
the type of y, or vice versa.
Boolean operations#
prefix ! (p: bool) : bool
infix && (p,q: bool) : bool
infix || (p,q: bool) : bool
!p is the complement of p.
p && q is true if both p and q are true. If p is false,
q is not evaluated.
p || q is true if at least one of p and q is
true. If p is true, q is not evaluated.
Type Operations#
Number (T: OrdinalType) : int
(A: FixedArrayType) : int
(a: Array) : int
For an ordinal type T, Number(T) returns the number of elements in
T. For a fixed array type A, Number(A) is the length of A.
Similarly, for an array a, Number(a) is the length of the type of
a if a has fixed array type, or the allocated length of a if a
has open array type. The expression a will be evaluated only if it
denotes an open array.
First (T: OrdinalType) : T
(A: FixedArrayType) : int
(a: Array) : int
Last (T: OrdinalType) : T
(A: FixedArrayType) : int
(a: Array) : int
For a non-empty ordinal type T, First returns the smallest value
of T and Last returns the largest value.
For a fixed array type A, First(A) is 0 and Last(A) is N-1
where N is Length(A). Similarly, for an array a, First(a) and
Last(a) are 0 and N-1, respectively, where N is
Length(a). The expression a will be evaluated only if it is an
open array.
Constant Expressions#
Constant expressions are a subset of the general class of expressions,
restricted by the requirement that it be possible to evaluate the
expression statically. All operations are legal in constant
expressions except for New and dereferencing (explicit or implicit)
A variable can appear in a constant expression only as an argument to
First, Last, Number, and such a variable must not have an open
array type. Literals and top-level procedure constants are legal in
constant expressions.