Subtypes

What is a type?

A type is a property of a storage location. Examples of storage locations include: a variable, a field of a struct, an element of an array, a heap block, a function argument or return value. The type of a location is almost always static, i.e. known at compile time and not stored at run time (see [TODO] "doc/dynamic.txt" for the only exception).

The type of a location constrains the value stored at that location. It also tells you how to calculate the types of the locations pointed to by the value. The most obvious kind of constraint is that a type only allows certain values, and forbids all others. A type can also tell you how a value may be used; for example, a function type tells you how to call the value, and that you must not examine it in any other way. There are also more abstract constraints, such as constness. See "doc/data-model.txt", "doc/emails/const.txt" and "doc/emails/ref-const.txt" for more details. [TODO: Edit the two emails into a proper "doc/const.txt" file.]

Partly because the type constrains the value, the type also defines the representation of the value, i.e. its encoding as a string of bytes in memory. The same string of bytes can mean a different value, if stored in a location with a different type. For example, a string of eight zero bytes can mean: the integer zero; the floating-point value zero; an array of length zero; a null pointer; a pair of NUL characters; eight copies of the value false; and so on. This is how Welly achieves good memory density.

Often we say that a value "has" a type, or "is of" a type, or that a type "has" a value or "contains" a value. Indeed, it is easy to think of the type as a property of the value. However, this is really an abbreviated form of language. The precise language is that if a location has the type, then the value can be stored at that location. The type of the value is really the type of the location that holds it.

What is a subtype

If the constraints imposed by a type t include all the constraints imposed by a type u, and if types t and u define the same representation for all values of type t, then we say that t is a subtype of u. In practice, this means that any value stored in a location of type t can be copied, unmodified, to a location of type u.

For example [not implemented], the type ref var int is a subtype of opt ref var int. The type ref var int allows values that are pointers to heap blocks containing a single integer, and it allows the integer to be read and written. The type opt ref var int allows all of those values, with the same permissions, and also allows null.

For another example, the type ref var int is a subtype of ref int. The type ref var int allows the integer to be read and written, but ref int only allows the integer to be read. It adds an extra constraint on how the value may be used.

For a third example, the type ref const int is a subtype of ref int. The type ref const int allows the integer to be read, and guarantees that it will never change, but ref int only allows the integer to be read. By failing to make the guarantee, it adds an extra constraint on how the value may be used.

It is useful to adopt the convention that every type is a subtype of itself. When we want to say that t is a subtype of u but that u is not a subtype of t, we say t is a "proper" subtype of u.

Syntax

On this page, we'll uses the syntax t <= u to mean that t is a subtype of u. Therefore, the syntax t < u tests whether t is a proper subtype of u, and the syntax t == u tests whether they are both subtypes of each other.

Mathematical definition

The subtype relation is defined coinductively, meaning that t is a subtype of u unless there exists a finite disproof using the axioms defined in this section.

The task of searching for a disproof of t <= u has the form of a flood-fill algorithm. The state of the algorithm is a set of pairs (v, w) such that disproving v <= w is sufficient to disprove t <= u. You make progress by recursively descending v and w in order to add more pairs to the set, using the rules below. Eventually, you either find a pair that is trivially disproved by one of the rules, or you find that you can no longer add any more pairs to the set.

All types

The rules in this section are supposed to be exhaustive. It is therefore important to be able to enumerate all possible types. The possible types are defined in the section on data types. For reference, here is a copy of the version of the definition I used to write this document:

union constness {
    BASE
    CONST
    VAR
}

union type {
    PRIMITIVE(uint8, uint8)
    REF(constness, type)
    ARRAY0
    ARRAY(constness, type)
    STRUCT(array const type)
    UNION(array const struct {name: str, type: array const type})
    FUNCTION(array const type, type)
    DYNAMIC
    MACRO
    WITH(type, module)
} with { ... }
In a UNION, the names must be distinct. Note that there is only one type DYNAMIC and only one type MACRO.

Incompatible types

v is not a subtype of w unless the cases of v and w are one of the combinations marked "*" in the following table:

v \ w P R A0 A S U F D M W
PRIMITIVE * *
REF * *
ARRAY0 * * *
ARRAY * *
STRUCT * *
UNION * *
FUNCTION * *
DYNAMIC * *
MODULE * *
WITH * * * * * * * * * *

Primitives

[Not implemented - the current version uses a simpler design.]

PRIMITIVE(s1, a1) <= PRIMITIVE(s2, a2)
only if
s1 == s2
and
a1 == a2

References

REF(c, t) <= REF(d, u)
only if
t <= u
and
c == d or d == BASE
and
u <= t or d != VAR

Arrays

ARRAY0() <= ARRAY0()
ARRAY0() <= ARRAY(d, u)
ARRAY(c, t) <= ARRAY(d, u)
only if
t <= u
and
c == d or d == BASE
and
u <= t or d != VAR

Structs

STRUCT(fs) <= STRUCT(gs)
only if
fs.length == gs.length
and
i.
fs[i] <= gs[i]

Unions

UNION(cs) <= UNION(ds)
only if
i. ∃ j.
m == n
and
ts.length == us.length
and
k.
ts[k] <= us[k]
where
(m, ts) = cs[i]
and
(n, us) = ds[j]

Functions

FUNCTION(ts, tr) <= FUNCTION(us, ur)
only if
tr <= ur
and
ts.length == us.length
and
i.
us[i] <= ts[i]

WITH types

[Not implemented.]

WITH(t, m) <= u
only if
t <= u
t <= WITH(u, n)
only if
t <= u