A type is a property of a storage location. Examples of storage locations include: a variable, a field of a struct, an element of an array, a heap block, a function argument or return value. The type of a location is almost always static, i.e. known at compile time and not stored at run time (see [TODO] "doc/dynamic.txt" for the only exception).
The type of a location constrains the value stored at that location. It also tells you how to calculate the types of the locations pointed to by the value. The most obvious kind of constraint is that a type only allows certain values, and forbids all others. A type can also tell you how a value may be used; for example, a function type tells you how to call the value, and that you must not examine it in any other way. There are also more abstract constraints, such as constness. See "doc/data-model.txt", "doc/emails/const.txt" and "doc/emails/ref-const.txt" for more details. [TODO: Edit the two emails into a proper "doc/const.txt" file.]
Partly because the type constrains the value, the type also defines the representation of the value, i.e. its encoding as a string of bytes in memory. The same string of bytes can mean a different value, if stored in a location with a different type. For example, a string of eight zero bytes can mean: the integer zero; the floating-point value zero; an array of length zero; a null pointer; a pair of NUL characters; eight copies of the value false
; and so on. This is how Welly achieves good memory density.
Often we say that a value "has" a type, or "is of" a type, or that a type "has" a value or "contains" a value. Indeed, it is easy to think of the type as a property of the value. However, this is really an abbreviated form of language. The precise language is that if a location has the type, then the value can be stored at that location. The type of the value is really the type of the location that holds it.
If the constraints imposed by a type t
include all the constraints imposed by a type u
, and if types t
and u
define the same representation for all values of type t
, then we say that t
is a subtype of u
. In practice, this means that any value stored in a location of type t
can be copied, unmodified, to a location of type u
.
For example [not implemented], the type ref var int
is a subtype of opt ref var int
. The type ref var int
allows values that are pointers to heap blocks containing a single integer, and it allows the integer to be read and written. The type opt ref var int
allows all of those values, with the same permissions, and also allows null
.
For another example, the type ref var int
is a subtype of ref int
. The type ref var int
allows the integer to be read and written, but ref int
only allows the integer to be read. It adds an extra constraint on how the value may be used.
For a third example, the type ref const int
is a subtype of ref int
. The type ref const int
allows the integer to be read, and guarantees that it will never change, but ref int
only allows the integer to be read. By failing to make the guarantee, it adds an extra constraint on how the value may be used.
It is useful to adopt the convention that every type is a subtype of itself. When we want to say that t
is a subtype of u
but that u
is not a subtype of t
, we say t
is a "proper" subtype of u
.
On this page, we'll uses the syntax t <= u
to mean that t
is a subtype of u
. Therefore, the syntax t < u
tests whether t
is a proper subtype of u
, and the syntax t == u
tests whether they are both subtypes of each other.
The subtype relation is defined coinductively, meaning that t
is a subtype of u
unless there exists a finite disproof using the axioms defined in this section.
The task of searching for a disproof of t <= u
has the form of a flood-fill algorithm. The state of the algorithm is a set of pairs (v, w)
such that disproving v <= w
is sufficient to disprove t <= u
. You make progress by recursively descending v
and w
in order to add more pairs to the set, using the rules below. Eventually, you either find a pair that is trivially disproved by one of the rules, or you find that you can no longer add any more pairs to the set.
The rules in this section are supposed to be exhaustive. It is therefore important to be able to enumerate all possible types. The possible types are defined in the section on data types. For reference, here is a copy of the version of the definition I used to write this document:
union constness {
BASE
CONST
VAR
}
union type {
PRIMITIVE(uint8, uint8)
REF(constness, type)
ARRAY0
ARRAY(constness, type)
STRUCT(array const type)
UNION(array const struct {name: str, type: array const type})
FUNCTION(array const type, type)
DYNAMIC
MACRO
WITH(type, module)
} with { ... }
In a UNION
, the name
s must be distinct.
Note that there is only one type DYNAMIC
and only one type MACRO
.
v
is not a subtype of w
unless the cases of v
and w
are one of the combinations marked "*" in the following table:
v \ w | P | R | A0 | A | S | U | F | D | M | W |
---|---|---|---|---|---|---|---|---|---|---|
PRIMITIVE | * | * | ||||||||
REF | * | * | ||||||||
ARRAY0 | * | * | * | |||||||
ARRAY | * | * | ||||||||
STRUCT | * | * | ||||||||
UNION | * | * | ||||||||
FUNCTION | * | * | ||||||||
DYNAMIC | * | * | ||||||||
MODULE | * | * | ||||||||
WITH | * | * | * | * | * | * | * | * | * | * |
[Not implemented - the current version uses a simpler design.]
PRIMITIVE(s1, a1) <= PRIMITIVE(s2, a2)
s1 == s2
a1 == a2
REF(c, t) <= REF(d, u)
t <= u
c == d
or d == BASE
u <= t
or d != VAR
ARRAY0() <= ARRAY0()
ARRAY0() <= ARRAY(d, u)
ARRAY(c, t) <= ARRAY(d, u)
t <= u
c == d
or d == BASE
u <= t
or d != VAR
STRUCT(fs) <= STRUCT(gs)
fs.length == gs.length
i
.fs[i] <= gs[i]
UNION(cs) <= UNION(ds)
i
. ∃ j
.m == n
ts.length == us.length
k
.ts[k] <= us[k]
(m, ts) = cs[i]
(n, us) = ds[j]
FUNCTION(ts, tr) <= FUNCTION(us, ur)
tr <= ur
ts.length == us.length
i
.us[i] <= ts[i]
WITH
types[Not implemented.]
WITH(t, m) <= u
t <= u
t <= WITH(u, n)
t <= u