Datatypes
At its simplest, a datatype looks just like an enum declaration. For example, we could say:
datatype Color { Red, Green, Blue };
As with enum, the declaration creates a type (called datatype Color) and three constants Red, Green, and Blue. Unlike enum, these constants do not have type datatype Color. Instead, each variant has its own type, namely datatype Color.Red, datatype Color.Green, and datatype Color.Blue. However, a pointer to one of these values can be treated as a sub-type of a pointer to a datatype Color. So you can write:
datatype Color.Red red = Red;
datatype Color *c = &red;
In this simple example, we are splitting hairs, but we will soon find all these distinctions useful.
Unlike enum, datatype variants may carry any fixed number of values, as in this example:
datatype Shape {
Point,
Circle(float),
Ellipse(float,float),
Polygon(int,float),
};
A Point has no accompanying information, a Circle has a radius, an Ellipse has two axis lengths, and a (regular) Polygon has a number of sides and a radius. (The value fields do not have names, so it is often better style to have a variant carry one value of a struct type, which of course has named members.) This example creates five types: datatype Shape, datatype Shape.Point, datatype Shape.Circle, datatype Shape.Ellipse, and datatype Shape.Polygon. Like in our previous example, datatype Shape.Point* is a subtype of datatype Shape* and Point is a constant of type datatype Shape.Point.
Variants that carry one or more values are treated differently. Circle becomes a constructor; given a float it produces an object of type datatype Shape.Circle, for example Circle(3.0). Similarly, Ellipse(0,0) has type datatype Shape.Ellipse (thanks to implicit casts from int to float for 0) and Polygon(7,4.0) has type datatype Shape.Polygon. The arguments to a constructor can be arbitrary expressions of the correct type, for example, Ellipse(rand(), sqrt(rand())).
Here are some examples which allocate a Point and Circle respectively, but then use subtyping to treat the resulting values as if they are Shape pointers:
datatype Shape *s1 = new Point;
datatype Shape *s2 = new Circle(3.0);
Datatypes are particularly useful for building recursive structures. For example, a small language of arithmetic expressions might look like this:
enum Unops { Negate, Invert};
enum Binops { Add, Subtract, Multiply, Divide };
typedef datatype Exp *@notnull exp_t;
datatype Exp {
Int(int),
Float(float),
Unop(enum Unops, exp_t),
Binop(enum Binops, exp_t, exp_t)
};
A function returning an expression representing the multiplication of its parameter by two can be written like this:
exp_t double_exp(datatype Exp @e) {
return new Binop(Multiply, e, new Int(2));
}
Accessing Datatype Variants
Given a value of a datatype type, such as datatype Shape, we do not know which variant it is. It could be a Circle or Shape, etc. In Cyclone, we use pattern matching to determine which variant a given datatype value actually is, and to extract the arguments that were used to build the datatype value. For example, here is how you could define isCircle:
bool isCircle(datatype Shape *s) {
switch(s) {
case &Circle(r): return true;
default: return false;
}
}
When a switch statement’s argument is a pointer to a datatype, the
cases describe variants. One variant of datatype Shape *
is a pointer
to a Circle, which carries one value. The corresponding pattern has &
for the pointer, Circle for the constructor name, and one identifier
for each value carried by Circle. The identifiers are binding
occurrences (declarations, if you will), and the initial values are
the values of the fields of the Circle at which s points. The scope is
the extent of the case clause.
Here is another example:
(The reader is asked to indulge compiler-writers who have forgotten basic geometry.)
extern area_of_ellipse(float,float);
extern area_of_poly(int,float);
float area(datatype Shape *s) {
float ans;
switch(s) {
case &Point:
ans = 0;
break;
case &Circle(r):
ans = 3.14*r*r;
break;
case &Ellipse(r1,r2):
ans = area_of_ellipse(r1,r2);
break;
case &Polygon(sides,r):
ans = area_of_poly(sides,r);
break;
}
return ans;
}
The cases are compared in order against s. The following are compile-time errors:
- It is possible that a member of the datatype type matches none of the cases. Note that default matches everything.
- A case is useless because it could only match if one of the earlier cases match. For example, a default case at the end of the switch in area would be an error.
As you can discover in Pattern Matching, Cyclone has much richer pattern-matching support than we have used here.
Polymorphism and Datatypes
A datatype declaration may be polymorphic over types and regions just
like a struct or union definition (see Polymorphism). For example,
here is a declaration for binary trees where the leaves can hold some
BoxKind `a
:
datatype Tree<`a> {
Leaf(`a);
Node(datatype Tree<`a>*, datatype Tree<`a>*);
};
In the above example, the root may be in any region, but all children will be in the heap. The following version allows the children to be in any region, but they must all be in the same region. (The root can still be in a different region.)
datatype Tree<`a,`r> {
Leaf(`a);
Node(datatype Tree<`a,`r> *`r,
datatype Tree<`a,`r> *`r);
};