2021/06/12

What is a type?

Another Coffee Compiler Club call, another concept to explain.

What is a type?

It seems like such a simple question, until you try to answer it. What is a type?

When we designed Ecstasy, we were intent on boiling down types to some pure form, some simple atomic model of constructing everything in our programming universe. Types are, in some way, the periodic table of program elements. The protons, neutrons, and electrons of program existence. If you can't explain the basic building blocks of your universe, how can you explain how anything works?

Let's start with the conclusion: An object's type is the sum of its behavior.

Not having any real background in type theory, I have no idea whether this is an obvious truism or a nutty novel notion. I'm going to assume the latter, only because I've never used a language with a type definition like this, and also because it provides an excellent opportunity to explain the concept.

Most type systems that I've known and used are built around some combination of identity, state, and behavior. Java object types, for example, are based entirely on identity: The identity of a class is its type; the identity includes the name of the class, and its ancestors, in terms of super classes and implemented interfaces. Java types do carry detailed information about state (fields) and behavior (methods), but those details don't define the type; only the identity defines the type. The question "Is some object reference o of type T?" is never answered by what fields the object contains, nor by what methods it has, but rather by its identity, and solely by its identity.

Of course, this was quite a leap forward from the answer in C (and by extension, C++); in C, the question "Is some object reference o of type T?" always has the answer "Yes". Got a pointer? It turns out that your pointer points to whatever type you tell the compiler it points to. Is it a Cat? Is it a Dog? Is it a Car? Is it a House? The answer is always "Yes!" One must admit that C is quite an agreeable language when it comes to types. (With this in mind, it's also easy to understand how C code is responsible for so many security flaws.)

But let's drop all of these notions on the floor, and start over. Let's start with a made-up syntax for defining a type:

type
{
// things that define what the type is
}

Looks kind of like a C structure. And we often think about types in exactly this way: They have a name (identity), and they have structure (fields).

type Person
{
String firstName;
String lastName;
}

But by our definition, this is completely wrong! We claimed that a type is only the sum of its behavior, and this example has only identity and state instead. So let's fix this, temporarily, and in the ugliest manner possible:

type // if we could name it, we'd call it "Person"
{
String getFirstName();
void setFirstName(String);
String getLastName();
void setLastName(String);
}

Interesting. Ugly, but interesting. We've turned state into behavior. And we turned the identity into a comment. But let's try an experiment: We'll allow a type to have an optional identity (including type name, type parameters, and ancestor types), just to make it easier to describe the types, and we'll create a few types that we can re-use to avoid some of that awful boilerplate:

type Ref<T>
{
T get();
}

type Var<T> : Ref<T>
{
void set(T);
}

type Person
{
Var<String> firstName();
Var<String> lastName();

So there is no state, per se, and the identity exists solely as a convenience for us, the reader, but we're almost back to where we started. In fact, if we introduce a short-hand notation for a zero-parameter method that returns a Var<T>, we are back where we started:

type Person
{
String firstName; // this just means "Var<String> firstName()"
String lastName; // this just means "Var<String> lastName()"
}

We're close, but not done, because we have referenced another type, "String". And what is a "String"? It's just another type, that has to be defined in the exact same way as "Person". To define a String, it helps to have an Array. To define an Array, which has a size, it helps to have an Int64. To define an Int64, it helps to have another Array of Bit. To define a Bit, it helps to have an IntLiteral to represent a 0 or 1. And to define an IntLiteral, it helps to have a String.

In other words, the type system forms a closed loop. All types are defined from other types, and types are defined solely by their behavior.

And behavior? Behavior is simply defined as a set of named methods, each taking zero or more typed parameters, and returning zero or more typed results.

So what does this mean?

To oversimplify the conclusion, it means that a mathematician can use set theory to implement a type calculus for such a type system. Really, that's it. You know, like curing cancer, or finding the holy grail.

In Ecstasy, all types are defined like this. Even Type is a Type.

Riddle me this

It should be pretty obvious now why we refer to this as the "Turtles Type System", since it's turtles the whole way down. One of the interesting riddles we encountered early on looked something like this:

type A
{
B foo();
}

type B
{
A foo();
}

Question: What is the difference between an A and a B?

Rules

One of the interesting things with such a type system is how easy it is to construct recursive rules from it. For example, we say that a method m consumes type T if any of the following holds true:

  1. m has a parameter type declared as T;
  2. m has a parameter type that produces T;
  3. m has a return type that consumes T.

Similarly, we say that a method m produces type T if any of the following holds true:

  1. m has a return type declared as T;
  2. m has a return type that produces T;
  3. m has a parameter type that consumes T.

These rules form the basis for checking the legality of things like method variance, such as co-variance and contra-variance, which in turn allows the type system to intelligently enforce type safety.





2021/05/21

What is a Property?

On a recent Cliff Click Coffee Compiler Club call, this question came up: What exactly is an Ecstasy property? It turns out that a property is a very obvious and simple thing, yet explaining it is not so simple.

Developers have different expectations when they hear the word "property", including:

  • It's just a named field in a structure.
  • It's something that has a getter and a setter.

These are logical expectations, because in languages like C++ and Java, "object properties" are just fields in structures, and in Java, the getter and setter methods are a well-known way to expose private fields as public virtual methods.

But unfortunately, starting with this train of thought takes us in the wrong direction, so let's forget all of this historical context, and back up to the beginning: What exactly is an Ecstasy property?

First, it is important to appreciate where an Ecstasy property exists:

  • A property can be declared inside any class, including module and package classes;
  • A property can be declared inside a property;
  • A property can be declared inside a method.

That a property can exist inside a class is not unusual, but it is a bit unusual that a property can exist inside another property, or even inside of a method.

In Ecstasy, everything is an object, so it follows that a property is an object. Objects have types. So what is the type of a property? A property, like a local variable, is an instance of Ref, a reference. If the property is mutable, then (also like a local variable), it is an instance of Var, which extends Ref.

Since a property is an object, and objects are instances of a class, then what is the class of a property? The class of a property is unknowable within Ecstasy. That does not mean that the property does not have a class; it simply means that the class is not visible from within the running code. Let's take a simple example:

  class Person
{
Int age;
}

When we have an instance of Person, we can ask that object's reference for its actual class, and if it was created within the current container, it will return the Person class. But it's also possible that an object reference comes from outside of the container, in which case asking for the actual class will not return the actual class, but will instead return just the interface type through which the object can be viewed; this is the basis for container security, and is a fundamental building block of Ecstasy's strong security model.

When the Ecstasy runtime starts up, and an Ecstasy application is loaded and starts running, it is running in the outermost Ecstasy container, called "container 0", which is the container within which the application's module was loaded, and within which all other containers and objects are created, so one would think that the applications' properties would also be created within that "container 0" ... but that would be incorrect. In order for the initial application "container 0" to be created, there had to already be a Container class, and since that class comes from the core Ecstasy module, that means that the core Ecstasy module was already loaded in some container before "container 0" was created. And since it's "turtles the whole way down", it should be obvious that "container 0" is itself actually sitting on top of an infinite stack of turtles, which for purposes of keeping this short, we will simply refer to as "container -1".

"Wait ... what?!?" I can almost hear the WTFs being hurled at computer screens everywhere. But here's the simple truth: Anything outside of the container that the application is loaded within is simply unknowable. So if the application is started in something that we call "container 0", and a container always exists within a container, then we know that there must be some "container -1", if only because otherwise there couldn't be a "container 0". And just to keep this short and as-simple-as-possible, the runtime itself is that unknowable outer container, and the runtime itself is the container that loaded the Ecstasy module, and the runtime itself is the thing that knows how to "new" a class, and to automatically "new" whatever class is automatically used for each property as well. And in reality, that doesn't actually happen -- each property couldn't actually be a new object, right?

As with many things in Ecstasy, the answer is purposefully unknowable. If you ask for a property's reference, you do get back a usable object -- one that you can reflect on, pass around, store in a property somewhere, or whatever it is that you do with objects -- so obviously the property object "exists", by some definition; but like all turtles, it may not have existed before you looked at it, and it may not exist when you're not looking at it.

But here's where things get seriously cool: Since a property does have a class, we can augment that class! Of course, we don't use the extend keyword like when we sub-class (because we don't know what class to extend) ... but we can write a mixin for the property, because we do know the type to mix into! In fact, lots of functionality in Ecstasy is built by writing mixins that can be mixed into properties and local variables, such as futures, lazily calculated values, and watched values.

Furthermore, we can augment a property where we define it, as if it were a class. Here's a silly example:

  module Test
{
@Inject Console console;
Log log = new ecstasy.io.ConsoleLog(console);

void run()
{
log.add("Simple property example!");

val o = new TestClass();
for (Int i : 0..5)
{
val n = o.x;
}

o.&x.foo(); // &x gets the property, instead of de-referencing it
}

class TestClass
{
Int x
{
@Override Int get()
{
++count;
return super();
}

void foo()
{
log.add($"Someone accessed this property {count} times!");
}

private Int count;
}
}
}

 And when we run it:

++++++ Loading module: Test +++++++

Simple property example!
Someone accessed this property 6 times!

Process finished with exit code 0
So we can augment our property with code where we define the property, we can mix in predefined functionality into a property (again, it's not magic, because you could have written those mixins yourself!), and we can even modify a property's behavior on a sub-class (assuming that the property wasn't private), because the subclass' property's class implicitly extends the super-class' property's class.

Okay, that was a lot of information, but it conveys an important point: A property isn't just some field in a structure. It's a real object, with a real class, and it behaves like a real object, with a real class.

But what about the property's value? Where is it stored? The Ecstasy type system determines which properties require a field for their storage, and automatically includes those fields in the underlying structure that is defined for each class, and thus exists for each object. In other words, all of a property's state is stored in whatever class the property "rolls up" into. Here are two simple examples:

  class Example
{
Int x; // this has a field on Example:struct

Int y.get()
{
return 7; // this does not have a field
}
}
The type system has rules that determine when a field is required. The compiler uses these rules. The runtime uses these same rules. If a field is required, then the field will exist. If the field is not required, then the field will not exist.

So how is that field accessed? Well normally, we don't even think about that. If there's an object o with a property x, we just dereference the property o.x, and we never think about the field. But conceptually, the field is accessed by the last method in the call chain for the get() method on the property, so if you don't override the get() method, then accessing the property goes straight to the field.

Alternatively, sometimes it is necessary to work with an object's structure directly. A serialization library, for example, may need to access the field values to store them off, and subsequently build a new structure using that stored-off data to re-instantiate the corresponding object. Instead of trying to paste an example here, it makes more sense just to point to the JSON serialization implementation that does exactly this. While it might seem like a common thing (directly accessing a field), it turns out that the only places in the entire Ecstasy code base where fields are being accessed directly are (i) serialization implementations and (ii) tests of the compiler and runtime itself.

So, back to the initial question: What is an Ecstasy property?

  • A property has a name.
  • A property represents state with a value, of a type.
  • A property is contained within a class, a property, or a method.
  • A property is a container of classes, properties, and methods.
  • A property is itself a class.
  • A property can be customized, mixed into, inherited, and overridden.
  • Properties are virtual.