2020/01/25

Coming from Java, Part III (Singletons)


(This entry is the third installment of "Coming from Java". This is Part III; here is a link to Part II.)

There is a common pattern in Java, which is the singleton pattern. Basically, it allows you to create a single instance of a class, and then to be able to find that one same instance from anywhere in your code
public class Singleton
    {
    public static final Singleton INSTANCE = new Singleton();

    private Singleton()
        {
        // initialization stuff goes here
        }
    }
Creating a singleton in Ecstasy is accomplished by using the static keyword for the class:
static const Singleton
    {
    construct()
        {
        // initialization stuff goes here
        }
    }
There is one tiny detail, though: In Ecstasy, only const classes and service classes can be singletons, and the reason is quite fundamental to the purposes for which Ecstasy was designed.

To begin with, consider the differences between high-end computers that Java was designed for, and the high-end computers that Ecstasy is designed for. At the time that Java was initially being designed in the early 1990s, very few computers had more than one CPU -- Sun hadn't yet even released its first multiprocessor workstation! A server or a high-end workstation might have had 8MB of RAM, and -- still hard to believe! -- the IBM 370 mainframe of the day topped out at 16MB of RAM. That's megabytes -- a new notebook today has a thousand times that much memory!

Over a decade later, with dual-CPU servers now the norm, Java would get its first working memory model specification, specifying the JVM's guarantees for reads and writes occurring across multiple CPUs and among multiple threads.

Ecstasy, in contrast, was designed explicitly to take advantage of computers with potentially many thousands of cores, and with potentially many terabytes of main memory. To accomplish this, the design focused on disentangling threads from each other, and disentangling the memory -- what Java calls "the heap". The rationale is simple: In a modern computer, a single thread of code can perform on the order of 1-10 billion instructions per second, if and only if the thread does not share read/write memory with any other threads. The moment that a thread starts to use read/write memory that is being used by other threads, the performance (and the predictability of the performance) drops like a lead balloon.

To avoid the lead balloon effect, Ecstasy carves out exclusive zones of mutable memory, each with its own single conceptual thread. Each of these is called a service. An Ecstasy service can be thought of as a simple Turing Machine, or a simple von Neumann machine. And an Ecstasy service can be thought of as a boundary for mutability, because all mutation of a service's memory occurs within that service, and only immutable data can permeate that boundary. Services can communicate with other services, but that communication is conceptually asynchronous in nature, the communication is in the form of invocation, and only immutable data is exchanged.

Since a singleton is, by its nature, visible to all code running in an application, it therefore stands to reason that the singleton must be immutable -- so that it can be used by code running in any service -- or the singleton must itself be a service -- so that it can be invoked by code running in any other service.

And here is a straight-forward example in Ecstasy:
static service PageCounter
    {
    Int count = 0;
    Int hit()
        {
        return ++count;
        }
    }
Using the singleton is equally simple:
PageCounter.hit();
(Since it is a singleton, the name of the class implies the singleton instance.)

The same example can be constructed in Java, but thread safety is the responsibility of the programmer:
public class PageCounter
    {
    public static final PageCounter INSTANCE = new PageCounter();

    private PageCounter() {}
    
    private int count;
    
    synchronized public void setCount(int count)
        {
        this.count = count;
        }
    
    synchronized public int getCount()
        {
        return count;
        }
    
    synchronized public int hit()
        {
        return ++count;
        }
    }

// how to call the singleton
PageCounter.INSTANCE.hit();
There are a variety of ways to implement the counter in Java in order to make it more concurrent; for example, an atomic integer class can be used, or an atomic updater on a volatile field can be used, and so on. In Ecstasy, on the other hand, the choice of how to make the counter more concurrent is left completely up to the run-time implementation. The choice to allow the run-time to optimize this facet of execution is based on what we learned from Java's own HotSpot JVM -- which is that only the run-time has enough information to know which parts of the application would actually benefit from optimization in the first place, and which optimizations would work best, based on the actual run-time profiling information!

A few miscellaneous notes to wrap up this singular topic:
  • In Ecstasy, every module, package, and enum is a "static const" class, automatically. That means that modules and packages are all singleton objects, and every enum value is a singleton object.
  • Ecstasy does not require a memory model for explaining the order of reads and writes of mutable data among threads, because (as explained above) the Ecstasy design does not have mutable shared state among threads. (The Ecstasy design also uses services in lieu of explicit developer-managed threads, but that is a topic for another blog entry.)
  • There is no "global heap" in Ecstasy, so there is no "stop the world" garbage collection. Ecstasy is automatically garbage-collected, but each service can manage its own memory. The Ecstasy design effectively eliminates the "GC pause" problem, even for programs that use terabytes of RAM.

 

2020/01/23

Coming from Java, Part II

(This topic is large, so this entry is just the second installment. This is Part II; here is a link to Part I.)

In Object Oriented languages, objects represent the combination of related state and behavior. Java classes declare fields to hold state, and methods to provide behavior. Java fields and the methods are nested immediately within the class that contains them. This is an example of a common pattern for a Java class exposing state, stored in fields, via property accessors:
public class Person
    {
    public Person(String name, String phone)
        {
        setName(name);
        setPhone(phone);
        }
    
    private String name;
    private String phone;

    public String getName()
        {
        return name;
        }

    public void setName(String name)
        {
        assert name != null;
        this.name = name;
        }

    public String getPhone()
        {
        return phone;
        }

    public void setPhone(String phone)
        {
        this.phone = phone;
        }
    }
Ecstasy classes do not declare fields; instead, Ecstasy classes have properties that represent object state. A property is like an object, in that it can have its own nested state, and its own nested behavior. For example, to obtain the value of a property, one can invoke the get() method on the property. If the property is writable, then one can modify the value of the property by invoking the set() method on the property. (Of course, it is possible to use the simple dot notation for property access, which means that explicit calls to get() and set()are unnecessary.) Here is the above class, re-written in Ecstasy:
class Person(String name, String? phone);
You could also write it out in long-hand if you prefer; the following code compiles to the same exact result as the above code:
class Person
    {
    construct(String name, String? phone = Null)
        {
        this.name  = name;
        this.phone = phone;
        }
        
    String name;
    String? phone;
    }
It's possible in Java to make the "getter" and "setter" have different access, such as:
public String getName()
    {
    return name;
    }

private void setName(String name)
    {
    assert name != null;
    this.name = name;
    }
To accomplish this in Ecstasy, the equivalent is:
public/private String name;
The first access, "public", specifies that the property shows up in the public type as a Ref<String>; a Ref represents a read-only reference to a value. The second access, "private", specifies that the property shows up in the private type as a Var<String>; a Var represents both read and write access to the value.

Remember, though, that a property is like an object. Let's expand the Java example slightly, to validate that the name is not an empty String:
public void setName(String name)
    {
    assert name != null && name.length() > 0;
    this.name = name;
    }
In Ecstasy, the property contains a method called set(String) that we can override:
public/private String name.set(String name)
    {
    assert:arg name.size > 0;
    super(name);
    }
The above is just short-hand notation for:
public/private String name
    {
    @Override void set(String name)
        {
        assert:arg name.size > 0;
        super(name);
        }    
    }
There are a couple of important points here:
  • It's not the set() method that is private. The set() method is public, because it is part of the Var interface, as explained above.
  • Instead, the public Person type (known as Person:public or Person.PublicType) has a property that does not have a set() method, while the private Person type (known as Person:private or Person.PrivateType) has a property that does have a set() method.
  • In Java, "super" refers to the super-class. In Ecstasy, super is a reference to the function (like a function pointer) that is next in line to invoke in the virtual method's invocation chain. In other words, super is a function.
  • While it's not directly related, you can read more about the various specializations of the assert statement on this blog. The assert:arg statement produces an IllegalArgument exception if the assertion fails.
The interesting thing, though, is that we never have to deal with the field. We know it's there, because it has to be in order to hold the value, but the field doesn't have a name, we don't access it, and we don't modify it. Instead, we just call the super function for get() or set(), and at the end of that chain there is some implementation of the method (that we didn't have to write!) that accesses or stores the value for us using the field.

But what if we made it so that we could never reach the end of those method chains?
public/private String name
    {
    String get()
        {
        return "Bob";
        }

    void set(String name)
        {
        assert:arg name.size > 0;
        // do nothing with the name ... do not store it!
        }
    }
In this case, there would be no field for the name property, because it's obvious to the compiler that one is not needed!

So now it should be obvious that fields exist in Ecstasy, but that we never really have to mess around with them. Where are those fields actually held, though?

In one of the examples above, we talked about how the Person class has a public type and a private type, so you probably already guessed that the Person class also has a protected type, and you would be correct!

But the Person class has one more type: the Person:struct type. The Person:struct type has one property for each property of the Person class that needs a field. We call each property on the struct type a "field"; in Ecstasy, a field is just a property on a class' struct type.

The struct type is not user-definable. The struct type is automatically calculated by the compiler at compile-time, and by linker/loader at run-time. While the public, protected, and private types all refer to the same underlying object -- as if they were three different lenses through which you can view the same object -- the struct, on the other hand, is a separate object that is an implementation of the Struct interface.

For the purpose of this article, this is already way too much low-level information about structs, but the details are important for one reason: To understand constructors.

Constructors are weird. They live in a zone between non-existence and existence. They play by some extraordinary rules in Java, and the same is true in Ecstasy, because they fit into a zone of unknowns. Here's what a constructor looks like in Java, just to pick one at random from our own prototype compiler that was written in Java:
public StringConstant(ConstantPool pool, String sVal)
    {
    super(pool);

    assert sVal != null;
    m_sVal = sVal;
    }
First, in Java, a constructor must call either a different constructor on this class, or a constructor on the super class. Then, it is free to do other stuff, like checking parameters and initializing fields. Fields that aren't explicitly initialized are all set to their defaults, which is easy when null is a sub-class of everything.

Ecstasy is different. Not necessarily simpler or more complicated. Not necessarily better or worse. But it is different for very purposeful reasons:
  • Contruction is treated as a finite state automaton. Eliminating unknowns and improving predictability of execution is extremely important, and that is exactly what a finite state automaton does.
  • There is a period of time before the object is constructed. The developer gets complete control over that process.
  • There is a period of time after the object is constructed. The developer gets complete control over that process.
  • In between the before and the after, the developer is completely absent, and completely out of the picture for the moment of creation. During that moment, all the rules of object instantiation can be verified, and the object is created. We say that "the this becomes existent".
In that period before the object creation, there are two phases that the developer can implement:
  1. The construct(...) function(s) allows the developer to specify what information is needed to initialize the state of the object, and the developer can validate that information and initialize the structure of the object, which is the aforementioned struct.
  2. The assert() function allows the developer to collect, in one place, any assertions (or any other last-second work) that must occur before the object is created.
Here is an example of a constructor, from the Date class, which simply delegates to another constructor using the construct keword:
construct (Int year, Int month, Int day)
    {
    construct Date(calcEpochOffset(year, month, day));
    }
For both construct(...) and assert(), the this variable is the struct  -- not the object, because it has not yet been created! After that code has all completed successfully, the struct is checked by the system to make sure that all necessary fields have been assigned a value, and then the object is instantiated based on the struct. Then -- after the moment of creation, and before the newly created "this" reference is returned to the code that invoked the new operator -- one more step occurs: The corresponding finally(...) function for each previously invoked construct(...) function is executed, so that the object itself gets to see itself (and finish anything that it needs to) before being returned to the code that requested it.

Here's an example from the Array class:
protected construct(ArrayDelegate<Element> delegate)
    {
    this.delegate = delegate;
    }
finally
    {
    if (mutability == Constant)
        {
        makeImmutable();
        }
    }
In this example, the construct(...) function fills in the fields of the struct, but because the Array object does not yet exist at this point, the construct(...) function can not call the makeImmutable() method on the Array -- until the Array actually exists! And that is the purpose of the finally function -- to allow the new Array object to perform behavior that must occur as if it were part of the instantiation of the object, before that object is returned to the code that requested the object to be created.

A class can define an assert() function (with no parameters) as well. Regardless of whether any particular construct(...) function is invoked by the new operator, or by a sub-class -- and note that a sub-class is not required to invoke any construct(...) function on its super-class! -- the assert() function will be invoked before the object is created.

There are many details regarding the specific order of execution, handling of exceptions, and so on, but this post hopefully has given you a glimpse into how object structure works in Ecstasy, and how Ecstasy objects are created.

(Continue to Part III.)

2020/01/22

Coming from Java

The first question that we get from new developers working on Ecstasy is how it is similar to, and how it is different from the languages that they already know and use. One of the goals of Ecstasy was to make the language instantly accessible to programmers who already were comfortable with any of the C family of languages, such as C++, Java, and C#. We'll start by looking at one such language, Java, which is one of the most widely used languages today.

(This topic is large, so this entry is just the first installment.)

Here's the pocket translation guide from Java to Ecstasy with respect to the type system:
  • Java's type system is a combination of primitive (machine) types and class-based types, with a few "hybrid" types, such as arrays, that fit neither category. Ecstasy's type system is simply class-based; there are no primitive types.
  • Java's null type has one value, null, that is assignment compatible with any reference type. The Ecstasy Nullable enumeration defines the value Null, although the lower-case null is also supported by alias. In Ecstasy, the Null enum value is only assignable to a Nullable type, or a super-type thereof such as Object.
  • Java's boolean type has two values, true and false. The Ecstasy Boolean enumeration defines the values False and True, although the lower-case false and true are also supported by alias.
  • Java's char type is a 2-byte unsigned integer that represents a common sub-set of Unicode characters. Ecstasy's Char class represents any Unicode code-point.
  • Java's int type is a 32-bit unchecked signed integer value; Java additionally has byte, short and long types for 8-bit, 16-bit, and 64-bit unchecked signed integer values. Ecstasy provides both checked and unchecked, and both signed and unsigned implementations for 8-bit, 16-bit, 32-bit, 64-bit, 128-bit, and variable-length integers (conceptually similar to Java's BigInteger class). For example, UInt32 is a checked unsigned 32-bit integer, and @Unchecked Int128 is an unchecked signed 128-bit integer. Additionally, the alias Int maps to the 64-bit signed integer, Int64, and the alias Byte maps to the 8-bit unsigned integer, UInt8. (In Java, the byte type is signed.)
  • Java also has some proprietary support for decimal values via the BigDecimal class. Ecstasy provides standard 32-bit, 64-bit, 128-bit, and variable-length decimal value support via the Dec32, Dec64, Dec128, and VarDec classes; these are implementations of the IEEE 754-2008 standard for decimal floating point.
  • Java's float and double represent 32-bit and 64-bit IEEE 754 binary floating point values.  Ecstasy provides standard 16-bit, 32-bit, 64-bit, 128-bit, and variable-length IEEE 754 binary floating point values via the Float16, Float32, Float64, Float128, and VarFloat classes. Additionally, Ecstasy provides the ML- and AI-optimized "brain float 16"  type, via the BFloat16 class.
  • Java's primitive type system is based on a 32-bit word size. Ecstasy's does not have a primitive type system, and thus does not have a "word size", but in practice, Ecstasy defaults to using 64-bit integer, decimal, and binary floating point values.
  • In Java, the value "0" is an int. The compiler converts it, if necessary, to other types. In Ecstasy, the value "0" is an IntLiteral, which has the ability (both at compile-time and run-time) to convert to any numeric type. Unlike Java, there is no need for an "l"/"L" suffix on integers to inform the compiler that a value is a long.
  • In Java, the value "0.0" is a double. In Ecstasy, the value "0.0" is an FPLiteral, which has the ability (both at compile-time and run-time) to convert to any decimal or binary floating point type. Unlike Java, there is no need for an "f"/"F" or "d"/"D" suffix to inform the compiler that a value is a 32-bit or 64-bit value.
  • Java supports the class, enum, and interface keywords for declaring classes. Ecstasy supports these three keywords, plus: module, package, service, mixin, and typedef.
  • Classes such as Int64, Float64, Dec64, Char, and String that are used to hold constant values are implemented in Ecstasy using the const keyword instead of the class keyword. Instances of a const class are automatically made immutable as part of their construction; specifically, no reference to an object of a const class becomes visible until after the object is made immutable.
  • Ecstasy classes such as Nullable and Boolean are enumerations; enumerations are abstract classes that contain enum values, such as False and True. Enum values are singleton const classes.
  • Ecstasy module and package classes are singleton const classes, and are written like any other classes would be. Declarative modularity was recently introduced into Java via Project Jigsaw, with some similar goals. You can read more about Ecstasy modules on this blog.
  • Java does not have any language capabilities similar to a service, a mixin, or a typedef in Ecstasy. A service class provides a boundary for concurrent and/or asynchronous behavior, so it can be thought of in the same manner as a Java thread; however, an Ecstasy application may have millions of service objects, while it is unlikely that so many threads would be desirable in any language. An Ecstasy mixin provides cross-cutting functionality; in Java, some combination of boilerplate, delegation, and cut & paste would be used instead. An Ecstasy typedef is a means to provide a name to a type that itself can be expressed using the type algebra of the Ecstasy language. You can read more about class composition on this blog.
To put this into practice, consider this Java example:
package com.mycompany.myproduct.gui;

class Point
        implements Comparable<Point>
    {
    public Point(int x, int y)
        {
        this.x = x;
        this.y = y;
        }

    private final int x;
    private final int y;

    public int getX()
        {
        return x;
        }

    public int getY()
        {
        return y;
        }

    @Override
    public int hashCode()
        {
        return x ^ y;
        }

    @Override
    public boolean equals(Object obj)
        {
        if (obj instanceof Point)
            {
            Point that = (Point) obj;
            return this.x == that.x && this.y == that.y;
            }

        return false;
        }

    @Override
    public String toString()
        {
        return "Point{x=" + x + ", y=" + y + "}";
        }

    @Override
    public int compareTo(Point that)
        {
        int n = this.x - that.x;
        if (n == 0)
            {
            n = this.y - that.y;
            }
        return n;
        }
    }
And here is the corresponding Ecstasy code:
const Point(Int x, Int y);
This particular example is dramatic, because the const class declaration in Ecstasy implies automatic implementations of the Comparable, Hashable, Orderable, and Stringable interfaces. Furthermore, the parameters specified at the class level declare two properties, and a constructor.

Local variable declarations are similar, but the use of the comma as a general purpose separator (as in C) is not permitted. For example, in Java:
int a=0, b=0, c=0;
In Ecstasy, these would likely become separate declarations:
Int a=0;
Int b=0;
Int c=0;
It is also possible (and occasionally necessary) to declare and initialize multiple left-hand-side variable ("L-values"); the above example could be written as:
(Int a, Int b, Int c) = (0, 0, 0);
Note that the left-hand-side is in the form of a tuple, and the right-hand-side has a corresponding tuple type. In this form, the type of each left-hand-side variable can differ, and a type is only specified when declaring a variable. For example, if a function "foo()" exists that returns both an Int and a String, then the above-defined variable "c" and a new String variable can be assigned as follows:
(c, String d) = foo();
This introduces a dramatic difference in Ecstasy: Methods and functions can return more than one value, and those return values can be treated either as individual values, or as a tuple of values.

Furthermore, method and function parameters can also be provided either as individual values, or as a tuple of values, or as named values. Consider this example in Java that uses multiple delegating constructors:
class ErrorList
    {
    public ErrorList(int maxErrors)
        {
        this(maxErrors, false);
        }

    public ErrorList(boolean abortOnError)
        {
        this(0, abortOnError);
        }

    public Example(int max, boolean abortOnError)
        {
        this.max = max;
        this.abortOnError = abortOnError;
        // ...        
        }

    private int max;
    private boolean abortOnError;
    }
Using default parameter values, the Ecstasy equivalent of this example would not need all of those redundant constructors, each with slightly different signatures:
class ErrorList(Int max=0, Boolean abortOnError=False)
    {
    // ...    
    }
And the class could then be constructed using any combination of named parameters, as in the following example:
ErrorList errs = new ErrorList(abortOnError=True);
For the most part, though, the Ecstasy syntax is designed to maintain a high level of compatibility with Java (and C#) syntax. One area in which the syntax differs is with respect to type assertions and type tests. In Java, the type test uses the relational operator "instanceof", and the type assertion uses the C-style cast syntax, which often requires two sets of parenthesis, as in this Java code:
if (x instanceof List)
    {
    ((List) x).add(item);
    }
Ecstasy simplifies this syntax dramatically by employing the dot notation that is already so naturally used for property access and method invocation. The "is" keyword replaces "instanceof", and the "as" keyword replaces the awkward use of parenthesis for the type assertion (aka "type casting"):
if (x.is(List))
    {
    x.as(List).add(item);
    }
This approach is far easier to read, because it follows left-to-right, with no precedence concerns. Furthermore, if the compiler determines that the value "x" is not subject to concurrent modification, then type inference obviates the need for the type-assertion altogether:
if (x.is(List))
    {
    x.add(item);
    }
Operator precedence also differs slightly from Java, in order to simplify more operators into left-to-right ordering and to resolve a number of cases in which parenthesis were awkwardly required in Java:
Operator        Description             Level   Associativity       
--------------  ----------------------  -----   -------------       
&               reference-of              1                         
                                                                    
++              post-increment            2     left to right       
--              post-decrement                                      
()              invoke a method                                     
[]              access array element                                
?               conditional                                         
.               access object member                                
.new            postfix object creation                             
.as             postfix type assertion                              
.is             postfix type comparison                             
                                                                    
++              pre-increment             3     right to left       
--              pre-decrement                                       
+               unary plus                                          
-               unary minus                                         
!               logical NOT                                         
~               bitwise NOT                                         
                                                                    
?:              conditional elvis         4     right to left       
                                                                    
*               multiplicative            5     left to right       
/                                                                   
%               (modulo)                                            
/%              (divide with remainder)                             
                                                                    
+               additive                  6     left to right       
-                                                                   
                                                                    
<< >>           bitwise                   7     left to right       
>>>                                                                 
&                                                                   
^                                                                   
|                                                                   
                                                                    
..              range/interval            8     left to right       
                                                                    
<  <=           relational                9     left to right       
>  >=                                                               
<=>             order ("star-trek")                                 
                                                                    
==              equality                 10     left to right       
!=                                                                  
                                                                    
&&              conditional AND          11     left to right       
                                                                    
^^              conditional XOR          12     left to right       
||              conditional OR                                      
                                                                    
? :             conditional ternary      13     right to left       
                                                                    
:               conditional ELSE         14     right to left       
As you can see, a number of operators are grouped together, which previously each had their own precedence level; this implicitly employs left-to-right precedence for all operators within that grouping. Bitwise operators also have been moved to a significantly higher precedence level, which reduces the need for unnecessarily awkward parenthesization. Additionally, almost all operators map directly to methods, which means that explicit left-to-right behavior can be achieved by replacing a relational operator with the corresponding method invocation.

(Continue to Part II.)