wiki:Tutorial/Eval/Expressions

Version 72 (modified by benl, 6 years ago) (diff)

--

Expressions and the Evaluator

This section covers how to use the ddci-core (Disciplined Disciplined Compiler Interactive) shell, and introduces the basic features of the language.

For a more complete language definition see the Disciple Core Language Definition.

Previous Section: Installation?

Evaluation

Start the interpreter.

$ ddci-core
DDCi-core, version 0.3.1: http://disciple.ouroborus.net.
Type :help for help.
>

The interpreter evaluates expressions entered at the prompt.

> 42 [R0#] ()
42
> addInt [R0#] [R1#] [R3#] (2 [R0#] ()) (3 [R1#] ())
5

Type arguments are provided in square braces []. The constructors R0# and R1# are region handles that indicate what region of the store the integer objects should be allocated into. Region handles are types, so they are provided in square brackets.


Typing

Use the :type command to ask for the type of an expression:

> :type 42
42 :: [^ : %].Unit -(Alloc ^0 | $0)> Int ^0

The integer 42 is a data constructor. It takes a region handle, the unit constructor () and allocates an integer object into the store. The [^ : %] is a universal quantifier (forall), which says that the constructor can an allocate its object into any region. The % is the kind symbol for region handles.

The ^ is a binder for a deBruijn index, and the ^0 variables reference this binder. Builtin types typically use deBruijn indices for variables, but user programs can use regular named binders as well.

The function constructor is written with an effect term and a closure term. In this case the effect is Alloc ^0 and the closure is empty, which is written $0.

Let's ask for the type of of addInt

:type addInt
addInt :: [^ ^ ^ : %].Int ^2 -> Int ^1 -(Read ^1 + Read ^2 + Alloc ^0 | Use ^2)> Int ^0

The addInt function takes three region handles and two integers. It reads the integer objects in the first two regions and allocates its result into the third. Note that deBruijn indices are stack based, so Alloc ^0 refers to the inner most binder (the right most ^ in the [^^^:%].) quantifier. The closure term Use ^2 reveals that if we partially apply just the first integer argument, then this is captured in the closure of the resulting function object.

Use the :check command to ask for the effect and closure of an expression as well:

> :check 42 [R0#] ()
42 [R0#] ()
 :*: Int R0#
 :!: Alloc R0#
 :$: Use R0#

This says that 42 [R0#] () returns an object of type Int R0# , it allocates into region R0# during evaluation, and region R0# is reachable (used) by the resulting object.

Exercise: what is the type, effect and closure of the addInt expression from the previous section when we partially apply it to each of its type and value arguments? Why is this?


Kinding

Use the :kind command to ask for the kind of a type constructor:

> :kind Int
Int :: % ~> *

The Int type constructor takes a region handle (%) and produces a value type (*).

> :kind R0#
R0# :: %

The R0# type constructor is a region handle.

> :kind (->)
(->) :: * ~> ! ~> $ ~> * ~> *

The function constructor takes a value type, an effect type, a closure type, another value type, and produces a value type.

> :kind Read
Read :: % ~> !

The Read type constructor takes a region handle and produces an effect type.


Tracing evaluation

To see how an expression is being evaluated, enable tracing using the :set command.

:set +TraceEval +TraceStore
ok
> addInt [R0#] [R1#] [R2#] (2 [R0#] ()) (3 [R1#] ())
...
* STEP: L3#
* STORE
  NextLoc: 4
  NextRgn: 3
  Regions: {R0#, R1#, R2#}
  Global:  {R0#, R1#, R2#}

  Binds:
   L1#:R0# -> OBJ 2
      :: Int R0#
   L2#:R1# -> OBJ 3
      :: Int R1#
   L3#:R2# -> OBJ 5
      :: Int R2#

The final state of the program consists of the store location L3# and the above store. The line L3#:R2# -> OBJ 5 reveals that the result of the addition operation has been allocated into region R2#, which is what we asked for.

Disable tracing before moving onto the next section.

:set /TraceEval /TraceStore


Let-expressions and type application sugar

Here is a nicer way to add two numbers:

> :eval..
let x = 2 [R0#] () in
let y = 3 [R0#] () in
addInt [:R0# R0# R0#:] x y;;

5

This time we've split the expression over a few lines to make it easier to write. All ddci-core commands can be split over multiple lines by appending .. to the name of the command (as in :eval..) and terminating with double semicolon (;;). You can cut-and-paste the above example from the : to the ;; into the interpreter console.

We've also used type application sugar to reduce the number of brackets in the code. The syntax addInt [:R0# R0# R0#:] is equivalent to addInt [R0#] [R0#] [R0#].


Recursion, type abstraction and function bindings

Here is the factorial function using an accumulating parameter. The right of a recursive binding must always be a lambda abstraction, as we do not support value recursion yet. Recursive bindings must also be given explicit type signatures, as ddci-core does not infer their types.

> :eval..
letrec {
 fac : [r : %]. Int r -(!0 | Use r)> Int r -(Read r + Alloc r | Use r)> Int r
  = /\(r : %). \(acc n : Int r)
  . case n of {
        0     -> acc;
        1     -> acc;
        _     -> fac [r] (mulInt [:r r r:] acc n)
                         (subInt [:r r r:] n (1 [r] ()))
  }
} in fac [R0#] (1 [R0#] ()) (10 [R0#] ());;

3628800

In the code, the /\(r : %). is a type lambda which binds the region handle, and \(acc n : Int r). is a value lambda. We can also write this function using function binding sugar, to reduce the amount of duplicated type information.

:eval..
letrec {
 fac    [r:%] 
        (acc : Int r) {!0 | Use r}
        (n   : Int r) {Read r + Alloc r | Use r} : Int r
  = case n of {
        0       -> acc;
        1       -> acc;
        _       -> fac [r] (mulInt [:r r r:] acc n)
                           (subInt [:r r r:] n (1 [r] ()))
    }
} in fac [R0#] (1 [R0#] ()) (10 [R0#] ());;

3628800

With this sugar the [r:%] binds the region handle, and we provide the effect and closure of each function using { .. | .. } .


Local regions, witnesses and destructive update

Let's write factorial again, but this time using destructive update of a local accumulator.

At this stage the programs are getting too big to copy and paste, so there is a direct link to the code here: Factorial.dcx. The .dcx extension stands for Disciple Core eXchange. It's just a list of interpeter commands.

You can run ddci-core directly on the source file:

$ ddci-core Factorial.dcx
...

Here is the code and its output:

:eval..
let fac  [r1:%] 
         (n   : Int r1) {Read r1 + Alloc r1 | Use r1} : Int r1
     = letregion r2 with { w : Mutable r2 } in
       let acc = 1 [r2] () in
       let one = 1 [r2] () in
       letrec { 
        loop (i : Int r1) { Read r1 + Read r2 + Write r2 + Alloc r1 + Alloc r2
                          | Use  r1 + Use r2 } : Int r1
         = case i of {
            0    -> copyInt [:r2 r1:] acc;
            1    -> copyInt [:r2 r1:] acc;
            _    -> let _ = updateInt [:r2 r2:] <w> acc 
                                    (mulInt [:r2 r1 r2:] acc i)
                    in  loop (subInt [:r1 r2 r1:] i one);
         }
       } in loop n
in fac [R0#] (10 [R0#] ());;

3628800

This function stores the accumulator state in a local region (r2). The region is created when the letregion block is entered, and deallocated automatically at the end, along with all objects in it. Because region r2 is local to the function, we have copied the result out into region r1 before returning.

In addition to the local region, we also need some pre-existing regions to hold the argument and return value of the function. On startup, the interpreter creates an empty region for every pre-existing region handle in the program. In this case we've used R0#, so we get a region for it automatically. At runtime, when the local letregion block is entered a new region will be created, along with its handle, and the handle will be substituted for r2.

In Disciple Core, regions can have extended properties. These are set when the region is created, and they cannot be changed after creation. In the above example we've set r2 to be mutable so that we can destructively update objects contained within it. We achieve this by binding the variable w to a witness that this property is true. You can imagine the witness is like a file handle with write permission set. When we create the region we get such a file handle, and we need to pass it to every function that wants to perform a write.

In this case, updateInt needs write permission to region r2, so we pass it our witness. Witness arguments are provided in angle brackets <>, similarly to how type arguments are provided in square brackets [].

We can check that updateInt really needs this witness by looking at its type:

:type updateInt
updateInt :: [^ ^ : %].Mutable ^1 => (Int ^1 -> Int ^0 -(Read ^0 + Write ^1 | Use ^1)> Unit)

This says updateInt takes two region handles, a witness to mutability and two integers. It reads an object in the first region, and writes to an object in the second. If we partially apply it to the first integer then this is captured (used) in the closure of the resulting function.

Exercise: Verify that region r2 is deallocated before fac returns by using the TraceEval and TraceStore commands from before.

Exercise: Refactor fac so that the argument and return values are allocated into different regions.

Exercise: Refactor fac again so that when the program terminates, the regions holding the argument and return values contain only those values.

Exercise: (harder) Refactor fac again to eliminate the calls to copyInt from the case alternatives. You will probably still need a single call to copyInt elsewhere in the final program. Why is this?


Local objects cannot escape

In the previous section we saw that a local region introduced by letregion is deallocated when we leave the scope of that block. Of course, it would be bad if we could pass out a reference to an object in that region, as the reference would become invalid when the region was deallocated.

The type system ensures that this cannot happen. The rule is that a region variable introduced by a letregion cannot be free in the type of its body. For example, this doesn't work:

> letregion r1 in 5 [r1] ()
Region variable escapes scope of letregion.
       The region variable: r1 : %
  is free in the body type: Int r1

Nor does this:

> :eval..
letregion r1 in 
let z = 5 [r1] () 
in \(x : Unit). addInt [:r1 r1 R0#:] z (1 [r1] ());;

Region variable escapes scope of letregion.
       The region variable: r1 : %
  is free in the body type: 
       Unit -(Read r1 + Alloc r1 + Alloc R0# | Use r1 + Use R0#)> Int R0#

In this second example, an object in region r1 is captured in the closure of the returned function. If r1 was deallocated, and we tried to apply the function, then z would be invalid. However, as the region variable is visible in the effect and closure type of the function, we can see that it uses this region.

Here is a more subtle example:

:eval..
letregion r1 in
let z  = 5 [r1] () in
\(x : Unit).
 let xs = Cons [R0#] [Int r1] z (Nil [R0#] [Int r1] ()) in
 case xs of {
        Nil      -> 1 [R0#] ();
        Cons _ _ -> 2 [R0#] ();
 };;

Region variable escapes scope of letregion.
       The region variable: r1 : %
  is free in the body type: 
       Unit -(Read R0# + Alloc R0# | Use r1 + Use R0#)> Int R0#

This time the returned function does not read or allocate into r1, so the fact that it references an object in this region is not visible in its effect type. However, the body of the function still holds a reference to this object, and this is revealed by the Use r1 component of its closure type.


Laziness and purification

NOTE: As of DDC 0.3.2 we no longer support the lazy keyword on let-bindings. Support for lazy evaluation is currently being re-factored to use a primitive operator instead of a special keyword.

The following program builds a pair (Pr) of two integers. and then extracts just the second one.

:eval..
let foo [r1 r2 : %] (_ : Unit)
     = let x = 5 [r1] () in
       let y = letregion r3 in
               addInt [:r1 r3 r2:] x (1 [r3] ()) in
       Pr [r1] [Int r1] [Int r2] x y

in  letregion r4 in
    letregion r5 in
    case foo [:r4 r5:] () of {
            Pr _ y -> copyInt [:r5 R0#:] y
    };;

Let's suspend the computation of y with lazy evaluation. This will cause the second component of the tuple to be evaluated only when it is finally demanded by the copyInt function. As Disciple Core supports both destructive update and region (de)allocation, there are two complications when suspending evaluation:

  1. If the suspended computation will read some objects, but the objects are updated before the computation is forced, then the computation may return a different result than if they were updated after it is forced.
  1. If a suspended computation will read, write or allocate into a region, but the region is deallocated before the computation is forced, then all operations on this region will be invalid.

To guard against these problems, the type checker requires that all suspended computations satisfy the following constraints:

  1. They must be pure, meaning that they cannot have visible read, write or allocation effects. This means they can be forced at any time, and we'll still get the same result.
  1. They must be empty, meaning that they cannot have any region variables in their closure. This ensures that they cannot reference objects in regions that might be deallocated before the computation is forced.

We satisfy these constraints as follows:

  1. We purify the effect of the suspended computation. This involves supplying enough information to guarantee that all regions being read or allocated into are constant.
  1. We forget the closure of the suspended computation. This involves supplying enough information to guarantee that all regions being read or allocated into are global. Global regions are also created with letregion, but they are not deallocated in a stack-like manner. Instead, we rely on the garbage collector to reclaim objects in these regions.

Here is the corresponding code. Keep in mind that all of the extra information will be inferred by the source type inferencer! Disciple Core is intended as an intermediate representation for a compiler, and we are only writing these programs directly to learn how it works.

:eval..
let foo [r1 : %] (wc1 : Const r1) (wg1 : Global r1) 
        [r2 : %] (wc2 : Const r2) (wg2 : Global r2) (wl2 : Lazy  r2)
        (_ : Unit)
     = let x = 5 [r1] () in
       let y : Int r2 lazy <wl2> 
             = purify <read [r1] wc1 & alloc [r1] wc1 & alloc [r2] wc2> in
               forget <use  [r1] wg1 & use   [r2] wg2> in
               letregion r3 in 
               addInt [:r1 r3 r2:] x (1 [r3] ()) in
       Pr [r1] [Int r1] [Int r2] x y

in  letregion r4 with { wc4 : Const r4; wg4 : Global r4 } in
    letregion r5 with { wc5 : Const r5; wg5 : Global r5; wl5 : Lazy  r5 } in
    case foo [r4] <:wc4 wg4:> [r5] <:wc5 wg5 wl5:> () of {
            Pr _ y -> copyInt [:r5 R0#:] y
    };;

The purify cast takes a witness of type Pure e and masks effect e from the expression in the body. Likewise, the forget cast takes a witness of type Empty c and masks the closure c from the expression in the body. As we are suspending computation of the y binding, we need to mask its entire effect and closure.

We convert the witnesses of constancy passed into foo into witnesses of purity and emptiness with the read, alloc and use witness constructors. Use the :wtype command to ask for the types of these witness constructors:

> :wtype read
read :: [^ : %].Const ^0 => Pure (Read ^0)

> :wtype alloc
alloc :: [^ : %].Const ^0 => Pure (Alloc ^0)

> :wtype use
use :: [^ : %].Global ^0 => Empty (Use ^0)

The & operator (pronounced join), combines two witnesses into a larger one. For example, if w1 : Pure (Read r1) and w2 : Pure (Read r2) then w1 & w2 : Pure (Read r1 + Read r2).

Finally, the witness of type Lazy r2 allows us to allocate thunks into region r2. This indicates that inspecting objects in this region may diverge. For example, an object of type Int r2 could actually be a thunk that enters an infinite loop when it is forced. Alternatively, if we wish to ensure that objects in some region cannot be thunks, then we can apply the opposing constraint Manifest r2 when creating the region.

Exercise: Try to violate the constraints that Disciple Core places on suspended computations. What is preventing us from destructively updating the first component of the pair after foo has returned? Why can't the object bound to x be deallocated before the computation is forced?


Next Section: Program Transforms?