Heist

github.com/jcoglan/heist

Heist is a Scheme interpreter written in Ruby. It provides an executable for running Scheme code directly, and also allows the interpreter to be easily embedded in, and extended using, Ruby applications. It nominally targets R5RS, though is likely to be more liberal than other implementations.

What is Scheme?

Scheme is a dialect of Lisp, one of the oldest programming languages still in use today. Lisp was the first language to implement conditionals, first class functions (lambdas), closures and recursion. Some Lisp features, such as closures and macros, have yet to penetrate mainstream languages and it has been observed that mainstream languages are slowly converging on the feature set of Lisp.

Scheme in particular is important for its influence on current languages, in particular JavaScript and Ruby (JavaScript is remarkably similar to Scheme if you ignore the syntactic differences). It is also used in the much-praised book “Structure and Interpretation of Computer Programs”, wherein it is used to explore some of the theory surrounding interpreters.

It is a functional language, in that it supports first-class functions and many constructs that would be syntax in other languages resemble function calls in Scheme. However it is not purely functional as it allows side effects, and is more procedural in nature than Haskell and its ilk. Like other Lisps, its lack of predefined syntax and its support for macros (functions that rewrite source code) make it ideal for creating new syntactic forms and embedded languages.

Features

Heist nominally targets R5RS (www.schemers.org/Documents/Standards/R5RS/HTML). It has good support for many Scheme runtime features such as tail call optimisation, macros and first-class continuations, as well as reasonably complete libraries for all the Scheme data types.

Currently implemented R5RS features include:

  • Boolean literals: #t and #f

  • Numeric literals for integers, reals, rationals and complexes. The latter two are mapped to Ruby’s Rational and Complex objects and Ruby handles all math procedures.

  • Character, string and symbol literals

  • Proper and improper lists and full (quasi)quoting with ', `, ,, ,@

  • Vector literals, also with quasiquoting support

  • Definition and assignment: define, set!

  • Lambdas: (lambda (x y) body ...), (lambda args body ...), (lambda (x y . rest) body ...)

  • Conditionals: if, cond, case

  • Binding constructs: let, let*, letrec, begin

  • Iteration: named let, do

  • Quoting: quote, quasiquote and shorthands ', `, ,, ,@

  • Macros: let-syntax, letrec-syntax, define-syntax, syntax-rules

  • Delayed evaluation: delay and force

  • Continuations: call-with-current-continuation, aliased as call/cc

  • Equivalance predicates: eqv?, eq?, equal?

  • Numeric library: number?, complex?, real?, rational?, integer?, exact?, inexact?, =, <, >, <=, >=, zero?, positive?, negative?, odd?, even?, max, min, +, *, -, /, abs, quotient, remainder, modulo, gcd, lcm, numerator, denominator, floor, ceiling, truncate, round, rationalize, exp, log, sin, cos, tan, asin, acos, atan, sqrt, expt, make-rectangular, make-polar, real-part, imag-part, magnitude, angle, number->string, string->number

  • Boolean library: and, or, not, boolean?

  • List library: pair?, cons, car, cdr, set-car!, set-cdr!, caar, cadrcdddar, cddddr, null?, list?, list, length, append, reverse, list-tail, list-ref, memq, memv, member, assq, assv, assoc

  • Symbol library: symbol?, symbol->string, string->symbol

  • Character library: char?, char=?, char<?, char>?, char<=?, char>=?, char-ci=?, char-ci<?, char-ci>?, char-ci<=?, char-ci>=?, char-alphabetic?, char-numeric?, char-whitespace?, char-upper-case?, char-lower-case?, char->integer, integer->char, char-upcase, char-downcase

  • String library: string?, make-string, string, string-length, string-ref, string-set!, string=?, string-ci=?, string<?, string>?, string<=?, string>=?, string-ci<?, string-ci>?, string-ci<=?, string-ci>=?, substring, string-append, string->list, list->string, string-copy, string-fill!

  • Vector library: vector?, make-vector, vector, vector-length, vector-ref, vector-set!, vector->list, list->vector, vector-fill!

  • Control features: procedure?, apply, map, for-each, eval, load

  • Input/output: display, newline

In addition to the above R5RS features, the following are provided. Heist allows the behaviour of some of these features to be configured on a per-runtime-environment basis.

  • More-helpful-than-usual REPL. Heist’s REPL does not clutter your display with prompts and other such cruft. It indents your code automatically, supports tab completion, and output is printed as comments so it’s simple to copy code out of a REPL session into a file without modification.

  • Transparent lazy evaluation (call by need). Expressions are not evaluated before being passed as arguments, instead they are evaluated as and when the called function requires their values.

  • Unhygienic macros. Scheme mandates hygienic macros that preserve lexical scoping, but other Lisps use a simpler system that does not impose this restriction. Heist allows hygiene to be disabled, its author not having the requisite experience to decide which system he prefers yet.

  • Optional continuations. Supporting continuations incurs a constant performance overhead, so if you don’t need them you can switch them off.

Installation and Usage

gem install heist

Heist can be run from the command line or embedded in Ruby applications. To run a Scheme file:

heist path/to/file.scm

To start a REPL session:

heist -i

Both these commands accept a number of runtime configuration options; run heist -h for more information.

Embed in your Ruby app

To embed Heist in a Ruby application, you need to create an instance of the runtime:

require 'heist'
scheme = Heist::Runtime.new(options)

options is an optional argument and is a hash that specifies the behaviour of optional parts of the runtime. For example, to enable lazy evaluation:

scheme = Heist::Runtime.new(:lazy => true)

The full list of options is:

  • :continuations - sets whether continuations and (call/cc) are enabled. Default is false.

  • :lazy - sets whether evaluation is lazy. By default this option is false and evaluation is eager.

  • :unhygienic - set this to true to disable macro hygiene.

Bear in mind that continuations and lazy evaluation are not currently compatible; enabling continuations forces eager evaluation. Also note that lazy evaluation has a slightly unpredictable effect on the use of macros.

To run a Scheme file using the runtime, just call:

scheme.run("path/to/file.scm")

Create Scheme functions using Ruby

Every runtime gets its own copy of all the built-in functions. You can add your own functions written in Ruby and Heist will make them available as Scheme procedures. For example, for running tests I do this in my setup method:

@runtime = Heist::Runtime.new
@runtime.define 'assert-equal' do |expected, actual|
  actual.should == expected
end

This lets me write, for example, (assert-equal 7 (+ 3 4)) in my tests. You can place arbitrary Ruby code in your functions, so this is an easy way to expose Ruby functionality to the Scheme environment.

Run Ruby data as Scheme code

Heist can interpret Scheme code provided as Ruby data, allowing you to write and manipulate Lisp-style expressions inline with Ruby code. For example, using our Heist::Runtime instance from above:

scheme.exec [:define, [:square, :x], [:*, :x, :x]]

scheme.exec [:square, 9]
#=> 81

Arrays map directly to Scheme lists, symbols map to identifiers, strings, numbers and booleans are treated as literals, and any other data is interpreted verbatim. So, you can use any piece of Ruby data in these expressions as long as the functions you’re calling can operate on it.

Due to syntactic limitations, some Scheme constructs cannot be used in this interface. Quoting syntax is out, so the expression '(1 2 3) would need to be written [:quote, [1, 2, 3]]. Dotted pairs and improper lists are also not available, though you can work around this to some extent using :cons. Vectors should be written as [:vector, 1, 2, 3] for example. Characters can be generated using the char procedure with a unit-length string, for example [:char, "Z"] produces the character #\Z.

Notes

This is alpha-quality software. While every attempt has been made to run valid Scheme code, error handling is at present very weak. Anything explicitly documented in this file can be assumed to work according to the Scheme standard, or according to the information presented here, and can be assumed to be reasonably stable.

Ruby-based syntax

I have not documented how to write your own syntax using Ruby because it requires far too much knowledge of Heist’s plumbing at present (I suspect this may be unavoidable). Besides, we have macros so if you want new syntax we’ve got you covered. In fact, writing syntax using macros makes sure that new syntactic forms support continuations correctly, and Heist itself eschews Ruby-based syntax where possible.

Valid symbols

Heist is extremely liberal as regards symbols. Any sequence of characters that does not contain any spaces or parentheses and that is not a boolean literal, a number, a character literal or a string is considered a valid symbol.

Vectors and quoting

Vectors do not need to be quoted, for example the expression #(1 2 3) evaluates to itself. If a vector is quoted, it is made immutable and a quoted vector expression returns the same in-memory object every time it is evaluated. In contrast, an unquoted vector returns a new vector object every time it is evaluated. Unquoted vectors are not frozen as this makes macro transformations hard to implement, and because we cannot let the vector-set! procedure modify the parse tree each unquoted vector is copied on evaluation.

Macros

Macros are slightly more liberal than R5RS, in that this is a valid macro:

(define-syntax my-let (syntax-rules ()
  [(my-let (name value) ... body ...)
    (let [(name value) ...]
      body ...)]))

You are allowed to use ellipses in infix positions, as long as the pattern that follows the ellipsis matches input that the preceeding pattern would fail to match. This is, the ellipsis acts as a greedy star operator in regular expressions. In the above, (name value) ... body ... matches zero or more two-element lists followed by zero or more expressions of any type. The following expression evaluates to 3 using this macro:

(my-let (x 2) (y 3) y)

Speaking of macros, there is no distinct macro expansion stage in Heist. Macros are expanded as they are encountered at runtime, and their expansions are inlined into the AST to improve performance: the same macro use is never expanded more than once. Macros (and native syntactic forms) are first-class, so they can be easily aliased and used anonymously:

(define when if)

((syntax-rules () [(_ expr) expr])
  (+ 3 4)) ; => 7

It’s not clear whether it would be useful to choose syntactic forms conditionally, certainly this will not work with macros as the inlining process will overwrite conditional code using the first expansion generated.

Laziness

Lazy evaluation mode is mainly useful for doing lambda calculus without being concerned about the difference between normal and applicative order, which for example affects the expression for the Y combinator. It displays some interesting properties, such as the fact that this is a perfectly reasonable definition for Y:

(define (Y f)
  (f (Y f)))
; => #<procedure:Y>

(define fact (Y (lambda (rec)
                  (lambda (x)
                    (if (zero? x)
                        1
                        (* x (rec (- x 1))))))))
; => #<procedure:fact>

(fact 6)
; => 720

License

(The MIT License)

Copyright © 2009-2011 James Coglan

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the ‘Software’), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED ‘AS IS’, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.