Lua and Squirrel: The Case for Squirrel

When considering an embedded scripting language, the most common choice is Lua, though there are many alternatives. One of those is Squirrel, a Lua-like language with a C-like syntax. Based on its history, one might dismiss Squirrel as simply “Lua with C syntax” or assume that its inclusion of classes and other features makes it heavy compared to Lua, perhaps the work of a spiteful C++ programmer who doesn’t like Lua. In actuality, not only is Squirrel light and performant, it addresses memory performance issues in Lua as well as several inconsistencies in the language, and it supplies features that must be hand-built in Lua.

Lua

Released in 1993 by Roberto Ierusalimschy, Luiz Henrique de Figueiredo, and Waldemar Celes at the Pontifical Catholic University of Brazil, Lua is a popular language for embedding in C and C++ programs. It is available under the MIT license at http://www.lua.org. Most famous for its use in World of Warcraft, it is often chosen for its small size, fast performance, and ease of integration with C. From a language design perspective, it is interesting for its small size and use of the table as a universal data container. The most common complaints about Lua are its syntax, minimal set of features, and unpredictable performance impact due to garbage collection. Lua has been used in many projects, most of them games: Grim Fandango, World of Warcraft, Garry’s Mod, and more.

There are people who get upset if you capitalize it “LUA”. Lua means “moon” in Portugese.

Squirrel

Squirrel is a Lua-inspired language released by Alberto Demichelis in 2003. It’s up to version 3.0.4 and available under the MIT license at http://squirrel-lang.org. In contrast to Lua, Squirrel uses a C-like syntax and supports classes, enums, constants, and attributes. It uses a mixed approach of automatic reference counting and garbage collection to manage memory. Squirrel is used in the Code::Blocks IDE, Final Fantasy Crystal Chronicles: My Life as a King, Left 4 Dead 2, and Portal 2.

Why Squirrel Exists

Demichelis was motivated by his work on Lua integration at game developer CryTek. One of the most common issues facing game developers who embed Lua is the unpredictability of the garbage collector, which can affect real-time performance. Lua’s incremental garbage collector can be controlled somewhat through the lua_gc() function, which accepts a step size argument that can be fine-tuned to manage the performance impact of collection. The optimal setting for this value is different for every project and can change as a project changes.

To address the impact of Lua, which didn't have an incremental collector at the time, Demichelis attempted to implement reference counting in Lua. Rather than relying on the collector to periodically scan memory, object ownership would be tracked by counting references to an object and releasing it when there were no more references. The work required to do this led Demichelis to start over with a new language that would address the other issues he encountered with Lua, such as its unconventional syntax (co-workers were constantly asking how to write a for-loop) and small set of features.

Design & Performance

Squirrel is very similar in spirit to Lua. Both are implemented as a register-based VM. Both are small and easily embedded in C and C++ programs, and multiple VMs can be created and managed independently. Unlike Lua, which is written in C, Squirrel is implemented in C++ but exposes a C API modelled after Lua's stack-based API. Each language's source is simple enough to be included directly in a project and modified if needed. Both are around 100-150 kilobytes in compiled form.

Performance-wise, Lua edges out Squirrel in microbenchmarks for a number of reasons. In terms of pure bytecode-crunching, Lua will always be faster than Squirrel because Squirrel has more data types and language features. Most importantly, Lua is able to "cheat" by deferring memory management during the benchmark, while Squirrel performs constant memory management through refcounting. The tradeoff is intentional: Squirrel's memory management performs at a predictable, measurable cost in real-time applications like games, where Lua might initiate garbage collection at the worst moment.

Demichelis wrote on the Squirrel forum in 2006:

Also, I think when looking at the performance of a language of this kind, we have to look at a real scenario. Stuff like The Great language Shootout is, in my opinion, pointless. In a real app, you have several megs of data living in the VM; what is going to affect your performance is not the bytecode speed (that's almost linear). Mostly, the performance problem will be memory. Memory management (if it's a GCed language) plus memory aliasing and pagefaults and cache misses.

Ref counting solves the first issue by giving a linear/predictable cost in memory management. Smaller and more cache friendly data structures improve the aliasing (my attempt to solve this in Squirrel are classes and arrays). To give you an example, in our engine we have written a very good memory manager that keeps blocks of memory of the same size in the same virtual page (we do not use a heap, we go straight to the OS virtual page and take advantage of the 4Kb memory alignment), and Squirrel performs 3 times faster on my memory stress test that tries to reproduce a game loop creating and deleting stuff. It makes the system behave like a language with a copy collector.

Syntax

Many developers looking for a scripting language are drawn to Squirrel because of its C-like syntax. Syntax is often an aesthetic choice and is the least important factor in comparing Lua and Squirrel. There are advantages in Squirrel's C-like familiarity, but Lua's sometimes unconventional syntax isn't a good reason on its own to choose one or the other.

Lua Quirks

Much of this will be familiar to experienced Lua programmers. Lua is a humble little language, and it's hard to dislike it. Even Squirrel's author is fond of Lua and didn't write Squirrel with the intention of supplanting it or out of hatred for it. The criticisms here should be viewed as technical observations. The small scope of an embedded scripting environment somewhat reduces the impact of a language's flaws, because the usage bounds are so small that it's easier to tolerate and work around problems. In many cases, that is simple enough to do in Lua. However, the increasing number of applications using embedded scripting languages means that it's worth taking a look at inconsistencies and areas that have potential for improvement, for the sake of developers embedding the language as well as end-users exposed to the language through a scripting API.

Arrays and Tables

Both Squirrel and Lua have tables (a.k.a., dictionaries or associative arrays), and both support extending table behavior through metamethods. Unlike Lua, Squirrel has seperate types for arrays and dictionaries. In Lua, arrays are tables using integers as keys. This used to be a performance advantage for Squirrel until Lua adopted a hybrid approach in which tables are implemented internally using a hash map part and an array part, presented to the user as a single type of data structure.

Lua's use of the table as a universal interface for arrays and dictionaries may lead to unexpected behavior when the interface fails to model the user's intent:

local a = { 1, 2, 3, 4, 5 }
a[3] = nil
--[[ a is now a sparse array:
{
    a[1] = 1,
    a[2] = 2,
    a[4] = 4,
    a[5] = 5
}
]]

This creates an array with a "hole" in it, which will cause the length (#) operator to return an incorrect value. Instead of setting an element to nil, one must use the table.remove() function to remove an element and automatically shift the remaining keys downward. This is typically a benign issue on its own but is amplified when combined with other problems stemming from Lua's treatment of nil.

Nil In Lua

In Lua, undefined variables return nil. In Squirrel, attempting to access an undefined variable throws an exception. This is a crucial difference because of its implications for Lua's behavior. Lua makes no distinction between an element with a value of nil and an element that doesn't exist. Conceptually, a new table starts out with all possible keys set to nil, with Lua only tracking the non-nil elements.

local t = {
    a = "apple",
    b = "banana",
    c = "cucumber"
}

print(t.b) -- "banana"
t.b = nil  -- "banana" is effectively removed from the table. t is now { a = "apple", c = "cucumber" }
print(t.b) -- nil
print(t.x) -- nil

Squirrel's equivalent to nil is null, but the behavior is different:

local t = {
    a = "apple"
    b = "banana"
    c = "cucumber"
}

print(t.b) // "banana"
t.b = null // t is now { a = "apple", b = null, c = "cucumber" }
print(t.b) // null
print(t.x) // Throws an exception.

Inserting an element in Squirrel requires using the newslot operator (<-), and deleting requires the delete operator:

local t = {
    a = "apple"
    b = "banana"
    c = "cucumber"
}

t.t <- "tomato" // t is now { a = "apple", b = "banana", c = "cucumber", t = "tomato" }
delete t.t
print(t.t) // Throws an exception because the slot doesn't exist.

Accessing a variable in both Lua and Squirrel is really an access of the current environment table to retrieve the value of the element that has the variable's name as its key. Since Lua makes no distinction between non-existent elements and elements that are nil, undefined variables evaluate to nil, which can lead to subtle bugs:

function example(x)
    print(X) -- Prints nil.
end

-- Unintentional creation of an array with a hole.
local alpha = 1
local beta = 2
local delta = 3
local gamma = 4

local a = { alpha, beta, delte, gamma }

In Squirrel, these examples throw exceptions. There are Lua analysis tools that can catch these types of errors, and some safety can be achieved for undefined global access through metamethods, but having the language catch these mistakes is an advantage.

Boolean Expressions

Lua's treatment of nil also has implications for the evaluation of boolean expressions. For historical reasons, anything non-nil or non-false evaluates to true. This is because, prior to 5.0, Lua didn't have the boolean types true or false, so nil represented false. This results in inconsistent behavior:

while 0 do print("Loops forever") end
while not 1 do print("Does nothing") end
while 1 do print("Loops forever") end
while not 0 do print("Does nothing") end

Squirrel's C-like evaluation of boolean expressions:

while (0) print("Does nothing")
while (!1) print("Does nothing")
while (1) print("Loops forever")
while (!0) print("Loops forever")

Squirrel Features

Squirrel provides many features, some of which must be hand-built in Lua. There are third-party Lua libraries to provide some of this functionality, but they differ from each other in subtle ways and lack integration with the VM and the language.

Classes

Classes in Squirrel are key-value containers, which makes them similar to tables but with built-in behavior.

class Thing {

    name = null
    type = null

    constructor(aName, aType) {
        name = aName
        name = aType
    }

    function Dance() {
        // Do a dance.
    }
}

function Thing::Hug(target) {
    // Hug something.
}

class Monster extends Thing {

    constructor(name) {
        base.constructor(name, "MONSTER_TYPE")
    }

    function Attack(target) {
        // Attack something.
    }
}

local spider = Monster("Spider")
spider.Dance()

local rat = Monster("Rat")
rat.Hug(spider)

if (rat instanceof Monster)
    print("You monster!")

Lua supports object-oriented programming through the __index metamethod, which can be used to redirect failed table accesses to another table, and the colon operator, which passes an implicit self argument in a function call. With these two features, Lua stops just short of supporting classes at the language level. It also doesn't provide instance type introspection, which must be implemented by the programmer.

By having classes as a built-in feature in Squirrel, it's simpler to work with them through Squirrel's C API than through Lua's. For example, one approach in Lua for representing an object in the host application as a class instance in the VM would be to create a metatable in the registry with the class's methods, which might itself have a superclass metatable, and so on. You'd create a full userdata using the object's pointer (not a light userdata, as only a full userdata can have a metatable), and set its metatable to that of the class. Type introspection might be provided through a GetType() method that returns a string constant that you set during instance creation. A finalizer function can be implemented in C by registering a function for the __gc metamethod.

In Squirrel, after pushing a class onto the stack, call sq_call() to instantiate it on the stack. Then call sq_setinstanceup() to associate an arbitrary pointer with the new instance, and call sq_setreleasehook() to register a finalizer function if needed. It's not as easy to get wrong compared to constructing a class system by hand, and it's less code to write.

Additionally, type tags can be associated with classes and userdata using arbitrary pointers via sq_settypetag(). sq_getinstanceup() accepts a type tag for validating types: the class of the instance will have its type tag checked, and if that fails, its base class will be checked. If the type check fails all checks, the function fails. Otherwise, it returns the pointer associated with the instance.

Attributes

Classes can be annotated with attributes, which are just tables that can be accessed in Squirrel or by the C API. This can be used for storing documentation or to provide metadata to IDEs and automatic binding systems.

class Foo </ anAttribute = "Foo class level attribute" /> {

    </ blah = 10, bloop = [1, 2, 3] /> // Attributes of aProperty
    aProperty = null

    </ anAttribute = "Example function attribute" />
    function Example() {}
}

JSON Table Declaration

Squirrel supports table declaration using JSON syntax. External JSON files can be read directly in Squirrel by returning the JSON string in an inline compiled function:

local JSONSource = loadtxtfile("myjson.json");
local compiledJSONFile = compilebuffer( "return " + JSONSource);
local table = compiledJSONFile();

Parameter Type Masks

As in Lua, registered C functions can be called from scripts. Arguments are pushed on the stack and can be retrieved within the function. Lua's auxiliary library supplies the luaL_check* functions for validating arguments, while Squirrel provides a more automatic way of validating arguments via the sq_setparamscheck() function, which creates a parameter-checking scheme. sq_setparamscheck() takes an nparamscheck argument, which is the number of arguments the function takes, as well as a type mask. If nparamscheck is a positive value, the VM expects exactly that number of arguments, and if it's a positive value, the absolute value is treated as a minimum amount. In other words, a value of 3 means the function accepts exactly three arguments, while -3 means it accepts at least three arguments.

The type mask is a string of characters, each representing an argument type:

"tai" Table, array, and integer.
". u f" Anything, userdata, and float. For readability, spaces are allowed between characters and are ignored.
". a|s a|i|o b" Anything, array or string, array or integer or null, and boolean. The | is an OR operator.

The VM compiles the scheme and performs the checks for you when the function gets called. If the arguments don't pass validation, an error is thrown before the function is called. Passing the constant SQ_MATCHTYPEMASKSTRING for the nparamscheck argument tells the VM to extrapolate the number of arguments from the typemask.

Enums & Constants

By convention, Lua constants are just regular string variables with all-caps variable names and can be modified. In Squirrel, enums and constants are evaluated at compile-time and are read-only. They're stored in a special constants table, which is a regular Squirrel table accessible to scripts and the C API. Attempting to modify a constant throws an exception.

const foo = 12
const stringConstant = "string constant"

foo = 13 // Throws an exception.

enum {
    zero
    one
    two
    three
}

print(three) // 3

enum MathFood {
    apple = "apple"
    banana = "banana"
    carrot = "carrot"
    pie = 3.14159265
}

print(MathFood.apple) // "apple"

First-Class Weak References

Lua implements weak references through weak tables, created using the __mode field of the metatable. In Squirrel, weak references are first-class objects returned from the built-in weakref() method. This method is implemented for all types except null. Numeric types like integers and bools simply return themselves since they are passed by value.

local t = {}
local foo = [ "one", "two", "three" ]
t.theWeakRef <- foo.weakref()

foo = "Now foo is a different object."

/* The stack is cleaned conservatively, so a stack variable
   can keep a reference alive for a few instructions.
   Execute enough instructions for stack variables to get
   cleaned up. */
local i = 0; i = 1; i = 2

print(typeof(t.theWeakRef)) // null

Squirrel's Community

Squirrel's primary drawback is that it's a younger language and lacks Lua's community, documentation, and overall mindhsare. Demichelis is active on the Squirrel website's forum and is considering a move to Google Groups, but the forum has nowhere near the level of activity Lua sees on forums, wikis, and the Lua mailing list. Despite being embedded in high profile projects like Portal 2 and Left 4 Dead 2, there doesn't appear to have been a measurable surge in activity. There is currently no equivalent of Lua's excellent "Programming In Lua" book by Roberto Ierusalimschy.

That said, there is still enough of a community to find help or locate tools. There are the C++ binding systems SqPlus and Sqrat, and at least one fork called SquiLu that adds some Lua features such as --[[]] style multi-line comments. Squirrel has been ported to SL4A (Scripting Layer for Android), and syntax highlighting packages are available for editors like Sublime Text.

Lua Or Squirrel

Knowledge of Lua transfers well to Squirrel. If the difference in features and behavior between the two languages isn't enough of a difference in a choice between them, external factors might be, such as Squirrel's relative lack of documentation, its use of C++, or having existing code that depends on Lua. Even so, Squirrel shouldn't be overlooked. It has a proven track record in several commercial projects, and its built-in data structures and API features can save time. Most importantly, it's as fun as Lua.

About these ads