6.17 A WebAssembly version of LispE - naver/lispe GitHub Wiki

A WebAssembly implementation of LispE

Version française

Now you can enrich your JavaScript programs with Lisp. It becomes possible to execute statements such as: (geometric_distribution 20 0.3) directly in your JS code and get an Int32Array containing the result of this function.

We provide (see below) a playground to test LispE in your favorite browser. In particular, index.html shows you how to integrate this library in your code.

What is WebAssembly?

JavaScript has some limitations, despite significant progress, that make it unusable for some applications such as games or machine learning. The W3C has therefore proposed a new type of code: WebAssembly to overcome these problems. WebAssembly is a binary format, a bytecode that today's browsers can compile and execute. It works as an extension of JavaScript and can approach the performance of native applications, although according to some studies, performance is generally 10% to 60% slower than the corresponding C++ program. Above all, this bytecode has now become a compilation target for the most common programming languages such as C++ or Java. Thus, taking an existing C++ code and compiling it into WebAssembly is now possible.

Emscripten: C++ to WASM (WebAssembly)

There are several solutions for compiling to WebAssembly today, and one of the best known is Emscripten. Emscripten is based on the LLVM compiler, but offers as a result not only a WASM (WebAssembly) library, but also HTML and JS files to use it.

This is the tool we used to transform LispE into a WebAssembly library. Note that we took inspiration from Pragmatic compiling of C++ to WebAssembly. A Guide. to start this project.

Compilation

We have successfully compiled a version of LispE into a WebAssembly library that can run on any web browser. However, we faced some challenges in porting LispE to WebAssembly, especially regarding exception handling.

WebAssembly is a low-level binary format that aims to provide fast and portable execution of web applications. However, WebAssembly does not have native support for exception handling yet. Instead, it relies on JavaScript to catch and rethrow exceptions using a special tag that identifies the type and data of the exception.

In our initial version of LispE WebAssembly library, we used try/catch blocks to handle errors in our code. This worked fine for a binary version, but it proved quite problematic in WASM. First, it was very slow because every exception had to go through JavaScript and back to WebAssembly. Second, it was not very elegant because every function call was wrapped into a JS function.

We wrote a LispE program to scan our C++ code and replace each try/catch block with a simpler mechanism, where an error is signaled by allocating a specific field in the interpreter’s main class. This way, we preserved our original code, while also enabling us to create a much faster version for WASM.

The result was quite striking when we tested this new version against some of our test scripts. For example, when we ran our interpreter on the following script: gradient descent, it took 9ms on an Intel iMac, 153ms with the WASM version using try/catch, and 20ms with the WASM version without try/catch.

Output

The result of the compilation with Emscripten consists of a library lispe.wasm and two files lispe.hmtl and lispe.js. We implemented a fourth file lispe_functions.js to wrap the exported functions from the WASM library in JavaScript (see lispe_functions.js)

The lispe.js file, which is generated by emstdk contains only the minimal code to load the library and initialize it.

WASM Files

You can find all the necessary code in the wasm directory:

  • lispe.wasm is the WebAssembly library that enables you to run Lisp code.
  • lispe.js handles the loading and initialization of the WebAssembly library.
  • lispe_functions.js provides the JS wrappers to access the functions exported by lispe.wasm.
  • index.html demonstrates how to integrate this library into a Web page with a basic interface for experimenting with the language.

JavaScript API

This API has a key feature: it implements methods that return actual data structures that can be used directly in JS. These functions can return numerical values, strings or arrays of numbers and strings. They can also initialize variables within the interpreter. This is especially useful if you want to pass a large string or a large array to the code beforehand. Most of them take strings as input.

//See lispe_functions.js
function callEval(code, size_output) //this function evaluates some LispE code and returns a string

function callEvalToFloats(code, size_output_in_floats) //this function returns an array of Floats (Float64Array)
function callEvalAsInt(code) //this function returns an integer
function callSetqString(a_variable, value) //This function creates a variable whose value is a string 
etc.

We also added an additional function to clean the interpreter:

function callResetLispe() //this function resets the LispE interpreter

Note that the code and variables created in LispE are persistent, hence the need to be able to regenerate the interpreter when needed. Thus, between two calls, the variables keep their value.

Memory Usage

Firstly, it is important to note that WebAssembly operates within a sandbox environment, which ensures that memory cannot be corrupted or lost during program execution. To achieve this, the available memory must be defined when compiling WebAssembly using emcc. We compiled LispE with the following values:

INITIAL_MEMORY=47972352 
STACK_SIZE=20971520

47,972,352 corresponds to 732x65536 memory pages.
20,971,520 corresponds to 320x65536 memory pages.

It is essential to understand that this memory serves not only as the execution space for a WebAssembly program but also as the exchange area for functions exported by the library. Consequently, the arguments of WASM functions must be allocated within this specific memory space.

Strings

One of the most challenging aspects of WebAssembly is that it requires strings to be passed as arrays of integers, where each character is encoded in UTF-16. Therefore, these arrays are first converted to UTF-32 strings in LispE, since this is the basic encoding of strings in this language, and then if the output is also a string, they are converted back to arrays of integers.

We provide many different functions to encode and decode these arrays for characters, integers and floats.

// Encodes a string into an integer array
const encode = function stringToIntegerArray(str, array) { ... };

// Decodes an integer array into a string
const decode = function integerArrayToString(array, sz) { ... };

// Provides characters as integers
function provideCharactersAsInts(code) { ... };

etc.

Playground: index.html

We have provided an example of using this library in the index.html file. You need first to launch a server.

Launching a server

Move to the directory where lispe.wasm is located.

Execute:

emrun --port 8080 .

You can also use python:

python -m http.server --directory . 8080

Then you can use the address: http://localhost:8080 in any browser to interact with LispE.

Running Code

As demonstrated, it is possible to interact with lispe.wasm in a straightforward manner. You can add code, compile, and execute it on the fly. To re-execute code, simply position the cursor at the beginning of the line you wish to re-run.

By default, the callEval method takes code and returns a string result. In many cases, this may be adequate, but it is also possible to return numeric values or lists of numeric values. You can even minimize compilation by initializing variables directly with lists or strings, for instance (see callSetq).

Referring to our previous example: (geometric_distribution 20 0.3), it is possible to execute it in a way that returns a JavaScript array or a string.

Note that in the GUI, the first case is covered by the button As a list of integers, while a simple call to Reset/RUN will return a string. Furthermore, the text area: DATA is automatically transformed into a variable of name DATA (sic) in the interpreter, with as value the content of this text area.

The underlying functions that are called are the following:

var result = callEvalToInts("(geometric_distribution 20 0.3)", 1000), which gives: 8,0,4,1,0,2,2,2,4,1

while

var result = callEval("(geometric_distribution 20 0.3)", 1000), gives: "(2 1 0 1 1 0 0 11 5 11)"

requires us to parse the above string to extract the corresponding table.

Persistence

Each call to a callEval function keeps all the variables created in memory. Thus, it is possible to chain the calls without losing the intermediate values:

(async () => {
    await callEval("(setq r 100)", 1000)
    await callEval("(+= r 1000)", 1000)
    await callEval("(println r)",1000)
    await callEval("r", 100)
})();

Note two things about the code above:

  1. The consecutive calls are asynchronous.
  2. The last line executed in a code returns the final value of the evaluation, in this case r which is 1100

On the other hand, these variables exist only in the memory space of lispe.wasm. This ensures the privacy of the processed data since the execution is local and does not pass through any server. The data is kept in the user's local memory and not in that of a remote server.

Of course, it is much more efficient to call callEval with the whole code at once. Still, the execution will return as a result the last line of the code.

Conclusion

WebAssembly today offers JavaScript developers access to modules compiled in other languages such as C++, Rust or C#. Thanks to tools like emsdk, switching from C++ to WebAssembly code is relatively simple, although memory management and argument passing need some attention. Thus, we can offer a very interesting extension to JavaScript in the form of a LispE interpreter, which complements JavaScript with a very rich set of instructions, especially for manipulating lists and strings, thanks to a tight integration.

In particular, LispE offers the ability to write remarkably compact one-liners :

  • Splitting a string into sub-strings
// Splits a string into sub-strings
(segment "This is the 100 test"); ("This" "is" "the" "100" "test")
  • Powerful filters based on a regular expression:
(filterlist (λ(x) (rgx_match (rgx "%C%a+") x)) (segment "The lady lives in Paris. She lives in Montmartre")

which gives: ("The" "Paris" "She" "Montmartre")

The API makes it possible to exploit the language by returning lists or values directly. Thus, a call to the filter example, will return an array of strings:

lst = callEvalToStrings("(filterlist (λ(x) (rgx_match (rgx "%C%a+") x)) (segment "The lady lives in Paris")), 1000);

// lst is: La,Paris

Finally, the local execution of the code ensures the privacy of the user data and makes it easy to perform verifications or calculations in the browser without having to go through a server.

LispE is a very rich language as shown in the documentation which offers developers other ways to design and implement their code for the Web.