August 16th, 2010

Google Summer of Code 2010, Final Update

The pencils’ down date for Google Summer of Code 2010 is right now. Here’s a quick overview of what I feel I have contributed to scxml-js thus far, and what I feel should be done in the future.

Tests and Testing Framework

Critical to the development of scxml-js was the creation of a robust testing framework. scxml-js was written using a tests-first development style, which is to say that before adding any new feature, I would attempt to map out the implications of that feature, including all possible edge cases, and would then write tests for sucess, failure, and sanity. By automating these tests, it was possible to avoid regressions when new features were added, and thus maintain robustness as the codebase became more complex.

Testing scxml-js was an interesting challenges with respect to automated testing, as it was necessary to test both the generated target code (using ahead-of-time compilation), and the compiler itself (using just-in-time compilation), running in all the major web browsers, as well as on the JVM under Rhino. This represented many usage contexts, and so a great deal of complexity was bundled into the resulting build script.

The tests written usually conformed to a general format: a given SCXML input file would be compiled and instantiated, and a script would send events into the compiled statechart while asserting that the state had updated correctly. A custom build script, written in JavaScript, automated the process of compiling and running test cases, starting and stopping web browsers, and harvesting results. dojo.doh and Selenium RC were used in the testing framework.

Going Forward

It would be useful to phase out the custom JavaScript build script for a more standard build tool, such as maven or ant. This may be challenging, however, given the number of usage contexts of the scxml-js compiler, as well as the fact that the API it exposes is asynchronous.

Another task I’d like to perform is to take the tests written for Commons SCXML and port them so that they can be used in scxml-js

Finally, I have often noticed strange behaviour with Selenium. At this moment, when run under Selenium, tests are broken for in-browser compilation under Internet Explorer; however when run manually, they always pass. I’ve traced where the tests are failing, and it’s a strange and intermittent failure involving parsing an XML document. I it think may be caused by the way that Selenium instruments the code in the page. I feel it may be worthwhile to investigate alternatives to Selenium.

scxml-js Compiler

This page provides an overview of what features works right now, and what do not.

In general, I think scxml-js is probably stable enough to use in many contexts. Unfortunately, scxml-js has had only one user, and that has been me. I’m certain that when other developers do begin using it, they will break it and find lots of bugs.

I’m hoping to prepare a pre-alpha release to coincide with the SVG Open 2010 conference at the end of the month, and in preparation for this, I’m reaching out to people I know to ask them to attempt to use scxml-js in a non-trivial project. This will help me find bugs before I attempt to release scxml-js for general consumption.

Going Forward

There are still edge cases which I have in mind that need to be tested. For example, I haven’t done much testing of nested parallel states.

I also have further performance optimizations which I’d like to implement. For example, I’ve been using JavaScript 1.6 functional Array prototype extensions (e.g. map, filter, and forEach) in the generated code, and augmenting Array.prototype for compatibility with Internet Explorer. However, these methods are often slower than using a regular for loop, especially in IE, and so it would be good to swap them out for regular for loops in the target code.

Another performance enhancement would be to encode the statechart’s current configuration as a single scalar state variable, rather than encoding it as an array of basic state variables, for statecharts that do not contain parallel states. This would reduce the time required to dispatch events for these types of statecharts, as the statechart instance would no longer need to iterate through each state of the current configuration, thus removing the overhead of the for loop.

I’m sure that once outside developers begin to look at the code, they will have lots of ideas on how to improve performance as well.

There are other interesting parts of the project that still need to be investigated, including exploring the best way to integrate scxml-js with existing JavaScript toolkits, such as jQuery UI and Dojo.

Graph Layout, Visualization, and Listener API

As I stated in the initial project proposal, one of my goal for GSoC was to create a tool that would take an SCXML document, and generate a graphical representation of that document. By targeting SVG, this graphical representation could then be scripted. By attaching a listener to a statechart instance, the SVG document could then be animated in response to state changes.

I was able to accomplish this by porting several graph layout algorithms written by Denis Dube for his Master’s thesis at the McGill University Modelling, Simulation and Design Lab. Denis was kind enough to license his implementations for release in ASF projects under the Apache License. You can see a demo of some of this work here.

Going Forward

The intention behind this work was to create a tool that would facilitate graphical debugging of statecharts in the web browser. While this is currently possible, it still requires “glue code” to be manually written to generate a graphical representation from an SCXML document, and then hook up the listener. I would like to make this process easier and more automatic. I feel it should operate similarly to other compilers, in that the compiler should optionally include debugging symbols in the generated code which allow it to map to a “concrete syntax” (textual or graphical) representation.

Another issue that needs to be resolved is cross-browser compatibility. It’s currently possible to generate SVG in Firefox and Batik, but there are known issues in Chromium and Opera.

Also, there are several more graph layout algorithms implemented by Denis which I have not yet ported. I’d really like to see this happen.

Finally, my initial inquiries on the svg-developers mailing list indicated that this work would be useful for other projects. I therefore feel that these JavaScript graph layout implementations should be moved into a portable library. Also, rather than generating a graphical representation directly from SCXML, it should be possible to generate a graphical representation from a more neutral markup format for describing graphs, such as GraphML.


I have written some nice demos that illustrate the various aspects of scxml-js, including how it may be used in the development of rich, Web-based user interfaces. The most interesting and complex examples are the Drawing Tool Demos, which implement a subset of Inkscape’s UI behaviour. The first demo uses scxml-js with a just-in-time compilation technique; the second uses ahead-of-time compilation; and the third uses just-in-time compilation, and generates a graphical representation on the fly, which it then animates in response to UI events. This last demo only works well in Firefox right now, but shows what should be possible going forward.

I have several other ideas for demos, which I will attempt implement before the SVG Open conference.


The main sources of documentation now are the User Guide, the source code for the demos, and Section 5 of my SVG Open paper submission on scxml-js.


This has been an exciting and engaging project to work on, and I’m extremely grateful to Google, the Apache Software Foundation, and my mentor Rahul for facilitating this experience.

June 28th, 2010

Google Summer of Code, Update 3: More Live Demos

Just a quick update this time. The scxml-js is moving right along, as I’ve been adding support for new features at, on average, a rate of about a feature per day. Today, I reached an interesting milestone, which is that scxml-js is now as featurful as the old SCCJS compiler which I had previously been using in my research. This means that I can now begin porting the demos and prototypes I constructed using SCCJS to scxml-js, as well as begin creating new ones.

New Demos

Here are two new, simple demos that illustrate how scxml-js may be used to to describe and implement behaviour of web User Interfaces (tested in recent Firefox and Chromium; will definitely not work in IE due to its use of XHTML):

Both examples use state machines to describe and implement drag-and-drop behaviour of SVG elements. The first example is interesting, because it illustrates how HTML, SVG, and SCXML can be used together in a single compound document to declaratively describe UI structure and behaviour. The second example illustrates how one may create state machines and DOM elements dynamically and procedurally using JavaScript, as opposed to declaratively using XML markup. In this example, each dynamically-created element will have its own state machine, hence its own state.

I think the code in these examples is fairly clean and instructive, and should give a good sense regarding how scxml-js may ultimately be used as a finished product.

June 23rd, 2010

Google Summer of Code 2010, Project Update 2

Here’s another quick update on the status of my Google Summer of Code project.

Finished porting IR-compiler and Code Generation Components to XSLT

As described in the previous post, I finished porting the IR-compiler and Code Generation components from E4X to XSLT.

Once I had this working with the Java XML transformation APIs under Rhino, I followed up with the completion of two related subtasks:

  1. Get the XSL transformations working in-browser, and across all major browsers (IE8, Firefox 3.5, Safari 5, Chrome 5 — Opera still to come).
  2. Create a single consolidated compiler front-end, written in JavaScript, that works in both the browser and in Rhino.

Cross-Browser XSL Transformation

Getting all XSL transformations to work reliably across browsers was something I expressed serious concerns about in my previous post. Indeed, this task posed some interesting challenges, and motivated certain design decisions.

The main issue I encountered in getting these XSL transformations to work was that support for xsl:import in xsl stylesheets, when called from JavaScript, is not very good in most browsers. xsl:import works well in Firefox, but is currently distinctly broken in Webkit and Webkit-based browsers (see here for the Chrome bug report, and here for the Webkit bug report). I also had limited success with it in IE 8.

I considered several possible solutions to work around this bug.

First, I looked into a pure JavaScript solution. In my previous post, I linked to the Sarissa and AJAXSLT libraries. In general, a common task of JavaScript libraries is to abstract out browser differences, so the fact that several libraries existed which appeared to do just that for XSLT offered me a degree of confidence when I was initially choosing XSLT as a primary technology with which to implement scxml-js. Unfortunately, in this development cycle, on closer inspection, I found that Sarissa, AJAXSLT, and all other libraries designed to abstract out cross-browser XSLT differences (including Javeline, the jquery xsl transform plugin), are not actively maintained. As web browsers are rapidly moving targets, maintenance is a major concern when selecting a library dependency. In any case, a pure JavaScript solution did not appear feasible. This left me to get the XSL transformations working using just the “bare metal” of the browser.

My next attempt was to try to use some clever DOM manipulation to work around the Webkit bug. In the Webkit bug, xsl:import does not work because frameless resources cannot load other resources. This meant that loading the SCXML document on its own in Chrome, with an xml-stylesheet processing instruction pointing to the code generation stylesheet, did generate code correctly. My idea, then, was to use DOM to create an invisible iframe, and load into it the SCXML document to transform, along with the requisite processing instruction, and read out the transformed JavaScript. I actually had some success with this, but it seemed to be a brittle solution. I was able to get it to work, but not reliably, and it was difficult to know when and how to read the transformed JavaScript out of the iframe. In any case my attempts at this can be found in this branch here.

My final, and ultimately successful attempt was to use XSL to preprocess the stylesheets that used xsl:import, so as to combine the stylesheet contents, while still respecting the semantics of xsl:import. This was not too difficult, and only took a bit of effort to debug. You can see the results here. Note that there may be some corner cases of XSLT that are not handled by this script, but it works well for the existing scxml-js code generation backends. This is the solution upon which I ultimately settled.

One thing that must still be done, given this solution, is to incorporate this stylesheet preprocessing into the build step. For the moment, I have simply done the simple and dirty thing, which is to checked the preprocessed stylesheets into SVN.

It’s interesting to note that IE 8 was the easiest browser to work with in this cycle, as it provided useful and meaningful error messages when XSL transformations failed. By contrast, Firefox would return a cryptic error messages, without much useful information, and Safari/Chrome would not provide any error message at all, instead failing silently in the XSLT processor and returning undefined.

Consolidated Compiler Front-end

As I described in my previous post, a thin front-end to the XSL stylesheets was needed. For the purposes of running inside of the browser, the front-end would need to be written in JavaScript. It would have been possible, however, to write a separate front-end in a different language (bash, Java, or anything else), for the purposes of running outside of the browser. A design decision needed to be made, then, regarding how the front-end should be implemented:

  • Implement one unified front-end, written in JavaScript, which relies on modules which provide portable API’s, and provide implementations of these API’s that vary between environments.
  • Implement multiple front-ends, for browser and server environments.

I decided that, with respect to maintainability, it would be easier to maintain one front-end, written in one language, rather than two front-ends in different languages, and so I chose the first option. This worked well, but I’m not yet completely happy with the result, as I have code for Rhino and code for the browser mixed together in the same mdoule. This means that code for Rhino is downloaded to the browser, even though it is never called (see Transformer.js for an example of this). The same is true for code that targets IE versus other browsers. I believe I’ve thought of a way to use RequireJS to selectively download platform-specific modules, and this is an optimization that I’ll make in the near future.

In-Browser Demo

The result of this work can be seen in this demo site I threw together:

This demo provides a very crude illustration of what a browser-based Graphical User Interface to the compiler might look like. It takes SCXML as input (top-most textarea), compiles it to JavaScript code (lower-left textarea, read-only), and then allows simulation from the console (bottom-right textarea and text input). For convenience, the demo populates the SCXML input textarea with the KitchenSink executable content example. I’ve tested it in IE8, Safari 5, Chrome 5, Firefox 3.5. It works best in Chrome and Firefox. I haven’t been testing in Opera, but I’m going to start soon.

Future Work

The past three weeks was spent porting and refactoring, which was necessary to facilitate future progress, and now there’s lots to do going forward. My feeling is that it’s now time to get back to the main work, which is adding important features to the compiler, starting with functionality still missing from the current implementation of the core module:

I’m going to be presenting this work at the SVG Open 2010 conference at the end of August, so I’m also keen to prepare some new, compelling demos that will really illustrate the power of Statecharts on the web.

June 6th, 2010

Google Summer of Code 2010, Project Update 1

I’m two weeks into my Google Summer of Code project, and decided it was time to write the first update describing the work I’ve done, and the work I will do.

Project Overview

First a quick overview of what my project is, what it does, why one might care about it. The SCXML Code Generation Framework, JavaScript Edition project (SCXMLcgf/js) centers on the development of a particular tool, the purpose of which is to accelerate the development of rich Web-based User Interfaces. The idea behind it is that there is a modelling language, called Statecharts, which is very good at describing dynamic behaviour of objects, and can be used for describing rich UI behaviour as well. The tool I’m developing, then, is a Statechart-to-JavaScript compiler, which takes as input Statechart models as SCXML documents, and compiles them to executable JavaScript code, which can then be used in the development of complex Web UIs.

I’m currently developing this tool under the auspices of the Apache Foundation during this year’s Google Summer of Code. For more information on it, you could read my GSoC project proposal here, or even check out the code here.

Week 1 Overview

As I said above, I’m now two weeks into the project. I had already done some work on this last semester, so I’ve been adding in support for additional modules described in the SCXML specification. In Week 1, I added basic support for the Script Module. I wrote some tests for this, and it seemed to work well, so I checked it in.

Difficulties with E4X

I had originally written SCXMLcgf/js entirely JavaScript, targeting the Mozilla Rhino JavaScript implementation. One feature that Rhino offers is the E4X language extension to JavaScript. E4X was fantastic for rapidly developing my project. It was particularly useful over standard JavaScript in terms of providing an elegant syntax for: templating (multiline strings with embedded parameters, and regular JavaScript scoping rules), queries against the XML document structure (very similar to XPath), and easy manipulation of that structure.

These language features allowed me to write my compiler in a very declarative style: I would execute transformations on the input SCXML document, then query the resulting structure and and pass it into templates which generated code in a top-down fashion. I leveraged E4X’s language features heavily throughout my project, and was very productive.

Unfortunately, during Week 1, I ran into some difficulties with E4X. There was some weirdness involving namespaces, and some involving scoping. This wasn’t entirely surprising, as the Rhino implementation of E4X has not always felt very robust to me. Right out of the box, there is a bug that prevents one from parsing XML files with XML declarations, and I have encountered other problems as well. In any case, I lost an afternoon to this problem, and decided that I needed to begin to remove SCXMLcgf/js’s E4X dependencies sooner rather than later.

I had known that it would eventually be necessary to move away from E4X for portability reasons, as it would be desirable to be able to run the SCXMLcgf/js in the browser environment, including non-Mozilla browsers. There are a number of reasons for this, including the possibility of using the compiler as a JIT compiler, and the possibility of providing a browser-based environment for Statechart development. Given the problems I had had with E4X in Week 1, I decided to move this task up in my schedule, and deal with it immediately.

So, for Week 2, I’ve been porting most of my code to XSLT.

Justification for Targeting XSLT

At the beginning of Week 2, I knew I needed to migrate away from E4X, but it wasn’t clear what the replacement technology should be. So, I spent a lot of time thinking about SCXMLcgf/js, its architecture, and the requirements that this imposes on the technology.

The architecture of SCXMLcgf/js can be broken into three main components:

  • Front End: Takes in arguments, possibly passed in from the command-line, and passes these in as options to the IR Compiler and the Code Generator.
  • IR Compiler: Analyzes the given SCXML document, and creates an Intermediate Representation (IR) that is easy to generate code from.
  • Code Generator: Generates code from a given SCXML IR. May have multiple backend modules that target different programming languages (it currently only targets JavaScript), and different Statechart implementation techniques (it currently targets three different techniques).

My goal for Week 2 was just to eliminate E4X dependencies in the Code Generator component. The idea behind this component is that its modules should only be used for templating. The primary goal of these template modules is that they should be easy to read, understand, and maintain. In my opinion, this means that templates should not contain procedural programming logic.

Moreover, I came up with other precise feature requirements for a templating system, based on my experience from the first implementation of SCXMLcgf/js:

  • must be able to run under Rhino or the browser
  • multiline text
  • variable substitution
  • iteration (loops)
  • if/else blocks
  • Mechanisms to facilitate Don’t Repeat Yourself (DRY)
    • Something like function modularity, where you separate templates into named regions.
    • Something like inheritance, where a template can import other templates, and override functionality in the parent template.

Because I’m very JavaScript-oriented, I first looked into templating systems implemented in JavaScript. JavaScript templating systems are more plentiful than I had expected. Unfortunately, I did not find any that fulfilled all of the above requirements. I won’t link to any, as I ultimately chose not to go down this route.

A quick survey of XSLT, however, indicated to me that it did support all of the above functionality. So, this left me to consider XSLT, the other programming language which enjoys good cross-browser support.

I was pretty enthusiastic about this, as I had never used XSLT before, but had wanted to learn it for some time. Nevertheless, I had several serious concerns about targeting XSLT:

  1. How good is the cross-browser support for XSLT?
  2. I’m a complete XSLT novice. How much overhead will be required before I can begin to be productive using it?
  3. Is XSLT going to be ridiculously verbose (do I have to wrap all non-XML text in a <text/> node)?
  4. Is there good free tooling for XSLT?
  5. Another low-priority concern was that I wanted to keep down dependencies on different languages; it would be nice to focus on only one. I’m not sure about XSLT’s expressive power. Would it be possible to port the IR-Compiler component to XSLT?

To address each of these concerns in turn:

  1. There are some nice js libs that abstract out the browser differences: Sarissa, Google’s AJAXSLT.
  2. I did an initial review of XSLT. I found parts of it to be confusing (like how and when the context node changes; the difference between apply-templates with and without the select attribute; etc.), but decided the risk was low enough that I could dive in and begin experimenting with it. As it turned out, it didn’t take long before I was able to be productive with it.
  3. Text node children of an <xsl:template/> are echoed out. This is well-formed XML, but I’m not sure if it’s strictly legal XSLT. Anyhow, it works well, and looks good.
  4. This was pretty bad. The best graphical debugger I found was: KXSLdbg for KDE 3. I also tried the XSLT debugger for Eclipse Web Tools, and found it to be really lacking. In the end, though, I mostly just used <xsl:message/> nodes as printfs in development, which was really slow and awkward. This part of XSLT development could definitely use some improvement.

I’ll talk more about 5. in a second.

XSLT Port of Code Generator and IR-Compiler Components

I started to work on the XSLT port of the Code Generator component last Saturday, and had it completed by Tuesday or Wednesday. This actually turned out not to be very difficult, as I had already written my E4X templates in a very XSLT-like style: top-down, primarily using recursion and iteration. There was some procedural logic in there which need to be broken out, so there was some refactoring to do, but this wasn’t too difficult.

When hooking everything up, though, I found another problem with E4X, which was that putting the Xalan XSLT library on the classpath caused E4X’s XML serialization to stop working correctly. Specifically, namespaced attributes would no longer be serialized correctly. This was something I used often when creating the IR, so it became evident that it would be necessary to port the IR Compiler component in this development cycle as well.

Again, I had to weigh my technology choices. This component involved some analysis, and transformation of the given SCXML document to include this extra information. For example, for every transition, the Least Common Ancestor state is computed, as well as the states exited and the states entered for that transition.

I was doubtful that XSLT would be able to do this work, or that I would have sufficient skill in order to program it, so I initially began porting this component to just use DOM for transformation, and XPath for querying. However, this quickly proved not to not be a productive approach, and I decided to try to use XSLT instead. I don’t have too much to say about this, except to observe that, even though development was often painful due to the lack of a good graphical debugger, it was ultimately successful, and the resulting code doesn’t look too bad. In most cases, I think it’s quite readable and elegant, and I think it will not be difficult to maintain.

Updating the Front End

The last thing I needed to do, then, was update the Front End to match these changes. At this point, I was in the interesting situation of having all of my business logic implemented in XSLT. I really enjoyed the idea of having a very thin front-end, so something like:

xsltproc xslt/normalizeInitialStates.xsl $1 | \
xsltproc xslt/generateUniqueStateIds.xsl - | \
xsltproc xslt/splitTransitionTargets.xsl - | \
xsltproc xslt/changeTransitionsPointingToCompoundStatesToPointToInitialStates.xsl - | \
xsltproc xslt/computeLCA.xsl - | \
xsltproc xslt/transformIf.xsl - | \
xsltproc xslt/appendStateInformation.xsl - | \
xsltproc xslt/appendBasicStateInformation.xsl - | \
xsltproc xslt/appendTransitionInformation.xsl - | \
xsltproc xslt/StatePatternStatechartGenerator.xsl | \
xmlindent > out.js

There would be a bit more to it than that, as there would need to be some logic for command-line parsing, but this would also mostly eliminate the Rhino dependency in my project (mostly because the code still uses js_beautify as a JavaScript code beautifier, and the build and performance analysis systems are still written in JavaScript). This approach also makes it very clear where the main programming logic is now located.

In the interest of saving time, however, I decided to continue to use Rhino for the front end, and use SAX Java API’s for processing the XSLT transformations. I’m not terribly happy with these API’s, and I think Rhino may be making the system perceptibly slower, so I’ll probably move to the thin front end at some point. But right now this approach works, passes all unit tests, and so I’m fairly happy with it.

Future Work

I’m not planning to check this work into the Apache SVN repository until I finish porting the other backends, clean things up, and re-figure out the project structure. I’ve been using git and git-svn for version control, though, which has been useful and interesting (this may be the subject of another blog post). After that, I’ll be back onto the regular schedule of implementing modules described in the SCXML specification.

This work is licensed under GPL - 2009 | Powered by Wordpress using the theme aav1