MODELLING AND SIMULATION, WEB ENGINEERING, USER INTERFACES
October 7th, 2012

New projects: scxml-viz, scion-shell, and scion-web-simulation-environment

I have released three new projects under an Apache 2.0 license:
  • scxml-viz: A library for visualizing SCXML documents.
  • scion-shell: A simple shell environment for the SCION SCXML interpreter. It accepts SCXML events via stdin, and thus can be used to integrate SCXML with Unix shell programming. It integrates scxml-viz, so can also allow graphical simulation of SCXML models.
  • scion-web-simulation-environment: A simple proof-of-concept web sandbox environment for developing SCXML. Code can be entered on the left, and visualized on the right. Furthermore, SCION is integrated, so code can be simulated and graphically animated. A demo can be found here: http://goo.gl/wG5cq
July 29th, 2012

Syracuse Student Sandbox Hackathon Recap

Yesterday I participated in a hackathon at the Syracuse Student Sandbox. This blog post is meant to provide a quick recap of the interesting technical contributions that came out of this event.

All source code mentioned in this article is available on Github.

What I Did

My project idea was to develop a voice menu interface to the Archive.org live music archive using Twilio. The idea was that you would call a particular phone number, and be presented with a voice menu interface. There would be options to listen to the Archive.org top music pick, or to perform a search.

Core Technology

Archive.org

Archive.org exposes a very nice, hacker-friendly API. It is fairly well-documented here. I only encountered a few gotchas, which are that the API to the main page does not return valid JSON, and so it must be parsed using JavaScript’s eval; and, the query API is based on Lucene query syntax, which I did not find documented anywhere.

Twilio

Developing a Twilio telephony application is just like developing a regular web application. When you register with Twilio, they assign you a phone number, which you can then point to a web server URL. When someone calls the number, Twilio performs an performs HTTP request (either GET or POST, depending on how you have it configured) to the server which you specified.

Instead of returning HTML, you return TwiML. Each tag in a TwiML document is a verb which tells Twilio what to do. TwiML documents can be modelled as state machines, in that there’s a particular flow between elements. For certain tags, Twilio, will simply flow to the next tag after performing the action associated with that tag; however, for other tags, Twilio will perform a request (again, either GET or POST) to a URL specified by the tag’s “action” attribute, and will execute the TwiML document returned by that request. This is analogous to submitting a form in HTML.

Each HTTP request performed by Twilio will submit some data, like the caller’s phone number and location, as well as a variable which allows the server to track the session.

There were a few instances of undocumented behaviour that I encountered, but overall developing a TwiML application was as easy as it sounds. After I had my node.js hosting set up, I had an initial demo working in less than an hour, in which the user could call in, and would be able to hear the archive.org live music pick. This was simply a matter of using Archive.org’s API to retrieve the URL to the file of the top live music pick, and passing this URL to Twilio in a <Play> element. Twilio was then able to stream the MP3 file directly from Archive.org.

Main Technical Contribution: Using SCXML and SCION to Model Navigation in a Node.js Web Application

I developed the application using Node.js and SCION, an SCXML/Statecharts interpreter library I’ve been working on. In addition to providing a very small module for querying the archive.org API using Node.js, I feel the main technical contribution of this project was using SCXML to model web navigation, and I will elaborate on that contribution in this section.

Using Statecharts to model web navigation is not a new idea (see StateWebCharts, for example), however, I believe this is the first time this technique has been used in conjunction with Node.js.

From a high level, SCXML can be used to describe the possible flows between pages in a Web application. SCXML allows one to model these flows explicitly, so that every possible session state and the transitions between session states are well-defined. Another way to describe this is that SCXML can be used to implement routing which changes depending on session state.

A web server accepts an HTTP request as input and asynchronously returns an HTTP response as output. Each HTTP request can contain parameters, encoded as query parameters on the URL in the case of a GET request, or as POST data for a POST request. These parameters can contain data that allows the server to map the HTTP request to a particular session, as well as other data submitted by the user.

These inputs to the web server were mapped to SCXML in the following way. First, an SCXML session was created for each HTTP session, such that subsequent HTTP requests would be dispatched to this one SCXML session, and this SCXML session would maintain all of the session state.

Each HTTP request was turned into an SCXML event and dispatched as input to the SCXML session corresponding to the session of that HTTP request. An SCXML event has “name” and “data” properties. The url of the request was used as the event name, and the parsed query parameters were used as the event data. Furthermore, the Node.js HTTP request and response objects were also included as event data.

In this implementation, SCXML states were mapped to individual web pages, which were returned to the user on the HTTP response.

The SCXML document modelling navigation can be found here. Here is a graphical rendering of it (automatically generated using scxmlgui):

Statecharts Diagram

Statecharts Diagram

<?xml version="1.0" encoding="UTF-8"?>
<scxml 
	xmlns="http://www.w3.org/2005/07/scxml"
	version="1.0"
	profile="ecmascript">

    <datamodel>
        <data id="serverUrl" expr="'http://jacobbeard.net:1337'"/>
        <data id="api"/>
    </datamodel>

    <script src="./playPick.js"/>
    <script src="./performSearch.js"/>

    <state id="initial_default">
        <transition event="init" target="waiting_for_initial_request">
            <assign location="api" expr="_event.data"/>
        </transition>
    </state>

    <state id="waiting_for_initial_request">
        <transition target="root_menu" event="/"/>
    </state>

    <state id="root_menu">
        <onentry>
            <log label="entering root_menu" expr="_events"/>

            <!-- we want to send this as a response. hack SCION so we can do that somehow -->
            <Response>
                <Gather numDigits="1" action="number_received" method="GET">
                    <Say>Root Menu</Say>
                    <Say>Press 1 to listen to the archive dot org live music pick. Press 2 to search the archive dot org live music archive.</Say>
                </Gather>
            </Response>
        </onentry>

        <transition target="playing_pick" event="/number_received" cond="_event.data.params.Digits === '1'"/>
        <transition target="searching" event="/number_received" cond="_event.data.params.Digits === '2'"/>

        <!-- anything else - catchall error condition -->
        <transition target="root_menu" event="*">
            <Response>
                <Gather numDigits="1" action="number_received" method="GET">
                    <Say>I did not understand your response.</Say>
                    <Say>Press 1 to listen to the archive dot org live music pick. Press 2 to search the archive dot org live music archive.</Say>
                </Gather>
            </Response>
        </transition>
    </state>

    <state id="playing_pick">
        <!-- TODO: move the logic in playPack into SCXML -->
        <onentry>
            <log label="entering playing_pick"/>
            <script>
                playPick(_event.data.response,api);
            </script>
        </onentry>

        <!-- whatever we do, just return -->
        <transition target="root_menu" event="*"/>
    </state>

    <state id="searching">
        <datamodel>
            <data id="searchNumber"/>
            <data id="searchTerm"/>
        </datamodel>

        <onentry>
            <log label="entering searching"/>
            <Response>
                <Gather numDigits="1" action="number_received" finishOnKey="*"  method="GET">
                    <Say>Press 1 to search for an artist. Press 2 to search for a title.</Say>
                </Gather>
                <Redirect method="GET">/</Redirect>
            </Response>

        </onentry>

        <transition target="receiving_search_input" event="/number_received" cond="_event.data.params.Digits === '1' || _event.data.params.Digits === '2'"> 
            <assign location="searchNumber" expr="_event.data.params.Digits"/>
        </transition>
        <transition target="root_menu" event="/"/> 
        <transition target="bad_search_number" event="*"/> 
    </state>

    <state id="receiving_search_input">
        <onentry>
            <Response>
                <Gather numDigits="3" action="number_received" method="GET">
                    <Say>Press the first three digits of the name to search for.</Say>
                </Gather>
                <Redirect method="GET">/</Redirect>
            </Response>

        </onentry>

        <transition target="performing_search" event="/number_received" cond="_event.data.params.Digits"> 
            <assign location="searchTerm" expr="_event.data.params.Digits"/>
        </transition>
        <transition target="bad_search_number" event="/number_received"/> 
        <transition target="root_menu" event="*"/> 
        
    </state>

    <state id="performing_search">
        <onentry>
            <script>
                performSearch(searchNumber,searchTerm,_event.data.response,api);
            </script>
        </onentry>
        
        <transition target="searching" event="/search-complete" />
        <transition target="searching" event="/artist-not-found" />
        <transition target="root_menu" event="*" />
    </state>

    <state id="bad_search_number">
        <onentry>
            <Response>
                <Say>I didn't understand the number you entered.</Say>
                <Redirect method="GET">/</Redirect>
            </Response>

        </onentry>

        <transition target="searching" event="/"/> 
    
    </state>

</scxml>

Note that the transition conditions do not appear in the above diagram, so I would recommend reading the SCXML document as well as the diagram.

In this model, the statechart starts in an initial_default state in which it waits for an init event. The init event is used to pass platform-specific API’s into the state machine. After receiving the init event, the statechart will transition to state waiting_for_initial_request, where it will wait for an initial request to url “/”. After receiving this request, it will transition to state root_menu. Of particular interest here are the actions in the <onentry> tag. The TwiML document to be returned to the user is inlined directly as a a custom action within <onenter>, and is executed by the interpreter by writing that document to the node.js response object’s output stream. This document will tell Twilio to wait for the user to press a single digit, and to submit a GET request to URL “/number_received” when the request completes.

There are two transitions originating from root_menu. The first targets state play_pick, the second targets state searching, and the third loops back to state root_menu. The first two transitions have a cond attribute, which is used to inspect the data sent with the request. So, for example, if the user presses “1″, Twilio would submit a GET request to URL “/number_received?Digits=1″ (along with other URL parameters, which I have omitted for simplicity). This would be transformed into the SCXML event {name : '/number_received', data : { Digits : '1' }}, which would then activate the transition to playing_picks. The system would then transition to playing_picks, which would call a JavaScript function that would query the Archive.org API to retrieve the URL to Archive.org’s top song pick, and would output a TwiML document on the HTTP response object which would contain the URL to that song.

If the user pressed a “2″ instead of a “1″, then the cond attribute would cause the statechart to activate the transition to state searching instead of playing_pick. If the user pressed anything else, or attempted to navigate to any other URL, then the wildcard “*” event on the third transition would simply cause the statechart to loop back to root_menu.

The rest of the application is implemented in a similar fashion.

Comments and Critiques

While overall, I feel this effort was successful, and demonstrates a technique that could be used to develop larger and more complex applications, there are ways I would like to improve it.

First, while I feel that being able to inline the response as custom action code in the entry action of a state is a rather elegant approach, it would be useful to make the inline XML templated so that it can use data from the system’s datamodel.

Second, there’s a disconnect between the action specified in the returned document (the url to which the document will be submitted), and the transitions originating from the state corresponding to that document. For example, it would be possible to return a document with a form with action attribute “foo”, and have a transition originating from that state with event /bar. This may not be a desirable behaviour, as there’s not legal way for the returned web page to submit to URL “/bar”. The action attribute on the returned form can be understood as specifying the SCXML events that that page will be able to generate, or the possible flow between pages within the web application, and so it might be better somehow model the connection between returned form actions and transition events more explicitly.

Third, there are several features of SCXML that this demo did not make use of, including state hierarchy, parallel and history states. Uses for these features may emerge in the development of a more complex web application.

Fourth, there is currently quite a lot of logic in the action code called by the SCXML document. This includes multiple levels of asynchronous callbacks. This is not an ideal approach, as it means that even after an SCXML macrostep has ended, a callback within the action code executed within that macrostep may be asynchronously called by the environment. I feel this breaks SCION’s interpretation of Statecharts semantics, and may lead to unexpected behaviour. A better approach would be to feed the result of each asynchronous callback back into the state machine as an event, and use a sequence intermediate states to model the flow between callbacks.

Fifth, and finally, I had a few technical difficulties with SCION, in that node.js’s require function was not working correctly in embedded action code. I worked around this by passing the required API’s around as a single object between functions in action code. I fixed this issue in SCION today.

Conclusion

The finished application can be demoed by calling (315) 254-2188. I’m going to leave it up until the account runs out of money, so feel free to try it.

I had a great time at the Hackathon, and I feel my participation was productive on multiple levels. I’m looking forward to further researching how SCXML and SCION can be applied to web application development.

November 30th, 2011

Master Thesis Mini-Update: Initial Release of SCION

I just wanted to quickly announce the release of SCION, a project to develop an SCXML interpreter/compiler framework suitable for use on the Web, and the successor to SCXML-JS.

The project page is here: https://github.com/jbeard4/SCION
Documentation, including demos, may be found here: http://jbeard4.github.com/SCION/

I welcome your feedback.

September 29th, 2010

scxml-js Build Adventures

I presented my work on scxml-js at the SVG Open 2010 conference at the end of August. My hope was that I would be able to have a release prepared by this time, to encourage adoption among interested developers. However, I soon discovered that there was some process overhead involved in preparing releases in Apache Commons. Currently, scxml-js is a Commons Sandbox project, and Sandbox projects are not allowed under any circumstances to publish releases. In order to publish a release, the scxml-js would need to be “promoted” to Commons Proper, which would require a vote on the Commons mailing list. In order to pass a vote, it seemed likely that the scxml-js build system would need to be overhauled to use Maven, so as to be able to reuse the Maven parent pom, and thus inherit all of the regulated, well-designed build infrastructure shared by all projects at Apache Commons.

I had originally allocated two weeks to this task, from the end of Google Summer of Code, to the start of the SVG Open conference, but in fact I ended up spending over a month working on just the build infrastructure. I think some interesting new techniques emerged out of this work.

First, a description of what I was migrating from: a single custom build file written in JavaScript and designed to be run under Rhino. The reasoning behind this technique was that JavaScript is quite a nice scripting language, and very useful for many tasks, including writing build scripts, and due to its ability to use the RequireJS module system and dojo.doh unit testing framework natively, writing a custom build script in Rhino seemed to be the fastest, easiest way to perform automated tasks related to unit and performance testing of scxml-js. What it was not useful for, however, was setting up a Java classpath and calling java or javac (it also seemed like too much of an investment to perform a proper topological sort of dependencies between build targets). In the beginning of the project, using java and javac was not needed as scxml-js would always be run in interpreted mode on the command-line. As time went on, however, I wanted to use Rhino’s jsc utility to compile scxml-js to optimized Java bytecode, in order to improve performance as well as provide a standalone executable JAR for easy deployment. In order to solve this problem, I began to use Ant, which of course has very good integration with tasks relating to Java compilation.

Compiling JavaScript to Java Bytecode with Ant

Invoking jsc using Ant is actually pretty easy. The only complication arises if you have dependencies between your scripts (e.g. using Rhino’s built-in load() function), as jsc will not catch these. What is required is to preprocess your scripts so that all js dependencies are included in a single file, and then to run jsc on that built file. If you’re using just using load() to import script dependencies, this can be difficult to accomplish. If you’re using RequireJS, however, then you can make use of its included build script which does precisely what I described, in that it seeks out module dependencies and includes them in one giant file. It can also include the RequireJS library itself in the file, as well as substitute text (or XML) file dependencies as inline strings, so the end result is that all dependencies are are included in this single file. Compilation of scxml-js to Java bytecode is then a two-step process: calling the RequireJS build script to create a single large file that includes all dependencies, and calling jsc on the built file to compile it to bytecode. This will produce a single executable class file. Here’s a snippet to illustrate how this works:


<!-- this is the path to a front-end module that accepts command-line arguments and passes them into the main module -->
<property name="build-js-main-rhino-frontend-module" value="${src}/javascript/scxml/cgf/build/rhino"/>

<!-- RequireJS build script stuff -->
<property name="js-build-script" location="${lib-js}/requirejs/build/build.js"/>
<property name="js-build-dir" location="${lib-js}/requirejs/build"/>

<!-- include a reference to the closure library bundled with the RequireJS distribution -->
<path id="closure-classpath" location="${lib-js}/requirejs/build/lib/closure/compiler.jar"/>

<!-- jsc stuff -->
<property name="build-js-main" location="${build-js}/main-built.js"/>
<property name="build-class-main-name" value="SCXMLCompiler"/>
<property name="build-class-main" location="${build-class}/${build-class-main-name}.class"/>

<target name="compile-single-js">
	<mkdir dir="${build-js}"/>

	<java classname="org.mozilla.javascript.tools.shell.Main">
		<classpath>
			<path refid="rhino-classpath"/>
			<path refid="closure-classpath"/>
		</classpath>
		<arg value="${js-build-script}"/>
		<arg value="${js-build-dir}"/>
		<arg value="name=${build-js-main-rhino-frontend-module}"/>
		<arg value="out=${build-js-main}"/>
		<arg value="baseUrl=."/>
		<arg value="includeRequire=true"/>
		<arg value="inlineText=true"/>
		<arg value="optimize=none"/>
	</java>
</target>

<target name="compile-single-class" depends="compile-single-js">
	<mkdir dir="${build-class}"/>

	<!-- TODO: parameterize optimization level -->
	<java classname="org.mozilla.javascript.tools.jsc.Main">
		<classpath>
			<path refid="maven.plugin.classpath"/>
		</classpath>
		<arg value="-opt"/>
		<arg value="9"/>
		<arg value="-o"/>
		<arg value="${build-class-main-name}.class"/>
		<arg value="${build-js-main}"/>
	</java>
	<move file="${build-js}/${build-class-main-name}.class" todir="${build-class}"/>
</target>
// This is the module referenced by property "build-js-main-rhino-frontend-module".
// It accepts command-line arguments and passes them into the main module
(function(args){
	require(
		["src/javascript/scxml/cgf/main"],
		function(main){
			main(args);
		}
	);
})(Array.prototype.slice.call(arguments));

All of this was not too difficult to set up, and allowed me to accomplish my goal of building a single class file for the scxml-js project.

Importing JavaScript Modules with RequireJS in Ant

Rather than maintain two build scripts, it was desirable to move the unit testing functionality in the Rhino build script into Ant. As I had already put a significant amount of time into developing the Rhino build script, I wanted to directly reuse this code in Ant. This seemed possible, as Ant already provides good integration with Rhino and other scripting languages via its script tag, and either JSR-223 of the Bean Scripting Framework API. Unfortunately, however, Rhino when run under Ant does not expose any properties on the global object; this means that by default load() and readFile() are not available, making it virtually impossible to import code from other files, and thus making it impossible to directly import RequireJS modules in an Ant script. However, I sought help on the Rhino mailing list, and a convenient workaround was developed. The following code should be included at the beginning of every script tag, and the manager attribute on the script tag should be set to “bsf”:

var shell = org.mozilla.javascript.tools.shell.Main;
var args = ["-e","var a='STRING';"];
shell.exec(args);

var shellGlobal = shell.global;

//grab functions from shell global and place in current global
var load=shellGlobal.load;
var print=shellGlobal.print;
var defineClass=shellGlobal.defineClass;
var deserialize=shellGlobal.deserialize;
var doctest=shellGlobal.doctest;
var gc=shellGlobal.gc;
var help=shellGlobal.help;
var loadClass=shellGlobal.loadClass;
var quit=shellGlobal.quit;
var readFile=shellGlobal.readFile;
var readUrl=shellGlobal.readUrl;
var runCommand=shellGlobal.runCommand;
var seal=shellGlobal.seal;
var serialize=shellGlobal.serialize;
var spawn=shellGlobal.spawn;
var sync=shellGlobal.sync;
var toint32=shellGlobal.toint32;
var version=shellGlobal.version;
var environment=shellGlobal.environment;

Although, now that I’m reading this again, this also isn’t quite right, as, while these variables are being defined in the global namespace, technically they are not being added to the global object… but, in any case, this has not proven to be problematic.

Because this is a verbose declaration, and we like code reuse, I defined a macro called “rhinoscript” to abstract it out:


<macrodef name="rhinoscript">
	<text name="text"/>

	<sequential>
		<script language="javascript" manager="bsf">
			<classpath>
				<path refid="maven.plugin.classpath"/>
			</classpath><![CDATA[
				var shell = org.mozilla.javascript.tools.shell.Main;
				var args = ["-e","var a='STRING';"];
				shell.exec(args);

				var shellGlobal = shell.global;

				//grab functions from shell global and place in current global
				var load=shellGlobal.load;
				//import everything else...

				@{text}
		]]></script>
	</sequential>
</macrodef>

<!-- example call -->
<target name="test-call">
	<rhinoscript><![CDATA[
		load("foo.js");
		print("Hello World!");
	]]></rhinoscript>
</target>

This then allowed RequireJS modules to be imported and reused directly, as in the original Rhino build script. The only caveat is that, rather than using nice JavaScript data structures (Arrays, Objects, etc.) to store build-related properties, it was necessary to use Ant data structures (properties, paths, etc.) instead. Here’s the final result of the Ant task that uses RequireJS and dojo.doh to run unit tests:

<target name="run-unit-tests-with-rhino" depends="setup-properties">
	<rhinoscript><![CDATA[
		//load requirejs
		Array.prototype.slice.call(requirejs_bootstrap_paths.list()).forEach(function(requireJsPath){
			load(requireJsPath);
		});

		//this is a bit weird, but we define this here in case we need to load dojo later using the RequireJS loader
		djConfig = {
			"baseUrl" : path_to_dojo_base+"/"
		}

		function tailRecurse(list,stepCallback,baseCaseCallback){
			var target = list.pop();

			if(target){
				stepCallback(target,
					function(){tailRecurse(list,stepCallback,baseCaseCallback)});
			}else{
				if(baseCaseCallback) baseCaseCallback();
			}
		}

		var isComplete = false;

		require(
			{baseUrl:basedir},
			[path_to_dojo,
				"lib/test-js/env.js",
				"test/testHelpers.js"],
			function(){

				dojo.require("doh.runner");

				var forIE = "is-for-ie";
				var scxmlXmlTestPathList = Array.prototype.slice.call(scxml_tests_xml.list());
				var backendsList = backends.split(",");

				print("backendsList : " + backendsList);
				print("backendsList.length : " + backendsList.length);

				var oldDohOnEnd = doh._onEnd;
				doh._onEnd = function() { isComplete = true; oldDohOnEnd.apply(doh); };

				//we use tailRecurse function because of asynchronous RequireJS call used to load the unit test module
				tailRecurse(scxmlXmlTestPathList,
					function(scxmlXmlTestPath,step){
						var jsUnitTestPathPropertyName = scxmlXmlTestPath + "-" + "unit-test-js-module";
						var jsUnitTestPath = project.getProperty(jsUnitTestPathPropertyName);

						require([jsUnitTestPath],
							function(unitTestModule){

								backendsList.forEach(function(backend){
									var jsTargetTestPathPropertyName =
										forIE + "-" + backend + "-" + scxmlXmlTestPath + "-" + "target-test-path";

									var jsTargetTestPath = project.getProperty(jsTargetTestPathPropertyName);

									print("jsTargetTestPathPropertyName : " + jsTargetTestPathPropertyName);
									print("jsTargetTestPath  : " + jsTargetTestPath);

									//load and register
									load(jsTargetTestPath);

									unitTestModule.register(StatechartExecutionContext)
									delete StatechartExecutionContext;
								});

								step();
							});
					},
					function(){
						//run with dojo
						doh.run();
					}
				);

			}
		);

		//hold up execution until doh completes
		while(!isComplete){
			java.lang.Thread.sleep(20);
		}

	]]></rhinoscript>

I think this is kind of nice, because, if you look at new unit testing frameworks like Rake or Jake, the big advantage that they give you is the ability to use a real programming language (as opposed to a build Domain Specific Language, like Ant), and at the same time provide facilities for define build targets with dependencies, and topographically sorting them when they are invoked. Ant still has many advantages, however, including great support in existing continuous integration systems. The approach I have described seems to marry the advantages of using Ant, with those of using your preferred scripting language.

Integrating Ant with Maven

At this point, I had brought over most of the existing functionality from the Rhino build script into Ant, and I was beginning to look at ways to then hook into Maven. While I had some previous experience working with Ant, I had never before worked with Maven, and so there was a learning curve. The goal was to hook into the existing Apache Commons Maven-based build infrastructure, while at the same time trying to reuse existing code.

While this part was non-trivial to develop, it is actually the least interesting part of the process to me, and I think the least relevant to this blog (it doesn’t have much to do with JavaScript or Open Web technologies), so I’m only going to briefly describe it. The build architecture is currently as follows:

I felt it was important to maintain both an Ant front-end and Maven front-end to the build, as each has advantages for certain tasks. Common functionality is imported from build-common.xml. Both the Maven (pom.xml) and Ant (build.xml) front-ends delegate to mvn-ant-build.xml, which contains most of the core tasks without the dependencies between targets.

Based on my experience on the Maven mailing list, if you are a “Maven person” (a person who has “drunk the Maven kool-aid” – not my words, Maven people seem to like to use this phrase), then this architecture built around delegation to Ant will likely make you cry. It will seem needlessly complex, when the alternative of creating a set of custom Maven plugins will seem much better. This might be the case, and I proposed investigating this options. The problem, however, seems to be that relying on custom Maven plugins for building is a no-go for Commons projects (with the exception of the Maven Commons plugin), as it is uncertain where these plugins will be hosted. However, building a Maven plugin for process of compiling JavaScript using the RequireJS framework to Java bytecode, as outlined above, is I think something that has value, and which I would like to pursue at some point.

Future Work

I still have not put scxml-js forward for a vote, and even though the refactoring of the build system is more or less complete, I still may not do so. I have just arrived in Belgium where I will be working on my Master’s thesis for three months, and so I may need to deprioritize my work on scxml-js while I prioritize researching the theoretical aspects of my thesis. Also, now that SVG Open has passed, there seems to be less incentive to publish an alpha release. It may be better to give scxml-js more time to mature, and then release later on.

August 16th, 2010

Google Summer of Code 2010, Final Update

The pencils’ down date for Google Summer of Code 2010 is right now. Here’s a quick overview of what I feel I have contributed to scxml-js thus far, and what I feel should be done in the future.

Tests and Testing Framework

Critical to the development of scxml-js was the creation of a robust testing framework. scxml-js was written using a tests-first development style, which is to say that before adding any new feature, I would attempt to map out the implications of that feature, including all possible edge cases, and would then write tests for sucess, failure, and sanity. By automating these tests, it was possible to avoid regressions when new features were added, and thus maintain robustness as the codebase became more complex.

Testing scxml-js was an interesting challenges with respect to automated testing, as it was necessary to test both the generated target code (using ahead-of-time compilation), and the compiler itself (using just-in-time compilation), running in all the major web browsers, as well as on the JVM under Rhino. This represented many usage contexts, and so a great deal of complexity was bundled into the resulting build script.

The tests written usually conformed to a general format: a given SCXML input file would be compiled and instantiated, and a script would send events into the compiled statechart while asserting that the state had updated correctly. A custom build script, written in JavaScript, automated the process of compiling and running test cases, starting and stopping web browsers, and harvesting results. dojo.doh and Selenium RC were used in the testing framework.

Going Forward

It would be useful to phase out the custom JavaScript build script for a more standard build tool, such as maven or ant. This may be challenging, however, given the number of usage contexts of the scxml-js compiler, as well as the fact that the API it exposes is asynchronous.

Another task I’d like to perform is to take the tests written for Commons SCXML and port them so that they can be used in scxml-js

Finally, I have often noticed strange behaviour with Selenium. At this moment, when run under Selenium, tests are broken for in-browser compilation under Internet Explorer; however when run manually, they always pass. I’ve traced where the tests are failing, and it’s a strange and intermittent failure involving parsing an XML document. I it think may be caused by the way that Selenium instruments the code in the page. I feel it may be worthwhile to investigate alternatives to Selenium.

scxml-js Compiler

This page provides an overview of what features works right now, and what do not.

In general, I think scxml-js is probably stable enough to use in many contexts. Unfortunately, scxml-js has had only one user, and that has been me. I’m certain that when other developers do begin using it, they will break it and find lots of bugs.

I’m hoping to prepare a pre-alpha release to coincide with the SVG Open 2010 conference at the end of the month, and in preparation for this, I’m reaching out to people I know to ask them to attempt to use scxml-js in a non-trivial project. This will help me find bugs before I attempt to release scxml-js for general consumption.

Going Forward

There are still edge cases which I have in mind that need to be tested. For example, I haven’t done much testing of nested parallel states.

I also have further performance optimizations which I’d like to implement. For example, I’ve been using JavaScript 1.6 functional Array prototype extensions (e.g. map, filter, and forEach) in the generated code, and augmenting Array.prototype for compatibility with Internet Explorer. However, these methods are often slower than using a regular for loop, especially in IE, and so it would be good to swap them out for regular for loops in the target code.

Another performance enhancement would be to encode the statechart’s current configuration as a single scalar state variable, rather than encoding it as an array of basic state variables, for statecharts that do not contain parallel states. This would reduce the time required to dispatch events for these types of statecharts, as the statechart instance would no longer need to iterate through each state of the current configuration, thus removing the overhead of the for loop.

I’m sure that once outside developers begin to look at the code, they will have lots of ideas on how to improve performance as well.

There are other interesting parts of the project that still need to be investigated, including exploring the best way to integrate scxml-js with existing JavaScript toolkits, such as jQuery UI and Dojo.

Graph Layout, Visualization, and Listener API

As I stated in the initial project proposal, one of my goal for GSoC was to create a tool that would take an SCXML document, and generate a graphical representation of that document. By targeting SVG, this graphical representation could then be scripted. By attaching a listener to a statechart instance, the SVG document could then be animated in response to state changes.

I was able to accomplish this by porting several graph layout algorithms written by Denis Dube for his Master’s thesis at the McGill University Modelling, Simulation and Design Lab. Denis was kind enough to license his implementations for release in ASF projects under the Apache License. You can see a demo of some of this work here.

Going Forward

The intention behind this work was to create a tool that would facilitate graphical debugging of statecharts in the web browser. While this is currently possible, it still requires “glue code” to be manually written to generate a graphical representation from an SCXML document, and then hook up the listener. I would like to make this process easier and more automatic. I feel it should operate similarly to other compilers, in that the compiler should optionally include debugging symbols in the generated code which allow it to map to a “concrete syntax” (textual or graphical) representation.

Another issue that needs to be resolved is cross-browser compatibility. It’s currently possible to generate SVG in Firefox and Batik, but there are known issues in Chromium and Opera.

Also, there are several more graph layout algorithms implemented by Denis which I have not yet ported. I’d really like to see this happen.

Finally, my initial inquiries on the svg-developers mailing list indicated that this work would be useful for other projects. I therefore feel that these JavaScript graph layout implementations should be moved into a portable library. Also, rather than generating a graphical representation directly from SCXML, it should be possible to generate a graphical representation from a more neutral markup format for describing graphs, such as GraphML.

Demos

I have written some nice demos that illustrate the various aspects of scxml-js, including how it may be used in the development of rich, Web-based user interfaces. The most interesting and complex examples are the Drawing Tool Demos, which implement a subset of Inkscape’s UI behaviour. The first demo uses scxml-js with a just-in-time compilation technique; the second uses ahead-of-time compilation; and the third uses just-in-time compilation, and generates a graphical representation on the fly, which it then animates in response to UI events. This last demo only works well in Firefox right now, but shows what should be possible going forward.

I have several other ideas for demos, which I will attempt implement before the SVG Open conference.

Documentation

The main sources of documentation now are the User Guide, the source code for the demos, and Section 5 of my SVG Open paper submission on scxml-js.

Conclusion

This has been an exciting and engaging project to work on, and I’m extremely grateful to Google, the Apache Software Foundation, and my mentor Rahul for facilitating this experience.

June 28th, 2010

Google Summer of Code, Update 3: More Live Demos

Just a quick update this time. The scxml-js is moving right along, as I’ve been adding support for new features at, on average, a rate of about a feature per day. Today, I reached an interesting milestone, which is that scxml-js is now as featurful as the old SCCJS compiler which I had previously been using in my research. This means that I can now begin porting the demos and prototypes I constructed using SCCJS to scxml-js, as well as begin creating new ones.

New Demos

Here are two new, simple demos that illustrate how scxml-js may be used to to describe and implement behaviour of web User Interfaces (tested in recent Firefox and Chromium; will definitely not work in IE due to its use of XHTML):

Both examples use state machines to describe and implement drag-and-drop behaviour of SVG elements. The first example is interesting, because it illustrates how HTML, SVG, and SCXML can be used together in a single compound document to declaratively describe UI structure and behaviour. The second example illustrates how one may create state machines and DOM elements dynamically and procedurally using JavaScript, as opposed to declaratively using XML markup. In this example, each dynamically-created element will have its own state machine, hence its own state.

I think the code in these examples is fairly clean and instructive, and should give a good sense regarding how scxml-js may ultimately be used as a finished product.

June 23rd, 2010

Google Summer of Code 2010, Project Update 2

Here’s another quick update on the status of my Google Summer of Code project.

Finished porting IR-compiler and Code Generation Components to XSLT

As described in the previous post, I finished porting the IR-compiler and Code Generation components from E4X to XSLT.

Once I had this working with the Java XML transformation APIs under Rhino, I followed up with the completion of two related subtasks:

  1. Get the XSL transformations working in-browser, and across all major browsers (IE8, Firefox 3.5, Safari 5, Chrome 5 — Opera still to come).
  2. Create a single consolidated compiler front-end, written in JavaScript, that works in both the browser and in Rhino.

Cross-Browser XSL Transformation

Getting all XSL transformations to work reliably across browsers was something I expressed serious concerns about in my previous post. Indeed, this task posed some interesting challenges, and motivated certain design decisions.

The main issue I encountered in getting these XSL transformations to work was that support for xsl:import in xsl stylesheets, when called from JavaScript, is not very good in most browsers. xsl:import works well in Firefox, but is currently distinctly broken in Webkit and Webkit-based browsers (see here for the Chrome bug report, and here for the Webkit bug report). I also had limited success with it in IE 8.

I considered several possible solutions to work around this bug.

First, I looked into a pure JavaScript solution. In my previous post, I linked to the Sarissa and AJAXSLT libraries. In general, a common task of JavaScript libraries is to abstract out browser differences, so the fact that several libraries existed which appeared to do just that for XSLT offered me a degree of confidence when I was initially choosing XSLT as a primary technology with which to implement scxml-js. Unfortunately, in this development cycle, on closer inspection, I found that Sarissa, AJAXSLT, and all other libraries designed to abstract out cross-browser XSLT differences (including Javeline, the jquery xsl transform plugin), are not actively maintained. As web browsers are rapidly moving targets, maintenance is a major concern when selecting a library dependency. In any case, a pure JavaScript solution did not appear feasible. This left me to get the XSL transformations working using just the “bare metal” of the browser.

My next attempt was to try to use some clever DOM manipulation to work around the Webkit bug. In the Webkit bug, xsl:import does not work because frameless resources cannot load other resources. This meant that loading the SCXML document on its own in Chrome, with an xml-stylesheet processing instruction pointing to the code generation stylesheet, did generate code correctly. My idea, then, was to use DOM to create an invisible iframe, and load into it the SCXML document to transform, along with the requisite processing instruction, and read out the transformed JavaScript. I actually had some success with this, but it seemed to be a brittle solution. I was able to get it to work, but not reliably, and it was difficult to know when and how to read the transformed JavaScript out of the iframe. In any case my attempts at this can be found in this branch here.

My final, and ultimately successful attempt was to use XSL to preprocess the stylesheets that used xsl:import, so as to combine the stylesheet contents, while still respecting the semantics of xsl:import. This was not too difficult, and only took a bit of effort to debug. You can see the results here. Note that there may be some corner cases of XSLT that are not handled by this script, but it works well for the existing scxml-js code generation backends. This is the solution upon which I ultimately settled.

One thing that must still be done, given this solution, is to incorporate this stylesheet preprocessing into the build step. For the moment, I have simply done the simple and dirty thing, which is to checked the preprocessed stylesheets into SVN.

It’s interesting to note that IE 8 was the easiest browser to work with in this cycle, as it provided useful and meaningful error messages when XSL transformations failed. By contrast, Firefox would return a cryptic error messages, without much useful information, and Safari/Chrome would not provide any error message at all, instead failing silently in the XSLT processor and returning undefined.

Consolidated Compiler Front-end

As I described in my previous post, a thin front-end to the XSL stylesheets was needed. For the purposes of running inside of the browser, the front-end would need to be written in JavaScript. It would have been possible, however, to write a separate front-end in a different language (bash, Java, or anything else), for the purposes of running outside of the browser. A design decision needed to be made, then, regarding how the front-end should be implemented:

  • Implement one unified front-end, written in JavaScript, which relies on modules which provide portable API’s, and provide implementations of these API’s that vary between environments.
  • Implement multiple front-ends, for browser and server environments.

I decided that, with respect to maintainability, it would be easier to maintain one front-end, written in one language, rather than two front-ends in different languages, and so I chose the first option. This worked well, but I’m not yet completely happy with the result, as I have code for Rhino and code for the browser mixed together in the same mdoule. This means that code for Rhino is downloaded to the browser, even though it is never called (see Transformer.js for an example of this). The same is true for code that targets IE versus other browsers. I believe I’ve thought of a way to use RequireJS to selectively download platform-specific modules, and this is an optimization that I’ll make in the near future.

In-Browser Demo

The result of this work can be seen in this demo site I threw together:

http://live.echo-flow.com/scxml-js/demo/sandbox/sandbox.html

This demo provides a very crude illustration of what a browser-based Graphical User Interface to the compiler might look like. It takes SCXML as input (top-most textarea), compiles it to JavaScript code (lower-left textarea, read-only), and then allows simulation from the console (bottom-right textarea and text input). For convenience, the demo populates the SCXML input textarea with the KitchenSink executable content example. I’ve tested it in IE8, Safari 5, Chrome 5, Firefox 3.5. It works best in Chrome and Firefox. I haven’t been testing in Opera, but I’m going to start soon.

Future Work

The past three weeks was spent porting and refactoring, which was necessary to facilitate future progress, and now there’s lots to do going forward. My feeling is that it’s now time to get back to the main work, which is adding important features to the compiler, starting with functionality still missing from the current implementation of the core module:

https://issues.apache.org/jira/browse/SCXML-137

I’m going to be presenting this work at the SVG Open 2010 conference at the end of August, so I’m also keen to prepare some new, compelling demos that will really illustrate the power of Statecharts on the web.

June 6th, 2010

Google Summer of Code 2010, Project Update 1

I’m two weeks into my Google Summer of Code project, and decided it was time to write the first update describing the work I’ve done, and the work I will do.

Project Overview

First a quick overview of what my project is, what it does, why one might care about it. The SCXML Code Generation Framework, JavaScript Edition project (SCXMLcgf/js) centers on the development of a particular tool, the purpose of which is to accelerate the development of rich Web-based User Interfaces. The idea behind it is that there is a modelling language, called Statecharts, which is very good at describing dynamic behaviour of objects, and can be used for describing rich UI behaviour as well. The tool I’m developing, then, is a Statechart-to-JavaScript compiler, which takes as input Statechart models as SCXML documents, and compiles them to executable JavaScript code, which can then be used in the development of complex Web UIs.

I’m currently developing this tool under the auspices of the Apache Foundation during this year’s Google Summer of Code. For more information on it, you could read my GSoC project proposal here, or even check out the code here.

Week 1 Overview

As I said above, I’m now two weeks into the project. I had already done some work on this last semester, so I’ve been adding in support for additional modules described in the SCXML specification. In Week 1, I added basic support for the Script Module. I wrote some tests for this, and it seemed to work well, so I checked it in.

Difficulties with E4X

I had originally written SCXMLcgf/js entirely JavaScript, targeting the Mozilla Rhino JavaScript implementation. One feature that Rhino offers is the E4X language extension to JavaScript. E4X was fantastic for rapidly developing my project. It was particularly useful over standard JavaScript in terms of providing an elegant syntax for: templating (multiline strings with embedded parameters, and regular JavaScript scoping rules), queries against the XML document structure (very similar to XPath), and easy manipulation of that structure.

These language features allowed me to write my compiler in a very declarative style: I would execute transformations on the input SCXML document, then query the resulting structure and and pass it into templates which generated code in a top-down fashion. I leveraged E4X’s language features heavily throughout my project, and was very productive.

Unfortunately, during Week 1, I ran into some difficulties with E4X. There was some weirdness involving namespaces, and some involving scoping. This wasn’t entirely surprising, as the Rhino implementation of E4X has not always felt very robust to me. Right out of the box, there is a bug that prevents one from parsing XML files with XML declarations, and I have encountered other problems as well. In any case, I lost an afternoon to this problem, and decided that I needed to begin to remove SCXMLcgf/js’s E4X dependencies sooner rather than later.

I had known that it would eventually be necessary to move away from E4X for portability reasons, as it would be desirable to be able to run the SCXMLcgf/js in the browser environment, including non-Mozilla browsers. There are a number of reasons for this, including the possibility of using the compiler as a JIT compiler, and the possibility of providing a browser-based environment for Statechart development. Given the problems I had had with E4X in Week 1, I decided to move this task up in my schedule, and deal with it immediately.

So, for Week 2, I’ve been porting most of my code to XSLT.

Justification for Targeting XSLT

At the beginning of Week 2, I knew I needed to migrate away from E4X, but it wasn’t clear what the replacement technology should be. So, I spent a lot of time thinking about SCXMLcgf/js, its architecture, and the requirements that this imposes on the technology.

The architecture of SCXMLcgf/js can be broken into three main components:

  • Front End: Takes in arguments, possibly passed in from the command-line, and passes these in as options to the IR Compiler and the Code Generator.
  • IR Compiler: Analyzes the given SCXML document, and creates an Intermediate Representation (IR) that is easy to generate code from.
  • Code Generator: Generates code from a given SCXML IR. May have multiple backend modules that target different programming languages (it currently only targets JavaScript), and different Statechart implementation techniques (it currently targets three different techniques).

My goal for Week 2 was just to eliminate E4X dependencies in the Code Generator component. The idea behind this component is that its modules should only be used for templating. The primary goal of these template modules is that they should be easy to read, understand, and maintain. In my opinion, this means that templates should not contain procedural programming logic.

Moreover, I came up with other precise feature requirements for a templating system, based on my experience from the first implementation of SCXMLcgf/js:

  • must be able to run under Rhino or the browser
  • multiline text
  • variable substitution
  • iteration (loops)
  • if/else blocks
  • Mechanisms to facilitate Don’t Repeat Yourself (DRY)
    • Something like function modularity, where you separate templates into named regions.
    • Something like inheritance, where a template can import other templates, and override functionality in the parent template.

Because I’m very JavaScript-oriented, I first looked into templating systems implemented in JavaScript. JavaScript templating systems are more plentiful than I had expected. Unfortunately, I did not find any that fulfilled all of the above requirements. I won’t link to any, as I ultimately chose not to go down this route.

A quick survey of XSLT, however, indicated to me that it did support all of the above functionality. So, this left me to consider XSLT, the other programming language which enjoys good cross-browser support.

I was pretty enthusiastic about this, as I had never used XSLT before, but had wanted to learn it for some time. Nevertheless, I had several serious concerns about targeting XSLT:

  1. How good is the cross-browser support for XSLT?
  2. I’m a complete XSLT novice. How much overhead will be required before I can begin to be productive using it?
  3. Is XSLT going to be ridiculously verbose (do I have to wrap all non-XML text in a <text/> node)?
  4. Is there good free tooling for XSLT?
  5. Another low-priority concern was that I wanted to keep down dependencies on different languages; it would be nice to focus on only one. I’m not sure about XSLT’s expressive power. Would it be possible to port the IR-Compiler component to XSLT?

To address each of these concerns in turn:

  1. There are some nice js libs that abstract out the browser differences: Sarissa, Google’s AJAXSLT.
  2. I did an initial review of XSLT. I found parts of it to be confusing (like how and when the context node changes; the difference between apply-templates with and without the select attribute; etc.), but decided the risk was low enough that I could dive in and begin experimenting with it. As it turned out, it didn’t take long before I was able to be productive with it.
  3. Text node children of an <xsl:template/> are echoed out. This is well-formed XML, but I’m not sure if it’s strictly legal XSLT. Anyhow, it works well, and looks good.
  4. This was pretty bad. The best graphical debugger I found was: KXSLdbg for KDE 3. I also tried the XSLT debugger for Eclipse Web Tools, and found it to be really lacking. In the end, though, I mostly just used <xsl:message/> nodes as printfs in development, which was really slow and awkward. This part of XSLT development could definitely use some improvement.

I’ll talk more about 5. in a second.

XSLT Port of Code Generator and IR-Compiler Components

I started to work on the XSLT port of the Code Generator component last Saturday, and had it completed by Tuesday or Wednesday. This actually turned out not to be very difficult, as I had already written my E4X templates in a very XSLT-like style: top-down, primarily using recursion and iteration. There was some procedural logic in there which need to be broken out, so there was some refactoring to do, but this wasn’t too difficult.

When hooking everything up, though, I found another problem with E4X, which was that putting the Xalan XSLT library on the classpath caused E4X’s XML serialization to stop working correctly. Specifically, namespaced attributes would no longer be serialized correctly. This was something I used often when creating the IR, so it became evident that it would be necessary to port the IR Compiler component in this development cycle as well.

Again, I had to weigh my technology choices. This component involved some analysis, and transformation of the given SCXML document to include this extra information. For example, for every transition, the Least Common Ancestor state is computed, as well as the states exited and the states entered for that transition.

I was doubtful that XSLT would be able to do this work, or that I would have sufficient skill in order to program it, so I initially began porting this component to just use DOM for transformation, and XPath for querying. However, this quickly proved not to not be a productive approach, and I decided to try to use XSLT instead. I don’t have too much to say about this, except to observe that, even though development was often painful due to the lack of a good graphical debugger, it was ultimately successful, and the resulting code doesn’t look too bad. In most cases, I think it’s quite readable and elegant, and I think it will not be difficult to maintain.

Updating the Front End

The last thing I needed to do, then, was update the Front End to match these changes. At this point, I was in the interesting situation of having all of my business logic implemented in XSLT. I really enjoyed the idea of having a very thin front-end, so something like:

xsltproc xslt/normalizeInitialStates.xsl $1 | \
xsltproc xslt/generateUniqueStateIds.xsl - | \
xsltproc xslt/splitTransitionTargets.xsl - | \
xsltproc xslt/changeTransitionsPointingToCompoundStatesToPointToInitialStates.xsl - | \
xsltproc xslt/computeLCA.xsl - | \
xsltproc xslt/transformIf.xsl - | \
xsltproc xslt/appendStateInformation.xsl - | \
xsltproc xslt/appendBasicStateInformation.xsl - | \
xsltproc xslt/appendTransitionInformation.xsl - | \
xsltproc xslt/StatePatternStatechartGenerator.xsl | \
xmlindent > out.js

There would be a bit more to it than that, as there would need to be some logic for command-line parsing, but this would also mostly eliminate the Rhino dependency in my project (mostly because the code still uses js_beautify as a JavaScript code beautifier, and the build and performance analysis systems are still written in JavaScript). This approach also makes it very clear where the main programming logic is now located.

In the interest of saving time, however, I decided to continue to use Rhino for the front end, and use SAX Java API’s for processing the XSLT transformations. I’m not terribly happy with these API’s, and I think Rhino may be making the system perceptibly slower, so I’ll probably move to the thin front end at some point. But right now this approach works, passes all unit tests, and so I’m fairly happy with it.

Future Work

I’m not planning to check this work into the Apache SVN repository until I finish porting the other backends, clean things up, and re-figure out the project structure. I’ve been using git and git-svn for version control, though, which has been useful and interesting (this may be the subject of another blog post). After that, I’ll be back onto the regular schedule of implementing modules described in the SCXML specification.

This work is licensed under GPL - 2009 | Powered by Wordpress using the theme aav1