MODELLING AND SIMULATION, WEB ENGINEERING, USER INTERFACES
September 29th, 2010

scxml-js Build Adventures

I presented my work on scxml-js at the SVG Open 2010 conference at the end of August. My hope was that I would be able to have a release prepared by this time, to encourage adoption among interested developers. However, I soon discovered that there was some process overhead involved in preparing releases in Apache Commons. Currently, scxml-js is a Commons Sandbox project, and Sandbox projects are not allowed under any circumstances to publish releases. In order to publish a release, the scxml-js would need to be “promoted” to Commons Proper, which would require a vote on the Commons mailing list. In order to pass a vote, it seemed likely that the scxml-js build system would need to be overhauled to use Maven, so as to be able to reuse the Maven parent pom, and thus inherit all of the regulated, well-designed build infrastructure shared by all projects at Apache Commons.

I had originally allocated two weeks to this task, from the end of Google Summer of Code, to the start of the SVG Open conference, but in fact I ended up spending over a month working on just the build infrastructure. I think some interesting new techniques emerged out of this work.

First, a description of what I was migrating from: a single custom build file written in JavaScript and designed to be run under Rhino. The reasoning behind this technique was that JavaScript is quite a nice scripting language, and very useful for many tasks, including writing build scripts, and due to its ability to use the RequireJS module system and dojo.doh unit testing framework natively, writing a custom build script in Rhino seemed to be the fastest, easiest way to perform automated tasks related to unit and performance testing of scxml-js. What it was not useful for, however, was setting up a Java classpath and calling java or javac (it also seemed like too much of an investment to perform a proper topological sort of dependencies between build targets). In the beginning of the project, using java and javac was not needed as scxml-js would always be run in interpreted mode on the command-line. As time went on, however, I wanted to use Rhino’s jsc utility to compile scxml-js to optimized Java bytecode, in order to improve performance as well as provide a standalone executable JAR for easy deployment. In order to solve this problem, I began to use Ant, which of course has very good integration with tasks relating to Java compilation.

Compiling JavaScript to Java Bytecode with Ant

Invoking jsc using Ant is actually pretty easy. The only complication arises if you have dependencies between your scripts (e.g. using Rhino’s built-in load() function), as jsc will not catch these. What is required is to preprocess your scripts so that all js dependencies are included in a single file, and then to run jsc on that built file. If you’re using just using load() to import script dependencies, this can be difficult to accomplish. If you’re using RequireJS, however, then you can make use of its included build script which does precisely what I described, in that it seeks out module dependencies and includes them in one giant file. It can also include the RequireJS library itself in the file, as well as substitute text (or XML) file dependencies as inline strings, so the end result is that all dependencies are are included in this single file. Compilation of scxml-js to Java bytecode is then a two-step process: calling the RequireJS build script to create a single large file that includes all dependencies, and calling jsc on the built file to compile it to bytecode. This will produce a single executable class file. Here’s a snippet to illustrate how this works:


<!-- this is the path to a front-end module that accepts command-line arguments and passes them into the main module -->
<property name="build-js-main-rhino-frontend-module" value="${src}/javascript/scxml/cgf/build/rhino"/>

<!-- RequireJS build script stuff -->
<property name="js-build-script" location="${lib-js}/requirejs/build/build.js"/>
<property name="js-build-dir" location="${lib-js}/requirejs/build"/>

<!-- include a reference to the closure library bundled with the RequireJS distribution -->
<path id="closure-classpath" location="${lib-js}/requirejs/build/lib/closure/compiler.jar"/>

<!-- jsc stuff -->
<property name="build-js-main" location="${build-js}/main-built.js"/>
<property name="build-class-main-name" value="SCXMLCompiler"/>
<property name="build-class-main" location="${build-class}/${build-class-main-name}.class"/>

<target name="compile-single-js">
	<mkdir dir="${build-js}"/>

	<java classname="org.mozilla.javascript.tools.shell.Main">
		<classpath>
			<path refid="rhino-classpath"/>
			<path refid="closure-classpath"/>
		</classpath>
		<arg value="${js-build-script}"/>
		<arg value="${js-build-dir}"/>
		<arg value="name=${build-js-main-rhino-frontend-module}"/>
		<arg value="out=${build-js-main}"/>
		<arg value="baseUrl=."/>
		<arg value="includeRequire=true"/>
		<arg value="inlineText=true"/>
		<arg value="optimize=none"/>
	</java>
</target>

<target name="compile-single-class" depends="compile-single-js">
	<mkdir dir="${build-class}"/>

	<!-- TODO: parameterize optimization level -->
	<java classname="org.mozilla.javascript.tools.jsc.Main">
		<classpath>
			<path refid="maven.plugin.classpath"/>
		</classpath>
		<arg value="-opt"/>
		<arg value="9"/>
		<arg value="-o"/>
		<arg value="${build-class-main-name}.class"/>
		<arg value="${build-js-main}"/>
	</java>
	<move file="${build-js}/${build-class-main-name}.class" todir="${build-class}"/>
</target>
// This is the module referenced by property "build-js-main-rhino-frontend-module".
// It accepts command-line arguments and passes them into the main module
(function(args){
	require(
		["src/javascript/scxml/cgf/main"],
		function(main){
			main(args);
		}
	);
})(Array.prototype.slice.call(arguments));

All of this was not too difficult to set up, and allowed me to accomplish my goal of building a single class file for the scxml-js project.

Importing JavaScript Modules with RequireJS in Ant

Rather than maintain two build scripts, it was desirable to move the unit testing functionality in the Rhino build script into Ant. As I had already put a significant amount of time into developing the Rhino build script, I wanted to directly reuse this code in Ant. This seemed possible, as Ant already provides good integration with Rhino and other scripting languages via its script tag, and either JSR-223 of the Bean Scripting Framework API. Unfortunately, however, Rhino when run under Ant does not expose any properties on the global object; this means that by default load() and readFile() are not available, making it virtually impossible to import code from other files, and thus making it impossible to directly import RequireJS modules in an Ant script. However, I sought help on the Rhino mailing list, and a convenient workaround was developed. The following code should be included at the beginning of every script tag, and the manager attribute on the script tag should be set to “bsf”:

var shell = org.mozilla.javascript.tools.shell.Main;
var args = ["-e","var a='STRING';"];
shell.exec(args);

var shellGlobal = shell.global;

//grab functions from shell global and place in current global
var load=shellGlobal.load;
var print=shellGlobal.print;
var defineClass=shellGlobal.defineClass;
var deserialize=shellGlobal.deserialize;
var doctest=shellGlobal.doctest;
var gc=shellGlobal.gc;
var help=shellGlobal.help;
var loadClass=shellGlobal.loadClass;
var quit=shellGlobal.quit;
var readFile=shellGlobal.readFile;
var readUrl=shellGlobal.readUrl;
var runCommand=shellGlobal.runCommand;
var seal=shellGlobal.seal;
var serialize=shellGlobal.serialize;
var spawn=shellGlobal.spawn;
var sync=shellGlobal.sync;
var toint32=shellGlobal.toint32;
var version=shellGlobal.version;
var environment=shellGlobal.environment;

Although, now that I’m reading this again, this also isn’t quite right, as, while these variables are being defined in the global namespace, technically they are not being added to the global object… but, in any case, this has not proven to be problematic.

Because this is a verbose declaration, and we like code reuse, I defined a macro called “rhinoscript” to abstract it out:


<macrodef name="rhinoscript">
	<text name="text"/>

	<sequential>
		<script language="javascript" manager="bsf">
			<classpath>
				<path refid="maven.plugin.classpath"/>
			</classpath><![CDATA[
				var shell = org.mozilla.javascript.tools.shell.Main;
				var args = ["-e","var a='STRING';"];
				shell.exec(args);

				var shellGlobal = shell.global;

				//grab functions from shell global and place in current global
				var load=shellGlobal.load;
				//import everything else...

				@{text}
		]]></script>
	</sequential>
</macrodef>

<!-- example call -->
<target name="test-call">
	<rhinoscript><![CDATA[
		load("foo.js");
		print("Hello World!");
	]]></rhinoscript>
</target>

This then allowed RequireJS modules to be imported and reused directly, as in the original Rhino build script. The only caveat is that, rather than using nice JavaScript data structures (Arrays, Objects, etc.) to store build-related properties, it was necessary to use Ant data structures (properties, paths, etc.) instead. Here’s the final result of the Ant task that uses RequireJS and dojo.doh to run unit tests:

<target name="run-unit-tests-with-rhino" depends="setup-properties">
	<rhinoscript><![CDATA[
		//load requirejs
		Array.prototype.slice.call(requirejs_bootstrap_paths.list()).forEach(function(requireJsPath){
			load(requireJsPath);
		});

		//this is a bit weird, but we define this here in case we need to load dojo later using the RequireJS loader
		djConfig = {
			"baseUrl" : path_to_dojo_base+"/"
		}

		function tailRecurse(list,stepCallback,baseCaseCallback){
			var target = list.pop();

			if(target){
				stepCallback(target,
					function(){tailRecurse(list,stepCallback,baseCaseCallback)});
			}else{
				if(baseCaseCallback) baseCaseCallback();
			}
		}

		var isComplete = false;

		require(
			{baseUrl:basedir},
			[path_to_dojo,
				"lib/test-js/env.js",
				"test/testHelpers.js"],
			function(){

				dojo.require("doh.runner");

				var forIE = "is-for-ie";
				var scxmlXmlTestPathList = Array.prototype.slice.call(scxml_tests_xml.list());
				var backendsList = backends.split(",");

				print("backendsList : " + backendsList);
				print("backendsList.length : " + backendsList.length);

				var oldDohOnEnd = doh._onEnd;
				doh._onEnd = function() { isComplete = true; oldDohOnEnd.apply(doh); };

				//we use tailRecurse function because of asynchronous RequireJS call used to load the unit test module
				tailRecurse(scxmlXmlTestPathList,
					function(scxmlXmlTestPath,step){
						var jsUnitTestPathPropertyName = scxmlXmlTestPath + "-" + "unit-test-js-module";
						var jsUnitTestPath = project.getProperty(jsUnitTestPathPropertyName);

						require([jsUnitTestPath],
							function(unitTestModule){

								backendsList.forEach(function(backend){
									var jsTargetTestPathPropertyName =
										forIE + "-" + backend + "-" + scxmlXmlTestPath + "-" + "target-test-path";

									var jsTargetTestPath = project.getProperty(jsTargetTestPathPropertyName);

									print("jsTargetTestPathPropertyName : " + jsTargetTestPathPropertyName);
									print("jsTargetTestPath  : " + jsTargetTestPath);

									//load and register
									load(jsTargetTestPath);

									unitTestModule.register(StatechartExecutionContext)
									delete StatechartExecutionContext;
								});

								step();
							});
					},
					function(){
						//run with dojo
						doh.run();
					}
				);

			}
		);

		//hold up execution until doh completes
		while(!isComplete){
			java.lang.Thread.sleep(20);
		}

	]]></rhinoscript>

I think this is kind of nice, because, if you look at new unit testing frameworks like Rake or Jake, the big advantage that they give you is the ability to use a real programming language (as opposed to a build Domain Specific Language, like Ant), and at the same time provide facilities for define build targets with dependencies, and topographically sorting them when they are invoked. Ant still has many advantages, however, including great support in existing continuous integration systems. The approach I have described seems to marry the advantages of using Ant, with those of using your preferred scripting language.

Integrating Ant with Maven

At this point, I had brought over most of the existing functionality from the Rhino build script into Ant, and I was beginning to look at ways to then hook into Maven. While I had some previous experience working with Ant, I had never before worked with Maven, and so there was a learning curve. The goal was to hook into the existing Apache Commons Maven-based build infrastructure, while at the same time trying to reuse existing code.

While this part was non-trivial to develop, it is actually the least interesting part of the process to me, and I think the least relevant to this blog (it doesn’t have much to do with JavaScript or Open Web technologies), so I’m only going to briefly describe it. The build architecture is currently as follows:

I felt it was important to maintain both an Ant front-end and Maven front-end to the build, as each has advantages for certain tasks. Common functionality is imported from build-common.xml. Both the Maven (pom.xml) and Ant (build.xml) front-ends delegate to mvn-ant-build.xml, which contains most of the core tasks without the dependencies between targets.

Based on my experience on the Maven mailing list, if you are a “Maven person” (a person who has “drunk the Maven kool-aid” – not my words, Maven people seem to like to use this phrase), then this architecture built around delegation to Ant will likely make you cry. It will seem needlessly complex, when the alternative of creating a set of custom Maven plugins will seem much better. This might be the case, and I proposed investigating this options. The problem, however, seems to be that relying on custom Maven plugins for building is a no-go for Commons projects (with the exception of the Maven Commons plugin), as it is uncertain where these plugins will be hosted. However, building a Maven plugin for process of compiling JavaScript using the RequireJS framework to Java bytecode, as outlined above, is I think something that has value, and which I would like to pursue at some point.

Future Work

I still have not put scxml-js forward for a vote, and even though the refactoring of the build system is more or less complete, I still may not do so. I have just arrived in Belgium where I will be working on my Master’s thesis for three months, and so I may need to deprioritize my work on scxml-js while I prioritize researching the theoretical aspects of my thesis. Also, now that SVG Open has passed, there seems to be less incentive to publish an alpha release. It may be better to give scxml-js more time to mature, and then release later on.

This work is licensed under GPL - 2009 | Powered by Wordpress using the theme aav1