This is an HTML rendering of a working paper draft that led to a publication. The publication should always be cited in preference to this draft using the following reference:
Java Makes Scripting Languages Irrelevant?
Simplicity does not precede complexity, but follows it.
— Alan J. Perlis
In computing we often solve a complex problem by adding another level of indirection. As an example, on Unix file systems an index node, or inode, data structure allows files to be allocated concurrently and sparsely, and yet still provide an efficient random access capability. When we want to customize large and complex systems or express fluid and rapidly changing requirements a common tool we employ is to add a scripting layer on top of the corresponding system. An early instance of this approach was employed in Dan Murphy’s TECO editor developed on the DEC PDP-1 computer in 1962–63: its command language also doubled as an arcane (to put it politely) macro language.
About 20 years ago adding a scripting language interface to existing applications, which were at the time typically written in C, was all the rage. Lotus 123 supported macro commands, Framework had the FRED language, and AutoCAD and Emacs could be programmed in a form of Lisp. On the Unix platform system administrators wrote sophisticated sendmail configuration files to bridge—the at the time disparate and mutually incompatible—email networks. This was also the time when John Ousterhout developed Tcl/Tk as a general-purpose scripting language, to be integrated with any system that could benefit from such a capability. A few years later Microsoft came up with Application Basic, as its general-purpose scripting language for all its office productivity applications. All those early developments acquainted programmers with the notion of customizing applications through scripting, and opened the road for powerful general-purpose scripting languages such as Perl, Python, and Ruby (see John K. Ousterhout. Scripting: Higher-level programming for the 21st century. Computer, 31(3):23–30, March 1998). My impression is that with the evolution of Java and and Microsoft’s .NET offerings (I’ll use the term Java from now on as a stand-in for both alternatives) the niche occupied by scripting languages is rapidly shrinking; we are approaching the end of an era.
The application scripting languages I described serve an important purpose. Glued on an application they can greatly ease its configuration and customization and can allow end-user programming by offering a safe and friendly development environment. Those programming an application using its scripting interface do not have to bother with the intricacies of C’s memory management, the mechanism used for managing character strings in the specific application, and the complexity of the application's internal data structures. Instead, the scripting language typically offer (among other things), automatic memory management, a powerful built-in string data type, sophisticated data structures, a rich repertoire of operations, and an intuitive API for manipulating the application’s data and state. In addition, the application, by interpreting the scripting language, can isolate itself from undesirable effects of the scripting code, such as crashes and corruption of its data.
Notice, how most of the nice features applications obtain through the use of scripting languages are now offered by Java:
· automatic memory management through garbage collection,
· a standard string data type,
· collection interfaces implementing most useful data structures, and
· a very rich language library.
In addition, in applications written in Java what can be considered as an API already comes for free as part of their object-oriented design. One only needs to allow an application to dynamically load user-specified classes, expose its API by providing access to some of the application’s objects, limit the application’s exposure through the security manager and exception handlers, and the need for a separate scripting language vanishes.
In fact, many modern Java applications that support beans, plugins, and other extension mechanisms, follow exactly this strategy. Eclipse, Maven, Ant, Javadoc, ArgoUML, and Tomcat are some notable examples. Even on resource-constrained embedded devices, such as mobile phones, which are still programmed in a system programming language, configuration and customization is currently moving toward the Java direction.
Does the trend of customizing applications through a Java interface make scripting languages irrelevant? Yes and no. As an application configuration and extension mechanism, Java is probably the way to go. The cost of marshalling and unmarshaling data objects and types between the application's code written in Java and the conventions expected by a different scripting language is too high for the limited incremental benefits that the scripting language would offer. On the other hand, scripting languages still have an edge in a number of areas, offering us a number of distinct advantages.
A more flexible or imaginative syntax. Think of Perl's numerous quoting mechanisms and its regular expression extension syntax, or Python's use of indentation for grouping statements. These make some program elements a lot easier to read. As an example, variable substitution within Perl’s or the Unix shell’s double quoted strings is by far the most readable way to represent a program’s output.
Less fuss about types. Most scripting languages are typeless and therefore easier to write programs in. For example, Perl makes writing a client or server for an XML-based web service a breeze, whereas in Java we have to go through a number of contortions to implement the same functionality. Of course, the robustness and maintainability of code written in a typeless language is a different question, as many of us who maintain production code written in a scripting language later discover.
A more aggressive use of reflection. Consider here Perl's eval construct and Python's object emulation features. These allow the programmer to construct and execute code on the fly, or dynamically change a class’s fields. In many cases these features simplify the construction of flexible and adaptable software systems.
Tighter integration with command-line tools. Although Java 1.5 comes with an API containing over 3000 classes—with thousands more being available through both open source and proprietary packages—many operations can still be performed in a more reliable and efficient manner by interfacing with venerable command-line tools. The Unix scripting languages provide many facilities for combining these tools, such as the creation of pipelines, and the processing of data through sophisticated control constructs.
Viability as a command language. Many scripting languages, such as the ones of the operating system shells, can also double as a command language. Command-line interfaces often offer a considerably more expressive working medium than GUI environments (we’ll expand on that in another column). Coupling a command-line interface with a scripting language means that commonly executed command sequences can easily be promoted into automated scripts; a boon to developers. This coupling also encourages an exploratory programming style, which many of us find very productive. I often code complex pipelines step by step, examining the output of each step, before tacking another processing element at the pipeline’s end.
A shorter build cycle. Although for many systems a build cycle that provided time for an elaborate lunch is now sadly history, the tight feedback loop offered by the lack of a compilation step in scripting languages allows for rapid prototyping and exploratory changes, often hand-in-hand with the end-user. This is a feature that those using agile development methodologies can surely appreciate.
So, where do we stand now? The gap between system programming languages and scripting languages is slowly closing. For example, some scripting languages are capitalizing on Java’s infrastructure by having their code compile into JVM bytecode. However, there is still a lot of ground in the middle that is up for the grabs. New system programming language designs can offer more of the advantages now available only through scripting, while scripting languages are constantly benefiting from hardware performance advances that make their (real or perceived) efficiency drawbacks less relevant every day. The issue of the result’s quality remains an open question on both fronts.
We developers, as avid tool users, enjoy viewing the battle from atop, reaping the benefits.
Diomidis Spinellis is an associate professor in the Department of Management Science and Technology at the Athens University of Economics and Business and the author of Code Reading: The Open Source Perspective (Addison-Wesley, 2003). Contact him at firstname.lastname@example.org.