Friday 10 July 2009

a groovy scala example

So my last post caused a bit of a stir (though some excellent comments!). Sorry about that! :) I think a few folks got the wrong end of the stick thinking my post was some kinda put down of dynamically typed languages. I still like Groovy/Ruby/JavaScript etc. I tried to focus the blog post purely on folks who were hacking statically typed code in javac; that Scala was a natural long term replacement which is a much better, powerful & more elegant statically typed language. If you've already stopped using javac and using Groovy/Ruby/JavaScript/Clojure/whatever then the post wasn't really meant for you as you don't need a javac replacement as you've already stopped using javac :)

Charles Nutter explains what I was trying to say way more elegantly
Scala, it must be stated, is the current heir apparent to the Java throne. No other language on the JVM seems as capable of being a "replacement for Java" as Scala, and the momentum behind Scala is now unquestionable
Anyway the reason for this post was to show how Scala is very Groovy based a nice example posted to the comments by James Iry. Here's the Scala code first
val (minors, adults) = people partition (_.age < 18 )
I'm sure most of you will agree its pretty clean, concise code and easy to read. It partitions a collection of people into two collections of minor and adults based on age.

Guillaume LaForge posted the Groovy equivalent
def (minors, adults) = people.split { it.age < 18 }
Notice how apart from some pretty minor surface syntax differences they are kinda identical to read. Mostly Groovy forces the dot between people and the split method (I think thats still true?) and uses 'it' for the default parameter name in a closure block rather than the Scala equivalent of _ if no name is specified. (Scala the _ is the wildcard symbol in various parts of the language as * is an operator on lots of objects like numbers).

So Scala and Groovy code can look about the same; nice concise, easy to read - describing the intention of the developer, rather than lots of noisy lines of code, or maybe ugly anonymous inner class noise using non-standard helper libraries to try simulate having nice core language support for working with very common data structures in a powerful expressive way.

There's a ton of Java libraries out there trying to tack on the ability to work with common collections & data structures in the modern Scala/Groovy/Ruby/Python/C#/VB way (dare I use the monad word here?) - I even helped create one many years ago. FWIW if you're stuck on Javac, do take a look at google collections which is really good.

My main criticism of these approaches in Java is they are non-standard (there's a ton of them out there - which to use?) and they are separate to the objects you want to work with. You can't just use a List, Set, Array or Map directly - you have to remember the separate class in a separate package you need to import so you can just do simple basic operations on data structures concisely. This kinda stuff is so basic, it should be a core feature of the programming language you use IMHO.

So both the Scala and Groovy examples above are a big improvement IMHO over their Java equivalents. I just wanted to demonstrate one of the additional benefits the Scala version has. While the code looks about the same, the Scala one creates immutable variables (which are great for concurrency) but the main difference is that the values are all statically typed. That means the compiler / IDE / documentation tool knows the exact types of the variables (minors, adults, people). Big deal I hear some of you say.

Well here's a handy example of why that is useful. Imagine you exposed this line of code as an API folks could invoke. Here's a PersonThingy object which defines a method minorsAndAdults as follows
def minorsAndAdults() = people partition (_.age < 18)
Now notice how the generated API documentation describes the return type of minorsAndAdults() as being (List(Person),List(Person)). i.e. you know its gonna return a 2 value tuple with a list of people for both values. This is very handy when you want to expose your code to other people; you don't have to write documentation (which gets out of date fast) describing what it is you return, the compiler can do all this for you. Plus when folks try to invoke your method their compiler & IDE will type check their code as they type it to ensure its used correctly (e.g. in case the user forgot to extract the first value from the tuple etc).

This also works nicely when using map. For example here is a little example which defines two methods greetings and lengths using the same value and a map method call but just using different expressions in the function passed
def greetings = names map ("Hello " + _)

def lengths = names map (_.size)
which return List[String] and List[Int] respectively.

So while dynamically typed languages and Scala can look as equally as concise; being statically typed can save you yet more typing explaining stuff the compiler/IDE/documentation generator already knows - and it avoids documentation going stale after refactoring - for example to change the data structure.

One closing observation. A common put down (usually from folks who've never read a good Scala book or really tried using Scala in earnest) is "oh its too complex". Yet the code you frequently write when working with objects, collections and data structures often looks very similar to the Ruby/Groovy/JavaScript/Python equivalents - yet folks rarely use the complexity word with those languages.

I wonder if part of the problem is, Scala & its community tends to describe Scala code and what the language is in terms of various (slightly academic) programming language concepts so that you can understand how the language works internally so advanced users can see how you can bend it to do what you need in a more concise & expressive way. For example back to James Iry's example
val (minors, adults) = people partition (_.age < 18)
That one example uses a tuple, pattern matching, a lambda, and partial function application. Describing how the language works tends to make it sound more complex (especially if you use the monad word) than just saying 'here's how to partion a collection'. If you were to take the similar Groovy/Ruby/Python code and tried to explain how the language actually implemented that line of code it would sound about as complex (often with meta object programming in there somewhere).

Scala is frequently described as a unification of functional programming and OO programming language concepts. A common push back from OO folks is, "I don't like functional programming, I just wanna do OO". Yet those same OO folks seem to really like closures, lambdas, for/list comprehensions and so forth. For example VB has lambdas, C# them too and has list comprehensions. Folks tend to really like them - whether they are developers using VB, C#, Ruby, Python, Groovy, JavaScript etc.

i.e. it seems most of us all really like an OO language with functional language features; we just don't like to think of ourselves as functional language developers :) Microsoft BTW is doing really well at turning C# and VB into OO and functional languages - but just avoiding using the 'f' word to avoid scaring folks off. I wonder if the Scala community should use the 'f' word a little less? :)

Monday 6 July 2009

Scala as the long term replacement for java/javac?

Don't get me wrong - I've written tons of Java over the last decade or so & think its been a great evolutionary step from C++ and Smalltalk (lots of other languages have helped too like JavaScript, Ruby, Groovy, Python etc). However I've long wanted a long term replacement to javac. I even created a language to scratch this itch.

Java is a surprisingly complex language (the spec is 600 pages and does anyone really grok generics in Java?), with its autoboxing (and lovely NPE's hiding in there), primitive types, icky arrays which are not collections & general lack of polymorphism across strings/text/buffers/collections/arrays along with extremely verbose syntax for working with any kind of data structure & bean properties and still no closures (even in JDK7) which leads to tons of icky try/catch/finally crapola unless you use frameworks with new custom APIs & yet more complexity. Java even has type inference, it just refuses to use it to let us save any typing/reading.

This issue becomes even more pressing with there being no Java7 (which is even more relevant after Snorcle - I wonder if javac is gonna be replaced with jdkc? :). So I guess javac has kinda reached its pinacle; closures look unlikely as does any kind of simplification or progression.

So whats gonna be the long term replacement for javac? Certainly the dynamic languages like Ruby, Groovy, Python, JavaScript have been getting very popular the last few years - lots of folks like them.

Though my tip though for the long term replacement of javac is Scala. I'm very impressed with it! I can honestly say if someone had shown me the Programming in Scala book by by Martin Odersky, Lex Spoon & Bill Venners back in 2003 I'd probably have never created Groovy.

So why Scala? Scala is statically typed and compiles down to the same fast bytecode as Java so its usually about as fast as Java (sometimes a little faster sometimes a little slower). e.g. compare how well Scala does in some benchmarks with groovy or jruby. Or this. Note speed isn't everything - there are times when you might want to trade code thats 10x slower for more productivity and conciseness; but for a long term replacement for javac speed is important.

Yet Scala has type inference - so its typically as concise as Ruby/Groovy but that everything has static types. This is a good thing; it makes code comprehension, navigation & documentation much simpler. Any token/method/symbol you can click on to navigate to the actual implementation code & documentation. No wacky monkey patching involved, or doubting of who added a method, when and how - which is great for large projects with lots of folks working on the same code over long periods of time. Scala seems to hit the perfect sweet spot between the consise feel of a dynamic language, while actually being completely statically typed. So I never have to remember the magic methods that are available - or run a script in a shell then inspect the object to see what it really looks like - the IDE/compiler just knows while you edit.

Scala has high order functions and closure support along with sequence comprehensions so you can write beautifully concise code. Scala also unifies functional and OO paradigms beautifully together into a language thats considerably simpler than Java (though the type system is of a similar order to truly understand than generics - but then thats usually an issue for framework creators rather than application code developers). It also lets folks gradually migrate from a traiditional OO/Java way of coding to a more functional way - which is particularly relevant for folks writing concurrent or asynchronous code (which due to the GHz of chips no longer going up but instead we're getting more cores is becoming more necessary). You can start the OO way and migrate to using immutable state if/when you need its benefits. Increasingly functional programming is becoming more and more important as we try and make things more concise and higher level (e.g. closures, higher order functions, pattern matching, monads etc) as well as dealing with concurrency and asynchrony via immutable state etc.

Scala also has proper mixins (traits) so you don't have to muck about with AOP wackiness to get nice modular code. There's even structural types in case you really do need some duck typing.

The thing which most impresses me is the core language syntax is pretty small and simple (the spec is about a quarter the size of Java's); but its way more powerful and flexible and is very easy to extend in libraries to add new semantics and features. For example see the Scala Actors. So its ideal for creating either embedded DSLs or external DSLs. There's really no need to have Java , XPath, XSLT, XQuery, JSP, JSTL, EL and SQL - you can just use Scala with some DSLs here and there (examples of this later...).

Scala does take a little bit of getting used to - I confess the first few times I looked at Scala it wasn't that pleasing on the eye - with Java you're kinda used to dumb verbose code which doesn't do very much - it can be quite a shock to see quite a few symbols at first. (It took me a while to get over the use of _ in scala which is the 'wildcard' symbol since * is an identifier/method).

If you've been doing lots of Java then Scala does feel quite different at first - (e.g. the order of types & identifiers in method/variable/parameter declarations - though the reason for that is to make it easy to miss out redundant type information).

e.g. in Java
List<String> list = new ArrayList<String>()
in Scala
val list = new List[String]
or if you want to specify exact typing
val list : List[String] = new List[String]
However if you keep at it, the beauty of Scala soon becomes apparent; its simplified so many of the gremlins in the Java language, allows you to write very concise code describing the intent behind the code rather than the implementation cruft - together with providing a nice migration path to elegant functional programming which is awesome for building concurrent or distributed software.

I highly recommend you take a look at Scala - with an open mind - and see if (once you're brain adjusts) you can see its beauty too.

Some scala links and online presentations
If you have a spare hour or so these video talks are great to watch
Handy Scala frameworks and libraries
  • liftweb the rails of scala
  • specs and ScalaTest for BDD and more literate testing showing how a typesafe DSL can help you write more consise and expressive code that is very IDE friendly
  • scalaz a handy library of utilities
  • dispatch for working with HTTP/JSON services
BTW for those like me who love JAXRS you can now use lift templates with Jersey via the new jersey-lift module.

As an example of this in action you can check out RestMQ which is an open source project I've been working on lately to provide a RESTful API and web console to message orientated middleware which is built on JAXRS (Jersey), Scala and Lift.

From a tooling perspective there's Ant/Maven plugins, an interactive Scala console (REPL) and IDE plugins for IDEA, Eclipse, NetBeans along with the usual editors (TextMate/Emacs etc). The IDE plugins are not yet up to the Java grade, but they are very useful with good code navigation & completion.

I've tried the plugins for NetBeans, Eclipse and IDEA they all have strengths and weaknesses; it seems Scala folks are split between them all. For code navigation and completion along with maven support I've found IDEA to be quite good. When you open a Maven pom.xml it seems to grok the code nicely, finding the scala source so you can navigate through any type/method to see its documentation/source etc. (You do typically have to manually add the Scala facet to run/debug stuff). Though IDEA is not always the best at highlighting syntax errors as you type. They could all use some work to bring them up to line with their Java counterparts though - try them out and see which you prefer.

Scala nits
With any language there's gonna be bits you love and bits you're not so keen on. Early impressions of Scala do seem like there's a bit of an attempt to use a few too many symbols :-; but you don't have to use them all - you can stick to the Java-ish OO side of the fence if you like. But then I guess longer term its probably better to use symbols for the 'special stuff' to avoid clashing with identifiers etc.

I'm not a massive fan of the nested import statement, using _root_.java.util.List to differentiate a 'global' import from a relative import. I'd have preferred a child prefix. e.g. if you have imported com.acme.cheese.model.Foo then to import model.impl.FooImpl i'd prefer an explicit relative prefix, say: import _.impl.FooImpl which would simplify things a little and more in keeping with Scala's attempt at simplifying things and removing cruft (being polymorphic to import java.util._).

However compared to all the massive hairy warts in Java, these downsides of Scala are tiny compared to the beauty, simplicity and power of Scala.

Conclusion
Given that MrJava, MrJRuby and MrGroovy are all tipping Scala as javac's long term replacement, there might be something in it. So what are you waiting for; get the Programming in Scala book or the O'Reilly Scala book and start having fun :)