My Journey Through the World of Programming Languages

Photo by Phil JacksonMy journey through the world of programming languages began in 1987 with the blinking cursor on a black-and-white computer screen of an Atari ST 1040 computer. After a few hours of playing with the GFA BASIC interpreter, I was hooked. The graphical capabilities of the Atari computer made it possible to program Mandelbrot fractals, the Towers of Hanoi, the Breakout game, and all those things which newbie programmers like to entertain themselves with.

Quite a few of these programs looked peculiarly similar to what people programmed ten years later when the first Java applets appeared. But I am getting ahead of myself. Back in 1987, BASIC was the beginner language. The GFA BASIC dialect was considered quite modern at the time, since it didn’t have line numbers and it was a full featured procedural language, at least in principle. Yet, it was still a toy. After about six months I felt like writing more ambitious projects and I realised that I had outgrown BASIC. Someone gave me a copy of a C-compiler, so I started learning the C-language.

This was a good decision as it turned out later, because I was able to use C throughout the first five years of my career. I found Kernighan-Ritchie style C to be conceptually very close to the GFA BASIC I had started with except for pointers, which were completely new. The study of C led me to Unix. I began writing clones of Unix tools and utilities for my own use. This was the late 80s before the GNU and Linux phenomena appeared.

One such project was a text editor that I enhanced with optimised scrolling routines in 386 Assembler language. I wrote the editor after I had exchanged my Atari computer for a PC. After a few months I had a number of common Unix tools and a nice text editor at my disposal which I could use under MS-DOS. Then I read Andrew Tanenbaum’s Minix book and I got into system programming. I wrote a micro-kernel task scheduler for the 386 in Assembler. Multi-tasking was a fascinating thing that seemed to be out of reach for an average personal computer. At the time, I briefly considered expanding the micro-kernel into a more complete OS by adding memory management and file management. However, I soon realised the immensity of this task. I had just started studying informatics, and I figured that I wouldn’t be able to accomplish it while still visiting lectures and doing homework.

At university, we were taught Pascal as a “first” programming language and Lisp as a second. Pascal was very easy, of course; it seemed like a verbose dialect of C. – Lisp, on the other hand, I found quite repulsive. – I could appreciate the underlying mathematical idea, the lambda calculus, but the syntax was just awful. I believe it was  IEEE Scheme. The language seemed great for graph-theoretical problems, but unsuitable to express common algorithms in a natural way. In other words, I found it to be a language for eggheads.

At the time, the imperative programming paradigm was predominant. It seemed the best way to get things done, as development tools and libraries for imperative procedural languages were readily available. The next language I learned at the university was Modula 2. I thought of it as an elaboration of Pascal with emphasis on data abstraction and encapsulation. From Modula 2 I learned the importance of encapsulation. Although I didn’t use Modula 2 for practical applications, I was able to apply the conceptual foundation in my work that revolved around C  programming.

After university, I worked in systems programming. I designed and implemented drivers for a company that manufactured proprietary hardware. Then I changed to work with another company in the field of machine translation and computer based training. After 5 years of coding in C, I thought it was time for a change. This was the early nineties, so I turned my attention to application programming with RAD tools which had just hit the market. I learned SQL inside out and created data-driven programs. Visual Basic 3.0 was the killer application in 1993, as it made the construction of Windows GUIs extremely easy. I was able to tie in with my prior Basic experience. Customers liked the productivity that comes with RAD.

After about a year, I dropped VB in favour of Delphi, which was superior for this purpose. Likewise, I could tie in with my previous Pascal experience. I learned the rudiments of object oriented programming with Object Pascal, which is odd given that C++ would have been the more natural path to object orientation after having programmed in C for many years. However, Object Pascal taught me proper componentisation. This was the mid-nineties and a lot of amazing things happened in the IT industry. The most important change was the commercial breakthrough of the Internet. Almost simultaneously, the Linux phenomenon happened. The IT industry boomed and technological progress was fast-paced. The Internet connected everybody everywhere and Linux brought corporate computing horsepower to the desktop.

As a result of these changes, I began coding HTML in 1996 and I learned JavaScript and Perl in 1997. The next year brought even more changes, as I decided to gear my business towards web development. Perl seemed like an idiosyncratic Unix solution born out of necessity. It was certainly practical for server side programming, but it was also rather painful and hackish. Fortunately, PHP appeared at around the same time and it offered a much cleaner solution for server programming.

Soon I found myself programming web applications in PHP most of the time. LAMP-based applications literally exploded on the Internet between 1998 and 2003. During this period, I also learned the rudiments of Java, C++, and C#. I was responsible for the management of projects implemented in all of these languages. Object-oriented programming had become the mainstream paradigm in the late nineties. I decided that I needed to take on one of these languages more seriously.

The obvious choice was Java, since it was general purpose, but still very strong in the field of web development. So I fully immersed myself in Java when the language made the transition from 1.4 to 1.5. At that point, Java was already mature and mainstream. As a latecomer to Java, the platform seemed huge to me, certainly larger than anything I had looked at before, including .NET. The sheer number of APIs was just unbelievable. It required a sustained effort of two years during which I read a shelf of Java books and began moving from trivial programming exercises to small projects and then to larger projects. Since the mid-2000s, Java has become my mainstay.

There are two reasons why I like Java. First, there is a fantastic eco-system connected to the platform. It ranges from best-of-breed IDEs, VMs, and app-servers to a gazillion libraries and frameworks, and (almost) everything is free. Second, Java is extremely scalable and robust. It is not the purest object-oriented language, neither the richest, but Java is probably the one language that transforms average programmers into software engineers. I argue that this is so, because of the high level of standardisation and best practices endorsement in the Java community.

I know that there are quite a few people who debate that. However, there’s a reason why universities teach Java to freshmen and why corporations use Java for enterprise development. It offers the largest and possibly the most robust platform for developing industrial-strength software. Of course, not everything is hunky-dory in the Java department. I perceive that the main problem is the language itself. – It’s aging. – Although (or perhaps because) it forces programmers to write tidy code and relinquish dirty C tricks, it tends to be tedious, as it involves generous amounts of boilerplate code. It also lacks good paradigms for fine-grained concurrency control.

Fortunately, with the Scala language I discovered a possible solution for these problems. At this point -early 2009- I haven’t yet done any larger projects in Scala, but my eagerness to do so is growing. Adding the functional paradigm to my programming instruments is very beneficial. It even flows over into my Java work, since it has changed the way I phrase algorithms in Java. The only negative effect is that by learning Scala, the limitations of the Java language became more evident and thus more painful. While functional programming will probably grow in the near future, Java has such a strong position that it won’t just fade away. Many large systems have been created in Java, so there will be maintenance work for decades to come. Meanwhile, it will be interesting to see how fast the industry embraces functional programming.

Naming Conventions (1)

Nomen est omen. This old Latin proverb means something like “the name says it all”. The ancients were superstitious, and they believed that names carry special powers. Names were thought to predispose its subject to bring about certain fortunes or to have certain qualities. Today, we have largely done away with such superstitions. In the scientific worldview, names are nothing but symbolic artefacts without intrinsic powers.

However, there is one field where this Latin proverb still applies, and where it is indeed more true than ever. Oddly, this field wasn’t even known to the Romans. I am talking about software development, of course. Names have a special importance to software, or perhaps better, the practice of naming does. The first thing I do when looking at a piece of software written by somebody else is to look at the names given to variables and other program elements. My experience has shown that the quality of the identifier names corresponds directly to the overall quality of the program code.

Identifier names are a crucial part of any program. They provide clues about semantics and program logic. They make or break code readability. They determine whether code is self-documenting or not. So, the old Latin proverb “nomen est omen” can be applied as follows: If you read through a piece of code for the first time and you have no idea what the variables are supposed to represent, or what the methods are supposed to accomplish, then this is a bad omen. It suggests that the author was not quite sure how to formulate the problem (or didn’t care) and it can be expected that the other aspects of the program are at least as confusing.

If you read through a piece of code and the identifier names are easily comprehensible and fit together like the pieces of a jigsaw puzzle, then this is a good omen. It suggests that the author had a clear idea of the task at hand. Naturally, there are many intermediate levels between these two opposites. While contemporary code editors and IDEs are very powerful, identifier naming is one of the things that cannot currently be automated by these tools.

It is up to the programmer to choose identifier names. Since good naming practice is essential for code maintainability, we will first define what makes a good naming practice and then look at some concrete examples of good and bad strategies. There are three basic ingredients for a good naming practice: (1) semantic precision, (2) consistency, (3) the right amount of verbosity.

The first aspect is by far the hardest to get right. Semantic precision means that the chosen name is appropriate, unambiguous, well defined, and compliant with conventions. Consistency means that names are formed according to common patterns and that terms are used consistently throughout the program. The right amount of verbosity relates to identifier length. It means that names do not leave anything open to guesswork while avoiding redundancy.

An identifier usually consists of a single word or a combination of words. In case of the latter, the individual words are often set apart by using CamelCase or the “_” underscore character. One of the most commonly found questionable practice is the use of abbreviations instead of written out words, for example rptCount instead of repeatCount. The word count indicates that this variable is a counter, but what is counted? Repetitions, receptors, recipients, red points, or something else? Reptiles perhaps? By adding a mere three characters and writing out the first word, the ambiguity is eliminated.

This doesn’t mean that abbreviations are always bad. For example, nothing speaks against widely used acronyms like URI for UniqueResourceIdentifier, or LCD instead of LiquidCrystalDisplay. Likewise, domain-specific abbreviations are acceptable, if the program is written within that domain, for example FOB (free on board) in the shipping domain, or VAT (value added tax) in the accounting domain. By definition this also includes acronyms in the software domain, such as i18n for internationalisation, or ftp for file transfer protocol. In addition, there are a number of  pre- and postfixes used ubiquitously in programming, such as min, max, fmt, pos, len, num, cnt, etc. which every programmer understands.

Generally speaking, abbreviations and acronyms should be used sparingly and only when they are common and free of ambiguity. This also means that one-letter or two-letter variable names, such as a, b, c, f1, x2, etc. are generally a bad idea, because they say nothing about the content of the variable. There is one exception to this rule: loop indices. Since loop indices (or iterator variables) are only used to to iterate through a collection of values, they don’t have any intrinsic meaning. So one might as well give them one-letter names. By convention, the letters i, j, k, etc. are used, whereas the alphabetic order corresponds to the loop nesting level. This mean i is used for the outermost loop, j for the second nested loop, k for the third, and so on.

This is standard practice for loop indices, but in other cases, index position corresponds to certain semantics. In this case, indices do have meaning. For example, one might define an array of counters, where counter[0] contains the number of students, counter[1] contains the number of passed tests and counter[2] contains the number of failed tests. Since the index numbers themselves don’t communicate any meaning, it is appropriate to define an enumerable type or a set of integer constants that conveys this meaning, for example STUDENTS=0, PASSED_TESTS=1, FAILED_TESTS=2, and so on.

This is all pretty much standard programming practice. Next time we will look at common identifier naming schemes, their merits and demerits, as well as language conventions.

The Problem With Cup Typing

First I should explain what I mean with cup typing. When you buy a cup of coffee, you have the choice of short, tall, or grande sized cup. Sometimes you can also choose  decaf or regular. When you declare an integer variable in Java, you have the choice of  byte, short, int, and long. Sometimes (in languages like C++) you can also choose between signed and unsigned. The similarity is obvious. And it doesn’t end with integers. Floating point numbers come in two different flavours, namely as regular “float” values (32-bit) and as “double” values (64-bit). Characters come in the form of 7-bit, 8-bit and 16-bit encodings. In statically typed programming languages, multiplicity is the rule rather than the exception. While Fortran and Pascal offer a moderate choice of two different integers, Java offers four plus a BigInteger implementation (“extra grande”) for really large numbers. However, it’s C# that takes the biscuit in cup typing with 9 different integer types and 3 different real  types. Database systems are keeping up with this trend. For example, the popular MySQL RDBMS offers 5 different integer types and 3 different real types. Seeing the evolution from Fortran to C#, it almost appears as if type plurality has increased over time. We must ask two things: How did this come about and is it useful? We appreciate the fact that we can buy coffee in different cup sizes to match our appetite, but does the same advantage apply to data types?

The first question is easy to answer. Graduated types result from the fact that computer architectures have evolved in powers of two. Over several decades, the register width of the CPU of an average PC has expanded from 8 to 16 to 32 to  64 bits. Each step facilitated the use of larger types and numeric types in particular were closely matched to register width. Expressing data types in a machine-oriented way appears to be a C legacy and quite a few newer programming languages have been strongly influenced by C. – It is my contention that while curly braces and ternary operators are an acceptable C-language tradition, graduated types are definitely not. Why not? Because they counter abstraction. They hinder rather than serve the natural expression of mathematical constructs. Have you ever wondered whether you should index an array with byte- or short-sized integers? Whether you should calculate an offset using int or long values? Whether method calls comply with type widening rules? Whether an arithmetic operation might overflow? Whether a type cast may lose significant bits or not? All of this is a complete waste of time in my view. Wouldn’t it be better to let the virtual machine worry about such low-level questions, or the library if a VM is not present? Cup typing gets positively annoying when you have to write an API that is flexible enough to deal with parameters of different widths. If there’s no type hierarchy, you inevitably end up with multiple overloaded constructors and methods (one for each type) which add unnecessary bulk. The Java APIs are full of such examples and the valueOf() method is a case in point – it’s really ugly.

However, graduated types are beyond ugly; they are outright evil. They cause an enormous number of bugs and the small numeric types are the prime offenders. I wonder how many times a signed or unsigned byte has caused erratic program behaviour by silently overflowing. Such bugs can be hard to find and worse – they often don’t show until certain border conditions are reached. Casts that shorten types also belong to the usual suspects. I shall not even mention the insidious floating point operations that regularly unsettle newbie programmers with funny looking computation results. What numeric types does one really need? – Integer numbers and real numbers. One of each and not more. – If you want to be generous as a language designer, you can throw in an optimised implementation of a complex number type and a rational number type. However, in an object-oriented language with operator overloading, it’s fairly easy to express these in a library. The fixed comma type (sometimes called decimal type) is the subset of the rational type where the denominator is always a power of ten. So, that’s really all you need – a clean representation of the basic mathematical number systems.

At this point, you might object: “but the CPU register is only x bits wide,” or “how do I allocate an array of fifty thousand short values?”, or “can I still have 8-bit chars?” Unfortunately, there is no simple answer to these questions. The natural way to represent integers is to always use the machine’s native word width, but unfortunately that doesn’t solve the problem. First of all, the word width is architecture dependent. Second, it might be wasteful for large arrays that hold small numbers and on the other hand it would still be too small for applications that need big integers. The solution is of course a variable size type, i.e. an integer representation that can grow from byte size to multiple word lengths. We have variable length strings, so why shouldn’t we have variable length numbers? It seems perfectly natural. There is certainly some overhead involved, because variable length types need special encoding. The overhead will be most likely due to loading a descriptor value and/or to bit shifting operations. After all, variable length numbers don’t come for free, but they do offer tremendous advantages. They relieve the programmer from making type width decisions, as well as documenting these decisions – and worse – changing the type width later if the decision turned out to be inadequate. Furthermore, they eliminate the above mentioned bugs resulting from silent overflows and type cast errors, not to mention API proliferation due to type plurality. Thus variable length numbers are generally preferable to common fixed width types.

Of course, there are situations where you know that you will never need more than a byte. There are also situations where performance is paramount. In addition, APIs and libraries based on multiple fixed types are not going to disappear overnight. To provide backward compatibility and to offer optimisation pathways to the programmer, a language could present these as subsets of the mathematical type. For example, if a language defines the keyword “int” for variable length integer numbers, then “int(8)” could mean a traditional byte, “int(16)” could mean a short word, and so on. Now, this is a bit like reintroducing cup typing through the back door. Therefore the use of subtypes for general purpose computations should be discouraged. However, it’s always better to have a choice of fixed and variable types than having no variable types at all.

Parallel Programming

During the last few years, we have seen a trend change in CPU design. Until about 2003, CPUs  became more powerful through frequency scaling. The number of operations per second increased exponentially over a period of 20 years. Clock speeds went from 4.77 MHz in the first IBM PC (1981) to 3.3 GHz in a high-end PC of the year 2003. Thus, Moore’s law was supported chiefly by increasing clock speed. Then something “strange” happened: clock speeds plateaued. For a brief moment, it appeared that Moore’s law was nearing its end. Not because of the physical limits of miniaturisation, or because higher clock speeds were impossible, but because of thermal problems associated with such high clock frequencies. Water cooled systems were already introduced at the top-end of the market, but obviously the energy efficiency of these systems presents a serious economic problem. Then in 2004, the first PC dual core processor was introduced. Dual cores became widely available in 2005 and by now we have become used to quadcore processors, while eight-core systems are around the corner. The trend is obvious: in the forseeable future we will move from multi-core to many-core CPUs with dozens and perhaps even more cores.

Increasing the clock speed of a CPU has the effect, that program execution speed increases proportionally with clock speed. The increase is independent of software and hardware architectures. A sequential algorithm simply runs twice as fast on a machine where CPU operations take half the time. Unfortunately, this is not the case for a machine with twice as many CPUs. A sequential algorithm runs just as fast on a 3 GHz single core CPU than on a 3 GHz dual core CPU, because it uses only one of the CPU cores. Increased overall execution speed is therefore only achieved when multiple tasks run concurrently, because they can be executed simultaneously on different CPUs. This problem of parallelism is not new in computer science. Unfortunately though, today’s programming languages aren’t very well equipped to deal with parallel scaling. The computer language idiom that typifies multi-core concurrency is the thread model. Threads are lightweight processes that are typically used for self-contained subtasks. Several languages, notably Java, offer APIs and native support for their implementation. Threads are well suited to asynchronous processing, for example in communication and networking. They can also be used for simple parallelisation, where a computing problem is parallel by nature (for example serving multiple web documents at the same time). However, threads are relatively cumbersome and thus not really suitable for fine-grained parallel programming, such as traversing data structures or executing nested loops in parallel. Edward A. Lee describes the problems with threads in this excellent article.

The fundamental problem that software engineers face is described by Amdahl’s law. Amdahl’s formula expresses the speed gain of an algorithm achieved by parallelisation: speedup = N / ( (B*N) + (1-B) ), where B is the non-parallelizable (sequential) percentage of the problem and N is the number of worker threads, or processor cores. There are two notable things about Amdahl’s law: (1) the speedup is highly dependent on B, and (2) the curve flattens logarithmically with increasing N. It’s also important to know that Amdahl’s law assumes a constant problem size, which is unrealistic given that parallelisation  requires a certain amount of control overhead. Nevertheless, we can draw several conclusions from Amdahl’s law. First, the performance gain from parallel scaling is lower than that of frequency scaling. Second, it is not proportional to the number of cores. Third, it is highly dependent on software architecture, since the latter chiefly determines the size of B. From the perspective of a software engineer, the last conclusion is probably the most interesting one. It leads to the question what can be done to maximise parallelisation at the level of the algorithm exploiting data and task parallelisms. The present thread model is too coarse-grained to provide solutions at the algorithm-level. Hence, what is called for are new programming idioms that make this task easier.

Ideally, parallelisation is fully automatic, implemented by the compiler and the underlying hardware architecture. In this case, the application programmer could simply continue to formulate sequential algorithms without having to worry about parallelism. Unfortunately, automatic parallelisation has turned out to be extremely complex and difficult to realise. Despite many years of research in compiler architecture, there are no satisfactory results. Parallelisation at the algorithm level involves several abstraction steps, such as decomposition, mapping, scheduling, synchronisation, and result merging. The hardest among these tasks is very likely decomposition, which means breaking down a sequential task into parts that can be solved in parallel. The preferred method for doing this is divide and conquer. A problem is divided into identical smaller pieces, whereas the division can be expressed recursively or iteratively. The problem chunks can then be solved in parallel and the subtask results are joined into one final result. Prime examples for this strategy are merge sort and sequential search algorithms. These are easily parallelisable. Likewise many operations on arrays and collections can be parallelised fairly easily. But certain tasks are not obviously parallelisable, for example the computation of the Fibonacci series: f (n) = f (n-1) + f (n-2). Since every step in the computation depends on the previous step, the Fibonacci algorithm is sequential by nature. In theoretical informatics, the question whether algorithms of the complexity classes P and NP are in principle parallelisable is still unsolved. For practical purposes, some problems are simply non-parallelisable.

Given that automatic parallelisation is currently out of reach, the need for the facilitation of algorithm parallelisation in computer languages boils down to (1) the need for expressive constructs that allow programmers to express parallelisms intuitively, and (2) the need for freeing programmers from writing boilerplate code for the drudge work of parallel execution, such as scheduling, controlling, and synchronisation. There are already a number of computer languages that provide built-in idioms for parallel programming, such as Erlang, Parallel Haskell, MPD, Fortress, Linda, Charm++ and others. However, these are fringe languages with a small number of users. It is questionable whether the   parallel scaling trend in hardware will lead to a wide adoption of any of these languages. Perhaps mainstream languages will evolve to acquire new APIs, libraries, and idioms to support parallel programming. For example, there are Compositional C++, MPI, Unified Parallel C, Co-Array Fortran and other existing extensions that make widely used languages more suitable for parallel programming, although there aren’t any established standards yet. It also remains to be seen whether the functional programming paradigm will catch on in view of parallel programming. Java is promising, because the JDK 7 (Dolphin) will contain a  new API for fine-grained parallel processing. In this very informative article by Brian Goetz, the author of Java Concurrency in Practice, introduces the new java.util.concurrency API features of Java SE 7. It will include a fork-join framework that simplifies the expression of parallel processing, for example in loops or in recursive method invocation. It will also provide new data structures, such as ParallelArray, that make parallel iteration easier to express. To learn more about parallel computing, read the free Introduction To Parallel Computing by Blaise Barney or search the free parallel programming books at Google Books.

Eclipsed by Europa

Eclipse LogoHave you been eclipsed lately? I mean software-wise, of course, using the Eclipse integrated development environment (IDE). Today I came pretty close to feeling eclipsed by the latest Eclipse download dubbed “Europa”. It began with the sheer size of the packages. 125 MB for the Eclipse JEE Europa version, 36 MB for web tools, another couple of megabytes for PHP development tools, visual editor, etc., etc. Not a problem with today’s DSL, you might think, and that is of course true if you live in America or Europe.

Unfortunately it’s a different story in Thailand. The Eclipse download page insisted on assigning a local mirror to me. This may be well-intentioned, alas not very effective, because servers located in Thailand tend to be bandwidth-drained, especially if it belongs to a public institution. Regrettably, there was no way to change that. The first connection delivered a whopping 4 kb/s download speed and stalled after 15 minutes. After that I decided to switch to BitTorrent. The BitTorrent software gave me 5 kb/s. That’s still dial-up speed, but at least a small improvement.

My computer spent the night downloading Eclipse files.In the morning, all the precious nuggets had arrived on my hard disk. Installation was a breeze compared to the download. Most of the standard components were already included in the Java EE download. I just had to add WTP, PDT, and a few other favourites on top of the Europa distro. In the past, this wasn’t always an easy task. I remember the time when some plugins depended on different, mutually incompatible versions of other plugins. The plugin architecture of the Eclipse framework is sort of asking for this type of problem.

Fortunately the days of dependency hell are gone thanks to synchronised update cycles of the Eclipse projects (there are 21 individual modules in the Europa release). In just a few minutes I had my shiny new Eclipse up and running. After starting it with the “-clean” parameter over the old workbench directory I was in business. It even accepted my old configuration settings. The thing I noticed first is that Europa launched considerably faster than my previous 3.2 version, but -good gracious- the title bar displayed “Program not responding” when I tried to interact with the menu. It seems that Eclipse is now initialising UI components after showing the IDE window and it keeps UI components locked during that time. Eventually after 10 seconds or longer, the program worked normally. I suspect that Eclipse is not loading faster after all; it just displays faster. The second surprise was that Eclipse opened external editor when I double-clicked on a JavaScript file in the navigator. What on earth? I thought I had WTP installed, which supposedly includes a JavaScript editor. Instead of getting to the bottom of that question, I decided to install my favourite JavaScript editor JSEdit, which now belongs to Adobe, but is still distributed freely. Eclipse IDE

Since I use Eclipse for Java as well as PHP development, I gave the new PHP Development Tools (PDT) a spin. PDT is now dubbed the premier PHP editor for Eclipse. To be honest, I was mildly disappointed. First of all, the outline view did not work. While syntax colouring and code folding were okay, the PDT editor lacks some of the features that I have grown used to, such as automatic code completion, marking of uninitialised local variables (which is great for typos in variable names), occurrence highlighting, instant compilation, etc.

Since all of these are real productivity gainers, I quickly reverted back to my old PHPEclipse plugin, after playing with PDT for a while. Because PDT is work in progress, I will certainly check back in a few months. The major incentive to use PDT instead of PHPEclipse is the integration of PDT with the Zend debugger plugin. What else is new in Eclipse? The interface now uses a gradient colour scheme which gives the UI a nice new look. Java code editing and refactoring has been improved. Code completion recognises Java types even if the respective imports haven’t been typed out yet. The code assist function is now able to to determine the legal types of exceptions in a catch clause based on the contents of the try block.

Unused members and types are now detected, and refactoring can be invoked from the context menu, which makes repetitive tasks, such as renaming identifiers, really easy. Though all of these are small incremental improvements, overall the JDT has become more intelligent as well as faster, which I am sure, every Java developer will appreciate. Of course, there are many more new features, in fact too many to list them here. In summary, Europa is definitely worth the download, even if you should feel a little “eclipsed” by the number and size of its of modules.