Java vs. PHP vs. Scala

I have a bit of a dilemma with programming languages. Next year, I expect to be able to free up a little extra time for a private programming project (call me an optimist!) and I am wondering which language/technology to use. The project is quite straightforward. It's a business application that I use for my own work as a software engineer. It consists of four components. There's a contact manager component (or CRM as it's now fashionably called), a project management component, a time tracking component, and a billing component. That may sound like a tall order, but obviously I don't need the full-blown functionality of applications like Siebel, MS Project, or SAP. I just need an application that brings certain functionality together in a quite specific way to suit my needs.

The software I am currently using for this purpose consists of two different programs. The CRM and billing components are contained in a Delphi application which I wrote more than 10 years ago. The time sheet and project management components are part of a PHP application that I developed in 2002. Needless to say that these two programs are neither cutting-edge, nor are they well integrated. The Delphi application uses an outdated Borland Paradox DB  and the PHP application contains large swathes of ugly procedural code. Although the whole shebang fulfils its purpose, I feel it's high time for a replacement. Of course, I could acquire an existing software package and save a lot of time writing code myself. But hey, I am a software engineer. I do like a creative challenge and I want something that fits my needs. I also want to learn new technologies.

The question I am asking myself now is what to use for the task. I am considering Java, PHP, and Scala. There are pros and cons for each of these:

(1) Java, JSP and a web framework with an app server. This is the obvious choice. Most of my professional work is JEE-based these days. I believe that I can work productively with Java, although the language inevitably involves a lot of boilerplate code and redundancy, which has a negative impact on productivity. In spite of this, it would be an good opportunity to deepen my knowledge of JSF (Java Server Faces), Hibernate, or try out some other persistence layer. It would also offer an opportunity to learn a new Java web framework that I haven't yet worked with such as Spring or Tapestry. From a business point of view, this may be a good choice because Java technologies are in high demand and it is also a very robust platform. The JEE universe is really quite large and there's enough territory that would be fun to explore. The downside is that Java, the language, is slightly tedious.

(2) The second choice is PHP and the Zend framework in combination with some AJAX toolkit, such as YUI or Dojo. I have the feeling this would be the most productive way to go; the biggest bang for the buck so to speak. For a project of this size (around 50 kloc), the development time may be even half of that with Java. PHP 5 and the Zend framework are mature technologies and I am quite familiar with both. Another advantage of PHP is that it's wide spread. Almost every hosting company offers PHP, whereas the number of Java hosting companies is considerably smaller (and usually more expensive). So, there wouldn't be any problem hosting the finished product anywhere. The downside is that PHP, being a dynamic language, is less robust  and slower than JVM bytecode. The language is also less expressive. But the biggest disadvantage is that I'd hardly learn anything new in the process.

(3) The third alternative is using Scala in combination with the Lift framework and a standard web container. I find this the most exciting choice, but it's very likely to be the most time consuming. I am rather new to Scala and functional programming. What I have seen so far is great. Programming in Scala is much more fun than coding in Java or PHP. I am afraid though, it would take a bit of time to wrap my head around it and work productively. Scala is still a foreign language to me. Another downside is that there is a limited choice of frameworks, APIs, and tools available at this point. Actually, Lift is the only Scala web framework I know of. Another question I am asking myself is whether acquiring Scala skills does make any business sense. I haven't seen too many Scala job offerings so far. Seems like the most fun choice, but also the least promising from a business point of view. Decisions, decisions, decisions…

Zend Studio for Eclipse

Usually I don’t talk about new product announcements in this blog, because there are just too many of them and often they are only of interest to a small group. Yet, I think the recent release of Zend Studio for Eclipse (Beta) will be exciting news for almost every PHP developer. Last year I had evaluated the previous version of Zend Studio, which is certainly excellent. However, I’ve grown attached to the Eclipse IDE, and therefore I did not make the switch and kept using the PHP Eclipse plugin instead. It seems that Eclipse has gained popularity across the board in the PHP world. Zend has recognised the signs of the time and released an Eclipse version of its flagship product earlier this month. This puts Zend Studio into direct competition with PDT and PHP Eclipse.

The Zend IDE has all the usual niceties, such as code assist, syntax colouring, code fold, class hierarchy view, property inspectors, auto format, and powerful search/replace functions without which most PHP developers probably cannot live any more, but there is even more. In addition to PHP code editing, the IDE also provides JavaScript and CSS editors. I have always wished for the sort of code refactoring functionality for PHP that you get with the Eclipse Java editor. The new Zend Studio finally offers this, although perhaps somewhat less extensively than what Java programmers are used to. At least there’s support for renaming and moving, as well as automatic includes. This alone should be a huge time saver, especially for larger projects with dozens or even hundreds of source files.

Further high-end features include support for PHPUnit Testing, PHPDoc generation, as well as debugging and profiling. I cannot stress enough how important these features are, especially when working in a team. The PHPUnit support generates skeleton test classes for all code elements and thus finally takes the pain out of creating meaningful test suites. Well, the programmer is still responsible for making them meaningful, but there isn’t so much typing involved any more. Likewise, with full PhpDoc support there is finally no more excuse for not including the programmer documentation in the sources. Project managers will probably welcome this, just as the felicitous integration of CVS and Subversion which works right out of the project view. Given “var_dump” and logging I have rarely felt the need for a debugger, but since debugging has always been a strength of Zend Studio, one might welcome this as an additional luxury. The profiler, on the other hand, can prove to be vital when checking large chunks of unfamiliar code for weaknesses.

Other noteworthy features are support for code templates and snippets (probably great for lazy typers with a good memory), inclusion of the open-source PHP/Java bridge which allows using Java classes from PHP and vice versa, and a number of wizards that help to get projects and classes getting off the ground faster, including support for WSDL/SOAP. Perhaps even more noteworthy are the integrated SQL GUI editor and HTML WYSIWYG editor. The SQL GUI offers the typical DB server explorer tree, table data viewing and editing in grids, a query editor with syntax highlighting and BLOB views. This is certainly helpful for casual database development, although “serious” DB developers might still prefer the respective DB manufacturer’s tools in combination with an ORM library. The HTML WYSIWYG editor is another really big addition. It looks a bit like an early version of Dreamweaver and it offers code, design and split-window code/design views. I doubt that it can replace a high-end designer tool, but it is certainly more than sufficient for the typical web application, and perhaps it even allows less artistically inclined programmers to add some visual polish to their pages.

In summary, Zend Studio for Eclipse is a high-end development tool with many advanced productivity features. The product is offered at a price of $254 USD and the final release is announced for the end of 2007. Further information, including a number of demo videos, is available at the Zend website.

Ajax development with Dojo

Almost exactly 11 years ago I made my first steps into the world of web programming by writing a web-based address book in Perl. The idea then was to teach myself how to create dynamic web pages. It took me about a week to learn Perl and to write this small app. Last week I have repeated the exercise, albeit on a different level. I wanted to teach myself the fundamentals of the Dojo toolkit, Dojo Ajax, and some of the more advanced MySQL features. It was also a welcome break from a longer period of Java work. I used a PHP5 OOP framework for the server side scripting. Mind you, not one of the super-bloated third party frameworks that can do everything from spitting out Pi to the thousandth digit to calculating your mortgage amortisation, but my own small set of classes, which don’t do much more than providing a simplified MVC scheme, DB abstraction, session encapsulation, and authentication. The focus was on Dojo, of course.

Dojo Ajax Application

The good news first. Dojo is very, very powerful. Unless you do really esoteric JavaScript programming, Dojo probably fulfils all scripting needs you will ever have. It offers a great number of widgets from Delphi-like layout containers, windows, tabs, and menus to sophisticated tree controls, sortable tables, and data stores. The out-of-the-box widgets are very easy to create and use. Almost no scripting is required. All you need to do is to add Dojo custom attributes to HTML div tags. It is worth learning Dojo for the widget set alone. Widgets provide an easy way to enhance the GUI of web applications and to create powerful interfaces. But GUI widgets is not all that Dojo has to offer. The second most important thing is probably the Ajax facility, which is likewise a felicitous implementation of a complex functionality. Dojo offers several different transport mechanisms for Ajax requests, including XMLHttpRequest and IFrames. It hides all the underlying complexity from the programmer, such as mechanism differences and browser incompatibilities, and encapsulates Ajax requests into a single function call. Furthermore Dojo offers a JSON-RPC client for use with Ajax, drag-and-drop functionality, form validation, cryptology functions, graphic functions and animation, math functions, xml parsing and more.

Unfortunately there is also bad news. Dojo’s comprehensiveness entails complexity. Granted, this complexity is well hidden from the user. Creating a widget is as easy as adding some Dojo-specific attributes to HTML div tags and including the respective Dojo libraries into a JavaScript section in the head of the HTML file. Things start getting difficult when you interact with widgets and when you customise them. For most non-trivial applications you probably need to do both. What makes this task difficult is chiefly the lack of a proper API documentation. At this time, Dojo only has a partial documentation which means that you either have to look at sample implementations (of which there are only a few), or look at the Dojo source code, or scour the Internet for tips and hints. This can be quite time consuming and frustrating, as the specific information is not always available. It has been said that the Dojo documentation was even more fragmentary before version 0.4. Another problem I came across is that when I started to combine different widgets, such as splitters, containers and layout widgets, the Internet Explorer 6 could not render the page any more. I felt reminded of the bad old days of the browser war when cross-browser compatibility was a software engineer’s pipe dream.

Productivity was likewise not very exalted. I spent 60+ hours on developing a simple address book application (see screenshot), and I am not even counting in the time I spent on learning Dojo. This application has two screens, one for contacts and one for organisations. Each screen consists of an entity list that can be searched in two modes (simple search and advanced), a form with detail data of a single entity, and two linked tables. Almost all data exchange with the server occurs via Ajax calls. Each of the two pages is driven by 475 lines of JavaScript code that handle the dynamics, on top of Dojo. Although it was an instructive experience, and although I am quite impressed with the breadth and power of the Dojo toolkit, I did not find it very practical for real-world application development. Considering that a conventional PHP application could be produced in a third of the time, it would be more economic to use IFrames for producing a similar UI. I reckon that Dojo is still far removed from being a RAD tool for web development. Conceptually it seems to be on the right track, however. If a proper documentation becomes available and if widget integration and interaction is streamlined, it might come out as a winner. IBM and Sun have recently announced their official support for Dojo, so perhaps we should keep an eye on its future development.

The triumph of the lamp

It is human to feel satisfaction when one’s predictions come true. To predict the success of LAMP in 1998 wasn’t that difficult, but neither was it a no-brainer.At the time, the acronym had just been coined by the German c’t magazine and it wasn’t widely known in the corporate world. LAMP stands for Linux-Apache-MySQL-PHP, a set of open-source software that powers web servers with dynamic content. Occasionally, the ‘P’ in LAMP is switched for Perl or Python, although PHP is now by far the most popular scripting language.

carbidelamp.jpgI remember having suggested a LAMP architecture for the implementation of a geo information system and extranet for a large government organisation in 1998. This project was on a tight budget, so I proposed to invest the resources into software development rather than into hardware and licenses. LAMP seemed ideal for it. However, the committee was utterly surprised that the word “Microsoft” did not appear in the proposal and they did not seem to put too much trust into any of the letters of L-A-M-P.

Luckily, another of my then customers was more open to the suggestion. A mid-sized logistics company was looking for new way to do business on the web. Since the company wanted to run their own servers, the LAMP stack offered a perfect solution to do this cost efficiently. It turned out to be foresighted decision. LAMP quickly gathered momentum on the Internet and soon became one of the mainstream web development platforms.

One has to keep in mind that the four pieces of software that make up LAMP have not been designed as a unified platform. On the contrary, they have different histories, and they were not specifically developed to work together. This is what distinguishes them from their competitor platforms ASP/.NET and Java/J2EE.

Let’s briefly go back to the year 1998. It was the time of the browser war and the boom. The Internet then consisted of about 5 million websites, which is less than one tenth of today (2005). Linux was at kernel version 2.8.x, Apache was at 1.3.0 and the Apache Software Foundation was not yet founded; MySQL was at version 3.21, and Andi Gutmans and Zeev Suraski just released the crucial PHP3 release.

Linux and Apache were already strong at the time. Linux was fairly mature OS with a userbase of 7.5 million in 1998. Torvalds had just trademarked the Linux name and the corporate world started to take notice. The Apache web server -originally developed as an extension of NCSA httpd server- already commanded 50% market share. MySQL and PHP, on the other hand, were new kids on the block. PHP had a userbase of only several ten thousand users and MySQL was widely considered a toy database.

The combination of these products, however, did one thing extremely well: powering dynamic websites. Plus they were free. Anyone who wanted to run a web server could use them without paying a single cent for license fees. Although MySQL and PHP had several limitations at the time, it did not really matter for the purpose of serving web pages. MySQL delivered impressive performance and PHP 3 was “cleaner” and easier to program than Perl. Thus the LAMP stack offered a killer platform for web applications.

Today, the quartet is more successful than ever. Apache powers 70% of all web servers. MySQL is the most widely used database on the Internet. It has matured into a full-featured RDBMS with transactions, replication, clustering, and (as of 5.0) stored procedures, views and triggers. PHP is now installed on 20% of all Internet sites. It fully supports the OOP paradigm and it has grown an extremely large function library that makes programmers feel like boys in a candy store.

What is behind the success of LAMP? In the case of Linux and Apache this is fairly easy to tell. They are both free and they offer excellent performance. You can easily pack 50 to 100 low traffic web sites onto a commodity PC. This makes Linux + Apache very popular with hosting companies who take advantage of the low cost of ownership. Furthermore, it allows service providers to customize the server’s configuration and administration model and apply it to an arbitrary number of cheap boxes which translates to arbitrary scalability. Hosting companies love it.

The case is slightly different with MySQL and PHP, because their growth is driven mainly by developers rather than by hosting companies. From the start, MySQL was geared towards web applications, which means its main strength is concurrent reads. Besides, it is easy to use and administer. PHP has become the scripting language of choice because its learning curve is almost zero, which means that programmers can be productive with PHP from day one. In addition, PHP is integrated very well with Apache from a very early time. It uses resources efficiently and avoids the CGI model and its known security problems.

All of the LAMP components originated around 1995 or before, which means that all of them recently passed their 10th birthday. LAMP can now be considered a mature architecture. The case of LAMP proves that using an open source platform is not like buying a pig in a poke. It proves that open source gets the job done, and -in the case of web servers- that it does the job better than anything else.