Parallel Programming

During the last few years, we have seen a trend change in CPU design. Until about 2003, CPUs  became more powerful through frequency scaling. The number of operations per second increased exponentially over a period of 20 years. Clock speeds went from 4.77 MHz in the first IBM PC (1981) to 3.3 GHz in a high-end PC of the year 2003. Thus, Moore’s law was supported chiefly by increasing clock speed. Then something “strange” happened: clock speeds plateaued. For a brief moment, it appeared that Moore’s law was nearing its end. Not because of the physical limits of miniaturisation, or because higher clock speeds were impossible, but because of thermal problems associated with such high clock frequencies. Water cooled systems were already introduced at the top-end of the market, but obviously the energy efficiency of these systems presents a serious economic problem. Then in 2004, the first PC dual core processor was introduced. Dual cores became widely available in 2005 and by now we have become used to quadcore processors, while eight-core systems are around the corner. The trend is obvious: in the forseeable future we will move from multi-core to many-core CPUs with dozens and perhaps even more cores.

Increasing the clock speed of a CPU has the effect, that program execution speed increases proportionally with clock speed. The increase is independent of software and hardware architectures. A sequential algorithm simply runs twice as fast on a machine where CPU operations take half the time. Unfortunately, this is not the case for a machine with twice as many CPUs. A sequential algorithm runs just as fast on a 3 GHz single core CPU than on a 3 GHz dual core CPU, because it uses only one of the CPU cores. Increased overall execution speed is therefore only achieved when multiple tasks run concurrently, because they can be executed simultaneously on different CPUs. This problem of parallelism is not new in computer science. Unfortunately though, today’s programming languages aren’t very well equipped to deal with parallel scaling. The computer language idiom that typifies multi-core concurrency is the thread model. Threads are lightweight processes that are typically used for self-contained subtasks. Several languages, notably Java, offer APIs and native support for their implementation. Threads are well suited to asynchronous processing, for example in communication and networking. They can also be used for simple parallelisation, where a computing problem is parallel by nature (for example serving multiple web documents at the same time). However, threads are relatively cumbersome and thus not really suitable for fine-grained parallel programming, such as traversing data structures or executing nested loops in parallel. Edward A. Lee describes the problems with threads in this excellent article.

The fundamental problem that software engineers face is described by Amdahl’s law. Amdahl’s formula expresses the speed gain of an algorithm achieved by parallelisation: speedup = N / ( (B*N) + (1-B) ), where B is the non-parallelizable (sequential) percentage of the problem and N is the number of worker threads, or processor cores. There are two notable things about Amdahl’s law: (1) the speedup is highly dependent on B, and (2) the curve flattens logarithmically with increasing N. It’s also important to know that Amdahl’s law assumes a constant problem size, which is unrealistic given that parallelisation  requires a certain amount of control overhead. Nevertheless, we can draw several conclusions from Amdahl’s law. First, the performance gain from parallel scaling is lower than that of frequency scaling. Second, it is not proportional to the number of cores. Third, it is highly dependent on software architecture, since the latter chiefly determines the size of B. From the perspective of a software engineer, the last conclusion is probably the most interesting one. It leads to the question what can be done to maximise parallelisation at the level of the algorithm exploiting data and task parallelisms. The present thread model is too coarse-grained to provide solutions at the algorithm-level. Hence, what is called for are new programming idioms that make this task easier.

Ideally, parallelisation is fully automatic, implemented by the compiler and the underlying hardware architecture. In this case, the application programmer could simply continue to formulate sequential algorithms without having to worry about parallelism. Unfortunately, automatic parallelisation has turned out to be extremely complex and difficult to realise. Despite many years of research in compiler architecture, there are no satisfactory results. Parallelisation at the algorithm level involves several abstraction steps, such as decomposition, mapping, scheduling, synchronisation, and result merging. The hardest among these tasks is very likely decomposition, which means breaking down a sequential task into parts that can be solved in parallel. The preferred method for doing this is divide and conquer. A problem is divided into identical smaller pieces, whereas the division can be expressed recursively or iteratively. The problem chunks can then be solved in parallel and the subtask results are joined into one final result. Prime examples for this strategy are merge sort and sequential search algorithms. These are easily parallelisable. Likewise many operations on arrays and collections can be parallelised fairly easily. But certain tasks are not obviously parallelisable, for example the computation of the Fibonacci series: f (n) = f (n-1) + f (n-2). Since every step in the computation depends on the previous step, the Fibonacci algorithm is sequential by nature. In theoretical informatics, the question whether algorithms of the complexity classes P and NP are in principle parallelisable is still unsolved. For practical purposes, some problems are simply non-parallelisable.

Given that automatic parallelisation is currently out of reach, the need for the facilitation of algorithm parallelisation in computer languages boils down to (1) the need for expressive constructs that allow programmers to express parallelisms intuitively, and (2) the need for freeing programmers from writing boilerplate code for the drudge work of parallel execution, such as scheduling, controlling, and synchronisation. There are already a number of computer languages that provide built-in idioms for parallel programming, such as Erlang, Parallel Haskell, MPD, Fortress, Linda, Charm++ and others. However, these are fringe languages with a small number of users. It is questionable whether the   parallel scaling trend in hardware will lead to a wide adoption of any of these languages. Perhaps mainstream languages will evolve to acquire new APIs, libraries, and idioms to support parallel programming. For example, there are Compositional C++, MPI, Unified Parallel C, Co-Array Fortran and other existing extensions that make widely used languages more suitable for parallel programming, although there aren’t any established standards yet. It also remains to be seen whether the functional programming paradigm will catch on in view of parallel programming. Java is promising, because the JDK 7 (Dolphin) will contain a  new API for fine-grained parallel processing. In this very informative article by Brian Goetz, the author of Java Concurrency in Practice, introduces the new java.util.concurrency API features of Java SE 7. It will include a fork-join framework that simplifies the expression of parallel processing, for example in loops or in recursive method invocation. It will also provide new data structures, such as ParallelArray, that make parallel iteration easier to express. To learn more about parallel computing, read the free Introduction To Parallel Computing by Blaise Barney or search the free parallel programming books at Google Books.

Semantic vs. presentational HTML

Today I debated with my colleagues the differences and merits of semantic HTML versus presentational HTML. This may seem a fairly esoteric topic to non-developers. However, for web developers it touches upon a fundamental issue, namely that of best coding practices. Should HTML be coded with semantic or presentational preference? Are there different situations where one coding style is more appropriate than the other? And what constitutes semantic versus presentational HTML in the first place?

Since my colleagues and I left these issues sort of unresolved, I am going to consider them in some more detail here. Web developers are divided in two camps, the semantic HTML advocates and the presentational HTML advocates. My colleague seemed to be arguing for a presentational approach. Before we look at the reasoning that backs each of the two positions, let us define these terms first.

Semantic HTML is the subset of HTML that describes the content and structure of a document, whereas presentational HTML is the subset of HTML used to determine the appearance of the document. While this definition is straightforward and unambiguous, in practice it is often difficult to point out the exact range of these sets. In other words, it’s not always easy to tell whether a given tag belongs into the semantic or into the presentational category.

Some HTML tags can be assigned quite easily, however. For example <address>, <abbr>, <body>, <code>, <kbd> are all semantic tags while <center>, <font>, <hr>, <b> and <br> (“bed and breakfast markup”) are all presentational. The same categorisation can be expanded to distinguish between presentational and semantic attributes. In some cases, HTML offers semantic and presentational alternatives that achieve the same thing. For example, most browsers render <i> (presentational) exactly as <em> (semantic), and <b> (presentational) exactly as <strong> (semantic).

To make things even more complicated, there are HTML tags which have both presentational and semantic aspects and other tags which have neither. Tags like <button>, <caption>, and <table> are examples of hybrids, whereas <script>, <applet>, <object> are neither semantic nor presentational but constitute containers for other types of content.

There are two principle arguments for preferring semantic markup over presentational markup. The first is that semantic markup helps to make documents easier to understand for machine parsers, as for example search engine robots, agents, screen readers, accessibility software and the like. The second argument is that it is always a good idea to separate content from presentation, because it aids automation and it helps to simplify maintenance. This argument gained momentum with the introduction of style sheets, which allow to move the appearance aspects to an external document.

There are also good arguments for preferring presentational markup over semantic markup. For example, there is the ease-of-use aspect. It is simply easier to write <b> than <strong> or <span style=”font-weight:bold”>. Then there is the backward-compatibility aspect. Most if not all of the presentational HTML markup is understood by even the most outdated browsers. The first pro-semantic argument can also be called into question, because today’s robots and search engine spiders are sophisticated enough to interpret the presentational aspects of a document and derive document structure from it.

Finally, the strongest point for giving presentational HTML preference is that HTML itself is designed for document presentation, not for document storage or structuring. My own point of view is that the distinction between presentational and semantic HTML is quite academic and probably irrelevant. We have to live with the fact that HTML is a bit messy by design. In practice, presentational elements are often (ab-)used to create document structure, for example by using <br> for paragraph separation. The reverse is also the case. Semantic and structural elements are often (ab-)used to create a certain visual appearance, as for instance the <blockquote> tag or the various tags used in conjunction with tables.

I tend to see HTML as a language that is chiefly concerned with presentation. In this capacity it has been extremely practical and successful. Ideally, HTML takes care of the document structure whereas CSS takes care of the finer aspects of visual appearance. In practice, however, it is rather difficult to achieve a complete separation. Therefore I suggest to abandon the attempt to rigidly structure content with semantic markup at the expense of visual definition.

If semantic structuring is a design goal, then choose a fitting XML format. XML is much better suited to that task and HTML can be generated quite easily from XML. The semantic approach only makes sense in those cases where rather simple documents are created in HTML and where HTML is the primary format. Otherwise semantics would have to be foisted onto the limited HTML vocabulary. Since the number of dynamically generated pages is outgrowing the number of static pages on the Internet, and since the use of XML is increasing, the distinction becomes less and less important.

Towards web engineering

Perhaps you have never heard of the term web engineering. That is not surprising, because it is not commonly used. The fact that it is not commonly used is, however, surprising. Very surprising actually. The process of software creation resembles that of website creation and engineering principles are readily applied to the former. A website can be considered an information system. However, there is one peculiarity: the creation of a website is as much a technical as an artistic process.

The design of graphics and text content (presentation) goes hand in hand with the design of data structures, interactive features, and user interface (application). Despite some crucial differences, web development and software development have many features in common. Since we are familiar with software engineering, and since we understand its principles and advantages, it seems sensible to apply similar principles to web development. Thus web engineering.

Before expanding on this thought delving into the details of the engineering process, let’s try to define the term “web engineering”. Here is the definition that Prof. San Murugesan suggested in his 2005 paper Web Engineering: Introduction and Perspectives: “Web engineering uses scientific, engineering, and management principles and systematic approaches to successfully develop, deploy, and maintain high-quality web systems and applications.”

Maturation Stages

It seems that current web development practices rarely conform with this definition. Most websites are still implemented in an uncontrolled code-and-fix style without codified procedures and heuristics. The person responsible for implementation is often a webmaster who is basically a skilled craftsman. The process of crafted web development is quite different from an engineering process. Generally speaking, there is a progression that all fields of industrial production go through from craftsmanship to engineering. The following figure illustrates this maturation process (Steve McConnel, 2003):

Industry Maturity Stages

At the craft stage, production is carried out by accomplished craftsmen, who rely on artistic skill, rule of thumb, and experience to create a product. Craftsmen tend to make extravagant use of resources and production is often time consuming. Therefore, production cost is high.

The commercial stage is marked by a stronger economic orientation. Growing demand, increased competition, and companies instead of craftsmen carrying out work are hallmarks of the commercial phase. At this stage, the production process is systematically refined and codified. Commercial suppliers provide standardised products at competitive prices.

Some of the technical challenges encountered at the commercial stage cannot be solved, because the research and development costs are too high for individual manufacturers. If the economic stakes are high enough, a corresponding science will emerge. As the science matures, it develops theories that contribute to commercial practice. This is the point at which production reaches the professional engineering stage.

On a global level, software production was in the craft stage until the 1970s and has progressed to the commercial stage since then. Though the level of professional engineering is already on the horizon, the software industry has not reached it yet. Not all software producers make full use of available methods, while other more advanced methods are still being researched.

Recent developments in methodology

Web development became a new field of production with the universal breakthrough of the Internet in the 1990s. During the subsequent decade, web development has largely remained in the craft stage. This is now changing, albeit slowly. The move from hand-coded websites to template-based websites, content management systems, and standard application packages signals the transition from craft to commercial stage. Nonetheless, web development has not yet drawn level with the state of art in software development.

Until recently, web development was like a blast from the past. Scripting a web application involved the use of arcane languages for producing non-reusable, non-object-oriented, non-componentised, non-modularised, and in the worst case non-procedural code. HTML, JavaScript, and messy CGI scripts were glued together to create so-called dynamic web pages. In other words, web development was a primitive craft. Ironically, all principles of software engineering were either forgotten or ignored. Thus, in terms of work practices, developers found themselves thrown back twenty years. However, the rapid growth of the Internet quickly made these methods appear outmoded.

During the past 15 years, the World Wide Web underwent a transformation from a linked information repository (for which it was originally designed) to a universal vehicle for worldwide communication and transactions. It has become a major delivery platform for a wide range of applications, such as e-commerce, online banking, community sites, e-government, and others. This transition has created demand for new and advanced web development technologies.

Today, we have some of these technologies. We have unravelled the more obscure aspects of HTML. We have style sheets to encapsulate presentation logic. We have OOP scripting languages for web programming. We have multi-tier architectures. We have more capable componentised browsers. We have specialised protocols and applications for a great variety of purposes.

However, we don’t yet have established methodologies to integrate all these technologies and build robust, maintainable, high-quality web applications. We don’t yet have a universally applicable set of best practices that tells us how to build, say a financial web application, from a server language, a client language, and a database backend. Consequently, there is still a good deal of black magic involved in web development.

If you build a house, you can choose from a number of construction techniques and building materials, such as brick, wood, stone, or concrete. For any of these materials, established methods exist that describe how to join them to create buildings. Likewise, there are recognised procedures to install plumbing, electricity, drainage, and so on. When you build a house, you fit ready-made components together. Normally you don’t fabricate your own bricks, pipes, or sockets.

Unfortunately, this is not so in the field of web development. Web developers do not ubiquitously rely on standard components and established methods. On occasion, they still manufacture the equivalent of bricks, pipes, and sockets for their own purposes. And they fit them together at their own discretion, rather than by following standard procedures. Unsurprisingly, this results in a more time consuming development process and in less predictable product quality.

Having recognised the need for web engineering, a number of question arises. What does web engineering have in common with software engineering? What are the differences? Which software engineering methods lend themselves best to web development? Which methods must be defined from scratch? A detailed examination of all these questions is unfortunately beyond the scope of this article. However, we can briefly outline the span of the field and name those aspects that are most crucial to the Web engineering process.

Web applications are different

Web applications are inherently different from traditional software applications, because they combine multimedia content (text, graphics, animation, audio, video) with procedural processing. Web development comprises software development as well as the discipline of publishing. This includes, for instance, authoring, proofing, editing, graphic design, layout, etc.

Web applications evolve faster and in smaller increments than conventional software. Installing, fixing, and updating a website is easier than distributing and installing a large number of applications on individual computers. Web applications can be used by anyone with Internet access. Hence, the user community may be vast and may have different cultural and educational backgrounds. Security and privacy requirements of web applications are more demanding. Web applications grow in an environment of rapid technological change. Developers must constantly cope with new standards, tools, and languages.

Multidisciplinary approach

Building a large website involves dissimilar tasks such as photo editing, graphics design, user interface design, copy writing, and programming, which in turn requires a palette of dissimilar skills. It is therefore likely that a number of specialists are involved in the creation of a website, each one working on a different aspect of it. For example, there may be a writer, a graphic designer, a Flash specialist and a programmer in the team. Hence, web development calls for a multidisciplinary approach and team work techniques. “Web development is a mixture between print publishing and software development, between marketing and computing, between internal communications and external relations, and between art and technology.” (Powell, 2000)

The website lifecycle

The concept of the website lifecycle is analogous to that of the software lifecycle. Since the field of software engineering knows several competing lifecycle models, there are likewise different approaches to website design. For example, the waterfall model can be used for relatively small web sites with mainly static content:

1. Requirements Analysis
2. Design
3. Implementation
4. Integration
5. Testing and Debugging
6. Installation
7. Maintenance

This methodology obviously fails for larger websites and web applications for the same reason it fails for larger software projects: the development process of large scale projects is incremental and requirements typically evolve with the project. But there is another argument that speaks for a more incremental/iterative methodology. Web applications are much easier to rollout and update than traditional software applications. Shorter lifecycles therefore make good practical sense. The “release early, release often” philosophy of the open source community certainly applies to web development. Frequent releases increase customer confidence, feedback, and avoid early design mistakes.

Categories of web applications

Different types of web applications can be distinguished by functionality (Murugesan, 2005):

Function Examples
Informational Online newspapers, product catalogues, newsletters, manuals, reports, online classifieds, online books.
Interactive Registration forms, customised information presentation, online games.
Transactional Online shopping (ordering goods and services), online banking, online airline reservation, online payment of bills
Workflow Oriented Online planning and scheduling, inventory management, status monitoring, supply chain management
Collaborative work environments Distributed authoring systems, collaborative design tools
Online communities, market places Discussion groups, recommender systems, online market places, e-malls (electronic shopping malls), online auctions, intermediaries


Maintainability is an absolutely crucial aspect in the design of a website. Even small to medium sites can quickly become difficult to maintain. Many developers find it tempting to use static HTML for presentation logic, because it allows for quick prototyping, and because script code can be mixed seamlessly with plain HTML. What is more, presentation logic is notoriously difficult to separate from business and application logic in web applications. However, this approach is only suitable for very small sites. In most other cases, a HTML generator, a template engine, or a content management system will be more appropriate.

Dependency reduction also contributes to maintainability. Web applications have multiple levels of dependencies: internal script code dependencies, HTML and template dependencies, and style sheet dependencies. These issues need to be addressed individually. The reduction of dependencies usually comes at the price of increasing redundancy/complexity. For example, it is more complex to use individual style sheets for different templates, than to use one master style sheet for all. Internal dependencies and complexity levels need to be balanced by an experienced developer.

As in conventional software development, code reusability is paramount to maintainability. Modern object-oriented scripting languages allow for adequate encapsulation and componentisation of web application parts. The problem is that developers do not always make use of these programming techniques, because they require more upfront planning effort. In environments with high economical pressure, there is a tendency to ad-hoc coding which yields quick results, but lacks maintainability.


Scalability is one of the friendlier facets of web development, because the web platform is -at least in theory- inherently scalable. Increasing traffic can often be handled by simply upgrading the server system. Nonetheless, developers still need to consider scalability when designing software systems. Two important issues are session management and database management. Memory resource usage is proportional to the amount of session management data. I/O and CPU load is proportional to concurrent database access, thus developers do well to anticipate peak loads and optimise links between different tiers in an n-tier system.

Ergonomics and aesthetics

The look-and-feel of a website, its layout, navigation, colour schemes, menus, etc. make up the ergonomics and aesthetics of a website. This is a more artistic aspect of web development and it should be left to a professional with relevant skills and a good understanding of usability aspects. Software ergonomics and aesthetics should not be an afterthought, but an integral aspect of the development process. These factors are strongly influenced by culture. A website that appeals to a global audience is more difficult to build than a website targeted at a specific group and culture.

Website testing

Website testing comprises all aspects of conventional software testing. In addition, it involves special fields of testing which are specific to web development:

  • Page display
  • Semantic clarity and cohesion
  • Browser compatibility
  • Navigation
  • Usability
  • User Interaction
  • Performance
  • Security
  • Code standards compliance

Compatibility and interoperability

Web applications are generally intended to run on a large variety of client computers. Users might use different operating systems, different font sets, different languages, different monitors, different screen sizes, and different browsers to display web pages. In some cases, web applications do not only run on standard PCs, but also on pocket computers, PDAs, mobile phones, and other devices. While it is nearly impossible to guarantee proper page display on all of these devices, there needs to be a clearly defined scope of interoperability. This scope should spell out at least the browsers, browser versions, languages, and screen sizes that are to be supported.


Steve McConnell (2003). Professional Software Development
San Murugesan (2005). Web Engineering: Introduction and Perspectives
T.A. Powell (1998). Web site engineering: Beyond Web page design

Climbing mount Java

You don’t want to build your programming career on dynamic languages alone, but you find C++ too messy, C# too proprietary, Delphi too underpowered, and D too esoteric? Java is your friend. Not only does Java top the TCPI list of most popular programming languages, it also offers an established framework for everything from mobile application programming to enterprise development. Mount MerapiMoreover, it is designed as a cross-platform language from the outset. The “write once, run everywhere” philosophy isn’t a mere vision, or an overly optimistic design goal. With Java it is a reality. The greatest appeal of Java, however, may be its extremely solid software engineering foundation.

Everything in the Java world, from programming paradigms, code conventions, documentation to unit tests and build systems is standardised. These standards are applied industry-wide; they come as part of the package, and they are supported worldwide by the Java community. Programmers are almost forced to write maintainable, extensible, reusable, and well-documented code with Java. This is something that many other development platforms claim to achieve, but often fail to deliver. Naturally, the solid software engineering foundation that Java offers comes at a cost. The cost is complexity.

If you are new to Java and eager to learn the language, you are well advised to allocate ample time to the learning process. Despite what some book titles claim, and what might be your experience with scripting languages, Java cannot be learnt in 24 hours. Other book titles suggest “Java in 21 days”, which is somewhat more realistic. Since Java has acquired important new features in version 5.0, e.g. enums, annotations, generics, a month is probably needed to become familiar with the basics, more if you are also new to the object-oriented programming paradigm.

“Okay, yet this would be the same for C++ or any other statically typed OOP language,” you say. Yes, but Java is more than a language. It is a platform. It consists of dozens of development tools, hundreds of APIs, and thousands of classes that do everything from database access to 2D rendering. Sun has illustrated this neatly in its SDK documentation, where the Java SE platform is depicted as a brick wall, every brick representing a major API or technology. The Java language itself composes only the uppermost layer of the wall. Extrapolating the time it requires to learn Java, a year seems reasonable to become familiar with each and every of the pictured Java SE components.

And this is merely the beginning. We haven’t yet touched upon enterprise development with Java EE, e.g. Enterprise Java Beans, web development with JSP/Servlets, XML, web services, and so on. We also did not mention third party tools and systems, such as Java IDEs, Ant, JUnit, Tomcat, Jboss, Hibernate, Struts, and what else belongs to the advanced Java programmer’s toolkit. From this it becomes clear that climbing mount Java is an extensive journey that requires significant efforts. Mastering the platform is something that probably takes several years of continued education and practice. Hopefully, this does not discourage newcomers. The rewards, I believe, are substantial.

So what’s a good starting point? Close to the source, namely on Sun’s own homepage, there is the indispensable JDK/Java SE documentation, which was already mentioned. Sun also offers a highly usable tutorial, which features general as well as specialised trails and topics. For those interested in Java training and certification, there are two excellent web sites, and These sites offer copious training material; they are a beacon for newcomers. Finally, there is a plethora of Java books on the market. For a beginner, a combination of a reference (“Java in a Nutshell”, David Flanagan), introductory text (“Head First Java”, Sierra and Bates) and a cook book (“Effective Java Programming Language Guide”, Joshua Bloch) probably makes sense.

Freebie of the Month: PSPad

A good plain text editor is the Swiss army knife of every programmer. Unfortunately, the Windows operating system offers only the “Notepad” program in this category, which is the equivalent of a $1.50 plastic knife. If you want to do more than opening an occasional README.TXT, then Notepad is definitely underpowered. This situation has created a market for commercial text editors, such as Ultra-Edit, CodeWright, EditPlus and others, which are excellent products, however, these programs are not free. In the open source arena there are well known editors, such as GNU Emacs and vim, which have evolved on the Unix platform. These editors are very powerful, but they are quirky and not exactly easy to learn and use. Why put up with a learning curve, when more user-friendly products are available? A multitude of freeware text editors with varying features is available for the Windows platform.

When I searched the Internet for a freeware editor, I was looking for raw power, speed, and features. In that order. The PSPad editor written by the Czech author Jan Fiala fits the bill perfectly. First of all, it is fast. Even on a modest Pentium IV computer, it starts up in less than two seconds. This is an important characteristic, since a text editor might get loaded dozens of times in succession for viewing or changing different files. It also makes it convenient to use PSPad when I don’t want to fire up a “heavy duty” IDE, such as Eclipse.

PSPad’s look is neat and functional. It presents itself with customisable tool bars, tabbed editor windows and a logically structured menu. Text windows can also be floated or tiled. The feature set of PSPad can compete with commercial high-end products. It includes syntax highlighting for dozens of programming languages, auto backups, macros, hex edit mode, integrated diff comparisons, pluggable text converters, customisable short-cut key map, spell checker, support for Windows, Unix, and Mac line endings, support for different character sets, HTML formatting and HTML validation through Tidy. This makes it ideal for editing a wide variety of file types from C++ source files to HTML pages, SQL statements, XML files, and shell scripts.

One feature I really liked is the multi-language code explorer, a feature that is otherwise only found in high-end IDEs. The code explorer seems to be capable of displaying almost anything from the DOM tree of an HTML document to a PHP or Java class. However, the most important aspect of a text editor for me is powerful search and replace capability. In this area, PSPad once again delivers. PSPad supports Perl-compatible regular expressions for search and replace operations, which is a make-or-break criterion for automated text processing. It also supports search and replace in multiple files, even recursively in subdirectories, which is again great for automated processing. The only limitation is that it cannot do both at the same time. It either processes regular expressions or multiple files, but not both. I am not sure why this limitation exists. Without it, PSPad would be pretty close to perfection.