Parallel Programming

During the last few years, we have seen a trend change in CPU design. Until about 2003, CPUs  became more powerful through frequency scaling. The number of operations per second increased exponentially over a period of 20 years. Clock speeds went from 4.77 MHz in the first IBM PC (1981) to 3.3 GHz in a high-end PC of the year 2003. Thus, Moore’s law was supported chiefly by increasing clock speed. Then something “strange” happened: clock speeds plateaued. For a brief moment, it appeared that Moore’s law was nearing its end. Not because of the physical limits of miniaturisation, or because higher clock speeds were impossible, but because of thermal problems associated with such high clock frequencies. Water cooled systems were already introduced at the top-end of the market, but obviously the energy efficiency of these systems presents a serious economic problem. Then in 2004, the first PC dual core processor was introduced. Dual cores became widely available in 2005 and by now we have become used to quadcore processors, while eight-core systems are around the corner. The trend is obvious: in the forseeable future we will move from multi-core to many-core CPUs with dozens and perhaps even more cores.

Increasing the clock speed of a CPU has the effect, that program execution speed increases proportionally with clock speed. The increase is independent of software and hardware architectures. A sequential algorithm simply runs twice as fast on a machine where CPU operations take half the time. Unfortunately, this is not the case for a machine with twice as many CPUs. A sequential algorithm runs just as fast on a 3 GHz single core CPU than on a 3 GHz dual core CPU, because it uses only one of the CPU cores. Increased overall execution speed is therefore only achieved when multiple tasks run concurrently, because they can be executed simultaneously on different CPUs. This problem of parallelism is not new in computer science. Unfortunately though, today’s programming languages aren’t very well equipped to deal with parallel scaling. The computer language idiom that typifies multi-core concurrency is the thread model. Threads are lightweight processes that are typically used for self-contained subtasks. Several languages, notably Java, offer APIs and native support for their implementation. Threads are well suited to asynchronous processing, for example in communication and networking. They can also be used for simple parallelisation, where a computing problem is parallel by nature (for example serving multiple web documents at the same time). However, threads are relatively cumbersome and thus not really suitable for fine-grained parallel programming, such as traversing data structures or executing nested loops in parallel. Edward A. Lee describes the problems with threads in this excellent article.

The fundamental problem that software engineers face is described by Amdahl’s law. Amdahl’s formula expresses the speed gain of an algorithm achieved by parallelisation: speedup = N / ( (B*N) + (1-B) ), where B is the non-parallelizable (sequential) percentage of the problem and N is the number of worker threads, or processor cores. There are two notable things about Amdahl’s law: (1) the speedup is highly dependent on B, and (2) the curve flattens logarithmically with increasing N. It’s also important to know that Amdahl’s law assumes a constant problem size, which is unrealistic given that parallelisation  requires a certain amount of control overhead. Nevertheless, we can draw several conclusions from Amdahl’s law. First, the performance gain from parallel scaling is lower than that of frequency scaling. Second, it is not proportional to the number of cores. Third, it is highly dependent on software architecture, since the latter chiefly determines the size of B. From the perspective of a software engineer, the last conclusion is probably the most interesting one. It leads to the question what can be done to maximise parallelisation at the level of the algorithm exploiting data and task parallelisms. The present thread model is too coarse-grained to provide solutions at the algorithm-level. Hence, what is called for are new programming idioms that make this task easier.

Ideally, parallelisation is fully automatic, implemented by the compiler and the underlying hardware architecture. In this case, the application programmer could simply continue to formulate sequential algorithms without having to worry about parallelism. Unfortunately, automatic parallelisation has turned out to be extremely complex and difficult to realise. Despite many years of research in compiler architecture, there are no satisfactory results. Parallelisation at the algorithm level involves several abstraction steps, such as decomposition, mapping, scheduling, synchronisation, and result merging. The hardest among these tasks is very likely decomposition, which means breaking down a sequential task into parts that can be solved in parallel. The preferred method for doing this is divide and conquer. A problem is divided into identical smaller pieces, whereas the division can be expressed recursively or iteratively. The problem chunks can then be solved in parallel and the subtask results are joined into one final result. Prime examples for this strategy are merge sort and sequential search algorithms. These are easily parallelisable. Likewise many operations on arrays and collections can be parallelised fairly easily. But certain tasks are not obviously parallelisable, for example the computation of the Fibonacci series: f (n) = f (n-1) + f (n-2). Since every step in the computation depends on the previous step, the Fibonacci algorithm is sequential by nature. In theoretical informatics, the question whether algorithms of the complexity classes P and NP are in principle parallelisable is still unsolved. For practical purposes, some problems are simply non-parallelisable.

Given that automatic parallelisation is currently out of reach, the need for the facilitation of algorithm parallelisation in computer languages boils down to (1) the need for expressive constructs that allow programmers to express parallelisms intuitively, and (2) the need for freeing programmers from writing boilerplate code for the drudge work of parallel execution, such as scheduling, controlling, and synchronisation. There are already a number of computer languages that provide built-in idioms for parallel programming, such as Erlang, Parallel Haskell, MPD, Fortress, Linda, Charm++ and others. However, these are fringe languages with a small number of users. It is questionable whether the   parallel scaling trend in hardware will lead to a wide adoption of any of these languages. Perhaps mainstream languages will evolve to acquire new APIs, libraries, and idioms to support parallel programming. For example, there are Compositional C++, MPI, Unified Parallel C, Co-Array Fortran and other existing extensions that make widely used languages more suitable for parallel programming, although there aren’t any established standards yet. It also remains to be seen whether the functional programming paradigm will catch on in view of parallel programming. Java is promising, because the JDK 7 (Dolphin) will contain a  new API for fine-grained parallel processing. In this very informative article by Brian Goetz, the author of Java Concurrency in Practice, introduces the new java.util.concurrency API features of Java SE 7. It will include a fork-join framework that simplifies the expression of parallel processing, for example in loops or in recursive method invocation. It will also provide new data structures, such as ParallelArray, that make parallel iteration easier to express. To learn more about parallel computing, read the free Introduction To Parallel Computing by Blaise Barney or search the free parallel programming books at Google Books.

Eclipsed by Europa

Eclipse LogoHave you been eclipsed lately? I mean software-wise, of course, using the Eclipse integrated development environment (IDE). Today I came pretty close to feeling eclipsed by the latest Eclipse download dubbed “Europa”. It began with the sheer size of the packages. 125 MB for the Eclipse JEE Europa version, 36 MB for web tools, another couple of megabytes for PHP development tools, visual editor, etc., etc. Not a problem with today’s DSL, you might think, and that is of course true if you live in America or Europe.

Unfortunately it’s a different story in Thailand. The Eclipse download page insisted on assigning a local mirror to me. This may be well-intentioned, alas not very effective, because servers located in Thailand tend to be bandwidth-drained, especially if it belongs to a public institution. Regrettably, there was no way to change that. The first connection delivered a whopping 4 kb/s download speed and stalled after 15 minutes. After that I decided to switch to BitTorrent. The BitTorrent software gave me 5 kb/s. That’s still dial-up speed, but at least a small improvement.

My computer spent the night downloading Eclipse files.In the morning, all the precious nuggets had arrived on my hard disk. Installation was a breeze compared to the download. Most of the standard components were already included in the Java EE download. I just had to add WTP, PDT, and a few other favourites on top of the Europa distro. In the past, this wasn’t always an easy task. I remember the time when some plugins depended on different, mutually incompatible versions of other plugins. The plugin architecture of the Eclipse framework is sort of asking for this type of problem.

Fortunately the days of dependency hell are gone thanks to synchronised update cycles of the Eclipse projects (there are 21 individual modules in the Europa release). In just a few minutes I had my shiny new Eclipse up and running. After starting it with the “-clean” parameter over the old workbench directory I was in business. It even accepted my old configuration settings. The thing I noticed first is that Europa launched considerably faster than my previous 3.2 version, but -good gracious- the title bar displayed “Program not responding” when I tried to interact with the menu. It seems that Eclipse is now initialising UI components after showing the IDE window and it keeps UI components locked during that time. Eventually after 10 seconds or longer, the program worked normally. I suspect that Eclipse is not loading faster after all; it just displays faster. The second surprise was that Eclipse opened external editor when I double-clicked on a JavaScript file in the navigator. What on earth? I thought I had WTP installed, which supposedly includes a JavaScript editor. Instead of getting to the bottom of that question, I decided to install my favourite JavaScript editor JSEdit, which now belongs to Adobe, but is still distributed freely. Eclipse IDE

Since I use Eclipse for Java as well as PHP development, I gave the new PHP Development Tools (PDT) a spin. PDT is now dubbed the premier PHP editor for Eclipse. To be honest, I was mildly disappointed. First of all, the outline view did not work. While syntax colouring and code folding were okay, the PDT editor lacks some of the features that I have grown used to, such as automatic code completion, marking of uninitialised local variables (which is great for typos in variable names), occurrence highlighting, instant compilation, etc.

Since all of these are real productivity gainers, I quickly reverted back to my old PHPEclipse plugin, after playing with PDT for a while. Because PDT is work in progress, I will certainly check back in a few months. The major incentive to use PDT instead of PHPEclipse is the integration of PDT with the Zend debugger plugin. What else is new in Eclipse? The interface now uses a gradient colour scheme which gives the UI a nice new look. Java code editing and refactoring has been improved. Code completion recognises Java types even if the respective imports haven’t been typed out yet. The code assist function is now able to to determine the legal types of exceptions in a catch clause based on the contents of the try block.

Unused members and types are now detected, and refactoring can be invoked from the context menu, which makes repetitive tasks, such as renaming identifiers, really easy. Though all of these are small incremental improvements, overall the JDT has become more intelligent as well as faster, which I am sure, every Java developer will appreciate. Of course, there are many more new features, in fact too many to list them here. In summary, Europa is definitely worth the download, even if you should feel a little “eclipsed” by the number and size of its of modules.

Towards web engineering

Perhaps you have never heard of the term web engineering. That is not surprising, because it is not commonly used. The fact that it is not commonly used is, however, surprising. Very surprising actually. The process of software creation resembles that of website creation and engineering principles are readily applied to the former. A website can be considered an information system. However, there is one peculiarity: the creation of a website is as much a technical as an artistic process.

The design of graphics and text content (presentation) goes hand in hand with the design of data structures, interactive features, and user interface (application). Despite some crucial differences, web development and software development have many features in common. Since we are familiar with software engineering, and since we understand its principles and advantages, it seems sensible to apply similar principles to web development. Thus web engineering.

Before expanding on this thought delving into the details of the engineering process, let’s try to define the term “web engineering”. Here is the definition that Prof. San Murugesan suggested in his 2005 paper Web Engineering: Introduction and Perspectives: “Web engineering uses scientific, engineering, and management principles and systematic approaches to successfully develop, deploy, and maintain high-quality web systems and applications.”

Maturation Stages

It seems that current web development practices rarely conform with this definition. Most websites are still implemented in an uncontrolled code-and-fix style without codified procedures and heuristics. The person responsible for implementation is often a webmaster who is basically a skilled craftsman. The process of crafted web development is quite different from an engineering process. Generally speaking, there is a progression that all fields of industrial production go through from craftsmanship to engineering. The following figure illustrates this maturation process (Steve McConnel, 2003):

Industry Maturity Stages

At the craft stage, production is carried out by accomplished craftsmen, who rely on artistic skill, rule of thumb, and experience to create a product. Craftsmen tend to make extravagant use of resources and production is often time consuming. Therefore, production cost is high.

The commercial stage is marked by a stronger economic orientation. Growing demand, increased competition, and companies instead of craftsmen carrying out work are hallmarks of the commercial phase. At this stage, the production process is systematically refined and codified. Commercial suppliers provide standardised products at competitive prices.

Some of the technical challenges encountered at the commercial stage cannot be solved, because the research and development costs are too high for individual manufacturers. If the economic stakes are high enough, a corresponding science will emerge. As the science matures, it develops theories that contribute to commercial practice. This is the point at which production reaches the professional engineering stage.

On a global level, software production was in the craft stage until the 1970s and has progressed to the commercial stage since then. Though the level of professional engineering is already on the horizon, the software industry has not reached it yet. Not all software producers make full use of available methods, while other more advanced methods are still being researched.

Recent developments in methodology

Web development became a new field of production with the universal breakthrough of the Internet in the 1990s. During the subsequent decade, web development has largely remained in the craft stage. This is now changing, albeit slowly. The move from hand-coded websites to template-based websites, content management systems, and standard application packages signals the transition from craft to commercial stage. Nonetheless, web development has not yet drawn level with the state of art in software development.

Until recently, web development was like a blast from the past. Scripting a web application involved the use of arcane languages for producing non-reusable, non-object-oriented, non-componentised, non-modularised, and in the worst case non-procedural code. HTML, JavaScript, and messy CGI scripts were glued together to create so-called dynamic web pages. In other words, web development was a primitive craft. Ironically, all principles of software engineering were either forgotten or ignored. Thus, in terms of work practices, developers found themselves thrown back twenty years. However, the rapid growth of the Internet quickly made these methods appear outmoded.

During the past 15 years, the World Wide Web underwent a transformation from a linked information repository (for which it was originally designed) to a universal vehicle for worldwide communication and transactions. It has become a major delivery platform for a wide range of applications, such as e-commerce, online banking, community sites, e-government, and others. This transition has created demand for new and advanced web development technologies.

Today, we have some of these technologies. We have unravelled the more obscure aspects of HTML. We have style sheets to encapsulate presentation logic. We have OOP scripting languages for web programming. We have multi-tier architectures. We have more capable componentised browsers. We have specialised protocols and applications for a great variety of purposes.

However, we don’t yet have established methodologies to integrate all these technologies and build robust, maintainable, high-quality web applications. We don’t yet have a universally applicable set of best practices that tells us how to build, say a financial web application, from a server language, a client language, and a database backend. Consequently, there is still a good deal of black magic involved in web development.

If you build a house, you can choose from a number of construction techniques and building materials, such as brick, wood, stone, or concrete. For any of these materials, established methods exist that describe how to join them to create buildings. Likewise, there are recognised procedures to install plumbing, electricity, drainage, and so on. When you build a house, you fit ready-made components together. Normally you don’t fabricate your own bricks, pipes, or sockets.

Unfortunately, this is not so in the field of web development. Web developers do not ubiquitously rely on standard components and established methods. On occasion, they still manufacture the equivalent of bricks, pipes, and sockets for their own purposes. And they fit them together at their own discretion, rather than by following standard procedures. Unsurprisingly, this results in a more time consuming development process and in less predictable product quality.

Having recognised the need for web engineering, a number of question arises. What does web engineering have in common with software engineering? What are the differences? Which software engineering methods lend themselves best to web development? Which methods must be defined from scratch? A detailed examination of all these questions is unfortunately beyond the scope of this article. However, we can briefly outline the span of the field and name those aspects that are most crucial to the Web engineering process.

Web applications are different

Web applications are inherently different from traditional software applications, because they combine multimedia content (text, graphics, animation, audio, video) with procedural processing. Web development comprises software development as well as the discipline of publishing. This includes, for instance, authoring, proofing, editing, graphic design, layout, etc.

Web applications evolve faster and in smaller increments than conventional software. Installing, fixing, and updating a website is easier than distributing and installing a large number of applications on individual computers. Web applications can be used by anyone with Internet access. Hence, the user community may be vast and may have different cultural and educational backgrounds. Security and privacy requirements of web applications are more demanding. Web applications grow in an environment of rapid technological change. Developers must constantly cope with new standards, tools, and languages.

Multidisciplinary approach

Building a large website involves dissimilar tasks such as photo editing, graphics design, user interface design, copy writing, and programming, which in turn requires a palette of dissimilar skills. It is therefore likely that a number of specialists are involved in the creation of a website, each one working on a different aspect of it. For example, there may be a writer, a graphic designer, a Flash specialist and a programmer in the team. Hence, web development calls for a multidisciplinary approach and team work techniques. “Web development is a mixture between print publishing and software development, between marketing and computing, between internal communications and external relations, and between art and technology.” (Powell, 2000)

The website lifecycle

The concept of the website lifecycle is analogous to that of the software lifecycle. Since the field of software engineering knows several competing lifecycle models, there are likewise different approaches to website design. For example, the waterfall model can be used for relatively small web sites with mainly static content:

1. Requirements Analysis
2. Design
3. Implementation
4. Integration
5. Testing and Debugging
6. Installation
7. Maintenance

This methodology obviously fails for larger websites and web applications for the same reason it fails for larger software projects: the development process of large scale projects is incremental and requirements typically evolve with the project. But there is another argument that speaks for a more incremental/iterative methodology. Web applications are much easier to rollout and update than traditional software applications. Shorter lifecycles therefore make good practical sense. The “release early, release often” philosophy of the open source community certainly applies to web development. Frequent releases increase customer confidence, feedback, and avoid early design mistakes.

Categories of web applications

Different types of web applications can be distinguished by functionality (Murugesan, 2005):

Function Examples
Informational Online newspapers, product catalogues, newsletters, manuals, reports, online classifieds, online books.
Interactive Registration forms, customised information presentation, online games.
Transactional Online shopping (ordering goods and services), online banking, online airline reservation, online payment of bills
Workflow Oriented Online planning and scheduling, inventory management, status monitoring, supply chain management
Collaborative work environments Distributed authoring systems, collaborative design tools
Online communities, market places Discussion groups, recommender systems, online market places, e-malls (electronic shopping malls), online auctions, intermediaries

Maintainability

Maintainability is an absolutely crucial aspect in the design of a website. Even small to medium sites can quickly become difficult to maintain. Many developers find it tempting to use static HTML for presentation logic, because it allows for quick prototyping, and because script code can be mixed seamlessly with plain HTML. What is more, presentation logic is notoriously difficult to separate from business and application logic in web applications. However, this approach is only suitable for very small sites. In most other cases, a HTML generator, a template engine, or a content management system will be more appropriate.

Dependency reduction also contributes to maintainability. Web applications have multiple levels of dependencies: internal script code dependencies, HTML and template dependencies, and style sheet dependencies. These issues need to be addressed individually. The reduction of dependencies usually comes at the price of increasing redundancy/complexity. For example, it is more complex to use individual style sheets for different templates, than to use one master style sheet for all. Internal dependencies and complexity levels need to be balanced by an experienced developer.

As in conventional software development, code reusability is paramount to maintainability. Modern object-oriented scripting languages allow for adequate encapsulation and componentisation of web application parts. The problem is that developers do not always make use of these programming techniques, because they require more upfront planning effort. In environments with high economical pressure, there is a tendency to ad-hoc coding which yields quick results, but lacks maintainability.

Scalability

Scalability is one of the friendlier facets of web development, because the web platform is -at least in theory- inherently scalable. Increasing traffic can often be handled by simply upgrading the server system. Nonetheless, developers still need to consider scalability when designing software systems. Two important issues are session management and database management. Memory resource usage is proportional to the amount of session management data. I/O and CPU load is proportional to concurrent database access, thus developers do well to anticipate peak loads and optimise links between different tiers in an n-tier system.

Ergonomics and aesthetics

The look-and-feel of a website, its layout, navigation, colour schemes, menus, etc. make up the ergonomics and aesthetics of a website. This is a more artistic aspect of web development and it should be left to a professional with relevant skills and a good understanding of usability aspects. Software ergonomics and aesthetics should not be an afterthought, but an integral aspect of the development process. These factors are strongly influenced by culture. A website that appeals to a global audience is more difficult to build than a website targeted at a specific group and culture.

Website testing

Website testing comprises all aspects of conventional software testing. In addition, it involves special fields of testing which are specific to web development:

  • Page display
  • Semantic clarity and cohesion
  • Browser compatibility
  • Navigation
  • Usability
  • User Interaction
  • Performance
  • Security
  • Code standards compliance

Compatibility and interoperability

Web applications are generally intended to run on a large variety of client computers. Users might use different operating systems, different font sets, different languages, different monitors, different screen sizes, and different browsers to display web pages. In some cases, web applications do not only run on standard PCs, but also on pocket computers, PDAs, mobile phones, and other devices. While it is nearly impossible to guarantee proper page display on all of these devices, there needs to be a clearly defined scope of interoperability. This scope should spell out at least the browsers, browser versions, languages, and screen sizes that are to be supported.

Bibliography

Steve McConnell (2003). Professional Software Development
San Murugesan (2005). Web Engineering: Introduction and Perspectives
T.A. Powell (1998). Web site engineering: Beyond Web page design

Ten sure-fire ways to crash your IT project

Although I am sure that you don’t need to learn how to crash IT projects, especially not your own, I would like to suggest this topic for three reasons. First, it’s fun. Gloating over the misfortunes of others may not be noble, but it is certainly edifying. Second, ever since Charles Babbage invented the computer it has been crashing. From blue screens of death to lost space probes, crashes seem to be an intrinsic part of the IT field. Third, we can actually learn from mistakes, even if they are not our own.

(1) Ambiguous specifications
(2) Lack of vision and communication
(3) Planning for disaster
(4) Lack of management commitment
(5) Lack of staff involvement
(6) Arrogance and ignorance
(7) Overambition
(8) Do-it-yourself solutions
(9) Silver bullets
(10) Scope creep

The nature of IT projects is intricate, complex, and sometimes unpredictable. The immense number of failures in the IT industry includes projects that get stuck, projects that never end, projects that overshoot budget, projects that do not deliver, and projects that do all of the aforementioned. The latter is by far most common type of failure. Often such occurrences are discreetly swept under the carpet, by both the customer and the contractor. Neither the customer’s nor the contractor’s reputation is likely to gain from it. This condition of secrecy is somewhat unfortunate since the post mortem analysis of a crashed IT project offers some learning potential.

The author of this article worked in the IT field for almost two decades and has seen a fair number of IT projects come down in a less than graceful manner. It would be presumptuous to claim otherwise. Having had the opportunity to observe and analyse the circumstances of ill-fated projects, it was possible to identify the conditions and patterns that spell failure. Unsurprisingly, all of these are management issues rather than technology or financial issues. This insight often stands in contradiction to what the responsible managers and contractors claim. Of course, it is more convenient to blame things on the “wrong” technology, the “wrong” product, “insufficient” budget, and so on.

(1) Ambiguous specifications

The number one killer of IT projects ought to be poor or ambiguous functional specifications. This is so, because specifications stand at the beginning of a project at a time when important course setting decisions are made. IT projects, like living beings, are most vulnerable in their infancy stage, where wrong decisions have their greatest impact. Sloppy specifications inevitably lead to misunderstandings, oversights, false assessments, and eventually fully grown disputes.

There is no better way to screw up a project than having no clear idea of it and put it out to tender. In order to accomplish this, it is best to assign an incompetent employee to write up the functional specifications. Ideally this would be someone with limited IT knowledge and an incomplete understanding of business requirements and work flow. This person should be asked to scribble up a few pages containing computer buzzwords, obscure management talk, and puzzling diagrams.

An invitation of such make-up will doubtlessly attract numerous bids anyway. After all, contractors cannot be too picky about clients. Some of the tenders may give the impression that they are not completely based on guesswork. These are the ones to present to upper management. Upper management will then select the supplier whose logo resembles that of IBM most closely. Should the supplier have the audacity to suggest a requirements analysis at the client’s expense, this should be rejected with utmost steadfastness and with the hint that further specifications will be worked out along the way.

(2) Lack of vision and communication

The deadly effect of poor specifications is closely rivalled by a lack of vision and communication. The dominant theme is: a problem that is not seen, not heard, and not talked about is not a problem. Naturally, this applies to both the client and the contractor. The client who doesn’t communicate his vision is just as harmful to project success as the contractor who conceals problems. It’s one thing not to have a clear vision, and it’s another not to be able to communicate it clearly.

To achieve the most disastrous results, it is recommended to replace clear vision by vague and unspecific goals. These are best communicated in an opaque language that makes references to features and milestones without actually defining what they consist of. Client participation in the various phases of project implementation should be avoided at all costs. After all, the contractor was hired to solve the problem, so he cannot expect the client to be bothered with answering questions. If client involvement is indispensable, then all work should be delegated to a lower rank executive who may be sacked if things go wrong.

(3) Planning for disaster

Planning for disaster results from the opposite attitude. Instead of putting too little attention to project management, it puts too much attention to project management, or rather the administrative details of it. Hence, the disaster planning attitude has a tendency to generate a lot of paperwork. The principal assumption is that things will probably go wrong. The strategy is then to define a plethora of procedures to prevent things from going wrong, or at least to document how things went wrong in view of ensuing legal procedures. Since this strategy is extravagant and costly, it is preferred by large corporations and government organisations.

The trick is to raise the administrative overhead to an insane level. Project members should at least spend two thirds of their time with meetings, filling forms, and generating documentation. Contracts should be no less than 100 pages. They should stipulate all kinds of provisions for the event of premature termination, breach, default, bankruptcy, death, and natural disaster. For this purpose, a lawyer should be hired from day one. Programmers, system analysts, and technicians must seek approval from their superiors, who must seek approval from top management, who must seek approval from their lawyers.

(4) Lack of management commitment

Every manager knows that “commitment is everything”. Because everybody knows it, managers must make it a point never to admit a lack of commitment. A manager typically says, “I am fully committed to the task, but unfortunately I don’t have time to answer your question right now.” That is an excellent excuse, because everybody understands that managers are very busy people. In addition, it is a diplomatic way of saying, “I rather have dinner with my golf mate than pondering mind-numbing techie questions with the geeks from the IT department.” After all, managers have better things to do than racking their brains over bits and bytes.

To develop this technique to its fullest, one must adopt a feudal view of the workplace, where the manager is the sovereign and the IT department is one of the subordinate departments whose primary function is to serve and follow orders. Since every department needs to understand its proper place in the organisation, it is best to let the nerds know that management cannot be bothered with the trivialities of information technology.

Managers may simply claim to be “non-technical” and confer responsibility to one of the lackeys from IT, preferably a yes-man, who is elevated to midlevel management for this purpose. The new midlevel manager, who is now above the common crowd, does well to cover his back and hire an external consultant. The primary function of this consultant is to serve as a scapegoat in case things turn sour. This configuration allows for maximum passing of the buck and leaves the contractor clueless as to who is in charge of the project.

(5) Lack of staff involvement

Lack of staff involvement is a more academic term for ivory tower syndrome. It is common that IT systems are implemented by people who have never worked in the position of those for whom the system is designed. Although this does not in itself constitute a problem, the situation may be creatively exploited in order to steer a project downhill. It’s best to consider the user an abstract entity, an unimportant appendage to the system, and desist completely from involving him in the design of the system. After all, users are replaceable entities. They have low intellects and they should not be allowed to interfere with the magnificent conception of management and engineering.

Interviews, field tests, usability tests, acceptance tests, and pilot schemes are a complete waste of time. A mere user cannot fathom the big picture. He cannot possibly appreciate the complexities of a multi-tiered architecture or a distributed database system. Such work is better left to the technologists, who know perfectly well what the system should look like. Technologists don’t need to know about lowly business issues. The system can be perfected on the drawing board. If the system looks good on the drawing board, it follows that it will work in practice. Once the system is out of the lab and in the real world, trainers may be dispatched to educate the users about the benefits of the system.

(6) Arrogance and ignorance

We have already moved into the wondrous realm of arrogance. No doubt we can further capitalise on this trait to bring virtually any IT project to a screeching halt. The know-it-all IT manager is just one variation of the theme. A know-nothing CEO may have an even more destructive effect, because there’s nothing quite like arrogance combined with ignorance. This person admits to be ignorant of IT, but he considers himself a top-notch business leader. He has seen company X implementing system Y and doubling their profits since. Moreover, system Y is a market leader and it costs a sum with many zeros. The vendors of system Y wear suits and they talk business, unlike the geeks from the IT department. System Y must surely be good.

This leads us to the topic of gullibility. A fair number of company directors become flabbergasted by IT talk. When listening to expressions like “adaptive supply chain network”, “business intelligence platform”, or “data warehouse”, these great leaders just nod in quiet admiration. Yes, these are wonderful things to have. In the course of time, an organisation with gullible leadership might contract consultitis, an affliction that results either from hiring too many consultants, or from hiring a consultant that continuously dazzles the audience with buzzwords and charts, instead of solving actual problems.

One of the best ways to bring down an IT project early, is to hire multiple consultants to solve the same problem. You can bet your bottom dollar on the consultants fighting a turf war over the client’s patronage. Instead of working on the best solution for a given problem, they will work out solutions to demonstrate how inadequate the other consultant’s approach is, for which they will charge $$$ per hour plus expenses.

(7) Overambition

Overambition is one of the most potent poisons for IT projects. It is usually concocted by planning committees who don’t have a realistic idea of complexity and time frames. The leitmotiv is: “Let’s solve all of our problems at once.” The recipe is fairly simple: Draw up a list of all the issues that the organisation wants to automate, from stock optimisation to HR management. Demand that the system should spit out a complete tax return at the push of a button. Throw in the latest hardware and try to use new and unproven application software. Shake. Do not stir.

Alternatively, you may try the following approach: Set artificially tight deadlines for each milestone and include a contractual clause stipulating that the contractor may be burnt at the stake for missing any of them. During project implementation, insist on incorporating many extras into the system. Urge the IT team to respond to each of the manifold itches of the user community. When a day turns into a week, and a week turns into a month, call for an emergency meeting and define new artificially tight deadlines.

(8) Do-it-yourself solutions

Overambition occasionally takes the form of “I did it my way”. The principal motive for the do-it-yourself approach is a distinctive self-image. First you have to assert that your organisation is unique and special. This results in the deeper heroic insight that none of the standard packages fits the needs of your organisation. At this time it is important to maintain self-confidence. Tell yourself how special you are. Don’t listen to advisers recommending to adopt standard software and change your work flow. The rules and procedures of your organisation are sacred. They have existed for decades; they are proven and true. The IT system should adapt itself to your work flow, not vice versa.

The only solution is then to courageously pioneer the field and tailor your own IT system. At this point you are looking forward to an exciting time of requirement analyses, feasibility studies, implementation and test cycles. The IT adventure has begun. On your way to success it is likely that you will wear out a number of IT managers and consultants. Don’t let this distract you. The rewards are great. You will obtain an absolutely unique system that costs a hundredfold of a standard package and takes ages to complete. If this sounds too daring, you should acquire a standard package and customise it beyond recognition. That way you make sure you will have to go through the entire customisation process again at every update cycle.

(9) Silver bullets

Silver bullets are simultaneously popular and infamous. They are infamous, because everybody knows they don’t work. They are popular, because they hold a huge promise and because there is a fine line between methodology and crankiness. “Methodology” simply means the application of a set of defined procedures and work practices. Methodology turns into crankiness at the point where it becomes a mantra. Contrary to a methodology, a mantra is a mere verbal formula. It is often devoid of meaning. But we are proleptic. How exactly does a methodology become a mantra?

Quite simply by chanting it, instead of practicing it. Example: Company X has identified a problem with quality control. Top management has thought long and hard about it and decided that the Six Sigma program is the way to solve it. The promise that Six Sigma holds -less than four defects in one million parts- has enticed the company to splash out on a new enterprise spanning Define-Measure-Analyse-Improve-Control software system. Few people actually understand what it does and what it means, but everybody understands that it’s called Six Sigma. So, everyone joins the chorus and sings: “Define-Measure-Analyse-Improve-Control”. Problems will surely disappear if the phrase is repeated often enough.

The psychology of the silver bullet is based on faith. A problem analysis is usually suggestive of certain approaches and solutions. Some of these solutions may be championed by leaders or highly respected individuals in the organisation. This gives it more credibility. When the solution is finally approved by the CEO, it gains even more credibility. People start to believe in the method. If the experts say it’s right, and the bosses say it’s right, then it must be right. People within the organisation stop questioning the solution. At this point, the solution becomes a silver bullet.

(10) Scope creep

The phenomenon of “scope creep” actually deserves to stand higher in this list, because it is quite an effective IT project crasher. It’s also deceptively simple. Scope creep means uncontrolled change in the definition and scope of a project. The best way to achieve this is to have no clear idea of the project from the outset. Just let your nebulous thoughts guide you and resist any attempt to concretise scope and define clear boundaries. Practice the philosophy that the world is in a steady flux. You need to be flexible. Then, during project implementation, demand the same from the people who implement it, and throw in manifold additions and alterations. Your motto is: I need more features.

Then sit back and watch the spectacle unfold. The project plan gets redrawn after every meeting. Deadlines are constantly missed, and teams get reshuffled. A few months into the project, the initial requirement analysis will look like a cryptic document from ancient times, which has little resemblance to the current state of affairs. Contractors will jump off, consultants will come and go, and the project starts to develop a life of its own.

Database duel – MySQL vs. PostgreSQL

Almost all non-trivial applications need to store data of some kind. If the data has the form of records, or n-tuples, it is typically handled by a relational database management system (RDBMS). Relational databases are conceptually founded on set theory and predicate logic. Data in an RDBMS is arranged in tables whose elements can be linked to each other. Today almost all RDBMS use SQL (structured query language) to implement the relational model. RDBMS with SQL have been in use since the late 1970s. Previously an expensive corporate technology, the first open source RDBMS became available during the late 1990s. Presently PostgreSQL and MySQL are the most popular open source RDBMS.

MySql LogoPostgreSql LogoBoth database systems are widely used for web applications. Although MySQL has a much larger user base (est. 6 million installations by 2005), the growth of PostgreSQL has recently accelerated. The latter came initially out of an academic environment. PostgreSQL was developed at the Berkeley University as a successor of the proprietary INGRES database. Until 1995, it used QUEL instead of SQL. Since version 6.0, the software is maintained and advanced by a team of volunteers and released free under the BSD license. In contrast, MySQL was developed in a commercial environment by the Swedish company TCX Dataconsult, and later by MySQL AB. It started out as a rewrite of the mSQL database and began to acquire more and better features. MySQL is released under a dual licensing scheme (GPL and paid commercial license).

Since the PostgreSQL developers had a head start of almost 10 years, the PostgreSQL database had hitherto more features than MySQL, especially more advanced features, which are desirable in an “enterprise” computing environment. These include advanced database storage, data management tools, information replication, and backup tools. MySQL, on the other hand, used to have an edge over PostgreSQL in terms of speed. It offered better performance for concurrent database access. Lately, this gap is closing, however. PostgreSQL is getting faster while MySQL acquires more enterprise features. The crucial 5.0 release of MySQL in October 2005 has added stored procedures, triggers, and views.

Let’s look at the commonalities first. Both systems are fully relational, using SQL for data definition, data manipulation, and data retrieval. They run on Windows, Linux, and a number of Unices. MySQL also runs on MacOS. Both databases come with a graphical GUI and query builder, backup, repair, and optimisation tools. They offer standard connectors such as ODBC and JDBC, as well as APIs for all major programming languages. Both systems support foreign keys and data integrity, subselects, transactions, unions, views, stored procedures, and triggers. Among the high-end features that both RDBMS offer are ACID-compliant transaction processing, multiple isolation levels, procedural languages, schemas (metadata), hot backups, data loading, replication (as an add-on in PostgreSQL), table spaces for disk storage layout, terabyte scalability, and SSL. MySQL and PostgreSQL also both support storage of geographic information (GIS). PostgreSQL additionally has network-aware data types that recognize Ipv4 and Ipv6 data types.

Now, let’s look at the differences. PostgreSQL is an object-relational database which means that it has object-oriented features, such as user-definable database objects and inheritance. Users can define data types, indexes, operators (which can be overloaded), aggregates, domains, casts, and conversions. PostgreSQL supports array data types. Inheritance in PostgreSQL allows to inherit table characteristics from a parent table. PostgreSQL also has very advanced programming features. In addition to its native procedural language, PL/pgSQL (which resembles Oracle’s PL/SQL), PostgreSQL procedures can be written in scripting languages, such as Perl, PHP. Python, etc., or compiled languages, such as C++ and Java. In contrast, MySQL (since version 5.0) only supports a native scripting language that follows the ANSI 2003 standard.

PostgreSQL/MySQL Comparison Chart

MySql PostgreSql Comparison Chart

The most evident advantage that MySQL offers –in terms of features– are its so-called pluggable storage engines. One may choose from a number of different data storage models, which allows the database administrator to optimise databases for the intended application. For example, a web application that makes heavy use of concurrent reads with few write operations may use the MyISAM storage engine to achieve top performance, while an online booking system may use the InnoDB storage engine for ACID-compliant transactions. Another interesting characteristic of MySQL not found in PostgreSQL is its support for distributed databases, which goes beyond mere database replication. Functionality for distributed data storage is offered through the NDB and FEDERATED storage engines, supporting clustered and remote databases respectively.

There are further differences, of course. MySQL is generally faster than PostgreSQL. It maintains a single process to accept new connections, instead of spawning a new process for each connection like PostgreSQL. This is a great advantage for web applications that connect on each page view. In addition, the MyISAM storage engine provides tremendous performance for both simple and complex SELECT statements. Stability is another advantage of MySQL. Due to its larger user base, MySQL has been tested more intensively, and it has historically been more stable than PostgreSQL.

PostgreSQL has a slight advantage over MySQL/InnoDB for concurrent transactions, because it makes use of Multiversioning Concurrency Control (MVCC), a mechanism found only in enterprise-grade commercial RDBMS. Another advantage of PostgreSQL is its relatively strict compliance with the ANSI 92/99 SQL standards, especially in view of data types. The ANSI SQL implementation of MySQL is more incomplete by comparison. However, MySQL has a special ANSI mode that disregards proprietary extensions.

In view of backup/restore capabilities, MySQL provides somewhat less convenience than PostgreSQL and commercial enterprise RDBMS. Nevertheless, hot backup and restore operations can be performed with both systems. Both PostgreSQL and MySQL/InnoDB allow transactional tables to be backed up simply by using a single transaction that copies all relevant tables. The disadvantage of this method is that it uses a lot of resources, which might compromise system performance.

With MySQL, a better solution is to use the replication mechanism for a continuous backup. PostgreSQL allows recovery from disk failure through point-in-time recovery (PiTR). This method combines file system level backups with a write ahead log, that records all changes to the database. Thus it is possible to recreate snapshots of the database of any point in time. In most cases, a crashed databases can be recovered up to the last transaction before the crash. The PiTR is also convenient for large databases, since it preserves resources.

MySQL Strengths

  • Excellent code stability
  • Excellent performance, fast CONNECT and SELECT
  • Multiple storage engines to choose from
  • Larger user base (thus larger number of applications and libraries)
  • Support for distributed databases
  • Many high-quality GUI tools available
  • Commercial support widely offered

PostgreSQL Strengths

  • Object-oriented features
  • Advanced programming concepts
  • Supports multiple programming languages
  • High ANSI SQL conformance
  • Mature high-end features
  • Robust online backups
  • Very liberal BSD license

In summary, PostgreSQL and MySQL are both mature products with many enterprise level features. They are both catching on with the best commercial RDBMS and are presently making inroads into the high-end market. The philosophy of both RDBMS differs in several ways. Roughly speaking, MySQL is targeted at developers who expect a workhorse database with proven performance, while PostgreSQL is suitable for developers who expect advanced features and programming concepts. MySQL offers more deployment options, whereas PostgreSQL offers more flexibility for developers.