On the info crawler lane

Sometimes I wonder when exactly Thailand has decided to jump on the info highway. Or hasn’t it? I am not sure, because on one hand telecom and IT are ubiquitous in daily use, on the other hand nobody seems to care much about it, at least not in the upper ranks of government. Of course, there is one organisation that has always cared about ICT in Thailand, namely the CAT. No, it’s not a feline quadruped but an acronym that stands for “Communication Authority of Thailand”. Behind this rather grandiose name is the wonderful state-run telecom company which owns Thailand’s international communication infrastructure. One of its major achievements was to ensure that general Internet access is twice as expensive as in Europe and America and about five to ten times slower.

The CAT, however, is just one piece in the ICT jigsaw puzzle. There are other players, such as the aptly named TOT (which means “dead” in German), the national company that owns much of the domestic telecom infrastructure. Both organisations have proven their competence by dominating the telecom markets for decades. Of course, this is not too difficult if you are a monopoly. In fact, you could reduce customer service to almost zilch and still dominate the market. Not that I am accusing these highly praised organisations of such a thing. As state-enterprises, they are forced to follow government policies, aren’t they? They are just the pawn of the ICT ministry. Or perhaps it is the other way round? Goodness knows. At least officially, the ministry of ICT (MICT) reigns supremely over said organisations.

The MICT is a fairly young entity. Being established in this millennium, it is certainly younger than the companies it is supposed to control. However, the ministry has quickly gained fame by announcing the privatisation of these companies and then reversing its decision. Rumours that the recently resigned ICT minister has done this on the same day in two different press conferences cannot be confirmed, however. Apart from amazing press announcements, most people know the MICT from blocking websites. Of course, the  MICT’s “McClean” campaign only targets sleazy and illegal websites which are no good anyway, such as youtube.com. Yet it would be unfair to suggest that the MICT’s activity is restricted to blocking websites. Of course they do a lot more.

For example, the ministry has recently established the ICT Usage Promotion Bureau. This bureau stepped into action right away and produced a so-called “Housekeeper CD” which will be distributed freely in Thailand next month. The software on this CD blocks even more websites. It will protect us from what the enlightened bureaucrats consider evil influences. Of course, it’s up to everyone to decide whether to follow the MICTS’s idea of safe Internet usage and install this “Housekeeper”. I can already see throngs of people queuing up to get their hands on the CD. The interesting thing about this case is that one of the first acts of an agency whose mandate is to promote ICT usage is to limit its usage. Quite a remarkable waste of tax money, if you ask me.

Certainly there are more efficient ways to protect unsuspecting citizens from the hazards of ICT usage: Thailand’s so-called cybercrime law, for example, which was passed earlier this year. The new law requires every service provider to keep a record of it’s users Internet usage for 90 days. A “service provider” is by definition everyone who provides an Internet service (quite obviously). This ranges from access providers and Internet cafés to website operators and blog authors. Theoretically, every of these entities is required to keep a log of their user’s full identity including their names. The cybercrime hunters expect service providers to hand over such information to the police upon request. It will be interesting to see how online services, bloggers and forum operators will go about recording their user’s real identities, given that they never see their users. Of course, if the host computer is located outside Thailand, then the crime of not collecting user particulars is not punishable under Thai law. Oh well. Not exactly a boost to the Thai hosting industry, I suppose.

Another interesting aspect of the new law is the “Photoshop clause” which makes it illegal to post an altered image of a person on the Internet if the image damages that person’s reputation. This does of course immediately raise the question, whether it is also illegal to post a non-altered image that damages a person’s reputation, nude photos of the ex-girlfriend, for example. According to the cybercrime law this would be perfectly legal, although it may be expected that the defamation articles of the civil code may be applied in such cases. One can only marvel at the half-baked authoritarian style of the legislation and the stunning absence of any far-sightedness. It makes one ask: “Are you serious?” Unfortunately they are.

New look, new engine

Former website layout As of today this blog has a new look (the old layout can still be seen in the picture on the left). Actually it’s a bit more than a new look, because I did not only change the template but the whole software package. The old Mambo version was hopelessly outdated after two years. It started to have problems with the database. This probably had something to do with the MySQL update that my hosting company had installed recently. My first idea was of course to upgrade the existing Mambo installation. When I looked at the latest versions of Mambo and Joomla, however, I didn’t see too much innovation there. The core functionality and UI pretty much looked like they did 2 years ago. So, I decided to swap the whole thing for a shiny new WordPress installation. And that’s how this website got its new look.

I promise I won’t get religious about which CMS or Blog is the best, as I consider this question rather futile. For some it’s Mambo, for some it’s WordPress (and there are still another gazillion systems of that ilk which one might discuss at length). What I like about WordPress is its simplicity and usability, which means that I get my job done faster. It’s about making things as simple as possible, but not simpler. Not that Mambo is excessively complex, on the contrary, but some things happen to work more smoothly in WordPress, at least I like to think so. For example, inserting images, SEO management, and code editing. Yes, WordPress has its limits. It’s not like typo3 or Alfreco, it’s just a blogging software. But blogging it does well. So, welcome to the new WordPress era!

Discussion board moderation

Discussion board moderation is a new “profession” and as such it requires a new set of skills. These are not, as many believe, technical skills. Discussion board moderation is primarily a management task and therefore it requires management skills. Since management is not an exact science, the dos and don’ts of discussion board moderation are not chiselled in granite. Yet, there are some important principles which executive and prospective moderators should consider.

Discussion boards (or “forums”) are a newfangled social phenomenon that came about with the Internet. They are meeting places for people who share a common interest about which they like to talk. An online discussion is essentially a written asynchronous conversation between two or more parties who send and receive questions, answers, and comments with a relative delay. These written conversations are much slower than natural conversations, but still faster than a traditional exchange of letters.

The necessity for moderation exists for several reasons. Usually the board operator desires some level of control over the content posted by other participants in order to ensure that it does not violate laws and regulations. In addition, the operator might want to define specific rules for the discussion board that fit the culture of its community. Such rules usually target netiquette and ethical codes. Finally, the board administrator must uphold the technical functioning of the discussion board system and prevent abuse. The attainment of these goals are usually delegated to the moderator(s) who may or may not be the same person as the board operator.

Common Challenges

Discussion boards provide entertainment, support, and fun for many people, but they are not without challenges. A virtual meeting place is a bit like a masked ball where participants enjoy complete anonymity. This can lead to problems. Anonymity, as well as the lack of physical contact, has a tendency to lower the inhibition threshold for socially unacceptable behaviour in some individuals. Common challenges are angry, hateful, obscene, or otherwise inappropriate posts, cross posting, spamming, trolling, DoS attacks, identity theft, and other more technical problems.

Flaming And Flame Wars

Flames are intentionally hostile or insulting messages that usually result from a heated exchange between people holding different opinions. Flames are the most common problem of discussion boards. The flame character of a message is identified by its design to attack the opponent rather than the argument. Hence, flames are ad hominems with a strong emotional impact. Flame wars are prolonged exchanges of flame posts, into which –according to the group dynamics of the community– many individuals may get involved. The affinity to flame wars depends on many factors, such as community behaviour, the nature of topics discussed, as well as moderation practices. Flaming is generally deterring and discouraging to users. Obviously, controversial topics are especially susceptible to flames.

Flames are a rather difficult challenge for the moderator. The most suitable strategy to control flames is to employ non-punitive measures, for example posting placatory comments, appeals to fairness, and conciliation proposals to calm the situation. Diplomacy and humour often work well. Prevention of flames, for example by creating a relaxed and intimate atmosphere, is even better. If this doesn’t work, it may be necessary to remind the opponents of the rules regarding discussion style or to close the thread. If the posted flames are inappropriate it may also be essential to delete offensive passages or posts. Finally, if nothing else works, warning and barring the offending member(s) is the last recourse.


A troll is someone who habitually posts disturbing, inflammatory or nonsensical messages that disrupt the discussion and upset the community. Trolls are basically agitators who provoke and create perturbation by some means, usually by flames, in order to drag attention to themselves or to sabotage the discussion. Trolling is best moderated by confronting the offender directly via the personal message system and by putting the troll user on the pre-moderation list if the discussion board software allows it. The motives for trolling are varied. The troll may be a disgruntled user, someone who feels that the board community “has turned against him”, someone with an underlying psychological problem, or merely someone venting temporary frustration. Trolls can be quite problematic. Persistent trolls should be pre-moderated or banned if pre-moderation is not an option.


Outright spamming has become somewhat rare on discussion boards, since most board software prevents robots from signing up and submitting spam. Yet, there is still the problem of spam posted by human subscribers. Spam contents range from fairly subtle, such as text links to a commercial website, to blatant, such as advertising banners in user signatures and posts. Spammers frequently seek out communities that fit the target group for their products or services. For example, a shop that sells exercise machines might seek out sports communities. Evidently most spammers have an agenda apart from the community and the discussion. Nothing is lost by immediately deleting the spam posts and blocking the offending user and IP address. The situation is somewhat different if a regular member submits an advertising post. In most cases, deletion and a warning issued via PM or the warning system will be sufficient to deal with a one-time transgression.


Cross-posting is the practice of submitting the same message to more than one forum. The intention of the sender is to reach the greatest possible number of readers. The conjunct problem is fragmentation of the ensuing discussion. If the cross-post is targeted at the same community, people also get the impression of being spammed. Cross-posting within the same discussion board is annoying in most cases. The moderator needs to decide whether cross-posting is appropriate or whether to delete duplicate posts. In order to avoid thread fragmentation, duplicate threads may be closed, ideally with an annotation containing a link that leads to one thread singled out to continue the discussion. Alternatively, the administrator may disallow cross-posting within the same discussion board altogether.

Off-Topic Posts

This is a very common problem and it is simultaneously difficult to control. Off-topic (OT) posts arise from the associative nature of subject matters, a characteristic that goes to the root of human language. Getting off on a tangent is all to easy. For example, a discussion about nuclear energy may divert into a discussion about alternative energies, nuclear weapons, or state regulations. In the natural flow of a discussion, minor diversions are common and probably unobjectionable. However, a thread often develops in a contingent way that spawns discussions about multiple topics –often in parallel– which is confusing in the same way a group of people talking at the same time is confusing. Unfortunately, there are no universally valid guidelines for off-topic moderation. It always depends on context and community. In an informal discussion about philosophy OT posts may be of no concern, while in a more formal setting, such as a technical support forum, off-topic contributions may not be allowed at all.

A topic is usually outlined by its thread title and the tagline (short description). If a thread develops an OT sideline, the OT posts may be swapped out into a new thread by the administrator. Many software packages provide a “split thread” operation for this purpose. To what extent OT posts are moderated and how strongly OT contributions are discouraged depends very much on the nature of the discussion board.


“Noise” is text and other content that does either not belong to a discussion or that interrupts the flow of a discussion. For example, long quotations or distracting signatures can be considered noise. If the noise ratio exceeds a certain value, following the discussion becomes visually tiresome. The best strategy to avoid this is by limiting signatures to a certain length (perhaps also to disallow images in signatures) and by discouraging full quotes. Quotations are often useful, even necessary to remind the reader of something previously mentioned and to establish the context for a reply. However, a full quote in which the answerer refers only to a tiny fragment within the quote is confusing and counterproductive. To avoid this, the discussion board software may be configured to discourage full quotes, for example by ergonomic means. Alternatively, the moderator may remind people not to overuse full quotes and edit out noise manually if necessary.

Multiple Identities and Impersonation

Multiple identities result from the same user subscribing several times to the same discussion board. This might happen with technically inexperienced users, users who have lost their password, or users who intentionally create multiple identities. Although most software packages can be configured to prohibit multiple subscriptions with the same email address and/or from the same IP number, subscribers may bypass this mechanism by using different email addresses and IPs. Furthermore, blocking IP addresses is problematic with dynamically assigned IPs. In most cases, multiple subscriptions result in a number of dead accounts which can be deleted after a certain period of inactivity. Other cases are more troublesome, especially those which involve the continued use of multiple identities or impersonation (identity theft). These are deceptive tactics which are not always easy to detect. They a re popular with trolls. An analysis of IP numbers and time stamps of a sequence of posts is often necessary to uncover this form of abuse. Since this is a serious form of abuse, it usually results in account termination and banning.

Denial of Service Attacks, Hacker Attacks

Denial of Service (DoS) Attacks are technical sabotage manoeuvres aimed at disrupting the discussion board service. The most common method is flooding. A flooding robot (a program) sends huge quantities of messages to the board, which then becomes unusable for other users. Most discussion board software packages have basic features to avert such attacks, for example by limiting the number of messages a user can post within a certain period. However, resourceful attackers may find ways to bypass these protection mechanisms. Luckily, DoS are somewhat rare since they require a some technical sophistication, and quite a bit of dedication to the purpose of sabotage. Hacker attacks, on the other hand, are more common. The most ordinary hacker attack is password sniffing on unencrypted connections, and subsequently using passwords for gaining entry to the discussion board system, preferably as a user with administrator privileges. DoS and hacker attacks are serious forms of abuse and should be reported to the service provider and possibly to the law enforcement authorities. Board operators do not always have the technical means to take on such attacks on their own.

Types of Moderation

The Usenet community generally distinguishes between four types of moderation, which are likewise applicable to web-based discussion board systems. These types of moderation differ in the way posts are moderated. They feature different decision and communication flow models.


The most common form of moderation is post-moderation, which means that either a single moderator or a group of moderators reviews contributions once they have been posted. In such a setting, messages ought to reviewed on a regular basis (perhaps daily) and moderators ought to perform editorial tasks as required. Post-moderation is time-consuming if done correctly, because moderators need to review all content and respond to inappropriate content in time. Moderators have full censoring power.


The most restrictive form of moderation is pre-moderation. Again, moderators have full censoring power and need to review every message, but content is reviewed before it goes online, not after. This means that posted messages first go into a waiting queue before they are approved and released by the moderator. The delay that results from this procedure is quite detrimental to discussions, because replies are not available to the community in real time. Since this normally drains the lifeblood from a discussion, pre-moderation is applied only in special situations, where the sensitivity of the topic requires more restrictive action. One example for pre-moderation are the book reviews on amazon.com.

Reactive Moderation

Reactive moderation relies on alerts from members of the discussion board. It moves the task of supervision from the moderator to the audience by offering easily accessible means of reporting problems to the moderator. The moderator only needs to review those areas with reported problems. This form of moderation is quite effective in conjunction with automatic supervision, such as word filters. Its greatest advantage is the reduction of moderation workload associated with the pre- and post-moderation methods. What is more, the legal responsibilities of the operator seem to move primarily to removing questionable content, rather than preventing it being posted. The principal disadvantage of reactive moderation is that not all breaches of house rules and legal provisions might get reported.

Distributed Moderation

The distributed moderation model is even more radical. It dispenses with the concept of a moderator person altogether. Instead it relies on the assumption that a community can collectively decide what is appropriate for itself and what is not. Moderation tasks are thus carried out by the community by means of a voting system. Current implementations of voting systems are often similar to content rating systems. For example, if someone suggests a post for deletion, it takes a number of consenting votes to actually carry out deletion. There are two problems with this approach. First, the community might have different views about “appropriate content” than the board operator. Second, online voting systems are still prone to abuse. Thus distributed moderation is not yet widespread, although some groups, such as slashdot.org and wikipedia.org have used it with great success.

Choosing a content management system

If you are playing with the idea of using a content management system (CMS), or if your organisation has already decided to deploy a CMS, then you are facing an important but difficult decision. On the one hand, you know that a CMS is the best way to handle your ever-growing content. On the other hand you are confronted with a bewildering variety of products that leaves you at a complete loss. To make things worse, you know that the choice of a CMS has a far-reaching implications on business processes. Choosing a CMS is not an easy task. It is imperative to select your CMS solution wisely. Deploying an inappropriate product may thwart your project, and it may even be worse than deploying no CMS at all.

In the pioneer days of the Web, there was only one way of publishing information: coding it in HTML and uploading it. The extreme simplicity of this approach was offset by its laboriousness. Developing, updating, and maintaining a medium scale website, say a hundred pages and more, required an insane amount of developer hours, and to make things worse, these were insanely expensive. The software industry soon responded to the dilemma by offering WYSIWIG editors and HTML code generators. With these tools it was possible to design and author websites graphically without having to care about nitty-gritty coding details.

The more advanced editors offered design templates, code snippets, plug-ins, and readymade sequences. They could generate the required set of HTML, JavaScript, and graphic files at a mouse click. These files then had to be uploaded one by one. Although this method is more efficient than manual coding, it still has several drawbacks. Whenever something is changed, pages must be generated and uploaded again, which is time consuming. Sometimes a small change in the design template can mean that hundreds of files need to be replaced. Moreover, the uploaded content is static. This means that it cannot change according to defined parameters, such as user preferences, sort order, date, and so on. Hence, static pages offer limited possibilities for interactive features. This drawback is overcome by the concept of dynamic web pages.

Dynamic pages are generated dynamically at request time. A dynamic web page is not a sequence of HTML tags, but an interpreted computer program (=script) that generates an HTML sequence according to predefined rules. This script is typically executed by a script language interpreter which passes on the resulting HTML sequence to the web server. Dynamical web page scripting unfolds its full potential in combination with an information repository, such as a relational database system, which holds the actual text and media contents. HTML code and information are merged when a user requests a page, and the result changes depending on defined conditions. Today, almost all large websites are based on this principle.

The CMS principle

A content management system (CMS) is a computer program that facilitates the collaborative creation, storage, delivery, distribution, and maintenance of “content”, that is documents, images, and other information. Typically the CMS is a web application and its content is distributed via the Internet or via a private intranet. A CMS exploits the principle of dynamic page generation and adds a further abstraction layer. It streamlines the process of web site creation by automating page generation and by applying templates and predefined features to an entire website. This allows the webmaster to focus on actual content creation and management. CMS either come with a special client software that allows webmasters to edit content and construct web pages, or there is a web-based administrator interface performing this function. The tasks of creating page layout, navigation, scripts and adding modules are left to the CMS. At the heart of every CMS is a database, usually a relational DBMS, which holds the information that constitutes the online content.

Types of CMS

Besides general purpose CMS that facilitate general website creation, there are a number of specialised CMS. For example, Wikis or Wikiwebs are CMS for the collaborative creation of knowledge bases, such as encyclopaedias, travel guides, directories, etc. These systems typically make it easy for anyone to change or add information. Publication CMS (PCMS) allow publishers to deliver massive amounts of content online. They are frequently used by media organisations and publishing houses to create web versions of their print media or broadcasts. Transactional CMS couple e-commerce functions with rich content. As in the case of amazon.com, they are used for applications that go beyond standard shopping cart functionality. Integrated CMS (ICMS) are systems that combine document management with content management. Frequently, the CMS part is an extension of a conventional document management application. Enterprise CMS (ECMS) are large applications that add a variety of specialised functions to the CMS core, such as document management, team collaboration, issue tracking, business process management, work flow management, customer relationship management, and so on.

It is also possible to define market segments by licensing cost. In this case, we can distinguish the following types:

  1. Free open-source CMS (no licensing cost). These products are typically quite simple and focus on general purpose and publishing functionality. Portals and Wikis also belong to this category.
  2. Boxed solutions (up to $3,000.- USD). These products typically offer solutions that allow non-technical users to create and manage websites collaboratively.
  3. Midrange solutions ($3,001.- to $ 30,000.- USD) commonly have a greatly extended set of functions in comparison to boxed solutions, although scope and philosophy may vary significantly. For example there are web development platforms, as well as powerful ICMS in this category.
  4. High-end solutions ($30,001.- USD up) are usually targeted at the enterprise market. Solutions in this class are often designed to handle massive amounts and types of documents and to automate business processes.
  5. Hosted solutions (for a monthly subscription fee) can be found in all of the three previous categories. Instead of a on-time license cost, there is a monthly fee.

The market is highly fragmented and there is a great variety of products in every segment. The largest segment is general purpose CMS with a multitude of proprietary and open-source, commercial, and non-commercial solutions. The sheer number of products makes a comprehensive review practically impossible. It is vital to narrow down the selection of CMS by compiling a list of requirements beforehand. In particular, the requirements should specify what sort of content you wish to manage, which category of CMS you are likely to prefer, and what should be its key features and capabilities. For example, if you wish to maintain documents and web pages in multiple languages, it is important to look for a software that supports this from the outset. Although many CMS can be adapted to handle multilingual content, they do this in different ways. Some may be unsatisfactory to you.

CMS Selection Checklist

Sometimes it is useful to use checklists to determine product features. These can help to narrow down the number of products you might want to review more closely.

Commercial checklist

  • Availability
  • Price
  • Licensing model
  • Total cost of ownership

Technical checklist

  • Supported operating systems
  • Supported web servers
  • Supported browsers
  • Supported database systems
  • Required hardware
  • Programming language
  • System architecture

Functionality checklist

  • Content organisation model (hierarchic/segmented, centralised/decentralised, etc.)
  • Content generation features (editors, spell checkers, etc.)
  • Content attributes (author, publication date, expiry date, etc.)
  • Content delivery (presentation, layout, visualisation, etc.)
  • Content management (moving, deleting, archiving, etc.)
  • Content versioning (multilingual, multiple versions)
  • Media management (images, animations, audio, etc.)
  • Link management (automatic navigation, link consistency checks, etc.)
  • User management (authentication, security, granularity of access privileges, etc.)
  • Template management (design, installation, maintenance)
  • Features for searching and browsing content
  • Special features (email forms, feedback lists, discussion boards, etc.)
  • Extensibility (plug-ins, add-ons, third party modules, etc.)

Integration checklist

  • Integration with external text editors
  • Integration with external image and media editors
  • Integration with external data
  • Integration with static website content
  • Integration with legacy systems

Helpful websites

There are a number of websites that offer CMS comparisons, descriptions, tests, and reviews. These may be helpful in the second phase of selection. After requirements have been gathered and desired key features have been defined, these websites assist prospects in determining concrete products for closer review.

  • www.opensourcecms.com
  • www.cmsmatrix.org
  • www.cmsjournal.com
  • www.cmswatch.com
  • www.contentmanager.net

The final step in CMS selection is to review and evaluate concrete products. This step may be fairly labour-intensive. Vendors must be invited. A trial version of the product must be obtained. It must be installed and configured properly. Its basic functions and features must be learned. Test data must be entered. Meetings and group reviews must be scheduled and held. The whole process may have to be repeated with a number of different products. This may sound off-turning, but the do-it-yourself approach is really the only way to ensure that you get the right product.

Management involvement

As always, management involvement is crucial. The decision making process cannot be completely delegated to IT, because in the end, the job of the CMS is to automate a business function, not an IT function. Depending on the nature of your content, it may be a marketing function, an R&D function, a human relation function, or even a production function as in the case of publishing houses. Depending on how you use the CMS it may also have a large impact on organisational communication. Therefore, management should be involved in phase one and three of the selection process. At the very least, management should review and approve the requirements specification and join the final review meetings. Often it is important to get an idea of the “look and feel” of a product beforehand.

After the acquisition

Once the chosen CMS is acquired and properly installed, people may create and publish content as they wish and live happily ever after. Well, not quite. If users are happy with the system, there may be a quick and uncontrolled growth of content. If they aren’t, the system may gather dust and the electronic catalogues may remain empty. The usual approach to regulate this is to put a content manager in charge of the system. The role of the content manager is different from that of a traditional webmaster. While a webmaster needs to be very tech-savvy, a content manager merely needs to be computer literate. The main responsibility is content editing and organisation. Hence, the role of a typical content manager is that of an editor and librarian.

Long term perspectives

Proprietary content management systems are currently expensive, especially in the enterprise (ECM) segment. The overall market will remain fragmented in the medium term. In the long term, however, the CMS market is likely to be commoditised. This means free open-source systems are likely to dominate the market. Currently open-source products are encroaching the “boxed solution” and “midrange” market. There are even a number of powerful open-source CMS with web delivery focus, such as typo3, which are comparable to proprietary high-performance products. As open-source solutions get more powerful, this trend is likely to continue. Extensibility, a large user base, and commercial support will be crucial for a system to assume a market leader position. At this moment, however, there are no candidates in sight.