Web Standards: OpenID

OpenID appears to be red hot right now. The adoption of this emerging standard has accelerated in the first half of 2008 as it has entered the radar screen of web developers. Many large organisations, such as Google, Yahoo, IBM, Microsoft and AOL provide OpenID servers. Popular Internet sites, such as LiveJournal, Blogger, Jabber, Drupal and Wikitravel support OpenID logins, and the list is growing. Browser support for OpenID is just around the corner (it’s a feature in Firefox 3 for example). But we are getting ahead of ourselves. What is OpenID and why is it good? Put simply, OpenID solves two common problems; that of having to manage multiple accounts on different websites and that of storing sensitive account information on websites you don’t control. With a single OpenID account you can log into hundreds of different websites. Best of it, you -the user- manage the account information, not the website owner. In more technical terms, OpenID is an open, decentralised, user-centric digital identity framework. I’ll explain this in some more detail.

openid.pngOpenID is an open standard, because nobody owns it and because it’s free of patents and commercial licensing. The standard is maintained by the OpenID foundation; free open source implementations are available in many languages, including Java and PHP. It is decentralised, because it does not depend on a specific domain server. An existing OpenID provider can be rerouted very easily, as we shall see. It is user-centric, because it allows users to manage and control their identity information. Users can identify themselves with a URL they own. While traditional authentication relies on a combination of either a name or an email address and a password, OpenID just requires one item which is either a URL or an XRI (extensible resource identifier). To understand how this works, let’s look at the OpenID protocol and see what an OpenID login procedure actually does.

Let’s assume you already have an OpenID. You can use the same OpenID with any OpenID-enabled website (called the “relying party”) by typing it into the OpenID login field or by letting your browser fill out the field automatically. When you click Submit, the relying party performs a “discovery” procedure to retrieve an authentication URL and subsequently performs an “association” procedure for secure information interchange with the OpenID provider. You are then transported to the authentication URL (called the “OpenID provider”). Normally this is a site like yahoo.com or myopenid.com, but nothing keeps you from running your own OpenID server. After authenticating at the OpenID provider’s secure login page, you are redirected back to the relying party. If the relying party has requested identity information (name, gender, birth of date, etc.), you are prompted which information should be sent to the relying party. Often this information is used to fill in a registration form at the relying party. This information isn’t retrieved for a normal login, but the OpenID protocol supports it. Once you are back at the relying party’s website, the relying party checks whether the authentication was approved and verifies that the information is received correctly from the OpenID provider.

It sounds slightly complicated and by looking at the OpenID specifications you will find that the protocol is indeed quite involved. However, from the users point of view, it is really simple. The user only sees the OpenID login screen. If the user has enabled automatic login at the OpenID provider via a certificate or cookie, the only screen the user sees is the “approve/deny” screen. Logging into a website could not be easier. Only one password needs to be remembered. Registration forms can be pre-filled. Login into specific sites can be fully automated. The best thing is that the user has full control over the OpenID provider thanks to the discovery process. During discovery, the relying party looks for two fields in the header of the web page that it finds at the OpenID URL. In HTML Discovery, there are two fields named openid.server and openid2.provider.Example:

<link rel="openid.server" href="http://www.myopenid.com/server" />

 <link rel="openid2.provider" href="http://www.myopenid.com/server" />

These two entries commonly point to the same end point (the OpenID provider) and are used by version 1 and version 2 of the OpenID protocol. If you have a website, you could simply edit the HTML of your site to add these entries into the HTML header. You could then use the URL of that page as your OpenID. The advantage of using your own web page is that you control the OpenID end point. Hence, you can switch OpenID providers while retaining your OpenID simply by editing your site’s HTML code.

If you are going to incorporate OpenID into your existing website, you might want to think twice about implementing the protocol yourself. It isn’t trivial, and there are already several open source libraries that can be used, e.g. Openid4java if you program in Java, or the JanRain PHP OpenID library which works with PHP 4.3 up. Additional libraries for these two languages, as well as Ruby, Python, C#, C++, and other languages can be found at http://wiki.openid.net/Libraries.

Grid Computing For A Cause

A few months ago I wondered what to do with the computing power of my new Quadcore PC. It seemed that my daily compiler runs, virtual machines, and the occasional game session don’t make full use of the capacity of this machine. The CPU meter rarely exceeds the 30% mark processing these mundane tasks. It doesn’t even break a sweat when compressing MPEG data. In principle, this is a good thing of course. Yet, the thought that the CPU cores remain underutilised for most of their lifetime appeared slightly wasteful to me. What to do with it? Well, I have found the answer to that question. The answer is BOINC.

BOINC stands for Berkeley Open Infrastructure for Network Computing, which is quite a mouthful, but the program’s purpose is easy to explain: it lets you donate computing resources to the research sector. With BOINC your computer becomes part of a research network. You can choose one or more research projects from a list to which you want to donate computing resources. The BOINC software downloads tasks from these projects, which are then executed on your machine. When the computing tasks are completed, the results are sent back to the project’s host computer. Downloading and uploading happens automatically via the Internet. The project host computer distributes tasks to hundreds or possibly thousands of PCs and coordinates all computing tasks.

This is fashionably called “grid computing”. In essence, the grid is made up by the group of volunteers in case of BOINC, or rather their computers, which are located all over the world. BOINC has more than half a million participants which bring together a whopping 900 to 1000 teraflops from their desktops. This is more computing power than the world’s largest supercomputer, the IBM Blue Gene, currently offers. Unsurprisingly, this quasi-supercomputing platform is used for computationally intensive tasks, or “number crunching” tasks. The best thing about BOIC, however, is that it doesn’t take away CPU cycles from your applications. The BOINC computing tasks run as low priority processes in the background and thus only use CPU cycles when no other program needs them. Hence, there is no noticeable performance decrease.

You might wonder at this point what the BOINC projects are about and why you should donate computing resources to them. There are plenty of projects with different aims and scopes, but it all began with one single project: SETI@home, whereas SETI stands for Search for Extraterrestrial Intelligence. The SETI@home project began in 1999. It is different from other SETI projects in that it relies almost exclusively on donated computing power. The software analyses data from the Arecibo radio telescope and tries to identify potential ETI signals. Although no such signals were found yet, the project has been a big success and it still draws new volunteers. As one of the first volunteer-based grid computing projects, it has demonstrated that the approach is not only viable, but that results generally exceeded expectations. It has also given people a better understanding of some of the challenges that anonymous grid computing entails.

As mentioned, today there are many different research projects that make use of BOINC. The list is growing since BOINC was GPL-ed in 2003. I am sure you will find many worthy causes among them. For example, in the medical sector, there is cancer and HIV research as well as malaria control and human genome research. The World Community Grid, which uses BOINC as one type of client software, specialises in research projects that benefit humanity directly. Then there is climateprediction.net which tries to produce a forecast of the climate in the 21st century. There are a number of biology and bioinformatics projects, such as Rosetta@home which develops computational methods to accurately predict and design proteins and protein complexes. This may ultimately help to find cures for diseases. Finally, there’s a growing number of science projects from quantum physics and astronomy to mathematics.

CPU Resource UsageI am running BOINC for a week and my computer is happily plodding away at constant 100% CPU load. The resource usage graph shows all four CPU cores at max. It doesn’t seem to affect the system negatively, although I have to say, the computer does get noticeably hotter at this load. This definitely means higher energy consumption and thus a higher electricity bill. According to the BOINC Wiki at Berkeley, the power consumption increase is around 50%. Admittedly, I was a bit concerned about overheating, because this is the hot season in Thailand and room temperature is often around 30 deg. Celsius. However, my computer has borne up bravely so far. In order to reduce the heat problem, BOINC allows you to throttle CPU usage to a certain percentage, say 70%, which results in a square pulse resource usage graph. I might try that if it is getting any hotter.

Click this link to download BOINC.

Info crawler addendum

Personal note: It’s almost Christmas now. Since the beginning of this month, my TT&T Maxnet Internet connection has been quirky. I am supposed to have a fast DSL subscriber connection, at least that’s what I am paying for. But I am currently getting 6-8 kbps download speed, which doesn’t even make my old Hayes modem envious. Some sites such as Wikipedia aren’t accessible at all, although I can still connect via proxy server. I hear that other people in Thailand are experiencing similar… um… surprises. To be honest, most people aren’t really surprised. This experience is sort of common. When I was still subscribed with CAT, I was actually glad to get anything above 0 kbps because half of the time the connection didn’t work at all. Once I made the mistake to call their customer service to inquire about this.The receptionist connected me to the “technician”. The technician determined that this was a “special” problem and decided that I had to talk to a specialised technician. The specialised technician then connected me to another technician who was allegedly responsible for my area. That person finally routed me back to the first “technician” so the game could start anew. Well, it could have, but I hung up at that point. I am not even thinking about calling TT&T now. – It’s the time to be merry.

On the info crawler lane

Sometimes I wonder when exactly Thailand has decided to jump on the info highway. Or hasn’t it? I am not sure, because on one hand telecom and IT are ubiquitous in daily use, on the other hand nobody seems to care much about it, at least not in the upper ranks of government. Of course, there is one organisation that has always cared about ICT in Thailand, namely the CAT. No, it’s not a feline quadruped but an acronym that stands for “Communication Authority of Thailand”. Behind this rather grandiose name is the wonderful state-run telecom company which owns Thailand’s international communication infrastructure. One of its major achievements was to ensure that general Internet access is twice as expensive as in Europe and America and about five to ten times slower.

The CAT, however, is just one piece in the ICT jigsaw puzzle. There are other players, such as the aptly named TOT (which means “dead” in German), the national company that owns much of the domestic telecom infrastructure. Both organisations have proven their competence by dominating the telecom markets for decades. Of course, this is not too difficult if you are a monopoly. In fact, you could reduce customer service to almost zilch and still dominate the market. Not that I am accusing these highly praised organisations of such a thing. As state-enterprises, they are forced to follow government policies, aren’t they? They are just the pawn of the ICT ministry. Or perhaps it is the other way round? Goodness knows. At least officially, the ministry of ICT (MICT) reigns supremely over said organisations.

The MICT is a fairly young entity. Being established in this millennium, it is certainly younger than the companies it is supposed to control. However, the ministry has quickly gained fame by announcing the privatisation of these companies and then reversing its decision. Rumours that the recently resigned ICT minister has done this on the same day in two different press conferences cannot be confirmed, however. Apart from amazing press announcements, most people know the MICT from blocking websites. Of course, the  MICT’s “McClean” campaign only targets sleazy and illegal websites which are no good anyway, such as youtube.com. Yet it would be unfair to suggest that the MICT’s activity is restricted to blocking websites. Of course they do a lot more.

For example, the ministry has recently established the ICT Usage Promotion Bureau. This bureau stepped into action right away and produced a so-called “Housekeeper CD” which will be distributed freely in Thailand next month. The software on this CD blocks even more websites. It will protect us from what the enlightened bureaucrats consider evil influences. Of course, it’s up to everyone to decide whether to follow the MICTS’s idea of safe Internet usage and install this “Housekeeper”. I can already see throngs of people queuing up to get their hands on the CD. The interesting thing about this case is that one of the first acts of an agency whose mandate is to promote ICT usage is to limit its usage. Quite a remarkable waste of tax money, if you ask me.

Certainly there are more efficient ways to protect unsuspecting citizens from the hazards of ICT usage: Thailand’s so-called cybercrime law, for example, which was passed earlier this year. The new law requires every service provider to keep a record of it’s users Internet usage for 90 days. A “service provider” is by definition everyone who provides an Internet service (quite obviously). This ranges from access providers and Internet cafés to website operators and blog authors. Theoretically, every of these entities is required to keep a log of their user’s full identity including their names. The cybercrime hunters expect service providers to hand over such information to the police upon request. It will be interesting to see how online services, bloggers and forum operators will go about recording their user’s real identities, given that they never see their users. Of course, if the host computer is located outside Thailand, then the crime of not collecting user particulars is not punishable under Thai law. Oh well. Not exactly a boost to the Thai hosting industry, I suppose.

Another interesting aspect of the new law is the “Photoshop clause” which makes it illegal to post an altered image of a person on the Internet if the image damages that person’s reputation. This does of course immediately raise the question, whether it is also illegal to post a non-altered image that damages a person’s reputation, nude photos of the ex-girlfriend, for example. According to the cybercrime law this would be perfectly legal, although it may be expected that the defamation articles of the civil code may be applied in such cases. One can only marvel at the half-baked authoritarian style of the legislation and the stunning absence of any far-sightedness. It makes one ask: “Are you serious?” Unfortunately they are.

New look, new engine

Former website layout As of today this blog has a new look (the old layout can still be seen in the picture on the left). Actually it’s a bit more than a new look, because I did not only change the template but the whole software package. The old Mambo version was hopelessly outdated after two years. It started to have problems with the database. This probably had something to do with the MySQL update that my hosting company had installed recently. My first idea was of course to upgrade the existing Mambo installation. When I looked at the latest versions of Mambo and Joomla, however, I didn’t see too much innovation there. The core functionality and UI pretty much looked like they did 2 years ago. So, I decided to swap the whole thing for a shiny new WordPress installation. And that’s how this website got its new look.

I promise I won’t get religious about which CMS or Blog is the best, as I consider this question rather futile. For some it’s Mambo, for some it’s WordPress (and there are still another gazillion systems of that ilk which one might discuss at length). What I like about WordPress is its simplicity and usability, which means that I get my job done faster. It’s about making things as simple as possible, but not simpler. Not that Mambo is excessively complex, on the contrary, but some things happen to work more smoothly in WordPress, at least I like to think so. For example, inserting images, SEO management, and code editing. Yes, WordPress has its limits. It’s not like typo3 or Alfreco, it’s just a blogging software. But blogging it does well. So, welcome to the new WordPress era!