Oct 7

Scala is a purely object-oriented language in the sense that all language constructs are objects. This includes all primitive type values, such as Char, Int, Long, Float etc. For example, it is perfectly legal in Scala to append a method call to a literal primitive value, as in 3.1415.round or 88.max(99). Scala defines the following basic types:

Byte     8-bit Integer (-2^7 to 2 ^7-1)
Short    16-bit Integer (-2^15 to 2^15-1)
Int      32-bit Integer (-2^31 to 2^31-1)
Long     64-bit Integer (-2^64 to 2^64-1)
Char     16-bit Unicode character (0 to 2^16-1)
String   A sequence of Chars
Float    32-bit IEEE 754 single-precision floating point number
Double   64-bit IEEE 754 double-precision floating point number
Boolean  Logical value (true or false)

These data types are defined in the package scala, which is imported automatically along with scala.lang by the compiler. Extended functionality for these types (such as the cited max() function) is available via the scala.runtime API. Each type has an associated “rich wrapper class”, e.g for String there is a RichString class, for Int there is a RichInt class, and so on. Unlike Java, Scala does not distinguish between objects and primitive types. This implies that there are no boxing and unboxing functions and thus less to worry about the semantic details of these operations. Everything is simply an object. Behind the scenes the compiler performs optimisations that amount to auto-unboxing when arithmetic expressions are evaluated. Hence, the expression a + b * c which is equivalent to method calls b.*(c).+(a) is actually resolved to a primitive type arithmetic, just like in Java.

Scala is not only object-oriented in the sense that all variables are objects, irrespective of type, but that all functions including methods are also objects. Since functions and methods are objects, they are treated just like regular variables, which means that functions can be assigned to variables. They can also be used as parameters in function calls or as return values of functions and methods. A function value in the position of a function parameter creates a so-called higher-order function. In addition, Scala has a construct called a function literal which is basically a nameless function used in place of a function value, a bit like an anonymous method in Java, although the former are a lot more versatile. Furthermore, operators in Scala are methods. This makes some sort of operator overloading possible. Since +,-,*,/ etc. are just methods with funny names, you can define operators for your own data types. It is therefore possible to extend the language by adding APIs that contain class definitions for new data types along with their operators. This is illustrated by the following class definition for rational numbers. Rational numbers can be expressed as a fraction of two integer numbers. The following class defines the data type Rational as well as the four basic arithmetic operations for rational numbers:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
class Rational(numerator: Int, denominator: Int) {
 
  require(denominator != 0)
 
  private val gcd = greatestCommonDivisor(numerator.abs,
    denominator.abs)
  val n = numerator / gcd
  val d = denominator / gcd
 
  def this(n: Int) = this(n, 1)
 
  private def greatestCommonDivisor(a: Int, b: Int): Int =
  if (b == 0) a else greatestCommonDivisor(b, a % b)
 
  def + (that: Rational): Rational =
  new Rational(n * that.d + d * that.n, d * that.d)
 
  def - (that: Rational): Rational =
  new Rational(n * that.d - d * that.n, d * that.d)
 
  def * (that: Rational): Rational =
  new Rational(n * that.n, d * that.d)
 
  def / (that: Rational): Rational =
  new Rational(n * that.d, d * that.n)
 
  override def toString = n + "/" + d
}

The first line contains the head of the class with the so-called primary constructor, which is part of the class declaration in Scala. The expression in parentheses accepts two integer numbers that constitute the numerator and the denominator of the fraction. The next line that starts with require tests the denominator for zero and throws an exception if denominator = 0. As you can see, exceptions don’t need to be declared in the head of the class definition in Scala. The following three lines reduce the fraction by calculating the greatest common divisor and subsequently dividing the numerator and denominator by the result. These first four lines are not enclosed by a function definition block or any other block and thus constitute the body of the primary constructor. Next follows what’s called a auxiliary constructor in Scala. Auxiliary constructors are easy to spot, because they always start with this(). The expression def this(n: Int) = this(n, 1) provides a constructor that accepts a single integer number as an argument. This makes it easy to represent integer numbers as fractions where needed simply by setting the denominator implicitly to 1. For example, the expression new Rational(2,3) results in 2/3 and the expression new Rational(2) results in 2/1.

The method definition that follows the constructor code computes the greatest common integer divisor of two numbers using recursion. Since the recursive invocation stands in tail-call position in this instance, the Scala compiler optimises the generated byte code  internally and creates a loop instead of a recursive function call. The four methods after the greatestCommonDivisor() method define the four basic arithmetic operations for fractions. In case of addition and subtraction, the numerators and denominators of both operands are multiplied with each other before the sum of the numerators is calculated. In doing so, Scala makes use of built-in operator precedence (* and / before + and -), which is determined lexically if the function or method name begins with a special operator characters. The expression  n * that.d + d * that.n, is thus executed as n.*(that.d).+.(d.*(that.n)). Multiplication and division are even easier to implement as the two fractions are simply multiplied by each other, or respectively multiplied by the reciprocal value. The results are then returned as new Rational objects. Since the constructor finds the GCD of the two integer numbers, the resulting fraction is automatically reduced. Finally, the last method toString outputs the fraction in a readable form. Overridden methods must be marked with the override keyword in Scala. Functions that don’t take arguments are written without parentheses, hence toString() becomes toString.

Class Hierarchy

Scala HierarchySimilar to Smalltalk, every class inherits from a single common superclass. The universal superclass is named Any in Scala. This is to say that any datum in Scala is of the type Any. The subclasses of Any fall into two categories: AnyVal (value) and AnyRef (object reference). The above listed basic data types, as well as the primitive types in Java, such as byte, int, float, etc., are direct descendants of AnyVal, while all other types are descendants of AnyRef. The Scala type AnyRef is therefore conceptually identical with java.lang.Object and it comes with analogous methods, e.g. equals(), hashCode, clone, wait, etc. Some of these are actually defined further up in the hierarchy, namely in Any. In addition, Scala defines the type scala.scalaObject which provides the common superclass for all objects in the Scala APIs. This works because Scala allows multiple inheritance via so-called traits. This will be explained later in more detail.

The nesting of this type hierarchy allows interoperability wit Java. For example:

1
2
3
var a: int = 2    // Java Integer
var b: Int = 3    // Scala Integer
println(a + b)    // 5

In addition, it is possible to use Java collections for storing Scala data types, or Scala collections for storing Java data types. Furthermore, Scala has several special data types. For example, scala.Null is defined as subtype of any class that derives from an object reference scala.AnyRef. It is analogous to the null literal in Java and can thus be assigned to object references. The type scala.Nothing is defined as a subtype of all types; it is employed in exceptions where an expression does not return a type. This happens situations where an exception is thrown in a place where a value is expected. Finally, there is the type scala.Unit that is used for functions that don’t return a value. This is Scala’s concession to imperative programming. It is analogous to the void keyword in Java.

Share and Enjoy:
  • Digg
  • del.icio.us
  • Technorati
  • Facebook
  • Mixx
  • Google
  • YahooMyWeb
  • Slashdot
  • LinkedIn
  • blogmarks
  • Live
  • description
  • StumbleUpon
  • Ma.gnolia
  • MisterWong
  • NewsVine
  • Reddit
  • Spurl
  • Yigg
  • E-mail this story to a friend!
Sep 26

Because HTML is at the very core of the World Wide Web, you would expect it to be a mature and refined technology. You would also expect it to provide a flexible platform for Web application development and deployment. As most web developers know, the reality is a bit different. HTML started out as a rather simple SGML application for creating hyperlinked documents. It originally provided a basic set of elements for data viewing, data input, and formatting, whereas it did a little bit of all, yet nothing quite right. While this was practical for whipping up quick-and-dirty websites, it proved to be inadequate for more demanding presentation tasks and fine-tuned user interaction. Thus a whole bunch of supplemental technologies came into being, including CSS, JavaScript, Flash and finally AJAX. You know the story. All of this was quite a messy affair and unfortunately it still is.

While the HTML 4.01 specification has ruled the Web since 1999, the fifth incarnation of HTML was released by the W3C as a working draft earlier this year and is constantly updated since then. The HTML 5 specification is supposed to pave the way for future Web standards. It contains an older draft of W3C dubbed “Web Forms 2.0”, which is W3C’s answer to Web 2.0 and the World Wide Web becoming a platform for distributed applications. Don’t expect anything too radical, though. It neither delivers the hailed “rich GUI” for the Internet, nor will it replace current technologies like AJAX. It is rather designed as a natural extension of the former. It provides good backward compatibility while smoothing some of the rough edges of HTML. No more no less. Let’s have a look at the new features in more detail.

HTML 5 mends the split between the preceding HTML 4 and XHTML 1.0 specifications. Rather than being defined in terms of syntactical rules, it makes the DOM tree its conceptual basis. Thus HTML 5 can be expressed in two similar syntaxes, the “traditional” one and the XML syntax, which both result in the same DOM tree. It goes far beyond the scope of previous specifications, for example by spelling out how markup errors are handled, rather than leaving it to browser vendors, and by specifying APIs for new and old elements. These APIs describe how scripting languages interact with HTML. So, what’s new? The following elements have been dropped from the specification:

  • <acronym>
  • <applet>
  • <basefont>
  • <center>
  • <dir>
  • <font>
  • <frame>
  • <frameset>
  • <isindex>
  • <noframes>
  • <s>
  • <small>
  • <strike>
  • <tt>
  • <u>
  • <xmp>

The following attributes are also goners:

    abbr, accesskey, align, alink, axis, background, bgcolor, border, cellpadding and cellspacing, char, charoff, charset, classid, clear, compact, codebase, codetype, coords, declare, frame, frameborder, headers, height, hspace, language, link, marginheight and marginwidth, name, nohref, noshade, nowrap, profile, rules, rev, scope, scrolling, shape, scheme, size, standby, summary, target, text, type, valuetype, valign, version, vlink, width.

Some of these elements and attributes are quite obscure, so perhaps they won’t be missed. Others like <center>, align, background, and <u> were heavily used in the past, although most of these were already deprecated in HTML 4. The message here is clear: get rid of presentational markup and use CSS instead. The <b>, <i>, <em> and <strong> tags have miraculously survived, however. Although primarily used for text formatting in the past, these tags have been assigned new (non-presentational) semantics to make them respectable. Another conspicuous omission are frames. Yes, frames are gone! But you might breath a sigh of relief to know that <iframe> is still there. Speaking presentational versus semantic HTML, there are quite a few additions to HTML 5 in the latter category. The new semantic tags are designed to aid HTML authors in structuring text and to make it easier for search engine crawlers to parse information in web pages. Here they are (explanations provided by W3C):

  • <section> represents a generic document or application section. It can be used together with h1-h6 to indicate the document structure.
  • <article> represents an independent piece of content of a document, such as a blog entry or newspaper article.
  • <aside> represents a piece of content that is only slightly related to the rest of the page.
  • <header> represents the header of a section.
  • <footer> represents a footer for a section and can contain information about the author, copyright information, et cetera.
  • <nav> represents a section of the document intended for navigation.
  • <dialog> can be used to mark up a conversation in conjunction with the <dt> and <dd> elements.
  • <figure> can be used to associate a caption together with some embedded content, such as a graphic or video.
  • <details> represents additional information or controls which the user can obtain on demand.

Most of these, except the last two, behave like the <div> element, which means their primary use is to identify a block of content that belongs together. Unlike <div> special semantics are associated with each of these elements. Not very exciting? HTML 5 also introduces the following new elements (explanations again from the W3C document):

  • <audio> and <video> for multimedia content. Both provide an API so application authors can script their own user interface, but there is also a way to trigger a user interface provided by the user agent. Source elements are used together with these elements if there are multiple streams available of different types.
  • <embed> is used for plugin content.
  • <mark> represents a run of marked (highlighted) text.
  • <meter> represents a measurement, such as disk usage.
  • <time> represents a date and/or time.
  • <canvas> is used for rendering dynamic bitmap graphics on the fly, such as graphs, games, et cetera.

The <embed> tag supersedes the <applet> and <object> tags. It defines some sort of embedded content that doesn’t expose its internal structure to the DOM tree. The content is typically rendered by a browser plugin. The <audio> and <video> tags are perhaps more interesting, because they make it possible to include multimedia files or streams directly into the HTML document without having to specify a vendor-specific plugin for playing the content. Granted, this could previously be done with the <embed> tag, but the <embed> tag was never a W3C standard and it isn’t supported by all browsers. Obviously, W3C has decided not to follow the mainstream browser implementations and added the <audio> and <video> tags instead, while reserving the <embed> tag for the above named purpose.

Arguably the most exciting additions to HTML 5 -at least from the perspective of a web developer- are the extensions to form processing and data rendering, and the related APIs, such as the editing API or the drag-and-drop API. These additions have previously evolved as a separate standard under the term Web Forms 2.0 and are now incorporated into HTML 5. The <input> element has been enhanced to support several new data types. New elements for user interface components have been defined, similar to those that can be found in GUI applications. For example, HTML 5 finally features the long awaited combo box, a combination of text input and drop-down list, which is a standard component in GUIs for decades. A new <datagrid> element for the interactive/editable representation of data in tabular, list, or tree form, is also present. Here are the new <input> types:

  • type=”datetime”- a date and time (year, month, day, hour, minute, second, fraction of a second) with the time zone set to UTC.
  • type=”datetime-local”- a date and time (year, month, day, hour, minute, second, fraction of a second) with no time zone.
  • type=”date” - a date (year, month, day) with no time zone.
  • type=”month” - a date consisting of a year and a month with no time zone.
  • type=”week” - a date consisting of a year and a week number with no time zone.
  • type=”time”- a time (hour, minute, seconds, fractional seconds) with no time zone.
  • type=”number” - a numerical value.
  • type=”range” - a numerical value, with the extra semantic that the exact value is not important.
  • type=”email”- an e-mail address.
  • type=”url” - an internationalised resource identifier.

The input element also has several new attributes in HTML 5 that enhance its functionality (many of these also apply to other form controls such as <select>, <textarea>, etc.):

  • list=”listname” - used in conjunction with the <datalist> element to create a combobox.
  • required - indicates that the user must provide an input value.
  • autofocus - automatically focuses the control upon page load.
  • form - allows a single control to be associated with multiple forms.
  • inputmode - gives a hint to the user interface as to what kind of input is expected.
  • autocomplete - tells the browser to remember the value when the user returns to the page.
  • min - minimum value constraint.
  • max - maximum value constraint.
  • pattern - specifies pattern constraint.
  • step - specifies step constraint.

The following new elements provide additional user interface components for web applications. The last three are actually not themselves UI components, but components used for scripting the UI through a server side language:

  • <command> represents a command the user can invoke (e.g. toolbar button or icon).
  • <datalist> together with the a new list attribute for input is used to create comboboxes.
  • <output> represents some type of output, such as from a calculation done through scripting.
  • <progress> represents a completion of a task, such as downloading or when performing a series of expensive operations.
  • <menu> represents a menu. The element has three new attributes: type, label and autosubmit. They allow the element to transform into a menu as found in typical user interfaces as well as providing for context menus in conjunction with the global contextmenu attribute.
  • <datagrid> represents an interactive representation of a tree list or tabular data.
  • <ruby>, <rt> and <rb> allow for marking up Ruby annotations.
  • <eventsource> represents a target that “catches” remote server events.
  • <datatemplate>, <rule> and <nest> provide a templating mechanism for HTML.

Let’s briefly look at the new <datagrid> element. <datagrid> usually has a <table> child element, although <select> and <datalist> are also possible to create a tree control. The columns in the datagrid can have clickable captions for sorting. Columns, rows, and cells can each have specific flags, known as classes, which affect the functionality of the datagrid element. Rows are selectable and single cells (or all cells) can be made editable. A cell can contain a checkbox or values that can be cycled. Rows can also be separator rows. Datagrids have a DOM API for updating, inserting, and deleting rows or columns. They also have a data provider API that controls grid data content and editing.

I hope you found this brief overview useful. Please note that the features mentioned here don’t cover everything that is new in HTML 5, but hopefully they catch the essence. The HTML 5 specification is a work in progress; it is still changing and evolving. You can find the latest editor’s draft at http://www.w3.org/html/wg/html5/. An overview of the changes from HTML 4 is available at http://www.w3.org/TR/html5-diff/.

Share and Enjoy:
  • Digg
  • del.icio.us
  • Technorati
  • Facebook
  • Mixx
  • Google
  • YahooMyWeb
  • Slashdot
  • LinkedIn
  • blogmarks
  • Live
  • description
  • StumbleUpon
  • Ma.gnolia
  • MisterWong
  • NewsVine
  • Reddit
  • Spurl
  • Yigg
  • E-mail this story to a friend!
Sep 13

One sign of the success of the Java platform is the profusion of new languages for the Java Virtual Machine (JVM), which appeared during the past few years. Some examples are Jython, Groovy, and JRuby. These languages serve niches in domain specific development where they typically offer better productivity and time-to-market than traditional Java development. One language stands out, however – the Scala language. Unlike many others, Scala is not an adaptation of an existing language to the Java platform, but it has been designed for the JVM from the beginning and it is fully interoperable with the existing Java APIs.

What is more, Scala is a rich statically typed language that provides the ease of use of a scripting language. Scala’s object-oriented features are at least as powerful as those of Java. Scala also offers a full set of functional programming features. It is the felicitous combination of object-oriented and functional properties that makes this language interesting. As of today, Scala already has closures, support for properties, an Erlang-like concurrency API (excellent for multi-core parallel programming) as well as other advanced features that Java will only have in future releases, or possibly never. For this reason, I would like to introduce this language to interested readers in a brief tutorial. Once you get to know Scala, you might agree that it’s a fascinating language. So let’s get started. The obvious starting point is the (in-)famous “Hello World” program, or rather a slightly more involved version of it:

1
2
3
4
5
6
object Hello {
  def main(args: Array[String]) {
    for(val arg: String <- args)
      System.out.println("Hello, " + arg + "!");
  }
}

Any Java programmer should be able to understand what these six lines of code do, despite it being written in a “foreign” language. The program prints a greeting for each of the arguments passed to it. The invocation “scala Hello World” prints the inspiring yet slightly trite line “Hello Word!”. For more fun, you could pass the names of all 190 and odd countries and the program would greet each nation individually. There are some obvious differences to Java. For example, there are no static or void declarations. Another obvious difference is that the program starts with an object definition rather than a class definition. In Scala, the keyword object creates a singleton object, which is akin to a class that contains only static members, except that the Scala object is able to make use of inheritance and polymorphic invocation.

Method definitions start with the keyword def in Scala. The single argument to the main method is args which is of type Array[String], meaning “array of string”. The method contains a for-loop that looks similar to the Java-5 foreach loop construct. It iterates over the elements in the string array args. The expression val arg: String is equivalent to the Java expression final String arg. The keyword val declares an immutable variable, which means that its value cannot be changed after it has been assigned a value. Scala also has mutable variables which are declared with var instead of val. The variable arg is of type String and it takes the value of the iterator variable. It is possible to simplify the program a little:

1
2
3
4
5
6
object Hello {  
  def main(args: Array[String]) {
    for(val arg <- args)
      println ("Hello, " + arg + "!")
  }
}

In this version, the following items have been omitted:

The type declaration : String after arg was eliminated. Although Scala is a statically typed language, which means that every variable has a fixed type, it can infer type declarations automatically. In this case, it infers that the arg variable is of type String because the args variable is of type array of String. This feature is known as type inference and it contributes to making Scala source code more concise by eliminating redundant type declarations. Other examples for type inference are:

1
2
3
val a = "Abacadabra"
val b = 3.141592
val c = Map(1 -> "alpha", 2 -> "beta", 3 -> "gamma")

The types of the variables a, b, c are derived from the literal values. Variable a is a String, b is a Double, and c is a Map that maps Integer values to String values. The above listing also does without the System.out in System.out.println() which results in just println(). Since println() is a frequently used statement, the method context is predefined by Scala. Java programmers may think of this as an automatic static import. Another thing that is absent from the simplified version of the program is the semicolon after the println() call. Semicolons are mostly optional in Scala, except in situations where multiple expressions are stringed together in one line. The code already looks simple, but with Scala we can make it even more concise by replacing the for loop with a foreach() invocation:

1
2
3
4
5
object Hello { 
  def main(args: Array[String]) {
    args.foreach(arg => println("Hello" + arg + "!"))
  }
}

The foreach construct iterates over args, creating a String variable arg variable each time and calls println() on every iteration. The expression with the arrow (an equals sign followed by a greater than sign) this => that can be read “with a given argument this do that”. It is called a function literal in Scala. Granted, in this case it doesn’t save us much typing, but if we wanted to print just the arg variable and if we didn’t refer to it in the expression following the arrow =>, we could just write it as: args.foreach(println), which is the short form for args.foreach(arg => println(arg)).

I hope that this first glance at Scala has whetted your appetite for more. The cited examples are fairly unsophisticated. The next section, which discusses Scala’s types and object hierarchy, will introduce some more involved concepts.

Share and Enjoy:
  • Digg
  • del.icio.us
  • Technorati
  • Facebook
  • Mixx
  • Google
  • YahooMyWeb
  • Slashdot
  • LinkedIn
  • blogmarks
  • Live
  • description
  • StumbleUpon
  • Ma.gnolia
  • MisterWong
  • NewsVine
  • Reddit
  • Spurl
  • Yigg
  • E-mail this story to a friend!

« Previous Entries Next Entries »