VP8 has landed

Today, WebM/VP8 was announced at Google IO. The last month I have been working on adding support for Opera and we have now released labs builds for Windows, Mac and Linux. We have also published an article for web developers who want to start using WebM <video>. Here’s what it looks like, the Sintel trailer on YouTube playing in Opera with WebM <video>. No Flash!

Previously I’ve posted on Opera Core Concerns, but this time I want to share some personal reflections. (In other words: any views expressed are my own, not those of Opera Software.) What follows is the (quite geeky) history of me and the various codecs and organizations I’ve come into contact with over the past 8 years.

Back in gymnasiet I was a compulsive MP3 collector and was very picky about my bitrates. 128 kbps sounded (still sounds) horrible and it pained me, so when I learned about the Vorbis audio codec I was very excited. Not only was it technically superior, it was also completely free. I re-ripped all of my CDs as Vorbis, told all my friends to do the same and started listening to Machinae Supremacy simply because they offered Ogg downloads. I was a fanboy.

In June 2002 On2 released VP3 to the world. It was my summer holidays and I spent most days inside on an extra slow dial-up connection. I clearly remember that upon reading the news I literally bounced out of my chair and threw my hands in the air out of joy. (Remember, I was 17.) The first thing I did when there was code I could compile and run (packaged by Xiph I believe) was to encode and watch Star Trek: First Contact. The example decoder could neither pause or play in fullscreen, so instead I changed my screen resolution and just watched.

At the time I couldn’t do much to help out, but I wanted to be part of this cool community. One of the first pieces of C code I ever wrote was oggsplit, a not-so-useful tool for splitting multiplexed Ogg streams into separate files. I never used it much, but was quite proud to see it in Xiph’s ogg-tools package.

Fast foward. In the summer of 2006 I began working as a summer intern at Opera Software in Linköping, where I wrote an example plugin for video playback on Opera Devices SDK. I picked GStreamer as the backend and by the time I was done I must have watched A New Computer ~1000 times.

In February 2007 Opera proposed <video> and released a proof-of concept Ogg Vorbis+Theora build. I had no part in this, so it came as somewhat of a surprise. Initially thought that they had used my plugin, but that turned out to not be the case – it was libogg, libvorbis and libtheora integrated directly into the browser. The most exciting part was the strong stance for open standards, something that I obviously agree with.

What happened after that is pretty well known: the <video> tag makes obvious sense, so it quickly got implemented in other browsers. When I joined Opera’s core department (part time) in the summer of 2008, <video> hadn’t been touched much for over a year, so I was tasked with bringing it back to life. Loving both audio/video and the web, it would be hard to find a more suitable and fun job. I ended up porting my then 2-year old plugin and thus Opera is now using GStreamer internally. The Codec Wars™ were always a pain, but we did finally release Opera 10.50 with support for Ogg Vorbis+Theora.

After Google announced that they were buying On2 there was lots of speculation that they would release VP8. I have certainly hoped it would happen, but it seemed a bit too good to be true. Therefore, my reaction when it was confirmed was similar to when VP3 was released – bouncing like a 17-year old. That Vorbis is the chosen audio codec for WebM only makes things better. How lucky I am, that this time I get to actually be part of the release event. It’s been immensely fun working on this, in secrecy, then seeing everything happen in a maelstrom of releases, tweets and blog posts today. Håkon is at Google IO running my code on stage, but just a few weeks ago he was in Opera’s Beijing office, watching sunflowers in one of the first Opera VP8 builds:

While not yet 100% bug-free, VP8 in Opera is well on its way and will be in an official release soon. Today is a good day for open video and the open web. Many thanks to everyone who have worked to make this possible. Live long and prosper, WebM!

Microformats vs RDFa vs Microdata

Warning: The microdata syntax has changed (e.g. item="foo" is now itemscope itemtype="foo") since this blog post was written. Don’t copy the examples.

I spent last weekend with my good friend Emil sketching a REST-style interface for his graph database Neo4j. One of the output formats we wanted was plain HTML for easy debugging via the browser. Wanting to enable JavaScript-based enhancements of these pages we needed a way to annotate the data to make it available to scripts. (Use by clients of the REST API should be possible, but unlikely if XML or JSON output is available.)

The three candidates were microformats, microdata and RDFa. We began with plain HTML:

  I'm Philip Jägenstedt at
  <a href="http://foolip.org/">foolip.org</a>.

The simple task at hand is to make my name and homepage machine-readable using each of these formats. What follows is a more elaborate version of the reasoning we went through while evaluating the strengths and weaknesses of each alternative.


<p class="vcard">
 I'm <span class="fn">Philip Jägenstedt</span> at   <a class="url" href="http://foolip.org/">foolip.org</a>.

Microformats are “a set of simple, open data formats”, i.e. predefined vocabularies under centralized control. In this example I’ve used the hCard microformat. One “feature” of microformats is that it is valid HTML 4.01/XHTML 1.0, which is why the class attribute is used in novel ways. Although HTML 4.01 mentions that class may be used “for general purpose processing by user agents” it’s normally only used “as a style sheet selector”, i.e. for CSS. What this means is that we are working in a single global namespace which is already polluted with all the CSS class names ever used.

The only thing that distinguishes microformats from random CSS classes is the tree structure. This structure is quite a limitation though, because it means that you have to find or make a common ancestor element to all of the data in a single hCard. For a data interchange format, it all seems insane and simply too brittle. Emil put it rather bluntly when he tweeted:

Microformats. Pile of shite that just increases our systematic technical debt.

Still, I have great respect for some of the people behind microformats and the down-to-earth philosophy. They openly state that microformats aren’t “infinitely extensible and open-ended” or “a panacea for all taxonomies, ontologies, and other such abstractions”. As microformats was never intended to solve our use case it is no surprise that it really doesn’t.

Certainly anyone can use class="foo" to mean anything they like without going through the microformats process – such data formats are cleverly called poshformats (Plain Old Semantic HTML). All things considered though, the whole approach seems outdated and I hope it won’t still be around 5 years from now. Microformats has shown the need for HTML-embedded machine-readable data, now let’s find a better solution.


To understand RDFa you first need some understanding of RDF. The RDF model is basically a somewhat roundabout way of describing graphs using subject-predicate-object triples. An example is the best way to illustrate:

RDF graph of me, my name and my homepage

This graph represents me, my name, my homepage and the relationships between them. I’m using the FOAF vocabulary because it already has the concepts of “name” and “homepage”. In N3 syntax this corresponds to these two triples:

@prefix foaf: <http://xmlns.com/foaf/0.1/> .
<#me> <foaf:name> "Philip Jägenstedt" .
<#me> <foaf:homepage> <http://foolip.org> .

Everything in <brackets> is a URI and because URIs tend be long prefixes are used: <foaf:name> actually means <http://xmlns.com/foaf/0.1/name>. I’ve used #me to represent myself, but this should really be resolved to a full URI.

As you can see, the subject is #me in both statements. The relationships in the graph are the predicates, i.e. foaf:name and foaf:homepage. The object is either another URI or a string literal. Adding RDF triples equates to adding more nodes and relationships to the graph. This is general enough that you can model almost anything you want with it.

Back to RDFa. The “a” refers to how attributes in XHTML are used to serialize RDF:

<p xmlns:foaf="http://xmlns.com/foaf/0.1/" about="#me">
 I'm <span property="foaf:name">Philip Jägenstedt</span> at
  <a rel="foaf:homepage" href="http://foolip.org/">foolip.org</a>.

The use of XML namespaces here is a bit odd. Prefixes in XML are used on element and attribute names, but here it’s only used in the attribute value. These are actually CURIEs, another URL shortening scheme. Jeni Tennison recently wrote an excellent post about the use of prefixes in RDFa which I encourage everyone to read. I also chatted briefly with Henri Sivonen about the problems with xmlns and would recommend reading his mails on those issues.

If we return to RDFa syntax for a bit, notice how property, rel and rev are used for the exact same purpose (setting the predicate) in different contexts. The intention was probably to mimic existing practices such as rel=”next”, but the net result is just more room for confusion. While I won’t claim that it’s just too hard I certainly think it could have been simpler without loosing much expressive power.

RDFa began in the now discontinued XHTML2 WG and seems strongly rooted in the Semantic Web (now Linked Data) community and that stack of technologies and tools. It was later made into a module for XHTML 1.1, but there is no W3C-sanctioned way of embedding RDFa in plain HTML. Getting into HTML5 would guarantee RDFa’s survival in the web ecosystem, so its proponents approached the WHATWG/HTML WG suggesting that RDFa be included. There was much heated discussion, the drama of which was my sole source of entertainment for weeks at a time. I’ll again refer to Jeni’s summary of the clash of priorities and “fruitless discussion”. I particularly want to emphasize this conclusion:

It’s just not going to happen for HTML5

I don’t hate RDF(a). I can certainly see the appeal of the RDF model after taking the time to understand it. It may just be a very verbose way of describing graphs, but as a data interchange format it seems to do a good job. However, being able to express arbitrary RDF in HTML in a compact way is not an actual use case for most web developers. If it’s possible without added complexity that’s fine, but HTML is not a triplestore.


As a result of gathering use cases and other input from the big RDFa discussion, suddenly one day HTML5 microdata section sprung into existence along with a very long announcement to the WHATWG list from Ian Hickson (our editor). Within 3 hours there was a demo and not long after another. This is it:

<p item="vcard"> I'm <span itemprop="fn">Philip Jägenstedt</span> at
  <a itemprop="url" href="http://foolip.org/">foolip.org</a>.

This looks very similar to the microformats example, but the new item and itemprop attributes are used instead of class. The model used is nested groups of name-value pairs, where the name-value pairs are given by the elements with itemprop attributes. In other words, it is quite similar to a DOM tree or a JavaScript object.

There are some predefined item types (used above), but it’s possible to use either URLs (http://foolip.org/footype) or reversed DNS identifiers (org.foolip.footype) to define your own types without any risk of namespace pollution. Note however that there are no prefixes or other URL shortening schemes. I don’t think I’m crazy to suggest that services like bit.ly and tr.im have shown a way out of the “long URL” problem. If microdata gains any traction, I think communities will create vocabularies with clever shorthands like http://link.to/the/past, mr.burns or ht.ml5.

Finally, the subject attribute can be used to avoid the “common ancestor” problem we had with microformats by simply referring to the item element by id:

<p item="vcard" id="me">
   I'm <span itemprop="fn">Philip Jägenstedt</span>.
<!-- stuff -->
<a itemprop="url" subject="me"   href="http://foolip.org/">foolip.org</a>.

Microdata is quite straightforward and feels much more native to HTML than RDFa. As Jeni explains, microdata can’t express RDF triples using datatypes or XML literals. I’ll also add that using a blank node as object isn’t possible. Other than that, RDF triples can be expressed by using the about type to give the subject of the name-value (predicate-object) pair. Here’s my FOAF example from earlier:

<p item>
  <a itemprop="about" href="#me"></a>
  I'm <span itemprop="http://xmlns.com/foaf/0.1/name">
    Philip Jägenstedt</span> at
  <a itemprop="http://xmlns.com/foaf/0.1/homepage"

It is quite ugly, so if there’s any way to make it simpler I’m sure such suggestions are welcome. In general though, it seems like a better idea to use simple microdata structures and map that against a RDF vocabulary if possible. In fact, the spec already defines how to extract some RDF (and JSON) from microdata so I’m sure it’s not difficult to do.

Returning to the “browsable web” (the one I normally work with), microdata has a DOM API that browsers can implement. The prospect of JavaScript having access to the microdata on a page is so exciting that I didn’t want to wait, so I hacked up MicrodataJS to try it out. You can access my name and email in the vcard example as such:

var props = document.getItems("vcard")[0].properties;
var fn = props.namedItem("fn")[0].content;
var url = props.namedItem("url")[0].content;
alert("Name: " + fn + "; URL: " + url);

Unsurprisingly there are some issues with the API which I’ve sent feedback on and expect to be fixed to my satisfaction eventually, but the basic functionality is sound. I imagine scripts making dynamic pie charts from tables, providing page-specific autocomplete suggestions and making shiny animated SVG visualizations of the RDF graphs hidden in the tag soup…

Google is now offering to do usability testing of the microdata syntax to see if it can be improved, so if you have any suggestions be sure to bring those to the WHATWG now.


The examples I’ve used are overly simplistic and may utterly fail to show the strengths and weaknesses of each syntax. Still, this is my best effort to make sense of the issues at hand and I haven’t intentionally misrepresented any technology or community. I assume that there is much more debate to come before the dust settles on this issue and perhaps I’ll even change my mind after experimenting more with real-world implementation. I leave you with this unambiguous summary of my views:

  • Microformats, you’re a class attribute kludge
  • RDFa, HTML is not your triplestore
  • Microdata, I like you but you need more review


  • Shelley Powers wrote about RDFa and HTML5’s microdata from the perspective of the RDFa/Semantic Web community. It’s quite a different view from mine, so read that before believing my propaganda.
  • Following James Graham’s suggestion, I have registered mantic.se for fun reverse DNS identifiers like se.mantic.banana. Mostly for fun, don’t take it too seriously…
  • I misunderstood Jeni’s post about expressing RDF in microdata and have fixed that section to be more accurate.

Disclaimer: this post is the result of excess spare time and not part of my work at Opera Software. I know nothing about Opera’s plans (or lack thereof) for microformats, RDFa or microdata.