Internet Alchemy http://iandavis.com/blog 4 8 15 16 23 42 Tue, 12 May 2009 21:36:15 +0000 http://wordpress.org/?v=2.8-bleeding-edge en hourly 1 Google’s RDFa a Damp Squib http://iandavis.com/blog/2009/05/googles-rdfa-a-damp-squib http://iandavis.com/blog/2009/05/googles-rdfa-a-damp-squib#comments Tue, 12 May 2009 21:36:15 +0000 Ian Davis http://iandavis.com/blog/?p=1355 It’s been an interesting week for embedding metadata in HTML. Yesterday I was exploring html5 microdata and today Google announce support for RDFa. At first this announcement seemed like a big deal – Google supporting the web of data in a big way, a real push into the world of open structured data. However, a closer look reveals that Google have basically missed the point of RDFa. The RDFa support is limited to the properties and classes defined on a hastily thrown together site called data-vocabulary.org. There you will find classes for Person and Organization and properties for names and addresses, completely ignoring the millions of pieces of data using well established terms from FOAF and the like. That means everyone has to rewrite all their data to use Google’s schema if they want to be featured on Google’s search engine. Its like saying you have to write your pages using Google’s own version of html where all the tags have slightly different spellings to be listed in their search engine!

The result is a hobbled implementation of RDFa. They’ve taken the worst part – the syntax – and thrown away the best – the decentralized vocabularies of terms. It’s like using microformats without the one thing they do well: the simplicity. This is why I believe Google missed the point. They made the mistake of treating RDFa as an alternative to microformats, which completely ignores its true strength as a structured data format.

As I twittered earlier: it seems odd that Google, a company that thrived on the open messy web, seeks to ignore it and go for a controlled vocabulary. I’m hoping that this is just a toe in the water and more will come. But there’s a part of me that thinks otherwise. Surely there’s no way the smart people in Google didn’t know about the existing vocabularies and data for people, places, reviews and businesses? We’ve all seen large companies claim support for key standards yet deliver partial or broken implementations and some companies use that as a deliberate tactic to undermine the standard itself, to break interoperability or make it impossibly hard. Its very easy for these situations to be explained away as a mistake, or as a work in progress, but we need to push and dig deeper and hold companies to their very public claims.

]]>
http://iandavis.com/blog/2009/05/googles-rdfa-a-damp-squib/feed
Microdata Experiment http://iandavis.com/blog/2009/05/microdata-experiment http://iandavis.com/blog/2009/05/microdata-experiment#comments Tue, 12 May 2009 01:10:03 +0000 Ian Davis http://iandavis.com/blog/2009/05/microdata-experiment I read the new HTML5 microdata proposal tonight and thought I’d see what it would take to convert my existing homepage which is currently marked up using eRDF. The result is here and it was surprisingly painless to do the conversion. You can try it out using this demo service. The spec is still changing so I don’t know how long my experiment will remain valid (it changed from using property to itemprop attributes while I was converting my html!)

]]>
http://iandavis.com/blog/2009/05/microdata-experiment/feed
Tom Ilube Explains the Semantic Web at Davos http://iandavis.com/blog/2009/03/tom-ilube-explains-the-semantic-web-at-davos http://iandavis.com/blog/2009/03/tom-ilube-explains-the-semantic-web-at-davos#comments Wed, 04 Mar 2009 09:55:53 +0000 Ian Davis http://iandavis.com/blog/?p=1346 A great presentation by Tom Ilube, explaining the Semantic Web at the Davos economic forum this year. Succinct, articulate and pitched at just the right level.

]]>
http://iandavis.com/blog/2009/03/tom-ilube-explains-the-semantic-web-at-davos/feed
Why Open Data Is More Important than Open Source http://iandavis.com/blog/2009/03/open-data-open-source http://iandavis.com/blog/2009/03/open-data-open-source#comments Wed, 04 Mar 2009 02:56:00 +0000 Ian Davis http://iandavis.com/blog/?p=1336 Last week I delivered the keynote for the final day of code4lib 2009. This was a particular honour because, unlike many conferences, the keynote speakers are proposed and voted on by the code4lib community. So, rather than keynote speakers being used to draw people to the conference, the community draws the speakers to them.

I chose to present on a topic that is close to my heart, to my company’s vision and, I hoped, of interest to the audience: freedom. For this conference I chose a specific expression of freedom that i thought would be of particular interest to a community deeply entrenched in metadata. The title of my keynote was “If you love something… set it free“:

The conference organisers videoed all the sessions so hopefully I can link to a more informative version. As usual with this kind of presentation I wrote copious prompt notes in case I dried up then found I couldn’t read them and carried on regardless :)

I hope to come back to various points raised in my presentation over time, but right now I want to focus on one area that has sparked a good deal of debate (such as here, here and here with much twittering too). Right in the middle of the presentation I offered three conjectures, the first of which was data outlasts code which lead me to then assert that therefore open data is more important than open source. This appears to be controversial.

First, it’s important to note what I did not say. I did not say that open source is not important. On the contrary I said that open source was extremely important and it has sounded the death knell for proprietary software. Later speakers at the conference referred to this statement as controversial too :) . (What I actually meant to say was that open source has sounded the death knell for propietary software models). I also mentioned that open source and free software has a long history and that open data is where open source was 25 years ago (I am using the term open source and free software interchangeably here).

I also did not say that code does not last nor that algorithms do not last. Of course they last, but data lasts longer. My point was that code is tied to processes usually embodied in hardware whereas data is agnostic to the hardware it resides on. The audience at the conference understand this already: they are archivists and librarians and they deal with data formats like MARC which has had superb longevity. Many of them deal with records every day that are essentially the same as they were two or three decades ago. Those records have gone through multiple generations of code to parse and manipulate the data.

In a recent post Egon Willighagen criticised my conjecture:

Ian Davis was recently quoted saying open data is more important than open source, which was pulled (out of context) from this presentation. The context was (a slide earlier): Data outlasts code.

As far as I can see, this is utter nonsense, even within context of the slide (see also this discussion on FriendFeed). Obviously, within the context of Ian it does makes sense, and I hope he will respond in his blog and explain why he thinks Open Data is more special.

Without code, you have no way of accessing the data. Ask anyone to recover from a hard disk failure. In ODOSOS (Open Standards, Open Data, Open Source) they are all equal. You need them all for progress. You cannot single out one as being more important than another. Why would you anyway? Politics is all I can think of… All three combine and ensure our science is more efficient.

I think the flaw in this argument is this statement: “Without code, you have no way of accessing the data.” It’s true that you need code to access data, but critically it doesn’t have to be the same code from year to year, decade to decade, century to century. Any code capable of reading the data will do, even if it’s proprietary. You can also recreate the code whereas the effort involved in recreating the data could be prohibitively high. This is, of course, a strong argument for open data formats with simple data models: choosing CSV, XML or RDF is going to give you greater data longevity than PDF, XLS or PST because the cost of recreating the parsing code is so much lower.

Here’s the central asymmetry that leads me to conclude that open data is more important than open source: if you have data without code then you could write a program to extract information from the data, but if you have code without data then you have lost that information forever.

Consider also, the rise of software as a service. It really doesn’t matter whether the code they are built on are open source or not if you cannot access the data they manage for you. Even if you reproduce the service completely, using the same components, your data is buried awayout of your reach. However, if you have access to the data then you can achieve continuity even if you don’t have access to the underlying source of the application. I’ll say it again: open data is more important than open source.

Of course we want open standards, open source and open data. But in one or two hundred years which will still be relevant? Patents and copyrights on formats expire, hardware platforms and even their paradigms shift and change. Data persists, open data endures.

The problem we have today is that the open data movement is in its infancy when compared to open source. We have so far to go, and there are many obstacles. One of the first steps to maturity is to give people the means to express how open their data is, how reusable it is. The Open Data Commons is an organisation explicitly set up to tackle the problem of open data licensing. If you are publishing data in any way you ought to check out their licences and see if any meet with your goals. If you licence your data openly then it will be copied and reused and will have an even greater chance of persisting over the long term.

Hopefully I have given plenty of background to my open data conjecture. I’m eager to hear what you think so please comment or email me directly. You may find these links relevant too:

]]>
http://iandavis.com/blog/2009/03/open-data-open-source/feed
The Semantic Web Acid Test http://iandavis.com/blog/2009/03/the-semantic-web-acid-test http://iandavis.com/blog/2009/03/the-semantic-web-acid-test#comments Mon, 02 Mar 2009 13:17:27 +0000 Ian Davis http://iandavis.com/blog/?p=1329 Tom Heath writes a cracking post on the current attempts by a few people to brand web applications that happen to perform text analysis as “Semantic Web”. For me, this nails it:

I certainly notice plenty of unjustified attempts at present to co-opt the term Semantic Web, now that it’s no longer a dirty word, and drive it off down some dodgy alleyway. Some of these products, services or companies may be applications or services that use some semantic technology and are delivered over the Web, but that doesn’t make them Semantic Web applications, services or companies. Anything claiming the Semantic Web label needs to get its hands dirty with Linked Data somewhere along the way. That’s just how it is.

Tom’s right. These attempts to label some pretty run-of-the-mill web applications as Semantic Web suggests to me that the marketers are seeing the Semantic Web meme as carrying some useful currency. The problem they face is that the Semantic Web has some well-defined principles that can be used as tests. Here’s the first test: if you see one of these applications find one of its pages describing something that’s useful to you (e.g. a place or a person) and ask yourself “what’s the URI of the thing this page is describing?”.

]]>
http://iandavis.com/blog/2009/03/the-semantic-web-acid-test/feed
What Are The Benefits of MVC? http://iandavis.com/blog/2008/12/what-are-the-benefits-of-mvc http://iandavis.com/blog/2008/12/what-are-the-benefits-of-mvc#comments Tue, 09 Dec 2008 01:08:28 +0000 Ian Davis http://iandavis.com/blog/?p=1316 Since there’s a rather nice discussion going on around my weekend post on RMR and MVC I thought I’d dig out the description of MVC from the Gang of Four book to remind us all what we’re actually talking about. Often people forget that the GOF book didn’t include MVC as a design pattern but as a usage scenario that they decomposed into constituent patterns, most notably Observer, Strategy and Composite.

Here’s the relevant section from the book:

The Model/View/Controller (MVC) triad of classes [first described by Krasner and Pope in 1988] is used to build user interfaces in Smalltalk-80. Looking at the design patterns inside MVC should help you see what we mean by the term “pattern.” MVC consists of three kinds of objects. The Model is the application object, the View is its screen presentation, and the Controller defines the way the user interface reacts to user input. Before MVC, user interface designs tended to lump these objects together. MVC decouples them to increase flexibility and reuse.

MVC decouples views and models by establishing a subscribe/notify protocol between them. A view must ensure that its appearance reflects the state of the model. Whenever the model’s data changes, the model notifies views that depend on it. In response, each view gets an opportunity to update itself. This approach lets you attach multiple views to a model to provide different presentations. You can also create new views for a model without rewriting it.

[...reference to diagram elided...]

Taken at face value, this example reflects a design that decouples views from models. But the design is applicable to a more general problem: decoupling objects so that changes to one can affect any number of others without requiring the changed object to know details of the others. This more general design is described by the Observer design pattern.

Another feature of MVC is that views can be nested. For example, a control panel of buttons might be implemented as a complex view containing
nested button views. The user interface for an object inspector can consist of nested views that may be reused in a debugger. MVC supports nested views with the CompositeView class, a subclass of View. CompositeView objects act just like View objects; a composite view can be used wherever a view can be used, but it also contains and manages nested views.

Again, we could think of this as a design that lets us treat a composite view just like we treat one of its components. But the design is applicable to a more general problem, which occurs whenever we want to group objects and treat the group like an individual object. This more general design is described by the Composite design pattern. It lets you create a class hierarchy in which some subclasses define primitive objects (e.g., Button) and other classes define composite objects (CompositeView) that assemble the primitives into more complex objects.

MVC also lets you change the way a view responds to user input without changing its visual presentation. You might want to change the way it responds to the keyboard, for example, or have it use a pop-up menu instead of command keys. MVC encapsulates the response mechanism in a Controller object. There is a class hierarchy of controllers, making it easy to create a new controller as a variation on an existing one.

A view uses an instance of a Controller subclass to implement a particular response strategy; to implement a different strategy, simply replace the instance with a different kind of controller. It’s even possible to change a view’s controller at run-time to let the view change the way it responds to user input. For example, a view can be disabled so that it doesn’t accept input simply by giving it a controller that ignores input events.

The View-Controller relationship is an example of the Strategy design pattern. A Strategy is an object that represents an algorithm. It’s useful when you want to replace the algorithm either statically or dynamically, when you have a lot of variants of the algorithm, or when the algorithm has complex data structures that you want to encapsulate.

MVC uses other design patterns, such as Factory Method to specify the default controller class for a view and Decorator to add scrolling to a view. But the main relationships in MVC are given by the Observer, Composite, and Strategy design patterns.

From this text the two key benefits of MVC are that it allows you to:

  • “attach multiple views to a model to provide different presentations” (view/model decoupling)
  • “change the way a view responds to user input without changing its visual presentation” (view/controller decoupling)

Unexpectedly (for me, anyway) it says nothing about decoupling models and controllers. Anyway, my observation is that if you need the above flexibility then MVC is your best bet. However, if you don’t need those particular decouplings then you are adopting needless complexity and you’ll be paying for it in the long run.

]]>
http://iandavis.com/blog/2008/12/what-are-the-benefits-of-mvc/feed
Second OpenVocab Logo http://iandavis.com/blog/2008/12/second-openvocab-logo http://iandavis.com/blog/2008/12/second-openvocab-logo#comments Sun, 07 Dec 2008 10:45:14 +0000 Ian Davis http://iandavis.com/blog/?p=1313 Darren Geraghty sent me a wonderful design idea for the OpenVocab logo competition. I’ve had two great submissions now.

]]>
http://iandavis.com/blog/2008/12/second-openvocab-logo/feed
Happy People http://iandavis.com/blog/2008/12/happy-people http://iandavis.com/blog/2008/12/happy-people#comments Sat, 06 Dec 2008 12:38:53 +0000 Ian Davis http://iandavis.com/blog/?p=1311 Stowe Boyd picks up on some very interesting research that suggests that people’s happiness is related to the happiness of their friends, their friends’ friends, and their friends’ friends’ friends. Specifically the report includes this:

we found that each additional happy friend increases a person’s probability of being happy by about 9%. For comparison, having an extra $5,000 in income (in 1984 dollars) increased the probability of being happy by about 2%.

I suspect the converse is true too. People stuck in networks of unhappy people will tend to be unhappier, which inevitably leads to their friends being slightly unhappier in return. I’ve seen this effect in organisations where people who are unhappy with their workplace tend to band together to share their unhappy experiences. When this group is in the majority then they may have the chance to change the organisation’s behaviour. However when they are in the minority, implying that the majority of the organisation are happy, then it becomes a difficult environment for them which increases their unhappiness, a spiral of despair that can bring the whole group down. That can be very damaging to the individuals, their otherwise happier peers and to the organisation itself. I wonder if this research indicates that these spirals can be unwound by the happy network reaching out to the unhappy one?

]]>
http://iandavis.com/blog/2008/12/happy-people/feed
Web Sequence Diagrams http://iandavis.com/blog/2008/12/web-sequence-diagrams http://iandavis.com/blog/2008/12/web-sequence-diagrams#comments Sat, 06 Dec 2008 10:44:41 +0000 Ian Davis http://iandavis.com/blog/?p=1308 Via Ryan Tomayko’s great article on gateway caches I found this neat webapp that generates UML sequence diagrams. Very handy and a clever implementation. It even has an API for integration into other tools

]]>
http://iandavis.com/blog/2008/12/web-sequence-diagrams/feed
The Web is RMR not MVC http://iandavis.com/blog/2008/12/the-web-is-rmr-not-mvc http://iandavis.com/blog/2008/12/the-web-is-rmr-not-mvc#comments Sat, 06 Dec 2008 10:21:38 +0000 Ian Davis http://iandavis.com/blog/?p=1306 Last year I wrote a short post titled MVC Obscures the Mechanics of the Web that drew together some other peoples writings on why MVC is poorly matched to the web. I didn’t give an alternative though, apart from indirectly in the comments. Now Paul James has written a succinct description of that alternative and christened it RMR – Resource Method Representation.

In RMR (which is simply REST as seen on the Web) the user interacts with resources using representations via restricted set of methods. As Paul explains, typically this is implemented as a set of classes representing the resources, a set of Response classes and some kind of routing to tie it all together. This is how frameworks like Tonic (written by Paul) and Konstrukt work. It’s also how my RDF framework Paget works, although that is very immature at the moment and doesn’t cover responses very well.

Here’s to banishing MVC from the web :)

]]>
http://iandavis.com/blog/2008/12/the-web-is-rmr-not-mvc/feed